According to the numba documentation, sets are supported: “All methods and operations on sets are supported in JIT-compiled functions. […] Sets must be strictly homogeneous”
However, when I try creating a set with string values, it doesn’t work:
@numba.njit
def set_numba():
s = set()
s.add('TEST')
set_numba()
yields:
TypingError: - Resolution failure for literal arguments:
AssertionError()
- Resolution failure for non-literal arguments:
AssertionError()
During: resolving callee type: BoundFunction(set.add for set(undefined))
Can someone explain what’s going on here?
Hi @kartiksubbarao,
This means it’s not supported. The message is from the older internal API used by Numba which is in part why it’s not hugely clear. The error comes from here:
numba/core/types/containers.py", line 523, in __init__
assert isinstance(dtype, (Hashable, Undefined))
AssertionError
and it’s due to bugs a) UnicodeType
should inherit from Hashable
, and b) set
can only deal with non-reference counted types (so no string support yet). It’s fixed (inheritance is correct and strings etc are forbidden) in the upcoming Numba 0.52 here https://github.com/numba/numba/pull/5639, I’ve got a PR lined up to note this in the documentation too.
If you really want something set
-like that can handle strings, consider a typed.Dictionary
as a workaround, the keys can be strings, this can be wrapped in a StructRef
or similar for ease?
Hope this helps?
Thanks for the explanation @stuartarchibald. I looked into using a typed Dictionary but unfortunately ran into either errors or performance issues.
The reason I need to populate a set/dict in the numba function is to check for uniqueness in a numpy array (passed as an argument) while doing other calculations. If I pass the numpy array as-is and try to store/query its items in the dict, I run into issues like this: https://github.com/numba/numba/issues/4505
If I convert the numpy array to an ordinary python list, I get a deprecation warning https://numba.pydata.org/numba-doc/latest/reference/deprecation.html#deprecation-of-reflection-for-list-and-set-types
If I convert the numpy array into a typed List, that takes a lot of time and makes the njit version slower than the ordinary python version.
I see, thanks for expanding on this. I’d recommend opening a question in the How do I do...?
section along with a minimal working example of what you are trying to do, with some code to look at it might be possible to come up with a solution. Thanks
Ok, will post an example in that section to illustrate the issue.