Trying to use a set with string values causes AssertionError()

According to the numba documentation, sets are supported: “All methods and operations on sets are supported in JIT-compiled functions. […] Sets must be strictly homogeneous”

However, when I try creating a set with string values, it doesn’t work:

@numba.njit
def set_numba():
    s = set()
    s.add('TEST')
set_numba()

yields:

TypingError: - Resolution failure for literal arguments:
AssertionError()
- Resolution failure for non-literal arguments:
AssertionError()

During: resolving callee type: BoundFunction(set.add for set(undefined))

Can someone explain what’s going on here?

Hi @kartiksubbarao,

This means it’s not supported. The message is from the older internal API used by Numba which is in part why it’s not hugely clear. The error comes from here:

numba/core/types/containers.py", line 523, in __init__
            assert isinstance(dtype, (Hashable, Undefined))
        AssertionError

and it’s due to bugs a) UnicodeType should inherit from Hashable, and b) set can only deal with non-reference counted types (so no string support yet). It’s fixed (inheritance is correct and strings etc are forbidden) in the upcoming Numba 0.52 here https://github.com/numba/numba/pull/5639, I’ve got a PR lined up to note this in the documentation too.

If you really want something set-like that can handle strings, consider a typed.Dictionary as a workaround, the keys can be strings, this can be wrapped in a StructRef or similar for ease?

Hope this helps?

Thanks for the explanation @stuartarchibald. I looked into using a typed Dictionary but unfortunately ran into either errors or performance issues.

The reason I need to populate a set/dict in the numba function is to check for uniqueness in a numpy array (passed as an argument) while doing other calculations. If I pass the numpy array as-is and try to store/query its items in the dict, I run into issues like this: https://github.com/numba/numba/issues/4505

If I convert the numpy array to an ordinary python list, I get a deprecation warning https://numba.pydata.org/numba-doc/latest/reference/deprecation.html#deprecation-of-reflection-for-list-and-set-types

If I convert the numpy array into a typed List, that takes a lot of time and makes the njit version slower than the ordinary python version.

I see, thanks for expanding on this. I’d recommend opening a question in the How do I do...? section along with a minimal working example of what you are trying to do, with some code to look at it might be possible to come up with a solution. Thanks :slight_smile:

Ok, will post an example in that section to illustrate the issue.