Hi team!!
Someone knows how to make List Comprehension work with NumPy arrays inside?
I’m having trouble getting the following jitted function to work, which is a simplified version of a larger code:
@nb.njit()
def multimask(search_arr: np.ndarray, bigarray: np.ndarray):
return np.array([bigarray == q for q in search_arr]) # <-Numba error
It doesn’t work when jitted. Below are more explanations and questions…
Expected return:
The function should return a boolean 2D array where each row shows where the corresponding value in search_arr
was found in bigarray
. For example:
>>> a = np.array([10,20,10,30,10])
>>> s = np.array([10,30])
>>> multimask(s, a)
array([[ True, False, True, False, True], # Found 10 in 1st, 3rd, 5th
[False, False, False, True, False]]) # Found 30 in 4th item
But I get the following error:
.... click here to see error ...
ERROR:
No implementation of function Function(<built-in function setitem>) found for signature:
>>> setitem(array(undefined, 1d, C), int64, array(bool, 1d, C))
There are 16 candidate implementations:
- Of which 16 did not match due to:
Overload of function 'setitem': File: <numerous>: Line N/A.
With argument(s): '(array(undefined, 1d, C), int64, array(bool, 1d, C))':
No match.
Work around, but verbose…
While I managed to work around the jit error by opening the List Comprehension into a explicit loop, I don’t like the verbosity:
@nb.njit()
def multimask_okcompile(search_arr:np.ndarray, bigarray:np.ndarray):
size_search = len(search_arr)
size_bigarray = len(bigarray)
arr_mask = np.empty((size_search, size_bigarray), dtype="bool")
for pos in range(size_search):
q = search_arr[pos]
arr_mask[pos,:] = (bigarray == q)
return arr_mask
… I would prefer a cleaner pythonic solution and insist on List Comprehension or similar (if possible in Numba or Numpy).
Any ideas?
I know that list comprehension of simple types works fine in Numba, but I’m not sure how to make List Comprehension work with NumPy arrays inside. I’m wondering:
- Do I need to add (the right) type hints ?
- Can TypedList be used somehow to avoid explicit loop?
- or maybe I need to overloading
np.array
or the built-in functionsetitem
in some way? (I have taken a look to Numba issue 4470 (“Can’t create a numpy array from a numpy array”) to overload “np.array” to receive arrays. Fine for me if something like this works while waiting for Numba support arrays inside list comprehension)
Any direction or help on this will be highly appreciated!
What have I explored so far:
Before posting here, I made a lot of code testing and investigation (in the documentation, this forum, and even ChatGPT ). So far, I have tried:
- Adding type hints within the function in different ways.
- Defining the types outside the JIT scope and adding them as a
dtype
parameter. - Searching unsuccessfully for a different NumPy function to replace the Numba code (I have tried
np.isin
,np.vstack
, and different NumPy array creation functions) in case it is not possible to make list comprehension work with NumPy arrays inside. - Reading a lot about Numba arrays of arrays or lists of arrays (for example, “passing a list of numpy arrays into np array with numba”). However, something like make_2d() is just similar to my verbose workaround.