# How to pass a Numpy array of lists in @guvectorize function?

I have written several functions with @vectorize/guvectorize, it works very well. The performance is very good.
Now, I would like to pass as an argument a numpy array of lists…and I know it is not supported by Numba as it is an array of pyobjects.
In my code, I need an array that is able to contain data with different lengths. I am implementing a Physics problem where objects (molecules) can have a different number of components (eg H2O, C4H6O2, …) and I would like all this to be contained in a Numpy array. Each element of the array represents a molecule with its various components. So far, the idea that I had is to use an array declared as follows:
array = np.zeros(size, dtype = object)

This allows me to do something like this:
array[0] = [2,1]
array[1] = [4,6,2]

Each list corresponds to the components of a molecule.
However, now I need to pass this array as an argument of a @guvectorize function, and I now that Numba does not accept this type of data (pyobject).
Any idea? A work-around?
Or any alternative to replace this array of lists by something that could be accepted by Numba?

Hi @xtof2020, would you be able to use a certain number to designate no data? Then you can normalize all the arrays to the same length. 2D numpy arrays are passable into `@guvectorize`.

For example, below I use -1.0 to designate no data in a simple sum function across a 2D array:

``````import numpy as np
from numba import guvectorize

@guvectorize(['f8[:,:], f8[:]'], '(m,n) -> ()')
def sum(array2d, result):
m, n = array2d.shape
tmp_result = 0.0
for i in range(m):
tmp = array2d[i]
tmp = tmp[np.where(tmp != -1.0)]
for val in tmp:
tmp_result += val
result[0] = tmp_result

nodata = -1.0
a = np.array((1.0, 2.0, 3.0, nodata))
b = np.array((1.0, 2.0, nodata, nodata))
c = np.array((1.0, nodata, nodata, nodata))
abc = np.vstack((a, b, c))

sum(abc)
10.0
``````

hi @xtof2020, if you are not able to create 2d arrays from your data, as suggested by @ryanchien, I recommend you look into Awkward array (https://github.com/scikit-hep/awkward-1.0). It was designed precisely for dealing with arrays that contain arrays of different lengths. It’s compatible with numba, so you can pass awkward arrays into jitted functions.

Do you really need `guvectorize`? Numba works well with explicit loops, you don’t need to “vectorize” to get speed. I don’t know if awkward array works with `guvectorize`.

Hi @ryanchien and @luk-f-a.
Thanks for your suggestions. Actually, since I sent my post yesterday, I have been working on an approach using 2d arrays, as suggested @ryanchien. I think that it is the simplest way. Since the data (uint) represents the number of atoms of a given type, I guess that using 0 as nodata would make it.
Nevertheless, the awkward arrays look also interesting. I will have a look at it because other parts of my code could use it…

