Assigning to NumPy structural array using a tuple in `@jitclass`

I wrote this NumPy dynamic array class that I’m planning to make generic as outlined in this question, but in order to do that, I would need to be able to assign to the elements of the underlying array in a different way, preferably using tuples. This is generally possible but the following code doesn’t work:

Structure = np.dtype([('a', np.int8), ('b', np.float32)])

@jitclass([('capacity', nb.types.int32), ('length', nb.types.int32), ('array', nb.from_dtype(Structure)[:])])
class NumpyArrayList():
  def __init__(self, capacity):
    self.capacity = capacity
    self.length = 0
    self.array = np.empty(capacity, dtype=Structure)

  def append(self, element):
    if self.length >= self.capacity:
      new_size = self.capacity * 2

      new_array = np.empty(new_size, dtype=self.array.dtype)
      new_array[:self.length] = self.array

      self.capacity = new_size

      self.array = new_array

    self.array[self.length] = element

    self.length += 1
  
  def __getitem__(self, i):
    return self.array[i]

  def get_np_array(self):
    return self.array[:self.length]

array = NumpyArrayList(10)

array.append((1, 10.3))

…and throws the following error:

TypingError: Failed in nopython mode pipeline (step: nopython frontend)
- Resolution failure for literal arguments:
Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<built-in function setitem>) found for signature:

 >>> setitem(unaligned array(Record(a[type=int8;offset=0],b[type=float32;offset=1];5;False), 1d, A), int32, Tuple(int64, float64))

There are 16 candidate implementations:
    - Of which 16 did not match due to:
    Overload of function 'setitem': File: <numerous>: Line N/A.
      With argument(s): '(unaligned array(Record(a[type=int8;offset=0],b[type=float32;offset=1];5;False), 1d, A), int32, Tuple(int64, float64))':
     No match.

During: typing of setitem at <ipython-input-16-cf2dbcc19977> (21)

File "<ipython-input-16-cf2dbcc19977>", line 21:
  def append(self, element):
      <source elided>

    self.array[self.length] = element
    ^

- Resolution failure for non-literal arguments:
None

During: resolving callee type: BoundFunction((<class 'numba.core.types.misc.ClassInstanceType'>, 'append') for instance.jitclass.NumpyArrayList#7f18dd5a2040<capacity:int32,length:int32,array:unaligned array(Record(a[type=int8;offset=0],b[type=float32;offset=1];5;False), 1d, A)>)
During: typing of call at <string> (3)

In regular NumPy, I am able to do this (even if I just remove the @jitclass attribute), but the only way I could achieve something similar is to assign to each field of the custom dtype manually, which obviously wouldn’t be possible in a generic scenario. Like:

def append(self, element):
  ...

  self.array[self.length].a = element[0]
  self.array[self.length].b = element[1]

  ...

as far as I know, individual assignment is the only available option at the moment. I don’t think assigning a tuple to a record works in numba. It could work, but it’s not been implemented.

Luk

Oh, that’s unfortunate. Do you know of any other way something like this could be achieved? And is there a feature request for this or should, I start one? I looked through the issues and couldn’t find one but maybe I just used the wrong wording.

I didn’t understand why you cannot make the element-by-element assignment work. Is it because you plan to use records with different field names?

Yeah, I plan to make this generic using a factory class/function which would mean the record field names could be anything and field-by-field assignment wouldn’t be possible.

how much do you need this functionality? There are solutions that could work, but they involve going out of your way to make them work. It’s not the straightforward, this-is-how-it-works-in-numpy way.

For example, have a look at this example from the documentation Supported NumPy features — Numba 0.52.0-py3.7-linux-x86_64.egg documentation

arr = np.array([(1, 2)], dtype=[('a1', 'f8'), ('a2', 'f8')])
fields_gl = ('a1', 'a2')

@njit
def get_field_sum(rec):
    out = 0
    for f in literal_unroll(fields_gl):
        out += rec[f]
    return out

You can do the same when writing to the array

fields_gl = ('a1', 'a2')

@njit
def set_fields(rec, tup):
    i = 0
    for f in literal_unroll(fields_gl):
        rec[f] = tup[i]
        i += 1

Now, if the tuple is not all of the same type, then the above might fail. And there’s probably a solution, but it gets more and more convoluted the more problems you have to solve.

Another route, that works for sure, is to use code generation. you just generate a string with the code you need (element by element assignment) and then you use python’s exec to make it into a function. That’s the easiest way to get around code that it’s too dynamic: to dynamically generate a version of the code that is more static.

hope this helps,
Luk