C like struct in numba

Hi guys,

Since there is no documentation about memory management in numba docs, I had to ask: Is there any way in numba to emulate C like structs without using @jitclass or TypedDicts? Class is highly experimental and dicts are slow by nature.

I know that you can emulate C like structures with np.dtypes like this:

struct_dtype = np.dtype([('row', np.float64), ('col', np.float64)])
ty = nb.from_dtype(struct_dtype)
ty_inst = np.arange(1, dtype=struct_dtype)

t[0]['row']

Is there any way to instantiate/allcate new structs with type ‘ty’ besides np.arange?
Also, I want to be able to access it with dot notation, like this: ty_inst.row

2 Likes

There may be a better solution (someone else may provide it or I may find something later), but one thing that springs to mind is that the Interval example for extending Numba - it implements an extension that represents an interval using a struct of two floats: http://numba.pydata.org/numba-doc/latest/extending/interval-example.html

The example addresses the two operations (initializing and using dot notation) by demonstrating:

  • Lowering the constructor (with @type_callable) - this allows initializing the struct in a customized way. If you want to do the initialization outside a jitted function, the Python implementation of the class takes care of this - if you do it in a jitted function, the native code to initialize the struct is generated.
  • Attribute access (with make_attribute_wrapper) - i.e. you can access attributes with the dot notation, instead of the getitem-like mechanism.

Based on this example, are you able to build an extension that would fit your needs?

@gmarkall I saw the interval example, but it’s not mutable. I need it to be mutable. The docs say that making it mutable is more complicated without explaininig how to do it…

Another way, is to use named tuples, but those are also immutable

I think Numpy records do what you want. Adjusting your example

ty_inst = np.rec.array(1, dtype=struct_dtype)[0]
t.row

I work a lot with records and I don’t know any way to instantiate directly, without creating an array and slicing it.

If you can live with the limitation of only working with numpy types (so no lists, etc) then this is a great way. Otherwise you have to use jitclass or the new StructRef (https://github.com/numba/numba/pull/5993), but both are experimental .

1 Like

I think you have a typo

you either:

ty_rec = np.recarray(1, dtype=struct_dtype)[0]

or

ty_arr = np.zeros(1, dtype=struct_dtype)
ty_rec = np.rec.array(ty_arr, dtype=struct_dtype)[0]

or view

ty_rec = ty_arr.view(np.recarray)

I guess all 3 options end up in the same, don’t they?

did the answer help despite the typo?

Yes it did. thanks. I am trying now the StructRef thing. I didn’t knew about that one because the official docs are not up to date

@luk-f-a do you have any idea about how to pass type definitions to StructRef fields?

1 Like

sorry, I haven’t used them yet.

@rpopovici to follow up on making things mutable, it seems to me that there’s a straightforward way to make the Interval example (and anything similar to it) mutable - however, I fear I may be missing some edge case or other issue, so I’ve created this thread with the addition to the example and a couple of questions in case it’s not as straightforward as I think it is.

Calling np.recarray() is not supported in njit mode, but if I call it from outside then it works and also I can pass the record as a parameter to njit functions. Why? Is there any way to create np.recarrays from njit functions?

import numpy as np
import numba as nb
from numba import typed, typeof, njit, int64, types

struct_dtype = np.dtype([('row', np.float64), ('col', np.float64)])
ty = nb.from_dtype(struct_dtype)
ty_inst = np.arange(1, dtype=struct_dtype)

ty_rec = np.recarray(1, dtype=struct_dtype)[0]
ty_rec.row = 11.5
print(ty_rec)

@njit
def test(rec):
    ty_arr = np.zeros(1, dtype=struct_dtype)

    ty_rec = np.recarray(1, dtype=struct_dtype)[0]   # error here
    ty_rec.row = 11.5

    rec.col = -15.5
    print(rec)
    print(ty_arr)

test(ty_rec)

@njit
def test(rec):
    ty_arr = np.zeros(1, dtype=struct_dtype)

    ty_rec = np.zeros(1, dtype=struct_dtype)[0]   
    ty_rec.row = 11.5

    rec.col = -15.5
    print(rec)
    print(ty_arr)

AFAIK, there’s no distinction between structured arrays and record arrays in Numba.

1 Like

We are having the same problem with the struct. We are using jitclass for now but wondering if generating thousands of these is slower than using something else. What were your findings?

The only options I know of for heap allocated data structures are:

1 Like