C like struct in numba

rpopovici · July 28, 2020, 4:03pm

Hi guys,

Since there is no documentation about memory management in numba docs, I had to ask: Is there any way in numba to emulate C like structs without using @jitclass or TypedDicts? Class is highly experimental and dicts are slow by nature.

I know that you can emulate C like structures with np.dtypes like this:

struct_dtype = np.dtype([('row', np.float64), ('col', np.float64)])
ty = nb.from_dtype(struct_dtype)
ty_inst = np.arange(1, dtype=struct_dtype)

t[0]['row']

Is there any way to instantiate/allcate new structs with type ‘ty’ besides np.arange?
Also, I want to be able to access it with dot notation, like this: ty_inst.row

gmarkall · July 28, 2020, 4:30pm

There may be a better solution (someone else may provide it or I may find something later), but one thing that springs to mind is that the Interval example for extending Numba - it implements an extension that represents an interval using a struct of two floats: http://numba.pydata.org/numba-doc/latest/extending/interval-example.html

The example addresses the two operations (initializing and using dot notation) by demonstrating:

Lowering the constructor (with @type_callable) - this allows initializing the struct in a customized way. If you want to do the initialization outside a jitted function, the Python implementation of the class takes care of this - if you do it in a jitted function, the native code to initialize the struct is generated.
Attribute access (with make_attribute_wrapper) - i.e. you can access attributes with the dot notation, instead of the getitem-like mechanism.

Based on this example, are you able to build an extension that would fit your needs?

rpopovici · July 28, 2020, 4:42pm

@gmarkall I saw the interval example, but it’s not mutable. I need it to be mutable. The docs say that making it mutable is more complicated without explaininig how to do it…

rpopovici · July 28, 2020, 5:08pm

Another way, is to use named tuples, but those are also immutable

luk-f-a · July 28, 2020, 7:29pm

I think Numpy records do what you want. Adjusting your example

ty_inst = np.rec.array(1, dtype=struct_dtype)[0]
t.row

I work a lot with records and I don’t know any way to instantiate directly, without creating an array and slicing it.

If you can live with the limitation of only working with numpy types (so no lists, etc) then this is a great way. Otherwise you have to use jitclass or the new StructRef (https://github.com/numba/numba/pull/5993), but both are experimental .

rpopovici · July 29, 2020, 7:42am

I think you have a typo

you either:

ty_rec = np.recarray(1, dtype=struct_dtype)[0]

or

ty_arr = np.zeros(1, dtype=struct_dtype)
ty_rec = np.rec.array(ty_arr, dtype=struct_dtype)[0]

or view

ty_rec = ty_arr.view(np.recarray)

luk-f-a · July 29, 2020, 10:56am

I guess all 3 options end up in the same, don’t they?

did the answer help despite the typo?

rpopovici · July 29, 2020, 11:18am

Yes it did. thanks. I am trying now the StructRef thing. I didn’t knew about that one because the official docs are not up to date

rpopovici · July 30, 2020, 1:11pm

@luk-f-a do you have any idea about how to pass type definitions to StructRef fields?

luk-f-a · July 30, 2020, 1:24pm

sorry, I haven’t used them yet.

gmarkall · August 3, 2020, 9:58am

@rpopovici to follow up on making things mutable, it seems to me that there’s a straightforward way to make the Interval example (and anything similar to it) mutable - however, I fear I may be missing some edge case or other issue, so I’ve created this thread with the addition to the example and a couple of questions in case it’s not as straightforward as I think it is.

rpopovici · August 18, 2020, 7:56pm

Calling np.recarray() is not supported in njit mode, but if I call it from outside then it works and also I can pass the record as a parameter to njit functions. Why? Is there any way to create np.recarrays from njit functions?

import numpy as np
import numba as nb
from numba import typed, typeof, njit, int64, types

struct_dtype = np.dtype([('row', np.float64), ('col', np.float64)])
ty = nb.from_dtype(struct_dtype)
ty_inst = np.arange(1, dtype=struct_dtype)

ty_rec = np.recarray(1, dtype=struct_dtype)[0]
ty_rec.row = 11.5
print(ty_rec)

@njit
def test(rec):
    ty_arr = np.zeros(1, dtype=struct_dtype)

    ty_rec = np.recarray(1, dtype=struct_dtype)[0]   # error here
    ty_rec.row = 11.5

    rec.col = -15.5
    print(rec)
    print(ty_arr)

test(ty_rec)

luk-f-a · August 18, 2020, 9:18pm

@njit
def test(rec):
    ty_arr = np.zeros(1, dtype=struct_dtype)

    ty_rec = np.zeros(1, dtype=struct_dtype)[0]   
    ty_rec.row = 11.5

    rec.col = -15.5
    print(rec)
    print(ty_arr)

AFAIK, there’s no distinction between structured arrays and record arrays in Numba.

wkerzendorf · October 12, 2020, 12:19pm

We are having the same problem with the struct. We are using jitclass for now but wondering if generating thousands of these is slower than using something else. What were your findings?

rpopovici · October 12, 2020, 2:23pm

The only options I know of for heap allocated data structures are:

StructModel, stack allocated intended originaly for low level llvm. more info here Interval example: Why do mutable types need more sophisticated data models?
StructRef, new experimental heap allocated
jitclass heap allocated
numpy.recarrays
named tuple, immutable

Topic		Replies	Views
Implementing pandas DataFrame type via numba extension types Community Support	1	557	July 21, 2023
Changes in the behavior of numba functions? Support: How do I do ...?	19	431	October 28, 2023
Any numba equivalent for casting a raw pointer to a StructRef, Dict, List etc? Support: How do I do ...?	29	3224	January 25, 2023
Numba dictionary(list) and numpy structured arrays? Support: How do I do ...?	4	333	October 19, 2023
Accelerate loops that use ctypes, c_char_p, numpy str_? Community Support	6	1475	June 26, 2020

C like struct in numba

Related Topics