Interval example: Why do mutable types need more sophisticated data models?

EDIT / UPDATE: The answer is because types with a StructModel are passed by value, so mutations by callees aren’t reflected in the caller.

Original post:

The Interval Example states in Defining the Data Model for Native Intervals:

Mutable types need more sophisticated data models to be able to persist their values after modification. They typically cannot be stored and passed on the stack or in registers like immutable types do.

However, It seems to me that supporting mutation of its attributes can be implemented with:

from numba.core.extending import lower_setattr_generic, lower_setattr


@lower_setattr_generic(IntervalType)
def interval_set(context, builder, sig, args, attr):
    val = cgutils.create_struct_proxy(interval_type)(context, builder, value=args[0])
    setattr(val, attr, args[1])
    builder.store(val._getvalue(), args[0].operands[0])


@lower_setattr(IntervalType, 'hi')
def interval_set_hi(context, builder, sig, args):
    val = cgutils.create_struct_proxy(interval_type)(context, builder, value=args[0])
    setattr(val, 'hi', args[1])
    builder.store(val._getvalue(), args[0].operands[0])

The above functions are a generic set (to work with any named attribute) and one specifically for hi. After loading the struct and modifying its value, we write the modified struct back to the stack.

I can’t think of a situation in which this won’t work - am I missing the reason why we can’t have mutable stack-allocated structs in extensions? Or could the example do with some modification to add setattr for Interval as well?

The only reason I can think of, why you should not make stack allocated object mutable, is when you try to pass by pointer while returning from a function, but other than that, I see no reason why. Stack allocated object should be the same as heap allocated, with the exception that they are usualy short-lived and much cheaper to allocate.

The only reason I can think of, why you should not make stack allocated object mutable, is when you try to pass by pointer while returning from a function, but other than that, I see no reason why.

That’s true - though, I think there’s no way in Numba to end up returning a stack-allocated structure as a pointer (which would be an error in general), so this shouldn’t be an issue for the above implementation.

@sklam might remember what the intent was here. I suspect the hidden assumption in that text is that these mutable values are reference counted and passed by pointer, with the associated additional complexity. That’s not true generically, however, as you note.

OK, I’ve realised the rationale, which seems obvious in retrospect… Types with a StructModel are passed into functions by value, so mutations by a callee are not reflected in the caller. E.g.:

@jit(nopython=True)
def mutate_interval(i):
    i.lo = 10
    print(i.lo)

@jit(nopython=True)
def func():
    y = Interval(9.1, 9.2)
    print(y.lo)
    mutate_interval(y)
    print(y.lo)

prints

9.1
10.0
9.1

which is obviously incorrect.

:crazy_face:

@gmarkall this appears to be wrong by design. Passing structs by value is usually slow, unless your struct has only two fields, for didactic purpose… left aside the fact that you don’t get to choose if you want to pass by reference

this appears to be wrong by design.

I’m not sure that “wrong by design” is an accurate description of the situation:

Passing structs by value is usually slow, unless your struct has only two fields, for didactic purpose

“Pass by reference” here refers to the semantics, but not necessarily what the generated code actually ends up doing. In practice the callee will be inlined into the caller as part of the Numba compilation, and I’d expect LLVM to optimize away the copy itself, and pass more directly the members that are actually accessed. If I modify the example above to get rid of the print calls (because it complicates the IR a bit and I’d like a simple example):

@jit(nopython=True)
def mutate_interval(i):
    i.lo = 10
    return i.lo

@jit(nopython=True)
def func():
    y = Interval(9.1, 9.2)
    mutated_lo = mutate_interval(y)
    return y.lo, mutated_lo

print(func())

this gives:

(9.1, 10.0) # Still incorrect, demonstrates behaviour unchanged

the LLVM IR for the function optimizes down to:

  %retptr.repack5 = bitcast [2 x double]* %retptr to double*
  store double 9.100000e+00, double* %retptr.repack5, align 8
  %retptr.repack2 = getelementptr inbounds [2 x double], [2 x double]* %retptr, i64 0, i64 1
  %0 = bitcast double* %retptr.repack2 to i64*
  store i64 4621819117588971520, i64* %0, align 8

Here these is no call, no passing of a struct (even the struct itself is optimized away) and no copying.

left aside the fact that you don’t get to choose if you want to pass by reference

You can make the choice by defining your extension type either as a struct like in the Interval example, or as a StructRef.

Here these is no call, no passing of a struct (even the struct itself is optimized away) and no copying.

that’s a happy situation when the compiler can optimize away the overhead of passing by value the entire structure. most of the time, that is not the case

You can make the choice by defining your extension type either as a struct like in the Interval example, or as a StructRef.

This is what worries me. numba does not have a clear memory management strategy, or it does but is not documented anywhere. I have to take a look at the generated code from llvm to see what is going on…
Other than that, the two types of data structures proposed here should have been one and the same with a jit flag to indicate if you want to pass a copy or not, and maybe one more flag for deep copies

Besided this, the name StructRef is not very intuitive. From what I have seen, StructRef behaves more like a type template from which you can instantiate new data types. and the ref aspect of it indicates that it is being passed by reference

And by “memory management strategy” I am talking about what happens with your primitives/objects when you cross the bridge back and forth from python to native.
I am talking here about: reference counting, when and why, escape analysis, stack allocation vs heap allocation.
I don’t even know if numba uses any kind of automatic management, like garbage collector or automatic reference counting. I am not saying that it should, but it is not indicated anywhere

Another one: mutability vs imutability, how is being handled? like python does it, like c++ does it, or other way

Even more :slight_smile: numpy. we know that standard numpy doesn’t handles slicing like python does. how numba does it?

that’s a happy situation when the compiler can optimize away the overhead of passing by value the entire structure. most of the time, that is not the case

Since Numba usually inlines the callees in a @jit function, and structs are immutable and passed by reference, I tend to find that the overhead is optimized away in the cases I’ve looked at. I’ll agree that it’s not true for compilers and languages in general, but I’d be keen to see cases where LLVM is failing to optimize these passes by value for Numba-generated IR. If there are specific examples of this not happening for Numba-generated IR, it would be good to investigate them if you can share them.

numba does not have a clear memory management strategy, or it does but is not documented anywhere.

My understanding is that there’s a strategy, but it is still going to be subject to some changes as necessary before 1.0, and it does require further documentation at some point - for example StructRef has only just been added recently. I’m not a core developer so my opinion here should be taken with a grain of salt, but my view is that the strategy will become more concrete and clearly documented on the route towards 1.0.

I have to take a look at the generated code from llvm to see what is going on…

I too spend a lot of time looking at the LLVM IR. I mostly look at the optimized IR because its simpler to read, but some debugging and understanding needs a look at the unoptimized IR, or the generated assembly.

Other than that, the two types of data structures proposed here should have been one and the same with a jit flag to indicate if you want to pass a copy or not, and maybe one more flag for deep copies

I’m struggling to picture how this could be implemented without introducing inconsistencies - anything decorated with @jit should execute with the same semantics as the undecorated function, so having a flag in there that changes semantics seems counter to this.

In implementing a Numba extension, the choice of StructModel vs. StructRef, etc., should be made based on what it takes to replicate the plain Python semantics for the type being implemented.

Besided this, the name StructRef is not very intuitive. From what I have seen, StructRef behaves more like a type template from which you can instantiate new data types. and the ref aspect of it indicates that it is being passed by reference

Numba types are in general templates from which you can instantiate new types - for example in numba/core/types/__init__.py:

int8 = Integer('int8')
int16 = Integer('int16')
int32 = Integer('int32')
int64 = Integer('int64')

float32 = Float('float32')
float64 = Float('float64')

The Integer type is used to instantiate various other data types (int8, int16), etc - you could instantiate other-sized integers if you want to - similarly for Float and others in that file.

And by “memory management strategy” I am talking about what happens with your primitives/objects when you cross the bridge back and forth from python to native.

Some description of Boxing and Unboxing (conversion of Python objects to native and vice-versa) is in Low-level extension API — Numba 0.50.1 documentation and part of the Interval example: Example: an interval type — Numba 0.50.1 documentation

I don’t even know if numba uses any kind of automatic management, like garbage collector or automatic reference counting. I am not saying that it should, but it is not indicated anywhere

There is a reference counting implementation, which is referred to as the “Numba Runtime”: Notes on Numba Runtime — Numba 0.50.1 documentation

Another one: mutability vs imutability, how is being handled? like python does it, like c++ does it, or other way

I’m not quite clear about what “Like Python does” means vs. “Like C++ does it”, but Numba aims to match the semantics of the undecorated function it compiles, so it should probably be thought of as being handled how Python handles it.

Even more :slight_smile: numpy. we know that standard numpy doesn’t handles slicing like python does. how numba does it?

For slicing NumPy arrays, the slicing in a Numba-compiled function should match how NumPy does it - for slicing other things, it should match how Python does it. Any deviations from this are likely to be considered bugs.

I think when approaching Numba internals and extensions, its important to keep in mind that Numba is a project with quite a lot of complexity and change that has only a small core team working on it, and borne out of this is a limited set of documentation for a lot of aspects of how it works, especially for more-recently added, experimental, and in-progress features. I’m doing what I personally can to improve documentation and provide help / explanations as things progress, but it will necessarily take some time before everything is settled, stable, and complete with comprehensive documentation and examples.

If you have more questions about how things work I’d be happy to try and continue answering them (and likely discovering the specifics of various answers myself at the same time). If you’re able to make any contributions to the documentation that clarify things you’ve found unclear as you discover the answers, I think that would be very much appreciated, and would help move Numba towards the state you’d like to see it in.

Python passes primitives by value(immutable) and objects by refence(mutable). In C++, the default, is pass be value unless you pass by refence &. C pointers are considered to be pass by value since they hold an address. If I remember correctly C++ standard does not specify clearly how C++ & should be handled by the compiler, but it is, probably, always by pointer

Yes, I would like to contribute, but I am not yet knowledgeable enough about numba

@gmarkall I asked you about optional types in this thread https://github.com/numba/numba/issues/6063 if you have any idea about how to handle that, I would appreciate it if you wish to share it

On StructModel vs StructRef, StructModel was directly modeled after LLVM literal structure types. It inherited the immutability from LLVM. It was meant for low-level usage for accessing a LLVM structure. StructModel was written when there were no Numba allocated heap. All allocations were done externally. Now, Numba allocates objects on the heap and uses automatic reference counting. StructRef is a refcounted pointer to a StructModel allocated on the heap.

On pass-by-ref vs pass-by-val semantics, they are language semantics regardless of how they are lowered (e.g. whether they are pass by pointer or not). Whether or not Numba passes a StructModel in registers or in the stack is a matter of calling-convention. For large StructModel, Numba should really be passing in caller allocated stack slot much like C++ calling convention, but it is not doing that yet.

StructModel are like Python tuples. In fact, that’s how Numba lowers a tuple. It allows tuples to be made cheaply because it does not involve the heap. Numba has not resolved the the optimization for large tuples. Currently, large tuples are suffering in performance at function boundaries.

Lastly, if I were to write StructModel and StructRef as C++, it will look something like:

template <typename Tfield0, typename Tfield1, ...>
struct StructModel{
   Tfield0 attr0;
   Tfield1 attr1;
   ...
};


template <typename Tfield0, typename Tfield1, ...>
shared_ptr< StructModel<Tfield0, Tfield1, ...> > make_structref(Tfield0 attr0, Tfield1 attr1);
1 Like