hi there, I’m looking into Incorrect result copying array-typed field of structured array in 0.55.0rc1 · Issue #7693 · numba/numba · GitHub and I need some help/information.
I’m trying to understand the differences between functions
set1 (which doesn’t work as per issue above) and
set2 which does work.
import numpy as np from numba import njit src_dtype = np.dtype([ ("user", np.float64), ("array", np.int16, (3,)) ], align=True) dest_dtype = np.dtype([ ("user1", np.float64), ("array1", np.int16, (3,)) ], align=True) source = np.empty(5, dtype=src_dtype) dest = np.empty(5, dtype=dest_dtype) source = (1.2, [1, 2, 3]) @njit def set1(index, src, dest): dest['array1'] = src[index]['array'] return dest @njit def set2(index, src, dest): dest['array1'][:] = src[index]['array'] return dest set1(0, source, dest) print(dest) set2(0, source, dest) print(dest)
The numba ir of set1 seems fine (I guess), full ir here: set1_numba_ir - Pastebin.com
This part seems to handle the assignment, and I can’t see anything obviously wrong
$const8.3 = const(str, array) :: Literal[str](array) $10binary_subscr.4 = static_getitem(value=$6binary_subscr.2, index=array, index_var=$const8.3, fn=<built-in function getitem>) dest['array1'] = $10binary_subscr.4
Not seeing anything wrong in the numba ir, I’m now looking at LLVM IR to see how the assignment works at that level. I didn’t get very far because I’m confused by how the record array is represented in LLVM IR.
My understanding of the representation of normal, non-record arrays is that they are described by seven members in the data model
members = [ ('meminfo', types.MemInfoPointer(fe_type.dtype)), ('parent', types.pyobject), ('nitems', types.intp), ('itemsize', types.intp), ('data', types.CPointer(fe_type.dtype)), ('shape', types.UniTuple(types.intp, ndim)), ('strides', types.UniTuple(types.intp, ndim)), ]
which translate to seven variables in llvm.
i8* %arg.arr.0, i8* %arg.arr.1, i64 %arg.arr.2, i64 %arg.arr.3, double* %arg.arr.4, i64 %arg.arr.5.0, i64 %arg.arr.6.0
For a record array I see different llvm ir
i8* %arg.arr.0, i8* %arg.arr.1, i64 %arg.arr.2, i64 %arg.arr.3, [8 x i8]* %arg.arr.4, i64 %arg.arr.5.0, i64 %arg.arr.6.0
This was generated from this python array
np.zeros(3, dtype=[('a', np.float64)]). The
data member is an array pointer, but I don’t understand the
[8 x i8] part. Why 8 times i8?