hi there, I’m looking into Incorrect result copying array-typed field of structured array in 0.55.0rc1 · Issue #7693 · numba/numba · GitHub and I need some help/information.
Backgroud
I’m trying to understand the differences between functions set1
(which doesn’t work as per issue above) and set2
which does work.
import numpy as np
from numba import njit
src_dtype = np.dtype([
("user", np.float64),
("array", np.int16, (3,))
], align=True)
dest_dtype = np.dtype([
("user1", np.float64),
("array1", np.int16, (3,))
], align=True)
source = np.empty(5, dtype=src_dtype)
dest = np.empty(5, dtype=dest_dtype)
source[0] = (1.2, [1, 2, 3])
@njit
def set1(index, src, dest):
dest['array1'] = src[index]['array']
return dest
@njit
def set2(index, src, dest):
dest['array1'][:] = src[index]['array']
return dest
set1(0, source, dest[0])
print(dest[0])
set2(0, source, dest[0])
print(dest[0])
The numba ir of set1 seems fine (I guess), full ir here: set1_numba_ir - Pastebin.com
This part seems to handle the assignment, and I can’t see anything obviously wrong
$const8.3 = const(str, array) :: Literal[str](array)
$10binary_subscr.4 = static_getitem(value=$6binary_subscr.2, index=array, index_var=$const8.3, fn=<built-in function getitem>)
dest['array1'] = $10binary_subscr.4
Not seeing anything wrong in the numba ir, I’m now looking at LLVM IR to see how the assignment works at that level. I didn’t get very far because I’m confused by how the record array is represented in LLVM IR.
My understanding of the representation of normal, non-record arrays is that they are described by seven members in the data model
members = [
('meminfo', types.MemInfoPointer(fe_type.dtype)),
('parent', types.pyobject),
('nitems', types.intp),
('itemsize', types.intp),
('data', types.CPointer(fe_type.dtype)),
('shape', types.UniTuple(types.intp, ndim)),
('strides', types.UniTuple(types.intp, ndim)),
]
which translate to seven variables in llvm.
i8* %arg.arr.0,
i8* %arg.arr.1,
i64 %arg.arr.2,
i64 %arg.arr.3,
double* %arg.arr.4,
i64 %arg.arr.5.0,
i64 %arg.arr.6.0
For a record array I see different llvm ir
i8* %arg.arr.0,
i8* %arg.arr.1,
i64 %arg.arr.2,
i64 %arg.arr.3,
[8 x i8]* %arg.arr.4,
i64 %arg.arr.5.0,
i64 %arg.arr.6.0
This was generated from this python array np.zeros(3, dtype=[('a', np.float64)])
. The data
member is an array pointer, but I don’t understand the [8 x i8]
part. Why 8 times i8?
Thanks,
Luk