What does it mean that argument types are unaligned?

I have the following dtype definition

pointDtype = np.dtype([('x', 'f4'),('y', 'f4'),('z', 'f4')])
pointNBtype = nb.from_dtype(pointDtype)

and the following function,

@njit(nb.bool_(pointNBtype, pointNBtype))
def equal(p0, p1):
    return p0.x == p1.x and p0.y == p1.y and p0.z == p1.z

After I created the following,

p0 = np.array((1,2,0), dtype= pointNBtype)
p1 = np.array((0,0,0), dtype= pointNBtype)

I get the following error,

00 args = [self.typeof_pyval(a) for a in args]
    701 msg = ("No matching definition for argument type(s) %s"
    702        % ', '.join(map(str, args)))
--> 703 raise TypeError(msg)

TypeError: No matching definition for argument type(s) unaligned array(Record(x[type=float32;offset=0],y[type=float32;offset=4],z[type=float32...more stuff

This runs fine IF i do not specify the signature (ie. @njit without any signature specs)

Why not leave out the signature, call the function, then print out the deduced nopython signature?

Thanks @nelson2005
I believe that I did that but I was still unsuccessful. Here if I run the following,

pointNBtype

I get the following signature,

Record([('x', {'type': float32, 'offset': 0, 'alignment': None, 'title': None, }), ('y', {'type': float32, 'offset': 4, 'alignment': None, 'title': None, }), ('z', {'type': float32, 'offset': 8, 'alignment': None, 'title': None, })], 12, False)

After running the function I posted with p0, p1 as defined above. I get the following nonpython_signatures

[(Array(Record([('x', {'type': float32, 'offset': 0, 'alignment': None, 'title': None, }), ('y', {'type': float32, 'offset': 4, 'alignment': None, 'title': None, }), ('z', {'type': float32, 'offset': 8, 'alignment': None, 'title': None, })], 12, False), 0, 'C', False, aligned=False), Array(Record([('x', {'type': float32, 'offset': 0, 'alignment': None, 'title': None, }), ('y', {'type': float32, 'offset': 4, 'alignment': None, 'title': None, }), ('z', {'type': float32, 'offset': 8, 'alignment': None, 'title': None, })], 12, False), 0, 'C', False, aligned=False)) -> bool]

The only difference I see is that the first one does not start with `Array’ and I am not entirely certain why that is?

Maybe I’m missing something, but p0 and p1 look to me like they are arrays?

@nelson2005 I see what you mean. But wouldn’t p0, p1 be records here???
they were defined as pointNBtype which is what I used when specifying my signature.

They’re arrays of pointNBtype, right? Suppose you used dtype=float. What type would you expect p0 to be?

I see it now… I was confused thinking that the dtype was actually the array. However, if I change the signature to the following,

@njit(nb.bool_(pointNBtype[::1], pointNBtype[::1]))

that should be the same as the one provided by .nopython_signatures but it still blows us. Just a different error.

[...]
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
No conversion from array(bool, 1d, C) to bool for '$98return_value.1', defined at None

Your function does operations on arrays, and returns an array, right? Is there some reason you don’t want to use the automatically deduced signature?

Hi @nelson2005, that is an appropriate comment. I am transforming a long program by rewriting all the functions I have to be used with numba in order to optimize it. So… I thought better to specify as much as I can. But it might just be that I don’t really need to.

Right - the general recommendation as I understand it is to let numba figure out the signature unless you have a particular reason to specify it. Even then, an inferred signature is a great place to start.

It will certainly make life easier ! Thanks for your help!

Hey @MLLO ,

If you want to compare structured arrays for equality you can use the function np.array_equal to check the underlying data.
If you want to compare single records for equality you can use their field names as attributes.
What are the benefits to use structured instead of standard arrays in your case?

import numpy as np
import numba as nb

record_type = np.dtype([('x', 'f4'), ('y', 'f4'), ('z', 'f4')])

#=============================================================================
# This defines a 0-dimensional array of record_type not a record
#=============================================================================
mypoint = np.array((0,0,0), dtype=np.dtype([('x', 'f4'), ('y', 'f4'), ('z', 'f4')]))
print(f'type: {type(mypoint)}, ndim: {mypoint.ndim}')
# type: <class 'numpy.ndarray'>, ndim: 0

#=============================================================================
# Compare arrays
#=============================================================================
@nb.njit
def equal_arrays(p0, p1):
    return np.array_equal(p0.view(np.float32), p1.view(np.float32))

p0 = np.array([(0,0,0), (1,0,0)], dtype=record_type)
p1 = np.array([(0,0,0), (1,0,0)], dtype=record_type)
print(f'Arrays equal? => {equal_arrays(p0, p1)}')
# Arrays equal? => True

#=============================================================================
# Compare records
#=============================================================================
@nb.njit
def equal_records(p0, p1):
    return p0.x == p1.x and p0.y == p1.y and p0.z == p1.z
points = np.array([(0,0,0),
                   (0,0,0)], dtype=record_type)

print(f'Records equal? => {equal_records(points[0], points[1])}')
# Records equal? => True

Thanks @Oyibo !

I am still seem to be confused about how numba and numpy structured arrays behave. Point in question (building on the point spec above)

entryDtype = np.dtype([('error', 'f4'),
                      ('triangle_id', 'i8'),
                      ('point', pointNBtype)])

entryNBtype = nb.from_dtype(entryDtype)

p0 = np.array((0,0,0), dtype= pointNBtype)
entry0 = np.array((0.5, 0, p0), dtype=entryNBtype)

I define a numba list as such,

a = nb.typed.List.empty_list(entryNBtype)

I try the following,

a.append(entry0)

and it blows up? There is obviously something that I am missing but can see what it is?

Complete programs are helpful for responders.
In this case you’re trying to append an array to a list-of-entryNBtype? anytime you use np.array() you’re the resulting type is an array.

Hi @nelson2005
I don’t know if I fully understand you answer.
I am looking to have a list that will store entries (as specified). So a list of structured numpy arrays. If that make sense. How would this be accomplished?

@Oyibo
Following up on your comments…
how would you generate a list and a dictionary to store as structured array as your mypoint in numba??? I know howI would do this if mypoint was defined as

mypoint = np.array([ (0,0,0) ], dtype=record_type)

with square brackets. I ask this because if the latter is not possible then your second option is the one I need to adopt (though I would prefer NOT to use square brackets)

Hey @MLLO ,

A NumPy structured array is like a lightweight version of a DataFrame. It’s useful when you need to store data with different types in a table format. However, if you’re dealing with homogeneous data, such as point coordinates, a regular NumPy array is sufficient, too. On the other hand, if structured data storage and advanced data operations are needed, consider using a Pandas DataFrame, which offers more functionality, especially for joining data based on indices.

#=============================================================================
# imports
#=============================================================================
import numpy as np
from numpy.lib import recfunctions as rfn
import pandas as pd

#=============================================================================
# Creating a numpy array
#=============================================================================
numpy_array = np.arange(9, dtype=np.float32).reshape(3, 3)
print("Numpy Array:")
print(numpy_array)

#=============================================================================
# Creating a structured array
#=============================================================================
dtype = [('x', np.float32), ('y', np.float32), ('z', np.float32)]
structured_array = rfn.unstructured_to_structured(numpy_array, dtype=dtype)
print("\nStructured Array:")
print(structured_array.dtype.names)
print(structured_array.view(np.float32).reshape(3, 3))

#=============================================================================
# Creating a pandas dataframe
# =============================================================================
df = pd.DataFrame(structured_array)
print("\nDataframe:")
print(df)

# Numpy Array:
# [[0. 1. 2.]
#  [3. 4. 5.]
#  [6. 7. 8.]]

# Structured Array:
# ('x', 'y', 'z')
# [[0. 1. 2.]
#  [3. 4. 5.]
#  [6. 7. 8.]]

# Dataframe:
#      x    y    z
# 0  0.0  1.0  2.0
# 1  3.0  4.0  5.0
# 2  6.0  7.0  8.0

thanks @Oyibo

No I was not looking for nothing as complicated as a dataframe (another possibility would be an xarray). I am more intersected in very light data structure. In the example you produced earlier you were handling a point data type as a record without making it into an array. I was interested in that possibility (e.g. the possibility of doing a list of point records). Don’t know if this makes sense.

M.

Hey @MLLO ,
if you don’t want to define structured arrays but just single records you can use the record class.

import numpy as np
import numba as nb

record_type = np.dtype([('x', 'f4'), ('y', 'f4'), ('z', 'f4')])
p0 = np.record((0,0,0), dtype=record_type)
p1 = np.record((1,0,0), dtype=record_type)
pointType = nb.typeof(p0)
points = nb.typed.List.empty_list(pointType)
points.append(p0)
points.append(p1)
print(points)
# [(0., 0., 0.), (1., 0., 0.), ...]
1 Like

Perfect! @Oyibo
I have used a lot of numpy but never used this! Thank you.