What does it mean that argument types are unaligned?

MLLO · October 18, 2023, 11:09pm

I have the following dtype definition

pointDtype = np.dtype([('x', 'f4'),('y', 'f4'),('z', 'f4')])
pointNBtype = nb.from_dtype(pointDtype)

and the following function,

@njit(nb.bool_(pointNBtype, pointNBtype))
def equal(p0, p1):
    return p0.x == p1.x and p0.y == p1.y and p0.z == p1.z

After I created the following,

p0 = np.array((1,2,0), dtype= pointNBtype)
p1 = np.array((0,0,0), dtype= pointNBtype)

I get the following error,

00 args = [self.typeof_pyval(a) for a in args]
    701 msg = ("No matching definition for argument type(s) %s"
    702        % ', '.join(map(str, args)))
--> 703 raise TypeError(msg)

TypeError: No matching definition for argument type(s) unaligned array(Record(x[type=float32;offset=0],y[type=float32;offset=4],z[type=float32...more stuff

This runs fine IF i do not specify the signature (ie. @njit without any signature specs)

nelson2005 · October 19, 2023, 1:53am

Why not leave out the signature, call the function, then print out the deduced nopython signature?

MLLO · October 19, 2023, 2:06am

Thanks @nelson2005
I believe that I did that but I was still unsuccessful. Here if I run the following,

pointNBtype

I get the following signature,

Record([('x', {'type': float32, 'offset': 0, 'alignment': None, 'title': None, }), ('y', {'type': float32, 'offset': 4, 'alignment': None, 'title': None, }), ('z', {'type': float32, 'offset': 8, 'alignment': None, 'title': None, })], 12, False)

After running the function I posted with p0, p1 as defined above. I get the following nonpython_signatures

[(Array(Record([('x', {'type': float32, 'offset': 0, 'alignment': None, 'title': None, }), ('y', {'type': float32, 'offset': 4, 'alignment': None, 'title': None, }), ('z', {'type': float32, 'offset': 8, 'alignment': None, 'title': None, })], 12, False), 0, 'C', False, aligned=False), Array(Record([('x', {'type': float32, 'offset': 0, 'alignment': None, 'title': None, }), ('y', {'type': float32, 'offset': 4, 'alignment': None, 'title': None, }), ('z', {'type': float32, 'offset': 8, 'alignment': None, 'title': None, })], 12, False), 0, 'C', False, aligned=False)) -> bool]

The only difference I see is that the first one does not start with `Array’ and I am not entirely certain why that is?

nelson2005 · October 19, 2023, 2:40am

Maybe I’m missing something, but p0 and p1 look to me like they are arrays?

MLLO · October 19, 2023, 2:47am

@nelson2005 I see what you mean. But wouldn’t p0, p1 be records here???
they were defined as pointNBtype which is what I used when specifying my signature.

nelson2005 · October 19, 2023, 2:55am

They’re arrays of pointNBtype, right? Suppose you used dtype=float. What type would you expect p0 to be?

MLLO · October 19, 2023, 2:58am

I see it now… I was confused thinking that the dtype was actually the array. However, if I change the signature to the following,

@njit(nb.bool_(pointNBtype[::1], pointNBtype[::1]))

that should be the same as the one provided by .nopython_signatures but it still blows us. Just a different error.

[...]
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
No conversion from array(bool, 1d, C) to bool for '$98return_value.1', defined at None

nelson2005 · October 19, 2023, 3:06am

Your function does operations on arrays, and returns an array, right? Is there some reason you don’t want to use the automatically deduced signature?

MLLO · October 19, 2023, 3:10am

Hi @nelson2005, that is an appropriate comment. I am transforming a long program by rewriting all the functions I have to be used with numba in order to optimize it. So… I thought better to specify as much as I can. But it might just be that I don’t really need to.

nelson2005 · October 19, 2023, 3:12am

Right - the general recommendation as I understand it is to let numba figure out the signature unless you have a particular reason to specify it. Even then, an inferred signature is a great place to start.

MLLO · October 19, 2023, 3:14am

It will certainly make life easier ! Thanks for your help!

Oyibo · October 19, 2023, 6:57am

Hey @MLLO ,

If you want to compare structured arrays for equality you can use the function np.array_equal to check the underlying data.
If you want to compare single records for equality you can use their field names as attributes.
What are the benefits to use structured instead of standard arrays in your case?

import numpy as np
import numba as nb

record_type = np.dtype([('x', 'f4'), ('y', 'f4'), ('z', 'f4')])

#=============================================================================
# This defines a 0-dimensional array of record_type not a record
#=============================================================================
mypoint = np.array((0,0,0), dtype=np.dtype([('x', 'f4'), ('y', 'f4'), ('z', 'f4')]))
print(f'type: {type(mypoint)}, ndim: {mypoint.ndim}')
# type: <class 'numpy.ndarray'>, ndim: 0

#=============================================================================
# Compare arrays
#=============================================================================
@nb.njit
def equal_arrays(p0, p1):
    return np.array_equal(p0.view(np.float32), p1.view(np.float32))

p0 = np.array([(0,0,0), (1,0,0)], dtype=record_type)
p1 = np.array([(0,0,0), (1,0,0)], dtype=record_type)
print(f'Arrays equal? => {equal_arrays(p0, p1)}')
# Arrays equal? => True

#=============================================================================
# Compare records
#=============================================================================
@nb.njit
def equal_records(p0, p1):
    return p0.x == p1.x and p0.y == p1.y and p0.z == p1.z
points = np.array([(0,0,0),
                   (0,0,0)], dtype=record_type)

print(f'Records equal? => {equal_records(points[0], points[1])}')
# Records equal? => True

MLLO · October 19, 2023, 6:42pm

Thanks @Oyibo !

I am still seem to be confused about how numba and numpy structured arrays behave. Point in question (building on the point spec above)

entryDtype = np.dtype([('error', 'f4'),
                      ('triangle_id', 'i8'),
                      ('point', pointNBtype)])

entryNBtype = nb.from_dtype(entryDtype)

p0 = np.array((0,0,0), dtype= pointNBtype)
entry0 = np.array((0.5, 0, p0), dtype=entryNBtype)

I define a numba list as such,

a = nb.typed.List.empty_list(entryNBtype)

I try the following,

a.append(entry0)

and it blows up? There is obviously something that I am missing but can see what it is?

nelson2005 · October 19, 2023, 8:49pm

Complete programs are helpful for responders.
In this case you’re trying to append an array to a list-of-entryNBtype? anytime you use np.array() you’re the resulting type is an array.

MLLO · October 19, 2023, 9:44pm

Hi @nelson2005
I don’t know if I fully understand you answer.
I am looking to have a list that will store entries (as specified). So a list of structured numpy arrays. If that make sense. How would this be accomplished?

MLLO · October 19, 2023, 10:27pm

@Oyibo
Following up on your comments…
how would you generate a list and a dictionary to store as structured array as your mypoint in numba??? I know howI would do this if mypoint was defined as

mypoint = np.array([ (0,0,0) ], dtype=record_type)

with square brackets. I ask this because if the latter is not possible then your second option is the one I need to adopt (though I would prefer NOT to use square brackets)

Oyibo · October 19, 2023, 11:25pm

Hey @MLLO ,

A NumPy structured array is like a lightweight version of a DataFrame. It’s useful when you need to store data with different types in a table format. However, if you’re dealing with homogeneous data, such as point coordinates, a regular NumPy array is sufficient, too. On the other hand, if structured data storage and advanced data operations are needed, consider using a Pandas DataFrame, which offers more functionality, especially for joining data based on indices.

#=============================================================================
# imports
#=============================================================================
import numpy as np
from numpy.lib import recfunctions as rfn
import pandas as pd

#=============================================================================
# Creating a numpy array
#=============================================================================
numpy_array = np.arange(9, dtype=np.float32).reshape(3, 3)
print("Numpy Array:")
print(numpy_array)

#=============================================================================
# Creating a structured array
#=============================================================================
dtype = [('x', np.float32), ('y', np.float32), ('z', np.float32)]
structured_array = rfn.unstructured_to_structured(numpy_array, dtype=dtype)
print("\nStructured Array:")
print(structured_array.dtype.names)
print(structured_array.view(np.float32).reshape(3, 3))

#=============================================================================
# Creating a pandas dataframe
# =============================================================================
df = pd.DataFrame(structured_array)
print("\nDataframe:")
print(df)

# Numpy Array:
# [[0. 1. 2.]
#  [3. 4. 5.]
#  [6. 7. 8.]]

# Structured Array:
# ('x', 'y', 'z')
# [[0. 1. 2.]
#  [3. 4. 5.]
#  [6. 7. 8.]]

# Dataframe:
#      x    y    z
# 0  0.0  1.0  2.0
# 1  3.0  4.0  5.0
# 2  6.0  7.0  8.0

MLLO · October 19, 2023, 11:39pm

thanks @Oyibo

No I was not looking for nothing as complicated as a dataframe (another possibility would be an xarray). I am more intersected in very light data structure. In the example you produced earlier you were handling a point data type as a record without making it into an array. I was interested in that possibility (e.g. the possibility of doing a list of point records). Don’t know if this makes sense.

M.

Oyibo · October 20, 2023, 12:11am

Hey @MLLO ,
if you don’t want to define structured arrays but just single records you can use the record class.

import numpy as np
import numba as nb

record_type = np.dtype([('x', 'f4'), ('y', 'f4'), ('z', 'f4')])
p0 = np.record((0,0,0), dtype=record_type)
p1 = np.record((1,0,0), dtype=record_type)
pointType = nb.typeof(p0)
points = nb.typed.List.empty_list(pointType)
points.append(p0)
points.append(p1)
print(points)
# [(0., 0., 0.), (1., 0., 0.), ...]

MLLO · October 20, 2023, 12:16am

Perfect! @Oyibo
I have used a lot of numpy but never used this! Thank you.

Topic		Replies	Views
Mismatch specification error Support: How do I do ...?	23	466	October 18, 2023
Numba OptionalType(array()) - Problems Support: What is this error message?	3	401	May 5, 2023
TypingError: Failed in nopython mode pipeline (step: nopython frontend) Untyped global name 'nhat_lam': Cannot type list element type <class 'function'> Community Support	2	793	October 20, 2023
No conversion from none to array(float64, 1d, A) for '$82return_value.1', defined at None Support: What is this error message?	2	123	December 18, 2023
Signature for List(64) does not work Support: How do I do ...?	4	824	March 17, 2022

What does it mean that argument types are unaligned?

Related Topics