Thanks for making Numba, I am a big fan! I’ve got the Numba fan-poster, t-shirt, cap, underwear and everything, even my bed-sheets say Numba! (OK, that’s not really true, but it could be!)
My question relates to the best practices of passing read-only Python lists to Numba jit functions.
Versions
- Numba: 0.54.1
- Numpy: 1.20.3
- Python: 3.8.12
Simple Example
Let me first show the problem with a simple example, where we just sum all the elements of a list:
# Number of elements in the list.
n = 100000
# Numpy array with random floats.
x_np = np.random.normal(size=n)
# Convert Numpy array to Python list.
x_list = x_np.tolist()
@njit
def sum_list(x):
# Sum all elements of the list/array x.
s = 0
for i in range(len(x)):
s += x[i]
return s
%timeit sum_list(x=x_np)
# 126 µs ± 3.64 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit sum_list(x=x_list)
# This works but generates a warning:
# NumbaPendingDeprecationWarning: Encountered the use of a type that is
# scheduled for deprecation: type 'reflected list' found for argument
# 'x' of function 'sum_list'.
# 91 ms ± 5.38 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit sum_list(x=numba.typed.List(x_list))
# 141 ms ± 3.33 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
The fastest method by far is passing a Numpy array. The slowest is to use the new and experimental numba.typed.List
.
Sparse Matrix Example
In my actual use-case I have made a function that operates on a sparse matrix, whose elements are defined in a list of tuples like this: [(i0, j0, v0), (i1, j1, v1), ...]
where the iX
and jX
are integer-values for the coordinates into the sparse matrix and vX
are the float values for the matrix-elements at those coordinates. This does not seem to be supported very well in Numba as it runs very slowly.
Below is a somewhat contrived example which just sums all the elements of a sparse matrix and could therefore ignore the coordinates for the sparse matrix, but my actual problem needs to use the coordinates of the sparse matrix as well.
import scipy.sparse
# Generate a sparse matrix with random elements.
n = 100
x_sparse = scipy.sparse.random(n, n, density=0.1, format='coo')
# Convert the sparse matrix to a dense matrix.
x_dense = x_sparse.toarray()
# Numpy arrays for the column and row indices, and the matrix values.
x_col_np = x_sparse.col
x_row_np = x_sparse.row
x_val_np = x_sparse.data
# Convert Numpy arrays to Python lists.
x_col_list = x_col_np.tolist()
x_row_list = x_row_np.tolist()
x_val_list = x_val_np.tolist()
# Convert to a list of tuples, which is the format I prefer in my use-case.
x_list_tup = list(zip(x_col_list, x_row_list, x_val_list))
@njit
def sum_matrix_dense(x):
# Sum all elements of a dense matrix.
s = 0
for i in range(x.shape[0]):
for j in range(x.shape[1]):
s += x[i,j]
return s
%timeit sum_matrix_dense(x=x_dense)
# 13.1 µs ± 354 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
@njit
def sum_matrix_sparse1(x_list_tup):
# Sum all elements of a sparse matrix.
s = 0
for i, j, v in x_list_tup:
# This is a bit contrived because the indices i and j are ignored,
# but in my real use-case I need those as well.
s += v
return s
%timeit sum_matrix_sparse1(x_list_tup=x_list_tup)
# This runs but generates a warning:
# NumbaPendingDeprecationWarning: Encountered the use of a type that is
# scheduled for deprecation: type 'reflected list' found for argument
# 'x_list_tup' of function 'sum_matrix_sparse1'.
# 24.7 ms ± 1.58 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit sum_matrix_sparse1(x_list_tup=numba.typed.List(x_list_tup))
# This runs without warnings but is much slower than 'sum_matrix_dense()'.
# 1.85 ms ± 118 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
@njit
def sum_matrix_sparse2(x_col_list, x_row_list, x_val_list):
# Sum all elements of a sparse matrix.
s = 0
for k in range(len(x_col_list)):
col = x_col_list[k]
row = x_row_list[k]
val = x_val_list[k]
s += val
return s
%timeit sum_matrix_sparse2(x_col_list=x_col_list, x_row_list=x_row_list, x_val_list=x_val_list)
# This runs but generates a warning:
# NumbaPendingDeprecationWarning: Encountered the use of a type that is
# scheduled for deprecation: type 'reflected list' found for argument
# 'x_col_list' of function 'sum_matrix_sparse2'.
# 5.69 ms ± 2.14 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit sum_matrix_sparse2(x_col_list=numba.typed.List(x_col_list), x_row_list=numba.typed.List(x_row_list), x_val_list=numba.typed.List(x_val_list))
# This runs without a warning but is much slower than 'sum_matrix_dense()'.
# 4.33 ms ± 29.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit sum_matrix_sparse2(x_col_list=x_col_np, x_row_list=x_row_np, x_val_list=x_val_np)
# This is very fast when using separate Numpy arrays as the arguments.
# 1.88 µs ± 7.01 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Conclusion
It seems that in the current version of Numba (0.54.1), the fastest method by far, is to pass Numpy arrays instead of Python lists to Numba jit functions.
My questions are:
-
Will
numba.typed.List
eventually run just as fast with Python lists as Numpy arrays, so I can use a Python list or even a Python list of tuples directly with my Numba jit functions? Or will it most likely always be much faster to pass Numpy arrays to Numba jit functions? -
Is there a better / faster way of using read-only Python lists (or even a list of tuples) with Numba jit than I have done above?
Thanks very much and Merry Christmas!