Numba indexing faster with unsigned integers

This is probably known, but showed up when re-implementing sparse matrix algorithms. This helper is used in the first part of a sparse matmul. It’s considerably faster if we cast the indices to uint, as I suppose the compiler need not worry about wraparound for negative indices?

import numba
import numpy as np
import scipy

@numba.njit(boundscheck=False)  # doesn't make a different, but just to be sure
def csr_matmat_nnz(n_row, n_col, x_ptr, x_ind, x_data, y_ptr, y_ind, y_data):
    nnz = 0
    mask = np.full(n_col, -1, dtype=np.int32)
    for i in range(n_row):
        row_nnz = 0
        for jj in range(x_ptr[i], x_ptr[i + 1]):
            j = x_ind[jj]
            for kk in range(y_ptr[j], y_ptr[j + 1]):
                k = y_ind[kk]
                if mask[k] != i:
                    mask[k] = i
                    row_nnz += 1
        nnz += row_nnz
    return nnz

n, p, k = 100_000, 50, 10
A = scipy.sparse.random(n, p, density=0.1, format="csr")
B = scipy.sparse.random(p, k, density=0.05, format="csr")
x_ptr, x_ind, x_data = A.indptr, A.indices, A.data
y_ptr, y_ind, y_data = B.indptr, B.indices, B.data
n_row, n_col = A.shape[0], B.shape[1]

x_ind_u = x_ind.view(np.uint32)
y_ind_u = y_ind.view(np.uint32)
x_ptr_u = x_ptr.view(np.uint32)
y_ptr_u = y_ptr.view(np.uint32)

r = csr_matmat_nnz(n_row, n_col, x_ptr, x_ind, x_data, y_ptr, y_ind, y_data)
r_u = csr_matmat_nnz(n_row, n_col, x_ptr_u, x_ind_u, x_data, y_ptr_u, y_ind_u, y_data)
assert r == r_u

%timeit csr_matmat_nnz(n_row, n_col, x_ptr, x_ind, x_data, y_ptr, y_ind, y_data)  
# 3.39 ms ± 132 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit csr_matmat_nnz(n_row, n_col, x_ptr_u, x_ind_u, x_data, y_ptr_u, y_ind_u, y_data)
# 2.59 ms ± 24.5 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Is there a more idiomatic way of promising to numba that integers will never be negative?

Right, that’s fairly well known. We just cast it in our code.