I have noticed that there is a significant performance difference when indexing a numpy array with np.uint64
vs npint64
. Take the following simple contrived example:
import time
import numpy as np
from numba import njit
@njit
def some_func(a, idx):
for _ in range(100_000_000_000):
a[idx] = 10.0
a = np.random.rand(5)
a_copy = a.copy()
idx = np.array([5], dtype=np.int64) # int64
uidx = np.array([5], dtype=np.uint64) # uint64
lst = []
for _ in range(6):
a_copy[:] = a
tic = time.time()
some_func(a_copy, idx)
toc = time.time()
lst.append(toc - tic)
print(np.mean(lst[1:]))
lst = []
for _ in range(6):
a_copy[:] = a
tic = time.time()
some_func(a_copy, uidx)
toc = time.time()
lst.append(toc - tic)
print(np.mean(lst[1:]))
In the np.int64
case, it takes an average time of 43s while, in the np.uint64
case, it takes an average time of 35s. So, the np.uint64
case was around 19% faster!
This was extremely surprising as I would’ve expected a negligible difference using np.int64
to index into the numpy array.
My question is is this normal/expected/known behavior??
2 Likes
This is known and expected behavior. Think about it, if an index could be negative you have to check if it was negative and if so do the index wraparound calculation. This adds time but even if all the indices are positive then the presence of the checking code also prohibits vectorization. This is why parallel=True uses unsigned indices where they are known to be positive and people occasionally have problems when they add a signed number to an index and try to use that to index an array which won’t work because signed and unsigned unify as float.
2 Likes
@DrTodd13 Thank you so much for enlightening me! I learned something new. Would you happen to know what the type would be in for i in range(10)
? Is i
a uint64
or an int64
? I plan to add i
to another numpy
array of dtype=np.uint64
and then use the resulting sum to index into another numpy
array:
idx = np.array([1, 4, 9, 16, 25, 36, 49, 64, 81, 100], dtype=np.uint64)
a = np.random.rand(200)
for i in range(10):
j = idx[i] + i
a[j] = 5
Is there some way to ensure that i
is also np.uint64
?
“i” will be typed as int64. However, you can use numba.uint64(i) to create a new variable from it that is unsigned.
2 Likes
thanks @seanlaw for bringing it up and @DrTodd13 for the explanation! I have a lot of code that uses indexing, I now wonder if I can squeeze some more performance using numba.uint64(i)
I am also wondering whether the type of range
with 1 argument shouldn’t be an unint64, since it will always be positive. The 3 arguments version can be negative, but I don’t think the 1 argument version can be.
Luk
1 Like
Related (if tangentially) issue here
1 Like
Thank you @DrTodd13. If I have many integers, I wonder if casting with numba.uint64(i)
would create the same/similar cost as needing to check for negative integers before using the value as an array index?
Casting has almost no overhead.
1 Like