Numba has support for both standard python random() module and the np.random module, which is great as numpy.random functions are generally quite slow.
I was wondering if the vanilla python/numpy random is different in any way from a numba implementation.
That is, (1) I have a function that generates some random numbers, and (2) another function which is njited version of 1.
Is the behavior of these functions different in any characteristic way? Will the random numbers generated in both functions be of the same “quality”?
From memory, Numba replicates the standard NumPy results exactly, if you give Numba and NumPy the same “seed” as a starting state then the respective RNGs will emit the same sequence. Numba’s RNG is threadsafe too with the individual thread "state"s kept in thread local storage. Example of parity:
from numba import njit
import numpy as np
@njit
def foo(n, seed=0):
np.random.seed(seed) # set state
ret = np.empty(n)
for i in range(n):
ret[i] = np.random.random()
return ret
n = 10
for x, y in zip(foo(n), foo.py_func(n)): # run numba and python version
assert x == y
print(x, y)
Thanks for the detailed answer. That definitely puts me at ease about using random variables in jitted functions.
I was also surprised to see that jitted random functions are so much faster than the standard numpy ones.
%timeit foo(n) #> 2.46 µs ± 155 ns per loop
%timeit foo.py_func(n) #> 9.17 µs ± 278 ns per loop
# The above function but with the following line changed
# ret[i] = np.random.randint(n)
%timeit foo(n) #> 2.4 µs ± 122 ns per loop
%timeit foo.py_func(n) #> 34 µs ± 2.09 µs per loop
About 3-4 times faster for np.random.random() and ~10-100 times faster(!) for np.random.randint().
That’s just too good.
In fact from now on, I’m always going to wrap numpy random functions inside an njit function.