When I try to run a parallel loop using numba and print the index i,j – the result is random sequence of the index. I’m just wondering if there’s a way to implement this through their ascending index number such as when running with numba.cuda (all blocks and threads are done in ascending order).
from numba import njit, prange
import numpy as np
A = np.ones((3, 3))
s = 0
for i in prange(array.shape):
for j in prange(array.shape):
s += array[i,j]
Not sure about the answer to your question, but:
such as when running with numba.cuda (all blocks and threads are done in ascending order).
This isn’t the case - blocks and threads are not scheduled in ascending order in CUDA, you cannot expect any ordering in the scheduling in particular.
I think I got the wrong implementation of my code and understanding of CUDA. I was trying to use random_seed() with parallel=True so that each index in the array would have a corresponding random value generated from the random seed. I’ll try other ways to implement this. Thank you!
Also note that in nested
pranges, all but the outermost
prange is converted into a standard
range, that is, they are serialized. See Loop serialization in Automatic parallelization with @jit — Numba 0.57.0+0.g4fd4e39c6.dirty documentation