Would this result in a race condition?

Hello there,

I am using this awesome numba package for some experiments I have. However, I am looking at parallelizing some certain processes, and I want to know whether a certain code snippets yield a race condition when I parallelize it (this is a bit of a noob question).

# original_vectors is (n,d)
# new_vectors is (m, d)
# index_vectors is (n, k) where every entry is and index in new_vectors

@njit(nopython=True, parallel=True, target="cpu")
def generate_vectors(original_vectors, new_vectors, indexes):
    # Loop through the walks
    for ii in prange(indexes.shape[0]):
        for jj in prange(indexes.shape[1]):
            new_vectors[indexes[ii, jj]] += original_vectors[ii]
    return original_vectors

According to this tutorial, using indexes of a vector in a parallel for-loop can risk yielding a race condition.

However, when I test this, it yields the same output vector, no matter if I use parallel and prange or not. However, I manage to parallelize and speed it up by a factor of 2, so it would be nice if it did not yield a race condition.

Thanks!!

You do have a race condition in theory - the same location in new_vectors can be updated by more than one thread at the same time.
With the particular logic you have there the order doesn’t matter (apart from the “floating point arithmetic isn’t associative” problem), so in most cases you’ll be fine. However, the += operator is not atomic, so it’s possible that another thread will update that location in new_vectors between the addition starting and the result being written to the location.

Thank you for your reply, it is of great help! Okay, then for safety, I should probably avoid this specific parallelization even though it speeds it up. Perhaps there is some work around so I avoid the race condition?

At the risk of making myself a fool: @rhjmoore is completely right about the atomicity and the race conditions that come with it. However, to me it looks as if your loop hits every entry in new_vector no more than once. If that is the case, then I think you should be fine with this being executed in parallel since there are no competing threads working on a given element?

That depends what’s in indexes. If it looks like this you’ll hit elements in new_vectors multiple times:
indexes = np.asarray([0,1,],[0,2],[3,1]])
With this value of indexes, you’ll hit elements 0 and 1 of indexes twice.

You are absolutely right, I glanced over the fact that the indices were coming from somewhere else!