I have a python newbie question related to a compiler error I received. I am trying to refactor cuda atomic test cases to reduce code duplication.
In (https://github.com/numba/numba/blob/master/numba/cuda/tests/cudapy/test_atomics.py)
There are two functions that illustrate the code duplication I want to reduce:
def atomic_add(ary):
tid = cuda.threadIdx.x
sm = cuda.shared.array(32, uint32)
sm[tid] = 0
cuda.syncthreads()
bin = ary[tid] % 32
cuda.atomic.add(sm, bin, 1)
cuda.syncthreads()
ary[tid] = sm[tid]
def atomic_sub(ary):
tid = cuda.threadIdx.x
sm = cuda.shared.array(32, uint32)
sm[tid] = 0
cuda.syncthreads()
bin = ary[tid] % 32
cuda.atomic.sub(sm, bin, 1)
cuda.syncthreads()
ary[tid] = sm[tid]
The only difference between the function is the call to either “cuda.atomic.add” or “cuda.atomic.sub”. Recognizing this I wrote a function to capture the commonality which accepts a function pointer and rewrote “atomic_add”:
def atomic_binary_1dim_shared(ary, op2, ary_dtype, ary_nelements,
binop_func):
tid = cuda.threadIdx.x
sm = cuda.shared.array(ary_nelements, ary_dtype)
sm[tid] = 0
cuda.syncthreads()
bin = ary[tid] % ary_nelements
binop_func(sm, bin, op2)
cuda.syncthreads()
ary[tid] = sm[tid]
def atomic_add(ary):
atomic_binary_1dim_shared(ary, 1, uint32, 32, cuda.atomic.add)
However when I try this I receive an error from the compiler:
Untyped global name ‘atomic_binary_1dim_shared’: cannot determine Numba type of <class ‘function’>
File “numba/cuda/tests/cudapy/test_atomics.py”, line 24:
def atomic_add(ary):
atomic_binary_1dim_shared(ary, 1, uint32, 32, cuda.atomic.add)
I could use some help understanding and resolving this problem.