From this post:
Would this make it possible for extensions like Awkward Array to run in
numba.cuda.jit
ed functions? We’re currently developing a CUDA backend for Awkward, which uses CUDA pointers in place of main memory pointers and converts to and from CuPy, rather than NumPy, for the low-level buffers. I suspect some modifications would be necessary, but would Numba based on NVVM reduce barriers to something like the following in the future?@nb.cuda.autojit def something(awkward_array, output_numpy): i = nb.cuda.grid(1) data_structure = awkward_array[i] walk over data_structure, do something... output_numpy[i] = whatever...
For context, Awkward’s DataModel in Numba is a set of arrays and integer indexes into those arrays that are interpreted by type-specific, hard-coded implementations of
__getitem__
,__len__
, etc. So it’s all arrays and numbers internally; thepyobject
is only used for reference counting and boxing return values (which won’t be the case for CUDA kernels, since they act in place). With the Awkward CUDA backend, these arrays can be GPU-resident/CuPy.
I’d be really keen to find a way to support this. There are two things worth looking at right now that might enable the use of Awkward Array types and functions in CUDA kernels:
- It’s possible to extend the CUDA target to some extent. The Interval example from the documentation works with some small modifications: https://gist.github.com/gmarkall/ccc57848fd59312559513774aad870b5 - following the same pattern, I’d hope it would be possible to make the Awkward Array Numba support (data model, etc.) to work correctly in the CUDA target - and if it isn’t, let’s look at fixing the CUDA target’s extensibility so that it is possible!
- Passing Awkward Arrays into the kernel might need an Extension (unfortunately a bit of an overloaded term, this is an extension for processing arguments to kernels) to be registered. When dispatching a CUDA kernel, Numba translates the Python arguments into kernel arguments, but extensions get to mutate them first if necessary - see https://github.com/numba/numba/blob/master/numba/cuda/compiler.py#L706 - an Extension may be needed to make Numba accept Awkward Arrays here. Some examples of these extensions are in args.py.
How does this look from your perspective? Do you need me to expand a bit more on either of the items I mentioned above?