Extending Numba for CUDA inside Cppyy

gmarkall · July 22, 2024, 9:37am

@ed-o-saurus mentioned this already, but I put together a small example using pynvjitlink to embed the CUDA C++ code in the Python source:

from numba import cuda, int32
from pynvjitlink import patch

patch.patch_numba_linker()

cu_functions = cuda.CUSource('''
extern "C" __device__ int foo(int* return_value){
  *return_value = 42;
  return 0;
}
''')

foo = cuda.declare_device('foo', int32())


@cuda.jit(link=[cu_functions])
def kernel():
    print(foo())


kernel[1, 1]()
cuda.synchronize()

I think extending cppyy’s numba_ext to support the CUDA target will also require some source-code processing to handle the differences between the usual C++ ABI and that used by Numba internally, along with qualifying functions with the __device__ keyword. There are maybe other things I haven’t thought about yet.

To answer your original question about how to determine the current target, I think you can do:

from numba.core.target_extension import current_target
target = current_target()

and target will then be either "cuda" or "cpu" depending on the current target.

Topic		Replies	Views
Kernel within a kernel Support: How do I do ...?	10	1059	August 1, 2024
Passing pointer from c++ to python Numba	7	1279	August 25, 2022
CUDA: Experimental branch with support for -dlcm=cg Numba	0	437	August 31, 2022
Tutorial on supporting Python User-Defined Functions in CUDA-accelerated Applications with Numba Showcase	0	447	March 25, 2022
Best practices for developing large Numba-dependent projects with both CPU and GPU compatibility Numba	0	361	October 20, 2021

Extending Numba for CUDA inside Cppyy

Related Topics