Passing pointer from c++ to python

Razi · August 21, 2022, 12:29pm

Hi
To pass CPU pointer from c++ to python numpy I am using this code
np_arg = reinterpret_cast<PyArrayObject*>(PyArray_SimpleNewFromData(ND, dims, NPY_LONGDOUBLE, reinterpret_cast<void*>(c_arr)));

My question is how to pass c++ GPU pointer to python numba ?
Thanks

gmarkall · August 23, 2022, 9:33am

You need to wrap the pointer value in an object that implements the CUDA Array Interface. The interface needs more than just the pointer - it also needs to know the shape and type of the data so you will have to provide that in your implementation as well.

Here’s an example of a simple object implementing the interface. MyArray wraps a pointer obtained from cudaMalloc (though it could have come from anywhere, a C++ function, etc.):

from numba import cuda
from ctypes import CDLL, POINTER, byref, c_void_p, c_size_t
import cupy as cp


class MyArray:
    def __init__(self, shape, typestr, data):
        if isinstance(shape, int):
            shape = (shape,)

        self._shape = shape
        self._data = data
        self._typestr = typestr

    @property
    def __cuda_array_interface__(self):
        return {
            'shape': self._shape,
            'typestr': self._typestr,
            'data': (self._data, False),
            'version': 2
        }


# Use ctypes to get the cudaMalloc function from Python
cudart = CDLL('libcudart.so')
cudaMalloc = cudart.cudaMalloc
cudaMalloc.argtypes = [POINTER(c_void_p), c_size_t]

# Allocate some Numba-external memory with cudaMalloc
ptr = c_void_p()
float32_size = 4
nelems = 32
alloc_size = float32_size * nelems
cudaMalloc(byref(ptr), alloc_size)

# Wrap our memory in a CUDA Array Interface object
arr = MyArray(nelems, 'f4', ptr.value)


# Call a kernel on our object wrapping the pointer

@cuda.jit
def initialize(x):
    i = cuda.grid(1)
    if i < len(x):
        x[i] = 3.14


initialize[1, nelems](arr)


# Use CuPy for a convenient way to print our data to show that the kernel
# initialized it
print(cp.asarray(arr))

This produces:

[3.14 3.14 3.14 3.14 3.14 3.14 3.14 3.14 3.14 3.14 3.14 3.14 3.14 3.14
 3.14 3.14 3.14 3.14 3.14 3.14 3.14 3.14 3.14 3.14 3.14 3.14 3.14 3.14
 3.14 3.14 3.14 3.14]

gmarkall · August 23, 2022, 9:37am

There is some more background on this in https://raw.githubusercontent.com/numba/nvidia-cuda-tutorial/main/session-3/session-3-with-notes.pdf (session 3 from the NVIDIA Numba CUDA tutorial) - see slides 33-35.

I’ve now added the above code as an example in that repo: nvidia-cuda-tutorial/cai_implementation.py at main · numba/nvidia-cuda-tutorial · GitHub

Razi · August 23, 2022, 4:21pm

Hi Graham
Thanks for quick response.
I have code in c++ that create numpy.ndarray CPU pointer and send it to python .

np_arg = reinterpret_cast<PyArrayObject*>(PyArray_SimpleNewFromData(ND, dims, NPY_LONGDOUBLE, reinterpret_cast<void*>(c_arr)));
presult = PyObject_CallObject(pFunc, pArgs);

The pointer np_arg is numpy.ndarray type
I want to create in the c++ code numpy.ndarray GPU pointer and pass it to python ,what is the correct way?

gmarkall · August 23, 2022, 9:43pm

I think I’m still not sure what you’re attempting to do. What do you mean by “numpy.ndarray GPU pointer”? Do you want a CUDA device array? Further down that page there are also ways to allocate Pinned Memory and Unified Memory, and for mapping an existing ndarray into GPU memory. Do any of these look like what you need?

Razi · August 24, 2022, 2:58pm

Hi Graham
I want to send GPU vector from c++ to python numba.
to send CPU array is very easy
example

first declare and init
double c_arr[SIZE] = { 1, 2, 3, 4, 5,6,7,8,9};
second
np_arg = reinterpret_cast<PyArrayObject*>(PyArray_SimpleNewFromData(1, &dims, getType(), reinterpret_cast<void*>(data)));
This line create numpy.ndarray
and then I can send it to python
presult = PyObject_CallObject(pFunc, pArgs);
Now is the problem
Instead of CPU double array c_arr[SIZE] = { 1, 2, 3, 4, 5,6,7,8,9};
I am creating GPU array
double * in_d;
cudaMalloc((void**)&in_d, SIZEsizeof(double));
cudaMemcpy(in_d, data, SIZEsizeof(double), cudaMemcpyHostToDevice);
I want to send in_d to python
How can I do it?
Thanks
Razi

gmarkall · August 24, 2022, 3:49pm

Perhaps you can create a Python int from the pointer with PyLong_FromVoidPtr() to send over to the Python side, then pass that to the constructor of a MyArray-like object (from the example above).

Razi · August 25, 2022, 10:49am

Thanks
it’s working fine

Topic		Replies	Views
CUDA - OpenGL interop Support: How do I do ...?	9	2025	March 30, 2023
Copy an python object to device? Support: What is this error message?	2	866	May 11, 2021
Error using if numba cuda nvidia bindings are used Support: What is this error message?	0	419	March 30, 2023
Graphics API interop Community Support	14	1076	March 12, 2021
Making Awkward Arrays work in the CUDA target Community Support	4	1327	March 8, 2023

Passing pointer from c++ to python

Related topics