I try to share pytorch tensor by numba like this and it gives me strange results.
when I read the headline with numba, I get an incorrect result from the network from time to time. I added a delay after converting the tensor and it started working correctly. And I not understand why it happens
Code sample:
backbone_features = torch.cat(backbone_features_list, dim=0)
desc = backbone_features.__cuda_array_interface__
shape = desc["shape"]
strides = desc.get("strides")
dtype = np.dtype(desc["typestr"])
shape, strides, dtype = _prepare_shape_strides_dtype(shape, strides, dtype, order="C")
size = cuda.driver.memory_size_from_info(shape, strides, dtype.itemsize)
devptr = cuda.driver.get_devptr_for_active_ctx(
backbone_features.__cuda_array_interface__["data"][0])
data = cuda.driver.MemoryPointer(
current_context(), devptr, size=size, owner=backbone_features)
ipch = devices.get_context().get_ipc_handle(data)
desc = dict(shape=shape, strides=strides, dtype=dtype)
handle = pickle.dumps([ipch, desc])
Can it happen when new data is loaded before current data was read?
I also try base example with numba.cuda.api:
arr = cuda.to_device(tensor)
handle = arr.get_ipc_handle()
handle = pickle.dumps(handle)
and
with handle as ipc_array:
z = cuda.open_ipc_array(ipc_array, 1, dtype='float16', strides=None, offset=0)
hary = z.args[0].copy_to_host(stream=cuda.stream())
handle sending via localhost