Combine IPC and P2P Access

I have 2 GPUs and they are under the same PCIe Switch. When I enable P2P access and allocate memory blocks on device1 and device0, I can access device1’s memory block in kernels launched on device 0 within current process.
I wander whether a cuda IPC memory handle, allocated on device1 from another process, can somehow be directly accessed in kernel launched on device0 within current process.

Perhaps you could use managed memory instead, which shares an address space for all devices and the host:

https://numba.readthedocs.io/en/latest/cuda-reference/memory.html#numba.cuda.managed_array

More on managed memory / UVM: Unified Memory for CUDA Beginners | NVIDIA Developer Blog

Also, as a related note: the IPC mechanism exposed in Numba is an older one. The recommended one in the future will be based around these APIs: CUDA Driver API :: CUDA Toolkit Documentation (cuMemCreate etc., and cuMemExportToShareableHandle), but they’re not yet implemented / exposed in Numba (I haven’t got a timeline for doing this yet).