I’m performing mathematical calculations using Numba and creating a cuda.local.array within my CUDA kernels. After completing the computations, I call cuda.current_context().reset() to clear the cache. While this reduces the memory usage by 500 MB, there is still a significant amount of GPU memory that remains occupied.
I need to completely free this memory for further GPU computations. If I use cuda.close(), I lose the context, which leads to a CudaAPIError: [400] Call to cuLaunchKernel results in CUDA_ERROR_INVALID_HANDLE error when I try to execute any Numba code afterward.
How can I effectively clear all GPU memory without losing the context?