BUG: Numba using a lot of GPU memory

Hi everyone. Probably I found a bug in Numba. I created a reproducable demo. It should allocate only 256KB of memory but allocates more than 2GB. Can someone take a look?

import numba as nb
from numba import cuda
from time import sleep

######################

@cuda.jit("void(int32[:])", device=True)
def GPU_device_function(arr):
    return

@cuda.jit("void()")
def GPU_entry_point():
    # When this if is removed, it works normally
    if cuda.grid(1):
        return

    # Should use only 256 KB of memory.
    arr = cuda.local.array(shape=65536, dtype=nb.int32)

    # When this assigment is removed, it works normally
    arr[0] = 0

    # When this call is removed, it works normally
    GPU_device_function(arr)

######################

if __name__ == '__main__':
    print(cuda.select_device(0))
    print("LOADED")

    GPU_entry_point[1, 1]()  # Run  once
    cuda.synchronize()

    print("DONE")
    sleep(3)  # Wait, so the memory spike will show up in Task manager before deallocation
    print("END")

Thanks for the report! I can reproduce the behaviour, except with a Quadro RTX 8000 it uses 18GB of RAM! :slight_smile:

I’ve written it up on the issue tracker, and will report further progress there: https://github.com/numba/numba/issues/6352

any progress to this …?

@srujan The linked issue is quite old and it’s not clear that we weren’t seeing some artifact of driver behaviour, from reading through the linked issue.

If you’re having an issue I’d suggest posting a description of the issue you’re facing, along with the code that can be executed to reproduce the issue.