How do I use `sharedmem` parameter in kernel?

biosckon · March 16, 2023, 10:17pm

Docs describe that dispatcher object can be configured with sharedmem:

func[griddim, blockdim, stream, sharedmem]

How do I refer to this value of sharedmem from within kernel function?

Thank you!

gmarkall · March 17, 2023, 12:28pm

You declare a shared array of 0 size, like:

cuda.shared.array(0, dtype=np.float32)

Note that all dynamic shared memory arrays alias, so if you want to have multiple dynamic shared arrays, you need to take disjoint views of the arrays. As an example, consider:

from numba import cuda
import numpy as np


@cuda.jit
def f():
    f32_arr = cuda.shared.array(0, dtype=np.float32)
    i32_arr = cuda.shared.array(0, dtype=np.int32)
    f32_arr[0] = 3.14
    print(f32_arr[0])
    print(i32_arr[0])


f[1, 1, 0, 4]()
cuda.synchronize()

This allocates 4 bytes of shared memory (large enough for one int32 or one float32) and declares dynamic shared memory arrays of type int32 and of type float32. When f32_arr[0] is set, this also sets the value of i32_arr[0], because they’re pointing at the same memory. So we see as output:

$ python repro.py 
3.140000
1078523331

because 1078523331 is the int32 represented by the bits of the float32 3.14. If we take disjoint views of the dynamic shared memory:

@cuda.jit
def f_with_view():
    f32_arr = cuda.shared.array(0, dtype=np.float32)
    i32_arr = cuda.shared.array(0, dtype=np.int32)[1:]
    f32_arr[0] = 3.14
    i32_arr[0] = 1
    print(f32_arr[0])
    print(i32_arr[0])


f_with_view[1, 1, 0, 8]()
cuda.synchronize()

This time we declare 8 dynamic shared memory bytes, using the first four for a float32 value and the next four for an int32 value - we see we can set both the int32 and float32 value without them aliasing:

$ python repro.py
3.140000
1

biosckon · March 17, 2023, 5:28pm

Thank you Graham. This informative answer belongs to the docs!

Unless it’s already there, somewhere…

gmarkall · March 18, 2023, 7:57am

You’re right, it does, and it’s not really written anywhere - I think when the docs were first written it was assumed that readers would learn or know of the semantics from the CUDA C programming guide, but these days it’s reasonable to expect to only need to use the Numba docs for things like this.

biosckon · March 18, 2023, 11:11am

Indeed and I have just enough of CUDA C to be horrible at it but I wasn’t certain. My first instinct was to set the size to 0 but didn’t dare to be certain…Anyway if it’s up in docs it’ll help. Thanks!

Topic		Replies	Views
Cannot create a shared array in a kernel using kernel parameters Community Support	3	1056	February 5, 2021
Configuring how much shared memory is available Support: How do I do ...?	8	112	December 1, 2025
Using dynamic shared memory Numba	6	1091	August 22, 2023
CUDA shared memory on 1d arrays Numba	1	439	July 23, 2021
Sharing CUDA memory by numba Support: How do I do ...?	0	444	November 3, 2021

How do I use `sharedmem` parameter in kernel?

Related topics