The future of kernel programming style with Numba

pauljurczak · June 6, 2022, 4:11am

I did a few tests using stencil and the results are not great. This snippet:

import numba as nb
import numpy as np
import timeit as ti

def ker0(a):
  return 42*a

@nb.stencil
def ker(a):
  return 42*a[0, 0]

@nb.njit(fastmath=True)
def ker1(a):
  return ker(a)

@nb.njit(fastmath=True, parallel=True)
def ker2(a):
  return ker(a)

a = np.arange(10000).reshape((100, 100))

for i in range(3):
  fun = f'ker{i}(a)'
  t = 1000 * np.array(ti.repeat(stmt=fun, setup=fun, globals=globals(), number=1, repeat=100))
  print(f'{fun}:  {np.amin(t):6.3f}ms  {np.median(t):6.3f}ms')

produces:

ker0(a):   0.005ms   0.005ms
ker1(a):   0.009ms   0.009ms
ker2(a):   0.020ms   0.020ms

on a 6-core CPU with Python 3.10.4. Parallel mode slows stencil down. Are there more performant options to write kernels with? Perhaps GitHub - IntelPython/numba-dpex: Numba extension for Intel(R) XPUs? Anything else? Where is the development effort going these days?

animator · June 6, 2022, 12:32pm

@pauljurczak There is a stencil issue I opened and an ongoing pull request made by Dr. Todd in this respect.
It would be great if you can add your findings to the issue thread.

pauljurczak · June 13, 2022, 5:02am

It took a while, but I just did.

Topic		Replies	Views
@stencil kernel performance issue Support: How do I do ...?	4	446	June 10, 2022
Major slow down when adding one more layer of function Numba	3	211	September 28, 2023
Improving Numba for CPU workloads Numba	10	286	June 27, 2025
Slow down when porting anisotropic diffusion filter Community Support	2	289	June 16, 2021
Real world optimization using numba Community Support	0	942	March 5, 2021

The future of kernel programming style with Numba

Related topics