It depends. How important is caching and short compilation time for you? If it’s not of high priority, you can use something like this without sacrificing performance:
import numba as nb
import numpy as np
@nb.njit
def kernel_linear(x1, x2):
s = 0
for i in range(x1.shape[0]):
s += (x1[i] * x2[i])
return s
@nb.njit
def kernel_rbf(x1, x2, gamma):
s = 0
for i in range(x1.shape[0]):
s += (x1[i] - x2[i]) ** 2
return np.exp(-gamma * s)
@nb.njit
def test(x, kernel_func, *kernel_params):
out = np.empty(x.shape[0], x.dtype)
for i in range(x.shape[0]):
for j in range(x.shape[0]):
out[i] = kernel_func(x[i], x[j], *kernel_params)
return out
x = np.random.rand(5_000, 3)
test(x, kernel_linear)
test(x, kernel_rbf, 1.0)
I just checked, and we discussed the pros and cons of various alternatives for such problems in the thread I posted above: