For NumPy operations that are not supported in the CUDA target, they need to be rewritten in terms of loops and operations on individual elements. I haven’t had time to try and translate all of the above, but a couple of ideas for starting points might be (noting that the below code is untested, but illustrates the general idea):
-
numpy.linalg.inv: It looks like your matrix is 3x3, so you could use a function like this
@njit
def invert_3x3_matrix(m, res):
a = m[0, 0]
b = m[0, 1]
c = m[0, 2]
d = m[1, 0]
e = m[1, 1]
f = m[1, 2]
g = m[2, 0]
h = m[2, 1]
i = m[2, 2]
D = 1.0 / (a*(e*i - f*h) - b*(d*i - f*g) + c*(d*h - e*g))
a1 = D * (e*i - f*h)
b1 = D * (c*h - b*i)
c1 = D * (b*f - c*e)
d1 = D * (f*g - d*i)
e1 = D * (a*i - c*g)
f1 = D * (c*d - a*f)
g1 = D * (d*h - e*g)
h1 = D * (b*g - a*h)
i1 = D * (a*e - b*d)
res[0, 0] = a1
res[0, 1] = b1
res[0, 2] = c1
res[1, 0] = d1
res[1, 1] = e1
res[1, 2] = f1
res[2, 0] = g1
res[2, 1] = h1
res[2, 2] = i1
Then call it by declaring a local array for the result, e.g.:
invertCsensor = cuda.local.array((3, 3), np.float64)
invert_3x3_matrix(C_sensor_n, invertCsensor)
-
np.dot: For a dot product, you could rewrite like:
calcVal = 0.0
for i in range(len(leftSide)):
calcVal += leftSide[i] * rightSide[i]
I hope this helps give the idea - I think you’ll also need to rewrite conj with a loop as well.