NumbaRuntimeError: Failed in cuda mode pipeline NRT required but not enabled

I am a new learner of Numba. I want to speed up my code. But there is always some errors like this:

NotImplementedError Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/numba/core/lowering.py in lower_expr(self, resty, expr)
1323 # raise NotImplementedError if the types aren’t supported
→ 1324 impl = self.context.get_function(“static_getitem”, signature)
1325 return impl(self.builder,

38 frames
NotImplementedError: No definition for lowering static_getitem(List(int64, True), Literalint) → int64

During handling of the above exception, another exception occurred:

NumbaRuntimeError Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/numba/core/runtime/context.py in _require_nrt(self)
40 def _require_nrt(self):
41 if not self._enabled:
—> 42 raise errors.NumbaRuntimeError(“NRT required but not enabled”)
43
44 def _check_null_result(func):

NumbaRuntimeError: Failed in cuda mode pipeline (step: native lowering)
NRT required but not enabled
During: lowering “$60binary_subscr.2 = static_getitem(value=point1, index=0, index_var=$const58.1, fn=)” at (11)

And this is my code:

@cuda.jit
def opti_field_line(point1, point2, z, u_final):
a,b = cuda.grid(2)
h = 540
w = 960
wavelength = 0.000532
PS = 0.0108
PI = 3.14159265358979323846

if a < h and b < w:
  if point1[0]-(b-w/2)*PS < 0 and point2[0]-(b-w/2)*PS > 0:
      t1 = math.sqrt(PI/wavelength/z) * (point1[0]-(b-w/2)*PS)
      t2 = math.sqrt(PI/wavelength/z) * (point2[0]-(b-w/2)*PS)
      c1 = 1/(1j-1) * math.sqrt(2/PI) * ((2*t1-1/5*t1**5+1/108*t1**9) + 1j*(-2/3*t1**3+1/21*t1**7+1/660*t1**11))
      c2 = 1/(1j-1) * math.sqrt(2/PI) * ((2*t2-1/5*t2**5+1/108*t2**9) + 1j*(-2/3*t2**3+1/21*t2**7+1/660*t2**11))
      c3 = -1*math.copysign(1, t1)+ (1-1j)/2 * math.sqrt(2/PI) * cmath.exp(-1j*t1**2)/t1
      c4 = -1*math.copysign(1, t2)+ (1-1j)/2 * math.sqrt(2/PI) * cmath.exp(-1j*t2**2)/t2
      c5 = 1/4 * (1+1j) * math.sqrt(2*wavelength*z) * 1/z * cmath.exp(1j*2*PI/wavelength*z) * cmath.exp(1j*PI/wavelength/z*(point1[1]-(a-h/2)*PS)**2)
      if abs(t1) < 1.609 and abs(t2) < 1.609:
          c = c1 + c2
          u_final[a][b] += c5 * c
      elif abs(t1) < 1.609 and abs(t2) > 1.609:
          c = c1 + c4
          u_final[a][b] += c5 * c
      elif abs(t1) > 1.609 and abs(t2) < 1.609:
          c = c3 + c2
          u_final[a][b] += c5 * c
      elif abs(t1) > 1.609 and abs(t2) > 1.609:
          c = c3 + c4
          u_final[a][b] += c5 * c

h = 540
w = 960
u_final = np.zeros((h, w), dtype=np.complex128)

TPB = 16
threadsperblock = (TPB, TPB)
blockspergrid_x = math.ceil(u_final.shape[0]/threadsperblock[0])
blockspergrid_y = math.ceil(u_final.shape[1]/threadsperblock[1])
blockspergrid = (blockspergrid_x, blockspergrid_y)

u_final_gpu = cuda.to_device(u_final)
opti_field_line[blockspergrid, threadsperblock]([-1, 0], [1, 0], 100, u_final_gpu)

I’d be very grateful if you could help me.