Call scipy 'splev' routine in numba jitted function

Hello devs,

I have a situation where almost 90% of my code is numerical computation, with 10% for reading data in the beginning. Numba has really helped me to reduce the computation time while maintaining excellent readability.
I have to use interpolation in my code in between for some data. I use scipy’s ‘splrep’ to calculate coefficients at the beginning and further ‘splev’ inside the program for interpolation. The problem is that the subroutine containing ‘splev’ can’t be jitted. So some python overhead comes in between my numerical computation because of just one subroutine which has to call the interpolation subroutine. What is the best course of action in this situation, (1) Should I think about extending numba to recogonise ‘splev’ or (2) Should I just write a python version of ‘splev’ and jit the same. If someone can provide an example for option (1) that is posted elsewhere it would be very helpful.

Thank you

I’m wondering if another option might work reasonably well.

splev is already a compiled routine and jitting a reimplemention might not go any faster. Perhaps you could rearrange your code so that there is no need for the routine to be jitted? Or do you call this routine so many times that the python variable translation, etc., overhead becomes significant?

Yes,

Currently I have arranged my code so that all the interpolation calls to ‘splev’ is done in one subroutine per iteration, but my simulation itself involves a lot of iterations.

I have managed to port the ‘scipy splev’ function to the numba jitted version as shown below,

import numpy as np
import numba
from scipy import interpolate
import matplotlib.pyplot as plt
import time

x = np.linspace(0,100,1000)
y = np.sin(x)

plt.plot(x,y,'.',linewidth=0.7,color='green', label='data')

@numba.njit(cache=True)
def numba_splev(x, coeff): # spline is extrapolated from the end spans for points not in the support.

  t,c,k = coeff

  n = t.size
  m = x.size

  k1  = k+1
  k2  = k1+1
  nk1 = n - k1

  l  = k1
  l1 = l+1

  y = np.zeros(m)

  for i in range(m):

    # fetch a new x-value arg
    arg = x[i]

    # search for knot interval t[l] <= arg <= t[l+1]
    while not ((arg >= t[l-1]) or (l1 == k2)):
      l1 = l
      l  = l-1
    while not ((arg < t[l1-1]) or (l == nk1)):
      l = l1
      l1 = l+1
    
    # evaluate the non-zero b-splines at arg.    
    h  = np.zeros(20)
    hh = np.zeros(19)
    
    h[0] = 1.0
    
    for j in range(k):

      for ll in range(j+1):

        hh[ll] = h[ll]
      h[0] = 0.0
      for ll in range(j+1):
        li = l + ll 
        lj = li - j - 1
        if(t[li] != t[lj]):
          f = hh[ll]/(t[li]-t[lj])
          h[ll] += f*(t[li]-arg)
          h[ll+1] = f*(arg-t[lj])
        else:
          h[ll+1] = 0.0
          break
    sp = 0.0
    ll = l - 1 - k1
    for j in range(k1):
      ll += 1
      sp += c[ll]*h[j]
    y[i] = sp
  return y

coeff = interpolate.splrep(x,y, k=3) # Calculate coeffcients using scipy splrep

x2 = np.linspace(0,105,10000) 

y2 = interpolate.splev(x2, coeff) # Interpolation with scipy splev
y3 = numba_splev(x2, coeff)       # Interpolation with custom numba splev 

plt.plot(x2,y2,'-',linewidth=2.0,color='blue', label='scipy splev')
plt.plot(x2,y3,'-',linewidth=1.0,color='red',  label='numba splev')
plt.legend()
plt.show()

t1 = time.time()
for n in range(0,10000):
  y2 = interpolate.splev(x2, coeff)
print(time.time()-t1)

t2 = time.time()
for n in range(0,10000):
  y3 = numba_splev(x2, coeff) 
print(time.time()-t2)

Figure_1

For sample 10000 iterations of interpolation, the performance of numba version was not as good compared to the native scipy splev. Times were as follows,
scipy splev - 2.74 s
numba splev - 21.66s

I am wondering where the overhead is in the numba jitted version (I have coded it very similar to the scipy splev.f version). Why is it that the wrapped fortran code performs better than the direct call to the numba jitted version. Can somebody please take a look and explain what is going on. If my code is badly optimized can somebody please suggest a solution?

Alternatively, if someone can kindly suggest an example for extending numba to recognize scipy’s splev routine, it would be very helpful.

Thank you

Something simple that can be done is move the array declaration from inside to outside the loop, creating a new array at each iteration is quite costly. You can then reuse the array each time, and simply reset it’s contents to zero at each iteration.

So something like:

@numba.njit
def numba_splev(x, t,c,k):
    
    # declare only once
    h    = np.zeros(20)
    hh = np.zeros(19)

    for i in range(m):

        ....

        # reset       
        h[:] = 0
        hh[:] = 0

Perhaps set the datatype of those arrays to be float32, but I don’t know if the reduced accuracy will impact the results, that would require some testing.

Making this change makes it almost as fast as Scipy for me.

2 Likes

@Rutger thank you so much. Moving the declaration outside the loop really worked!!! Numba jitted splev performance is now very close to scipy version. So, probably my naive code implementation could be the reason why the numba version can’t outperform the wrapped fortran version. Any other suggestions for code improvements? I will test the code in my main simulation program and report the performance improvement.

Can you describe where you have problems with method 1 (using the already implemented method from Numba)?

Maybe changing the division by zero behavior of the Numba implementation, or allowing more optimizations which can have influence on the result (fastmath) can improve the performance a bit further.

@numba.njit(cache=True,fastmath=True,error_model='numpy')
1 Like

Hello devs,

I did some more optimizations to the numba accelerated port of ‘scipy’s splev’ interpolation routine for my application needs. My input data for generating the spline coeff’s are usually equispaced and further, during the simulation, the interpolation is done at random values (based on results of some calculations). Considering both these into account I have modified the code as shown below,

import numpy as np
import numba
from scipy import interpolate
import matplotlib.pyplot as plt
import time

# Custom wrap of scipy's splrep
def custom_splrep(x, y, k=3):
    
    """
    Custom wrap of scipy's splrep for calculating spline coefficients, 
    which also check if the data is equispaced.
    
    """
    
    # Check if x is equispaced
    x_diff = np.diff(x)
    equi_spaced = all(np.round(x_diff,5) == np.round(x_diff[0],5))
    dx = x_diff[0]
    
    # Calculate knots & coefficients (cubic spline by default)
    t,c,k = interpolate.splrep(x,y, k=k) 
    
    return (t,c,k,equi_spaced,dx) 

# Numba accelerated implementation of scipy's splev
@numba.njit(cache=True)
def numba_splev(x, coeff):
    
    """
    Custom implementation of scipy's splev for spline interpolation, 
    with additional section for faster search of knot interval, if knots are equispaced.
    Spline is extrapolated from the end spans for points not in the support.
    
    """
    t,c,k, equi_spaced, dx = coeff
    
    t0 = t[0]
    
    n = t.size
    m = x.size
    
    k1  = k+1
    k2  = k1+1
    nk1 = n - k1
    
    l  = k1
    l1 = l+1
    
    y = np.zeros(m)
    
    h  = np.zeros(20)
    hh = np.zeros(19)

    for i in range(m):
        
       # fetch a new x-value arg
       arg = x[i]
       
       # search for knot interval t[l] <= arg <= t[l+1]
       if(equi_spaced):
           l = int((arg-t0)/dx) + k
           l = min(max(l, k1), nk1)
       else:
           while not ((arg >= t[l-1]) or (l1 == k2)):
               l1 = l
               l  = l-1
           while not ((arg < t[l1-1]) or (l == nk1)):
               l = l1
               l1 = l+1
       
       # evaluate the non-zero b-splines at arg.    
       h[:]  = 0.0
       hh[:] = 0.0
       
       h[0] = 1.0
       
       for j in range(k):
       
           for ll in range(j+1):
               hh[ll] = h[ll]
           h[0] = 0.0
       
           for ll in range(j+1):
               li = l + ll 
               lj = li - j - 1
               if(t[li] != t[lj]):
                   f = hh[ll]/(t[li]-t[lj])
                   h[ll] += f*(t[li]-arg)
                   h[ll+1] = f*(arg-t[lj])
               else:
                   h[ll+1] = 0.0
                   break
       
       sp = 0.0
       ll = l - 1 - k1
       
       for j in range(k1):
           ll += 1
           sp += c[ll]*h[j]
       y[i] = sp
    
    return y

######################### Testing and comparison #############################

# Generate a data set for interpolation
x, dx = np.linspace(10,100,200, retstep=True)
y = np.sin(x)

# Calculate the cubic spline spline coeff's
coeff_1 = interpolate.splrep(x,y, k=3)	# scipy's splrep
coeff_2 = custom_splrep(x,y, k=3)	    # Custom wrap of scipy's splrep

# Generate data for interpolation and randomize
x2 = np.linspace(0,110,1000) 
np.random.shuffle(x2)

# Interpolate
y2 = interpolate.splev(x2, coeff_1) # scipy's splev
y3 = numba_splev(x2, coeff_2)	    # Numba accelerated implementation of scipy's splev

# Plot data
plt.plot(x,y,'--', linewidth=1.0,color='green', label='data')
plt.plot(x2,y2,'o',color='blue', markersize=2.0, label='scipy splev')
plt.plot(x2,y3,'.',color='red',  markersize=1.0, label='numba splev')
plt.legend()
plt.show()

print("\nTime for random interpolations")
# Calculation time evaluation for scipy splev
t1 = time.time()
for n in range(0,10000):
  y2 = interpolate.splev(x2, coeff_1)
print("scipy splev", time.time() - t1)

# Calculation time evaluation for numba splev
t1 = time.time()
for n in range(0,10000):
  y2 = numba_splev(x2, coeff_2)
print("numba splev",time.time() - t1)

print("\nTime for non random interpolations")
# Generate data for interpolation without randomize
x2 = np.linspace(0,110,1000) 

# Calculation time evaluation for scipy splev
t1 = time.time()
for n in range(0,10000):
  y2 = interpolate.splev(x2, coeff_1)
print("scipy splev", time.time() - t1)

# Calculation time evaluation for numba splev
t1 = time.time()
for n in range(0,10000):
  y2 = numba_splev(x2, coeff_2)
print("numba splev",time.time() - t1)


With additional optimizations (to the knot interval search), these are the results of the above program in my corei7 machine,

If the interpolation is done at random values, numba version is now twice as fast,
Scipy’s splev = 0.896 s
Numba splev = 0.375 s

If the interpolation is not done at random values scipy’s version is slightly faster,
Scipy’s splev = 0.281 s
Numba splev = 0.375 s

Since my usecase involves random interpolation, the numba accelerated code works perfectly for me.
Figure_1

Thank you @rickhg12hs @Rutger @max9111 for your valuable suggestions. I am working on the suggestion of @max9111 for method(1) in my original question as an exercise and will post an update later.

Ref : scipy/scipy/interpolate/fitpack at v1.7.1 · scipy/scipy · GitHub , GitHub - dbstein/fast_splines: numba accelerated replacement to RectBivariateSpline, accurate to boundary edges
and
GitHub - dbstein/fast_splines: numba accelerated replacement to RectBivariateSpline, accurate to boundary edges