I am trying to create a moving linear regression and I wanted to utilize numba
. However, I am struggling with the latter part as I lack the relevant experience.
Here’s what I have so far using pure numpy
. It is working, however, without applying numba
it is quite slow once you throw large arrays at it.
import numpy as np
def ols_1d(y, window):
y_roll = np.lib.stride_tricks.sliding_window_view(y, window_shape=window)
m = list()
c = list()
for row in np.arange(y_roll.shape[0]):
A = np.vstack([np.arange(1, window + 1), np.ones(window)]).T
tmp_m, tmp_c = np.linalg.lstsq(A, y_roll[row], rcond=None)[0]
m.append(tmp_m)
c.append(tmp_c)
m, c = np.array([m, c])
return np.hstack((np.full((window - 1), np.nan), m * window + c))
def ols_2d(y, window):
out = list()
for col in range(y.shape[1]):
out.append(ols_1d(y=y[:, col], window=window))
return np.array(out).T
if __name__ == "__main__":
a = np.random.randn(
10000, 10
) # function is slow once you really increse number of columns
print(ols_2d(a, 10))
I need those functions to become as fast as possible. I am working with arrays that are potentially quite large (10,000 by 1,000,000).
Any help in applying numba
is highly appreciated.