I haven’t used numba extensively so it’s entirely possible that I’m just doing something dumb here, but I encountered a very strange result and I’m not sure if it’s a bug or my own fault.
I wrote a function that implements softmax with an analytical derivative w.r.t the input values. My original attempt appeared to work for some inputs (2D arrays) but when called with a 1D array returns the value of its second argument from the previous invocation. It’s very odd.
This is my original attempt, which demonstrates the bug:
@numba.guvectorize([(float64[:], float64, float64, float64[:])],
'(n),()->(),(n)',
target='cpu')
def softmax_test(x, alpha, sm, dsm):
eax = np.exp(alpha * x)
num = np.sum(x * eax)
den = np.sum(eax)
sm = num / den
dsm = eax * (den*(1+alpha*x) - alpha*num)/(den**2)
x = np.random.rand(10,1000)
sm, dsm = softmax_test(x, -10.4)
print(sm)
#prints correct output
sm, dsm = softmax_test(x[4,:], -10.4)
print(sm)
#-10.4
sm, dsm = softmax_test(x[4,:], -10.5)
print(sm)
#-10.4
sm, dsm = softmax_test(x[4,:], -10.6)
print(sm)
#-10.5
Here is my current version which does not show the strange behavior, and always produces correct results as far as I can tell. Maybe this is just the correct way to write it, and that’s fine with me, but my reading of the documentation leads me to believe that my original version should have worked, and regardless the failure mode is nuts!
@numba.guvectorize([(float64[:], float64[:], float64[:], float64[:])],
'(n),()->(),(n)',
target='cpu')
def softmax(x, alpha, sm, dsm):
eax = np.exp(alpha * x)
num = np.sum(x * eax)
den = np.sum(eax)
sm[:] = num / den
dsm[:] = eax * (den*(1+alpha*x) - alpha*num)/(den**2)
So: have I found a bug (possibly a documentation bug) or is this user error?