Here is a MWE:
import numba
import numpy as np
@numba.njit
def foo(x):
for i in range(100):
x = np.asarray(x).item() * -1
return x
@numba.njit
def bar(x):
for i in range(100):
x = x * -1
return x
@numba.njit
def baz(x):
for i in range(100):
x = int(float(x * -1))
return x
assert foo(5) == bar(5) == baz(5)
%timeit foo(5) # 3.88 μs ± 1.08 μs per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
%timeit bar(5) # 151 ns ± 1.7 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
%timeit baz(5) # 567 ns ± 3.5 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
Any idea why the compiler doesn’t optimize the no-op? I understand the asarray could create a casting, but even that would be cheaper (as in baz).
I suspect some range checking? Like trying to do np.asarray(1 << 500) raises in a njit func