I noticed that that the operation value ** p
shows huge difference in the computing time between the two cases p=2.0
and p=2
. (Also: between p=3.0
and p=3
, and so on.)
Minimum Working Example:
@njit(fastmath=True)
def my_func(a, p):
n = len(a)
x = 0
for i in range(n):
for j in range(n):
x = x + (a[i] ** p) + (a[j] ** p)
return x
And,
input:
n = 20000
seed = 0
np.random.seed(seed)
a = np.random.rand(n)
And, try the following code for p=2.0
and p=2
my_func(a, p)
tic = time.time()
my_func(a, p)
toc = time.time()
print(toc - tic)
And, p=2
is a lot faster (like by 80% in my pc)
And, if we remove the @njit decorator
, the p=2.0
becomes (slightly) faster than p=2 !!
Can anyone help me understand what is going on here?