@Brohrer thank you for adding the script for reproduction, I played around with and will share my discoveries below:

I initially downloaded the script as `compare.py`

and ran it to get following results:

```
💣 zsh» python compare.py
Numpy array addition
32 ms
Numpy array addition, jitted by Numba
97 ms
Numba-jitted element-wise array addition
31 ms
```

I then tried to apply the following `diff`

to compare the non-jitted version of `C = A + B`

```
diff --git i/compare.py w/compare.py
index 9ac7a73b9b..b18cb6587f 100644
--- i/compare.py
+++ w/compare.py
@@ -6,7 +6,7 @@ n_reps = 10
n_elements = int(1e8)
-@njit
+#@njit
def add_numpy_jit(A, B, C):
C = A + B
return C
```

Running this again:

```
💣 zsh» python compare.py
Numpy array addition
32 ms
Numpy array addition, jitted by Numba
96 ms
Numba-jitted element-wise array addition
31 ms
```

So this is a bit weird, I thought to myself… I then realized that the way you benchmark compares quite different things. `np.add`

is being tested as the three-argument version `fcn(A, B, C)`

and `add_numpy_jit`

is doing a two argument version. Adding the `njit`

decorator doesn’t really make any difference here. Here is a patch that compares the jitted and non-jitted functions using the `py_func`

attribute of the compiled function to execute the non-jitted function:

```
diff --git i/compare.py w/compare.py
index 9ac7a73b9b..cfb585e75b 100644
--- i/compare.py
+++ w/compare.py
@@ -8,8 +8,7 @@ n_elements = int(1e8)
@njit
def add_numpy_jit(A, B, C):
- C = A + B
- return C
+ return np.add(A, B, C)
@njit
@@ -43,6 +42,6 @@ C = np.random.sample(n_elements)
add_numpy_jit(A, B, C)
add_elements(A, B, C)
-time_function(np.add, "Numpy array addition")
+time_function(add_numpy_jit.py_func, "Numpy array addition py_func")
time_function(add_numpy_jit, "Numpy array addition, jitted by Numba ")
time_function(add_elements, "Numba-jitted element-wise array addition")
```

Running it:

```
💣 zsh» python compare.py
Numpy array addition py_func
32 ms
Numpy array addition, jitted by Numba
32 ms
Numba-jitted element-wise array addition
31 ms
```

Though I am not sure what conclusions to draw from them this.!? Though I do wonder why Numba seems to `jit`

the three argument variant but rejects an explicit use of the `out=`

kwarg. I will think about this some more.