I want to perform matrix multiplication with a pre-allocated output array (i.e. use the `out`

parameter to `numpy.matmul`

). However, it doesn’t seem possible when the output array is in Fortran-order.

Numpy has two ways to perform a matrix multiply with an `out`

argument: `numpy.dot`

and `numpy.matmul`

. Using `matmul`

works with any combination of memory orders, while `dot`

only works for a C-order output array.

Numba duplicates the functionality of `numpy.dot`

, however it doesn’t support `numpy.matmul`

, so there is no way to perform a multiplication with a Fortran-order output.

Here’s a gist with a minimal example. If the `if`

statement at the end of the loop is removed, the `numba_dot`

call will fail for any case where `order_out`

is `F`

.

Is there any workaround that I’m missing?

Edit: I’m not able to include the link to the gist. Just add `https://`

to this.

`gist.github.com/joshayers/3db3315684442aa9c22fe7959cafeee3`