I want to perform matrix multiplication with a pre-allocated output array (i.e. use the out parameter to numpy.matmul). However, it doesn’t seem possible when the output array is in Fortran-order.
Numpy has two ways to perform a matrix multiply with an out argument: numpy.dot and numpy.matmul. Using matmul works with any combination of memory orders, while dot only works for a C-order output array.
Numba duplicates the functionality of numpy.dot, however it doesn’t support numpy.matmul, so there is no way to perform a multiplication with a Fortran-order output.
Here’s a gist with a minimal example. If the if statement at the end of the loop is removed, the numba_dot call will fail for any case where order_out is F.
Is there any workaround that I’m missing?
Edit: I’m not able to include the link to the gist. Just add https:// to this.
gist.github.com/joshayers/3db3315684442aa9c22fe7959cafeee3