Thank you for crosschecking @ofk123 .
-
In the given example we have used mutable global variables to pass the c-functions to the jitted functions. You have to change this approach to be able to cache. Can you provide a minimum working example of what you want to accomplish using Scipy and the version that fails (in your other thread)? This will increase the likelihood to get help from other users.
-
If you can’t measure a difference in performance between using F-order and C-order, then it’s likely that the memory layout doesn’t significantly affect performance in your case. Since the matrices are square and symmetric, the computation will produce the same results regardless of the memory layout.
-
Since your data-array is always 1D and you don’t have multiple RHS, you can simplify the process by providing a vector instead of a matrix to the LAPACK function
dpotrs. You will receive a vector as a result, too. This change should be sufficient for your specific case.
@njit
def numba_dpotrs(A, B, lower=True):
"""DPOTRS solves a system of linear equations."""
UPLO = np.array(ord('U') if lower else ord('L'), np.int32)
INFO = np.array(0, dtype=np.int32)
size = A.shape[0]
N = np.array(size, dtype=np.int32)
LDA = np.array(size, dtype=np.int32)
if B.ndim == 1:
NRHS = np.array(1, dtype=np.int32)
LDB = np.array(B.size, dtype=np.int32)
else:
NRHS = np.array(B.shape[0], dtype=np.int32)
LDB = np.array(B.shape[1], dtype=np.int32)
dpotrs_fn(
UPLO.ctypes,
N.ctypes,
NRHS.ctypes,
A.ctypes,
LDA.ctypes,
B.ctypes, # out
LDB.ctypes,
INFO.ctypes, # out
)
if INFO:
raise Exception(f'Oh no, something went wrong. INFO: {INFO}')
return B