How can I improve the runtime of this linear system solve?

Oyibo · February 26, 2024, 11:48pm

Thank you for crosschecking @ofk123 .

In the given example we have used mutable global variables to pass the c-functions to the jitted functions. You have to change this approach to be able to cache. Can you provide a minimum working example of what you want to accomplish using Scipy and the version that fails (in your other thread)? This will increase the likelihood to get help from other users.
If you can’t measure a difference in performance between using F-order and C-order, then it’s likely that the memory layout doesn’t significantly affect performance in your case. Since the matrices are square and symmetric, the computation will produce the same results regardless of the memory layout.
Since your data-array is always 1D and you don’t have multiple RHS, you can simplify the process by providing a vector instead of a matrix to the LAPACK function dpotrs. You will receive a vector as a result, too. This change should be sufficient for your specific case.

@njit
def numba_dpotrs(A, B, lower=True):
    """DPOTRS solves a system of linear equations."""
    UPLO = np.array(ord('U') if lower else ord('L'), np.int32)
    INFO = np.array(0, dtype=np.int32)
    size = A.shape[0]
    N = np.array(size, dtype=np.int32)
    LDA = np.array(size, dtype=np.int32)
    if B.ndim == 1:
        NRHS = np.array(1, dtype=np.int32)
        LDB = np.array(B.size, dtype=np.int32)
    else:
        NRHS = np.array(B.shape[0], dtype=np.int32)
        LDB = np.array(B.shape[1], dtype=np.int32)
    dpotrs_fn(
        UPLO.ctypes,
        N.ctypes,
        NRHS.ctypes,
        A.ctypes,
        LDA.ctypes,
        B.ctypes,       # out
        LDB.ctypes,
        INFO.ctypes,    # out
    )
    if INFO:
        raise Exception(f'Oh no, something went wrong. INFO: {INFO}')
    return B

Topic		Replies	Views
Scipy.sparse.linalg.splu, solve function acceleration Support: How do I do ...?	0	59	November 22, 2024
Usage of CUDA Python, Linear Algebra on GPU and Computational Code Community Support	7	3735	December 31, 2021
Extending LAPACK functions into numba Support: How do I do ...?	6	971	June 21, 2022
Using scipy.sparse.linalg.spsolve in numba Support: How do I do ...?	1	650	February 18, 2023
How to best compile numpy.linalg.slogdet(A)? Support: How do I do ...?	2	273	September 12, 2023

How can I improve the runtime of this linear system solve?

Related topics