How can I improve the runtime of this linear system solve?

Where can I see an example where a Lapack function on a C-level is called to solve a linear system?

I previously tried the Cython API on a distributed cluster, but ran into this error. So I am not sure I can make it work for solve either…