About Numba LLVM 20 support and Better PTX control

Hi, I am doing some higher genus Riemann Theta, i need to squeeze the last possible cycles
over the cpu or gpu over my laptop. I hope even to make it cpu+gpu together, I am using Deepseek/Claude heavily for coding.

I noticed that Numba 0.61.2 is built against LLVM 15 not the last 20.

For the cpu:
I tried a naive approach to “hack” Numba into llvm/ir and build against clang-20
but it turned out to be more complicated there is some libs for linking.

For the gpu:
It seems I am hiting into PTX optimization problem … standard gpu optimization
better balance between ram (constant/shared/global) and gpu loading ..it is ok
I accept this… may be better Numba/CUDA kernel with some inline PTX for critical
regions.

For both cases, it seems that i need more than what is in the manual. I am wondering
if you guys shall be generous to share some internal technical manual with the rest.
Since now, Numba is opensource.

Any help, do you have an internal comparative study of Numba vs new LLVM?!
Thank you very much for any help.
Kh.

Some folks from NVIDIA updated numba to use LLVM 20 and seen performance improvements. Not sure if they have published the source yet. Here’s the thread: Improving Numba for CPU workloads

LLVM 20 support is being worked on here: [WIP] LLVM 20 llvmdev recipe by gmarkall · Pull Request #1226 · numba/llvmlite · GitHub

Thank you very much for your replies.
It seems that nobody tried to extract the Numba/LLVM/ir and build against it with different compiler.
I shall try more with Deepseek