I’m new to numba.
I run Lattice Boltzmann Simulations when using numba (specifically jit with parallel=True), I could speed up my simulations a lot, for example going from 13 hours to 4 hours in my home computer and thats amazing.
Then I runned my code in a cluster and the simulation took more time than in my home computer.
When I was checking the processors, I actually saw that the simulations was using only one of the processors of the cluster and not as I was expecting the number of threads that I set, using numba.set_num_threads(64).
So I would like to understand better how this works and if it would be possible to solve this.
In my home computer I have a Ryzen 7 3700x with 16 threads, and actually checking the numba.set_num_threads(), I tried to run with 8 threads (half) but when I was looking at my processors, it looks that all of them are in use, but with 50% of capacity. This let me confuse also, because I though that it would literally use 8 threads and it wasnt what happened.
Any help would be great.