njit(parallel=True) or python multiprocessessing for speed up?

Hello, I’m working with a very large array, the implementation I have now uses python multiprocessing library so processes can work on different parts of the array at the same time. Then child processes uses jitted functions in no python mode.

does njit(parallel=True) go faster than the python multiprocessessing?

1 Like

There is no general answer to that I think, probably depends a lot on the specific problem. But generally large arrays sound like something that numba-parallel could be quite efficient at. It is also good to keep in mind that parallel=True not only allows the prange function for explicitly parallel loops but also tells the compiler to sprinkle in parallel optimisations at will wherever it can identify them.

1 Like

If the code that is being parallelized with multiprocessing is already jitted, then the pure execution time will be the same. Multiprocessing adds certain overhead compared to multithreading, in general and independently of Numba. Numba’s parallel uses multithreading so the overhead to start the parallel calculations is lower.
However, the comparison is quite complicated. The overhead of multiprocessing sometimes pays off. For example, if you work on a server with many processors.

It’s hard to say that as a rule one is faster than the other. Numba’s parallel is definitely easier to set up. So in many cases, it’s a low effort to simply try. A good measure of whether parallel is working well is to measure the speed-up factor. For example, if working with 4 cores you get 4x the speed, you won’t do better with another approach. If you get 1.5x, then it’s worth looking at the algorithm or trying something else.

1 Like

Profile, benchmark, try multiple solutions. Also, consider cuda.jit.