As usual, thanks very much for making Numba, it is a fantastic tool!
I have a Python package which uses Numba Jit on some of its functions. The functions have been implemented in such a way that they can also be run in parallel, simply by switching on the parallel mode when decorating the functions with Numba Jit. But I have currently switched the parallel mode off, because I don’t know how well it is supported in different Operating Systems. I am using Linux myself and I don’t really have the possibility to test it on Windows and Mac.
We have previously discussed switching between parallel and serial mode (see post #1125 as I am not allowed to write a link here), where the conclusion was that it could be done simply by setting the number of parallel threads with Numba’s function
set_num_threads, and a test showed that there was no performance penalty when doing this with only one thread, compared to using Jit in serial mode. But that was only tested on Linux.
These are my questions now:
If I decorate a function in my Python package with Numba Jit in parallel mode, can I expect it to work flawlessly on all Operating Systems: Linux, Windows and Mac?
Is there significant overhead in parallel mode on some Operating Systems e.g. on Windows? If this is the case, then the overhead will presumably also be there when the number of threads is set to 1?