Parallelise large loops inside jitted function


I am iterating over a large list with above 1 million elements inside a jitted function. I want to make use of the 128 CPU cores that my system has – how can I parallelise this loop to achieve greater efficiency? Currently it looks like only one core is being used to iterate through this list. Would using prange do this? Or would I have to create my own multiprocessing Pool?

Any help would be appreciated,

prange is a good way because it is managing threads for you. Things that you need to consider are:

  • Is the loop body data parallel? If yes, prange will be easy. If not, you might need to partition the work manually.
  • Beware of race condition because prange will not automatically lock your containers. Since you mention the use of list, make sure the list is not mutated by multiple threads.