Approach to parallel loops

j7zAhU · March 18, 2024, 3:02am

Hello, I have an njit’ed function which works on daily files and I would like to know the best way to run this in parallel for multiple days. Some candidates are to use: (1) Parallel from Joblib (2) Parallel from Numba / prange or (3) Dask The code looks like this:

@njit(cache=True)
def run_calc_one_day(col_1: np.array, col_2: np.array):
   """ loop logic"""
   return stats

def run_days(dates: list[str]):
   stat_container = []
   for day in dates:
      df = pd.read_parquet(f"date={day}")
      stats = run_calc_one_day(df["col_1"], df["col_2"])
      stat_container.append({day: stat})
      

run_days(["2024-01-01", "2024-01-02"])  ...a few hundred days

Many thanks!

Topic		Replies	Views
Parallel njit not working with None optional kwarg and nested loops Community Support	1	231	May 3, 2023
Help improving performance of embarassingly parallel loop Community Support	8	189	February 28, 2024
Parallel and Efficient Testing Community Support	0	60	February 19, 2024
Advice in parallelizing Support: How do I do ...?	2	1325	September 5, 2022
Prange loop with a call for i+1 Support: How do I do ...?	4	309	June 14, 2021

Approach to parallel loops

Related Topics