Hello, I have an njit’ed function which works on daily files and I would like to know the best way to run this in parallel for multiple days. Some candidates are to use: (1) Parallel from Joblib (2) Parallel from Numba / prange or (3) Dask The code looks like this:
@njit(cache=True)
def run_calc_one_day(col_1: np.array, col_2: np.array):
""" loop logic"""
return stats
def run_days(dates: list[str]):
stat_container = []
for day in dates:
df = pd.read_parquet(f"date={day}")
stats = run_calc_one_day(df["col_1"], df["col_2"])
stat_container.append({day: stat})
run_days(["2024-01-01", "2024-01-02"]) ...a few hundred days
Many thanks!