Pandas dataframes with numba


Sorry if double post.

Has anybody managed to speed up DataFrame computes with Numba?

I have a forecasting compute job that uses dataframes and takes up about 10 hours to compute for the most intensive customer(that has more than 1000 products).

I have a for loop for model selection(evaluating 5 models using cross validation on historic data) which takes up 95% of this time. Can anybody tell me how could I use numba or cython (?) to speed things up ? I think caching the functions would be the game changer here as the for loop repeats the same compute over and over again.

any help if welcome and golden for me right now!


Hi @tulbureandreit, please look into the CUDF from Nvidia Rapids. CUDF is an [almost] drop-in replacement of pandas DF that runs on GPU. We’ve seen massive performance improvements with CUDF.

1 Like

Here is the link for you convenience:

1 Like