Calling Numba function on pandas Series

It seems if you pass a Pandas series to a CPU-jitted function then the typing fails - it is necessary to call to_numpy() on the Series first. For example, in:

import pandas as pd
from numba import njit


@njit
def add_one(x):
    for i in range(len(x)):
        x[i] += 1


s = pd.Series([1, 2, 3])
print(s)

add_one(s.to_numpy())
print(s)

add_one(s)
print(s)

the first call to add_one() succeeds, and the second fails with:

numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
non-precise type pyobject
During: typing of argument at /home/gmarkall/numbadev/issues/pd-series/repro.py (7)

File "repro.py", line 7:
def add_one(x):
    for i in range(len(x)):
    ^ 

This error may have been caused by the following argument(s):
- argument 0: Cannot determine Numba type of <class 'pandas.core.series.Series'>

Is this expected?

Hi @gmarkall

At present Numba does not support Pandas data types, but it does support NumPy’s. The .to_numpy() call exposes the pandas.Series instance as a NumPy array and so that is why that works. In general, you can query whether Numba supports a type by passing an instance of the type to numba.typeof. Example:

<snip>
In [1]: import pandas as pd

In [2]: a = pd.Series([1, 2, 3])

In [3]: from numba import typeof

In [4]: typeof(a)
<snip>
ValueError: cannot determine Numba type of <class 'pandas.core.series.Series'>

In [5]: type(a.to_numpy())
Out[5]: numpy.ndarray

In [6]: typeof(a.to_numpy())
Out[6]: array(int64, 1d, C)

Hope this helps.

@gmarkall Intel SDC (scalable dataframe compiler) adds pandas datatype support to Numba.