How could i best optimize this with numba

Hi, I’m new to numba and would like to know how I could speed up my code using Numba.
All my currents attempts have resulted in slower execution times so I don’t know what to do next.
Can someone help me with this?

from itertools import product
import numpy as np
def create_coefficients(max_range, functions):
    for i in range(2, max_range):
        functions.extend(list(product([1, -1], repeat=i + 1)))


def split_complex(functions, res1, res2):
    for func in functions[0:(len(functions) // 2)]:
        res = np.roots(func)
        res1.extend([np.real(ele) for ele in res])
        res2.extend([np.imag(ele) for ele in res])


if __name__=="__main__":
    functions = []
    create_coefficients(15, functions)
    res1 = []
    res2 = []
    split_complex(functions, res1, res2)

@SeppeWouters thank you for posting this. It seems like the post is quite hard to parse so it may be worthwhile to lookup how to format stuff on Discourse. Also, what kind of execution times are you seeing and how are you measuring these?

Currently I’m hitting 3.6 s without any numba, I’m measuring this with timeit.

@SeppeWouters, thank you for updating this example. I tried to run the code locally, but I ran into the following issue:

zsh» python foo
Traceback (most recent call last):
  File "foo", line 15, in <module>
    create_coefficients(15, functions)
  File "foo", line 3, in create_coefficients
    functions.extend(list(product([1, -1], repeat=i + 1)))
NameError: name 'product' is not defined

Did you perhaps forget an import statement when copying the code to discourse?

Sorry, i forgot the import statements, I use product from itertools.

Ah yes, that makes sense. And is it the split_complex function you are trying to accelerate?

If possible both, but I think that it will be hard to improve create_coefficients due to the use of product.

You could re-implement product in pure Python and then @njit that.

I tried to reproduce the slowdown you get with Numba, but this is as far as I got:

In [1]: from itertools import product
   ...: import numpy as np
   ...: def create_coefficients(max_range, functions):
   ...:     for i in range(2, max_range):
   ...:         functions.extend(list(product([1, -1], repeat=i + 1)))
   ...:
   ...:
   ...: def split_complex(functions, res1, res2):
   ...:     for func in functions[0:(len(functions) // 2)]:
   ...:         res = np.roots(func)
   ...:         res1.extend([np.real(ele) for ele in res])
   ...:         res2.extend([np.imag(ele) for ele in res])
   ...:

In [2]:     functions = []
   ...:     create_coefficients(15, functions)

In [3]: from numba import njit

In [4]: nsc = njit(split_complex)

In [5]: %timeit res1 = []; res2 = []; split_complex(functions, res1, res2)
3.48 s ± 253 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [6]: %timeit res1 = []; res2 = []; nsc(functions, res1, res2)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-48f9ca6fedd2> in <module>
----> 1 get_ipython().run_line_magic('timeit', 'res1 = []; res2 = []; nsc(functions, res1, res2)')

~/miniconda3/envs/numba_3.8/lib/python3.8/site-packages/IPython/core/interactiveshell.py in run_line_magic(self, magic_name, line, _stack_depth)
   2324                 kwargs['local_ns'] = self.get_local_scope(stack_depth)
   2325             with self.builtin_trap:
-> 2326                 result = fn(*args, **kwargs)
   2327             return result
   2328

~/miniconda3/envs/numba_3.8/lib/python3.8/site-packages/decorator.py in fun(*args, **kw)
    229             if not kwsyntax:
    230                 args, kw = fix(args, kw, sig)
--> 231             return caller(func, *(extras + args), **kw)
    232     fun.__name__ = func.__name__
    233     fun.__doc__ = func.__doc__

~/miniconda3/envs/numba_3.8/lib/python3.8/site-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
    185     # but it's overkill for just that one bit of state.
    186     def magic_deco(arg):
--> 187         call = lambda f, *a, **k: f(*a, **k)
    188
    189         if callable(arg):

~/miniconda3/envs/numba_3.8/lib/python3.8/site-packages/IPython/core/magics/execution.py in timeit(self, line, cell, local_ns)
   1167             for index in range(0, 10):
   1168                 number = 10 ** index
-> 1169                 time_number = timer.timeit(number)
   1170                 if time_number >= 0.2:
   1171                     break

~/miniconda3/envs/numba_3.8/lib/python3.8/site-packages/IPython/core/magics/execution.py in timeit(self, number)
    167         gc.disable()
    168         try:
--> 169             timing = self.inner(it, self.timer)
    170         finally:
    171             if gcold:

<magic-timeit> in inner(_it, _timer)

ValueError: cannot compute fingerprint of empty list

So, this problem is known, because Numba can’t deal with empty lists as arguments as it simply is unable to infer their data type (they don’t exactly have one yet). So I’m not sure how you were able to benchmark the code from the snippet above.

Numba does have it’s own (two actually) list implementation. The more modern one is the numba.typed.List which allows you to specify a datatype, even for an empty list, so that it can be handed into functions. It is also more performant for certain types of use-cases.

https://numba.readthedocs.io/en/stable/reference/pysupported.html#typed-list

It may be worth exploring if you could leverage that to get speedups.