DFS implementation with Numba


I know that Numba is not meant to it, but I wanted to try it.
For a Graph with 4 vertices:
Python Execution time: DFS Finished in 4.5497 MilliSeconds.
For Numba it will be : DFS Finished in 1768.8087 MilliSeconds.

I tried with 10 vertices, but Numba raise a exception.
In attempt to see what is the cause of it, I profile it with cProfile.
It showed that using only python 28 functions was called.
while with Numba 1155632 functions was called, a lot of them is Numba related.

Part of the code is here: NumbaDFS - Pastebin.com
The Project (including the profile data): GitHub - LiorA1/PythonNumba: An attempt to implement DFS using Numba

Can I correct it in some way ?

Edit: I know that Numba purpose is not for it, I didnt delete because it might give some insights…

A couple of points / questions:

  • Are you timing the compilation time in your code as well? You may want to check A ~5 minute guide to Numba — Numba 0.55.0.dev0+438.g3131959c9.dirty-py3.7-linux-x86_64.egg documentation
  • The graph in your repository for measuring performance seems very small (4 nodes, 6 edges) - even without measuring compile time, the overhead of calling a Numba function will probably dominate the execution time. I think you will need to benchmark with larger graphs (I’m not sure exactly how much but maybe thousands or tens of thousands of nodes would be needed).

I know that Numba purpose is not for it, I didnt delete because it might give some insights

I think that Numba can actually work very well with this use case, and I’d be surprised if it’s not possible to go a lot faster with a Numba DFS (perhaps close to the equivalent written e.g. in C) than with one implemented in Python.

I start the measurement when the wrapper function is called. (It is the one that called the jittet function) and end the measurement when the wrapper function return.
So, include the compile time.

I tried a bigger graph (10 Nodes) but the program crashed.
Used cProfile to try and understand and saw a huge overhead.
Tried to understand the origins, and think it is the use of “set”, “list” and the complexed while…if…for…if… structure.

Dont really sure cProfile is accurate with Numba, but it shed some light.