DFS implementation with Numba

Hi,

I know that Numba is not meant to it, but I wanted to try it.
For a Graph with 4 vertices:
Python Execution time: DFS Finished in 4.5497 MilliSeconds.
For Numba it will be : DFS Finished in 1768.8087 MilliSeconds.

I tried with 10 vertices, but Numba raise a exception.
In attempt to see what is the cause of it, I profile it with cProfile.
It showed that using only python 28 functions was called.
while with Numba 1155632 functions was called, a lot of them is Numba related.

Part of the code is here: NumbaDFS - Pastebin.com
The Project (including the profile data): GitHub - LiorA1/PythonNumba: An attempt to implement DFS using Numba

Can I correct it in some way ?
Thanks!

Edit: I know that Numba purpose is not for it, I didnt delete because it might give some insights…

A couple of points / questions:

  • Are you timing the compilation time in your code as well? You may want to check A ~5 minute guide to Numba — Numba 0.59.0dev0+179.ga4664180.dirty documentation
  • The graph in your repository for measuring performance seems very small (4 nodes, 6 edges) - even without measuring compile time, the overhead of calling a Numba function will probably dominate the execution time. I think you will need to benchmark with larger graphs (I’m not sure exactly how much but maybe thousands or tens of thousands of nodes would be needed).

I know that Numba purpose is not for it, I didnt delete because it might give some insights

I think that Numba can actually work very well with this use case, and I’d be surprised if it’s not possible to go a lot faster with a Numba DFS (perhaps close to the equivalent written e.g. in C) than with one implemented in Python.

I start the measurement when the wrapper function is called. (It is the one that called the jittet function) and end the measurement when the wrapper function return.
So, include the compile time.

I tried a bigger graph (10 Nodes) but the program crashed.
Used cProfile to try and understand and saw a huge overhead.
Tried to understand the origins, and think it is the use of “set”, “list” and the complexed while…if…for…if… structure.

Dont really sure cProfile is accurate with Numba, but it shed some light.

You are explicitly forcing re-compilation the _dfs function every time you call the method iterative_dfs_numba_wrapper. That makes no sense to do, and is why your code is so slow!

Simply decorate the _dfs function (or create a compiled version of that function, outside your class that you call within your class). When I do that, after the first iteration, your example in the pastebin shows:

DFS Finished in 0.06662 MilliSeconds

2 Likes

Yeah this fixed it, but encounter a different issue when tried it with a larger graph…
Lets say:

edges = [(0, 1), (0, 2), (1, 2), (2, 0), (2, 3), (3, 3), (1, 4), (1, 5), (4, 8), (8, 9), (5, 7), (3, 6)]
graph = Graph(edges, 10)

the traceback is:

Traceback (most recent call last):
  File "C:/Users/LiorAsis/projects/githubProjects/PythonNumba/main.py", line 260, in <module>
    graphs_10()
  File "C:/Users/LiorAsis/projects/githubProjects/PythonNumba/main.py", line 68, in graphs_10
    graph_numba = GraphNumba(edges, 10)
  File "C:\Users\LiorAsis\projects\githubProjects\PythonNumba\Graph.py", line 120, in __init__
    self.adjList2.append(inner)
  File "C:\Users\LiorAsis\anaconda3\envs\CPyConcurrent\lib\site-packages\numba\typed\typedlist.py", line 344, in append
    _append(self, item)
  File "C:\Users\LiorAsis\anaconda3\envs\CPyConcurrent\lib\site-packages\numba\core\dispatcher.py", line 482, in _compile_for_args
    error_rewrite(e, 'typing')
  File "C:\Users\LiorAsis\anaconda3\envs\CPyConcurrent\lib\site-packages\numba\core\dispatcher.py", line 423, in error_rewrite
    raise e.with_traceback(None)
numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
- Resolution failure for literal arguments:
No implementation of function Function(<function impl_append at 0x0000017E59B4C3A0>) found for signature:

 >>> impl_append(ListType[array(int32, 1d, C)], array(float64, 1d, C))

There are 2 candidate implementations:
  - Of which 2 did not match due to:
  Overload in function 'impl_append': File: numba\typed\listobject.py: Line 592.
    With argument(s): '(ListType[array(int32, 1d, C)], array(float64, 1d, C))':
   Rejected as the implementation raised a specific error:
     AssertionError: Failed in nopython mode pipeline (step: native lowering)

  raised from C:\Users\LiorAsis\anaconda3\envs\CPyConcurrent\lib\site-packages\numba\np\arrayobj.py:5283

- Resolution failure for non-literal arguments:
None

During: resolving callee type: BoundFunction((<class 'numba.core.types.containers.ListType'>, 'append') for ListType[array(int32, 1d, C)])
During: typing of call at C:\Users\LiorAsis\anaconda3\envs\CPyConcurrent\lib\site-packages\numba\typed\typedlist.py (83)


File "..\..\..\anaconda3\envs\CPyConcurrent\lib\site-packages\numba\typed\typedlist.py", line 83:
def _append(l, item):
    l.append(item)
    ^


Process finished with exit code 1

Will try to debug it alone in the next few days.

import timeit

from numba import jit
from numba.typed import List


@jit
def _dfs(adj_list, i_source: int):

    visited = set()
    stack = List()

    stack.append(i_source)

    current = 0
    while len(stack):
        current = stack.pop()

        if current not in visited:
            visited.add(current)

            for v in adj_list[current]:
                if v not in visited:
                    stack.append(v)

class GraphNumba:

    def __init__(self, edges, num_of_vertices: int):
        self.level = 0
        
        # I do this to force types to be discovered here on inner and outer
        # lists; possible there's a better way!
        self.adj_list = List([List([0]) for _ in range(num_of_vertices)])
        [inner_list.pop() for inner_list in self.adj_list]

        for (src, dest) in edges:
            self.adj_list[src].append(dest)

    def iterative_dfs_numba_wrapper(self, i_source: int):
        _dfs(self.adj_list, i_source)


#
# 4 Edges
#
edges = List([(0, 1), (0, 2), (1, 2), (2, 0), (2, 3), (3, 3)])

# Do it once to force compilation
graph_numba = GraphNumba(edges, num_of_vertices=max(max(edges)) + 1)
graph_numba.iterative_dfs_numba_wrapper(2)

# Now time it
graph_numba = GraphNumba(edges, num_of_vertices=max(max(edges)) + 1)
number = 1
repeat = 1
min_time = min(timeit.Timer("graph_numba.iterative_dfs_numba_wrapper(2)", globals=globals()).repeat(repeat=repeat, number=number)) / number
print(f" 4 edges: {min_time:.3e} s")


#
# 10 edges
#
edges = List([(0, 1), (0, 2), (1, 2), (2, 0), (2, 3), (3, 3), (1, 4), (1, 5), (4, 8), (8, 9), (5, 7), (3, 6)])
graph_numba = GraphNumba(edges, num_of_vertices=max(max(edges)) + 1)
min_time = min(timeit.Timer("graph_numba.iterative_dfs_numba_wrapper(2)", globals=globals()).repeat(repeat=repeat, number=number)) / number
print(f"10 edges: {min_time:.3e} s")

Result:

 4 edges: 6.597e-06 s
10 edges: 7.182e-06 s
2 Likes