Dijkstra on grid

Hi, i am using numba to accelerate a dijkstra graph search on 2d grid.
The code is pretty slow (~1s for 1000x1000 image). Is there something obvious i could do to speed this up?

import heapq
import numpy as np
from numba import jit, njit
from numba.typed import List
    
    
@jit
def dijkstra_run(matrix, costs, priority_queue, visited, transitions, dxy):
    x_max, y_max = matrix.shape
    side = np.sqrt(x_max**2 + y_max**2)
    
    while priority_queue:
        cur_cost, (cur_x, cur_y) = heapq.heappop(priority_queue)
        
        if visited[cur_x, cur_y]:
            continue
        
        for dx, dy in dxy:
            x, y = cur_x + dx, cur_y + dy
            if x < 0 or x >= x_max or y < 0 or y >= y_max:
                continue
            if ~visited[x,y]:
                if matrix[x,y] + costs[cur_x, cur_y] < costs[x,y]:       
                    costs[x,y] = matrix[x,y] + costs[cur_x,cur_y]
                    #distance = np.sqrt( (x_max-x)**2 + (y_max-y)**2 ) # activate to get Astar
                    heuristic = costs[x,y]
                    heapq.heappush(priority_queue, (heuristic, (x, y)))
                    transitions[x,y,0] = cur_x
                    transitions[x,y,1] = cur_y

        visited[cur_x,cur_y] = 1
        
    # retrieve the path
    cur_x, cur_y = x_max - 1, y_max - 1
    on_path = List()
    on_path.append((cur_x,cur_y))
    while (cur_x, cur_y) != (0, 0):
        cur_x, cur_y = transitions[(cur_x, cur_y)]
        on_path.append((cur_x,cur_y))
    return on_path


def dijkstra(matrix, diag=False):
    x_max, y_max = matrix.shape
    dxy = List()
    dxy.append((0,1))
    dxy.append((1,0))
    if diag:
        dxy.append((1,1))

    costs = np.full_like(matrix, 1.0e10)
    costs[0,0] = matrix[0,0]
    visited = np.zeros(matrix.shape, dtype=bool)
    transitions = np.zeros((x_max, y_max, 2), dtype=np.int32)
  
    priority_queue = List()
    priority_queue.append((matrix[0,0], (0,0)))

    return dijkstra_run(matrix, costs, priority_queue, visited, transitions, dxy)

# TEST
dijkstra(np.random.randn(1000,1000))

hi @etienne87 , I measured the runtime with and without jit, and found a 86x improvement. From 17.3s to 0.2s. That is a very good improvement for a jit method.
Why do you think the runtime is too slow? have you compared it to another library or language?

Luk

Hi Luk!
Thanks for taking the time to time it. You are right that it accelerates very well compared to pure python!

I was comparing to GitHub - seung-lab/dijkstra3d: Dijkstra's Shortest Path for 6, 18, and 26-Connected 3D (Volumetric) Image Volumes (but obviously this api is highly optimized so it is perhaps not very fair).
I was just checking because i know the community here is often giving good advices ^^

I didn’t see anything obviously slow, so my best guess is that popping and pushing from a list is not the most efficient way to do it. I’m not sure, but you might want to research which data structure is the fastest for those operations, specially the push since it takes place inside the for loop.

Looking at the repo that you mention, I wonder if you are using the same algorithm. The code there looks more convoluted. Is it possible they implemented some optimizations that are not related to the language but to the design?

I also see that this project provides a cython interface. you can use cython functions from numba, here: High-level extension API — Numba 0.57.0.dev0+175.g24514ab33.dirty-py3.7-linux-x86_64.egg documentation
That would allow you to use the efficient algorithm from that repo, into any njit function you need for your project.

Hope this helps,
Luk

You can also njit your Dijkstra function, you’ll only need to change the dtype of visited to np.bool_.

That allows initializing the queue from a jitted function, and when that’s done you can create your priority_queue simply as [] instead of using List(). For me that makes a big difference, about a 8x improvement for a 1000x1000, and larger for bigger grids (on a slow laptop).

I would also remove the tuple from the queue, and make it simply 3 numbers. Although that doesn’t seem to make a big difference on my machine. In terms of data structures it should be simpler for Numba (I guess?).

Make it:
[float, int, int]

Instead of:
[float, tuple(int, int)]

You can also be more explicit about typing, for example change the algorithm to be in float32 if that’s sufficient for the application.

priority_queue = [(np.float32(x), np.int(x), np.int(x)) for x in range(0)]

This is a nice publication about algorithms for the Fast Marching Method:
GĂłmez et al (2018) Fast Methods for Eikonal Equations: An Experimental Survey

Dijkstra on a regular grid is similar, but instead of solving the Eikonal equation, you just take to closest neighbor. What you did is more or less algorithm 4 in the paper, “Simplified Fast Marching Method”.

From your example it looks like you have a very specialized case (eg going from corner to corner et). But it could easily be made more general. Allowing for “walls”, custom initialization with any amount of starting cells, different connectivity for the neighbors etc. And perhaps also different distance measures, like Dijkstra (euclidean), FMM.

2 Likes

Thanks a lot @Rutger ! Indeed It gives a very nice 8x improvement!
To be honest I had no idea of the connection between Eikonal equation and Dijkstra, I will read into this. You are also right that i could make this much more generic (2d/3d grid, more connectivity/ distance measures/ a* heuristic…).