Why does the initialization of jitclass behave very slow?

SHF101202021 · April 22, 2024, 12:36pm

I try to define my numba object to accelerate my codes with jitclass
Here is my definitions of the object

from numba import  float64
from numba.experimental import jitclass

import numpy as np
from shapely.geometry import Polygon as ShapelyPolygon
from shapely.geometry import Point
import time

import os

@jitclass([('vertices', float64[:, :])])
class NumbaPolygon:
    def __init__(self, vertices):
        self.vertices = vertices

    def contains_point(self, point):
        """To check if the point is inside the polygon"""
        x, y = point
        inside = False
        x_vertices = self.vertices[:, 0]
        y_vertices = self.vertices[:, 1]
        n = self.vertices.shape[0]
        p1x, p1y = x_vertices[0], y_vertices[0]
        for i in range(n + 1):
            p2x, p2y = x_vertices[i % n], y_vertices[i % n]
            if y > min(p1y, p2y):
                if y <= max(p1y, p2y):
                    if x <= max(p1x, p2x):
                        if p1y != p2y:
                            xinters = (y - p1y) * (p2x - p1x) / (p2y - p1y) + p1x
                        if p1x == p2x or x <= xinters:
                            inside = not inside
            p1x, p1y = p2x, p2y
        return inside

Then, I use the following codes to do the benchmark of the first compilation of the object

vertices = np.array([[0, 0], [10, 0], [10, 10], [0, 10]], dtype=np.float64)
point = np.array([0.5,0.5], dtype=np.float64)

t1 = time.perf_counter_ns()
numba_poly = NumbaPolygon(vertices)
numba_poly.contains_point(point)
consuming_time = time.perf_counter_ns() - t1
print(f'JITCLASS method takes {consuming_time/1e6:.6f} ms')        

t1 = time.perf_counter_ns()
poly = ShapelyPolygon(vertices)
poly.contains(Point(*point))
consuming_time = time.perf_counter_ns() - t1
print(f'Shapely method takes {consuming_time/1e6:.6f} ms')

Here is the results:

JITCLASS method takes 436.783740 ms
Shapely method takes 0.288628 ms

Though the second-try really takes less time, it is hard to compensate the required initialization time.

Second Try:
JITCLASS method takes 0.064460 ms
Shapely method takes 0.511835 ms

Hence, as title, I want to find any method to accelerate the initialization of the object.
Thanks in advanced!

Hannes · April 23, 2024, 12:36pm

The long burn-in time that you measure on the first iteration is probably not the object initialisation but the code compilation. Numba is a Just In Time compiler, so compilation is triggered whenever you call a function that has not been previously compiled for the passed argument types for the first time. After that the compiled function is ready for immediate use when called again.

If you repeat the experiment many times over you should see that all subsequent runs are much faster, even if you reinitialise the NumbaPolygon object over and over. I don’t want to install shapely right now, so I cannot put up a proper benchmark from my side - so you should still validate what I am claiming of course

Best of luck!

DannyWeitekamp · April 23, 2024, 10:20pm

Unfortunately functions with jitclasses as arguments do not work with cache=True because the type specification is built on the fly in a way that is inconsistent between calls, so if you use them you’ll need to wait for your functions to recompile every time. Not fun. Structrefs don’t have this limitation, but leave a lot to be desired in terms of usability and documentation. Others have found this thread helpful for making the transition here. There are some other good conversations about structrefs here on discourse if you search for structref. Namedtuples may also cache, if I recall, but are less readily extensible.

SHF101202021 · April 24, 2024, 3:36am

@Hannes
You’re right. Forgive my words are not precise.

My question is how the code compilation takes some many time for jitclass.

To my best understanding, for njit, numba need to take some time to deduce the data type of argument so that giving signatures will help numba to save much time for the first compilation.

Analogy to njit, specification should help jitclass to quickly compile the codes, but it doesn’t help.
That’s the point confusing me.

SHF101202021 · April 24, 2024, 3:40am

@DannyWeitekamp

Hi Danny,

Thanks for replying.
I am a little bit confusing why cache is mentioned here.
I don’t use it in demo codes.

Could you give me more explanations for that?
Thanks in advanced

DannyWeitekamp · April 24, 2024, 11:14am

Apologies I got a little bit ahead of myself. The usual trick for dealing with long upfront compile times is to use @njit(cache=True). For instance your function could be written without jitclass like so:


from numba import  njit
import numpy as np
import time

@njit(cache=True)
def contains_point(vertices, point):
    """To check if the point is inside the polygon specified by verticies"""
    x, y = point[0], point[1]
    inside = False
    x_vertices = vertices[:, 0]
    y_vertices = vertices[:, 1]
    n = vertices.shape[0]
    p1x, p1y = x_vertices[0], y_vertices[0]
    for i in range(n + 1):
        p2x, p2y = x_vertices[i % n], y_vertices[i % n]
        if y > min(p1y, p2y):
            if y <= max(p1y, p2y):
                if x <= max(p1x, p2x):
                    if p1y != p2y:
                        xinters = (y - p1y) * (p2x - p1x) / (p2y - p1y) + p1x
                    if p1x == p2x or x <= xinters:
                        inside = not inside
        p1x, p1y = p2x, p2y
    return inside

vertices = np.array([[0, 0], [10, 0], [10, 10], [0, 10]], dtype=np.float64)
point = np.array([0.5,0.5], dtype=np.float64)

t1 = time.perf_counter_ns()
contains_point(vertices, point)
consuming_time = time.perf_counter_ns() - t1
print(f'njit function takes {consuming_time/1e6:.6f} ms')

The first time you run code with njit(cache=True) it will compile which will take some time. For instance on my machine:

njit function takes 658.465777 ms

But on the second run the time will be greatly reduced. Numba still needs to initialize itself, and load its cached version of the function, but the function itself does not need to recompile the second time your run the script.

njit function takes 183.041554 ms

As I mention above this trick does not work with @jitclass. In general cache=True does not work with jitclass methods or with functions that take jitclasses. So if you wanted to use cache=True to reduce the upfront time of recompiling your functions then you might consider just using numpy arrays without bothering with the Polygon class wrapper.

Topic		Replies	Views
Optimal way to speed up class method Support: How do I do ...?	5	4130	August 28, 2020
KeyError/PicklingError: Can't pickle class Numba	3	1234	July 28, 2020
Jitclass with tuple not working Numba	3	329	October 11, 2022
AOT compilation of jitclass Numba	5	1397	June 8, 2022
Jitclass with input of list of jitclass Community Support	3	499	January 28, 2021

Why does the initialization of jitclass behave very slow?

Related Topics