Dictionary performance

jolos · November 22, 2022, 10:58am

When running the following script, I’m getting significant performance differences between plain python and numba, that I have difficulty explaining/understanding. Anybody any idea why numba is 100 times slower than python here?

import numba
import numpy as np
from numba.typed import List
from numba import types

@numba.njit
def test_nb(names, search):
    idx = {
        nm: i
        for i, nm in enumerate(names)
    }
   
    cnt = 0
    for s in search:
        if s in idx:
            cnt += 1
    return cnt

def test_py(names, search):
    idx = {
        nm: i
        for i, nm in enumerate(names)
    }
   
    cnt = 0
    for s in search:
        if s in idx:
            cnt += 1
    return cnt

dt = np.dtype('U128')
search_np = np.array([f"{i}" for i in range(0, 2000, 100)], dtype=dt)
search = [f"{i}" for i in range(0, 2000, 100)]


lst = List(lsttype=types.ListType(numba.from_dtype(dt)))
names = [f"{i}" for i in range(1000)]
for nm in names:
    lst.append(nm)

# warmup
test_nb(lst, search_np)

import time
t1 = time.time()
test_nb(lst, search_np)
t2 = time.time()
print(f"exectime numba: {(t2 - t1) * 1000}ms")
t1 = time.time()
test_py(names, search_np)
t2 = time.time()
print(f"exectime python: {(t2 - t1) * 1000}ms")
print(f"numba version = {numba.__version__}")

On my system this gives:

exectime numba: 45.153141021728516ms
exectime python: 0.24938583374023438ms
numba version = 0.56.3

sschaer · November 24, 2022, 2:27pm

Hi @jolos

In the case of Numba, almost all the time is spent creating idx. This is probably a direct result of dt = np.dtype('U128') being very large. I am not sure why this takes so much extra time. Hashing (which you use for the dictionary) is usually not very sensitive to input size. Allocating memory for the dictionary should also be orders of magnitude smaller.
You can easily verify that the data type matters a lot. Modify the following parts of your example:

search_np = np.array([f"{i}" for i in range(0, 2000, 100)])
search = [f"{i}" for i in range(0, 2000, 100)]


lst = list(lsttype=types.ListType(numba.from_dtype(search_np.dtype))
names = [f"{i}" for i in range(1000)]
for nm in names:
    lst.append(nm)

Now the types are handled automatically and you will see that Numpy chooses dtype('<U4'). The discrepancy between Numba and pure Python is reduced to a factor of two. Why Numba is still slower, I can’t say. It might be that the typed Dict or iterating over a typed List is simply not fully optimized yet.

I’m sorry I can’t give you a full explanation, but I hope it helps you anyway.

jolos · November 28, 2022, 6:35pm

Thx for looking at this @sschaer, it doesn’t directly solve my problem, but at least confirms that I’m not doing something obviously wrong

Topic		Replies	Views
Numba slow for simple numpy operations Numba	1	337	October 13, 2023
Performance issue with typed dicts and lists Support: How do I do ...?	2	1260	March 14, 2021
Timings for arr[:, i] seem much slower in numba Community Support	5	176	March 14, 2024
Curious performance loss of NumPy function with Numba Numba	11	520	December 26, 2023
Tips for performance improvement of my code Support: How do I do ...?	5	681	January 5, 2023

Dictionary performance

Related Topics