Dictionary performance

When running the following script, I’m getting significant performance differences between plain python and numba, that I have difficulty explaining/understanding. Anybody any idea why numba is 100 times slower than python here?

import numba
import numpy as np
from numba.typed import List
from numba import types

@numba.njit
def test_nb(names, search):
    idx = {
        nm: i
        for i, nm in enumerate(names)
    }
   
    cnt = 0
    for s in search:
        if s in idx:
            cnt += 1
    return cnt

def test_py(names, search):
    idx = {
        nm: i
        for i, nm in enumerate(names)
    }
   
    cnt = 0
    for s in search:
        if s in idx:
            cnt += 1
    return cnt

dt = np.dtype('U128')
search_np = np.array([f"{i}" for i in range(0, 2000, 100)], dtype=dt)
search = [f"{i}" for i in range(0, 2000, 100)]


lst = List(lsttype=types.ListType(numba.from_dtype(dt)))
names = [f"{i}" for i in range(1000)]
for nm in names:
    lst.append(nm)

# warmup
test_nb(lst, search_np)

import time
t1 = time.time()
test_nb(lst, search_np)
t2 = time.time()
print(f"exectime numba: {(t2 - t1) * 1000}ms")
t1 = time.time()
test_py(names, search_np)
t2 = time.time()
print(f"exectime python: {(t2 - t1) * 1000}ms")
print(f"numba version = {numba.__version__}")

On my system this gives:

exectime numba: 45.153141021728516ms
exectime python: 0.24938583374023438ms
numba version = 0.56.3

Hi @jolos

In the case of Numba, almost all the time is spent creating idx. This is probably a direct result of dt = np.dtype('U128') being very large. I am not sure why this takes so much extra time. Hashing (which you use for the dictionary) is usually not very sensitive to input size. Allocating memory for the dictionary should also be orders of magnitude smaller.
You can easily verify that the data type matters a lot. Modify the following parts of your example:

search_np = np.array([f"{i}" for i in range(0, 2000, 100)])
search = [f"{i}" for i in range(0, 2000, 100)]


lst = list(lsttype=types.ListType(numba.from_dtype(search_np.dtype))
names = [f"{i}" for i in range(1000)]
for nm in names:
    lst.append(nm)

Now the types are handled automatically and you will see that Numpy chooses dtype('<U4'). The discrepancy between Numba and pure Python is reduced to a factor of two. Why Numba is still slower, I can’t say. It might be that the typed Dict or iterating over a typed List is simply not fully optimized yet.

I’m sorry I can’t give you a full explanation, but I hope it helps you anyway.

Thx for looking at this @sschaer, it doesn’t directly solve my problem, but at least confirms that I’m not doing something obviously wrong :slight_smile: