How to convert a non numba dictionary to a nb.typed.Dict?

Hi,

using numba 0.54, I would like to convert a normal python dictionary to a numba compatible dictionary.

My code is the following:

import time

import numba as nb
# @nb.jit(nopython=True, cache=True)
def getNumbaDictFromDict(myDict):
    returnDict = nb.typed.Dict.empty(
        key_type=nb.types.unicode_type,
        value_type=nb.types.float64, )
    for runner in range(len(list(myDict))):
        returnDict[list(myDict.keys())[runner]] = myDict[list(myDict.keys())[runner]]
    return returnDict

myDict = {'a': 14., 'b': 15., 'c': 16.}
my_time = time.time()
abc = getNumbaDictFromDict(myDict)
print('time:', time.time() - my_time)

print("abc=", abc)

It takes roughly 2 seconds to execute it. If I activate the numba compiler directive, I get an error message:

This error may have been caused by the following argument(s):
- argument 0: Cannot determine Numba type of <class 'dict'>

Is there anything I can do to avoid the 2 seconds I have to wait for every execution?

Thanks in advance!

1 Like

hi @alatif-alatif
is your goal to use a typed.Dict in regular python code (non-jitted)? Or would you pass returnDict to another jitted function later?

Hi @luk-f-a

I want to use returnDict, or better to say abc later on in another jitted function (@nb.jit(nopython=True, cache=True)).

thanks, I understand. The filling of the dictionary can be done in jitted code, when the source is an array, list, tuple, anything that can be passed into a jitted function. Python dictionaries cannot be passed to a jitted function, so filling the typed.Dict from a python dict can only be done in a normal function (so without the decorator as you found out).
The 2 seconds you mention are the result of having to compile the typed.Dict for your types. The cost will not be 2 seconds every time you run the function, only the first time. Look at the example below:

import time
import numba as nb
# @nb.jit(nopython=True, cache=True)
def getNumbaDictFromDict(myDict):
    returnDict = nb.typed.Dict.empty(
        key_type=nb.types.unicode_type,
        value_type=nb.types.float64, )
    for runner in range(len(list(myDict))):
        returnDict[list(myDict.keys())[runner]] = myDict[list(myDict.keys())[runner]]
    return returnDict
myDict = {'a': 14., 'b': 15., 'c': 16.}
# running first time
my_time = time.time()
abc = getNumbaDictFromDict(myDict)
print('time:', time.time() - my_time)
# running a second time
my_time = time.time()
abc = getNumbaDictFromDict(myDict)
print('time:', time.time() - my_time)
time: 2.0478579998016357
time: 0.0002455711364746094

hope this helps,
Luk

2 Likes

Hi @luk-f-a ,

so my issue comes up during development time.

There it is a little bit annoying to wait (some extra) two seconds for every execution before I can test / execute my present code in focus.

With the decorator @nb.jit(nopython=True, cache=True) I am usually able to avoid waiting times like this.

It would be convenient if this would be possible also here.

I think this open issue is the problem you are facing: List() and Dict() always recompile _getitem,_setitem, _length, etc. every time; maybe should cache? · Issue #5713 · numba/numba · GitHub

Now that Numba 0.54 supports dictionary comprehensions (:heart_eyes:), you can also write something like this:

@nb.njit
def create_dict(items):
    return {k: v for k,v in items}

py_dict = {'a': 14., 'b': 15., 'c': 16.}
nb_dict = create_dict(tuple(d.items()))

It’s not identical of course, because the key and value datatype are inferred, not specified. That could be a pro or con.

nb_dict being:

DictType[unicode_type,float64]<iv=None>({a: 14.0, b: 15.0, c: 16.0})
2 Likes

that’s a nice solution. I’m curious as to whether it is cache’d, which was one of the requirements.

Yes, it seems to be cached. At least it is now faster (and acceptable at least for me) as compared to my initial setup.

Thanks all for your good input!