Using annotations with Numba (gu)vectorize functions

Hey,

I’m having some issues using Numba in combination with Python’s type annotation, specifically the method for adding additional metadata (in addition to a datatype) as specified in PEP 593.

This allows me to annotate the inputs and outputs of my functions with any custom metadata, which I can then use to automatically create a computation graph based on a large set of functions, some available inputs, and some desired outputs. Using pure Python this is possible by using the available functions from the typing module.

Ideally I would use numba.vectorize for the most part, and perhaps numba.guvectorize when functions have multiple return values. The problem is that neither preserves the type hints.

A quick investigation, including potential workarounds like using inspect.getfullargspec and the docstring resulted in the table shown below. Using getfullargspec would only provide information on the inputs, not outputs, so that wouldn’t be great. Using the docstring is more flexible (eg Numpy-doc etc), but requires encoding and parsing strings, which also seems a lot worse compared to type hints.

function type typing.get_type_hints inspect.getfullargspec func.doc
pure python yes yes yes
njit.py_func yes yes yes
njit no no yes
guvectorize empty dict incorrect None
vectorize no no yes

It looks like guvectorize exposes some sort of Python function, since all options “work”, but it’s not the function that’s being decorated.

Since njit keeps the original Python function available at the .py_func attribute, that would work well. I do however rather have all the goodness of automatic broadcasting and the compilation targets (parallel) that (gu)vectorize provide. All inputs can be scalar, 1D, 2D, 3D arrays etc, which works seamlessly with vectorize, but is a lot more involved when using njit.

If I only had a few functions, wrapping them in a separate Python function would be fine. But I’m aiming to use this with perhaps hundreds of functions, and the set of functions is quite dynamic. And also includes functions provided by users. So having to wrap each would be a lot of overhead, extra code management, and a barrier for users to provide their own. Therefore I would really like to avoid that.

One of the most elegant workarounds I can think of is to create my own decorator which parses whatever metadata I want prior to Numba having a go at it. This introduces the problem that my own decorator would interact with a normal Python function, but the Numba decorator returns a different function, so somehow the connection between both functions needs to be made. Since Numba seems to use the same name as Python, using the func.__name__ attribute could work, and is actually quite a nice solution. Can I however rely on Numba never changing the __name__ attribute to anything different? Given that it’s protected, the conventional answer is of course “no”, but how bad would it be? :sweat_smile:

I would be curious to hear from people what they think of this. Has anyone ever done something similar? Are there other workarounds that I haven’t thought of?

Here is a notebook replicating what I’ve been trying so far. It’s simplified a lot, but captures the gist of it I think. My extra annotation is just a string in this example, but PEP 593 allows it to be any object. I’m assuming things like that don’t affect how it all relates to Numba.

Hi Rutger,

I cannot really speak for (gu)vectorize specifically, but in the past I noticed that most decorators in Numba do NOT use python’s functools.update_wrapper / functools.wraps, which has the task of propagating annotations docstrings etc from the decorated function to the returned wrapper.
This showed up in a number of issues with Sphinx not documenting numba-decorated functions correctly or at all, you can read up on that a little bit here: @vectorize-decorated functions are not found by sphinx · Issue #5755 · numba/numba · GitHub

Maybe your use case is another good argument for actually updating the wrappers properly wherever possible. IIRC simply applying the update worked for some numba-decorators but not others. Maybe this would make for a worthwhile PR?

Cheers

1 Like

Hey Hannes,

Thank you for your suggestions. It made me realize that I can also wrap the vectorize decorator from Numba, and use functools.update_wrapper to assign necessary attributes. That’s already a much better workaround compared to my initial one stacking an additional decorator, which required a “central” storage of the type hints.

I’ve added this functools.update_wrapper workaround to the notebook posted above:

I tried looking at the Numba source code to see how the decorators work, but the dispatch mechanism used is quite intimidating to be honest. It doesn’t seem to be as simple as adding a “@functools.wraps(func)” here and there.

Regards,
Rutger