Using annotations with Numba (gu)vectorize functions

Rutger · November 8, 2021, 2:48pm

Hey,

I’m having some issues using Numba in combination with Python’s type annotation, specifically the method for adding additional metadata (in addition to a datatype) as specified in PEP 593.

This allows me to annotate the inputs and outputs of my functions with any custom metadata, which I can then use to automatically create a computation graph based on a large set of functions, some available inputs, and some desired outputs. Using pure Python this is possible by using the available functions from the typing module.

Ideally I would use numba.vectorize for the most part, and perhaps numba.guvectorize when functions have multiple return values. The problem is that neither preserves the type hints.

A quick investigation, including potential workarounds like using inspect.getfullargspec and the docstring resulted in the table shown below. Using getfullargspec would only provide information on the inputs, not outputs, so that wouldn’t be great. Using the docstring is more flexible (eg Numpy-doc etc), but requires encoding and parsing strings, which also seems a lot worse compared to type hints.

function type	typing.get_type_hints	inspect.getfullargspec	func.doc
pure python	yes	yes	yes
njit.py_func	yes	yes	yes
njit	no	no	yes
guvectorize	empty dict	incorrect	None
vectorize	no	no	yes

It looks like guvectorize exposes some sort of Python function, since all options “work”, but it’s not the function that’s being decorated.

Since njit keeps the original Python function available at the .py_func attribute, that would work well. I do however rather have all the goodness of automatic broadcasting and the compilation targets (parallel) that (gu)vectorize provide. All inputs can be scalar, 1D, 2D, 3D arrays etc, which works seamlessly with vectorize, but is a lot more involved when using njit.

If I only had a few functions, wrapping them in a separate Python function would be fine. But I’m aiming to use this with perhaps hundreds of functions, and the set of functions is quite dynamic. And also includes functions provided by users. So having to wrap each would be a lot of overhead, extra code management, and a barrier for users to provide their own. Therefore I would really like to avoid that.

One of the most elegant workarounds I can think of is to create my own decorator which parses whatever metadata I want prior to Numba having a go at it. This introduces the problem that my own decorator would interact with a normal Python function, but the Numba decorator returns a different function, so somehow the connection between both functions needs to be made. Since Numba seems to use the same name as Python, using the func.__name__ attribute could work, and is actually quite a nice solution. Can I however rely on Numba never changing the __name__ attribute to anything different? Given that it’s protected, the conventional answer is of course “no”, but how bad would it be?

I would be curious to hear from people what they think of this. Has anyone ever done something similar? Are there other workarounds that I haven’t thought of?

Here is a notebook replicating what I’ve been trying so far. It’s simplified a lot, but captures the gist of it I think. My extra annotation is just a string in this example, but PEP 593 allows it to be any object. I’m assuming things like that don’t affect how it all relates to Numba.

Hannes · November 9, 2021, 2:25pm

Hi Rutger,

I cannot really speak for (gu)vectorize specifically, but in the past I noticed that most decorators in Numba do NOT use python’s functools.update_wrapper / functools.wraps, which has the task of propagating annotations docstrings etc from the decorated function to the returned wrapper.
This showed up in a number of issues with Sphinx not documenting numba-decorated functions correctly or at all, you can read up on that a little bit here: @vectorize-decorated functions are not found by sphinx · Issue #5755 · numba/numba · GitHub

Maybe your use case is another good argument for actually updating the wrappers properly wherever possible. IIRC simply applying the update worked for some numba-decorators but not others. Maybe this would make for a worthwhile PR?

Cheers

Rutger · November 10, 2021, 3:29pm

Hey Hannes,

Thank you for your suggestions. It made me realize that I can also wrap the vectorize decorator from Numba, and use functools.update_wrapper to assign necessary attributes. That’s already a much better workaround compared to my initial one stacking an additional decorator, which required a “central” storage of the type hints.

I’ve added this functools.update_wrapper workaround to the notebook posted above:

I tried looking at the Numba source code to see how the decorators work, but the dispatch mechanism used is quite intimidating to be honest. It doesn’t seem to be as simple as adding a “@functools.wraps(func)” here and there.

Regards,
Rutger

Topic		Replies	Views
Improve @guvectorize capabilities Development	2	683	July 7, 2020
Using guvectorize inside a jitted function Support: How do I do ...?	11	1344	June 17, 2024
Lets use Python type annotations Development	8	2420	June 29, 2020
Enhancing jit signature inference using Python type annotations Community Support	4	790	October 3, 2023
Numba User Survey 2024 Announcements	9	1071	March 28, 2024

Using annotations with Numba (gu)vectorize functions

Related topics