My GTC 2022 talk explains how to provide an API for User-Defined Functions in Python that get compiled into CUDA code using Numba. A lot of the talk is focused more generically on writing Numba extensions (as this is the vehicle for supporting an application’s data structures), so it may be of interest to those wishing to learn more about Numba’s internals, or extending Numba.
Abstract: “Many applications provide a Python API so that users can script their execution and extend their functionality — well-known examples include Blender, FreeCAD, and QGIS. Accelerated applications can also provide Python APIs; although this provides extra power and flexibility to the end user, these APIs are typically restricted to plumbing together calls to preexisting kernels provided by the application developer — it’s generally not possible for users to write their own CUDA kernels for the application in Python. Existing solutions to this problem entail writing CUDA C kernels — however, Python programmer productivity falls drastically when they need to write kernels in another language. Numba is a compiler that enables users to write their own CUDA kernels in Python. Learn how to integrate and extend Numba within an accelerated application so that users can implement high-performance extensions and workflows as user-defined functions within the accelerated application using only Python.”
Recording: https://events.rainfocus.com/widget/nvidia/gtcspring2022/sessioncatalog/session/16339878397050012ADx (note that registration for NVIDIA’s GPU Technology Conference (GTC) is required, which is free)
Docker: Running the example code:
docker run -p 8888:8888 gmarkall/filigree:v1 and open the “Filigree Demo” notebook.