Communicating NumPy 2.0 Changes to Numba Users

Communicating NumPy 2.0 Changes to Numba Users.

Author: @stuartarchibald
Editor: @sklam

tl;dr:

NumPy 2.0 is coming soon (NumPy 2.0 development status & announcements · Issue #24300 · numpy/numpy · GitHub), Numba is making changes to accommodate this in various phases, including the introduction of a new type system that will eventually become default. Most people won’t need to do anything to their code, but some might. NumPy 2.0 may also impact the output of your non-JIT compiled code, be ready to test and verify your code base for both changes!

Contents:

Introduction

As many of you will already know, NumPy 2.0 (NP2) (Tracking issue: NumPy 2.0 development status & announcements · Issue #24300 · numpy/numpy · GitHub) is on the horizon and expected to land sometime in the next few months, with a first release candidate expected in the next few weeks. This is a binary incompatible change against NumPy 1.x, that comprises a large number of other changes, including the adoption of NEP-050 (NEP: NEP 50 — Promotion rules for Python scalars — NumPy Enhancement Proposals, main PR: API: Switch to NEP 50 behavior by default by seberg · Pull Request #23912 · numpy/numpy · GitHub). Unsurprisingly, Numba is one of the projects most heavily impacted by the changes that are coming, in a large part due to the way Numba’s type system works and also because Numba tries to bug-for-bug replicate NumPy. The NumPy folks have taken this breaking-change opportunity to fix a lot of issues, and also to help out Numba and some other NumPy-like-API compilers to rectify problems in NumPy itself for the benefit of these compilers. It’s been a lot of work for them and their willingness to listen to feedback, adjust, and explain decisions has been much appreciated by the Numba maintainers.

What’s the plan for Numba?

To help Numba users schedule adapting their code to use NP2, the provisional plan for Numba is as follows.

  • Numba 0.60 (will probably be released around June 2024, but realistically it’ll be the release after NP2 ships) will aim to be binary compatible with NP2. It will not mirror NP2’s execution semantics, types, behaviours etc but your existing Numba-compiled code should continue to work as it does now. I.e. you can use NP2 alongside Numba but don’t expect Numba to match NP2’s output (though it’s expected that it’ll be fine for most cases!).
  • Numba 0.61 (probably around September 2024) will have two type systems available, the “old” one which exists in Numba as of today, and a “new” one which is designed to make it possible to support NP2 (see the footnote). It is proposed that users will be able to choose which type system to use by setting an environment variable. This release will contain updated algorithms etc to match NP2 execution and as a result NP2 will be officially supported, but you will need to use the new type system which may mean updating your code. The “old” type system will continue to be available along with NumPy 1.x algorithms as part of a transition period.
  • Numba 0.62 (December 2024/January 2025) will continue with updates for NP2 as needed. The old type system will be deprecated.
  • Numba 0.63 (May 2025), a backport of the new type system to NumPy 1.x algorithms will be released, this offering a migration path for those pinned to NumPy 1.x for a while.
  • Numba 0.64 (September 2025) the old type system will be removed (this would coincide with SPEC-0000 based removal of NumPy 1.x support, Scientific Python - SPEC 0 — Minimum Supported Dependencies).

The uncertainty in timings in the above is because a) NumPy isn’t sure about timings and b) the amount of code change required for Numba to support NP2 is enormous. In practice this code change amounts to essentially having to rewrite or adjust most of Numba’s implementations of CPython and NumPy, and then assess and fix Numba’s 10000+ unit tests against these changes. Obviously this is a massive task and the Numba maintainers would like to thank you all in advance for your support and patience whilst this is work is undertaken!

As a side note, after transitioning to the new type system, that better models Python and NumPy types, it will simplify the support of using typing annotations/hints as a way of declaring a function signature.

FAQs

This section aims to cover the impact of the above changes based on how your project uses Numba.

What does this mean for my code/project?

First note that there are two updates going on:

  1. NP2 has various semantic changes which may alter the output of your code that uses NumPy, even if Numba is not involved.
  2. Numba’s new type system changes and algorithm updates are going to emulate NP2 semantics. As a result the output of your code that uses NP2 in Numba JIT regions could also change.

For the vast majority of users, in the case of both upgrading to NP2 and moving to Numba’s new type system, it’s hoped that most code will continue to “just work”. It’s expected that some Numba users will need to make changes to accommodate the new type system, but these are, in the large part, likely to be small. If you just use @numba.jit, there’s a good chance your existing code will work the same as it would if executed by NP2.

Places where it is already known that changes will be required are:

  • Anywhere where a scalar type instance is used, e.g. types.float64, this won’t exist in the new type system as Numba simply won’t “know” what sort of float it is (NumPy or Python). In cases like this a specific NumPy or Python float type will need to be chosen.
  • Anywhere where your code is not type stable. It is possible that by changing to the new type system, which is more strict about NumPy and Python scalars, that existing code is identified as being type unstable. Guidance will be produced on how to rectify such a situation.

What does this mean for my Numba extension library?

If you are a Numba extension writer (you have code using functions from numba.extending), it is reasonably likely that your code does type specific dispatch, and it is therefore possible that you will need to change your code to accommodate the new type system. With the arrival of the new type system, Numba will be doing it’s best to keep as much existing code working as possible, common idioms like checking if a type is an instance of a numba.types.Integer will still work as Numba will carry aliases to make sure it does.

What does this mean for my Numba compiler extension?

If you have customised the Numba compiler through either numba.core.compiler or numba.core.compiler_machinery to define your own compiler pipeline, or through numba.core.target_extensions to define your own hardware target, then it is likely that some adjustments will be needed. These changes are likely to take the form of exchanging one of the old type system’s more generic types for something more specific in the new type system. If there are unresolvable problems please do open an issue on the issue tracker and a maintainer can provide assistance.

I have code that is heavily reliant on the old/existing type system, how long have I got to upgrade?

Until at least September 2025, and potentially longer based on community feedback. This large-scale sort of change to NumPy and Numba hasn’t happened before and as a result it is hard to predict the magnitude of the impact. For users that do not manage to migrate in time, pinning to the last version of Numba with the old type system support is an option, this similar to the Python 2->3 transition.

Are there any resources available to help with upgrading to the new type system?

A migration guide will be written to help users with doing the upgrade. It’s expected that most of the time it will just be changing a few signatures or type casts, and sometimes adjusting algorithms to be more strictly type stable.

I follow the Numba code base… what’s going on?!

Practically, within the Numba code base, you will see a new type system with a lot more types and these types are considerably more specific to their domain of application than in the old type system. You’ll see that a lot of modules are duplicated into an old_<modulename> and new_<modulename> variant with a redirect in <modulename> depending on which type system is in use. The old type system will use the old_<modulename> code paths and will work just like Numba always has, no changes except bug fixes and incremental functionality additions will be added to the old_<modulename> code paths. Under the new type system the new_<modulename> code paths will be used, it is these modules that will be undergoing the necessary changes to make them a) work with NP2 and b) not confuse Python, NumPy and machine type behaviours. If you submit PRs to Numba that impact one of these “split” modules then it’s possible that two PRs will be needed or a maintainer will carry the changes across into the other type system. NumPy 2.0 related issues and patches are tracked using the NumPy 2.0 label on github Issues · numba/numba · GitHub.

What should Numba maintainers be aware of following the duplication of modules?

Following on from the above, when reviewing PRs, Numba maintainers will need to consider whether a PR impacts code in one of the duplicated modules. If an old_ module is being updated, consider why this is the case (could be a genuine bug) and whether the update also needs to be carried into the new_ module, the inverse also applies. The same sort of remarks can be applied for updates to cPython/ or np/ modules, does the change perhaps apply to both?

Footnote: Details on the new type system/why Numba is making this change.

The primary change for the “new” type system is that Python, NumPy and Machine scalars are all their own separate types and types only exist if they exist in reality (i.e. there’s no float32 in Python!). The “new” type system therefore permits a more accurate representation of programs as it can differentiate between e.g. a Python float and a NumPy float64 being returned. It also makes low level programming easier for maintainers as a new and very strict “machine” type system will exist, this will behave similarly to type interaction in the C language. The reason for needing a new type system is mainly due to NP2 adopting NEP-50, which spells out the interaction rules between various NumPy and Python types. If Numba cannot differentiate between these types then it cannot support NEP-50 and as a result cannot support NP2. It also follows that NP2 is offering a “clean break” in some respect and with that comes the opportunity for projects like Numba to also make some changes of a similar scale.

2 Likes