Hello,

The PyMC group has forked Theano and started a new project called Aesara. I’m leading the development of Aesara, and I’ve made it a top priority that we replace the old C code generation and compilation backend with Numba.

In the following, I’ll briefly introduce our project and describe how it relates to Numba, then I’ll describe what we’ve already done to support Numba in Aesara and one of the more important things we’re working on at the moment.

# Background

Anyone who’s familiar with the old Theano might already have a good idea of what a Numba “backend” means, but, for those who aren’t familiar: Aesara build graphs of symbolic NumPy operations, performs domain-specific optimizations, and either converts those to functions that execute the underlying NumPy functions in pure Python, or converts them to C (i.e. creates and exposes custom Python extensions and orchestrates the use of those).

This latter step–the one that creates computable functions from symbolic graphs–is almost exactly what Numba does with its object and no-python modes (albeit with LLVM instead of C).

The main difference is that Numba is much better for all this, and–from what I’ve seen–Aesara could easily be much better off investing its time and effort on “low-level” things in Numba instead of its existing framework.

At a “high-level”, our intentions with Aesara are to make it an approachable Python platform for domain-specific optimizations, especially ones that exists within the spaces between symbolic math and efficient numerical computation. For more background, see the experiments in `symbolic-pymc`

, the Python DSL work in the GitHub organization `pythological`

, and this summary paper explaining the two.

With that in mind, I believe it’s easier to see how Aesara and Numba could fit together.

I can provide more details upon request, but that’s the gist of it. We have some old documentation that should explain more, but it’s still basically just the old Theano documentation.

# Current Efforts

As of now, we have a simple dispatch-based foundation for converting Aesara graphs to Numba JITed functions: `aesara.link.numba.dispatch`

.

It’s essentially a dispatch from Aesara `Op`

s to `numba.njit`

ed functions that take the same types of inputs as the `Op`

s–except that these inputs are non-symbolic NumPy arrays. In the most basic cases, the JITed functions are just wrappers for NumPy functions. These conversions are very easy to make, because Aesara is largely a symbolic version of NumPy/SciPy.

There have been some challenges, though, and they’re all things of which the Numba community is well aware. For example, an attempt at my `slice`

boxing feature request is in there, as well as a hack for some `vectorize`

varargs-related issues that construct functions from source (using `exec`

).

## Work in Progress

There’s one thing I’m working on that’s particularly important, and I would like to get some low-level guidance on how to implement it in Numba.

It’s an Aesara `Op`

called `Scan`

, which represents a somewhat generalized loop. It’s one of the more complicated and least performant parts of Aesara, but it’s also a critical `Op`

, so I’m working on fixing it ASAP, and a Numba-backed implementation could be a simple solution to the bulk of its performance problems.

`Scan`

can be simplified–and somewhat generalized–as follows:

```
import numpy as np
def scan(carry, *in_sequences):
global inner_fn
out_sequences = []
for in_seq in np.nditer(in_sequences):
carry, out_seq = inner_fn(*(carry + tuple(in_seq)))
out_sequences.append(out_seq)
return np.stack(out_sequences)
```

Here’s an example of how it’s used:

```
import numba
@numba.njit
def inner_fn(x_tm1, a_t, b_t):
x_t = x_tm1 * a_t + b_t
return (x_t,), x_t
res = scan((np.array(-1.0),), np.arange(10), np.arange(10) * 1e-1)
print(res)
```

```
[0.00000e+00 1.00000e-01 4.00000e-01 1.50000e+00 6.40000e+00 3.25000e+01
1.95600e+02 1.36990e+03 1.09600e+04 9.86409e+04]
```

As you can guess, there are some issues with `njit`

ing this naive version of `scan`

; however, I think I understand the errors, so I’ll spare you that.

I did notice that the `numba.stencil`

feature could cover a lot of the same ground as our `scan`

, so I’m currently looking at its source for inspiration.

# Epilogue

At this point, the code in `StencilFunc`

seems to imply that the `exec`

-based approach I’ve been using in Aesara might not be as off-base as I initially thought. (If I’m mistaken, please tell me!) In the meantime, I’ll attempt to mix the source-generation + `exec`

approach with some other things I pick up from the `StencilFunc`

implementation and report any blocking issues here.

Given the extent to which I would like Aesara to use Numba, I am personally interested in becoming involved in Numba development at whatever level the issues we run into ultimately need to be resolved. In other words, if there are more efficient and direct routes that make better use of Numba *and* improve it (e.g. through bug fixes, new features, etc.), then those are the paths I would like this work to take.

Otherwise, I just wanted to introduce our project and its new Numba-related efforts. If anyone is interested and wants to know more, don’t hesitate to ask. Better yet, if anyone has general advice related to how we should interact with Numba via Aesara, or would like to point us toward any other relevant information or similar endeavors, please do.

(Sorry about the lack of relevant links; there’s a limit set on new Discourse users.)