Extending Numba with a "convertible to" type?

Using Numba’s extension API, is it possible to extend all functions and builtin operators that expect type Y with type X by inserting an X → Y conversion function?

Here’s the use-case: hundreds of functions taking NumPy arrays have been defined in core Numba. I have extended an array type that is sometimes convertible to NumPy arrays (I can raise exceptions in the conversion function if it is not). I would like to implement all of these functions and any that may be added in the future by registering an implicit conversion, similar to Scala’s implicit conversions.

For example, I can implement sums over contents of Awkward Arrays like this:

>>> import numpy as np
>>> import numba as nb
>>> import awkward1 as ak

>>> @nb.njit
... def f(input):
...     output = np.zeros(len(input), np.float64)
...     for i, x in enumerate(input):
...         for y in x:
...             output[i] += y
...     return output

>>> f(ak.Array([[0, 1, 2], [], [3, 4], [5], [6, 7, 8, 9]]))
array([ 3.,  0.,  7.,  5., 30.])

because input is an iterable Awkward Array yielding Awkward Arrays x and x is an iterable Awkward Array yielding numbers y, and output[i] += y knows how to add numbers to items of an array.

But suppose I want to write it like

>>> @nb.njit
... def f(input):
...     output = np.zeros(len(input), np.float64)
...     for i, x in enumerate(input):
...         output[i] = np.sum(x)
...     return output
>>> f(ak.Array([[0, 1, 2], [], [3, 4], [5], [6, 7, 8, 9]]))

As before, x is an Awkward Array. I can lower a conversion function to_numpy that converts the Awkward Array into a (lowered) NumPy array or raise an exception trying. However, I need

  • Numba’s typing pass to recognize all functions, not just np.sum, that take a NumPy array as also being open to a signature with an Awkward Array in place of the NumPy array, and
  • Numba’s lowering pass to insert my conversion function in those places.

Is there anything I can do about this? The use-case described above is scikit-hep/awkward-1.0#509, but this would also enable scikit-hep/awkward-1.0#174, implementing union-typed Awkward Arrays, because that also has to overload an open-ended set of functions. (If the union array can have values of type float or bool, then it should be registered as convertible to both float and bool, but if a value wants to resolve to bool and the particular datum happens to be float, the conversion function would raise an exception. That implements dynamic type-checking for a fixed set of types.)

The convertible-to-array case (issue #509) would be simpler to implement and is a more immediate need than union arrays (issue #174), but they both depend on this capability.

Is this already possible in Numba or would something have to change?

I forgot, I was going to give the exception raised by the second code example. However, it’s not a mystery why it fails—array_sum is implemented for types.Array, not Awkward Array’s ArrayView.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jpivarski/miniconda3/lib/python3.8/site-packages/numba/core/dispatcher.py", line 415, in _compile_for_args
    error_rewrite(e, 'typing')
  File "/home/jpivarski/miniconda3/lib/python3.8/site-packages/numba/core/dispatcher.py", line 358, in error_rewrite
    reraise(type(e), e, None)
  File "/home/jpivarski/miniconda3/lib/python3.8/site-packages/numba/core/utils.py", line 80, in reraise
    raise value.with_traceback(tb)
numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<function sum at 0x7fd7ffa6e430>) found for signature:
 >>> sum(awkward1.ArrayView(awkward1.NumpyArrayType(array(int64, 1d, A), none, {}), None, ()))
There are 2 candidate implementations:
  - Of which 2 did not match due to:
  Overload in function 'Numpy_method_redirection.generic': File: numba/core/typing/npydecl.py: Line 370.
    With argument(s): '(awkward1.ArrayView(awkward1.NumpyArrayType(array(int64, 1d, A), none, {}), None, ()))':
   Rejected as the implementation raised a specific error:
     TypeError: array does not have a field with key 'sum'
  raised from /home/jpivarski/irishep/awkward-1.0/awkward1/_connect/_numba/layout.py:323

During: resolving callee type: Function(<function sum at 0x7fd7ffa6e430>)
During: typing of call at <stdin> (5)

File "<stdin>", line 5:
<source missing, REPL/exec in use?>

Ah, one last thing: it looks like typeconv does this, but for scalar types. I wonder if the mechanism applies/can be applied more generally?

Also, there’s an ArrayCompatible abstract type that I could make Awkward Arrays inherit from. That can solve the convertible-to-array problem (my issue #509) but not the union type problem (issue #174), but that’s okay because the first is more immediately relevant.

However, functions in arraymath.py, for example, have concrete Array in their signatures, not ArrayCompatible. So this wouldn’t work, would it?

I also found a few notes about abstracting an “ArrayLike” (numba/numba#3855 and the Jan 15, 2019 minutes).

if this purely a typing problem, your custom type can have a can_convert_to method that would allow the type conversion. I used it to implement subtyping here https://github.com/numba/numba/pull/5579 and here https://github.com/numba/numba/pull/5560

If the lowering should be different (ie np.sum would have to be re-compiled for the custom type) then I don’t know how to do it.

Thanks! I just read through those PRs and I see that they’re about the subtyping that was talked about at last Tuesday’s meeting. What I need involves both typing and lowering:

  • Typing to recognize an Awkward Array (ak._connect._numba.arrayview.ArrayViewType) everywhere that a NumPy array is in a signature (nb.types.Array);
  • Lowering to insert the Awkward Array → NumPy conversion function (ak._connect._numba.arrayview.ArrayViewModelnb.core.datamodel.models.ArrayModel) as a first step in evaluating a function with a NumPy array in its signature but an Awkward Array passed in.

I started by thinking about the general solution, Scala-like implicit conversions, which would let extension developers make their objects masquerade as any type—concrete or abstract—without modifications to the core library. If that isn’t available, there are other options.

For my most immediate need, scikit-hep/awkward-1.0#509, only two types are involved: Awkward Arrays and NumPy arrays. There is a mechanism for this particular target: nb.core.types.abstract.ArrayCompatible. Just as making my arrays inherit from nb.types.IterableType was sufficient to get enumerate and zip for free, inheriting from nb.core.types.abstract.ArrayCompatible could be enough to get a lot of functions in arraymath.py if they were modified to take ArrayCompatible as argument types instead of Array and then call the conversion function as a first step. In fact, it looks like all the ufuncs in npydecl.py already accept ArrayCompatible instead of Array, so without any changes to Numba, I might at least get ufuncs for free.

So as a first step, perhaps I should try making Awkward Arrays inherit from ArrayCompatible and see if I can get all the ufuncs. (The original motivation was a non-ufunc, but one thing at a time.)

Looking more closely at ArrayCompatible, it is an abstract type that I can subclass (good) as long as I implement as_array, which returns the nb.core.types.Buffer type that this corresponds to (easy). But that’s just typing. Where do I implement the lowering that converts a concrete ak._connect._numba.arrayview.ArrayViewModel into a concrete nb.core.datamodel.models.ArrayModel? The lowered ufuncs need an ArrayModel to run and I can write the lowered conversion that turns my model into your model, but I need Numba to insert it in the right places.

How do I do this? Or am I wrong and is ArrayCompatible a promise that my concrete objects have the same memory layout as ArrayModel (in which case, it’s not really abstract)?

What about lower_cast? Is this the general, implicit “converible to” that I was asking about above?