Numba for microcontroller such as Cortex-M

Has anyone already tried to use Numba to generate llvm-code for microcontroller?

Using a cfunc callback function, it is possible. We were able to integrate the generated LLVM-IR code in a C++ project in a Linux environment.

However, the different libraries (Numba, Numpy, …) are not prepared to be compiled for Cortex-M.

Has anyone tried it already?

We were able to generate a simple function (addition of 32 bits values) and link the generated LLVM-IR file in an executable. We also tried a function which initialize a jitclass object. To do that, we wrote a dummy python-nrt framework.

The simple addition runs wonderfully on the Cortex-M4. However, the code is generated on a 64 bits PC and the target (Cortex-M4) is 32 bits. So when we try to use a jitclass, structure sizes or pointers are different than expected.

Numba is not prepared for cross-compiling or cross-code-generating. This would be a major improvement for people who - like us - would like to generate code for embedded devices.

Thanks for your thoughts / feedback, and for reporting your experiences.

I see how supporting cross-compilation for 32 bits would be great for your use case. However, we haven’t supported 32 bit targets for quite some time, and it might be a challenge to ensure functionality and correctness on par with 64 bit targets for Numba.

Whilst I think it would be possible to augment Numba with cross-compilation support and to re-add (or re-validate / re-ensure support for 32 bits), I think it would be a major effort that doesn’t fit into the resource constraints of the project right now; therefore, I expect an effort like that would need to come from development effort outside the core Numba project / team.

Another thought - I wonder if LPython could be more adaptable for this use case.

We made some progress with Numba (v0.61.0). We were able to make it run on a Raspberry Pi 2 (armv7 armhf). The datalayout is the same as for cortex-m4:
e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64.
However there is a bug when trying to use a JitClass as function argument on this architecture.

@nb.experimental.jitclass(spec=[("a", nb.float32)])
class TestData:
    def __init__(self):
        self.a = np.float32(0.0)

    def add(self, b: np.float32) -> np.float32:
        self.a += b
        return self.a

@nb.cfunc(TestData.class_type.instance_type(),
          nopython=True,
          nogil=True,
          error_model='numpy',
          fastmath=True,
          no_cfunc_wrapper=True,
          no_cpython_wrapper=True,
          inline='always'
          )
def test_data_create():
    return TestData()

@nb.cfunc(nb.float32(TestData.class_type.instance_type, nb.float32),
          nopython=True,
          nogil=True,
          error_model='numpy',
          fastmath=True,
          no_cfunc_wrapper=True,
          no_cpython_wrapper=True,
          inline='always'
          )
def test_data_add(test_data, b:np.float32)  -> np.float32:
    return test_data.add(b)


def main():
    td = test_data_create();#()
    b = 0
    print(f"td.a : {td.a}")
    for i in range(10):
        b = test_data_add(td, np.float32(i));
        print(f"{b} == {td.a}")

The results are following:

nrt_allocate_meminfo_and_data (nil)
NRT_Allocate_External bytes=28 ptr=0x24a3e30
NRT_MemInfo_alloc_dtor 0x24a3e48 4
NRT_MemInfo_init mi=0x24a3e30 external_allocator=(nil)
td.a : 0.0
td.a : 0.0
0.0 == -5.486129280331905e+303
1.0 == -5.486129280331905e+303
2.0 == -5.486129280331905e+303
3.0 == -5.486129280331905e+303
4.0 == -5.486129280331905e+303
5.0 == -5.486129280331905e+303
6.0 == -5.486129280331905e+303
7.0 == -5.486129280331905e+303
8.0 == -5.486129280331905e+303
9.0 == -5.486129280331905e+303
NRT_MemInfo_release 0x24a3e30 refct=1
NRT_MemInfo_call_dtor 0x24a3e30
nrt_internal_custom_dtor 0x24a3e48, 0x6ec49050
NRT_dealloc meminfo: 0x24a3e30 external_allocator: (nil)
NRT_Free 0x24a3e30

Without looking into it yet, one possibility would be that we’re making an assumption somewhere about pointers being 64 bits.

As a curiousity, we tried 2 tests present in Numba Framework on the Raspberry Pi 2. The Jitclass test is OK.
However, the cfunc test fails:

(.venv) guillaume@raspberrypi:~ $ python3 -m numba.tests.test_cfunc
..EE.s....sss
======================================================================
ERROR: test_numba_carray (__main__.TestCArray.test_numba_carray)
Test Numba-compiled carray() against pure Python carray()
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/guillaume/numba/numba/core/lowering.py", line 511, in lower_inst
    impl = self.context.get_function('static_setitem', signature)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/base.py", line 556, in get_function
    return self.get_function(fn, sig, _firstcall=False)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/base.py", line 558, in get_function
    raise NotImplementedError("No definition for lowering %s%s" % (key, sig))
NotImplementedError: No definition for lowering static_setitem(Array(float32, 1, 'C', False, aligned=True), slice<a:b>, UniTuple(int32, 2)) -> none

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/guillaume/numba/numba/tests/test_cfunc.py", line 294, in test_numba_carray
    self.check_numba_carray_farray(carray_usecase, carray_dtype_usecase)
  File "/home/guillaume/numba/numba/tests/test_cfunc.py", line 270, in check_numba_carray_farray
    f = cfunc(sig)(pyfunc)
        ^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/decorators.py", line 275, in wrapper
    res.compile()
  File "/home/guillaume/numba/numba/core/compiler_lock.py", line 35, in _acquire_compile_lock
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/ccallback.py", line 68, in compile
    cres = self._compile_uncached()
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/ccallback.py", line 82, in _compile_uncached
    return self._compiler.compile(sig.args, sig.return_type)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/dispatcher.py", line 80, in compile
    status, retval = self._compile_cached(args, return_type)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/dispatcher.py", line 94, in _compile_cached
    retval = self._compile_core(args, return_type)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/dispatcher.py", line 107, in _compile_core
    cres = compiler.compile_extra(self.targetdescr.typing_context,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/compiler.py", line 739, in compile_extra
    return pipeline.compile_extra(func)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/compiler.py", line 439, in compile_extra
    return self._compile_bytecode()
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/compiler.py", line 505, in _compile_bytecode
    return self._compile_core()
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/compiler.py", line 481, in _compile_core
    raise e
  File "/home/guillaume/numba/numba/core/compiler.py", line 473, in _compile_core
    pm.run(self.state)
  File "/home/guillaume/numba/numba/core/compiler_machinery.py", line 363, in run
    raise e
  File "/home/guillaume/numba/numba/core/compiler_machinery.py", line 356, in run
    self._runPass(idx, pass_inst, state)
  File "/home/guillaume/numba/numba/core/compiler_lock.py", line 35, in _acquire_compile_lock
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/compiler_machinery.py", line 311, in _runPass
    mutated |= check(pss.run_pass, internal_state)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/compiler_machinery.py", line 272, in check
    mangled = func(compiler_state)
              ^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/typed_passes.py", line 468, in run_pass
    lower.lower()
  File "/home/guillaume/numba/numba/core/lowering.py", line 193, in lower
    self.lower_normal_function(self.fndesc)
  File "/home/guillaume/numba/numba/core/lowering.py", line 232, in lower_normal_function
    entry_block_tail = self.lower_function_body()
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/lowering.py", line 262, in lower_function_body
    self.lower_block(block)
  File "/home/guillaume/numba/numba/core/lowering.py", line 276, in lower_block
    self.lower_inst(inst)
  File "/home/guillaume/numba/numba/core/lowering.py", line 513, in lower_inst
    return self.lower_setitem(inst.target, inst.index_var,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/lowering.py", line 624, in lower_setitem
    return impl(self.builder, (target, index, value))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/base.py", line 1190, in __call__
    res = self._imp(self._context, builder, self._sig, args, loc=loc)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/base.py", line 1220, in wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/np/arrayobj.py", line 556, in setitem_array
    return fancy_setslice(context, builder, sig, args,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/np/arrayobj.py", line 1715, in fancy_setslice
    res = context.compile_internal(builder, raise_impl, sig, tup)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/base.py", line 882, in compile_internal
    return self.call_internal(builder, cres.fndesc, sig, args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/base.py", line 889, in call_internal
    status, res = self.call_internal_no_propagate(builder, fndesc, sig, args)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/base.py", line 903, in call_internal_no_propagate
    status, res = self.call_conv.call_function(builder, fn, sig.return_type,
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/callconv.py", line 907, in call_function
    code = builder.call(callee, realargs, attrs=_attrs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/llvmlite/llvmlite/ir/builder.py", line 881, in call
    inst = instructions.CallInstr(self.block, fn, args, name=name,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/llvmlite/llvmlite/ir/instructions.py", line 105, in __init__
    raise TypeError(msg)
TypeError: Type of #3 arg mismatch: i64 != i32

======================================================================
ERROR: test_numba_farray (__main__.TestCArray.test_numba_farray)
Test Numba-compiled farray() against pure Python farray()
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/guillaume/numba/numba/core/lowering.py", line 511, in lower_inst
    impl = self.context.get_function('static_setitem', signature)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/base.py", line 556, in get_function
    return self.get_function(fn, sig, _firstcall=False)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/base.py", line 558, in get_function
    raise NotImplementedError("No definition for lowering %s%s" % (key, sig))
NotImplementedError: No definition for lowering static_setitem(Array(float32, 1, 'F', False, aligned=True), slice<a:b>, UniTuple(int32, 2)) -> none

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/guillaume/numba/numba/tests/test_cfunc.py", line 300, in test_numba_farray
    self.check_numba_carray_farray(farray_usecase, farray_dtype_usecase)
  File "/home/guillaume/numba/numba/tests/test_cfunc.py", line 270, in check_numba_carray_farray
    f = cfunc(sig)(pyfunc)
        ^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/decorators.py", line 275, in wrapper
    res.compile()
  File "/home/guillaume/numba/numba/core/compiler_lock.py", line 35, in _acquire_compile_lock
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/ccallback.py", line 68, in compile
    cres = self._compile_uncached()
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/ccallback.py", line 82, in _compile_uncached
    return self._compiler.compile(sig.args, sig.return_type)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/dispatcher.py", line 80, in compile
    status, retval = self._compile_cached(args, return_type)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/dispatcher.py", line 94, in _compile_cached
    retval = self._compile_core(args, return_type)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/dispatcher.py", line 107, in _compile_core
    cres = compiler.compile_extra(self.targetdescr.typing_context,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/compiler.py", line 739, in compile_extra
    return pipeline.compile_extra(func)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/compiler.py", line 439, in compile_extra
    return self._compile_bytecode()
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/compiler.py", line 505, in _compile_bytecode
    return self._compile_core()
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/compiler.py", line 481, in _compile_core
    raise e
  File "/home/guillaume/numba/numba/core/compiler.py", line 473, in _compile_core
    pm.run(self.state)
  File "/home/guillaume/numba/numba/core/compiler_machinery.py", line 363, in run
    raise e
  File "/home/guillaume/numba/numba/core/compiler_machinery.py", line 356, in run
    self._runPass(idx, pass_inst, state)
  File "/home/guillaume/numba/numba/core/compiler_lock.py", line 35, in _acquire_compile_lock
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/compiler_machinery.py", line 311, in _runPass
    mutated |= check(pss.run_pass, internal_state)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/compiler_machinery.py", line 272, in check
    mangled = func(compiler_state)
              ^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/typed_passes.py", line 468, in run_pass
    lower.lower()
  File "/home/guillaume/numba/numba/core/lowering.py", line 193, in lower
    self.lower_normal_function(self.fndesc)
  File "/home/guillaume/numba/numba/core/lowering.py", line 232, in lower_normal_function
    entry_block_tail = self.lower_function_body()
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/lowering.py", line 262, in lower_function_body
    self.lower_block(block)
  File "/home/guillaume/numba/numba/core/lowering.py", line 276, in lower_block
    self.lower_inst(inst)
  File "/home/guillaume/numba/numba/core/lowering.py", line 513, in lower_inst
    return self.lower_setitem(inst.target, inst.index_var,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/lowering.py", line 624, in lower_setitem
    return impl(self.builder, (target, index, value))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/base.py", line 1190, in __call__
    res = self._imp(self._context, builder, self._sig, args, loc=loc)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/base.py", line 1220, in wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/np/arrayobj.py", line 556, in setitem_array
    return fancy_setslice(context, builder, sig, args,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/np/arrayobj.py", line 1715, in fancy_setslice
    res = context.compile_internal(builder, raise_impl, sig, tup)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/base.py", line 882, in compile_internal
    return self.call_internal(builder, cres.fndesc, sig, args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/base.py", line 889, in call_internal
    status, res = self.call_internal_no_propagate(builder, fndesc, sig, args)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/base.py", line 903, in call_internal_no_propagate
    status, res = self.call_conv.call_function(builder, fn, sig.return_type,
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/numba/numba/core/callconv.py", line 907, in call_function
    code = builder.call(callee, realargs, attrs=_attrs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/llvmlite/llvmlite/ir/builder.py", line 881, in call
    inst = instructions.CallInstr(self.block, fn, args, name=name,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/guillaume/llvmlite/llvmlite/ir/instructions.py", line 105, in __init__
    raise TypeError(msg)
TypeError: Type of #3 arg mismatch: i64 != i32

----------------------------------------------------------------------
Ran 13 tests in 17.371s

FAILED (errors=2, skipped=4)