Possible to generate "standalone" C callback LLVM IR?

Hello, I’m new to Numba’s compiler infrastructure and am wondering if it’s possible to define a C callback function (via @cfunc) and use the low-level API to generate “standalone” LLVM IR. “Standalone” being something that could be compiled and linked to independently by another C program without the need for CPython or the NRT.

The use case would be to use Python to define simple native “lambda” functions that are send to a remote server for compilation and execution.

Thanks!

Hi @scienceplease

I used to work on a project (RBC) that did something quite similar. The project involved using Numba to generate LLVM IR that could be executed elsewhere in a system without a Numba installation. The project even had a simpler version of the NRT implemented directly in LLVM IR.

It might be possible to achieve the same with just @cfunc, but it needs additional work as not everything will work out of the box or will work at all.

NRT

You either need to disable NRT when using the cfunc decorator or make sure your code is linked against the Numba runtime in the server. Without the NRT, you are basically left with primitive types, as containers need the runtime to manage memory.

LLVM Mismatch versions

Make sure you use the same LLVM version in the server to compile the IR generated by Numba or otherwise, there might be cases where LLVM won’t be able to parse the intermediate representation generated by Numba.

Exceptions

I don’t think exceptions will work at all.


Below is a simple example of a LLVM IR generated when using this decorator. See that Numba generated two LLVM IR functions, being one of then prefixed with @cfunc.. One is the function that follows the Numba CPU calling convention, while the other (prefixed with @cfunc._) is a wrapper to obey the cc.

from numba import cfunc

@cfunc('int32(int32)', _nrt=False)
def incr(x):
    return x + 1

print(incr.inspect_llvm())
; ModuleID = 'incr'
source_filename = "<string>"
target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
target triple = "arm64-apple-darwin21.4.0"

@_ZN08NumbaEnv8__main__4incrB2v1B48c8tJTIcFKzyF2ILShI4CrgQElUakCCQB1FgGS2gCAA_3d_3dEi = common local_unnamed_addr global i8* null

; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn writeonly
define i32 @_ZN8__main__4incrB2v1B48c8tJTIcFKzyF2ILShI4CrgQElUakCCQB1FgGS2gCAA_3d_3dEi(i32* noalias nocapture writeonly %retptr, { i8*, i32, i8*, i8*, i32 }** noalias nocapture readnone %excinfo, i32 %arg.x) local_unnamed_addr #0 {
entry:
  %.6 = add i32 %arg.x, 1
  store i32 %.6, i32* %retptr, align 4
  ret i32 0
}

; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn writeonly
define i32 @cfunc._ZN8__main__4incrB2v1B48c8tJTIcFKzyF2ILShI4CrgQElUakCCQB1FgGS2gCAA_3d_3dEi(i32 %.1) local_unnamed_addr #0 {
entry:
  %.3 = alloca i32, align 4
  store i32 0, i32* %.3, align 4
  %.7 = call i32 @_ZN8__main__4incrB2v1B48c8tJTIcFKzyF2ILShI4CrgQElUakCCQB1FgGS2gCAA_3d_3dEi(i32* nonnull %.3, { i8*, i32, i8*, i8*, i32 }** nonnull undef, i32 %.1) #1
  %.17 = load i32, i32* %.3, align 4
  ret i32 %.17
}

attributes #0 = { mustprogress nofree norecurse nosync nounwind willreturn writeonly }
attributes #1 = { noinline }

I bet that LLVM is smart enough to inline the first function (@_ZN08NumbaEnv8__main__ ...) into the second one and remove the unused arguments.

Forgot to mention but the RBC project ships with a decorator @remotejit that you can use to do this.