Understanding Numba optimization


for a university project I am trying to generate llvm ir from various programming languages in order to extract data dependencies from this representation to detect parallelism. For this I need unoptimized llvm ir. It is important that I have all the load and store instructions and that they are not optimized away by the compiler.

I looked at the Numba example from this site:

Numba Guide arch

and followed through with this example:

def add(a, b):
    return a + b

add(5, 10)

i set the env variable numba_dump_llvm to 1

I expected some output like in the first box of section 7a, nopython and unoptimized.

     %"a" = alloca i64
     %"b" = alloca i64
     %"$0.3" = alloca i64
     %"$0.4" = alloca i64
     br label %"B0"
     store i64 %"arg.a", i64* %"a"
     store i64 %"arg.b", i64* %"b"
     %".8" = load i64* %"a"
     %".9" = load i64* %"b"
     %".10" = add i64 %".8", %".9"
     store i64 %".10", i64* %"$0.3"
     %".12" = load i64* %"$0.3"
     store i64 %".12", i64* %"$0.4"
     %".14" = load i64* %"$0.4"
     store i64 %".14", i64* %"retptr"
     ret i32 0

however unlike the example, this is what I got:

  br label %"B0"
  %".6" = add nsw i64 %"arg.a", %"arg.b"
  store i64 %".6", i64* %"retptr"
  ret i32 0

so it seems that despite setting the env variable to unoptimized the optimization still occurred?

Thanks for bringing this up - I’ve recorded this as LLVM dump appears to be optimized when it shouldn't be · Issue #8149 · numba/numba · GitHub

Also, note that the docs you link to are the old ones on pydata.org - the current documentation is at: Numba architecture — Numba 0.55.2+0.g2298ad618.dirty-py3.7-linux-x86_64.egg documentation

I’m actually not sure this is a bug (see details on the issue). @Program1 if you create a slightly more complex example do you still see IR that is obviously unoptimized?