Fallback object mode is much faster than explicit object mode?

Volbla · July 9, 2022, 1:26pm

This may be a duplication of thread 571, but it didn’t clear my confusion and i see a much larger discrepancy between the apparent compile modes than the poster there.

I have a function for listing (as indeces) all unique, unordered combinations given a certain set size and combination size. It’s useful for iterating through all possible pairs, triples etc of list members.

def combinations(out_of: int, choose: int):
    combinations_list = np.empty(( math.comb(out_of, choose) , choose ), dtype=np.int64)
    current_combination = np.arange(choose)

    for index in range(len(combinations_list)):
        combinations_list[index] = current_combination
        current_combination[-1] += 1

        # Iterate backwards through the list, carrying any overflow
        no_of_carries = 0
        for a in range(1, choose):
            if current_combination[-a] > out_of - a:
                current_combination[-a - 1] += 1
                no_of_carries += 1
            else:
                break

        # Reset any overflown entries to their smallest valid value
        for a in range(choose - no_of_carries, choose):
            current_combination[a] = current_combination[a-1] + 1

    return combinations_list

This supposedly can’t compile in nopython mode because math.comb is not supported. I could calculate that in a python function and pass it as an argument to a jitted combinations(), but that would be such an ugly hack. I want a more straightforward solution without numba complaining, if possible.

I tried adding a plain @jit decorator and call the function with arguments (20, 10). I get the long warning message but it runs fine. After compilation, subsequent calls consistently take ~8ms.

from time import perf_counter as p
s = p()
combinations(6,3)
print(f"Compilation: {p() - s}")
for x in range(3):
    s = p()
    test = combinations(20,10)
    print(p() - s)
print('\n', test.shape)
print(test[0])
print(test[50000])
print(test[-1])

…/iterate.py:5: NumbaWarning: ←[1m
Compilation is falling back to object mode WITH looplifting enabled because Function “combinations” failed type inference due to: ←[1m←[1mUnknown attribute ‘comb’ of type Module(<module ‘math’ (built-in)>)
[etc]
Compilation: 1.40045765
0.00787641399999961
0.008276775000000125
0.007770777000000173

(184756, 10)
[0 1 2 3 4 5 6 7 8 9]
[ 0 2 3 5 6 12 13 14 16 19]
[10 11 12 13 14 15 16 17 18 19]

I assumed loop lifting was very useful here since there’s a pretty long loop (20 choose 10 = 184 756) with another couple of small loops inside it. But if i instead decorate it with @jit(forceobj=True, looplift=True), post compilation calls take a whole second!

Compilation: 1.5251205799999998
0.997549486
1.003974469
1.009878551

(184756, 10)
[0 1 2 3 4 5 6 7 8 9]
[ 0 2 3 5 6 12 13 14 16 19]
[10 11 12 13 14 15 16 17 18 19]

Is looplift even a real parameter? It’s not mentioned in any of the documentation. Does jit actually use object mode when it says it does, or can it still compile most of the function? Does it only use object mode on the one, unsupported operation? If so, is there a neat way to specify that manually so i can compile without warnings?

Topic		Replies	Views
Why am i getting different performance speeds for the "same" decorator? Community Support	11	977	March 4, 2021
How do I know if Numba compiled anything? Support: How do I do ...?	2	1119	August 26, 2021
Check if a function is jitted Support: How do I do ...?	4	1606	November 13, 2020
Implementing NumPy Functions using objmode Support: How do I do ...?	2	252	June 20, 2024
How can I make this function jit-compatible? Community Support	1	122	July 4, 2024

Fallback object mode is much faster than explicit object mode?

Related topics