Numba very slow with default arguments depending on which arguments are provided

Crossposting my stackoverflow question, I’m implementing a function for which some arguments are compulsory and some are set as default in numba, however depending on the values set for the default arguments, which order the different types appear in and which arguments are provided I am getting very different timings:

import numba as nb

@nb.njit
def function(a, b, c, d=1.49012e-8, e=1.49012000000001e-8, f=0.0, g=None):
    ...
        
@nb.njit
def function2(a, b, c, d=1.49012e-8, e=1.49012000000001e-8, f=0.0):
    ...
        
@nb.njit
def function3(a, b, c, d=1.49012e-8, e=1.49012000000001e-8, f=None, g=0.0):
    ...

@nb.njit
def function4(a, b, c, d=1.49012e-8, e=1.49012e-8, f=0.0, g=None):
    ...

And then timing it with different numbers of args and excluding different kwargs:

d = 1.49012e-8
e = 1.49012000000001e-8
f = 0.0
g = 1000

args = (d, e, f, g)
kwargs = {'d': d, 'e': e, 'f': f, 'g': g}

def time_func(func, args, kwargs):
    func(1, 2, 3)
    
    print(func.__name__)
    print("time *args")
    for i, _ in enumerate(args):
        func(1, 2, 3, *args[:i])
        %timeit -n 1000 func(1, 2, 3, *args[:i])
    print("time **kwargs")
    for i in kwargs:
        _kwargs = {k: v for k, v in kwargs.items() if k != i}
        func(1, 2, 3, **_kwargs)
        %timeit -n 1000 func(1, 2, 3, **_kwargs)

time_func(function, args, kwargs)
time_func(function2, args[:-1], {k: v for k, v in kwargs.items() if k != 'g'})
time_func(function3, args, kwargs)
time_func(function4, args, kwargs)

Output:

function
time *args
26.3 µs ± 425 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
25.4 µs ± 266 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
24 µs ± 175 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
241 ns ± 4.94 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
time **kwargs
235 ns ± 2.03 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
23.7 µs ± 62.6 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
23.3 µs ± 203 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
241 ns ± 5.25 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
function2
time *args
24.1 µs ± 115 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
23.3 µs ± 172 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
22.1 µs ± 428 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
time **kwargs
210 ns ± 1.31 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
22.6 µs ± 97.4 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
21.9 µs ± 98.5 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
function3
time *args
26.3 µs ± 149 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
25.2 µs ± 81.4 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
24 µs ± 160 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
23.3 µs ± 416 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
time **kwargs
237 ns ± 4.64 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
25 µs ± 290 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
255 ns ± 12.5 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
24.2 µs ± 112 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
function4
time *args
26.2 µs ± 238 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
25.1 µs ± 95.6 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
24.1 µs ± 250 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
240 ns ± 5.87 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
time **kwargs
231 ns ± 11.9 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
233 ns ± 3.1 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
23.4 µs ± 132 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
230 ns ± 3.43 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

I’ve tested with numba 0.58.0 in python 3.11.5 and numba 0.57.1 in python 3.10.12 and get similar results in both.