Numba not working in AWS Sagemaker

I’m trying to migrate some code from Google Colab to AWS Sagemaker. I have a training script and at the end of the training I evaluate results using some functions, here comes the problem. I have troubles using numba. I have this function for cosine similarity:

@jit(nopython=True)
def fast_cosine(u, v):
    m = u.shape[0]
    udotv = 0
    u_norm = 0
    v_norm = 0
    for i in range(m):
        if (np.isnan(u[i])) or (np.isnan(v[i])):
            continue

        udotv += u[i] * v[i]
        u_norm += u[i] * u[i]
        v_norm += v[i] * v[i]

    u_norm = np.sqrt(u_norm)
    v_norm = np.sqrt(v_norm)

    if (u_norm == 0) or (v_norm == 0):
        ratio = 1.0
    else:
        ratio = udotv / (u_norm * v_norm)
    return ratio

That function works flawlessy in Google Colab, taking couple of minutes to complete. When I use the same in Sagemaker, it takes hours. I also tried to add cache=True and to set NUMBA_CACHE_DIR to a path surely writable, in my case /opt/ml/model . I also set NUMBA_DEBUG_CACHE but the only output I get is this:

[cache] index saved to '/opt/ml/model/code_ec9d94cfa840991722b08139d5ab141658f37feb/train.fast_cosine-199.py38.nbi'
[cache] data saved to '/opt/ml/model/code_ec9d94cfa840991722b08139d5ab141658f37feb/train.fast_cosine-199.py38.1.nbc'

This function takes ages on AWS Sagemaker, while it takes only couple of minutes in Google Colab (where it takes much more without using Numba).

Note for anyone else with an interest in this problem, there is also an issue here: Numba not working in containerized environment, like Sagemaker · Issue #7970 · numba/numba · GitHub