Seeing different behavior with python 3.8.2 vs. python 3.7.5

I have this code that works just fine on one of my computers (MacBook Pro with python 3.7.5, numpy 1.19.4, and numba 0.52.0) but produces a different result on my other computer (Macbook Pro with python 3.8.2, numpy 1.20.1, and numba 0.52.0). With the former setup, this code produces a numpy-array with 8 elements, with the latter setup it produces and array with one element. The 8 element version is what I consider correct.
To make the situation even more strange, by just including a print-statement at the end of the function (commented out below) I get the code to produce the result I want on both computers. As far as I understand, that print statement should not override the @jit(nopython=True).

-Any pointers?

For the ones interested, the code is part of a metric for a type causal inference (uplift modeling).

import numpy as np
from numba import jit

@jit(nopython=True)
def _qini_points(data_class,
                 data_score,
                 data_group):
    """Auxiliary function for qini_coefficient(). Returns the
    points on the qini-curve.

    Args:
    data_class (numpy.array([bool]))
    data_score (numpy.array([float]))
    data_group (numpy.array([bool])): True indicates that sample
     belongs to the treatment-group.
    """
    # Order data in descending order:
    data_idx = np.argsort(data_score)[::-1]
    data_class = data_class[data_idx]
    data_score = data_score[data_idx]
    data_group = data_group[data_idx]

    # Set initial values for counters etc:
    qini_points = []
    # Normalization factor (N_t / N_c):
    n_factor = np.sum(data_group) / np.sum(~data_group)
    control_goals = 0
    treatment_goals = 0
    score_previous = np.finfo(np.float32).min
    tmp_n_samples = 1  # Set to one to allow division in first iteration
    tmp_treatment_goals = 0
    tmp_control_goals = 0
    for item_class, item_score, item_group in\
            zip(data_class, data_score, data_group):
        if score_previous != item_score:
            # If we have a 'new score', handle the samples
            # currently stored as counts...
            for i in range(1, tmp_n_samples + 1):
                # Turns out adding the zeroeth item is pointless.
                # Oh, well... it does not affect a thing.
                tmp_qini_point = (treatment_goals + i * tmp_treatment_goals /
                                  tmp_n_samples) -\
                    (control_goals + i * tmp_control_goals /
                     tmp_n_samples) * n_factor
                qini_points.append(tmp_qini_point)
            # Add tmp items to vectors before resetting them
            treatment_goals += tmp_treatment_goals
            control_goals += tmp_control_goals
            # Reset counters
            tmp_n_samples = 0
            tmp_treatment_goals = 0
            tmp_control_goals = 0
            score_previous = item_score
        # Add item to counters:
        tmp_n_samples += 1
        tmp_treatment_goals += int(item_group) * item_class
        tmp_control_goals += int(~item_group) * item_class

    # Handle remaining samples:
    for i in range(1, tmp_n_samples + 1):
        tmp_qini_point = (treatment_goals + i * tmp_treatment_goals /
                          tmp_n_samples) -\
            (control_goals + i * tmp_control_goals /
             tmp_n_samples) * n_factor
        qini_points.append(tmp_qini_point)

    # Make list into np.array:
    # Toggling the print function here will make the code work again.
    # print(len(qini_points))
    qini_points = np.array(qini_points)
    return qini_points

# Test function:
data_class = np.array([True, False, False, True, True, False, True])
data_score = np.array([0.1, 0.2, 0.2, 0.2, 0.5, 0.6, 0.7])
data_group = np.array([True, False, True, False, False, True, True])

tmp = _qini_points(data_class, data_score, data_group)
print(tmp)

@notto thank you for submitting this on the Numba discourse. We recently noticed some issues with Numba and Numpy 1.20 NumPy 1.20 numerical regressions · Issue #6812 · numba/numba · GitHub – can you try with Python 3.8.2 and numpy 1.19.4 to check if this might be Numpy or Python version related?

I created a separate virtualenv with the same versions and packages (python 3.8.2, numba 0.52.0) except the numpy of which I installed the 1.19.4 version as you suggested. I still see the same erroneous behavior. I also tried the above with numba 0.53.0 and the problem still persists. So it seems the problem is python related.
-Hope this helps!

@notto thank you for following up on this and thank you for also testing 0.53.0. This may very well be a bug. The next step will be to condense the example to what is known as a “minimum reproducer”, i.e. a as-short-as-possible snippet, without all the domain specific computations, to trigger only this behaviour. If you have the time and inclination, please do feel free to attempt this. Otherwise, one of our developers will probably take a closer look next week.

Allright. I think I got it:

import numpy as np
from numba import jit

@jit(nopython=True)
def test():
    tmp = []
    for i in range(3):
        tmp.append(i)
    for i in range(1):
        tmp.append(i)
    tmp = np.array(tmp)
    return tmp


tmp = test()
print(tmp)

Commenting out and uncommenting the line “@jit(nopython=True)” results in different behavior. It seems that with Numba, the second for-loop is emptying the tmp-variable and just creating a list with one item rather than appending the item to the existing list. This now with Python 3.8.2, NumPy 1.20.1, and Numba 0.52.0. And to be precise, I am not seeing this with Python 3.7.5.

I am thinking this is a bug.
-If you agree and decide to fix it, how do I get the news of this issue having been resolved?

Thank you for the pointers!

@notto excellent! Great work! I can confirm your findings and that the reproducer works.

On Python 3.7 this outputs:

[0 1 2 0]

Whereas on 3.8 it outputs:

[0]

Thank you for finding this and isolating it. The next step will be to transfer this to the Numba issue tracker where it will receive further scrutiny and be labeled as a bug. This is also the ticket which you can then watch and follow to be notified once the issue is resolved. Obviously the workaround is simple in the reproducer case, but I would tend to argue that this is something that should be fixed.

Imported to Github issue tracker here: reflected list behaves differently on Python 3.7 and 3.8 · Issue #6825 · numba/numba · GitHub

Thanks for the link!
This is a structure I would often use to deal with remaining samples in some temporary variables after exiting some loop, so I would hope that this gets fixed. But that is up to you.
-Good luck!

@notto you could try using numba.typed.List as an alternative if you require this specific pattern:

https://numba.readthedocs.io/en/stable/reference/pysupported.html#typed-list