Random sampling parity with NumPy

We’ve added naive support for the size parameters to the numpy.random functions supported by Aesara, but, after setting the RNG seed and generating multiple samples with size, the results do not match NumPy’s when parameter broadcasting is involved. Since the results do match when broadcasting isn’t involved, it seems like the iteration order appears to be the reason for this disparity–in at least the two cases I’ve tested: standard_cauchy and gamma.

In the latter case, the logic can be traced to cont_broadcast_2 and the use of PyArray_MultiIterNew2 (or essentially just PyArray_MultiIterNew). Does anyone know the exact order in which this iterator object produces its elements, and/or how this can be reproduced in Numba?

Here’s an illustration for np.random.gamma that uses a Numba function similar to what’s produced by Aesara:

import numpy as np
import numba as nb
from functools import reduce


@nb.njit
def gamma(a, b, size):
    np.random.seed(232)

    size_len = reduce(lambda x, y: x * y, size)

    a_bcast = np.broadcast_to(a, size)
    b_bcast = np.broadcast_to(b, size)

    res = np.empty(size, dtype=np.float64)
    for n in range(size_len):
        res.flat[n] = np.random.gamma(a_bcast.flat[n], b_bcast.flat[n])

    return res


a_val = np.array([1.0, 20.0], dtype=np.float64)
b_val = np.array(1.0, dtype=np.float64)

res = gamma(a_val, b_val, (3, 2))
res
# array([[ 0.39615309, 18.06329295],
#        [ 1.05653084, 17.60267079],
#        [ 1.88297358, 20.82522561]])

np.random.seed(232)
exp_res = np.random.gamma(a_val, b_val, size=(3, 2))
exp_res
# array([[ 0.39615309, 15.82979862],
#        [ 0.37191629, 17.9364272 ],
#        [ 1.88297358, 16.43705365]])

My initial assumptions were not entirely correct: apparently the results may not match even when parameters aren’t being broadcasted:

# These parameters result in matching samples
a_val = np.array(1.0, dtype=np.float64)
b_val = np.array(20.0, dtype=np.float64)

res = gamma(a_val, b_val, (3, 2))
res
# array([[ 7.92306179,  8.49720122],
#        [ 3.12853278, 21.13061682],
#        [ 7.43832585,  6.68419228]])

np.random.seed(232)
exp_res = np.random.gamma(a_val, b_val, size=(3, 2))
exp_res
# array([[ 7.92306179,  8.49720122],
#        [ 3.12853278, 21.13061682],
#        [ 7.43832585,  6.68419228]])

# These don't
a_val = np.array(20.0, dtype=np.float64)
b_val = np.array(1.0, dtype=np.float64)

res = gamma(a_val, b_val, (3, 2))
res
# array([[17.81826817, 15.04965406],
#        [17.60267079, 26.33298681],
#        [13.3839478 , 22.84712425]])

np.random.seed(232)
exp_res = np.random.gamma(a_val, b_val, size=(3, 2))
exp_res
# array([[14.93257957, 14.40587564],
#        [15.10836299, 15.62507865],
#        [20.25415298, 16.48784293]])

Edit: np.random.gamma isn’t a good example, because its Numba implementation appears to take a different approach than NumPy altogether.