Random sampling parity with NumPy

brandonwillard · March 4, 2022, 7:40pm

We’ve added naive support for the size parameters to the numpy.random functions supported by Aesara, but, after setting the RNG seed and generating multiple samples with size, the results do not match NumPy’s when parameter broadcasting is involved. Since the results do match when broadcasting isn’t involved, it seems like the iteration order appears to be the reason for this disparity–in at least the two cases I’ve tested: standard_cauchy and gamma.

In the latter case, the logic can be traced to cont_broadcast_2 and the use of PyArray_MultiIterNew2 (or essentially just PyArray_MultiIterNew). Does anyone know the exact order in which this iterator object produces its elements, and/or how this can be reproduced in Numba?

Here’s an illustration for np.random.gamma that uses a Numba function similar to what’s produced by Aesara:

import numpy as np
import numba as nb
from functools import reduce


@nb.njit
def gamma(a, b, size):
    np.random.seed(232)

    size_len = reduce(lambda x, y: x * y, size)

    a_bcast = np.broadcast_to(a, size)
    b_bcast = np.broadcast_to(b, size)

    res = np.empty(size, dtype=np.float64)
    for n in range(size_len):
        res.flat[n] = np.random.gamma(a_bcast.flat[n], b_bcast.flat[n])

    return res


a_val = np.array([1.0, 20.0], dtype=np.float64)
b_val = np.array(1.0, dtype=np.float64)

res = gamma(a_val, b_val, (3, 2))
res
# array([[ 0.39615309, 18.06329295],
#        [ 1.05653084, 17.60267079],
#        [ 1.88297358, 20.82522561]])

np.random.seed(232)
exp_res = np.random.gamma(a_val, b_val, size=(3, 2))
exp_res
# array([[ 0.39615309, 15.82979862],
#        [ 0.37191629, 17.9364272 ],
#        [ 1.88297358, 16.43705365]])

brandonwillard · March 4, 2022, 9:48pm

My initial assumptions were not entirely correct: apparently the results may not match even when parameters aren’t being broadcasted:

# These parameters result in matching samples
a_val = np.array(1.0, dtype=np.float64)
b_val = np.array(20.0, dtype=np.float64)

res = gamma(a_val, b_val, (3, 2))
res
# array([[ 7.92306179,  8.49720122],
#        [ 3.12853278, 21.13061682],
#        [ 7.43832585,  6.68419228]])

np.random.seed(232)
exp_res = np.random.gamma(a_val, b_val, size=(3, 2))
exp_res
# array([[ 7.92306179,  8.49720122],
#        [ 3.12853278, 21.13061682],
#        [ 7.43832585,  6.68419228]])

# These don't
a_val = np.array(20.0, dtype=np.float64)
b_val = np.array(1.0, dtype=np.float64)

res = gamma(a_val, b_val, (3, 2))
res
# array([[17.81826817, 15.04965406],
#        [17.60267079, 26.33298681],
#        [13.3839478 , 22.84712425]])

np.random.seed(232)
exp_res = np.random.gamma(a_val, b_val, size=(3, 2))
exp_res
# array([[14.93257957, 14.40587564],
#        [15.10836299, 15.62507865],
#        [20.25415298, 16.48784293]])

Edit: np.random.gamma isn’t a good example, because its Numba implementation appears to take a different approach than NumPy altogether.

Topic		Replies	Views
Functions involving random numbers Community Support	2	1072	December 21, 2020
Random Number Generation on the GPU Support: What is this error message?	9	3489	November 6, 2020
Numba parallel loops does not execute by ascending index number unlike CUDA Community Support	3	344	June 2, 2023
Speeding up random array generation compared to numpy Community Support	2	745	February 26, 2024
Random array generation : numba cuda slower than cupy? Support: How do I do ...?	3	1942	July 23, 2021

Random sampling parity with NumPy

Related topics