Strange behaviour of parallel execution of function in which one result is correct and the other not

vcantarella · September 7, 2023, 2:17pm

Hi. I have a jit decorated function which tracks particle in a numerical model. The function is this one below and it outputs the travel time of that particle and the travel time * reactivity, which is a scalar value mapped in the grid cell.

@numba.jit(nopython=True)
def travel_time_cum_reactivity(initial_position, initial_cell, face_velocities, gvs, xedges, yedges, z_lf, z_uf,
                               termination, reactivity):
    """
    Calculates the travel_time and cumulative reactivity of a particle assuming steady-state flow conditions
    ----------
    initial_position
    initial_cell
    face_velocities
    gvs
    xedges
    yedges
    z_lf
    z_uf
    termination
    reactivity: array with relative reactivity values per cell

    return
    t: (sum of dt)
    r: (sum of dr*dt)
    -------

    """
    # initializing:
    layer = initial_cell[0]
    row = initial_cell[1]
    col = initial_cell[2]

    continue_tracking = True
    coords = initial_position.copy()

    # Initializing the variables:
    dts = []
    reacts = []

    while continue_tracking:

        # coordinates at lower and upper faces:
        left_x = xedges[col]
        right_x = xedges[col + 1]
        low_y = yedges[row + 1]
        top_y = yedges[row]
        bt_z = z_lf[layer, row, col]
        up_z = z_uf[layer, row, col]

        coords_0 = np.array([left_x, low_y, bt_z])
        coords_1 = np.array([right_x, top_y, up_z])
        # gradients for the cell
        gvp = gvs[:, layer, row, col]

        # relative reactivity of the cell
        relative_react = reactivity[layer, row, col]

        # velocities at lower coordinate faces:
        v0 = face_velocities[np.array([0, 2, 4]), layer, row, col]
        # velocities at upper coordinate faces:
        v1 = face_velocities[np.array([1, 3, 5]), layer, row, col]

        # current velocities
        v = velocity(coords, v0, gvp, coords_0)

        # Where is it going:
        velocity_gradient_index = np.abs(v0 - v1) > 1e-10 * numba_max_abs(np.column_stack((v0, v1)))
        exit_direction_index = exit_direction(v0, v1, v)

        # Time to reach each end
        dt_array = reach_time(exit_direction_index, velocity_gradient_index, v0, v1, v, gvp, coords, coords_0, coords_1)

        # actual travel time:
        dt = np.min(dt_array)
        exit_point_loc = np.argmin(dt_array)

        # TODO: check if this statement is still necessary!!
        if dt == np.inf:
            break

        exit_point = exit_location(exit_direction_index, velocity_gradient_index, dt, v0, v1, v, gvp, coords, coords_0,
                                   coords_1)

        if exit_point_loc == 0:
            col = col + exit_direction_index[0]
        if exit_point_loc == 1:
            row = row + (-exit_direction_index[1])
        if exit_point_loc == 2:
            layer = layer + (-exit_direction_index[2])

        dts.append(dt)  # traveltime calculation
        reacts.append(relative_react * dt)  # relative reactivity

        # termination criteria evaluation: whether the particle has reached a termination layer or out of the system
        assert dt == relative_react * dt
        has_negative_index = negative_index(np.array([layer, row, col]))
        over_index = larger_index(np.array([layer, row, col]), np.array(termination.shape))
        term_value = termination[layer, row, col]

        if (term_value == 1) | has_negative_index | over_index:
            continue_tracking = False
        # new loop:
        coords = exit_point

    final_dt = np.sum(np.array(dt))
    final_react = np.sum(np.array(reacts))
    return final_dt, final_react

Then I tried to parallelize with prange to many particles, using reactivitity = np.ones(grid.shape), which should give me the same result for both outputs:

@numba.jit(nopython=True, parallel=True)
def cumulate_react(particle_coords, particle_cells, face_velocities, gvs, xedges, yedges, z_lf, z_uf,
                   termination, reactivity):
    dts = np.empty((particle_coords.shape[0]))
    reacts = np.empty((particle_coords.shape[0]))
    for i in numba.prange(particle_coords.shape[0]):
        initial_position = particle_coords[i]
        initial_cell = particle_cells[i]
        dt, react = travel_time_cum_reactivity(initial_position, initial_cell, face_velocities,
                                               gvs, xedges, yedges, z_lf, z_uf, termination, reactivity)
        dts[i] = dt
        reacts[i] = react
    return dts, reacts

However, for my test set, the result is correct for the reactivity, which is the second output, but incorrect for the first one. Any idea how can this happen? The difference between each result is minimal, the second result has an additional multiplication and thats it.

vcantarella · September 8, 2023, 3:01pm

Hi,

Sorry,

There is a typo in the first function : final_dt = np.sum(np.array(dt)) instead of final_dt = np.sum(np.array(dts))`

Sorry, it took me one week to figure it out …

Topic		Replies	Views
JIT decorated function fails compiling with "parallel=True" Support: What is this error message?	2	544	July 23, 2021
Advice in parallelizing Support: How do I do ...?	2	1326	September 5, 2022
Numba Prange Not Working as Expected Community Support	3	826	May 30, 2022
Parallelized code yield different results, what to be careful for? Support: How do I do ...?	5	537	January 29, 2021
Is numba suitable to map a list of array into an array in parallel? Support: How do I do ...?	9	2131	February 16, 2021

Strange behaviour of parallel execution of function in which one result is correct and the other not

Related Topics