Filtering array elements and computing accumulated results

hgrecco · December 1, 2021, 1:20pm

I am trying to decouple filtering and cumulative calculation that I have to perform over large arrays.

As there are different filtering strategies and also a variety of calculations that can be made, decoupling is important to simplify testing and allow composability.

I don’t want to generat an intermediate filtered array as they can be very large. I would like to filter and calculate on the fly. But I was not able to do this in a way that performs well.

This is a very simplified example:

@nb.njit
def myacum(elements1, elements2, el1_max, el2_max):
    out = 0
    for el1, el2 in zip(elements1, elements2):
        if el1 < el1_max and el2 < el2_max:
            out += el1 * el2
    return out

which I then rewrote using generators:

def filter_by_max(elements1, elements2, el1_max, el2_max):
    @nb.njit
    def func():
        for el1, el2 in zip(elements1, elements2):
            if el1 < el1_max and el2 < el2_max:
                yield el1, el2
    return func

@nb.njit
def myacum2(it):
    out = 0
    for el1, el2 in it():
        out += el1 * el2
    return out

Is there a robust and performant way to achieve this decoupling?

Topic		Replies	Views
Feedback on tips for first-timers Community Support	14	975	August 15, 2023
Numba performance doesn't scale as well as NumPy in vectorized max function Community Support	14	3259	March 4, 2022
Speeding up recursive functions using Numba Community Support	0	119	September 11, 2024
How can I make this arithmetic run faster? Support: How do I do ...?	3	361	February 29, 2024
Njit with np.argwhere, np.logical_and, and np.sum Numba v0.50.0 Community Support	7	2636	March 14, 2021

Filtering array elements and computing accumulated results

Related topics