Help with Optimisation

Hi there,

I am new to using Numba, but I feel that when I implemented it, it doesn’t really speed up my code.

Perhaps have a look at my implementation and let me know if it is correct (this is the function that I call in my main script):

@njit(nogil=True)
def NB8thNumbaPC (time_array,time_steps,dis,vel,acc,F,C,K,M,a0,a1,a2,a3,a4,a5,a6,a7,Keff,FinalDis,FinalVel,FinalAcc,v,
train_start_time,X0,n_modes,mr,L,Rail_Z,IntermediateRail_Z,FinalRail_Z,FinalRail_Z_dtt,a,
Omega,Ch,gravity,rail_wheel_location_z,FinalPForce,FinalZ0,wheel_location_x,FinaldeltaZ,
rail_accel_location_1,rail_accel_location_2,rail_accel_location_3,rail_accel_location_4,
rail_accel_location_5,progress_proxy):
for i in (range(0, int(math.ceil(time_steps)))):
time = time_array[i]

    Feff = F + np.dot(C, (a1 * dis + a4 * vel + a5 * acc)) + np.dot(M, (
                a0 * dis + a2 * vel + a3 * acc))  
    disdt = (np.linalg.pinv(Keff) @ (Feff))  
    accdt = (a0 * (disdt - dis)) - a2 * vel - a3 * acc  
    veldt = vel + (a6 * acc) + a7 * accdt  # velocity
    # storing current values as next time step values
    dis = disdt
    vel = veldt
    acc = accdt
    # storing current values in final matrix
    FinalDis[:, i:i + 1] = dis  
    FinalVel[:, i:i + 1] = vel
    FinalAcc[:, i:i + 1] = acc

    # choosing when to start moving train

    # =============================================================================
    #     if time < train_start_time:
    #         X=X0
    #     elif time >= train_start_time:
    #         X=v*(time-train_start_time)+X0
    # =============================================================================
    X = X0#v * (time - train_start_time) + X0
    z_rail_atwheel1 = 0  # essential
    for k in (range(1, n_modes + 1)):
        z_rail_atwheel1 = z_rail_atwheel1 + (dis[k - 1][0] * np.sin(k * np.pi * X / L) * np.sqrt(2 / (mr * L)))

    z_rail_at_50m = 0
    for k in range(1, n_modes + 1):
        z_rail_at_50m = z_rail_at_50m + (acc[k - 1][0] * np.sin(k * np.pi * 50 / L) * np.sqrt(2 / (mr * L)))

    z_rail_at_50_005m = 0
    for k in range(1, n_modes + 1):
        z_rail_at_50_005m = z_rail_at_50_005m + (
                    acc[k - 1][0] * np.sin(k * np.pi * 50.005 / L) * np.sqrt(2 / (mr * L)))

    z_rail_at_50_010m = 0
    for k in range(1, n_modes + 1):
        z_rail_at_50_010m = z_rail_at_50_010m + (
                    acc[k - 1][0] * np.sin(k * np.pi * 50.01 / L) * np.sqrt(2 / (mr * L)))

    z_rail_at_49_995m = 0
    for k in range(1, n_modes + 1):
        z_rail_at_49_995m = z_rail_at_49_995m + (
                    acc[k - 1][0] * np.sin(k * np.pi * 49.995 / L) * np.sqrt(2 / (mr * L)))

    z_rail_at_49_990m = 0
    for k in range(1, n_modes + 1):
        z_rail_at_49_990m = z_rail_at_49_990m + (
                    acc[k - 1][0] * np.sin(k * np.pi * 49.990 / L) * np.sqrt(2 / (mr * L)))

    # populating Final Rail Displacement
    z_rail_at_x = 0
    for column, x in enumerate(Rail_Z):
        z_rail_at_x = 0
        for k in (range(1, n_modes + 1)):
            z_rail_at_x = z_rail_at_x + (dis[k - 1][0] * np.sin(k * np.pi * x / L) * np.sqrt(2 / (mr * L)))
        IntermediateRail_Z[0][column] = z_rail_at_x

    FinalRail_Z[i, :] = (IntermediateRail_Z)

    # populating Final Rail Z Acceleration
    z_rail_at_x = 0
    for column, x in enumerate(Rail_Z):
        z_rail_at_x = 0
        for k in (range(1, n_modes + 1)):
            z_rail_at_x = z_rail_at_x + (acc[k - 1][0] * np.sin(k * np.pi * x / L) * np.sqrt(2 / (mr * L)))
        IntermediateRail_Z[0][column] = z_rail_at_x

    FinalRail_Z_dtt[i, :] = (IntermediateRail_Z)

    # calculate deltaZ
    wheel1_displacement = dis[-2][0]


    z0 =0# a * (1 - np.cos(Omega * time)) / 2  #

    deltaZ = (wheel1_displacement - z_rail_atwheel1 - z0)
    # print("DeltaZ: ",deltaZ)

    # calculate P1
    if deltaZ < 0:
        P1 = 0
        # print("Lost Contact", P1)
    elif deltaZ > 0:
        P1 = Ch * (deltaZ ** (3 / 2))  # Hertzian Contact


    # update force vector for WHEEL FORCE
    F[-2] = gravity[-2] - P1
    # update force vector for RAIL FORCE #gravity[k-1] is usually 0
    for k in range(1, n_modes + 1):
        F[k - 1] = gravity[k - 1] + P1 * np.sin(k * np.pi * X / L) * np.sqrt(2 / (mr * L))

    rail_wheel_location_z[0][i] = z_rail_atwheel1

    FinalPForce[0][i] = P1
    FinalZ0[0][i] = z0
    wheel_location_x[0][i] = X

    FinaldeltaZ[0][i] = deltaZ
    rail_accel_location_1[0][i] = z_rail_at_50m
    rail_accel_location_2[0][i] = z_rail_at_50_005m
    rail_accel_location_3[0][i] = z_rail_at_50_010m
    rail_accel_location_4[0][i] = z_rail_at_49_995m
    rail_accel_location_5[0][i] = z_rail_at_49_990m

    progress_proxy.update(1)
return Rail_Z,time_steps,FinalRail_Z,wheel_location_x,FinalDis,FinalRail_Z_dtt,FinalPForce,FinaldeltaZ

Welcome!

I’ll be difficult to say anything meaningful about it without knowing how the function is called. The datatypes of the arguments, their size/shape etc.

You can probably start by simplifying things as much as possible. Remove arguments that aren’t used. It looks like computations as np.sqrt(2 / (mr * L)) don’t have to be done several times within a loop.

If some of the arguments are arrays, avoiding nested indexing like wheel_location_x[0][i] would help. If those aren’t arrays yet, that might improve things.

But before doing anything, it should start with creating a baseline of a minimal reproducible example that can be benchmarks to evaluate the effect of any changes.