How do I use Numba to run Trading Back Testing?

functor · July 21, 2022, 8:49pm

I programmed a trading algorithm to get me 50 Lambos. I also have acquired years of market data for back testing the algorithm. The algorithm has a few basic parameters that I modify and then back test and get some basic statistics on how the algorithm performed historically. This allows me to fine tune the parameters for max Lambos.

The problem is testing all these combos in my single threaded python app takes FOREVER. I want to get my video card to do it in parallel to speed things up significantly.

So, I’ve got the static read-only historical market data broken down into 4x 1-D float arrays (open, low, high, close). There is also a single 1-D int32 array with a unix timestamp. The indicies of these 5 arrays correspond to the same historical data point. To me, this should be in a shared memory.

Then I have the parameters, A start and end time to check in the above arrays, as well as a few other float values that I need to make decisions while traversing the market data (4x arrays above) between the start and end times.

As output, I have several statistics as floats (could be put into a float array if needed) that are captured to help shed light on how a particular parameter combination performed. The number of these statistics is fixed for any iteration and has no relation to the size of any input array.

I took a look @ guvectorize, but it seems like the output array needs to be related to one of the input variable sizes. I also tried iterating through the historical data arrays, but it didn’t seem to like not starting at the beginning.

Am I looking in the right place? @guvectorize? If not, how can I get my back tester running on a GPU core?

I am new to all this (but am an experienced developer) so I would appreciate any pointers anyone has.

sklam · July 21, 2022, 9:03pm

@guvectorize might just be too restrictive for your usecase; esp, if you want to look into GPU shared memory to optimize things. I would suggest writing in @cuda.jit for all the freedom.

I’d also suggest using cupy and cudf for high-level API for GPU array and GPU dataframe, respectively. You can write CUDA kernels in Numba that work with data stored in these libraries.

functor · July 21, 2022, 9:08pm

thanks @sklam ! This is perfect. I had a feeling that guvectorize wasn’t where I needed to be. I’ll check out your suggestions! Thanks again!

fullthrottle · May 10, 2024, 3:31pm

I’m writing a bit late, but I wanted to share my experience.

The main computation loop is implemented using Numba, which, compared to plain Python, offers a speed improvement of 100-1000 times. Here’s a snippet of what the code looks like:

@nb.njit((int64)(nb_record_type[:], unicode_type, float64, float64, NPDatetime('ns')))
def RunTrack(...):
    for i in range(N):
        ...
        with objmode():
            ...

The key data structure in use, nb_record_type[:] , is a shared array (gigabytes) consisting of structures such as:

record_type = np.dtype([("var1", np.float64), ("wo_spy", np.float64), ...,
("b_tRngRto", np.float32),("e_tRngRto", np.float32),
("b_tRngRto100", np.float32),("e_tRngRto100", np.float32),
("oneetf_coded", np.uint64),("dt",'datetime64[ns]')
], align = True)

All this runs on a cluster managed by ray (framework), and inside each 32-processor node nb_record_type[:] is used by all processes without memory copying.

If not for Numba (i.e., if I had to stick to Python), I would have had to rent not 100 nodes, but 10,000 nodes, which would be financially untenable.

A small trick, modern Numba allows using with objmode() to run Python procedures in which I conveniently load (pandas,json,databases,numpy,fs) additional data for computation (in chunks) as I progress through the loop.

Topic		Replies	Views
Using dynamic shared memory Numba	6	1083	August 22, 2023
CPU vs GPU version Numba	2	471	July 28, 2020
Walkthrough from pure Python implementation to multi-GPU Numba-jitted version Showcase	0	1441	March 25, 2022
Sharing CUDA memory by numba Support: How do I do ...?	0	444	November 3, 2021
Random array generation : numba cuda slower than cupy? Support: How do I do ...?	3	1922	July 23, 2021

How do I use Numba to run Trading Back Testing?

Related topics