I programmed a trading algorithm to get me 50 Lambos. I also have acquired years of market data for back testing the algorithm. The algorithm has a few basic parameters that I modify and then back test and get some basic statistics on how the algorithm performed historically. This allows me to fine tune the parameters for max Lambos. ![]()
The problem is testing all these combos in my single threaded python app takes FOREVER. I want to get my video card to do it in parallel to speed things up significantly.
So, I’ve got the static read-only historical market data broken down into 4x 1-D float arrays (open, low, high, close). There is also a single 1-D int32 array with a unix timestamp. The indicies of these 5 arrays correspond to the same historical data point. To me, this should be in a shared memory.
Then I have the parameters, A start and end time to check in the above arrays, as well as a few other float values that I need to make decisions while traversing the market data (4x arrays above) between the start and end times.
As output, I have several statistics as floats (could be put into a float array if needed) that are captured to help shed light on how a particular parameter combination performed. The number of these statistics is fixed for any iteration and has no relation to the size of any input array.
I took a look @ guvectorize, but it seems like the output array needs to be related to one of the input variable sizes. I also tried iterating through the historical data arrays, but it didn’t seem to like not starting at the beginning.
Am I looking in the right place? @guvectorize? If not, how can I get my back tester running on a GPU core?
I am new to all this (but am an experienced developer) so I would appreciate any pointers anyone has.