How Matlab’s GPU, Parallel Computing and Jacket make HFT algos models the fastest with lowest latency!!

(Last Updated On: November 10, 2010)

How Matlab’s GPU, Parallel Computing and Jacket  make HFT algos models the fastest with lowest latency!!

Seriously, if you are really interested in this sector, I would pay attention to what is below:

I am always afraid people will shred my logic in going with Matlab versus dynamic programming languages like Python or R. Again, this is a choice not a debate of which is better.

For those that don’t know Matlab brings to the table some very, very powerful techniques available to quant prototyping and quant research. I cannot guarantee if this is ever a smart way to go with production related code but for sure at this point of my own development, Matlab’s GPU capabilities and parallel computing is very powerful through these toolkits.

Let me know explain from some highlights in an article I found:

CPUs are serial computing devices and GPUs are parallel computing devices. For small or serial operations, the best performance is likely achieved on the CPU. For large or parallel operations, the best performance is likely achieved on the GPU.  In general, the CPU will beat the GPU on data with only a few hundred elements.


The first time GPUs see a new piece of code from you, it spends some time analyzing it and compiling various instruction sequences for faster lookup on subsequent runs.  That first run will be a little slower, but for long-running computations (several minutes) there won’t be any noticeable lag.  This is often referred to as “warm up”.


Once casting data to the GPU occurs subsequent MATLAB operations that are performed on these data types will automatically be executed on the GPU. You can bring back the results to the CPU for any other processing by simply casting them back to appropriate data types in MATLAB e.g single, double etc.  For most data parallel computations this is all that is necessary to get a performance boost from the GPU and Jacket.


When using for-loops, some GPU-based computations must pass through Jacket’s compile on-the-fly system, which invokes NVIDIA’s compiler, nvcc. Compiling kernels is computationally expensive when done with nvcc. Jacket minimizes this expense by employing a lazy execution design and trace saving. However, in loops, it is important to watch out for iterating parameters which may be forcing an nvcc compilation with every loop iteration. Random numbers too, can force compilations in certain cases and slow down performance.


From the other third party package solutions I see like Open Quant or Marketcetera, I do see these packages do not deliver the advantages that Matlab has. Also, your performance will depend on how you code up C++ and Matlab script files. I would suggest paying attention to the above get incredible performance out of your High Frequency Trading application.

Also, I must mention this is just theory but in practice, I am hoping to prove Matlab can deliver the lowest possible latency for any high frequency trading solution possible!!


NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!

Subscribe For Latest Updates

Sign up to best of business news, informed analysis and opinions on what matters to you.
Invalid email address
We promise not to spam you. You can unsubscribe at any time.


Check NEW site on stock forex and ETF analysis and automation

Scroll to Top