How to compare two systems in high frequency trading HFT and quant analytics?
if you have two different systems that generate two different sets of return numbers. How do you usually decide, which one is better? I am assuming, that is not just about taking the absolute value at the end of the testing period … is there something like a common “quality test” where you take the variance etc. into account?
Well the classical way of measuring is sharpe ratio. Mean % return minus risk free rate divided by standard deviation of returns. Be careful that you are using equal time frames for returns and variance. Basically it is important how much return you made, in what time frame, and by how much risk.
There are other ways of course. But sharpe ratio has good theory behind and optimal under some conditions. For example if somebody claims that they made 80% return on something. Ask them the sharpe ratio. Such returns are made by taking excesive leverage. Sharpe ratio brushes away all that and gives you an honest deleveraged picture.
Sortino Ratio (based on the semideviation of returns) is a better metric than Sharpe. However, both can be completely misleading. For example, Beakshire Hathaway Sharpe Ratio since 1990 is below 0.6 much less than LTCM used to have. Where is LTCM? 🙂 Comparing two systems is not a straightforward task.
Well, if you calculate LTCM’s Sharpe by the time they vanished, it’d probably be quite low as well. Further I think it is not exactly correct to compare a hedge fund to Berkshire Hathaway.
Apart from ratios, there are many other things you can take into consideration. I feel that the comparison makes sense largely if the two systems are correlated. If there is no correlation, probably a Sharpe/Sortino would be fine. In case of a correlation, depending on your requirements, you go for the one that suits your risk appetite. I am sure there would be backtesting done, look for days/trades where you lost a significant amount of money. The general metrics of Value at Risk can be used. See if the trading system is sustainable if such a loss occurs. Choose the one with the risk that you can afford to tolerate, while making sure that the risk-return trade off is correct.
Further, for 2 systems giving similar results, ALWAYS choose a simpler model. COmplicated models with heavy calculations are more than likely to be prone to overfitting i.e. on historical data they would give amazing results, but on unseen data would be terrible.
To solve this, we generally use cross validation – for example, we get our optimized parameters/ratio for the trading based on 2/3rd of the dates, and the returns are then computed for the remaining 1/3rd. The system with better returns on these 1/3rd of the dates is the better system.
They had a decent Sharpe ration for some time. There is nothing wrong in comparing the two as both employ(ed) systems.
Yes, there are a lot of things to consider. Correlation numbers between the two system (or the lack thereof) can be misleading. In real life, correlation tends to let you down when you need it most (the beauty of tail risk). It is important to make a thorough assessment of robustness of the systems being compared. Robustness is often more important than the observed quality of returns expressed by Sharpe/Sortino.
Yes, it is worth applying Occam’s Razor principle if the models are similar.
Cross validation is common technique, but quite often the sample space is not big enough. There are other alternatives available.
If you have 2 uncorrelated systems both making money, then trading both is better than trading any one just by itself.
Replace 2 by 10, and you have found the holy grail.
I am pretty much a beginner, so I won’t know much. Could you elaborate more on the alternative techniques for cross validation?
Also, I did not understand why correlation can be misleading? Or is it in a way saying historical data need not necessarily be a good predictor of the future?
I would appreciate your comments.
Bootstrapping can be a good alternative, given that your sample size is likely to be limited.
The correlation can be misleading when the sample of returns is not representative. Consider a robust strategy and an overfitted one; both will exhibit similar behaviours (correlate) in stationary market conditions (the latter is optimised for), but diverge when a regime shift occurs.
the holy grail or not, it is definitely the way to go. Two things that can never be overrated are stop losses and diversification.
the problem when comparing two systems is that it is easy to get to compare apples and oranges.
For example, any trading strategy can be represented using a payoff matrix: Σ(H.*ΔP) where a strategy H is applied to a price differential matrix over the whole portfolio history. If you compare 2 strategies H1 and H2 over the same portfolio (same stock selection), then you are really comparing two strategies. However, if you make changes to the stock selection process, you are not testing only 2 strategies anymore, you are also testing the stock selection process.
Picking 100 stocks from the S&P500 will represent only one selection with its own signature out of many googols (10 power 100) of possibilities. So, really the question should be: what is being compared? The trading method or the stock selection process?
Even comparing two systems over past market data can only give an indication, a method of analysis of what was; not of what is to be!
NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!