Covariance matrix pca decomposition in quant analytics
In order to use principal component analysis on a covariance matrix, is it better to perform eigenvalue decomposition on the covariance matrix directly or on the correlation matrix?
Applying PCA to cov. or corr. is an empirical question. If say all is well behaved with the Normal (Gauss) assumption on joint distribution and all marginal distributions, then it should not be too different statistically if use cov. or corr.
However, if the problem is ill-conditioned, may want to relax the Gauss assumption. Or if want to stress correlations or need to build correlation matrix from different sources, then would apply PCA to corr.
Or if want to use a whole set of different assumptions on joint and marginal distrubutions by applying copula. See the “Sklar” theorem on applying coppulas where in most cases the PCA is on the corr.
Also, R. Rebonato and P. Jackel-1999 have a paper on stressing and fixing corr. matrix. Regards
I’d look at correlation and break volatility out separately.
I suspect the answer is that it is better to use the correlation, and that is what the ‘factor.model.stat’ function in the BurStFin R package does http://www.portfolioprobe.com/2012/02/16/the-burstfin-r-package/
Correlations are dynamic, but covariances are really dynamic. Hence I’m guessing that leaving the volatiliity on the side, as William suggests, is the better thing to do. However, I’ve never seen a comparison — it would be interesting to see how different they are.
What I am understanding from the discussion so far is that correlation matrices are more stable than covariance matrices. If anyone is aware of any empirical research on this, especially in the context of equity markets, I would be very interested in any pointers to some papers on the topic.
Most of the research on Random Matrix Theory is in terms of correlation matrices. This is because it effectively scales the variance of each to be 1. So there’s lots of papers using the correlation matrix. I don’t recall off the top of my head any papers that compare the different methods empirically.
Using the correlation matrix becomes more important when you’re dealing with different kinds of assets. For instance, if you’re looking at log changes in equities and changes in bond yields, then the bond yields will be much less volatile than the equities. So if you take the first n factors, the bond yields will likely be nowhere close. However, if you look at the first n factors of the correlation matrix, the bond yields will be closer to the top.
The problem with using the correlation matrix is that if you generate factors from the initial returns and the matrix of eigenvectors from the correlation matrix, then the factors are not uncorrelated. However, if you use Z-scores instead, they will be uncorrelated. So basically, you have to remove the first two unconditional moments from everything and assume those are estimated correctly before dealing with the factors.
Another way to think about it is to imagine that you have a copula that has all the fat tails and dynamic adjustment of the means and variances accounted for in all the assets. So these are all uniformly distributed and if you take the inverse of the normal on them, then the copula will be transformed into a joint distribution such that each marginal distribution is normal with constant mean 0 and standard deviation 1. Hence, the covariance matrix of this equals the correlation matrix. So in this sense, this dimension reduction is only about analyzing the copula (a separate one would be required if you want to reduce the number of assets over which you estimate fat tails, dynamic means/variances, etc).
With regards to Patrick’s point, I’m not sure that’s the best reason to prefer correlation vs. covariance. For instance, if you perform PCA on the covariance matrix and then estimate Garch models on the top n factors and scale them, then you’re left with scaled residuals that can be analyzed with a copula. The scaled residuals may not be uncorrelated in this copula and the copula may be time-varying. So if the problem is that the covariance matrix is sort of double-dynamic (i.e. time-varying variance and time-varying correlations), then I think it is possible to analyze it.
NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!