Tag Archives: PCA

Does This Spark an Idea? Principal Component Analysis (PCA) of Yesterday and Today

Does This Spark an Idea? Principal Component Analysis (PCA) of Yesterday and Today

http://www.geniq.net/res/Principal-Component-Analysis-Yesterday-and-Today.html

Principal Component Analysis of Yesterday and Today geniq.net

 

=

we’ve just covered PCA in my Data Mining class, and so I’d be interested in reading more of your article. Thanks!

 

==
The article is actually a chapter in my new book
http://www.geniq.net/statistical-machine-data-mining-Bruce-Ratner-book.html .
However, if you rather not buy, I will be glad to chat with you.

==

 

Interesting. I am definitely interested in highly correlated eigengenes as a way to find expression networks.

 

==

PCA has long been a key part of my toolbox for dealing with high-dimensional data. At the start, it’s a good way to check whether samples cluster by treatment in a PCA biplot. Then I look at which variables have the highest loading on each principal component, which often gives an initial idea what the main biological trends in the response of the samples might be. At later stages I may filter variables by ANOVA then do PCA just using a subset of variables.

 

HOW DO YOU START A PROFITABLE TRADING BUSINESS? Read more NOW >>>

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!

Quant analytics: Can anyone suggest a good starting point for trying out some open source tools for PCA / PLS / OPLS?

Quant analytics: Can anyone suggest a good starting point for trying out some open source tools for PCA / PLS / OPLS?

 

==

, I assume that PCA means Principal Components Analysis? If so, The base installation of R comes with at least two options: princomp (http://127.0.0.1:20208/library/stats/html/princomp.html) for eigenvalue decomposition, and prcomp (http://127.0.0.1:20208/library/stats/html/prcomp.html) for singular value decomposition. I’ve also had a reasonably good experience using the “principal” function in the psych package (http://127.0.0.1:20208/library/psych/html/principal.html). If you’re not familiar with R, you can look here for an introduction (http://www.statmethods.net/), or check out the R group here on LinkedIn (http://www.linkedin.com/groups/R-Project-Statistical-Computing-77616). R would also be able to handle your PLS and OPLS needs – I just don’t have any specific recommendations for those procedures since I don’t use them in my own work.

 

==

http://www.numericaldynamics.com/DownLoad.html

 

www.knime.org. It is an open source platform for data analysis. It has PCA (if this is Principal Component Analysis).

 

HOW DO YOU START A PROFITABLE TRADING BUSINESS? Read more NOW >>>

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!

Quant analytics: any one do Covariance matrix pca decomposition?

Who does this?is this empirical? Can it be applied to corvariance or on a correlate?are they dynamic? Are correlational matrices more stable? This may be an interesting go for research in optimizing matrix manipulation with these techniques.

HOW DO YOU START A PROFITABLE TRADING BUSINESS? Read more NOW >>>

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!

Covariance matrix pca decomposition in quant analytics

Covariance matrix pca decomposition in quant analytics

In order to use principal component analysis on a covariance matrix, is it better to perform eigenvalue decomposition on the covariance matrix directly or on the correlation matrix?

 

==

Applying PCA to cov. or corr. is an empirical question. If say all is well behaved with the Normal (Gauss) assumption on joint distribution and all marginal distributions, then it should not be too different statistically if use cov. or corr.

However, if the problem is ill-conditioned, may want to relax the Gauss assumption. Or if want to stress correlations or need to build correlation matrix from different sources, then would apply PCA to corr.

Or if want to use a whole set of different assumptions on joint and marginal distrubutions by applying copula. See the “Sklar” theorem on applying coppulas where in most cases the PCA is on the corr.

Also, R. Rebonato and P. Jackel-1999 have a paper on stressing and fixing corr. matrix. Regards

 

=

I’d look at correlation and break volatility out separately.

 

==

I suspect the answer is that it is better to use the correlation, and that is what the ‘factor.model.stat’ function in the BurStFin R package does http://www.portfolioprobe.com/2012/02/16/the-burstfin-r-package/

Correlations are dynamic, but covariances are really dynamic. Hence I’m guessing that leaving the volatiliity on the side, as William suggests, is the better thing to do. However, I’ve never seen a comparison — it would be interesting to see how different they are.

 

==

What I am understanding from the discussion so far is that correlation matrices are more stable than covariance matrices. If anyone is aware of any empirical research on this, especially in the context of equity markets, I would be very interested in any pointers to some papers on the topic.

 

==

Most of the research on Random Matrix Theory is in terms of correlation matrices. This is because it effectively scales the variance of each to be 1. So there’s lots of papers using the correlation matrix. I don’t recall off the top of my head any papers that compare the different methods empirically.

Using the correlation matrix becomes more important when you’re dealing with different kinds of assets. For instance, if you’re looking at log changes in equities and changes in bond yields, then the bond yields will be much less volatile than the equities. So if you take the first n factors, the bond yields will likely be nowhere close. However, if you look at the first n factors of the correlation matrix, the bond yields will be closer to the top.

The problem with using the correlation matrix is that if you generate factors from the initial returns and the matrix of eigenvectors from the correlation matrix, then the factors are not uncorrelated. However, if you use Z-scores instead, they will be uncorrelated. So basically, you have to remove the first two unconditional moments from everything and assume those are estimated correctly before dealing with the factors.

Another way to think about it is to imagine that you have a copula that has all the fat tails and dynamic adjustment of the means and variances accounted for in all the assets. So these are all uniformly distributed and if you take the inverse of the normal on them, then the copula will be transformed into a joint distribution such that each marginal distribution is normal with constant mean 0 and standard deviation 1. Hence, the covariance matrix of this equals the correlation matrix. So in this sense, this dimension reduction is only about analyzing the copula (a separate one would be required if you want to reduce the number of assets over which you estimate fat tails, dynamic means/variances, etc).

With regards to Patrick’s point, I’m not sure that’s the best reason to prefer correlation vs. covariance. For instance, if you perform PCA on the covariance matrix and then estimate Garch models on the top n factors and scale them, then you’re left with scaled residuals that can be analyzed with a copula. The scaled residuals may not be uncorrelated in this copula and the copula may be time-varying. So if the problem is that the covariance matrix is sort of double-dynamic (i.e. time-varying variance and time-varying correlations), then I think it is possible to analyze it.

 

HOW DO YOU START A PROFITABLE TRADING BUSINESS? Read more NOW >>>

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!

Quant analytics: Principle component analysis and Linear discrimination analysis!, tools to visualize Screen and plot, PCA?

Quant analytics: Principle component analysis and Linear discrimination analysis!, tools to visualize Screen and plot, PCA?

hi, can anyone share me the information about Principle component analysis and Linear discrimination analysis!, what kind of software i can use to visualize Scree plot, PCA plot ..? many thanks

 

==

What exactly u wanna know in Principal Component Analysis(PCA)? Caz To know PCA u shd have atleast basic knowledge of Matrix(addition, subtraction, multiplication, inverse etc.) because PCA deals with variance-covariance matrix(or Sigma Matrix) & it helps in data reduction & data interpretation and used manly for factoring the sigma matrix for the factor analysis.
Discriminant analysis is completely different concept caz in this we concern with sorting objects into 2 or more classes. This technique is used when the dependent variable is categorical & independent variable is cardinal.
& u can use SPSS for these techniques

 

==

For addition, PCA not to reduce data but to reduce dimension of the data. This analysis is used when you want to use all of the variable but there is collinearity among them. in my experience PCA is just intermediary analysis before go to factor analysis or regression. You can use either SAS, SPSS or minitab to conduct this analysis.

 

==

 

==

thank you very much for the post, actually i do have results for different samples form different country generated by an equipement that i use, Following the litteratures, i believe that i can assess the purity and the autenticity of these samples. moreover i can assess the origine of thess samples by sorting the them into similarity groupes!.
many litteratures uses different approashes like PCA, LDA, DA, PLS, and cluster analysis, and sometimes they use combine PCA and LDA to assess the data
for me i would like to know whish best approach i can use for the purpose.

 

 

HOW DO YOU START A PROFITABLE TRADING BUSINESS? Read more NOW >>>

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!

Quant analytics: how to check that Data reduction is correct after applying PCA to a data set

Quant analytics: how to check that Data reduction is correct after applying PCA to a data set

==

 

! You should check that the Cumulative Proportion of Variance of the number of dimensions you decide to take is enough (about 80%: it depends on the field). On R, you clearly see that with the “summary” command, where you see the proportion of variance due to each component. In a few words, you can reduce data if you do not lose too much information: so if you decide to take the two principal components, their cumulative proportion of variance should be enough in order to well represent the original data set. Of course the cumulative proportion reaches 100% only if you take all the dimensions, but very often only a couple of them are necessary to explain a big part of the original data. Hope this helps!

 

 

HOW DO YOU START A PROFITABLE TRADING BUSINESS? Read more NOW >>>

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!

Quant analytics: PCA Tutorial quantturk.com

Quant analytics: PCA Tutorial  quantturk.com

Principal Component Analysis is widely used in finance industry for interest rate shocks. The attached is a nice turtorial to understand what PCA is about by Lindsay I…

—-

You might be interested in Independent Component Analysis Here is a tutorial
http://cis.legacy.ics.tkk.fi/aapo/papers/IJCNN99_tutorialweb/
There have been several papers on use in finance and esp. better portfolio theory.

 

 

HOW DO YOU START A PROFITABLE TRADING BUSINESS? Read more NOW >>>

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!