Tag Archives: Principal Component Analysis

Does This Spark an Idea? Principal Component Analysis (PCA) of Yesterday and Today

Does This Spark an Idea? Principal Component Analysis (PCA) of Yesterday and Today

http://www.geniq.net/res/Principal-Component-Analysis-Yesterday-and-Today.html

Principal Component Analysis of Yesterday and Today geniq.net

 

=

we’ve just covered PCA in my Data Mining class, and so I’d be interested in reading more of your article. Thanks!

 

==
The article is actually a chapter in my new book
http://www.geniq.net/statistical-machine-data-mining-Bruce-Ratner-book.html .
However, if you rather not buy, I will be glad to chat with you.

==

 

Interesting. I am definitely interested in highly correlated eigengenes as a way to find expression networks.

 

==

PCA has long been a key part of my toolbox for dealing with high-dimensional data. At the start, it’s a good way to check whether samples cluster by treatment in a PCA biplot. Then I look at which variables have the highest loading on each principal component, which often gives an initial idea what the main biological trends in the response of the samples might be. At later stages I may filter variables by ANOVA then do PCA just using a subset of variables.

 

HOW DO YOU START A PROFITABLE TRADING BUSINESS? Read more NOW >>>

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!

can anyone help for quant analysis to implement Principal Component Analysis for satistical data reduction

can anyone help for quant analysis to implement Principal Component Analysis for satistical data reduction

 

==

There are LOTS of online resources that walk you through this… find the one that corresponds to the software you’re using.

For example, if you’re using SAS, consider:
http://support.sas.com/publishing/pubcat/chaps/55129.pdf

Or, if you’re using SPSS, consider:
http://www.unt.edu/rss/class/Jon/SPSS_SC/Module9/M9_PCA/SPSS_M9_PCA1.htm

If you’re using another statistical program, consider:
http://www.google.com 😉

 

==

can anyone help to CONJOINT ANALYSIS AND PERCEPTUAL MAPPING. using SAS and
SPSS.

 

==

If you are not into programming, you could try some Excel add-ins, e.g., XLSTAT

XLSTAT is commercial, but maybe you can try this one:
http://sourceforge.net/projects/imdev/
Very easy is using PCA in RapidMiner, which is also free

 

==

and there is also cookbook:
http://www.simafore.com/blog/bid/62910/How-to-run-Principal-Component-Analysis-with-RapidMiner-Part-1

 

==

as usual google is your friend and you should ask it before bothering groups with trivial questions.
Besides what has already been recommended here, I really like Numerical Recipes since it combines explaining mathematical concepts with actual code that works.

 

==

Lots of people have pointed you at tools for this technique.

But be aware that PCA is not a good tool for data reduction because it gives a coefficient to every variable. So it doesn’t reduce the variables needed. (unless you start playing with some arbitrary rule for deleting variables…).
Also if you have information about causality in the variables then that is ignored, eg some variables could be interrelated and others obviously not.

Also it assumes there are no subgroups in your sample (if there are the PCs are confounded with group differences). In practice this is often not true and then it is better to identify the variables determining the groups… which is a different problem.

 

==

John Parker’s SAS reference is a great one, very straight forward. Keep in mind even when used successfully in models, you still have to have a clear interpretation of the variable(s) you end up keeping. In many cases if you can not clearly interpret the final model output all the PCA work might have been for nothing (for example, a policy or economics paper for a peer reviewed journal). If you only have 20 or 30 variables there are better ways to handle multicolinearity and data reduction.

 

 

==

For most practical situations PCA is pretty useless.

 

==

It is useful. But these days with access to Neural Networks, Bagged and Boosted tress and SVM’s, variable reduction is redundant, unless of course your computing power is not up to the task.

 

==

There are some open source codes for free written in MATLAB. It is very easy to perform.

 

==

I have always used the statistical software R to perform PCA (it is free). You can simply use the command “princomp”. However, you can find lots of manuals on Internet about R.

 

 

HOW DO YOU START A PROFITABLE TRADING BUSINESS? Read more NOW >>>

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!