**How could we do predictions with data mining for quant analytics?**

==

Predictive models can be grouped in two major groups: classification and regression models.

Both aim at building models that predicts the value of a variable knowing the values of other variables. Both models accept in input a set of training data. Each training instance has several attributes, one of which is the variable to be predicted. In classification, this variable is categorical and it is called class variable, in regression the variable is real–valued and is known as dependent variable. etc. Predictive models learn, using the

training data at hand, a mapping from the input variables to the dependent variable. The resulting model is then used to predict the value of the dependent variable for a new instance of which all the independent variables are known.

Examples of classification algorithms are: decision trees, neural network, nearest neighbor, rule-learners, etc..

==

these videos may be helpful:http://www.youtube.com/user/11AntsAnalyticsTV and there is also a free trial here:http://www.11antsanalytics.com/

==

Have you considered using JMP Pro, a tool from SAS that allows users to create predictive models visually?

http://www.jmp.com/uk/software/jmp9/pro/

==

I have used PLS regression with XLSTAT

==

I am using a hybrid PLS/NN method for regression. PLS is used for variables pruning and NN for prediction.

==

Please, could you explain to me hybrid PLS/NN?

=-=

This particuraly interesting when you have many variables. Prediction using NN with too many variables may be time complex and time consuming. So, you first perform a PLS to select most important variables (VIP with coeffeicient higher than 0.8 for example) and then do your NN prediction with reduced variables.

==

Predictive model, in general are ‘Classification’ and ‘Regression’. The goal is to build a model where the value of one variable can be predicted from the values of other variable.

Classification is used for ‘categorical’ variables (i.e. Y/N, or answers for a variable like 1–5 for “like best” to “like least”).

Regression is used for “continuous” variables (e.g., variables where the values can

be any number, with decimals, between one number and another; age of a person would be an example, or blood pressure, or number of cases of a product coming off

an assembly line each day).

==

Thanks for the informations, is it possible to apply PLS regression to predict for example binary materials in materials science? If yes, how could we do it?

==

,

You should take a look at Krishna Rajan’s work for materials science predictions. For example:

The application of Principal Component Analysis to materials science data, Data Science Journal, Vol. 1 (2002) pp.19-26;

Materials informatics, Materials Today, Volume 8, Issue 10, October 2005, Pages 38–45

and quite a few others

**NOTE** I now post my

** ***TRADING ALERTS* into my personal

FACEBOOK ACCOUNT and

TWITTER. Don't worry as I don't post stupid cat videos or what I eat!