Tag Archives: Quant analytic

Quant analytics: More Q&A from a member about paid vs free market data source and time frequency with ARIMA

Quant analytics: More Q&A from a member about paid vs free market data source and time frequency with ARIMA

Questions from a member:

I reviewed the links. I see all literature its about stocks…. Are
there some examples with R code with Forex for cointegration?
Why Lmax is candidate to forex trading? What do we look for in broker
for be candidate to hft ?

My answer:
These should work for forex. I will posting some links on arima as a model type. This is the most popular type of modeling on a recent poll I did. Lmax does not manipulate your odrder so they are market neutral. This is what you look for in a broker. I hoped this helps

More questions from the same member:
Thanks, I wait for your posting, how many periods do you check wiht arima? Wich time frame ?
Source feed is yahoo? Thanks

My response:
I am probably going to use 1 min bars but you could easily use ticks if you want. My source is IQFeed but Yahoo can really only be done with End of Day if you want free data.

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!

How to compare two systems in high frequency trading HFT and quant analytics?

How to compare two systems in high frequency trading HFT and quant analytics?

Hi all,
if you have two different systems that generate two different sets of return numbers. How do you usually decide, which one is better? I am assuming, that is not just about taking the absolute value at the end of the testing period … is there something like a common “quality test” where you take the variance etc. into account?
Cheers,

==

 

Well the classical way of measuring is sharpe ratio. Mean % return minus risk free rate divided by standard deviation of returns. Be careful that you are using equal time frames for returns and variance. Basically it is important how much return you made, in what time frame, and by how much risk.

There are other ways of course. But sharpe ratio has good theory behind and optimal under some conditions. For example if somebody claims that they made 80% return on something. Ask them the sharpe ratio. Such returns are made by taking excesive leverage. Sharpe ratio brushes away all that and gives you an honest deleveraged picture.

 

==

Sortino Ratio (based on the semideviation of returns) is a better metric than Sharpe. However, both can be completely misleading. For example, Beakshire Hathaway Sharpe Ratio since 1990 is below 0.6 much less than LTCM used to have. Where is LTCM? 🙂 Comparing two systems is not a straightforward task.

 

==

Well, if you calculate LTCM’s Sharpe by the time they vanished, it’d probably be quite low as well. Further I think it is not exactly correct to compare a hedge fund to Berkshire Hathaway.

Apart from ratios, there are many other things you can take into consideration. I feel that the comparison makes sense largely if the two systems are correlated. If there is no correlation, probably a Sharpe/Sortino would be fine. In case of a correlation, depending on your requirements, you go for the one that suits your risk appetite. I am sure there would be backtesting done, look for days/trades where you lost a significant amount of money. The general metrics of Value at Risk can be used. See if the trading system is sustainable if such a loss occurs. Choose the one with the risk that you can afford to tolerate, while making sure that the risk-return trade off is correct.

Further, for 2 systems giving similar results, ALWAYS choose a simpler model. COmplicated models with heavy calculations are more than likely to be prone to overfitting i.e. on historical data they would give amazing results, but on unseen data would be terrible.

To solve this, we generally use cross validation – for example, we get our optimized parameters/ratio for the trading based on 2/3rd of the dates, and the returns are then computed for the remaining 1/3rd. The system with better returns on these 1/3rd of the dates is the better system.

 

==

They had a decent Sharpe ration for some time. There is nothing wrong in comparing the two as both employ(ed) systems.

Yes, there are a lot of things to consider. Correlation numbers between the two system (or the lack thereof) can be misleading. In real life, correlation tends to let you down when you need it most (the beauty of tail risk). It is important to make a thorough assessment of robustness of the systems being compared. Robustness is often more important than the observed quality of returns expressed by Sharpe/Sortino.

Yes, it is worth applying Occam’s Razor principle if the models are similar.

Cross validation is common technique, but quite often the sample space is not big enough. There are other alternatives available.

 

==

If you have 2 uncorrelated systems both making money, then trading both is better than trading any one just by itself.

Replace 2 by 10, and you have found the holy grail.

 

===

I am pretty much a beginner, so I won’t know much. Could you elaborate more on the alternative techniques for cross validation?

Also, I did not understand why correlation can be misleading? Or is it in a way saying historical data need not necessarily be a good predictor of the future?

I would appreciate your comments.

 

==

Bootstrapping can be a good alternative, given that your sample size is likely to be limited.

The correlation can be misleading when the sample of returns is not representative. Consider a robust strategy and an overfitted one; both will exhibit similar behaviours (correlate) in stationary market conditions (the latter is optimised for), but diverge when a regime shift occurs.

 

==

the holy grail or not, it is definitely the way to go. Two things that can never be overrated are stop losses and diversification.

 

=

the problem when comparing two systems is that it is easy to get to compare apples and oranges.

For example, any trading strategy can be represented using a payoff matrix: Σ(H.*ΔP) where a strategy H is applied to a price differential matrix over the whole portfolio history. If you compare 2 strategies H1 and H2 over the same portfolio (same stock selection), then you are really comparing two strategies. However, if you make changes to the stock selection process, you are not testing only 2 strategies anymore, you are also testing the stock selection process.

Picking 100 stocks from the S&P500 will represent only one selection with its own signature out of many googols (10 power 100) of possibilities. So, really the question should be: what is being compared? The trading method or the stock selection process?

Even comparing two systems over past market data can only give an indication, a method of analysis of what was; not of what is to be!

 

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!

Forget genetic algoritms or evolution learning in my quant analytic strategy research. Tool is Matlab

Forget genetic algoritms or evolution learning in my quant analytic strategy research. Tool is Matlab

Serious. This is too experimental and immature. I will stick with the purest form of classic algos and models as compared to crossbreeding them. I find it interesting but it was too early for me to get into it. This is according to my Matlab developing and testing results!

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!

Quant analytic: Best method to transform a continuous variable to categorical variable in order to build a logistic regression model?

Quant analytic: Best method to transform a continuous variable to categorical variable in order to build a logistic regression model?

 

==

Which logistic regression model do you intend to use? If binary logistic: just decide on a cut-point which separates the two categories. In SPSS, you can use the recode method which is available on the Transform menu, or do it via syntax.
If you create more than two categories, you might look at ordinal regression, which is an extension of the binary logistic model, but quite challenging to interpret. I assume your categories would be ordered, so in a multinomial logistic model you would lose part of the information contained in the data.

==

 

I would urge caution and recommend you reconsider whether you want to really want to “bin” your continuous outcome variable.

Logistic regression is best applied when the two outcomes reflect distinct states (for example, has diabetes vs. does not have diabetes). If you took a continuous variable, like income, and binned it to “over $40k” and “$40k or less” you really don’t have distinct states … the difference between $39,999 and $40,001 is trivial.

If you are struggling with a skewed outcome variable, I recommend you consider these two alternatives before resorting to binning it:
(1) Use a generalized linear model and select an appropriate distribution (Poisson and Gamma are quite popular); or
(2) Try transforming your outcome variable (such as a log transformation) to see if that makes it “more normal”.

 

==

 

==

You can generate a seq of cut-off points and then try to separate the continuous data to binary using the cut-off. Based on each logistic regression, calculate the AUC. Find the highest AUC and the corresponding cut-off. I think that cut-off may be the optimal one to classify your data into binary.

 

==
i don’t understand how to use a generalized linear model sutch us poisson or Gamma to bin continuous variable can you give a simple example if you want.

==

 

I was suggesting consider using a generalized linear model instead of binning — not as a method to create bins. Sorry for the misunderstanding.

 

==

 

I wouldn’t recommend doing that. Why would you want to lose richness of data that have been collected using a ratio scale by downgrading it to data using a categorical scale. It might be better to run linear or non-linear regression and thus retain the robustness of the data you’ve collected. I’d recommend looking at various bivariate scatter plots (two-dimensional plots between the dependent variable and each of the independent variables, one by one), as the first step, to understand the nature of relationship, and then chose an appropriate regression model accordingly. … But if for any reason, you must want to change the dependent variable to categorical scale, you could just follow a simple step. Classify the independent variable data into intervals, examine the frequency distribution to look for distinct concentrations or groupings. Choose cut-off points as appropriate and combine the intervals into groups accordingly. Assign a value to each group, and what you now have is a categorical scale. Hope this helps.

 

 

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!

Details for Top and Most Popular Quant Analytic Articles at QuantLabs.net

Details for Top and Most Popular Quant Analytic Articles at QuantLabs.net

 

Top and Most Popular Quant Analytics Articles

This zipe file contains:

 

best-tutorial-on-matrices-within-matlab-not-so-hard-with-this-one.pdf

filename.txt

quant-analytics-alternative-to-multiple-regression.pdf

quant-analytics-anyone-know-of-rapid-miner-for-open-source-visual-modelling.pdf

quant-analytics-best-definition-of-mean-reversion-.pdf

quant-analytics-black-scholes-vs-binomial-differences.pdf

quant-analytics-correlation-coedfficient-a-correlation-matrix-in-excel.pdf

quant-analytics-correlation-matrix-of-sp500-components (1).pdf

quant-analytics-correlation-matrix-of-sp500-components (2).pdf

quant-analytics-correlation-matrix-of-sp500-components.pdf

quant-analytics-fractal-volatility.pdf

quant-analytics-how-excel-can-find-implied-volatility-.pdf

quant-analytics-how-to-do-piecewise-cubic-spline-for-treasury-yield-curve-within-excel.pdf

quant-analytics-hull-white-model-explained-as-easily-as-possible.pdf

quant-analytics-intro-to-value-ar-risk-var-with-excel-.pdf

quant-analytics-is-this-the-easiest-way-to-calculate-maximum-drawdown-calculation.pdf

quant-analytics-mathematical-approach-to-order-book-modelling.pdf

quant-analytics-nonlinear-interpolation-with-excels-solver-to-construct-yield-curve.pdf

quant-analytics-polynomial-fitting-algorithm.pdf

quant-analytics-trading-a-mathematical-finance.pdf

quant-analytics-using-excel-linest-function-for-multivariate-regression.pdf

quant-analytics-value-at-risk-var-historical-simulation-for-portfolio.pdf

quant-analytics-very-simple-binomial-option-pricing-cox-ross-rubinstein-crr-model.pdf

quant-analytics-what-is-bivariate-normal-distribution.pdf

quant-analytics-what-is-the-bjerksund-stensland-model.pdf

quant-analytics-youtube-videos-on-covariance-and-co

 

http://quantlabs.net/labs/index.php?option=com_docman&task=doc_details&gid=814

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!

What are your “must read” blogs, websites to keep up to speed on new Quant Analytic/Data Mining/AI and Machine Learning projects/trends?

What are your “must read” blogs, websites to keep up to speed on new Quant Analytic/Data Mining/AI and Machine Learning projects/trends?

Greetings, I have been recruiting in Analytics for the last few years on a small scale for my Wireless clients. I am now moving my practice exclusively to “Big Data” projects as find the stuff fascinating.

What are your must reads (online or mailed) periodicals/blogs, and industry newsletters you have bookmarked or subscribed to and refer to on those rare “free moments” you have. I would also welcome any calls/emails to network about exciting projects you are working on or are interested in.


Sandro Saitta keeps a list of blogs on his site: http://www.dataminingblog.com/list-of-blogs/. kdnuggets is the gold standard. twitter lists for infovis, rstats, hadoop is another great resource for the latest trends.


A must to follow is Ajay Ohri’s blog http://www.decisionstats.com/ also http://smartdatacollective.com/ particularly James Taylors column and Tom Fuyala

I spent over an hour on the Smartdatacollective just now. So appreciate your helping out!

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!

Quant analytics: Is it possible to improve a predictive model using the data that come after the model is applied?

Quant analytics: Is it possible to improve a predictive model using the data that come after the model is applied?

The new data are supposed to be biased towards the treatment. So how to utilize them in this case? Thanks.

That depends on the model in question. Your initial question is, unfortunately, too general to give a specific answer. You should possibly look into Bayesian Inference or hierarchial Bayesian modelling, with your initial model as your prior.

 

Yes, consider a Self Organizing Map (SOM). This separates your variables and allows you to find those that have higher correlations between the other variables. “After this model is applied,” you can formulate weights based on these results into a stronger model such as a neural network.

 

 

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!

Quant analytics: How to Transform a distribution to normal

Quant analytics: How to Transform a distribution to normal

A part from the box-cox transformation, are there other methods for transforming a distribution to normal?

I interpret the problem you posed as

X= your data set (a column of numbers)

You are looking for a function f(x) such that Y = f(x) ~ normal( some mean, some sd)

 

Box-Cox is one possibility – however there are many other possibilities depending on what is the distribution on your X

 

For example –

1) if X~Lognormal distribution, then log(x) ~Normal

2) If your X is like correlation coefficients (ranges between -1,1) – Fisher’s transform will convert it to normal – f(r) = 0.15 * ln( (1+r) / (1-r) )

3) In fact, based on theory of probability distributions – more bizzare choices of X and f(x) can be constructed.

– For an applied Statistician – you may need to focus on what is your original distribution of X and go from there.

For crude purpose you can substract the sample mean & divide by sample s.d.The resulting data will be approx Normal(0,1) for moderately large sample size

If you’re working in one dimension, you can always — given a value drawn from your distribution — calculate the fraction of values lower than this value (i.e. get the value of the cumulative function of your distribution) and then use the inverse of the cumulative normal distribution to look up the transformed value. The transformed values will correspond to a normal distribution by construction.

 

Maybe you could tell us *why* you want to transform your distribution into a normal one ?

• I agree with Andre’s question. Is it the only way to go that you need to let your data follow a normal distribution, say after any transformation process? In case it is impossible to achieve this goal, there are number of methods available in literature can handle non-normal data

Why? And I might add, it would be useless unless you could transform the results obtained after using the normal back to the proper domai

ransformation depends on the existing distribution. Statistician performs diagnosis before treatment as a doctor performs a diagnosis before treatment. You can try a series of transformations and examine the test Kolmogorov – Smirnov If the data close to a normal distribution. Choose the best transformation. Besides there are methods of non-parametric tests with powerful statistical distributions are not normal.

If we assume that you examine the days of hospitalization in – patients. You’ll find that a large population with 0 days then? Normal distribution of days of hospitalization. In this case the solution is not a transformation but the analysis in two separate phases. 0-1 as a function of logistics and then treated normal function during hospitalization. In short – there must be a first diagnosis of the underlying data is correct before the solution.

 

 

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!

Quant analysis: Why would a quant trader would not want to enter into Brazil for market opportunities?

Quant analysis:

Why would a quant trader would not want to enter into Brazil for market opportunities?

With the current economic situation in Brazil, countless opportunities are being generated in the Brazilian market, both for companies and professionals. Companies in most sectors are growing steadily, while more and more employment opportunities are open everyday.

In this climate, J & P Emerging Enterprises enters the market bringing deep knowledge in business development and career consulting to prepare foreign companies, investors and professionals to get the most out of these opportunities, devising a strategic plan according to each case analysis, aiming to explore the best of what’s in the market for each specific scenario.
Brazil is one of the fastest-growing major economies in the world, having its GDP Growth in the second quarter of 2010 increase 8.8% from the same quarter in 2009, and with the GDP expected to stay at 5.9% for the next 5 years, with the help of the World Cup in 2014, Olympic Games in 2016, and PAC 2 (which is a government funded growth acceleration program). The Brazilian Administration is also planning to boost private sector long-term financing with a goal of reaching 19.1% of GDP by 2010-2011.

Brazil has abundant and well-developed agricultural, mining, manufacturing, and service sectors. In 2008, Brazil became a net external creditor and two ratings agencies awarded investment grade status to its debt. Consumer confidence has been steadily rising due to Brazil’s per capital GDP which has increased 17.8% from R$13,931 in 2003 to R$16, 414 in 2009.

Billions of dollars of foreign capital are being invested into Brazil. From now until after 2016 when the Olympics will be held in Rio de Janeiro, direct foreign investment into Brazil should increase steadily and is expected to reach a total of US$33 billion for 2010 and 2011.

In 2008, 34 Brazilian companies were listed on the Forbes Global 2000 list: Petrobras which is involved in the Oil & Gas sector ranked #8 in the world, Vale which operates within the mining sector ranked #49 worldwide, and Brazilian Banking giant Banco Bradesco placed #81 in the world, just to name a few.

The service sector is the largest component of GDP at 66.8%, followed by the industrial sector at 29.7% (2007 est.). Agriculture represents 3.5% of GDP (2008 est.). Brazilian labor force is estimated at 100.77 million of which 10% is occupied in agriculture, 19% in the industry sector and 71% in the service sector.

All these factors make Brazil an extremely attractive market for foreign companies attempting to expand their reach, which is why direct foreign investment into Brazil has increased in billions of dollars for the past couple of years, hundreds of foreign companies are taking advantage of the tremendous market opportunities, and thousands of foreign professionals are looking for placement in the country.

http://jpemergingenterprises.com/home.html

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!