Monthly Archives: May 2012

Crucial and many helpful R packages and research papers for finance and HFT with quant model, algo, and strategy example

Crucial and many helpful R packages and research papers for finance and HFT with quant  model, algo, and strategy example

Note none of these have NOT been verified or validated yet but don’t mind me, I feel like a kid in a candy factory with these!

With Interactive Brokers and R:

http://blog.fosstrading.com/2010/05/introducing-ibrokers-and-jeff-ryan.html

http://cran.r-project.org/web/packages/IBrokers/vignettes/RealTime.pdf

Implied volatility:

http://www.r-bloggers.com/the-only-thing-smiling-today-is-volatility/

For volatility forecasting using GARCH

http://www.r-bloggers.com/trading-using-garch-volatility-forecast/

Time series analysis and computational finance Cointegration test

www.stat.ucl.ac.be/ISdidactique/Rhelp/library/tseries/html/00Index.html
urca R package with Conintegration
http://cran.r-project.org/web/packages/urca/index.html

http://global-4-lvs-colossus.opera-mini.net/hs36-13/15877/1/-1/cran.r-project.org/urca.pdf

Limit Order Book R package

http://r-forge.r-project.org/R/?group_id=790  <– not in CRAN but does not seem to have a download link
Engle Granger coefficient test

http://cran.r-project.org/web/packages/tsDyn/tsDyn.pdf
CRAN – Package crawl random walk theory

http://cran.r-project.org/web/packages/crawl/index.html

Time series analysis in r (includes autocorrelation p17)

http://www.statoek.wiso.uni-goettingen.de/veranstaltungen/zeitreihen/sommer03/ts_r_intro.pdf
Ljung box test in r (includes times series)

Ljung Box part of this: http://www.statoek.wiso.uni-goettingen.de/veranstaltungen/zeitreihen/sommer03/ts_r_intro.pdf

http://cran.r-project.org/doc/contrib/Ricci-refcard-ts.pdf

Auto regressive estimation model
http://cran.r-project.org/web/packages/cts/vignettes/kf.pdf

Auto regressive is part of http://quantlabs.net/r-blog/2012/05/excellent-tutorial-on-using-urca-r-package-for-var-cointegration-statistical-tests-non-stationary-processes-benchmarks-and-estimating-models/
R time series pair trading Engle and Granger cointegartion
http://cran.r-project.org/web/packages/PairTrading/PairTrading.pdf
Volatility models
http://cran.r-project.org/web/packages/realized/realized.pdf
Brownian Motion
http://cran.r-project.org/web/packages/sde/sde.pdf
Non parametric regression estimation
http://cran.r-project.org/doc/contrib/Fox-Companion/appendix-nonparametric-regression.pdf
Time based arbitrage opportunities
http://www.r-bloggers.com/time-based-arbitrage-opportunities-in-tick-data/

Bid Ask spread with tick data rtaq R package
http://cran.r-project.org/web/packages/RTAQ/RTAQ.pdf
Tick data bid ask spread
http://cran.r-project.org/web/packages/FinTS/FinTS.pdf
High frequency data analysis in r with taq data base
http://faculty.washington.edu/ezivot/research/hfanalysis.pdf
Probability of observing k arrivals

http://cran.r-project.org/web/packages/HMM/HMM.pdf
Note Amihud reference of cran in the following research paper:

http://poseidon01.ssrn.com/delivery.php?ID=595118123002081089030087126071081068052035058029030050009002086102005018011112069076118021122027111056019097028001082100025005051092069006116118100098122075080031073081071095115105007093083028120122&EXT=pdf
Info and market impact

http://www.econ.kuleuven.be/public/n09022/RTAQ_vignette.pdf
Most profitable hedge fund strategy in r

http://www.r-bloggers.com/most-profitable-hedge-fund-style/
Econometric Analysis of Financial Market Data

http://www.math.uncc.edu/~zcai/FE-notes.pdf

PCA in R

http://www.r-bloggers.com/principal-component-analysis-use-extended-to-financial-economics-part-2/
Statistical arbitrage in r

http://www.r-bloggers.com/most-profitable-hedge-fund-style/

Dynamic modeling of mean-reverting spreads for statistical arbitrage

http://imperial.academia.edu/GiovanniMontana/Papers/1104540/Dynamic_modeling_of_mean-reverting_spreads_for_statistical_arbitrage

CAPM n r (note PerformanceAnalytics R package may be just as effective)
http://cran.r-project.org/web/packages/BLCOP/vignettes/BLCOP.pdf
Package RTAQ liquidity arbitrage

http://cran.r-project.org/web/packages/RTAQ/index.html

Crucial and many helpful R packages and research papers for finance and high frequency trading with a quant  model, algo, and strategy example

Note none of these have NOT been verified or validated yet but don’t mind me, I feel like a kid in a candy factory with these!

With Interactive Brokers and R:

http://blog.fosstrading.com/2010/05/introducing-ibrokers-and-jeff-ryan.html

http://cran.r-project.org/web/packages/IBrokers/vignettes/RealTime.pdf

Implied volatility:

http://www.r-bloggers.com/the-only-thing-smiling-today-is-volatility/

Time series analysis and computational finance Cointegration test

www.stat.ucl.ac.be/ISdidactique/Rhelp/library/tseries/html/00Index.html
urca R package with Conintegration
http://cran.r-project.org/web/packages/urca/index.html

http://global-4-lvs-colossus.opera-mini.net/hs36-13/15877/1/-1/cran.r-project.org/urca.pdf

Limit Order Book R package

http://r-forge.r-project.org/R/?group_id=790
Engle Granger coefficient test

http://cran.r-project.org/web/packages/tsDyn/tsDyn.pdf
CRAN – Package crawl random walk theory

http://cran.r-project.org/web/packages/crawl/index.html

Time series analysis in r (includes autocorrelation p17)

http://www.statoek.wiso.uni-goettingen.de/veranstaltungen/zeitreihen/sommer03/ts_r_intro.pdf
Ljung box test in r (includes times series)

http://cran.r-project.org/doc/contrib/Ricci-refcard-ts.pdf
Auto regressive estimation model
http://cran.r-project.org/web/packages/cts/vignettes/kf.pdf
R time series pair trading Engle and Granger cointegartion
http://cran.r-project.org/web/packages/PairTrading/PairTrading.pdf
Volatility models
http://cran.r-project.org/web/packages/realized/realized.pdf
Brownian Motion
http://cran.r-project.org/web/packages/sde/sde.pdf
Non parametric regression estimation
http://cran.r-project.org/doc/contrib/Fox-Companion/appendix-nonparametric-regression.pdf
Time based arbitrage opportunities
http://www.r-bloggers.com/time-based-arbitrage-opportunities-in-tick-data/

Bid Ask spread with tick data rtaq R package
http://cran.r-project.org/web/packages/RTAQ/RTAQ.pdf
Tick data bid ask spread
http://cran.r-project.org/web/packages/FinTS/FinTS.pdf
High frequency data analysis in r with taq data base
http://faculty.washington.edu/ezivot/research/hfanalysis.pdf
Probability of observing k arrivals

http://cran.r-project.org/web/packages/HMM/HMM.pdf
Note Amihud reference of cran in the following research paper:

http://poseidon01.ssrn.com/delivery.php?ID=595118123002081089030087126071081068052035058029030050009002086102005018011112069076118021122027111056019097028001082100025005051092069006116118100098122075080031073081071095115105007093083028120122&EXT=pdf
Info and market impact

http://www.econ.kuleuven.be/public/n09022/RTAQ_vignette.pdf
Most profitable hedge fund strategy in r

http://www.r-bloggers.com/most-profitable-hedge-fund-style/
Econometric Analysis of Financial Market Data

http://www.math.uncc.edu/~zcai/FE-notes.pdf

PCA in R

http://www.r-bloggers.com/principal-component-analysis-use-extended-to-financial-economics-part-2/
Statistical arbitrage in r

http://www.r-bloggers.com/most-profitable-hedge-fund-style/

Dynamic modeling of mean-reverting spreads for statistical arbitrage

http://imperial.academia.edu/GiovanniMontana/Papers/1104540/Dynamic_modeling_of_mean-reverting_spreads_for_statistical_arbitrage

CAPM n r (note PerformanceAnalytics R package may be just as effective)
http://cran.r-project.org/web/packages/BLCOP/vignettes/BLCOP.pdf
Package RTAQ liquidity arbitrage

http://cran.r-project.org/web/packages/RTAQ/index.html

The mother load of R packages for financial trading, quant, and potential high frequency trading (HFT) needs

The mother load of R packages for financial trading, quant, and potential high frequency trading (HFT) needs

So there seems to be this endless supply of what look to be a decent list of R finance packages. Some of these include quant based ones. This is my first day researching so I cannot vouch for any of these yet. I do know some R packages can be duds but I am not sure if these ones will be either but are part of CRAN which says positive things. Here we go:

Extreme value analysis:

http://cran.r-project.org/web/packages/evir/evir.pdf

Refer to p 39 for parameter use in Gvt:

http://www.stat.colostate.edu/graybillconference2009/Workshop%20Files/ShortCourseGraybill.pdf

 

 

Potential fat tail analysis which lead to the ones below:

http://braverock.com/brian/R/PerformanceAnalytics/html/Return.Geltner.html

PerformanceAnalytics package is quite amazing and easy to use for the amount of analysis it has: i.e. VaR

http://r.789695.n4.nabble.com/Value-at-risk-td3516991.html

Overview and demo of PerformanceAnalytics (PA):

http://cran.r-project.org/web/packages/PerformanceAnalytics/vignettes/PerformanceAnalyticsChartsPresentation-Meielisalp-2007.pdf

http://www.rinfinance.com/RinFinance2009/presentations/PA%20Workshop%20Chi%20RFinance%202009-04.pdf

How read profitable data and convert to PA package

http://quant.stackexchange.com/questions/1536/use-trades-as-input-for-performanceanalytics

How to back test strategies with PA:

http://blog.fosstrading.com/2011/03/how-to-backtest-strategy-in-r.html

A technical package:

http://cran.r-project.org/web/packages/TTR/index.html

TradeAnalytics packages which includes quantstrat:

 

 

http://cran.r-project.org/web/packages/TTR/index.html

Intro to quantstrat:

http://blog.fosstrading.com/2011/08/introduction-to-quantstrat.html

General list of R packages for quant trading:

http://blog.fosstrading.com/2011/08/introduction-to-quantstrat.html

The motherload of all financial trading packages in CRAN:

http://cran.wustl.edu/web/views/Finance.html

I feel like a kid a candy factory with all this. Makes me wonder how Matlab is going to keep up. Wow! Thanks to all contributors above for all these. Now I have to start digging and play with everything. I will also keep reporting through this blog for those interested.

 

Advantages of R in high frequency trading with Redis NOSQL, doRedis, dot NET C# HFT on Linux and Windows

Advantages of R in high frequency trading with Redis NOSQL, doRedis, dot NET C# HFT on Linux and Windows

I talk about the advantages of this stack for an High Frequency Trading environment. This of couse includes the advantages of R over something like Matlab.

[youtube_sc url=”9QsWeqwyxa0″ playlist=”HFT with Redis NOSQL, R doRedis dot NET C Sharp trading platform on Linux and Windows ” title=”HFT%20with%20Redis%20NOSQL,%20R%20doRedis%20dot%20NET%20C%20Sharp%20trading%20platform%20on%20Linux%20and%20Windows%20″]

Here is an example doRedis R interation in RStudio with a remote Redis server:

registerDoRedis(queue=’jobs’,host=”192.168.2.15″,port=6379)
> library(‘doRedis’)
Loading required package: rredis
Loading required package: foreach
foreach: simple, scalable parallel programming from Revolution Analytics
Use Revolution R for scalability, fault tolerance and more.
http://www.revolutionanalytics.com
Loading required package: iterators
> registerDoRedis(‘jobs’)

> registerDoRedis(‘jobs’,’92.168.2.15′,’6379′)

> registerDoRedis(queue=’jobs’,host=’92.168.2.15′,port=’6379′)

> registerDoRedis(queue=’jobs’,host=’192.168.2.15′,port=’6379′)
> startLocalWorkers(n=2,queue=’jobs’,host=’192.168.2.15′,port=’6379′)
> removeQueue(‘jobs’)
[1] TRUE
> startLocalWorkers(n=2,queue=’jobs’,host=’192.168.2.15′,port=’6379′)
> startLocalWorkers(n=2,queue=’jobs’,host=’192.168.2.15′,port=’6379′)
> removeQueue(‘jobs’)
[1] TRUE
> startLocalWorkers(n=2,queue=’jobs’,host=’192.168.2.15′,port=’6379′)
> foreach(icount(10),.combine=sum,.multicombine=TRUE,.inorder=FALSE) %dopar% 4*sum((runif(1000000)^2 + runif(1000000)^2)<1)/10000000
[1] 3.141388

From http://cran.r-project.org/web/packages/doRedis/vignettes/doRedis.pdf

Very Nice! My C# program calls R Code through the R.NET package. Dancing in the streets!

Very Nice! My C# program calls R Code through the R.NET package. Dancing in the streets!

Whoa! I finally got this working with R.NET. I can get my C# appliction call directly R code which is very nice. There is a nasty bug on the search path of R so note at the beginning of the code as well. Get more info about R.Net package from http://rdotnet.codeplex.com/. Here is the C# code but don’t foret to add the.NET.DLL DLL reference in your Visual Studio:

 

using System;
using System.Linq;
using RDotNet;

class Program
{
    static void Main(string[] args)
    {
        //code solution from http://stackoverflow.com/questions/7960738/importing-mgcv-fails-because-rlapack-dll-cannot-be-found
        string rhome = System.Environment.GetEnvironmentVariable(“R_HOME”);
        if (string.IsNullOrEmpty(rhome))
            rhome = @”C:Program FilesRR-2.15.0″;

        System.Environment.SetEnvironmentVariable(“R_HOME”, rhome);
        System.Environment.SetEnvironmentVariable(“PATH”, System.Environment.GetEnvironmentVariable(“PATH”) + “;” + rhome + @”bini386″);

        // Set the folder in which R.dll locates.
        //REngine.SetDllDirectory(@”C:Program FilesRR-2.12.0bini386″);
        REngine.SetDllDirectory(@”C:Program FilesRR-2.15.0bini386″);
        using (REngine engine = REngine.CreateInstance(“RDotNet”, new[] { “-q” }))  // quiet mode
        {
            foreach (string path in engine.EagerEvaluate(“.libPaths()”).AsCharacter())
            {
               Console.WriteLine(path);
            }
            //engine.EagerEvaluate(“.libPaths(“C:/Program Files/R/R-2.15.0/library”);

            // .NET Framework array to R vector.
            NumericVector group1 = engine.CreateNumericVector(new double[] { 30.02, 29.99, 30.11, 29.97, 30.01, 29.99 });
            engine.SetSymbol(“group1”, group1);
            // Direct parsing from R script.
            NumericVector group2 = engine.EagerEvaluate(“group2 <- c(29.89, 29.93, 29.72, 29.98, 30.02, 29.98)”).AsNumeric();

            // Test difference of mean and get the P-value.
            GenericVector testResult = engine.EagerEvaluate(“t.test(group1, group2)”).AsList();
            double p = testResult[“p.value”].AsNumeric().First();

            Console.WriteLine(“Group1: [{0}]”, string.Join(“, “, group1));
            Console.WriteLine(“Group2: [{0}]”, string.Join(“, “, group2));
            Console.WriteLine(“P-value = {0:0.000}”, p);
        }
    }
}

 

You can also run complete R scipts from within C# using R.NET. Here is an C# example:

using System;
using System.Linq;
using RDotNet;

class Program
{
    static void Main(string[] args)
    {
        //code solution from http://stackoverflow.com/questions/7960738/importing-mgcv-fails-because-rlapack-dll-cannot-be-found
        string rhome = System.Environment.GetEnvironmentVariable(“R_HOME”);
        if (string.IsNullOrEmpty(rhome))
            rhome = @”C:Program FilesRR-2.15.0″;

        System.Environment.SetEnvironmentVariable(“R_HOME”, rhome);
        System.Environment.SetEnvironmentVariable(“PATH”, System.Environment.GetEnvironmentVariable(“PATH”) + “;” + rhome + @”bini386″);

        // Set the folder in which R.dll locates.
        //REngine.SetDllDirectory(@”C:Program FilesRR-2.12.0bini386″);
        REngine.SetDllDirectory(@”C:Program FilesRR-2.15.0bini386″);
        using (REngine engine = REngine.CreateInstance(“RDotNet”, new[] { “-q” }))  // quiet mode
        {
            foreach (string path in engine.EagerEvaluate(“.libPaths()”).AsCharacter())
            {
               Console.WriteLine(path);
            }
            engine.EagerEvaluate(“.libPaths(“C:/Program Files/R/R-2.15.0/library”);

            // .NET Framework array to R vector.
            //NumericVector group1 = engine.CreateNumericVector(new double[] { 30.02, 29.99, 30.11, 29.97, 30.01, 29.99 });
            //engine.SetSymbol(“group1”, group1);
            //// Direct parsing from R script.
            //NumericVector group2 = engine.EagerEvaluate(“group2 <- c(29.89, 29.93, 29.72, 29.98, 30.02, 29.98)”).AsNumeric();

            //// Test difference of mean and get the P-value.
            //GenericVector testResult = engine.EagerEvaluate(“t.test(group1, group2)”).AsList();
            //double p = testResult[“p.value”].AsNumeric().First();

            //Console.WriteLine(“Group1: [{0}]”, string.Join(“, “, group1));
            //Console.WriteLine(“Group2: [{0}]”, string.Join(“, “, group2));
            //Console.WriteLine(“P-value = {0:0.000}”, p);

            //example to call complete R script from http://rdotnet.codeplex.com/discussions/262426
            //REngine R = REngine.GetInstanceFromID(“RDotNet”);
            //R.EagerEvaluate(“source(“MyRscript.r”)”);
            engine.EagerEvaluate(“source(“test.r”)”);
            Console.ReadLine();
        }
    }
}

Here is R script test.r which resides in the same Release directory as the C# program. I am sure this can be elsewhere but just want to test running the R script.

G<-c(1,2,3)
G
cat(‘hello from r’, G)

 

How to get R connect to NOSQL scalable database Redis with doRedis R package for parallelization

How to get R connect to NOSQL scalable database Redis with doRedis R package for parallelization

I managed to accomplish R connecting to a NOSQL database namely Redis. I wish I could get Cassandra going but no luck. I have a video on how I got it going. Do note the links below that helped me accomplish this:
Discussion on various NOSQL databases including Redis:

http://stackoverflow.com/questions/4720508/redis-couchdb-or-cassandra

For installing Redis Server on CentOS Linux: (I could not get this successfully working with wget and untarring  was downloaded)

https://github.com/causes/redis-centos/blob/master/README.markdown

As said in my video, follow the downloaded packages INSTALL readme guide. You need to make to build from the source of Redis. It ain’t that bad. You also need to run a test on the settings of Redis.
For doRedis R package video demo and instructions on how to use it in within R:

http://bigcomputing.com/doredis.html

[youtube_sc url=”http://www.youtube.com/watch?v=p0PO6IAwe-w” title=”r%20redis%20nosql%20integration”]

Rhadoop with R and Hadoop sort of works

Rhadoop with R and Hadoop sort of works

Finally RHadoop running with R and Hadoop with rmr Map and Reduce bridged thanks to this tutorial

These links made it happens to someone who commented on my last post on what started this whole journey. Thanks to them.

<a href=”https://github.com/jeffreybreen/tutorial-201203-big-data”>https://github.com/jeffreybreen/tutorial-201203-big-data</a>

<a href=”https://github.com/jeffreybreen/tutorial-201203-big-data/blob/master/README”>https://github.com/jeffreybreen/tutorial-201203-big-data/blob/master/README</a>

http://jeffreybreen.wordpress.com/2012/03/10/big-data-step-by-step-slides/

[youtube_sc url=”http://www.youtube.com/watch?v=uCgrUU02__Q” title=”r%20rhadoop%20hadoop”]

Cassandra looks like best NOSQL but too bad the connection through R is scant

No to MYSQL: How to use Cassandra with R and RCasandra versus Hadoop and HBase

MYSQL looked good but can get pricey. Cassandra is true free open source and Twitter uses it!

http://blog.milford.io/2010/06/installing-apache-cassandra-on-centos/

HBase vs Cassandra: why we moved

No to MYSQL: How to use Cassandra with R and RCasandra versus Hadoop and HBase

MYSQL looked good but can get pricey. Cassandra is true free open source and Twitter uses it!

http://blog.milford.io/2010/06/installing-apache-cassandra-on-centos/

HBase vs Cassandra: why we moved

[youtube_sc url=”http://www.youtube.com/watch?v=1oHvtdSvUDs” title=”R%20and%20Cassanrda%20with%20RCassandra”]

Lucky me, I get a strange issue with RCassandra, Hmmm…probably a stupid thing but please comment if anyone has any ideas. Thanks

[youtube_sc url=”http://www.youtube.com/watch?v=ZM4pEG4o7zI” playlist=”rcassandra issue”]

 

This could be a potential workaround but let me try for another day:

Big Analytics with R, Cassandra, and Hive