Best alternative development stack for R with Hadoop? Forget MYSQL? Cassandra and Java listeners for market tick data

After reading about the limitation of MYSQL and how expensive it can get, I decided to take chance on Cassandra. One big reason there is a RCassandra package within CRAN so yippee for that. Also, the install does not look hard and better yet, you can integrate it with Hadoop. Yipee for that! Also, Cassandra may be faster for writing than HBase which was part of the RHadoop offering so boo to that. Also, I plan to have to some Java listeners to my market data to populate the Cassandra database. This stack may work so let’s cross my fingers.  Here are my links that got me thinking this way:

http://stackoverflow.com/questions/4884967/hadoop-hbase-hdfs-vs-mysql-or-postgres-loads-of-independent-structured-d

http://ria101.wordpress.com/2010/02/24/hbase-vs-cassandra-why-we-moved/

http://blog.milford.io/2010/06/installing-apache-cassandra-on-centos/

This above install of Cassandra appears to work with a few tricks as running as root but does work and installed fine.

http://www.datastax.com/docs/0.7/map_reduce/hadoop_mr

http://code.google.com/p/cassandra-java-client/

The last 2 links  I question but the Cassandra install may be worth doing but integrating with Hadoop could be a challenge. I also hope the RCassandra works to.

 

 

Get our FREE Open Source Historical Database by answering the 2 WORLD'S FASTEST TRADER/QUANT QUESTIONS

Opt In Image

Get Matlab 2011a Coder Toolkit Generating Visual Studio 2010 Express DLLs With C++ Source Code

Are we speaking YOUR language?

Get secret time-saving shortcuts for Matlab, R, C++ and everything you need to create a winning system now!

Post to Twitter

Related posts:

  1. Sept 5. Cassandra vs HBase, with Nick Telford. Free talk.
  2. How R, Hadoop, RHIPE can handle 400TB of market tick data that kdb+ cannot do. Also, all fo free thanks to open source
  3. Quant development: low latency read, replication across different data centers, hbase or cassandra?
  4. Surprise! I thought Matlab Coder toolbox C++ Hadoop Pipes would be a good option vs Java, Hadoop, and R
  5. Big data market survey: Hadoop Solutions for quant development