Which RDMS or NOSQL database do you use for R? MySQL, Cassandra, HBase, MongoDB, Oracle, PostgreSQL, CouchDB, SQLite?
This R survey is kind of important. It will show a few things:
- Which R most users use regardless if they are commercial vs open source vs NOSQL .
- This will help us figure out which database is best for R using the scalability and speed depending on the requirements. This includes multiple writes for market tick data from C++ or a Java application and access by various R algorithms for analytics purposes.
Go here for the poll.
Here are some reasonable options with reasons:
I would assume this to be the number one choice since it is open source (or at least they say). It also contains sharding and other scalability needs with clustering. Is this something that people are using for their trading platform requirements? This includes using MYSQL as a tick data repository.
Is anyone actually using this open source database for their R needs?
This is easily the most popular commercial RDMS for both Linux/Unix and Windows. As Oracle has open R into with a connector into their ecosystem, I wondered if people are actually using this.
I am unsure if there are any R package connectors to any of these databases. I was just curious as I am really not interested in these as a real option.
There seems to be no R package support for this. I once posted something on R-Bloggers.com and it lit up the site, it made me wonder if this is actually more popular than people think. It seems to meet the needs of both quick write and read access.
Now the doRedis R package looked really hot. It even showcased how to use with a potential financial analytics system. I even saw Java sharding examples which left me excited on the capabilities of this database.
This seems to be strangely the most popular of all. I also found various R packages which seems to support it as well.
HBase which is part of Hadoop
Eh. No support even according to Revolutionary Analytics which their lacking install R package guides. I gave up pretty quickly on these R packages.
All others database options seem fine but the ones listed above seem the most viable for any R user as repository for scaling and clustering.
Go here for the poll.