Monthly Archives: June 2012

Youtube video: Proper steps to integrate MongoDB NOSQL database with R on Ubuntu Linux with RStudio

Youtube video: Proper steps to integrate MongoDB NOSQL database with R on Ubuntu Linux with RStudio

Why NOSQL?

When comparing other RDMS including Oracle, MySQL or DB/2 against NOSQL, I was one of those old geezers who thought they were another ‘go fast nowhere’ technology trend.

Boy, was I wrong! I looked at all the options including HBase/Hadoop, Cassandra, Redis, and MongoDB. I found Hbase and Hadoop were difficult to install. The R packages for these were useless as there were few installation documents. Cassandra was easy but the the client support for R either didn’t exist or didn’t work with the RCassandra package. Redis looked awesome and is still one of the best for R thanks to the doRedis R package. The Java examples to shard looked promising too.

Then I revisited everything and came across MongoDB. It seems to have all the advantages of Cassandra and Redis with decent R package support.

NOSQL options and R

After looking at all the options, my recommended NOSQL databases is Redis or MongoDB. It was easy to install Redis but tricky with MongoDB. My recommendation is to install MongoDB with a clean current Ubuntu Linux Desktop version. All you need to do is type the following on the command line:

‘apt-get install mongo’

… and you’re ready to go. It takes care of all the permissions, users, etc. It was so easy that I could have saved a day’s work without going through the hoops I did. A lesson learned, and the time I lost is time you can save!

MongoDB install obstacles and tricks

Here are some current tricks to save you even more time:

Install Open JDK 6 not 7 as it will mess up the RJava install with R

If this happens, keep an eye out for a message that says something to the effect of checking your Linux Java environment. It will say you need to run as root, R CMD java or something similar. This message would be appear in your R console during the RJava package install. My problem came to Java 7 versus 6 which resolved it.

Create your MongoDB mydb data with:

http://www.mongodb.org/display/DOCS/Tutorial
http://pseudofish.com/blog/2011/05/25/analysis-of-data-with-mongodb-and-r/

Once installed, you should be able to do the following within your R shell or RStudio:
for RMongo:

> library(“RMongo”)

Loading required package: RUnit

> mongo <- mongoDbConnect(“mydb”)
> dbShowCollections(mongo)
[1] “system.indexes” “things”
> results <- dbGetQuery(mongo, “nutrient_metadatas”, “{}”, 0, 2)
> results <- dbGetQuery(mongo, “things”, “{}”, 0, 2)
> names(results)
[1] “X_id” “name”
> dbDisconnect(mongo)

[youtube_sc url=”http://www.youtube.com/watch?v=DbhdKBx-lK0″ playlist=”How to, Integrate, MongoDB ,NOSQL, database ,R , Ubuntu, Linux, RStudio ” title=”How%20to%20Integrate%20MongoDB%20NOSQL%20database%20with%20R%20on%20Ubuntu%20Linux%20with%20RStudio%20″]

My Youtube video using free R-Studio for newbies and installing free R packages from CRAN or R-Forge

My Youtube video using free R-Studio for newbies and installing free R packages from CRAN or R-Forge

I have just posted a new video on the amazing combination of all these tools.

Why R-Studio?

There is no difference if you use are using Eclipse or Netbeans as an integrated development environment, I love this free tool R-Studio which should be added to your arsenal for any model prototyping or development. It can be used on any major operating system platform including Mac OSX, Linux, or Windows. It also has a server edition you can load up on a remote server for developing through a browser session including an Apple Ipad. Pretty neat compared to something like Matlab Mobile. This is the primary reason why I switched from Matlab to R.

The confusing difference between CRAN and R-Forge

As a newbie, I could not find the difference with these R package repositories. As a discovery on Stackexchange.com, it was simply explained CRAN contains your milestones releases of R packages. This basically means they are more stable and could be major releases as well. As some R packages have many developers to the package, you could get changes every few hours which may make the package unstable if there was a potentially bad change. As a result, the package administrator may release the milestone into CRAN

R-Studio and CRAN

As you will find in this video, I would feel more comfortable to stick with CRAN and so does R-Studio. This video shows you how ridiculously easy it is to install an R package from CRAN. As I will start depending on these R packages for my production environment,   I would have no choice but do this. I am a sure a large bank or hedge fund would feel these same way. You can still manually install R packages from R-Forge as well.

A Thanks Goes To….

Thanks to this R community for developing these pretty amazing tools and packages. Best part is the documentation is definitely adequate to get me started fairly quickly with confidence as well. I am sure they spend a pretty huge amount of selfless time to get these tools to the point where they are.

[youtube_sc url=”http://www.youtube.com/watch?v=TSxS0x4PLPg” playlist=”r, r-studio, cran, r-forge” title=”Using%20free%20R-Studio%20for%20newbies%20and%20installing%20free%20R%20packages%20from%20CRAN%20or%20R-Forge”]