Surprise! I thought Matlab Coder toolbox C++ Hadoop Pipes would be a good option vs Java, Hadoop, and R
Another option is to drop R, replace with C++ generated code from Matlab using the Coder toolbox. You can then pipe through using Hadoop Pipes. This is just an idea but could work but everything depends on the kinds of C++ code generated by Matlab. It does eliminate the need of needing to learn a language like R which could be a huge battle with some undocumented packages. This seems to be the big gripe of R.
Here are some tutorial links with C++ and Hadoop Pipes.
But this lead me to this link:
The wordcount program in native Java, in Python streaming mode and in C++ pipes mode is run on 6 books from the Gutenberg project:
Damn! C++ Hadoop Pipes seems twice as slow as Java. I can never win.
Method Real Time
(seconds) Ratio to Java
Java 2 min 15.7 1.0
C++ 5 min 26 0.416
Python 12 min 46.5 0.177
NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!