Quant Development

Youtube video on How to populate a highly scalable NOSQL Redis database from Java with source code

Youtube video on How to populate a highly scalable NOSQL Redis database  from Java with source code Some links to you out: Usage: https://github.com/xetorthio/jedis Download the JAR: https://github.com/xetorthio/jedis/downloads Learn more on how I do http://quantlabs.net/membership.htm YouTube Preview Image

Get our FREE Open Source Historical Database by answering the 2 WORLD'S FASTEST TRADER/QUANT QUESTIONS

Post to Twitter

New R blog created on how to do a model, algo, strategy for HFT High Frequency Trading

New R blog created on how to do a model, algo, strategy for HFT High Frequency Trading See here at http://quantlabs.net/r-blog/

Get our FREE Open Source Historical Database by answering the 2 WORLD'S FASTEST TRADER/QUANT QUESTIONS

Post to Twitter

Multi Agents Systems (MAS) for quant development and high frequency trading HFT

Multi Agents Systems (MAS) for quant development and high frequency trading HFT Hi, I am looking for an interesting theme for my MSc and I wish to join MAS and Algorithmic Trading. Can anyone suggest some ideas or even papers with the same topic? Thanks == have a look at Altreva. It's an application of agent-based models for algorithmic trading. http://www.altreva.com == There is a program called Netlogo that allows developers to create multi agent based models. A good paper that worked on a simulation of the May market crash in 2010 is located here: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1932152 Many ideas were discussed at the High frequency data analysis conference, check out the abstracts: http://kolmogorov.math.stevens.edu/conference2011/index.php/abstracts-of-the-talks == Thank you for posting. I didn't know Altreva and I am starting to know it now. I already knew NetLogo and I know it is used a lot in social sciences because it is easier to programm than Java (Rpast for instance). But I will dig in a bit more. If you have additional info you would like to share I would welcome it a lot. Thanks

Get our FREE Open Source Historical Database by answering the 2 WORLD'S FASTEST TRADER/QUANT QUESTIONS

Post to Twitter

No to MYSQL: How to use Cassandra with R and RCasandra versus Hadoop and HBase

No to MYSQL: How to use Cassandra with R and RCasandra versus Hadoop and HBase MYSQL looked good but can get pricey. Cassandra is true free open source and Twitter uses it! http://blog.milford.io/2010/06/installing-apache-cassandra-on-centos/ http://ria101.wordpress.com/2010/02/24/hbase-vs-cassandra-why-we-moved/ Learn more on how I proceed with this with R as part of my high frequency trading: http://quantlabs.net/membership.htm YouTube Preview Image Lucky me, I get a strange issue with RCassandra, Hmmm...probably a stupid thing but please comment if anyone has any ideas. Thanks YouTube Preview Image This could be a potential workaround but let me try for another day: http://www.datastax.com/dev/blog/big-analytics-with-r-cassandra-and-hive      

Get our FREE Open Source Historical Database by answering the 2 WORLD'S FASTEST TRADER/QUANT QUESTIONS

Post to Twitter

How to develop with R on an Apple Ipad Iphone or Android using Rstudio Server for Free

View CommentsWritten on May 17th, 2012 by caustic
Categories: Quant Development, R
How to develop with R on an Apple Ipad Iphone or Android using Rstudio Server for Free I show why RStudio enables you to develop R on a remote web server.

YouTube Preview Image

 

Get our FREE Open Source Historical Database by answering the 2 WORLD'S FASTEST TRADER/QUANT QUESTIONS

Post to Twitter

Best alternative development stack for R with Hadoop? Forget MYSQL? Cassandra and Java listeners for market tick data

Best alternative development stack for R with Hadoop? Forget MYSQL? Cassandra and Java listeners for market tick data After reading about the limitation of MYSQL and how expensive it can get, I decided to take chance on Cassandra. One big reason there is a RCassandra package within CRAN so yippee for that. Also, the install does not look hard and better yet, you can integrate it with Hadoop. Yipee for that! Also, Cassandra may be faster for writing than HBase which was part of the RHadoop offering so boo to that. Also, I plan to have to some Java listeners to my market data to populate the Cassandra database. This stack may work so let’s cross my fingers.  Here are my links that got me thinking this way: http://stackoverflow.com/questions/4884967/hadoop-hbase-hdfs-vs-mysql-or-postgres-loads-of-independent-structured-d http://ria101.wordpress.com/2010/02/24/hbase-vs-cassandra-why-we-moved/ http://blog.milford.io/2010/06/installing-apache-cassandra-on-centos/ This above install of Cassandra appears to work with a few tricks as running as root but does work and installed fine. http://www.datastax.com/docs/0.7/map_reduce/hadoop_mr http://code.google.com/p/cassandra-java-client/ The last 2 links  I question but the Cassandra install may be worth doing but integrating with Hadoop could be a challenge. I also hope the RCassandra works to.    

Get our FREE Open Source Historical Database by answering the 2 WORLD'S FASTEST TRADER/QUANT QUESTIONS

Post to Twitter

Disaster Recovery eBook – A Guide to Modern IT Disaster Recovery

Disaster Recovery eBook - A Guide to Modern IT Disaster Recovery This online eBook provides insight and advice on how to build an effective disaster recovery strategy in the evolving world of virtual infrastructures, while mitigating the impact of so-called 'Black Swan' events in the datacenter. Learn more: =⇒ http://bit.ly/DisasterRecoveryeBook GlobalRisk community globalriskcommunity.com Join the world's premier risk forum and community for executives, service providers, entrepreneurs and propel your career to a New level! -- How good is your IT Disaster Recovery ...really? Please share your stories here. -- Thank you for downloading this document. The number of downloads so far was very impressive. Please give us your feedback.

Get our FREE Open Source Historical Database by answering the 2 WORLD'S FASTEST TRADER/QUANT QUESTIONS

Post to Twitter

why r rhine and rhadoop do not work, back to windows microsoft .net and matlab

why r rhine and rhadoop do not work, back to windows microsoft .net and matlab Learn how do in these tools http://quantlabs.net/membership.htm UPDATE: Progress has been made but I still wish these R package developers made better install guides instead of relying on others and going on Easter Egg hunts http://quantlabs.net/blog/2012/05/finally-rhadoop-running-with-r-and-hadoop-with-rmr-map-and-reduce-bridged-thanks-to-this-tutorial/ YouTube Preview Image  

Get our FREE Open Source Historical Database by answering the 2 WORLD'S FASTEST TRADER/QUANT QUESTIONS

Post to Twitter

Does somebody has experience with dmraid + Intel Rapid Storage Technology on RedHat5 for quant development

Does somebody has experience with dmraid + Intel Rapid Storage Technology on RedHat5 for quant development   Hi   I'm configuring 11 Intel servers type S2600JF with 4*500GB disks per server type TOSHIBA that I'd like to configure like a Raid0 or maybe Raid5 if I don't loose too much performances because of the parity computation; does somebody manage the dmraid ( so called "FakeRAID" or "FirmwareRAID" ) on RedHat5 in production ? do I really gain performances from the Raid0 ? during a Raid5 disk failure how to change the broken disk to rebuild the Raid5, online ? Many thanks to report your experiences ! -- Hi, my usual experience with related configs,   * if you need to choose between a vendor-specific fakeraid (Bios SW Raid with OS level sw raid add-in) or generic JBOD disks and then standard linux SW raid -- usually the Linux Stock SW Raid gives you simpler management, fewer dependencies on custom config; less constraint to the 'hardware'; and the same performance. However, that being said, usually either route is functional; it is more a matter of 'preference'.   * you don't want 'it all fails if one drive fails' - you probably want a raid10 config, not a raid0 config. With 4 drives raid5 is a not-great option IMHO. You will very likely get significantly better performance with 4 drive raid10 vs raid5 or raid6. Additionally if your chassis supports it you would get linear boost by going up to 6 or 8 drives in the raid10. More spindles = more parallel bandwidth which will help throughput approach saturation on the controller / rather than being constrained below 'maximum possible throughput'. However, if you can only accommodate 4 drives - then you are stuck with 4 drives, clearly.   Hope this helps a bit,   -- I was thinking about Raid0 because my final users don't care if they loose some jobs during a broken disk event, they will simply resubmit those jobs; because of that I believe I'll reach the best performances by using Raid0 on my 4 disks.   Said that I'll still spend hours to rebuild/validate the broken server so I was also looking for a reliable Raid layout with a little impact on performances and Raid10 or Raid5 seem to be a good compromise to me, with probably Raid10 faster than Raid5. I'll measure both.   Yes I can accommodate just 4 disks, no more.   About mdraid, that's the official Intel preference for this FakeRAID technology on RedHat6 but my final users need RedHat5 because their legacy applications.   I'd like to get a comment about dmraid from someone that's using it in production since some years, by Googling for HowTos on this software I didn't find so much..   --   Raid10 is to protect your time and sanity :-) (ie, avoid system rebuilds due to inevitable disk fails. Remember, all disks will always fail - eventually; it is just a matter of how long it takes. So you have to plan for the inevitable..:-)   Re: Pick of MDRaid vs DMRaid. Ultimately - both will work, I think. There may be more effort involved with one than the other (ie, linux swraid / mdraid - is easier to manage and more stable / doesn't require kernel rebuilds etc; dmraid - will likely be more drama to get installed and potentially has more risk:opportunity for fail:issues).   I did a quick google and found a relevant thread that seems to have similar (stronger) opinion re: DMRaid.   http://www.linuxquestions.org/questions/linux-kernel-70/centos-kernel-upgrade-breaks-dmraid-on-intel-software-raid-638450/   However, this is purely a 3rd party ref so I don't really have personal experience to comment. Ultimately it will be your pick.   Good luck!       -- I have done extensive testing and debugging on Intel's latest SAS/SATA interface found on the new SandyBridge-EP platforms like the Jefferson Pass board you are using (S2600JF). I realize using dmraid is usually a cost saving measure. Your data is important and if you can afford it I encourage you to invest in a PCIe hardware RAID card like LSI's 9265-8i. The kernel driver options for the onboard interface are not yet reliable in my opinion, especially in a RHEL 5 variant (RHEL, CentOS, ScientificLinux, etc).   With the interface in RSTe mode you use the isci kernel driver. It doesn't appear until RHEL 6.2 or until 2.6.18-234.4.1 in RHEL 5. The driver isn't fully baked and I have seen problems with it. It's scatter-gather requirements differ from other kernel drivers and I've seen bug fixes in other areas of the kernel break this driver. In my opinion it's not fully baked.   In ESRT mode you will use the MegaSR dmraid style driver. This is a binary only driver and not available in source form. This will restrict you to kernel patches or upgrades that have versions released by the author (LSI/Intel).   You invested in these machines I assume to do meaningful computing. You should consider additional investment in a stable storage infrastructure, at least until the isci or MegaSR environment becomes more functional.   In addition to the LSI RAID cards, Intel has some custom hardware RAID mezzanine boards designed for the S2600JF that give true hardware RAID without taking up a PCI slot and they use standard MegaRAID kernel drivers.   I hope this helps you,   -- I don't know enough about your use case to give highly specific advice ... general advice is that dmraid isn't good. MD software raid is reasonably good, and well written.   As for the drivers, Jeff notes that they are flaky. We've seen all sorts of interesting half baked drivers in Linux for the various MB functions, not just for EP. Your best bet would be a well baked card and driver. The Intel MBs all have mezzanine cards you can run in JBOD or RAID modes. They are relatively inexpensive.   4 drives isn't a whole lot ... RAID5 performance will start at 3/4 of the full RAID0 performance, and thats only if you are doing full stripe reads/writes. Do smaller IO, and performance will suffer.   My concern is, based on what you are describing ... I hope you aren't going to run these for a parallel file system. This design (lots of small machines with a small number of disks) is a bad design pattern for such IO (yeah, even for Hadoop). Usually the motherboard controllers are connected to some pretty weak controller chip, or sharing an oversubscribed PCIe link. We've seen 4, 6, and 8 port SATA drives hung off a single/dual port PCIe connected controller ... the disks could easily overwhelm the controller (lots of SM boards have had these issues).   RAID0 if you simply don't care about data reliability. RAID10 if you do.   -- Thanks to both for your precious remarks!   I'm testing dmraid and so far the only configuration that worked like expected was Raid0; on Raid5 performances were very bad but the system survived when I've unplugged 1 disk; on Raid10 the performances were acceptable ( of course less than Raid0 ) but Linux got stuck when I've unplugged 1 disk.   I'm not using these nodes like a Parallel Filesystem, it would be a really weak and plenty of single point of failure architecture: they're going to be individual SGE nodes with a local fast scratch FS.   I definitively agree on the Jeff's clue about a dedicated LSI Raid controller with 1GB cache where I already had a lot of good experiences; or the "poor man" Raid offered by Intel. Both options sound better than dmraid. I think the Intel Raid is already there on my MB because once I've modified the BIOS and I've got a Raid dialogue different from that about RST but what I've found sexy in RST is this interesting feature to make 2 different Raids using the same disks.   When I'm going to have something interesting to report I'll update this thread, see you soon and thanks again for your comments   -- The ESRT2 mode allows making two RAIDs from the same disks. Linux mdraid does as well. In my opinion Linux mdraid is higher performing, more flexible and above all open (source code, community support, etc). ESRT2 uses the MegaSR driver which is closed source (binary only) which can be limiting..     -- Also, Intel's mezzanine SAS/SATA HBA and HRA are not "poor man's". They OEM LSI's 6Gb SAS HBA and RAID chips and put them on a custom mezzanine board designed for the S2600JF. High-performance, stable and using long standing and vetted kernel drivers.

Get our FREE Open Source Historical Database by answering the 2 WORLD'S FASTEST TRADER/QUANT QUESTIONS

Post to Twitter

So many thanks goes to powerful open source free software for CentOS Enterprise Linux, Java, R, RStudio, and Hadoop with Cloudera

So many thanks goes to powerful open source free software for CentOS Enterprise Linux, Java, R, RStudio, and Hadoop with Cloudera Learn more her eon what I do http://quantlabs.net/membership.htm YouTube Preview Image

Get our FREE Open Source Historical Database by answering the 2 WORLD'S FASTEST TRADER/QUANT QUESTIONS

Post to Twitter

Follow

Get every new post delivered to your Inbox

Join other followers: