Tag Archives: RESTful Apps

Quant development: HA and Low Latency on EC2 for RESTful Apps

HA and Low Latency on EC2 for RESTful Apps

I just added a post to my blog condensing some of my experiences and thoughts about implementing HA, Low Latency on EC2: specifically for RESTful apps. I’d welcome any comments or suggestions anyone may have…


Low latency on EC2?
How low can you go on a large dataset?


Depends how large, how low, and how much you’re willing to pay


EC2 is not designed for low-latency if you have a non-trivial or distributed dataset. The only part of the topology with semi-consistent low-latency characteristics are the HPC nodes, which unfortunately are designed for applications where disk does not matter and offer little in the way of HA. Our engineers use the EC2 HPC nodes as a test platform for low-latency codes and that works well enough but it would not be efficient for production.

How low is “low-latency”? Throughput does not seem to be a concern, so it probably could be done for sufficiently high values of “low latency”.


That was my point. 🙂

If you want low latency, you’re talking fusionIO cards, 10GBe, and as little network between your rest server and your cluster. Then your client has to be as close as possible to your REST server.

Sort of violates the principle of Hadoop…;-)

I would agree that “low-latency” and “Hadoop” are rarely used together in the same sentence and for good reason. 🙂

That said, the REST interface implies a moderate tolerance for latency so it might be possible to put something together that works. Our systems that run on EC2 are neither Hadoop nor REST; they are almost entirely custom but the workload necessitates it and there is no open source software that can scale the kinds of analytics we do.


Low latency means different things to different people. In the world of trading, low latency means as fast as technically possible, damn the costs.

In the Hadoop world, low latency means as fast as possible using ‘commodity’ hardware. And even here commodity hardware is sometimes tough to define.

The real question is how fast is fast enough?

You can design low latency Hadoop systems. But if you look at the problem you are trying to solve, your budget for hardware, software, and man power … etc… Ask yourself how low do you need to go? Does it make more sense in paying a premium for faster hardware or expanding your cluster out?

EC2 is a different beast. It’s a public infrastructure and it’s not really designed for speed.


Great article. Most of the advice including CDN, Geo-load balancing applies in all circumstances not only cloud computing. I am the #1 proponent of “Ground Computing”. Reading the article I get the “square peg in round hole” feeling. Technologies like Amazon SQS, Amazon Cloudfront, Amazon RDS, Amazon ELB, buzzword bingo anyone? All this just to get ALMOST as fast as real hardware doing it the “old fashioned way”.

If the primary fact is the Amazon network has latency and noisy neighbours, no amount of technology layered above that will help. You can not get around low latency with more layers.

Are we talking 5 sec. or 50 ms ? (Assume 40 ms ping time from client to server.)

Nice article and post.

If someone out there (admittedly, living in a box but interested in big data / low latency) now knows what they need is a CDN or to localize access to services based on GeoIP discovery then this article has helped.

Big data analytics at an elastic scale with minimized end to end latency is hard. There is no optimal commercial or open source solution that has a one size fits all form factor. The economy afforded by scale is not equivalent to the scale of the low latency economy. The former is relatively cheap. The latter is relatively expensive.

For example: GPU folk often talk about ‘speed’ when they really mean ‘high bandwidth’. Being able to do a truck load of processing in 50 millis is great, unless you just need to process a single event in a handful of microseconds. Just like 50 odd socks is a nightmare when you just want a single pair of socks. Context is key.

Last but not least: Latency, throughput, scale, volumetric, computational complexity… these aren’t numbers. We need the equivalence of a control structures (sequences, statements, selections) formalism for big data low latency analytic environments.

So my only constructive feedback would be to include a frame of reference and context in your postings so it’s clear (except to trolls and nitpickers, who are never happy anyway) to the audience what the non-functional characterstics are. The context.

An example with ‘low’ latency: If this is ‘human interaction time’ and it’s serialized (single human at a time) over the web then 5 updates a second aught to be enough, right, say to ‘select-then-click’ a button? Replace the wetware with, a web-spider or web-robot. Now replace the web-robot with a legion of DDOSsing zombie robots.



NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!