Did you know that Wal-Mart, Walt Disney, General Electric, Nokia, and Bank of America are ALL using Hadoop?
BusinessWeek just published this article: Getting a Handle on Big Data with Hadoop: The flood of information from social media and elsewhere is propelling companies’ use of free and customizable software called Hadoop to manage it. Good info on how companies are leveraging Hadoop to turn Big Data challenges into big business opportunities. http://buswk.co/rk2nTh
Sure, what’s else is new? 😉
So what? They also use Windows and mainframes 🙂
you also forgot about Linux (RedHat/Centos/etc) too. 🙂
I guess I should say that anyone who wants to see who is doing what just has to look at the members of Hadoop groups on LinkedIn or see who’s a member of the local Hadoop Usergroups. 😉
question for you. I’ve heard that Hadoop is the new Linux. What are your thoughts on this?
Not sure what you meant by your question. One’s an OS the other is a M/R Framework and associated tools.
If you meant to compare Linux and Hadoop in terms of being a disruptive tech, then yes. And its already proven to be true. Hadoop is definitely a disruptive tech.
Hadoop was a disruptive tech years ago. Now, Hadoop is table stakes
Hadoop these days is bread and butter for large data analytics. Its fashionable enough that those who don’t have large amounts of data still want to use it. In time to come you are likely to see a lot of disruptive tech evolving around distributed analytics, including real-time crunching, beyond M/R that Hadoop provides.
Steaks? Toss in a Baked Potato and I’m in. 🙂
I would have to say that Hadoop is still disruptive because we’re hitting the main stream market. If you go back to Linux, you could have been an early adopter long before it hit mainstream shops and I think the same is true.
Looking at all of the discussion spam from the likes of Informatica, along with mainstream articles now mentioning Hadoop and some of the big name adopters like Nokia… I would say that we’re still in the early stages.
“Looking at all of the discussion spam from the likes of Informatica…” awesome. 🙂
I’m sorry, I didn’t realize there was an issue. I’m simply sharing information about Big Data (Hadoop) and hoping to learn something in exchange about this hot topic. I thought that was the purpose of LinkedIn Groups. I didn’t mean to offend anyone.
, you didn’t offend anyone. It’s just that for the most part, Hadoop is old news. And it doesn’t come close to low latency. The folks who gained competitive advantage years ago via Hadoop, or Map/Reduce, are certainly still using it but have also moved on to new solutions
I believe that Hadoop, or Map/Reduce, will be disruptive but not for what you might think. It’s made the public start thinking in distributed terms, moving code to data, breaking problems down into smaller pieces, etc. It’s also re-introduced a focus on big data and the types of algorithms you need to deal with big data. The world is going real-time (I prefer event-driven) and Hadoop just isn’t going to cut it. The revolution is already over for Hadoop I’m afraid
comparing Hadoop to Linux was done in terms of describing Hadoop as the new operating system for data. So in that regard, it’s a valid comparison even though they’re too very different technologies.
the Civil Rights Act was signed in 1964 but the first black president assumed office only in 2009. There is still a lot of Hadoop and Hadoop powered disruption to come, I’m afraid. Of course there will be more in distributed computing, algorithms, event driven, and even the way we understand and analyze data.
Shashank, and expectations for both were overblown and thus far, found lacking. Time will tell my friend but either way, discussions about Hadoop don’t belong in this forum because of the phrase “Low Latency
This is a new topic area for me and I am genuinlely interested. In the majority of articles I’ve read Big Data and Hadoop are typically discussed together. Big data (large volumes and different types of data such as transaction data and interaction data) needing to be processed in Hadoop clusters. I’ve read about government agencies using low latency databases built on top of Hadoop for fuzzy matching finger prints for example. So yes, I’m curious why you don’t care about Hadoop.
AS far as I am aware, Disney is only looking into using Hadoop 🙂 they same with other major studios. They are certainly no Google, that uses map/reduce to process data IN only, then involving more suitable tools in analysis of data, and no Amazon. Hadoop’s outmost useful feature is instant scalability and reliable availability of the data with multiple segmented replication on the almost OS level, this is the one to be fascinated with. Map reduce allows parallel processing of unstructured data into structured data and allows to store on disk. Then, there are better tools for processing structured data, to tell you the truth. All these “news” in architectures are old; however, hadoop allowed certain things that were not possible before…
But for specialized niche like low latency, you may want to contact Volkmar Uhlig of HStreaming. We just had a meeting where we had Ted Dunning from MapR and Volkmar from HStreaming present on their products. (I’ll try to have their presentations up on our meetup site this weekend.) HStreaming fits the niche of real time M/R on events.
The point is that, yes I agree with your points, however I think that Hadoop is evolving and taking some interesting twists w both MapR and HStreaming.
I’m familiar with both MapR and HStreaming and while they might have Hadoop friendly api’s, they’re not Hadoop. Quite frankly, they’re much more like CEP. Either way, the Hadoop you describe for the ‘main stream community’ isn’t low latency.
Low latency isn’t niche – when you consider that going below a 5 minute batch time on Hadoop presents diminishing returns, most people are interested in processing transactions in somewhat less time. But again, Hadoop isn’t for that (but this forum is).
In regards to missing out on the evolution of Hadoop, my team has been doing event-driven (streaming) map/reduce now *for a couple of years* on Wall St. So, as Yevgeniya points out above, this new Hadoop architecture is actually quite old. Wall St has been using grid for 20 years (scatter/gather, map/reduce, what ever you want to call it), so I’ve actually been right in the thick of it the whole time.
If you can post links to the presentations you mention, that would be great. But please do it in a separate thread.
NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!