What are the key characteristics of a ‘Big Data’ problem?
When can you say a problem can not be solved using tradition RDBMS and or BI tools.
I don’t think there is any straightforward answer to it. It all depends what is that you are trying to solve. Most of the typical scenarios do involve marrying the traditional, structured data stored in your RDBMS or data marts with the unstructured or semi structured data stored in logs or NoSQL databases. But if I have to answer your question then it will be mainly the data which is getting generated in bulk..GB/TB of data every day/week and you need to analyse that to get some insights into it to either use it for competitive differentiation or to define new business
BIG Data as such is not a problem. It is in fact solution to many unsolved problems if you know how to process in a cost effective and timely manner. Volume, velocity and variety are three key characteristics of BIG Data, not the Big data problem though. Traditional RDMBS have their own strengths, but it may be difficult to convert millions of scanned copies of news papers into PDF using RDBMS (just a use case example). Moreover, you may not need all the source data to be persisted but only the final aggregated output so another challenge you have there is high velocity streaming data. There can be many other examples and good use cases.
Thanks to both Partha and Vishal. The community have solved these kind of problems before using clustering , partioning and various other combos of tools. So is this a new buzzword for distributed computing? I am just trying to probe it a bit deeper and see if you gurus have any solid cases.
FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!