“Statistical and Machine-Learning Data Mining” – Is the topic a mixed metaphor?
Welcome to the Department of Redundancy Department… sort of.
One could argue that statistical learning, machine learning, and data mining are three ways of looking at the same thing. It is true that machine learning technically includes logic programming AI, but when most people talk about the three subjects, they’re usually talking about the same thing.
As a metaphor…? Since technically machines cannot truly learn the way humans do, by not only changing “software” (thoughts) but “hardware” (neural connections), it may of may not be a metaphor, depending on what you are considering…
I would like to start with a few personal experiences. I began research during the early 00’s and attended a conference related to Statistics. The debate between Cox and Breiman was told to me, but the exact details is something which I was not familiar. Then I remember a vintage statistician, I mean immature, declared that topics like Data Mining and Machine Learning should not be allowed to grow and that “Statistics” must dominate everything. At that moment, as a beginner, I wondered the necessity of “Domination”.Next, I recently attended a conference in which an eminent statistician from India initiated the beginning of the conference and, as expected, launched an attack on the Machine Learning community declaring how they violate the principles of randomization, etc. Using “Statistical Learning” concepts such as dimensional convergence, etc, he convinced the audience that “Statistics”, and not “Machine Learning”, is the way forward. The consequence of such rhetoric is that beginners get intimidated by eminent and never bother to learn what “Machine Learning” is all about. As a response to the rhetoric, I had prepared a presentation slide urging the audience to have a sane look at other disciplines and read the books of Hastie, Tibshirani, and Friedman (2008) and then arrive at appropriate conclusion. I was not aware of the book “Statistical and Machine-Learning Data Mining: Techniques for Better Predictive Modeling and Analysis of Big Data” by Prof. Ratner (who has started this discussion). Of course, I was not allowed to present the slides. In that very same conference there was a presentation on some work related to “Data Mining”. Though the slides were beautiful and covered almost every terminology used in the subject, the problem there was the absolute lack of any “Mathematical detailing”. I mean that the concepts were presented in a “Black Box” form and not as a scientific tool. In this subcontinent, on most occasions, all data mining methods are presented as “Software Utilities” and not as scientific tools.A coincidence is that like Prof. Ratner, I am also a lot impressed by the late Tukey’s 1962 article “The Future of Data Analysis” and the EDA work. Will join the discussion later. Thanks.
NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!