I recently read an interesting article from Analytics Magazine entitled “Managing fundamental tradeoffs” by Mu Zhu. This article is interesting since it explains what prevent easy automation of data mining tools. According to the author:
“Algorithms used to uncover such a relationship – for example, neural networks and support vector machines – must be sufficiently flexible. This is because only flexible algorithms can be adapted to the vastly different situations that we encounter in practice.”
He illustrates his ideas using K-nearest neighbors algorithm. He explains that any flexible algorithm have “knobs” that must be tuned:
“Blind applications of predictive algorithms without carefully turning these “knobs” are sure to produce bad or even disastrous results.”
He gives another example of “knobs” with decision trees:
“The size of the decision tree is an important “knob” and, like the KNN, it is necessary to control this parameter carefully in order for the decision tree to be effective.”
Finally, a very important quote according to me:
“Predictive analytics and data mining are about finding information from data. They are search operations. As with all search operations, there are always two questions: where do we search, and how do we search? The algorithms are concerned with how to search, but we must tell them where to search, that is, we must feed the algorithms with data.”
One of the main conclusion of the author is that “one click” applications cannot solve all problems. What’s your opinion about that? If this is correct, then the next question would be in which situation can we use “one-click applications? Feel free to give your mind about this issue.
Read the full article.