2006 trends on Data Mining Research
Welcome back to Data Mining Research! I hope you enjoyed your holidays. Although I will not make predictions about the future of data mining, I want to highlight three topics that have emerged from last year posts on this blog.
The first one is the data mining software or language used by people in research and industry. It is clear that several possibilities exist (examples can be found on this post). I think that the diversity of people using them, as well as their aim, makes it difficult to have a universal language for data mining.
The second topic is about data mining pitfalls and the related difficulties for beginners using data mining as a tool. After discussing on the post about data mining pitfalls and garbage in, garbage out, it is clear that many different pitfalls and traps stand on the knowledge way.
The last one, related to the previous one, concerns the automation of the data mining task. One of the main issue concerning the management of the above mentioned pitfalls. How to automate clustering when the number of cluster is unknown? How to automate neural networks avoiding underfitting and overfitting? How to choose the right data mining method to use? Some of these questions may be answered through following a methodology in a book. In addition, companies such as KXEN may be helpful.