Top Five Articles in Data Mining

April 19, 2011 by Sandro Saitta
Filed under: Uncategorized 

If you're new here, you may want to subscribe to my RSS feed. Thanks for visiting!

During the last years, I’ve read several data mining articles. Here is a list of my top five articles in data mining. For each article, I put the title, the authors and part of the abstract. Feel free to suggest your favorite ones.

An Introduction to Variable and Feature Selection

Isabelle Guyon and André Elisseeff

Variable and feature selection have become the focus of much research in areas of application for which datasets with tens or hundreds of thousands of variables are available. These areas include text processing of internet documents, gene expression array analysis, and combinatorial chemistry. The objective of variable selection is three-fold: improving the prediction performance of the predictors, providing faster and more cost-effective predictors, and providing a better understanding of the underlying process that generated the data.

Data Clustering: A Review

A.K. Jain, M.N. Murty and P.J. Flynn

Clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exploratory data analysis. However, clustering is a difficult problem combinatorially, and differences in assumptions and contexts in different communities has made the transfer of useful generic concepts and methodologies slow to occur. This paper presents an overview of pattern clustering methods from a statistical pattern recognition perspective, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners.

From Data Mining to Knowledge Discovery in Databases

Usama Fayyad, Gregory Piatetsky-Shapiro and Padhraic Smyth

Data mining and knowledge discovery in databases have been attracting a significant amount of research, industry, and media attention of late. What is all the excitement about? This article provides an overview of this emerging field, clarifying how data mining and knowledge discovery in databases are related both to each other and to related fields, such as machine learning, statistics, and databases.

Nine Laws of Data Mining

Tom Khabaza

In its current form, data mining as a field of practise came into existence in the 1990s, aided by the emergence of data mining algorithms packaged within workbenches so as to be suitable for business analysts.  Perhaps because of its origins in practice rather than in theory, relatively little attention has been paid to understanding the nature of the data mining process.  The development of the CRISP-DM methodology in the late 1990s was a substantial step towards a standardised description of the process that had already been found successful and was (and is) followed by most practising data miners.

Statistical Modeling: The Two Cultures

Leo Breiman

There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated by a given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown. The statistical community has been committed to the almost exclusive use of data models. This commitment
has led to irrelevant theory, questionable conclusions, and has kept statisticians from working on a large range of interesting current problems. Algorithmic modeling, both in theory and practice, has developed rapidly in fields outside statistics.

In its current form, data mining as a field of practise came into existence in the 1990s, aided by the emergence of data mining algorithms packaged within workbenches so as to be suitable for business analysts.  Perhaps because of its origins in practice rather than in theory, relatively little attention has been paid to understanding the nature of the data mining process.  The development of the CRISP-DM methodology in the late 1990s was a substantial step towards a standardised description of the process that had already been found successful and was (and is) followed by most practising data miners.
No TweetBacks yet. (Be the first to Tweet this post)
  • Share/Bookmark

Comments

30 Comments on Top Five Articles in Data Mining

  1. whathappenedto on Wed, 20th Apr 2011 2:31 am
  2. might also want to check out the best papers in KDD (the top data mining conference)

    http://jeffhuang.com/best_paper_awards.html#kdd

  3. Dream on Tue, 10th May 2011 7:13 am
  4. thanks to Sandro Saitta for the five articles.
    i’m planning do my Ph.D for DM.
    but no ideas obtained now.

  5. madhu on Fri, 3rd Jun 2011 10:27 am
  6. also want latest best papers in data mining on various functionality

  7. poongodi mahindran on Tue, 12th Jul 2011 6:00 am
  8. suggest topics in datamining for phd research work

  9. M.Jeyasutha on Fri, 15th Jul 2011 9:44 am
  10. Tell me some Ph.D research topics in Data Mining.

  11. pankaj p. on Sat, 23rd Jul 2011 11:40 pm
  12. I am also doing my phd in data mining contact to me on pankajpathak101@gmail.com

  13. Anandhi on Fri, 19th Aug 2011 11:05 am
  14. suggest some topics(area) in datamining for phd research work

  15. ravi.g on Mon, 29th Aug 2011 10:11 am
  16. please any one suggest some topics in data mining for P.HD
    contact to me: g.raviraja@gmail.com

  17. Frank on Tue, 13th Sep 2011 12:38 pm
  18. Why are people who’s name is ending with -i -j- or -y desperately looking for a phd topic? :)

  19. Sandro Saitta on Wed, 14th Sep 2011 1:53 pm
  20. @Frank: good question! Anyway, I can’t answer to everybody separately, so please, have a look at this page on Data Mining Research: http://www.dataminingblog.com/new-to-data-mining/

  21. sivakumar on Thu, 22nd Dec 2011 11:02 am
  22. I’m doing research in DM. Can any one tell me, where i can get answers for all my questions regarding DM?

  23. deepa on Mon, 26th Dec 2011 9:54 am
  24. can u help me find a good paper on data mining for my research work

  25. subha on Wed, 11th Jan 2012 7:39 am
  26. I want to do my PHD in the data mining. Can you please suggest me some research problems in this area and some research paper that would help me to write PHD Thesis proposal.

  27. VEERASWAMY on Wed, 11th Jan 2012 10:30 am
  28. please send paper on feature selection in datamining

  29. VEERASWAMY on Wed, 11th Jan 2012 10:31 am
  30. please send paper on feature selection in datamining

    my mail id ammisetty.veeraswamy@gmail.com

  31. Phil on Sun, 11th Mar 2012 4:19 am
  32. For all those looking for a phd topic in data mining, a good way to find a topic is to look at the papers published in the top data mining conferences (KDD, PAKDD, ICDM, PKDD, …). Some hot topics now are social networks, big data, mobile data mining…

  33. Data Research services on Thu, 12th Jul 2012 2:48 pm
  34. Hi Sandro,

    Thanks for sharing the information. I am into the same business, and what to say i have downloaded each of the pdf’s and looking forward to read them all.

  35. Girish on Sat, 6th Oct 2012 7:08 am
  36. All those desperadoes looking for topic for Phd… How are you going to do PHD if you do not even know what is latest and unexplored in data mining? And Isn’t your PHD guide suppose to guide you? Or they are also equally dxxx?
    PHD is more than the certificate guys. Real scholarship is expected. Not a tool to get the next job or promotion in your department.

  37. srinivas on Sat, 20th Jul 2013 7:46 am
  38. i need to get a problem linked with cellular automata with datamining

  39. Sagar Kapadiya on Thu, 1st Aug 2013 10:24 am
  40. Please send me all research paper based on Usage data mining on my email address : kapadiya.sagar@gmail.com

  41. sujit kumar bahdani on Tue, 6th Aug 2013 7:26 pm
  42. Please send me all research paper based on Usage data mining on my email address :

  43. sujit kumar bahdani on Tue, 6th Aug 2013 7:28 pm
  44. Please send me all research paper based on Usage data mining on my email address : sujit9925@yahoo.in

  45. guru on Wed, 1st Jan 2014 6:34 pm
  46. any body from bangalore, who can help in getting research proposal

  47. Rajakumar Duraimurugan on Sun, 16th Mar 2014 2:43 pm
  48. Its good to see the great work happening in Data Mining, but would be more useful and appropriate if a commercial dimension is given to the research…Rajakumar Duraimurugan, Bangalore.

  49. deepa on Sat, 25th Oct 2014 6:16 am
  50. hai sir…i am doing m.phill..my research is starting in dm.plse send me ideas how to start the research…

  51. subha on Fri, 31st Oct 2014 2:24 pm
  52. Hello sir.. Iam planned to do ph.d in data mining.So pls send me some topics for doing ph.d in data mining.

  53. suganya on Tue, 18th Nov 2014 7:28 am
  54. i m doing my mphil my research is about data mining can u please send me topics.

  55. samira bayat on Mon, 8th Dec 2014 10:49 pm
  56. hi
    I can not select title for my thesis in SVM.
    please,please help me.
    I want find 5 articles in this field.
    thanks a lot.

  57. Sourav Datta on Thu, 18th Dec 2014 11:22 am
  58. Hello Guys this is Sourav, a new guy in data mining world. A big respect from my side to all erudite guys present here. I am not as knowledgable as you guys. I want to learn a lot about data mining. Please can anybody send me a good code on how to sentiment analysis. I am unable to classify the the tweets into positive and negative. Please people post here how to do the code

  59. SOFIA on Fri, 19th Dec 2014 4:26 pm
  60. PLEASE I KINDLY REQUEST TO SEND SOME IDEAS ABOUT DATA MINING FOR M.Phil research

Tell me what you're thinking...





  • Swiss Association for Analytics

  • T-shirts, Mugs & Mousepads


    All benefits given to a charity association
  • Data Mining Search Engine

    Supported by AnalyticBridge

  • Archives

  • Reading Recommandations