Data mining definitions
Filed under: data mining, exploratory data analysis, knowledge discovery, machine learning, pattern recognition, terminology
In the literature, the field of data mining can be found under several other terms. Below are examples of definition related to the field of data mining:
- Machine Learning: “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.” (Mitchell, 1997)
- Exploratory Data Analysis: “A philosophy of data analysis where the researcher examines the data without any pre-conceived ideas in order to discover what the data can tell him about the phenomena being studied.” (Martinez and Martinez, 2004)
- Pattern Recognition: “Statistical pattern recognition is a term used to cover all stages of an investigation from problem formulation and data collection through to discrimination and classification, assessment of results and interpretation.” (Webb, 2002)
- Data Mining: “Data mining is a technology that blends traditional data analysis methods with sophisticated algorithms for processing large volumes of data.” (Tan et al., 2006)
- Knowledge Discovery: “[...] a new generation of techniques and tools with the ability to intelligently and automatically assists humans in analyzing the mountains of data for nuggets of useful knowledge.” (Fayyad et al., 1996)
Some of these definitions are technical while other are intuitive. Do you have other examples of such definitions or remarks about these ones? Feel free to comment.
Comments
5 Comments on Data mining definitions
-
Daniele on
Wed, 25th Apr 2007 8:15 am
-
Anonymous on
Thu, 26th Apr 2007 8:25 pm
-
Will Dwinnell on
Fri, 11th May 2007 1:03 am
-
Sandro Saitta on
Tue, 15th May 2007 9:08 am
-
James Taylor on
Thu, 31st May 2007 12:15 am
The most important:
“Machine learning is statistics minus any checking of models and assumptions” (Prof. Ripley, Oxford University – UK).
Jeff Jonas has covered this, as well.
Breiman’s essay Statistical Modeling: The Two Cultures explores some of these issues.
Daniele, thanks for the nice definition. Will, thanks for the link. It seems interesting, so I will read it soon.
Here’s a post I wrote on the topic of definitions to add to the mix. My favorite definition, though, was one I saw in the R&D department here:
Analytics simplify data to amplify its value.
JT
http://www.edmblog.com
Tell me what you're thinking...
















