Math Stats and Data Mining
I recently found the new data mining blog named "Math Stats and Data Mining" written by Rachel Graham. It is a very nice blog with a particular focus on statistics and making sense of data. I really like the way posts are written: readable and entertaining with a personal viewpoint. Certain posts are particularly interesting, such as the one on the Pythagorean Theorem or the one entitled "Why is Statistics… Continue reading...Poll on DIKW hierarchy
If you work in data mining, you are every day confronted to terms such as data, information and knowledge. As explained in a previous post on Data Mining Research, there exists a hierarchy on these terms. It is usually represented as shown in the following picture.
My question, regarding this terminology is: What do
The two cultures according to Breiman
In a recent post on Data Mining Research, Will mentioned a paper entitled Statistical Modeling: The Two Cultures. This paper, written by Leo Breiman (the father of decision trees) and published in 2001 in Statistical Science is intended to both statisticians and data miners. As indicated in the title, Breiman compares two different cultures: the statistical culture assuming data models and the data mining culture using algorithmic models.TheSmall book review: Web Dragons
November 16, 2007 by Sandro Saitta · Leave a Comment
Filed under: data mining books, search engine, web dragons
Filed under: data mining books, search engine, web dragons
Data mining is a field which is closely related to information extraction and search engines. Web Dragons: Inside the Myths of Search Engine Technology explains everything you want to know about search engines (the so called "web dragons") and how they work. Before reading the book, you perhaps wonder why Witten and co-authors called… Continue reading...
RSS Feed of Data Mining Research
Some readers reported that the RSS feed of Data Mining Research is sometimes giving feedburner error reports. This may happen if you use the old RSS feed from blogger:http://dataminingresearch.blogspot.com/atom.xml (old feed)This feed is no more valid. For those who are still using it, please update to the following one:http://feeds.feedburner.com/dataminingblog (new feed)Thanks to Shane for noticing the problem.[End of post]Data mining and statistics
November 8, 2007 by Sandro Saitta · Leave a Comment
Filed under: business, data mining for companies, statistics
I have recently found an interesting paper about the connection between data mining and statistics. It is written by Diego Kuonen, who is now working at Statoo Consulting in Switzerland. The basic question that leads his paper is whether data mining is statistical déjà vu.After explaining what is statistics and why it is needed, he explains data mining using several definitions. He points out an interesting fact by… Continue reading...
Filed under: business, data mining for companies, statistics












