If you are interested to know what is
trendy these days (or years!), have a look at
Google Trends. It is easy to use and give you a good idea about information people are looking for on the web (or at least on Google).Here is Google definition of this functionality: "
Google Trends analyzes a portion of Google web searches to compute how many searches have been done
You know Google for sure. You certainly know Gmail, Google Earth and Picasa. But do you know Google Spell Checker? It is certainly not the case since it does not really exist. This is just a different use of the Google search engine.As an example let's take the (plural) expression
models predictions. It is a very common mistake for French speaking people - such as me - to write…
Continue reading...
Analyzing data coming from blogs is nowadays possible through the use of tools such as
BlogPulse. A good example of data analysis is given on
Matthew's blog.For example, we can compare the use (in blogs) of the two words
Israel and
Lebanon. It is interesting to notice how well they are related, and even more when comparing them with the term
Iran
In the data mining (or machine learning) community, as you can expect, the term
learning is often used. As a consequence, it is explained in many books, mainly with different definitions. Therefore, it is often difficult to find a clear and understandable definition for the term
learning. Across my readings, I found the following definition as the most clear one:"
Learning is the improvement of performance in some
Here is a very interesting blog concerning data mining:
http://datamining.typepad.com/The author, Matthew Hurst, writes about many subjects mainly related to statistics and the web. Several posts are related to data mining. Moreover, he regularly posts messages that are of general interest…
Continue reading...
How many times have you read a paper about data mining which does not illustrate its results using a common database available on the web? This is a provocative question, of course. However, due to my personal reading experience in data mining, I estimate that 7 out 10 papers use common data available on the web, for example:
These data are clearly done for the precise purpose of testing…
Continue reading...
Here is a non exhaustive list of general or introductory books about data mining and machine learning:
- Cristianini N. and Shawe-Taylor J., An Introduction to Support Vector Machines and Other Kernel-based Learning Methods, Cambridge University Press, 2000.
- Hand D., Mannila H. and Smyth P., Principles of Data Mining, MIT Press, 2001.
- Langley P., Elements of machine learning, Morgan Kaufmann Publishers, 1996.
- Larose D.T., Discovering knowledge in data: an introduction to data mining, Wiley-Interscience, 2005.
- Mitchell