In certain situations, the data miner has to perform sampling on the dataset before applying any algorithm. The main reason being too many data to mine. In such a case, a possible technique is random sampling. If classes are uniformly distributed, one may use random sampling before supervised learning.
But what about association rule mining? If you use random sampling before an association rule algorithm, you may end up finding no…
Continue reading... | 7 Comments

I was at PAW Gov in Washington D.C. on September 12th and 13th and it was just great! Let's start with the people. It was a pleasure for me to meet so many data mining experts. That was one great aspect of this PAW conference: experts are very accessible compared to other events. I had the opportunity to meet great…
Continue reading... | 2 Comments
BAQMaR, a network of analytic people, is organizing its annual event on December 8th in Ghent, Belgium. I have been invited to give a talk during the data mining session. I will present the work I did when I was consultant for FinScore. The talk is entitled "Personalized online advertising using data mining". If you are interested, feel free to
register for the BAQMaR conference. For more details, look…
Continue reading...

The French also have their data miners and more particularly their data mining authors. Stéphane Tufféry is one of them. In 2008 he authored the book "Data Mining et Statistique Décisionnelle" which has been translated in English in 2011 (a huge work according to the size of the book - more than 680 pages - done by Rod Riesco). The…
Continue reading... | 2 Comments
I recently discovered Cross Validated, a Q&A platform for statisticians and data miners. With Seth Rogers, Community Developer of Cross Validated, we have conducted an interview of Rob Hyndman, Professor of Statistics. Rob proposed forming this particular community. Thanks to both Seth and Rob for your collaborations.

Data Mining Research: Could you introduce yourself and explain your relationship to Analytics?
Rob Hyndman…
Continue reading... | 1 Comment

After the article
How to use twitter for data miners, let me propose advices on using LinkedIn. First, you may already know that your LinkedIn account can be linked to display your tweets (see
this link). Continue by adding the right keywords in your summary, so that other data miners can find you easily. Example of terms are…
Continue reading... | 3 Comments

At first, one can think of data mining as a way to answer questions. This is one manner of using data mining. Below are examples of questions:
- Which of my customers are most likely to churn next month?
- Which are the most important parameters to predict the weather of tomorrow?
- Are there groups (clusters) among my clients?
- What is the probability of
« Previous Page — Next Page »