# Auto-correlation for time series analysis

Recently, I was reading the EPFL magazine and was surprised to see an article where they interviewed my master thesis adviser, Francois Fleuret. He explained data mining and gave an example about our project.

The goal of my master project was to predict the pollen concentration in the air for the following days. For that, we used different kinds of weather data (temperature, wind, sun, rain, etc.) available as daily data. The most interesting was not the data mining technique used, but rather the results obtained.

We used various techniques such as linear regression and decision tree. At the end, we also tried auto-correlation to study the effect of the quantity of pollen from one day on the following days. As the value was quite high, we plotted our prediction and saw that they were very close to yesterday pollen concentration. We could thus conclude with a sentence such as “tomorrow’s pollen concentration has a high probability to be like today’s concentration”.

The lesson that I learn from this project is to first try simpler methods (such as the cross- and auto-correlation in this case) before using any other, more complex, data mining techniques. This concept is related to the Occam’s razor which can be summarized by the well known quote “Entities must not be multiplied beyond necessity”. Of course, as it is recommended in data mining, you should always try more than one techniques to make predictions.

## 6 comments found on “Auto-correlation for time series analysis”

1. secret says:

Nice.
We find this autocorrelation in many phenomena with a “momentum” property, in particular physical ones. I guess pollens molecules can simply disappear, they have to move continuously. I guess it all depends on the time constants of the influential actions (gravity, wind, rain…)
Usually, with such pbs, forecasting “today will be almost the same as today” is a first step, but doesn’t had so much information.
The real difficulty is to forecast the difference from today, which is much harder.
Typical example is prices of financial securities. What you want to forecast if the difference in prices (ie return). Some people say it’s impossible (Efficient Market Hypothesis)

Einstein “Make things as simple as possible, but not simpler.”

Some issues I have with Occam principle :
Do you think it’s a truth of nature? Do you think it’s just a nice technical methodology? Why does it actually work in practice? Why the world should be “simple”? What is your definition of “simplicity”? Does it depends on our human nature of thinking?
I like a lot the Bayesian philosophy of science, giving a rational scientific approach to experiment and model update. But there is no such think as Occam razor in Bayesian foundations. Some times it appears during marginalization (Automatic razor) but sometimes it doesn t !!! I would be happy to understand more deeply that stuff
Regards

2. @secret: Thanks for this very interesting comment! Particularly about the Occam’s razor issue. We should never take things for granted but rather put everything into questions. I hope your comment will launch a discussion on this topic…

3. When we have ambitious projects in mind, we tend to forget basic, more simpler methods. Thanks for the reminder.