A note on correlation
Correlation is often used as a preliminary technique to discover relationships between variables. More precisely, the correlation is a measure of the linear relationship between two variables. Pearson’s correlation coefficient is defined as:
As written above, the main drawback of correlation is the linear relationship restriction. If the correlation is null between two variables, they may be non-linearly related. As written in Tan et al. (2006), x and x^2 have a correlation of zero but are non-linearly related. Remind that non-linear does not mean polynomial. Consider for example x and cos(x). Although their correlation is close to zero, they are related.
P.-N. Tan, M. Steinbach, and V. Kumar. Introduction to Data Mining. Addison Wesley, 2006.