Data Mining Book Review: Freakonomics

FreakonomicsI recently read Freakonomics, by Seven Levitt and Stephen Dubner, a very interesting reading. It’s the same kind of books as Super Crunchers (Ian Ayres) and The Numerati (Stehen Baker). The concept is to explain given case studies using data mining and statistics. The focus is not on the technique used (such as random experiment, linear regression, correlation, etc.) but rather on the description of the problem itself and the way to solve it.

Levitt and Dubner do a good job in presenting these case studies. Most of them are very interesting. They proceed by asking questions such as “Why do drug dealers still live with their moms?” and “What makes a perfect parent?”. Each chapter is leaded by such as question. Each issue is answered by analysing facts and data. One could summarize their book as “The data can tell the story”. Their aim is to explain what they consider interesting questions using available data (usually public).

At the technical level, anybody can read this book, data miner / statistician or not. The good point is that beginners will learn what is feasible with data using statistics. The advanced data miners will find interesting examples and basic principles reminder. For example, the authors carefully explain the difference between causality and correlation. The book is easy to read and well written. The only weak point is the “expanded” part of the book. Most information it contains (from the author blog) is already summarized somewhere in the book. One may feel like reading the book twice. To conclude, anyone data-interested (non only data miners), should read this insightful book.

Freakonomics on Amazon.


Recommended Reading