Data Science Book: Everybody Lies

Seth Stephens-Davidowitz has written a very entertaining book on big data and how it can be used to understand Humankind. The main idea of Seth is that Google searches is the most powerful source of information to understand what people really think about. Seth argues that the main advantage of Big Data is our ability to zoom in the data. When referring to Big Data, Seth is only dealing with the notion of Volume (not the other Vs). He defines Big Data by providing use cases with new kinds of data in large volume that were not available before.

On Facebook and other social media sites, people tend to lie and try to show a nice story about themselves. They want to look good and to be seen as happy individuals. When using Google search engine, the story is different. The anonymous nature of the tool makes people ask about their real concerns. I would have appreciated a chapter on the limitations of such approaches. Mainly, not everyone is using Google search and the ones using it represent a biased sample of the world population.

While plenty of other topics are covered in the book (e.g. A/B tests), Seth focus primarily on examples related to sexuality. According to him, this is the most interesting subject for which people are more likely to lie. Seth provides plenty of meaningful examples of how people lie.

The book often mentions Data Science and Machine Learning. Seth is mainly performing data analysis using huge volume of data. Everybody Lies is definitely an interesting reading, with the same kind of freshness as Weapons of Math Destruction, which I also recommend.


Recommended Reading