I recently received a copy of The Master Algorithm, written by Pedro Domingos, to review. While I was reading the back cover, my first impression was skepticism. Indeed, Domingos main idea in this book, is the (future) existence of a so called “Master Algorithm” which will outperform any other algorithms for any kind of tasks.
I can see you smiling and so did I when I started reading the book. Reaching the end, I have a different, let’s say more enlightened, view of the concept of Master Algorithm. Although Domingos has several arguments on why such an algorithm should exist, I’m still not convinced, and I will tell you why at the end of this post.
Nevertheless, The Master Algorithm is a must read for several reasons. First, the book is a journey in the field of data science algorithms, grouping data scientists into “tribes”:
- Symbolists (e.g. decision tree)
- Connectionists (e.g. neural networks)
- Evolutionaries (e.g. genetic algorithm)
- Bayesians (e.g. naive Bayes)
- Analogizers (e.g. SVM)
Each chapter describes the principles and history behind these tribes. This gives the reader a broad and comprehensive view of data mining approaches and differences between them. The chapter about combining all methods is full of metaphors and very well written. A vision of the future is provided in the last chapter.
Second reason why Domingos book is a must read: its idea of a unique algorithm to tune (beating all others) is interesting and well described. For Domingos, we will need to combine different learning paradigms to reach the Master Algorithm. According to Domingos, it is the role of the reader to discover it, the book being only a starting point. This is a very nice way of motivating non-experts to read the book. Finally, the book is provocative (at least if you don’t believe in a unique algorithm to solve all problems).
Let me now explain why I don’t believe that a Master Algorithm will soon replace the variety of existing ones. First, the very existence of a variety of algorithms is due to the various ways we can solve a given problem. As long as there is more than one person using Predictive Analytics, there will be more than one algorithm used. People use different algorithms because they think differently and are better at solving a problem with a specific algorithm they know better.
Second, the many different applications of Predictive Analytics are way too heterogeneous to be solved by a unique algorithm. Want to understand your model? Use linear regression. Need stability? Just try SVM. Limited by an low memory device implementation? Rules obtained using decision trees may do the trick. In conclusion, the variety of algorithms used are needed due to the different skills of people using them, the various applications to solve as well as deployment particularities, for example.
Whether you agree with Domingos or not, this book is a must have to learn machine learning without equation. It will help you get the big picture of the several learning paradigms. Finally, the provocative idea is not only intriguing, but also very well argued.
This is a guest post by Khushbu Shah from DeZyre.com.
Internet today as a collective agency is creating 2.5 quintillion bytes of date on a daily basis and nearly 90% of all of our global data has emerged in the past 2 years.
Looking from an outsider’s perspective Big Data sounds like a good thing, which it is if we can manage to effectively handle it. For… Continue reading...
If you need a ML book as a teacher, Machine Learning – The art and science of algorithms that make sense of data, is definitely the one you need. It covers most ML algorithms, divided by genre (tree, rule, ensemble, etc.). From a teaching point of view, the book is quite comprehensive. From a practical point of view, some chapters… Continue reading...
Data Mining with Salesforce Customer Relationship Management
The world is now a global marketplace thanks to the Internet, and customer relationship management is a critical part of enforcing business efficiency to enable achievement of business objectives and set a company apart from the competition. Salesforce CRM tools and strategies will only be effective if you utilize the customer information generated to anticipate and fulfill their needs, eventually leading… Continue reading... | 2 Comments
I will give a talk about one of our data science project at Expedia in the below events. Please come by so we can chat if you are attending one of these events:
Competitive Intelligence (CI) is the process of collecting, aggregating and analyzing external data for the benefit of a company. A good introduction to the subject can be found in Competitive Intelligence Advantage: How to Minimize Risk, Avoid Surprises, and Grow Your Business in a Changing World. I particularly appreciate the use cases showing the distinction between competitive intelligence and competitor analysis. The below image provides a good view… Continue reading... | 1 Comment
We recently started a Meetup group for the Swiss Association for Analytics: http://www.meetup.com/swiss-analytics
Feel free to join our Meetup to be informed about our free events in Switzerland. Recent event topics included analytics for CRM, fraud detection and text analytics.
We are always looking for speakers and sponsors, so feel free to contact us at firstname.lastname@example.org