I’m a big fan of decision tree and linear regression. If that’s not enough, I use a hand made mix of genetic algorithms and ensemble classifiers.
I still don’t get SVM, especially how you find the kernel to use. I love the idea of avoiding overfitting, but what the point if it’s to introduce a soft marging and space transformation that destroy this nice feature. When I see the number of illustrations that use 4 points for supporting vectors I think few people understand the basics of SVM (for wikipedia, the french version use 4 point, the german probably 4, only the english one is correct with 3 points).
My favorite “hammer” for classification and regression problems is gradient boosted regression trees. Seamless handling of missing values, mixed type potentially correlated predictors, high accuracy, variable importance measures, partial dependency plots to understand average marginal effects of the inputs etc. make this my first go-to algorithm in the toolbox.
Generally, I also like ensembles of different classier types (hybrid ensembles).
Another vote for gradient boosted tree ensembles as the first call for many problems, for the reasons @Jeff mentioned. (I most often use the gbm package in R.)
After that mixed ensembles. Often using the caret package which provides a reasonable consistent interface to >140 different models with all the bagging, cross-validation/bootstrapping, and parallel computing already taken care of.
Sounds like Jeff and I should set up an echo chamber somewhere.
Great post, Sandro
Tell me what you're thinking...
Email Address (required)
Speak your mind
Download Alexa toolbar
Copyright © 2008 · Revolution Code Blue theme by Brian Gardner
Get a Blog · WordPress · Log in