Will Dwinnell is a data mining practitioner with a long experience as well as a blogger on Data Mining in MATLAB and Abbott Analytics. He kindly accepted to answer the questions of Data Mining Research (DMR) about his every day work.
DMR: Who are you and what is your job?
Will Dwinnell (WD): I am Will Dwinnell and I build predictive mathematical models. At the moment, I work for a credit card company, predicting customer behavior. Prior to holding this position, I built models of telecommunications customer churn, medical patients (cancer diagnosis), microeconomic forecasting, industrial part quality prediction and mutual fund customer defection, among other things.
DMR: What are your everyday data mining challenges?
WD: Probably the same as anyone else’s: on the technical side: data which is difficult to access or which is of poor quality (missing values, weak predictors, poorly documented, etc.) and on the business side: dealing with non-data miners.
DMR: Can you give an example of a recurrent issue you face when you are in the “data preparation” step?
WD: Data quality is nearly always an issue. Getting appropriate samples, especially when a non-statistician pulls the data is a challenge. Many difficulties are the same as those faced by any consumer of organizational data: poor or nonexistent documentation, inconsistent variable meanings, frequent missing values, missing value flags which vary from field to field and the inability to link vital tables are typical. I have worked hard to automate solutions to some of these problems.
DMR: As an experienced data miner, do you have a general advice to give to other practitioners in this field?
WD: My primary technical advice would be: Never stop learning. I learn from books and papers (including student project reports- even, on occasion, from high school students), conversations with other analysts and through experimentation.
Thanks to Will Dwinnell for his answers.