What have you mined?

During the last five years, I have mined different types of data. For my master project, I mined pollen data in order to predict the pollen concentration for the following week. In my PhD, I mine engineering data. More precisely, I mine data in the case of diagnosis. Indeed, a structure can be in different states. Each of these states is an example with a set of parameters (dimensions). My aim is to mine these “states” to discover useful knowledge for engineers performing diagnosis.

Out of the “standard” data that are mined in most of the literature on data mining, I have also heard data mining applied on:

  • Wine data (Super Crunchers book)
  • Baseball data (Data Mining and Predictive Analytics blog)
  • Cat faces (a colleague of mine)
  • Casino games data (an episode of Numb3rs)

I’m sure there are many other kinds of data that have been mined. I’m interested to know the type of data that you have mined. Feel free to mention your personal experience (or knowledge).


Recommended Reading

Comments Icon3 comments found on “What have you mined?

  1. Professionally, most of my predictive modeling work has been pretty conventional, with things like:

    -customer attrition (telecommunications and mutual fund customers)
    -credit risk
    -business forecasting
    -industrial part quality prediction
    -medical diagnosis (cancer)

    Recreationally, I’ll predict anything for fun. A good example is my Pixel Classification Project, in which individual pixels in images are classified as “foliage” or “not foliage”.

    I have met people or read about the following predictive modeling applications:

    -classifying popcorn kernels as to whether or not they will pop
    -estimating lumber value of plots of land from aerial photographs
    -assessing drug toxicity from images of deformed webs of spiders given said drugs

    Of course, there are myriad examples, both on-line and in more formal settings, of people attempting to predict things like the stock market, horse races, etc., though most of the ones I’ve examined closely were not sufficiently rigorous to be very inspiring.

  2. On a more puerile note, there have been a number of research projects which attempted to, in turn: 1. discern human skin among pixels in images, 2. identify unclothed human bodies in images (a “pornography detector”), and 3. assess human facial beauty. Some examples are:

    Skin Detection, by Jennifer Wortman

    Skin Patch Detection in Real-World Images, by Hannes Kruppa, Martin A. Bauer and Bernt Schiele

    Are You HOT or NOT?, by Jim Hefner and Roddy Lindsay

    Whatever their other attractions, these problems are technically interesting. Human skin, for instance, can be difficult to discern from things like wood grain.

  3. Thanks for your relevant comments Will. Regarding images, Google is certainly using data mining algorithms for its SafeSearch filtering. Up to now, my favorite is definitely the “popcorn” one!

Comments are closed.