Yet Another Learning Environment

October 31, 2006 by
Filed under: software, weka, yale 

I recently had a short discussion on my blog about data mining languages with Ralf Klinkenberg, a data mining researcher. We briefly discussed about using Matlab, WEKA or YALE for data mining. As a researcher, I use Matlab for data mining. The problem with a software such as WEKA, to my mind, is the difficulty to implement your own functions. WEKA, I think, is very good for applying data mining techniques on your data set. However, but I’m perhaps wrong, I think with Matlab it is easier to directly reuse existing functions, modify them or even create your own functions and combine them with existing ones.

Ralf, as one of the developer of YALE, seems to prefer it to Matlab. Since I’m curious about preferences concerning data mining languages, feel free to answer to this post by telling what kind of language do you use for data mining as well as your domain (industry/research/teaching). Examples of categories are:

  • Maltab, Python, etc.
  • WEKA, YALE, etc.
  • C, C++, etc.
  • Another category
Share

Comments

7 Comments on Yet Another Learning Environment

  1. George Tziralis on Wed, 1st Nov 2006 8:04 am
  2. I’m in general a MatLab fan, but I think Weka is highly appropriate for data mining tasks and i use it without second thought on all my relevant projects (I’m not yet familiar enough with Yale)

  3. Sandro Saitta on Wed, 1st Nov 2006 6:20 pm
  4. I’m happy to see I’m not the only Matlab fan :-)

    By the way, when I write that Matlab is more research oriented I mean that people doing research in data mining will perhaps prefer Matlab. However, all the people doing research with or using data mining will certainly chose WEKA or YALE.

  5. Anonymous on Thu, 2nd Nov 2006 9:47 pm
  6. I am not using R myself but I am surprised that it hasn’t been mentioned.
    http://www.r-project.org/

  7. Anonymous on Thu, 2nd Nov 2006 10:04 pm
  8. I am a scientific researcher in data mining (at a university) and a practitioner applying data mining as consultant and software developer (as freelancer). In both roles, I use YALE, WEKA, and Java to implement my solutions and new methods. YALE is easily extendable. You can write your own operators or plugins in Java. The YALE tutorial, which is available online, describes how to do this.
    Reusability of existing methods, ease of combining existing and new methods, and rapid prototyping are really strong reasons for using YALE.

  9. Sandro Saitta on Sun, 5th Nov 2006 5:59 pm
  10. The post about Java data mining has some comments about the R language if needed.

  11. Anonymous on Mon, 6th Nov 2006 8:41 am
  12. I’m a big supporter of OpenSource software and so I’m personally anti Matlab.

    I prefer to roll my own in C and C++.

    Datamining on large db is runtime intensive and personally can’t tolerate anything slower.

  13. Sandro Saitta on Mon, 6th Nov 2006 8:59 am
  14. I definitely agree with you about runtime efficiency. I’m working with datasets in the range of 1000 entries. For this, Maltab is usually alright. When it takes too long with Matlab, I write functions in C (and use the interface with Matlab). Now if you work with huge databases, an alternative to Matlab should perhaps be used.

Tell me what you're thinking...





  • Swiss Association for Analytics

  • Most Popular Posts

  • T-shirts, Mugs & Mousepads


    All benefits given to a charity association
  • Data Mining Search Engine

    Supported by AnalyticBridge

  • Archives

  • Reading Recommandations