Data Mining with R

September 17, 2008 by
Filed under: R data mining, R language, S language 

There exists many solutions to implement your data mining algorithms: C/C++, Java, Matlab, R, etc. Any of these languages have their advantages and drawbacks. In this post, I will consider R, since it is the one I am using at my job. Here is a list of the reasons why R is a good choice for data mining:

  • It is fast (at least faster than Matlab)
  • It is completely free
  • Several data mining algorithms are available as packages (such as rpart)
  • It contains several libraries for graphics

A comprehensive document about using R for data mining has been written by Luis Torgo at University of Porto, Portugal. The document of 125 pages contains two case studies, one of which is in finance. It is thus a very good introduction for people using R for data mining.

R has also several drawbacks. Out of the lack of information regarding errors, the most important drawbacks are certainly the ones that concern the conception of its basic structures such as vector/matrix, incremental loop, etc. You can have details about R design flaws at Radford Neal’s blog.



8 Comments on Data Mining with R

  1. romain on Thu, 18th Sep 2008 8:47 pm
  2. I used R not so long ago for doing data-mining oriented study and found rattle:

    Even though I prefer the interface of orange (, rattle combine the power of R and a simple interface which helps diving into the data…

  3. Sandro Saitta on Tue, 23rd Sep 2008 8:22 am
  4. Thanks for sharing Romain. Personally, I write my R code with Tinn-R and execute it directly in the terminal (either Windows or Linux).

  5. Anonymous on Tue, 23rd Sep 2008 1:53 pm
  6. I’m glad to see you’ve found R 🙂


  7. Sandro Saitta on Wed, 24th Sep 2008 10:55 am
  8. After years of research using Matlab and Java, I eventually found R 🙂

  9. Anonymous on Thu, 25th Sep 2008 11:37 pm
  10. Is there any way to use R with another object-oriented programming language? That would be fun…

  11. Shane Butler on Sun, 28th Sep 2008 11:13 pm
  12. Sandro Saitta on Mon, 29th Sep 2008 12:56 pm
  13. Thanks for the link Shane!

  14. Steffen on Thu, 6th Nov 2008 1:03 pm
  15. Thanks for the links sandro. I am using R mainly for prototyping, and then prefer more “reliable” environments like RapidMiner to “freeze” my results.

    @objectorientedprogramming: You can code objectoriented-style in R by using S4-classes. Since R is a scripting language, to many coders tend to define their own standards within their code, which makes the interaction of functions from different packages a real mess. S4 can reduce this mess.

    just a few remarks


Tell me what you're thinking...

  • Swiss Association for Analytics

  • Most Popular Posts

  • T-shirts, Mugs & Mousepads

    All benefits given to a charity association
  • Archives

  • Visitors