Data Mining Tools: From SAS to R/Java

August 26, 2009 by
Filed under: Uncategorized 

After a few months using SAS, I find it a powerful and interesting tool to use. It has its own programming language (SAS Base) which allows you to be very specific. On the other side, you have a strong GUI (Enterprise Guide) which lets you perform several tasks using drag and drop. One of the main issue with SAS is, of course, its price. Not every company can afford a SAS license.
sastojavar1
The question then is: when you have SAS code, how can you convert it to a free programming language such as Java or R? After some search on the web, I found the following solutions:

  • From SAS to Java: In the first look it seems difficult to compare SAS with Java code, since they are very different (Java is web oriented). However, it seems that DullesOpen has a tool, named Carolina, to convert SAS into Java code. You can test their tool online and I have contacted them to get a trial version. I will let you know when I have more information about this tool.
  • From SAS to R: SAS system and R language (the free version of S programming language) are quite close since they are both statistics oriented. Of course, the syntax is completely different. Up to know, I didn’t found any tool for automatic SAS to R code conversion. I will comment this post as soon as I will find some interesting tool.

Ajay from DecisionStats told me about MineQuest. It seems that they have a tool to convert SAS code to WPS (World Programming System), a SAS Base clone. If you have some suggestions or way of converting SAS to any other language, you’re welcome to comment this post.

Share

Comments

17 Comments on Data Mining Tools: From SAS to R/Java

  1. Statguy on Wed, 26th Aug 2009 9:25 pm
  2. Another good choice would be to use Clojure which has an R-like domain specific language called Incanter. You get the expressiveness of a LISP, the statistical environment of R, and the enterprise-readiness of the JVM. The price is also right. Give it a try.

    http://incanter.org/

  3. Sébastien Derivaux on Wed, 26th Aug 2009 11:32 pm
  4. This is interesting. Are you saying you are more efficient in SAS that in a Java framework or R? If yes, can you explain why please?

    I disagree that Java is web language, it’s a general language and most of web stuffs aren’t even in the core framework. I think that the bigger issue is not so much converting one language to another but one framework to another. SAS has many useful high level function which haven’t equivalent in Java (Weka or RapidMiner are really far from SAS). A decision tree can be implemented by many way, thus reproducing the undocumented SAS one should be quite challenging.

    What are your feelings about PMML which is another solution (for models at least)?

  5. Sandro Saitta on Thu, 27th Aug 2009 11:36 am
  6. Hi there,

    First, thanks for your comments.

    @Statguy: thanks for the link. I will have a look!

    @Sébastien: I’m not saying that I’m more efficient in SAS. Just that SAS may not be the right solution for every clients. Thus, one may have to translate a SAS program into another programming language.

    I understand your point about Java. I say it is web oriented, because I think most of its success comes from the web. Of course, it is much more than a programming language for applets.

    You’re right, it must be very challenging. Regarding PMML, it would be nice if it were used by every data mining tools/programming languages :-)

  7. Haltux on Mon, 31st Aug 2009 10:15 am
  8. Hello,

    Java is not at all web oriented. Its success partially comes from its use on web servers, but that does not make it “web oriented”.

    In my opinion, the question is: which Java library could be used to replace SAS? Reimplementing every SAS features in Java does not seem very realistic. However, existing implementation of DM tools in Java are already existing: Weka, Rapidminer…

  9. Sandro Saitta on Tue, 1st Sep 2009 8:26 pm
  10. @Haltux: Thanks for sharing your opinion.

    I believe that the current success of Java is in big part due to the applets (but I may be wrong).

    Regarding the reimplementation, I agree, one should perhaps use an existing tool such as WEKA.

  11. Jay B.Simha on Thu, 10th Sep 2009 3:15 pm
  12. Sandro is right. Even though SAS is a powerful platform for analytics, not every client can afford it or need it. Also many of the practical analytics problems can be solved with freely available tools (open source or otherwise). It is just a matter of selecting right tool for the budget and requirement. As far as the conversion from SAS to Java, I have not faced such problems but prefer to write application in Java platform than to convert. It is just my preference.

  13. Sandro Saitta on Fri, 11th Sep 2009 9:01 am
  14. @Jay: Thanks for your input. I agree with you, that the platform/language should be chosen in advance. But it is often fixed by the client. And, in the consulting area, the client changes very often.

  15. Shane on Mon, 5th Oct 2009 11:26 pm
  16. I dont think most people would replace SAS with Java programming… PostgreSQL/MySQL in combination with R or RapidMiner would be ok though.

  17. Sandro Saitta on Tue, 6th Oct 2009 7:14 am
  18. @Shane: Thanks for your comment. I’m now trying WPS, which can read and interpret SAS code (with a much cheaper license price). I will blog about that soon.

  19. Michael Zeller on Wed, 18th Nov 2009 10:23 pm
  20. I highly recommend to take a closer look at the Predictive Model Markup Language (PMML) standard which is supported by most commercial vendors as well as open source tools. Open standards are the way to go, if you want to keep your options open and have more flexibility to choose a best-of-breed solution. PMML bridges the gap to production deployment and integration of predictive models, e.g., you can deploy your models on the Amazon EC2 cloud with Zementis http://www.zementis.com

    For SAS, only Enterprise Miner currently supports PMML export. However, there seems to be a free tool that converts Base SAS to PMML, available for download at http://www.latentview.com/sas-2-pmml.html

    To learn more about PMML, please join the PMML discussion group on LinkedIn with links to various resources, tools that support PMML, and more:
    http://www.linkedin.com/groupRegistration?gid=2328634

  21. John Pico on Fri, 20th Nov 2009 4:26 pm
  22. Are there any opensource stats programs that will interpret my old SAS/base code. Also is it possible to convert SAS language to PMML to S language? Or just use PMML in R?

  23. Sandro Saitta on Sat, 21st Nov 2009 5:47 pm
  24. @Michael: Thanks for proposing the PMML approach. I will have a look at the SAS-to-PMML tool. Thanks for all the links.

    @John: For interpreting SAS/Base code, the only tool I know is WPS. It is not free but much cheaper than a SAS licence. From SAS to PMML you can look at the link from Michael. I have never used S, so I can’t help you for that.

  25. Quant on Mon, 21st Nov 2011 12:30 pm
  26. SAS codes are ugly, I have also been looking for an automatic software to convert SAS to R, however, haven’t got anything. So far I have to manually convert the codes, but it is a pain. Please share with us if you find some, thanks.

  27. OPREING on Sat, 7th Jan 2012 8:17 pm
  28. Stacy on Tue, 25th Jun 2013 2:00 am
  29. OPREING: Thank you! I read through your presentation, but (unless I am overlooking something) it did not seem to address the problem at hand, for 2 reasons: 1) The problem under consideration is converting SAS CODE to R, not SAS DATA (i.e., the code that reads in the data), and 2) It seemed to assume that the user has access to SAS, which many of us cannot afford.

    Thanks for sharing your presentation, though! Let me know if I am not understanding.

    Has anyone else made any progress on this in the last 4 years since this conversation was started? I am fairly new to R but have some SAS code I need to convert, so I am trying to scope out what it will require.

    Thanks!

  30. Mike Petrovic on Fri, 28th Feb 2014 5:56 pm
  31. Dulles Research (www.DullesResearch.com) has a patented automated SAS to Java conversion solution (Carolina), which has been in production in Fortune 50 companies for over 4 years. The technology has been proven and is continuing to evolve in to some really great tools – like Carolina for Hadoop (runs SAS programs in-Hadoop).

    If you’re rewriting SAS programs in Java, or C++ (to run SAS programs in-Hadoop; run SAS programs in-database, or integrate SAS programs with existing operational systems), then Carolina can save you a lot of time and money.

    Also, take a look at James Taylor’s (of Decision Management Solutions) recent update on Carolina:
    http://jtonedm.com/2014/02/10/first-look-dulles-research-carolina-update/

    Hope this helps!

  32. priya on Sun, 13th Dec 2015 5:06 am
  33. Thanks for your valuable information.

Tell me what you're thinking...





  • Swiss Association for Analytics

  • Most Popular Posts

  • T-shirts, Mugs & Mousepads


    All benefits given to a charity association
  • Data Mining Search Engine

    Supported by AnalyticBridge

  • Archives

  • Reading Recommandations