SAS for Data Mining

May 19, 2009 by
Filed under: Uncategorized 

sas-logolg2After having used Matlab and R for data mining, I am now using the SAS (Statistical Analysis System) solution. The software was chosen according to our client internal uses. SAS was already used in the company (a telecomunication company in Switzerland) and there were no reason to change. The first surprise with SAS is when you install it. Or should I say “them”. Indeed SAS contains dozens of software and several of them can be used for the same purpose. For example, when it goes to data preprocessing and data aggregation, you could use Data Integration Studio (DI Studio), Enterprise Guide (EG) or SAS Base. One of the first challenge of SAS is to find the right tool for you according to the tasks you have to perform.

DI Studio is a drag and drop interface that allows you to preprocess and query your data. The user interface is quite old and when it comes to programming structures such as loops, it’s quite complicated. Also it is not straightforward if you want to perform actions that are not in the drag and drop interface. For more flexibility, you can use SAS Base. No advanced GUI. Just a programming language based mainly on DATA and PROC steps. You also have the possibility to write programs that write their own code through MACRO. Thus SAS Base is a powerful programming language that allows you to perform any task you need. EG is a GUI to SAS. Most preprocessing, data visualization and basic statistics can be done in a drag and drop mode with EG. The interesting aspect with EG is that you can add SAS code from SAS Base as well as DI Studio. For more advanced data mining functionnalities (neural networks, SVM, etc.) you need Enterprise Miner (EM). EM is also a drag and drop sowftare where you can build your data mining projects. Usually, input data sets in EM will be output data sets from DI Studio, EG or SAS Base.

The second challenge with SAS is the installation and configuration. If you work in a company that has its own IT department, then everything is fine. If you have to install SAS tools by yourself for your own PC or laptop, then it’s a bit more complicated. First SAS is based on a client/server approach. For example, even if you need DI Studio locally, you will have to install servers, new users, management console and several fixes (available on SAS support website). By the way, the SAS support team is of excellent quality, really. I interact with them one or two times a week. Their answer is fast and they are very professional. It’s really a pleasure to have professional answering your questions so fast (which is usually not the case with Matlab or R). So, of course, free solutions such as R and Java have their own advantage, but once installed and configured correctly, SAS is an easy to learn and powerful tool for data mining.

For more information about SAS:

The official SAS website
The SAS support page where you can submit a problem
The discussion group comp.sys-soft.sas
The SUGI list of technical papers about SAS
The Little SAS Book (I will review it soon)
The SASCOM magazine (free)

If you have other interesting resources or if you would like to give your opinion about SAS, feel free to post a comment.

Share
“Ithink what you saw in film and what you saw in cheap jerseys workout numbers, I don’t know that it quite matches up,” Belichick said Tuesday before practice. “Sort of like Malcolm[Butler], it’s a little bit of the same thing with Malcolm. His testing numbers, I’m sure that’s part of the reason why he didn’t get drafted either.”. If we go down the roster of guys who have gotten injured, we’re going to find ray ban sunglasses a lot of guys that were injured at the same time as these guys, were operated ray ban sunglasses on at the same time as these guys, that are not at the level that these guys are and are not coming back as quickly.”Kansas City’s Jamaal Charles, coming off a serious knee injury, is ranked eighth in rushing. Carolina linebacker Thomas Davis has broken new ground by becoming the first player to return from three ACL reconstructions on the same knee.Another high profile comeback is in the pipeline. Baltimore’s Ray Lewis, a future Hall of Fame member, reportedly could be ready to return from a torn triceps in mid December when the Ravens play Manning and the Broncos. This, combined with the general economic downturn, meant doom for the ambitious project. The park now sits empty. And this isn’t “Tightly patrolled by security until we finish construction someday” empty; this is “A’ight, we’re outta here. Holes between quarters are generally short. The hostile player that gets the ball won have the capacity to get their shot off. It will be crazy how huge of a distinction this makes.. I do not mean the exchange of political or business favors. I am referring to the actual application of fingernail to flesh: the time honored, if somewhat primitive, tradition of the male back scratch. You can probably apply these helpful hints to training almost any dog, but my experience has been with Jack Russell terriers.. Fashion has taken us by a storm! What once was known as an optional has become mandatory. Yes! Here we are talking about oxford check shirt and printed neck sweatshirt. While both the shirts can be worn on different occasion, but what makes them men essential is the versatility that it offers! Here we are going to list some why you need to have them in your wardrobe:. He adds that Mercedes is oakley outlet now “perceived as younger, more dynamic, sporty than it was before then. It is attracting the market with disposable income. We have increased the potential target group of customers. Some time ago, a cable television commercial put forth the idea that satellite customers received Baratas Ray Ban a lesser quality picture than cable subscribers. Satellite companies disagreed, and based on my personal experience, I know that cable companies are cheap china jerseys way out of line when they say those things. I am one of the two percent of all Americans who cheap nfl jerseys cannot receive Direct TV due to my living circumstances.
Share

Comments

10 Comments on SAS for Data Mining

  1. Ashutosh on Tue, 19th May 2009 2:07 pm
  2. You said: “By the way, the SAS support team is of excellent quality, really. I interact with them one or two times a week.”

    Wow! One or two times a week? Is there something wrong with the product?

    I agree on the difficulty on installing it. It is indeed difficult. You need to spend hours and hours getting it on your machine.

  3. Sandro Saitta on Tue, 19th May 2009 2:45 pm
  4. Thanks for your comment Ashutosh. In my case, several interactions with the support team are due to the installation and configuration of the product. After that, most questions are very specific and concern either errors (yes, there are a lot of possible errors) or functionalities (for example, how can I add some text to a EM schema, this is not yet possible in fact). It is also true that SAS is not the most easy-to-use tool for data mining, but it’s definitely powerful.

    I also have to admit that sometimes it is easier to ask questions instead of looking for hours in the SAS help or on the Web 😉

  5. Shane Butler on Wed, 20th May 2009 11:53 am
  6. You forgot to mention cost 🙂

    If I am not mistaken, to do any data mining you will have to buy the SAS/STAT addon… this only gets you stats and a few types of regression. For decision trees, neural nets, etc you need EM which has a hefty price tag! I guess another alternative might be JMP if you want to keep it in the SAS family… not 100% sure of its features though.

  7. Tim Manns on Thu, 21st May 2009 8:24 am
  8. Cost :0

    Also baring in mind, only one man in the world owns SAS software; Mr Jim Goodnight. All the customers rent it for a hefty price *every* year. Don’t pay your license, can’t even run existing analysis. Everything stops. If you have a solution you have purchased, then you simply don’t buy the new features and can use the ‘old’ software as long as you want. Of course that often has little importance to us data analysts that use it in large organisations, but its worth baring in mind.

    Most of my work is using Clementine, but actually it is the data warehouse (Teradata) that stores the data and does the data processing. Clementine turns everything into SQL. When I do usually talk to SAS users they always describe a system of data extraction out of a data mart or warehouse and import into SAS. In my mind this is a nasty overhead. I know that more recently SAS has better data connectivity and SAS into SQL transforms, but it rarely seems used by anyone (or maybe they don’t like to boast as much 🙂

    I’ve love to know what kind of set-up you have regarding data storage and scoring etc.

    Cheers

    Tim

  9. Steffen on Sat, 23rd May 2009 10:12 am
  10. “Open Source” ?

    A (maybe dump) question:
    Are you able to see the source code of the algorithmns (e.g. decision tree) already delivered with the product ?

    kind regards,

    Steffen

  11. Sandro Saitta on Sun, 24th May 2009 6:06 pm
  12. @Shane: You’re right! Costs! Well in fact I was like Tim, in the “data analysts” point of view 🙂
    With SAS Base and EM you can do usual DM tasks. You don’t need SAS/STAT (at least with the SAS 9.1 version). Or was it installed by default with SAS Base?

    @Tim: Good point about the license. Regarding the SAS, the good point is that it is SAS in fact. It is so spread that nearly any data-related software will have a “import SAS data set” option. For example, I can pre-process my data using SAS and then use Insightful Miner – to mine – or Tibco Spotfire – to play with – my data.

    @Steffen: I have never tried, but I don’t think so. If you want to self-tune the code itself, I guess it’s better to use YALE, WEKA or R 🙂

  13. gucci bags discounted on Tue, 12th Nov 2013 5:42 pm
  14. that was a nice cute bag! i think it is specially made for young ladies because of its color and style.

  15. HP Laptop on Thu, 25th Jun 2015 7:12 am
  16. How does SAS compare to SPSS? i’m sorry i tried searching it on SAS’ page but couldn’t find it anywhere.

  17. huawei mobiles on Wed, 8th Jul 2015 5:19 am
  18. I also use SAS for my data mining work. Its a great tool and i recommend everybody to use it. Thanks for sharing such great info

  19. junaid on Tue, 20th Dec 2016 8:32 am
  20. online shopping provides a way out.
    online shopping pakistan Given the near universal availability of Internet connectivity

Tell me what you're thinking...





  • Swiss Association for Analytics

  • Most Popular Posts

  • T-shirts, Mugs & Mousepads


    All benefits given to a charity association
  • Data Mining Search Engine

    Supported by AnalyticBridge

  • Archives

  • Reading Recommandations