SAS/Tableau integration


Published on

The integration of SAS and Tableau can have significant business benefits. SAS and Tableau are ‘best of breed’ in their own areas: SAS in the area of Analytics and ‘Analytical Data Preparation’; Tableau in the area of data discovery, visualization and intuitive, interactive dashboarding. Consequently, it makes sense to find ways to combine these technologies to deliver an Integrated Information Framework which leverages the strengths of both solutions.

Published in: Technology

SAS/Tableau integration

  1. 1. SAS/Tableau Integration 10 Steps for a Seamless SAS/Tableau Experience Patrick Spedding Strategic Advisor, Business Intelligence & Analytics See-Change Solutions Ltd US: (949) 528-6665 Australia: (02) 8005-6148 Copyright @ 2012 See-Change Solutions @spedding http://www.
  2. 2. 10 Steps for a Seamless SAS/Tableau Experience For the many organizations who have both SAS and Tableau, it makes sense to find ways to integrate these technologies to provide an Integrated Information Framework which leverages the strengths of both solutions. This presentation covers 10 techniques for integrating SAS and Tableau, using SAS as a data source and data preparation environment. SAS routines developed to feed the Tableau solution will also be demonstrated. Taking advantage of SAS and Tableau capabilities in this manner can provide a way to ‘rapid prototype’ business reporting requirements, without the costs and delays typically seen when attempting to model emerging business requirements in the Data Warehouse via traditional ETL methods. In addition, this approach suggests a way to protect existing investments in analytical reporting as developed by your SAS team, by providing a platform to publish those reports for easy consumption, plus easy re-formatting and ‘slice & dice’ of these reports in the Tableau environment. Techniques covered will include commonly requested topics such as data and currency formatting, relative date calculations, longitudinal data analysis, integrating SAS Web Stored Processes and considerations for the use of SAS ACCESS and SAS ODBC/ OLE-DB. Copyright @ 2012 See-Change Solutions
  3. 3. Agenda • SAS/Tableau Integration: 1. 2. 3. 4. 5. Extract formats Integrating SAS Web Stored Processes Use of SAS ODBC/OLE-DB Date Formats Extract useful elements of date fields/relative date calculations 6. Currency Formats 7. Rename Column Names/columns used for levels in hierarchies 8. Add descriptors for coded fields 9. Use SAS formats for more complex formats 10.Merge Disparate Data Sources • Q&A Copyright @ 2012 See-Change Solutions
  4. 4. SAS Data as a Source for Tableau: Approaches • • • • • • SAS Dataset -> CSV -> Tableau SAS Report -> CSV -> Tableau SAS Stored Process -> Tableau SAS Data -> Datasource (RDBMS Connection) -> Tableau SAS Data -> Datasource (via ODBC) -> Tableau SAS Dataset -> OLE-DB -> Excel -> Tableau Note: CSV is typically around 10X smaller than SAS7BDAT format Copyright @ 2012 See-Change Solutions
  5. 5. SAS Dataset -> CSV -> Tableau This method uses a SAS dataset to feed the Tableau environment. For example, complex business logic can be built into a SAS (E. Guide) process, then value can be added in Tableau Desktop (e.g. drill-downs, relative time calculations, ratios), before displaying via Tableau Server). This is a good approach for both prototyping BI requirements as well as ‘Analytical Data Preparation’. proc export data=WORK.COMPARATIVE_PERFORMANCE outfile= "corpdfsSASOutputComparative _Performance.txt" dbms=dlm replace ; delimiter = '|' ; run ; Copyright @ 2012 See-Change Solutions
  6. 6. SAS Report -> CSV -> Tableau This method takes the output of a SAS report (eg Enterprise Guide report) and ‘pivots’ the data in such a way as to provide a data input into Tableau. Copyright @ 2012 See-Change Solutions
  7. 7. SAS Stored Process -> Tableau This method takes a SAS report and enables it as a SAS Web Stored Process, which can then be linked and run within Tableau. Security can be integrated via ‘Single Signon’ if required. (Note: SAS Integration Technologies required) http://<SAS Server>:8080/SASStoredProcess/do?_program=<Report Name>&_action=properties Note: For SAS Web Stored Processes with Prompts, need to add &_action=properties to the URL Copyright @ 2012 See-Change Solutions
  8. 8. SAS Data -> RDBMS-> Tableau This method uses the SAS ‘PROC SQL’ method to output SAS results directly to a relational table, for example a table within the Data Warehouse. With the SAS/ACCESS interface, you reference database objects directly in a DATA step or SAS procedure using the SAS LIBNAME statement. PROC SQL can be used to update, delete or insert data into a relational table, for example via Bulk Load. Copyright @ 2012 See-Change Solutions
  9. 9. SAS Data -> ODBC -> Tableau This method can use an ODBC Connection to allow any SAS dataset to be a source for Tableau. The SAS ODBC driver can be used to create an ODBC connection, from which a data source connection can be defined within Tableau to point to the SAS data set. Note: SAS profile required to access product downloads Note: Date fields are not properly interpreted unless it is a Tableau extract Copyright @ 2012 See-Change Solutions (400Mb download)
  10. 10. SAS Data -> OLE-DB -> Excel -> Tableau This method can use an OLE-DB Connection to allow any SAS dataset to be a source for Tableau. The SAS OLE-DB provider can be used to create an OLEDB, from which a data source connection can be defined within Excel. Tableau can then point to the Excel file to retrieve the SAS data. Copyright @ 2012 See-Change Solutions
  11. 11. Dealing with SAS Dates In this example, we have a number of dates in our SAS dataset: t1.rsmwrkdt FORMAT=DDMMYYS8. LABEL="Resumed Work Date" AS 'Resumed Work Date'n PROC SQL; CREATE TABLE WORK.QUERY_FOR_POLICY1 AS SELECT t1.trandate FORMAT=DDMMYYS8., t1.polexpdt FORMAT=DDMMYYS8., t1.commdate FORMAT=DDMMYYS8. FROM WORK.QUERY_FOR_POLICY t1 QUIT; Copyright @ 2012 See-Change Solutions
  12. 12. Dealing with SAS Dates - Notes If you're going to work with a date as a string type it's better to use ISO-8601 format of YYYY-MM-DD. This is locale insensitive so you don't need to worry about DD/MM/YYYY vs. MM/DD/YYYY. Your formula would then read: DATE(LEFT([Period],4) + “-“ + MID([Period],5,2) + “-“ + RIGHT([Period],2)) This is an improvement, but string logic is much slower than numeric logic, so it would be even better to work with this as numbers. Convert the [Period] field to be a number instead of a string, then use the following: DATEADD(‘DAY’, [YYYYMMDD]%100-1, DATEADD(‘MONTH’, INT(([YYYYMMDD]%10000)/100)-1, DATEADD(‘YEAR’, INT([YYYYMMDD]/10000)-1900, #1900-01-01#))) Note that the performance gains can be remarkable with large data sets. In a test conducted over a 1 billion record sample, the first calculation took over 4 hours to complete, while the second took about a minute. Copyright @ 2012 See-Change Solutions
  13. 13. Dealing with SAS Dates - Examples /* Day of Injury */ (CASE WHEN(WEEKDAY(t1.injdate)) = 1 THEN 'Sun' WHEN(WEEKDAY(t1.injdate)) = 2 THEN 'Mon' WHEN(WEEKDAY(t1.injdate)) = 3 THEN 'Tue' WHEN(WEEKDAY(t1.injdate)) = 4 THEN 'Wed' WHEN(WEEKDAY(t1.injdate)) = 5 THEN 'Thu' WHEN(WEEKDAY(t1.injdate)) = 6 THEN 'Fri' WHEN(WEEKDAY(t1.injdate)) = 7 THEN 'Sat' END) AS 'Day of Injury'n, Copyright @ 2012 See-Change Solutions
  14. 14. Dealing with SAS Dates - Examples /* Elapsed Weeks */ intnx('week',t1.csdwrkdt,t1.rsmwrkdt,'e') LABEL="Elapsed Weeks" AS 'Elapsed Weeks'n, /* Claim Age (Months) */ intck('month',t1.clamdate,t1.sysdate) LABEL="Claim Age (Months)" AS 'Claim Age (Months)'n, /* Reporting Delay (days) */ intck('day',t1.injdate,t1.clamdate) ) AS 'Reporting Delay (days)'n, SAS Lag functions can also be very useful (See “Longitudinal Data Techniques” /library/nesug00/ad1002.pdf) Copyright @ 2012 See-Change Solutions
  15. 15. Currency Formats t1.totpayc FORMAT=BEST12. LABEL="Claim Payments to Date" AS 'Claim Payments to Date'n Use BEST12. to avoid issues when importing/displaying SAS currency data in Tableau. Copyright @ 2012 See-Change Solutions
  16. 16. Rename Column Names Rename cryptic SAS field names: data claims ; set ; keep insurer claim teed deis injdate injdsnat injnatc injresc injdislc clmclosf clmclodt workpc csdwrkdt rsmwrkdt hrswrkwk hrstincc; run ; t1.emplnam1 LABEL="Employer Name" AS 'Employer Name'n, Label hierarchy levels appropriately: /* Industry - Level 1*/ t1.indgroup LABEL="Industry - Level 1" /* Industry - Level 2*/ t1.indsubgrp LABEL="Industry - Level 2" Copyright @ 2012 See-Change Solutions
  17. 17. Add descriptors for coded SAS fields /* Insurer Name */ (CASE WHEN t1.insurer = WHEN t1.insurer = WHEN t1.insurer = WHEN t1.insurer = ELSE 'Not Known' END) AS 'Insurer Name'n 1 2 3 4 THEN THEN THEN THEN 'Insurer 'Insurer 'Insurer 'Insurer 1' 2' 3' 4' /* Liability Status */ (CASE WHEN t1.clmliab = 1 THEN 'Notification of work related injury' WHEN t1.clmliab = 2 THEN 'Liability accepted' WHEN t1.clmliab = 5 THEN 'Liability not yet determined' WHEN t1.clmliab = 6 THEN 'Administration error' WHEN t1.clmliab = 7 THEN 'Liability denied' WHEN t1.clmliab = 8 THEN 'Provisional liability accepted weekly and medical payments' WHEN t1.clmliab = 9 THEN 'Reasonable excuse' WHEN t1.clmliab = 10 THEN 'Provisional liability discontinued' WHEN t1.clmliab = 11 THEN 'Provisional liability accepted medical only, weekly payments not applicable' WHEN t1.clmliab = 12 THEN 'No action after notification' ELSE 'Not Known' END) AS 'Liability Status'n, Copyright @ 2012 See-Change Solutions
  18. 18. Use SAS formats for more complex formats CASE WHEN t1.deis le '30jun2011'd THEN put(t1.occncode,asc21dgn.) ELSE put(t1.occncode,anzsco1n.) END AS 'Occupation - Level 1'n, /* Occupation - Level 2*/ CASE WHEN t1.deis le '30jun2011'd THEN put(t1.occncode,asc22dgn.) ELSE put(t1.occncode,anzsco2n.) END AS 'Occupation - Level 2'n, Often in SAS, a single field will be set up with several informats, relating to different levels of a hierarchy. Connecting to the SAS dataset via SAS ODBC would lose this information, therefore it is advisable to apply each SAS informat to create multiple fields in the SAS extract, prior to importing into Tableau. Copyright @ 2012 See-Change Solutions
  19. 19. Merge Disparate Data Sources This is particularly useful when rows may not all match across sources Also, this approach avoids having to try to join all sources in real time in one or several outer join SQL statements (as would be the approach in traditional BI tools such as Cognos) Copyright @ 2012 See-Change Solutions
  20. 20. THANK YOU. See-Change Solutions @spedding Copyright @ 2012 See-Change Solutions