SAS/Tableau Integration
10 Steps for a Seamless SAS/Tableau
Experience

Patrick Spedding
Strategic Advisor, Business Intelligence & Analytics
See-Change Solutions Ltd
patrick@see-change.com.au
US: (949) 528-6665
Australia: (02) 8005-6148
au.linkedin.com/in/spedding

Copyright @ 2012 See-Change Solutions

@spedding

http://www. see-change.com.au
10 Steps for a Seamless SAS/Tableau Experience
For the many organizations who have both SAS and Tableau, it makes sense to find ways to
integrate these technologies to provide an Integrated Information Framework which
leverages the strengths of both solutions.
This presentation covers 10 techniques for integrating SAS and Tableau, using SAS as a data
source and data preparation environment. SAS routines developed to feed the Tableau
solution will also be demonstrated.
Taking advantage of SAS and Tableau capabilities in this manner can provide a way to ‘rapid
prototype’ business reporting requirements, without the costs and delays typically seen
when attempting to model emerging business requirements in the Data Warehouse via
traditional ETL methods. In addition, this approach suggests a way to protect existing
investments in analytical reporting as developed by your SAS team, by providing a platform
to publish those reports for easy consumption, plus easy re-formatting and ‘slice & dice’ of
these reports in the Tableau environment.
Techniques covered will include commonly requested topics such as data and currency
formatting, relative date calculations, longitudinal data analysis, integrating SAS Web Stored
Processes and considerations for the use of SAS ACCESS and SAS ODBC/ OLE-DB.
Copyright @ 2012 See-Change Solutions
Agenda
• SAS/Tableau Integration:
1.
2.
3.
4.
5.

Extract formats
Integrating SAS Web Stored Processes
Use of SAS ODBC/OLE-DB
Date Formats
Extract useful elements of date fields/relative date
calculations
6. Currency Formats
7. Rename Column Names/columns used for levels in
hierarchies
8. Add descriptors for coded fields
9. Use SAS formats for more complex formats
10.Merge Disparate Data Sources

• Q&A
Copyright @ 2012 See-Change Solutions
SAS Data as a Source for Tableau:
Approaches
•
•
•
•
•
•

SAS Dataset -> CSV -> Tableau
SAS Report -> CSV -> Tableau
SAS Stored Process -> Tableau
SAS Data -> Datasource (RDBMS Connection) -> Tableau
SAS Data -> Datasource (via ODBC) -> Tableau
SAS Dataset -> OLE-DB -> Excel -> Tableau
Note: CSV is typically around 10X smaller than SAS7BDAT format

Copyright @ 2012 See-Change Solutions
SAS Dataset -> CSV -> Tableau
This method uses a SAS dataset to feed the Tableau environment. For
example, complex business logic can be built into a SAS (E. Guide) process,
then value can be added in Tableau Desktop (e.g. drill-downs, relative time
calculations, ratios), before displaying via Tableau Server). This is a good
approach for both prototyping BI requirements as well as ‘Analytical Data
Preparation’.

proc export
data=WORK.COMPARATIVE_PERFORMANCE
outfile=
"corpdfsSASOutputComparative
_Performance.txt"
dbms=dlm replace ;
delimiter = '|' ;
run ;

Copyright @ 2012 See-Change Solutions
SAS Report -> CSV -> Tableau
This method takes the output of a SAS report (eg Enterprise Guide report)
and ‘pivots’ the data in such a way as to provide a data input into Tableau.

Copyright @ 2012 See-Change Solutions
SAS Stored Process -> Tableau
This method takes a SAS report and enables it as a SAS Web Stored Process,
which can then be linked and run within Tableau. Security can be integrated
via ‘Single Signon’ if required. (Note: SAS Integration Technologies required)

http://<SAS Server>:8080/SASStoredProcess/do?_program=<Report
Name>&_action=properties

Note: For SAS Web Stored Processes with Prompts, need to add
&_action=properties to the URL
Copyright @ 2012 See-Change Solutions
SAS Data -> RDBMS-> Tableau
This method uses the SAS ‘PROC SQL’ method to output SAS results directly to
a relational table, for example a table within the Data Warehouse. With the
SAS/ACCESS interface, you reference database objects directly in a DATA step
or SAS procedure using the SAS LIBNAME statement. PROC SQL can be used
to update, delete or insert data into a relational table, for example via Bulk
Load.

Copyright @ 2012 See-Change Solutions
SAS Data -> ODBC -> Tableau
This method can use an ODBC Connection to allow any SAS dataset to be a
source for Tableau. The SAS ODBC driver can be used to create an ODBC
connection, from which a data source connection can be defined within
Tableau to point to the SAS data set.
http://support.sas.com/demosdownloads/setupcat.jsp?cat=ODBC%20Drivers

Note: SAS profile required to access product downloads

Note: Date fields are not properly interpreted unless it is a Tableau extract
Copyright @ 2012 See-Change Solutions

(400Mb download)
SAS Data -> OLE-DB -> Excel -> Tableau
This method can use an OLE-DB Connection to allow any SAS dataset to be a
source for Tableau. The SAS OLE-DB provider can be used to create an OLEDB, from which a data source connection can be defined within Excel. Tableau
can then point to the Excel file to retrieve the SAS data.

Copyright @ 2012 See-Change Solutions
Dealing with SAS Dates
In this example, we have a number of dates in our SAS dataset:

t1.rsmwrkdt FORMAT=DDMMYYS8. LABEL="Resumed Work Date"

AS 'Resumed Work Date'n

PROC SQL;
CREATE TABLE WORK.QUERY_FOR_POLICY1 AS
SELECT t1.trandate FORMAT=DDMMYYS8.,
t1.polexpdt FORMAT=DDMMYYS8.,
t1.commdate FORMAT=DDMMYYS8.
FROM WORK.QUERY_FOR_POLICY t1
QUIT;

Copyright @ 2012 See-Change Solutions
Dealing with SAS Dates - Notes
If you're going to work with a date as a string type it's better to use ISO-8601 format of
YYYY-MM-DD. This is locale insensitive so you don't need to worry about
DD/MM/YYYY vs. MM/DD/YYYY. Your formula would then read:
DATE(LEFT([Period],4)
+ “-“ + MID([Period],5,2)
+ “-“ + RIGHT([Period],2))
This is an improvement, but string logic is much slower than numeric logic, so it would
be even better to work with this as numbers. Convert the [Period] field to be a number
instead of a string, then use the following:

DATEADD(‘DAY’, [YYYYMMDD]%100-1,
DATEADD(‘MONTH’, INT(([YYYYMMDD]%10000)/100)-1,
DATEADD(‘YEAR’, INT([YYYYMMDD]/10000)-1900, #1900-01-01#)))
Note that the performance gains can be remarkable with large data sets. In a test
conducted over a 1 billion record sample, the first calculation took over 4 hours to
complete, while the second took about a minute.

Copyright @ 2012 See-Change Solutions
Dealing with SAS Dates - Examples
/* Day of Injury */
(CASE
WHEN(WEEKDAY(t1.injdate)) = 1 THEN
'Sun'
WHEN(WEEKDAY(t1.injdate)) = 2 THEN
'Mon'
WHEN(WEEKDAY(t1.injdate)) = 3 THEN
'Tue'
WHEN(WEEKDAY(t1.injdate)) = 4 THEN
'Wed'
WHEN(WEEKDAY(t1.injdate)) = 5 THEN
'Thu'
WHEN(WEEKDAY(t1.injdate)) = 6 THEN
'Fri'
WHEN(WEEKDAY(t1.injdate)) = 7 THEN
'Sat'
END) AS 'Day of Injury'n,

Copyright @ 2012 See-Change Solutions
Dealing with SAS Dates - Examples
/* Elapsed Weeks */
intnx('week',t1.csdwrkdt,t1.rsmwrkdt,'e')
LABEL="Elapsed Weeks" AS 'Elapsed Weeks'n,
/* Claim Age (Months) */
intck('month',t1.clamdate,t1.sysdate) LABEL="Claim
Age (Months)" AS 'Claim Age (Months)'n,
/* Reporting Delay (days) */
intck('day',t1.injdate,t1.clamdate) ) AS 'Reporting
Delay (days)'n,

SAS Lag functions can also be
very useful
(See “Longitudinal Data
Techniques”
http://www.ats.ucla.edu/stat/sas
/library/nesug00/ad1002.pdf)

Copyright @ 2012 See-Change Solutions
Currency Formats
t1.totpayc FORMAT=BEST12. LABEL="Claim Payments to
Date" AS 'Claim Payments to Date'n

Use BEST12. to avoid
issues when
importing/displaying SAS
currency data in Tableau.

Copyright @ 2012 See-Change Solutions
Rename Column Names
Rename cryptic SAS field names:
data claims ;
set mth.claims ;
keep insurer claim teed deis injdate injdsnat
injnatc injresc injdislc clmclosf clmclodt
workpc csdwrkdt rsmwrkdt hrswrkwk hrstincc;

run ;
t1.emplnam1 LABEL="Employer Name" AS
'Employer Name'n,

Label hierarchy levels appropriately:
/* Industry - Level 1*/
t1.indgroup LABEL="Industry - Level 1"
/* Industry - Level 2*/
t1.indsubgrp LABEL="Industry - Level 2"
Copyright @ 2012 See-Change Solutions
Add descriptors for coded SAS fields
/* Insurer Name */
(CASE
WHEN t1.insurer =
WHEN t1.insurer =
WHEN t1.insurer =
WHEN t1.insurer =
ELSE 'Not Known'
END) AS 'Insurer Name'n

1
2
3
4

THEN
THEN
THEN
THEN

'Insurer
'Insurer
'Insurer
'Insurer

1'
2'
3'
4'

/* Liability Status */
(CASE
WHEN t1.clmliab = 1 THEN 'Notification of work related
injury'
WHEN t1.clmliab = 2 THEN 'Liability accepted'
WHEN t1.clmliab = 5 THEN 'Liability not yet determined'
WHEN t1.clmliab = 6 THEN 'Administration error'
WHEN t1.clmliab = 7 THEN 'Liability denied'
WHEN t1.clmliab = 8 THEN 'Provisional liability accepted weekly and medical payments'
WHEN t1.clmliab = 9 THEN 'Reasonable excuse'
WHEN t1.clmliab = 10 THEN 'Provisional liability
discontinued'
WHEN t1.clmliab = 11 THEN 'Provisional liability accepted medical only, weekly payments not applicable'
WHEN t1.clmliab = 12 THEN 'No action after notification'
ELSE 'Not Known'
END) AS 'Liability Status'n,
Copyright @ 2012 See-Change Solutions
Use SAS formats for more complex formats
CASE WHEN t1.deis le '30jun2011'd
THEN put(t1.occncode,asc21dgn.)
ELSE put(t1.occncode,anzsco1n.)
END AS 'Occupation - Level 1'n,
/* Occupation - Level 2*/
CASE WHEN t1.deis le '30jun2011'd
THEN put(t1.occncode,asc22dgn.)
ELSE put(t1.occncode,anzsco2n.)
END AS 'Occupation - Level 2'n,

Often in SAS, a single field will be set up with several informats, relating
to different levels of a hierarchy.
Connecting to the SAS dataset via SAS ODBC would lose this
information, therefore it is advisable to apply each SAS informat to
create multiple fields in the SAS extract, prior to importing into Tableau.
Copyright @ 2012 See-Change Solutions
Merge Disparate Data Sources
This is particularly
useful when rows may
not all match across
sources

Also, this approach avoids having to try to join
all sources in real time in one or several outer
join SQL statements (as would be the approach
in traditional BI tools such as Cognos)

Copyright @ 2012 See-Change Solutions
THANK YOU.

See-Change Solutions
patrick@see-change.com.au

au.linkedin.com/in/spedding
@spedding

www.see-change.com.au
Copyright @ 2012 See-Change Solutions

SAS/Tableau integration

  • 1.
    SAS/Tableau Integration 10 Stepsfor a Seamless SAS/Tableau Experience Patrick Spedding Strategic Advisor, Business Intelligence & Analytics See-Change Solutions Ltd patrick@see-change.com.au US: (949) 528-6665 Australia: (02) 8005-6148 au.linkedin.com/in/spedding Copyright @ 2012 See-Change Solutions @spedding http://www. see-change.com.au
  • 2.
    10 Steps fora Seamless SAS/Tableau Experience For the many organizations who have both SAS and Tableau, it makes sense to find ways to integrate these technologies to provide an Integrated Information Framework which leverages the strengths of both solutions. This presentation covers 10 techniques for integrating SAS and Tableau, using SAS as a data source and data preparation environment. SAS routines developed to feed the Tableau solution will also be demonstrated. Taking advantage of SAS and Tableau capabilities in this manner can provide a way to ‘rapid prototype’ business reporting requirements, without the costs and delays typically seen when attempting to model emerging business requirements in the Data Warehouse via traditional ETL methods. In addition, this approach suggests a way to protect existing investments in analytical reporting as developed by your SAS team, by providing a platform to publish those reports for easy consumption, plus easy re-formatting and ‘slice & dice’ of these reports in the Tableau environment. Techniques covered will include commonly requested topics such as data and currency formatting, relative date calculations, longitudinal data analysis, integrating SAS Web Stored Processes and considerations for the use of SAS ACCESS and SAS ODBC/ OLE-DB. Copyright @ 2012 See-Change Solutions
  • 3.
    Agenda • SAS/Tableau Integration: 1. 2. 3. 4. 5. Extractformats Integrating SAS Web Stored Processes Use of SAS ODBC/OLE-DB Date Formats Extract useful elements of date fields/relative date calculations 6. Currency Formats 7. Rename Column Names/columns used for levels in hierarchies 8. Add descriptors for coded fields 9. Use SAS formats for more complex formats 10.Merge Disparate Data Sources • Q&A Copyright @ 2012 See-Change Solutions
  • 4.
    SAS Data asa Source for Tableau: Approaches • • • • • • SAS Dataset -> CSV -> Tableau SAS Report -> CSV -> Tableau SAS Stored Process -> Tableau SAS Data -> Datasource (RDBMS Connection) -> Tableau SAS Data -> Datasource (via ODBC) -> Tableau SAS Dataset -> OLE-DB -> Excel -> Tableau Note: CSV is typically around 10X smaller than SAS7BDAT format Copyright @ 2012 See-Change Solutions
  • 5.
    SAS Dataset ->CSV -> Tableau This method uses a SAS dataset to feed the Tableau environment. For example, complex business logic can be built into a SAS (E. Guide) process, then value can be added in Tableau Desktop (e.g. drill-downs, relative time calculations, ratios), before displaying via Tableau Server). This is a good approach for both prototyping BI requirements as well as ‘Analytical Data Preparation’. proc export data=WORK.COMPARATIVE_PERFORMANCE outfile= "corpdfsSASOutputComparative _Performance.txt" dbms=dlm replace ; delimiter = '|' ; run ; Copyright @ 2012 See-Change Solutions
  • 6.
    SAS Report ->CSV -> Tableau This method takes the output of a SAS report (eg Enterprise Guide report) and ‘pivots’ the data in such a way as to provide a data input into Tableau. Copyright @ 2012 See-Change Solutions
  • 7.
    SAS Stored Process-> Tableau This method takes a SAS report and enables it as a SAS Web Stored Process, which can then be linked and run within Tableau. Security can be integrated via ‘Single Signon’ if required. (Note: SAS Integration Technologies required) http://<SAS Server>:8080/SASStoredProcess/do?_program=<Report Name>&_action=properties Note: For SAS Web Stored Processes with Prompts, need to add &_action=properties to the URL Copyright @ 2012 See-Change Solutions
  • 8.
    SAS Data ->RDBMS-> Tableau This method uses the SAS ‘PROC SQL’ method to output SAS results directly to a relational table, for example a table within the Data Warehouse. With the SAS/ACCESS interface, you reference database objects directly in a DATA step or SAS procedure using the SAS LIBNAME statement. PROC SQL can be used to update, delete or insert data into a relational table, for example via Bulk Load. Copyright @ 2012 See-Change Solutions
  • 9.
    SAS Data ->ODBC -> Tableau This method can use an ODBC Connection to allow any SAS dataset to be a source for Tableau. The SAS ODBC driver can be used to create an ODBC connection, from which a data source connection can be defined within Tableau to point to the SAS data set. http://support.sas.com/demosdownloads/setupcat.jsp?cat=ODBC%20Drivers Note: SAS profile required to access product downloads Note: Date fields are not properly interpreted unless it is a Tableau extract Copyright @ 2012 See-Change Solutions (400Mb download)
  • 10.
    SAS Data ->OLE-DB -> Excel -> Tableau This method can use an OLE-DB Connection to allow any SAS dataset to be a source for Tableau. The SAS OLE-DB provider can be used to create an OLEDB, from which a data source connection can be defined within Excel. Tableau can then point to the Excel file to retrieve the SAS data. Copyright @ 2012 See-Change Solutions
  • 11.
    Dealing with SASDates In this example, we have a number of dates in our SAS dataset: t1.rsmwrkdt FORMAT=DDMMYYS8. LABEL="Resumed Work Date" AS 'Resumed Work Date'n PROC SQL; CREATE TABLE WORK.QUERY_FOR_POLICY1 AS SELECT t1.trandate FORMAT=DDMMYYS8., t1.polexpdt FORMAT=DDMMYYS8., t1.commdate FORMAT=DDMMYYS8. FROM WORK.QUERY_FOR_POLICY t1 QUIT; Copyright @ 2012 See-Change Solutions
  • 12.
    Dealing with SASDates - Notes If you're going to work with a date as a string type it's better to use ISO-8601 format of YYYY-MM-DD. This is locale insensitive so you don't need to worry about DD/MM/YYYY vs. MM/DD/YYYY. Your formula would then read: DATE(LEFT([Period],4) + “-“ + MID([Period],5,2) + “-“ + RIGHT([Period],2)) This is an improvement, but string logic is much slower than numeric logic, so it would be even better to work with this as numbers. Convert the [Period] field to be a number instead of a string, then use the following: DATEADD(‘DAY’, [YYYYMMDD]%100-1, DATEADD(‘MONTH’, INT(([YYYYMMDD]%10000)/100)-1, DATEADD(‘YEAR’, INT([YYYYMMDD]/10000)-1900, #1900-01-01#))) Note that the performance gains can be remarkable with large data sets. In a test conducted over a 1 billion record sample, the first calculation took over 4 hours to complete, while the second took about a minute. Copyright @ 2012 See-Change Solutions
  • 13.
    Dealing with SASDates - Examples /* Day of Injury */ (CASE WHEN(WEEKDAY(t1.injdate)) = 1 THEN 'Sun' WHEN(WEEKDAY(t1.injdate)) = 2 THEN 'Mon' WHEN(WEEKDAY(t1.injdate)) = 3 THEN 'Tue' WHEN(WEEKDAY(t1.injdate)) = 4 THEN 'Wed' WHEN(WEEKDAY(t1.injdate)) = 5 THEN 'Thu' WHEN(WEEKDAY(t1.injdate)) = 6 THEN 'Fri' WHEN(WEEKDAY(t1.injdate)) = 7 THEN 'Sat' END) AS 'Day of Injury'n, Copyright @ 2012 See-Change Solutions
  • 14.
    Dealing with SASDates - Examples /* Elapsed Weeks */ intnx('week',t1.csdwrkdt,t1.rsmwrkdt,'e') LABEL="Elapsed Weeks" AS 'Elapsed Weeks'n, /* Claim Age (Months) */ intck('month',t1.clamdate,t1.sysdate) LABEL="Claim Age (Months)" AS 'Claim Age (Months)'n, /* Reporting Delay (days) */ intck('day',t1.injdate,t1.clamdate) ) AS 'Reporting Delay (days)'n, SAS Lag functions can also be very useful (See “Longitudinal Data Techniques” http://www.ats.ucla.edu/stat/sas /library/nesug00/ad1002.pdf) Copyright @ 2012 See-Change Solutions
  • 15.
    Currency Formats t1.totpayc FORMAT=BEST12.LABEL="Claim Payments to Date" AS 'Claim Payments to Date'n Use BEST12. to avoid issues when importing/displaying SAS currency data in Tableau. Copyright @ 2012 See-Change Solutions
  • 16.
    Rename Column Names Renamecryptic SAS field names: data claims ; set mth.claims ; keep insurer claim teed deis injdate injdsnat injnatc injresc injdislc clmclosf clmclodt workpc csdwrkdt rsmwrkdt hrswrkwk hrstincc; run ; t1.emplnam1 LABEL="Employer Name" AS 'Employer Name'n, Label hierarchy levels appropriately: /* Industry - Level 1*/ t1.indgroup LABEL="Industry - Level 1" /* Industry - Level 2*/ t1.indsubgrp LABEL="Industry - Level 2" Copyright @ 2012 See-Change Solutions
  • 17.
    Add descriptors forcoded SAS fields /* Insurer Name */ (CASE WHEN t1.insurer = WHEN t1.insurer = WHEN t1.insurer = WHEN t1.insurer = ELSE 'Not Known' END) AS 'Insurer Name'n 1 2 3 4 THEN THEN THEN THEN 'Insurer 'Insurer 'Insurer 'Insurer 1' 2' 3' 4' /* Liability Status */ (CASE WHEN t1.clmliab = 1 THEN 'Notification of work related injury' WHEN t1.clmliab = 2 THEN 'Liability accepted' WHEN t1.clmliab = 5 THEN 'Liability not yet determined' WHEN t1.clmliab = 6 THEN 'Administration error' WHEN t1.clmliab = 7 THEN 'Liability denied' WHEN t1.clmliab = 8 THEN 'Provisional liability accepted weekly and medical payments' WHEN t1.clmliab = 9 THEN 'Reasonable excuse' WHEN t1.clmliab = 10 THEN 'Provisional liability discontinued' WHEN t1.clmliab = 11 THEN 'Provisional liability accepted medical only, weekly payments not applicable' WHEN t1.clmliab = 12 THEN 'No action after notification' ELSE 'Not Known' END) AS 'Liability Status'n, Copyright @ 2012 See-Change Solutions
  • 18.
    Use SAS formatsfor more complex formats CASE WHEN t1.deis le '30jun2011'd THEN put(t1.occncode,asc21dgn.) ELSE put(t1.occncode,anzsco1n.) END AS 'Occupation - Level 1'n, /* Occupation - Level 2*/ CASE WHEN t1.deis le '30jun2011'd THEN put(t1.occncode,asc22dgn.) ELSE put(t1.occncode,anzsco2n.) END AS 'Occupation - Level 2'n, Often in SAS, a single field will be set up with several informats, relating to different levels of a hierarchy. Connecting to the SAS dataset via SAS ODBC would lose this information, therefore it is advisable to apply each SAS informat to create multiple fields in the SAS extract, prior to importing into Tableau. Copyright @ 2012 See-Change Solutions
  • 19.
    Merge Disparate DataSources This is particularly useful when rows may not all match across sources Also, this approach avoids having to try to join all sources in real time in one or several outer join SQL statements (as would be the approach in traditional BI tools such as Cognos) Copyright @ 2012 See-Change Solutions
  • 20.