SlideShare a Scribd company logo
Copyright © 2013 Quintiles
An Alternative way to
Import Multiple Excel
files with Multiple
Worksheets into SAS
Saurabh Patel
All opinions expressed in this presentation are the author’s personal views,
and may not reflect the opinions or views of Quintiles.
2
We will see….
• What is it?
• Why really alternative method is require?
• How an alternative way will work?
• Example
• Benefits
• Drawbacks
3
What is it?
• In the pharmaceutical industry often data transfers happen in Microsoft
excel formats files (like: XLS, XLSX or XLSM).
• It is an alternative way to import multiple excel files with multiple
worksheets into SAS datasets.
• Which allows users to import excel data easily without specifying file
name, worksheet name, variable length and format.
Input
VB Script Processing SAS Macro Processing
Output
4
Why really alternative method is require?
When we have convention methods like:
• LIBNAME statement Excel engine
• Import Wizard
• PROC IMPORT
• DDE Method
5
Three criteria can use for choosing the "Optimum" method :
Limitations of Regular process
1. Precision :
• Are we sure to read the input data properly? Automated SAS
procedures like PROC IMPORT or LIBNAME statement classify the
entire column as numeric data, which often loss of decimal precisions
or formats of values.
Excel Input File Imported SAS dataset using
LIBNAME statement
6
Limitations of Regular process
2. Flexibility :
1. LIBNAME STATEMENT EXCEL ENGINE
• Offers more practical advantages over PROC IMPORT and DDE.
• It doesn't provide flexibility to define variables format, lengths.
• It doesn't works for file formats like .csv or delimited files.
2. Import wizards and PROC IMPORT
• It can use for both excel spreadsheet and delimited files.
• It can process only single spreadsheet at a time which is time
consuming and also doesn't provide flexibility to define variables
format, lengths.
3. DDE Method
• Offers more flexibility for define variable format, lengths.
• It always requires more input parameter like filename, worksheet
name, work sheet ranges and variable lengths.
7
Limitations of Regular process
3. Automation :
• While handling multiple excel files with multiple worksheets, and
numerous data transfer most important criteria is to save
important time.
• All conventional methods requires minimal amount of input
parameter but it requires greater amount of attention regarding
worksheet names, worksheet ranges, variable formats, and
variable lengths.
• If the programmer first needs to save each Excel spreadsheet as
a CSV or TXT file, or run an import wizard, this is not automated.
Simply mean that everything should done within the single SAS
program.
8
How an alternative way will work?
Input
VB Script Processing
1. VBScript Processing
Output
9
• What is VBScript?
• VBScript (Visual Basic Scripting Edition) is an Active Scripting
language developed by Microsoft that is modeled on Visual Basic.
• How VBScript will be helpful here?
• VBScript is modeled on Visual Basic and Excel file system also
modeled on Visual Basic.
• It is helpful here two important way.
1. Convert multiple Excel Files with worksheets into different
.CSV or delimited text file format with just providing path
name or filename.
2. Unmerge cells into excel worksheets and fill duplicate
data into rest of cells. And remove carriage return
(Alt+Enter) values to get data into proper order.
1. VBScript Processing
10
VB Script Code
11
How an alternative way will work?
2. SAS Macro Processing
SAS Macro Processing
OutputInput
12
2. SAS %CSV Macro Processing
1. Get the list of all CSV filenames in Input Directory to convert in
SAS dataset
2. Get the variable names from the first row of each CSV files
and coverts variable names into valid variable SAS names
3. Using INFILE statement and variable names import all data as
character format with maximum lengths.
13
Example
14
Benefits
• Dynamic process (just providing input directory path or filename) is
helpful to save time.
• Convert Multiple excel files into CSV , unmerging cells and removing
carriage return using VB script more convenient and user-friendly.
• Specifying all variables format as character and maximum length
helpful to easily find out difference in frequent data transfers.
• For QC purpose, provides better options to check getting data properly
with native format and values.
15
Drawbacks
• It creates dataset with all variables as character format.
When numeric variables are more compare to character
variables, in that situation will not be more user friendly.
• It imports hidden worksheets data also, so programmer need to
define more input parameters to get import selected worksheets data.
16
References
• An Optimal Way to Import Excel Worksheets into PC SAS
http://analytics.ncsu.edu/sesug/2008/SBC-134.pdf
• So, Your Data are in Excel!
http://www2.sas.com/proceedings/sugi31/020-31.pdf
• CSV: A MACRO WHICH WRITES SAS® PROGRAMS TO READ CSV FILES
http://www.lexjansen.com/nesug/nesug03/ps/ps019.pdf
17
18
Contact: saurabh.patel@quintiles.com

More Related Content

What's hot

CSV File Manipulation
CSV File ManipulationCSV File Manipulation
CSV File Manipulation
primeteacher32
 
Microsoft Office File Formats
Microsoft Office File FormatsMicrosoft Office File Formats
Microsoft Office File Formats
bigblueteacher
 
Ajaxism
AjaxismAjaxism
Ajaxism
UC San Diego
 
MongoDB: An Introduction - July 2011
MongoDB:  An Introduction - July 2011MongoDB:  An Introduction - July 2011
MongoDB: An Introduction - July 2011
Chris Westin
 
EPM Logs 101 - Hyperion Focus 17
EPM Logs 101 - Hyperion Focus 17EPM Logs 101 - Hyperion Focus 17
EPM Logs 101 - Hyperion Focus 17
Datavail
 
Exchange Database: Data loss and Recovery Methods
Exchange Database: Data loss and Recovery MethodsExchange Database: Data loss and Recovery Methods
Exchange Database: Data loss and Recovery Methods
Ben Tyson
 
Advanced programming ch2
Advanced programming ch2Advanced programming ch2
Advanced programming ch2
Gera Paulos
 
Servlets as introduction (Advanced programming)
Servlets as introduction (Advanced programming)Servlets as introduction (Advanced programming)
Servlets as introduction (Advanced programming)
Gera Paulos
 

What's hot (8)

CSV File Manipulation
CSV File ManipulationCSV File Manipulation
CSV File Manipulation
 
Microsoft Office File Formats
Microsoft Office File FormatsMicrosoft Office File Formats
Microsoft Office File Formats
 
Ajaxism
AjaxismAjaxism
Ajaxism
 
MongoDB: An Introduction - July 2011
MongoDB:  An Introduction - July 2011MongoDB:  An Introduction - July 2011
MongoDB: An Introduction - July 2011
 
EPM Logs 101 - Hyperion Focus 17
EPM Logs 101 - Hyperion Focus 17EPM Logs 101 - Hyperion Focus 17
EPM Logs 101 - Hyperion Focus 17
 
Exchange Database: Data loss and Recovery Methods
Exchange Database: Data loss and Recovery MethodsExchange Database: Data loss and Recovery Methods
Exchange Database: Data loss and Recovery Methods
 
Advanced programming ch2
Advanced programming ch2Advanced programming ch2
Advanced programming ch2
 
Servlets as introduction (Advanced programming)
Servlets as introduction (Advanced programming)Servlets as introduction (Advanced programming)
Servlets as introduction (Advanced programming)
 

Similar to Saurabh_Patel_An Alternative way to Import Multiple Excel files with Multiple Worksheets into SAS

Exciting Features for SQL Devs in SQL 2012
Exciting Features for SQL Devs in SQL 2012Exciting Features for SQL Devs in SQL 2012
Exciting Features for SQL Devs in SQL 2012
Brij Mishra
 
Test Data Transfer Tool
Test Data Transfer ToolTest Data Transfer Tool
Test Data Transfer Tool
Hai Nguyen
 
Taming the shrew Power BI
Taming the shrew Power BITaming the shrew Power BI
Taming the shrew Power BI
Kellyn Pot'Vin-Gorman
 
MuleSoft Surat Virtual Meetup#30 - Flat File Schemas Transformation With Mule...
MuleSoft Surat Virtual Meetup#30 - Flat File Schemas Transformation With Mule...MuleSoft Surat Virtual Meetup#30 - Flat File Schemas Transformation With Mule...
MuleSoft Surat Virtual Meetup#30 - Flat File Schemas Transformation With Mule...
Jitendra Bafna
 
22_presentation.ppt
22_presentation.ppt22_presentation.ppt
22_presentation.ppt
BhaktiSagarVideos
 
SQL PPT.pptx
SQL PPT.pptxSQL PPT.pptx
SQL PPT.pptx
Kulbir4
 
AtoM Data Migrations
AtoM Data MigrationsAtoM Data Migrations
AtoM Data Migrations
Artefactual Systems - AtoM
 
Updates on Veda provided by Amit Kanudia from KanORS-EMR
Updates on Veda provided by Amit Kanudia from KanORS-EMRUpdates on Veda provided by Amit Kanudia from KanORS-EMR
Updates on Veda provided by Amit Kanudia from KanORS-EMR
IEA-ETSAP
 
Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)
Michael Rys
 
Migrate from Oracle to Aurora PostgreSQL: Best Practices, Design Patterns, & ...
Migrate from Oracle to Aurora PostgreSQL: Best Practices, Design Patterns, & ...Migrate from Oracle to Aurora PostgreSQL: Best Practices, Design Patterns, & ...
Migrate from Oracle to Aurora PostgreSQL: Best Practices, Design Patterns, & ...
Amazon Web Services
 
bi-publisher.pptx
bi-publisher.pptxbi-publisher.pptx
bi-publisher.pptx
kjkombrink
 
Mca5010 web technologies
Mca5010 web technologiesMca5010 web technologies
Mca5010 web technologies
smumbahelp
 
Sql 2016 - What's New
Sql 2016 - What's NewSql 2016 - What's New
Sql 2016 - What's New
dpcobb
 
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
Michael Stack
 
Mysql using php
Mysql using phpMysql using php
Mysql using php
AllsoftSolutions
 
Web technologies-course 07.pptx
Web technologies-course 07.pptxWeb technologies-course 07.pptx
Web technologies-course 07.pptx
Stefan Oprea
 
5\9 SSIS 2008R2_Training - DataFlow Basics
5\9 SSIS 2008R2_Training - DataFlow Basics5\9 SSIS 2008R2_Training - DataFlow Basics
5\9 SSIS 2008R2_Training - DataFlow Basics
Pramod Singla
 
Abstract.DOCX
Abstract.DOCXAbstract.DOCX
Abstract.DOCX
Debabrata Mondal
 
introductionofssis-130418034853-phpapp01.pptx
introductionofssis-130418034853-phpapp01.pptxintroductionofssis-130418034853-phpapp01.pptx
introductionofssis-130418034853-phpapp01.pptx
YashaswiniSrinivasan1
 
Mca5010 web technologies
Mca5010 web technologiesMca5010 web technologies
Mca5010 web technologies
smumbahelp
 

Similar to Saurabh_Patel_An Alternative way to Import Multiple Excel files with Multiple Worksheets into SAS (20)

Exciting Features for SQL Devs in SQL 2012
Exciting Features for SQL Devs in SQL 2012Exciting Features for SQL Devs in SQL 2012
Exciting Features for SQL Devs in SQL 2012
 
Test Data Transfer Tool
Test Data Transfer ToolTest Data Transfer Tool
Test Data Transfer Tool
 
Taming the shrew Power BI
Taming the shrew Power BITaming the shrew Power BI
Taming the shrew Power BI
 
MuleSoft Surat Virtual Meetup#30 - Flat File Schemas Transformation With Mule...
MuleSoft Surat Virtual Meetup#30 - Flat File Schemas Transformation With Mule...MuleSoft Surat Virtual Meetup#30 - Flat File Schemas Transformation With Mule...
MuleSoft Surat Virtual Meetup#30 - Flat File Schemas Transformation With Mule...
 
22_presentation.ppt
22_presentation.ppt22_presentation.ppt
22_presentation.ppt
 
SQL PPT.pptx
SQL PPT.pptxSQL PPT.pptx
SQL PPT.pptx
 
AtoM Data Migrations
AtoM Data MigrationsAtoM Data Migrations
AtoM Data Migrations
 
Updates on Veda provided by Amit Kanudia from KanORS-EMR
Updates on Veda provided by Amit Kanudia from KanORS-EMRUpdates on Veda provided by Amit Kanudia from KanORS-EMR
Updates on Veda provided by Amit Kanudia from KanORS-EMR
 
Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)
 
Migrate from Oracle to Aurora PostgreSQL: Best Practices, Design Patterns, & ...
Migrate from Oracle to Aurora PostgreSQL: Best Practices, Design Patterns, & ...Migrate from Oracle to Aurora PostgreSQL: Best Practices, Design Patterns, & ...
Migrate from Oracle to Aurora PostgreSQL: Best Practices, Design Patterns, & ...
 
bi-publisher.pptx
bi-publisher.pptxbi-publisher.pptx
bi-publisher.pptx
 
Mca5010 web technologies
Mca5010 web technologiesMca5010 web technologies
Mca5010 web technologies
 
Sql 2016 - What's New
Sql 2016 - What's NewSql 2016 - What's New
Sql 2016 - What's New
 
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
 
Mysql using php
Mysql using phpMysql using php
Mysql using php
 
Web technologies-course 07.pptx
Web technologies-course 07.pptxWeb technologies-course 07.pptx
Web technologies-course 07.pptx
 
5\9 SSIS 2008R2_Training - DataFlow Basics
5\9 SSIS 2008R2_Training - DataFlow Basics5\9 SSIS 2008R2_Training - DataFlow Basics
5\9 SSIS 2008R2_Training - DataFlow Basics
 
Abstract.DOCX
Abstract.DOCXAbstract.DOCX
Abstract.DOCX
 
introductionofssis-130418034853-phpapp01.pptx
introductionofssis-130418034853-phpapp01.pptxintroductionofssis-130418034853-phpapp01.pptx
introductionofssis-130418034853-phpapp01.pptx
 
Mca5010 web technologies
Mca5010 web technologiesMca5010 web technologies
Mca5010 web technologies
 

Saurabh_Patel_An Alternative way to Import Multiple Excel files with Multiple Worksheets into SAS

  • 1. Copyright © 2013 Quintiles An Alternative way to Import Multiple Excel files with Multiple Worksheets into SAS Saurabh Patel All opinions expressed in this presentation are the author’s personal views, and may not reflect the opinions or views of Quintiles.
  • 2. 2 We will see…. • What is it? • Why really alternative method is require? • How an alternative way will work? • Example • Benefits • Drawbacks
  • 3. 3 What is it? • In the pharmaceutical industry often data transfers happen in Microsoft excel formats files (like: XLS, XLSX or XLSM). • It is an alternative way to import multiple excel files with multiple worksheets into SAS datasets. • Which allows users to import excel data easily without specifying file name, worksheet name, variable length and format. Input VB Script Processing SAS Macro Processing Output
  • 4. 4 Why really alternative method is require? When we have convention methods like: • LIBNAME statement Excel engine • Import Wizard • PROC IMPORT • DDE Method
  • 5. 5 Three criteria can use for choosing the "Optimum" method : Limitations of Regular process 1. Precision : • Are we sure to read the input data properly? Automated SAS procedures like PROC IMPORT or LIBNAME statement classify the entire column as numeric data, which often loss of decimal precisions or formats of values. Excel Input File Imported SAS dataset using LIBNAME statement
  • 6. 6 Limitations of Regular process 2. Flexibility : 1. LIBNAME STATEMENT EXCEL ENGINE • Offers more practical advantages over PROC IMPORT and DDE. • It doesn't provide flexibility to define variables format, lengths. • It doesn't works for file formats like .csv or delimited files. 2. Import wizards and PROC IMPORT • It can use for both excel spreadsheet and delimited files. • It can process only single spreadsheet at a time which is time consuming and also doesn't provide flexibility to define variables format, lengths. 3. DDE Method • Offers more flexibility for define variable format, lengths. • It always requires more input parameter like filename, worksheet name, work sheet ranges and variable lengths.
  • 7. 7 Limitations of Regular process 3. Automation : • While handling multiple excel files with multiple worksheets, and numerous data transfer most important criteria is to save important time. • All conventional methods requires minimal amount of input parameter but it requires greater amount of attention regarding worksheet names, worksheet ranges, variable formats, and variable lengths. • If the programmer first needs to save each Excel spreadsheet as a CSV or TXT file, or run an import wizard, this is not automated. Simply mean that everything should done within the single SAS program.
  • 8. 8 How an alternative way will work? Input VB Script Processing 1. VBScript Processing Output
  • 9. 9 • What is VBScript? • VBScript (Visual Basic Scripting Edition) is an Active Scripting language developed by Microsoft that is modeled on Visual Basic. • How VBScript will be helpful here? • VBScript is modeled on Visual Basic and Excel file system also modeled on Visual Basic. • It is helpful here two important way. 1. Convert multiple Excel Files with worksheets into different .CSV or delimited text file format with just providing path name or filename. 2. Unmerge cells into excel worksheets and fill duplicate data into rest of cells. And remove carriage return (Alt+Enter) values to get data into proper order. 1. VBScript Processing
  • 11. 11 How an alternative way will work? 2. SAS Macro Processing SAS Macro Processing OutputInput
  • 12. 12 2. SAS %CSV Macro Processing 1. Get the list of all CSV filenames in Input Directory to convert in SAS dataset 2. Get the variable names from the first row of each CSV files and coverts variable names into valid variable SAS names 3. Using INFILE statement and variable names import all data as character format with maximum lengths.
  • 14. 14 Benefits • Dynamic process (just providing input directory path or filename) is helpful to save time. • Convert Multiple excel files into CSV , unmerging cells and removing carriage return using VB script more convenient and user-friendly. • Specifying all variables format as character and maximum length helpful to easily find out difference in frequent data transfers. • For QC purpose, provides better options to check getting data properly with native format and values.
  • 15. 15 Drawbacks • It creates dataset with all variables as character format. When numeric variables are more compare to character variables, in that situation will not be more user friendly. • It imports hidden worksheets data also, so programmer need to define more input parameters to get import selected worksheets data.
  • 16. 16 References • An Optimal Way to Import Excel Worksheets into PC SAS http://analytics.ncsu.edu/sesug/2008/SBC-134.pdf • So, Your Data are in Excel! http://www2.sas.com/proceedings/sugi31/020-31.pdf • CSV: A MACRO WHICH WRITES SAS® PROGRAMS TO READ CSV FILES http://www.lexjansen.com/nesug/nesug03/ps/ps019.pdf
  • 17. 17