SlideShare a Scribd company logo
Understanding SAS Data Step
Processing
Ravi Mandal
Reading Raw Data
• Using the following SAS program:
DATA NEW;
INPUT ID $ AGE TEMPC;
TEMPF=TEMPC*(9/5)+32;
DATALINES;
0001 24 37.3
0002 35 38.2
;
run;
proc print;run;
Ravi M., sasindia@outlook.com
Overview of SAS Data Step
Ravi M., sasindia@outlook.com
Compile Phase
(Look at Syntax)
Execution Phase
(Read data, Calculate)
Output Phase
(Create Data Set)
Compile Phase
DATA NEW;
INPUT ID $ AGE TEMPC;
TEMPF=TEMPC*(9/5)+32;
DATALINES;
0001 24 37.3
0002 35 38.2
;
run;
proc print;run;
Ravi M., sasindia@outlook.com
SAS Checks the syntax of
the program.
• Identifies type and
length of each variable
• Does any variable need
conversion?
If everything is okay,
proceed to the next step.
If errors are discovered, SAS
attempts to interpret what
you mean. If SAS can’t
correct the error, it prints an
error message to the log.
Create Input Buffer
• SAS creates an input buffer
• INPUT BUFFER contains data as it is read in
DATALINES;
0001 24 37.3
0002 35 38.2
;
Ravi M., sasindia@outlook.com
1 2 3 4 5 6 7 8 9 10 11 12
0 0 0 1 2 4 3 7 . 3
INPUT BUFFER
Execution Phase
• PROGRAM DATA VECTOR (PDV) is created and
contains information about the variables
• Two automatic variables _N_ and _ERROR_ and a
position for each of the four variables in the DATA
step.
• Sets _N_ = 1 _ERROR_ = 0 (no initial error) and
remaining variables to missing.
Ravi M., sasindia@outlook.com
_N_ _ERROR_ ID AGE TEMPC TEMPF
1 0 . . .
Buffer to PDV
Ravi M., sasindia@outlook.com
1 2 3 4 5 6 7 8 9 10 11 12
0 0 0 1 2 4 3 7 . 3
_N_ _ERROR_ ID AGE TEMPC TEMPF
1 0 0001 24 37.3 .
Calculated
value
Buffer
PDV
_N_ _ERROR_ ID AGE TEMPC TEMPF
1 0 0001 24 37.3 99.14
Processes the code TEMPF=TEMPC*(9/5)+32; Initially
missing
Reads 1st record
If there is an executable statement…
Output Phase
• The values in the PDV are written to the
output data set (NEW) as the first
observation:
Ravi M., sasindia@outlook.com
_N_ _ERROR_ ID AGE TEMPC TEMPF
1 0 0001 24 37.3 99.14
ID AGE TEMPC TEMPF
0001 24 37.3 99.14
This is the first record
in the output data set
named “NEW.”
Note that _N_ and
_ERROR_ are
dropped.
From
PDV
Write data to data set.
Exceptions to Missing in PDV
• Some data values are not initially set to missing in the
PDV
• variables in a RETAIN statement
• variables created in a SUM statement
• data elements in a _TEMPORARY_ array
• variables created with options in the FILE or INFILE
statements
• These exceptions are covered later.
Ravi M., sasindia@outlook.com
_N_ _ERROR_ ID AGE TEMPC TEMPF
1 0 . . .
Initial values usually
set to missing in PDV
Next data record read
• Once SAS finished reading the first data record, it continues the same
process, and reads the second record…sending results to output data
set (named NEW in this case.)
• …and so on for all records.
Ravi M., sasindia@outlook.com
ID AGE TEMPC TEMPF
0001 24 37.3 99.14
0002 35 38.2 100.76
Descriptor Information
• For the data set, SAS creates and maintains a description about each
SAS data set:
• data set attributes
• variable attributes
• the name of the data set
• member type, the date and time that the data set was created, and the
number, names and data types (character or numeric) of the variables.
Ravi M., sasindia@outlook.com
Data Set Description
proc datasets ;
contents data=new;
run;
Contents output… (abbreviated)
Ravi M., sasindia@outlook.com
# Name Member
Type
File Size Last
Modified
1 NEW DATA 5120 20Nov13:0
8:59:32
Alternate program
proc contents data= new;
run;
Description output continued…
Data Set Name WORK.NEW Observations 2
Member Type DATA Variables 4
Engine V9 Indexes 0
Created Wed, Nov 20, 2013
08:59:32 AM
Observation Length 32
Last Modified Wed, Nov 20, 2013
08:59:32 AM
Deleted
Observations
0
Protection Compressed NO
Data Set Type Sorted NO
Label
Data Representation WINDOWS_64
Encoding wlatin1 Western
(Windows)
Ravi M., sasindia@outlook.com
Description output continued…
Alphabetic List of Variables and Attributes
# Variable Type Len
2 AGE Num 8
1 ID Char 8
3 TEMPC Num 8
4 TEMPF Num 8
Ravi M., sasindia@outlook.com
Original Program
DATA NEW;
INPUT ID $ AGE TEMPC;
TEMPF=TEMPC*(9/5)+32;
DATALINES;
0001 24 37.3
0002 35 38.2
;
run;
proc print;run;
Ravi M., sasindia@outlook.com
Original Program
DATA NEW;
INPUT ID $ AGE TEMPC;
TEMPF=TEMPC*(9/5)+32;
DATALINES;
0001 24 37.3
0002 35 38.2
;
run;
proc print;run;
Ravi M., sasindia@outlook.com
Obs ID AGE TEMP
C
TEMP
F
1 0001 24 37.3 99.14
2 0002 35 38.2 100.76
Program output
Example of Error
DATA NEW;
INPUT ID $ AGE TEMPC;
TEMPF=TEMPC*(9/5)+32
DATALINES;
0001 24 37.3
0002 35 38.2
;
run;
proc print;run;
proc datasets ;
contents data=new;
run;
Ravi M., sasindia@outlook.com
Missing Semi-colon
76 DATA NEW;
77 INPUT ID $ AGE TEMPC;
78 TEMPF=TEMPC*(9/5)+32
79 DATALINES;
---------
22
80 0001 24 37.3
----
180
ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, *, **, +, -
, /, <, <=, <>, =, >, ><, >=, AND, EQ, GE,
GT, IN, LE, LT, MAX, MIN, NE, NG, NL, NOTIN, OR, ^=, |, ||, ~=.
ERROR 180-322: Statement is not valid or it is used out of proper order.
81 0002 35 38.2
82 ;
83 run;
ERROR: No DATALINES or INFILE statement.Ravi M., sasindia@outlook.com
Error found during compilation
Summary - Compilation Phase
• During Compilation
• Check syntax
• Identify type and length of each new variable (is a data type conversion
needed?)
• creates input buffer if there is an INPUT statement for an external file
• creates the Program Data Vector (PDV)
• creates descriptor information for data sets and variable attributes
• Other options not discussed here: DROP; KEEP; RENAME; RETAIN; WHERE;
LABEL; LENGTH; FORMAT; ARRAY; BY; ATTRIB; END=, IN=, FIRST, LAST, POINT=
Ravi M., sasindia@outlook.com
Summary – Execution Phase
1. The DATA step iterates once for each observation being
created.
2. Each time the DATA statement executes, _N_ is
incremented by 1.
3. Newly created variables set to missing in the PDV.
4. SAS reads a data record from a raw data file into the input
buffer (there are other possibilities not discussed here).
5. SAS executes any other programming statements for the
current record.
6. At the end of the data statements (RUN;) SAS writes an
observation to the SAS data set (OUTPUT PHASE)
7. SAS returns to the top of the DATA step (Step 3 above)
8. The DATA step terminates when there is no more data.
Ravi M., sasindia@outlook.com
End
Ravi M., sasindia@outlook.com

More Related Content

What's hot

Sas Functions INDEX / INDEXC / INDEXW
Sas Functions INDEX / INDEXC / INDEXWSas Functions INDEX / INDEXC / INDEXW
Sas Functions INDEX / INDEXC / INDEXW
THARUN PORANDLA
 
Understanding SAS Data Step Processing
Understanding SAS Data Step ProcessingUnderstanding SAS Data Step Processing
Understanding SAS Data Step Processing
guest2160992
 
Proc SQL in SAS Enterprise Guide 4.3
Proc SQL in SAS Enterprise Guide 4.3Proc SQL in SAS Enterprise Guide 4.3
Proc SQL in SAS Enterprise Guide 4.3
Mark Tabladillo
 
Data Match Merging in SAS
Data Match Merging in SASData Match Merging in SAS
Data Match Merging in SAS
guest2160992
 
SAS Functions
SAS FunctionsSAS Functions
SAS Functions
guest2160992
 
Introduction to sas
Introduction to sasIntroduction to sas
Introduction to sas
Ajay Ohri
 
SAS basics Step by step learning
SAS basics Step by step learningSAS basics Step by step learning
SAS basics Step by step learning
Venkata Reddy Konasani
 
INTRODUCTION TO SAS
INTRODUCTION TO SASINTRODUCTION TO SAS
INTRODUCTION TO SAS
Bhuwanesh Rawat
 
SAS - overview of SAS
SAS - overview of SASSAS - overview of SAS
SAS - overview of SAS
Vibrant Technologies & Computers
 
Sas
SasSas
Basics Of SAS Programming Language
Basics Of SAS Programming LanguageBasics Of SAS Programming Language
Basics Of SAS Programming Language
guest2160992
 
Introduction to SAS
Introduction to SASIntroduction to SAS
Introduction to SAS
izahn
 
Learn SAS Programming
Learn SAS ProgrammingLearn SAS Programming
Learn SAS Programming
SASTechies
 
SAS Macros part 1
SAS Macros part 1SAS Macros part 1
SAS Macros part 1
venkatam
 
Utility Procedures in SAS
Utility Procedures in SASUtility Procedures in SAS
Utility Procedures in SAS
guest2160992
 
trail design.pdf
trail design.pdftrail design.pdf
trail design.pdf
SukumarReddy43
 
SAS Macro
SAS MacroSAS Macro
SAS Macro
Sonal Shrivastav
 
Sas practice programs
Sas practice programsSas practice programs
Sas practice programs
gowthami marreddy
 
SAS BASICS
SAS BASICSSAS BASICS
SAS BASICS
Bhuwanesh Rawat
 
A Step-By-Step Introduction to SAS Report Procedure
A Step-By-Step Introduction to SAS Report ProcedureA Step-By-Step Introduction to SAS Report Procedure
A Step-By-Step Introduction to SAS Report Procedure
YesAnalytics
 

What's hot (20)

Sas Functions INDEX / INDEXC / INDEXW
Sas Functions INDEX / INDEXC / INDEXWSas Functions INDEX / INDEXC / INDEXW
Sas Functions INDEX / INDEXC / INDEXW
 
Understanding SAS Data Step Processing
Understanding SAS Data Step ProcessingUnderstanding SAS Data Step Processing
Understanding SAS Data Step Processing
 
Proc SQL in SAS Enterprise Guide 4.3
Proc SQL in SAS Enterprise Guide 4.3Proc SQL in SAS Enterprise Guide 4.3
Proc SQL in SAS Enterprise Guide 4.3
 
Data Match Merging in SAS
Data Match Merging in SASData Match Merging in SAS
Data Match Merging in SAS
 
SAS Functions
SAS FunctionsSAS Functions
SAS Functions
 
Introduction to sas
Introduction to sasIntroduction to sas
Introduction to sas
 
SAS basics Step by step learning
SAS basics Step by step learningSAS basics Step by step learning
SAS basics Step by step learning
 
INTRODUCTION TO SAS
INTRODUCTION TO SASINTRODUCTION TO SAS
INTRODUCTION TO SAS
 
SAS - overview of SAS
SAS - overview of SASSAS - overview of SAS
SAS - overview of SAS
 
Sas
SasSas
Sas
 
Basics Of SAS Programming Language
Basics Of SAS Programming LanguageBasics Of SAS Programming Language
Basics Of SAS Programming Language
 
Introduction to SAS
Introduction to SASIntroduction to SAS
Introduction to SAS
 
Learn SAS Programming
Learn SAS ProgrammingLearn SAS Programming
Learn SAS Programming
 
SAS Macros part 1
SAS Macros part 1SAS Macros part 1
SAS Macros part 1
 
Utility Procedures in SAS
Utility Procedures in SASUtility Procedures in SAS
Utility Procedures in SAS
 
trail design.pdf
trail design.pdftrail design.pdf
trail design.pdf
 
SAS Macro
SAS MacroSAS Macro
SAS Macro
 
Sas practice programs
Sas practice programsSas practice programs
Sas practice programs
 
SAS BASICS
SAS BASICSSAS BASICS
SAS BASICS
 
A Step-By-Step Introduction to SAS Report Procedure
A Step-By-Step Introduction to SAS Report ProcedureA Step-By-Step Introduction to SAS Report Procedure
A Step-By-Step Introduction to SAS Report Procedure
 

Similar to Understanding sas data step processing.

Sas
SasSas
Prog1 chap1 and chap 2
Prog1 chap1 and chap 2Prog1 chap1 and chap 2
Prog1 chap1 and chap 2
rowensCap
 
Evolutionary db development
Evolutionary db development Evolutionary db development
Evolutionary db development
Open Party
 
SAS DATA STEP PROCESS
SAS DATA STEP PROCESSSAS DATA STEP PROCESS
SAS DATA STEP PROCESS
PinnacleVexKAAnalyti
 
Habits of Effective SAS Programmers
Habits of Effective SAS ProgrammersHabits of Effective SAS Programmers
Habits of Effective SAS Programmers
Sunil Gupta
 
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Serban Tanasa
 
Program logic and design
Program logic and designProgram logic and design
Program logic and design
Chaffey College
 
[DSC Europe 22] Smart approach in development and deployment process for vari...
[DSC Europe 22] Smart approach in development and deployment process for vari...[DSC Europe 22] Smart approach in development and deployment process for vari...
[DSC Europe 22] Smart approach in development and deployment process for vari...
DataScienceConferenc1
 
pm1
pm1pm1
XII - 2022-23 - IP - RAIPUR (CBSE FINAL EXAM).pdf
XII -  2022-23 - IP - RAIPUR (CBSE FINAL EXAM).pdfXII -  2022-23 - IP - RAIPUR (CBSE FINAL EXAM).pdf
XII - 2022-23 - IP - RAIPUR (CBSE FINAL EXAM).pdf
KrishnaJyotish1
 
Microsoft SQL Server Query Tuning
Microsoft SQL Server Query TuningMicrosoft SQL Server Query Tuning
Microsoft SQL Server Query Tuning
Mark Ginnebaugh
 
Data Warehousing with Python
Data Warehousing with PythonData Warehousing with Python
Data Warehousing with Python
Martin Loetzsch
 
BP206 - Let's Give Your LotusScript a Tune-Up
BP206 - Let's Give Your LotusScript a Tune-Up BP206 - Let's Give Your LotusScript a Tune-Up
BP206 - Let's Give Your LotusScript a Tune-Up
Craig Schumann
 
Sql Automation 20090610
Sql Automation 20090610Sql Automation 20090610
Sql Automation 20090610
livingco
 
Tensorflow User Group Toronto - Ehsan Amjadian - TF Gager
Tensorflow User Group Toronto - Ehsan Amjadian - TF GagerTensorflow User Group Toronto - Ehsan Amjadian - TF Gager
Tensorflow User Group Toronto - Ehsan Amjadian - TF Gager
Devatanu Banerjee
 
Bdc BATCH DATA COMMUNICATION
Bdc BATCH DATA COMMUNICATIONBdc BATCH DATA COMMUNICATION
Bdc BATCH DATA COMMUNICATION
Hitesh Gulani
 
Cloudera Movies Data Science Project On Big Data
Cloudera Movies Data Science Project On Big DataCloudera Movies Data Science Project On Big Data
Cloudera Movies Data Science Project On Big Data
Abhishek M Shivalingaiah
 
Introducción al Software Analítico SAS
Introducción al Software Analítico SASIntroducción al Software Analítico SAS
Introducción al Software Analítico SAS
Jorge Rodríguez M.
 
Containerized Stream Engine to Build Modern Delta Lake
Containerized Stream Engine to Build Modern Delta LakeContainerized Stream Engine to Build Modern Delta Lake
Containerized Stream Engine to Build Modern Delta Lake
Databricks
 
RAMP_FINAL_ppt
RAMP_FINAL_pptRAMP_FINAL_ppt
RAMP_FINAL_ppt
Madhusmita Roy
 

Similar to Understanding sas data step processing. (20)

Sas
SasSas
Sas
 
Prog1 chap1 and chap 2
Prog1 chap1 and chap 2Prog1 chap1 and chap 2
Prog1 chap1 and chap 2
 
Evolutionary db development
Evolutionary db development Evolutionary db development
Evolutionary db development
 
SAS DATA STEP PROCESS
SAS DATA STEP PROCESSSAS DATA STEP PROCESS
SAS DATA STEP PROCESS
 
Habits of Effective SAS Programmers
Habits of Effective SAS ProgrammersHabits of Effective SAS Programmers
Habits of Effective SAS Programmers
 
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
 
Program logic and design
Program logic and designProgram logic and design
Program logic and design
 
[DSC Europe 22] Smart approach in development and deployment process for vari...
[DSC Europe 22] Smart approach in development and deployment process for vari...[DSC Europe 22] Smart approach in development and deployment process for vari...
[DSC Europe 22] Smart approach in development and deployment process for vari...
 
pm1
pm1pm1
pm1
 
XII - 2022-23 - IP - RAIPUR (CBSE FINAL EXAM).pdf
XII -  2022-23 - IP - RAIPUR (CBSE FINAL EXAM).pdfXII -  2022-23 - IP - RAIPUR (CBSE FINAL EXAM).pdf
XII - 2022-23 - IP - RAIPUR (CBSE FINAL EXAM).pdf
 
Microsoft SQL Server Query Tuning
Microsoft SQL Server Query TuningMicrosoft SQL Server Query Tuning
Microsoft SQL Server Query Tuning
 
Data Warehousing with Python
Data Warehousing with PythonData Warehousing with Python
Data Warehousing with Python
 
BP206 - Let's Give Your LotusScript a Tune-Up
BP206 - Let's Give Your LotusScript a Tune-Up BP206 - Let's Give Your LotusScript a Tune-Up
BP206 - Let's Give Your LotusScript a Tune-Up
 
Sql Automation 20090610
Sql Automation 20090610Sql Automation 20090610
Sql Automation 20090610
 
Tensorflow User Group Toronto - Ehsan Amjadian - TF Gager
Tensorflow User Group Toronto - Ehsan Amjadian - TF GagerTensorflow User Group Toronto - Ehsan Amjadian - TF Gager
Tensorflow User Group Toronto - Ehsan Amjadian - TF Gager
 
Bdc BATCH DATA COMMUNICATION
Bdc BATCH DATA COMMUNICATIONBdc BATCH DATA COMMUNICATION
Bdc BATCH DATA COMMUNICATION
 
Cloudera Movies Data Science Project On Big Data
Cloudera Movies Data Science Project On Big DataCloudera Movies Data Science Project On Big Data
Cloudera Movies Data Science Project On Big Data
 
Introducción al Software Analítico SAS
Introducción al Software Analítico SASIntroducción al Software Analítico SAS
Introducción al Software Analítico SAS
 
Containerized Stream Engine to Build Modern Delta Lake
Containerized Stream Engine to Build Modern Delta LakeContainerized Stream Engine to Build Modern Delta Lake
Containerized Stream Engine to Build Modern Delta Lake
 
RAMP_FINAL_ppt
RAMP_FINAL_pptRAMP_FINAL_ppt
RAMP_FINAL_ppt
 

More from Ravi Mandal, MBA

Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
Ravi Mandal, MBA
 
ITIL Foundation Certificate in IT Service Management
ITIL Foundation Certificate in IT Service ManagementITIL Foundation Certificate in IT Service Management
ITIL Foundation Certificate in IT Service Management
Ravi Mandal, MBA
 
Certificate
CertificateCertificate
Certificate
Ravi Mandal, MBA
 
Sas array statement
Sas array statementSas array statement
Sas array statement
Ravi Mandal, MBA
 
Introduction about analytics with sas+r programming.
Introduction about analytics with sas+r programming.Introduction about analytics with sas+r programming.
Introduction about analytics with sas+r programming.
Ravi Mandal, MBA
 
2 unit ie& v
2 unit  ie& v2 unit  ie& v
2 unit ie& v
Ravi Mandal, MBA
 
INDIAN ETHOS 1
INDIAN ETHOS 1INDIAN ETHOS 1
INDIAN ETHOS 1
Ravi Mandal, MBA
 

More from Ravi Mandal, MBA (7)

Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
ITIL Foundation Certificate in IT Service Management
ITIL Foundation Certificate in IT Service ManagementITIL Foundation Certificate in IT Service Management
ITIL Foundation Certificate in IT Service Management
 
Certificate
CertificateCertificate
Certificate
 
Sas array statement
Sas array statementSas array statement
Sas array statement
 
Introduction about analytics with sas+r programming.
Introduction about analytics with sas+r programming.Introduction about analytics with sas+r programming.
Introduction about analytics with sas+r programming.
 
2 unit ie& v
2 unit  ie& v2 unit  ie& v
2 unit ie& v
 
INDIAN ETHOS 1
INDIAN ETHOS 1INDIAN ETHOS 1
INDIAN ETHOS 1
 

Recently uploaded

UofT毕业证如何办理
UofT毕业证如何办理UofT毕业证如何办理
UofT毕业证如何办理
exukyp
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Kaxil Naik
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
a9qfiubqu
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
wyddcwye1
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
Monthly Management report for the Month of May 2024
Monthly Management report for the Month of May 2024Monthly Management report for the Month of May 2024
Monthly Management report for the Month of May 2024
facilitymanager11
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
taqyea
 
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
bmucuha
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
yuvarajkumar334
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
y3i0qsdzb
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
z6osjkqvd
 

Recently uploaded (20)

UofT毕业证如何办理
UofT毕业证如何办理UofT毕业证如何办理
UofT毕业证如何办理
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
Monthly Management report for the Month of May 2024
Monthly Management report for the Month of May 2024Monthly Management report for the Month of May 2024
Monthly Management report for the Month of May 2024
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
 
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
 

Understanding sas data step processing.

  • 1. Understanding SAS Data Step Processing Ravi Mandal
  • 2. Reading Raw Data • Using the following SAS program: DATA NEW; INPUT ID $ AGE TEMPC; TEMPF=TEMPC*(9/5)+32; DATALINES; 0001 24 37.3 0002 35 38.2 ; run; proc print;run; Ravi M., sasindia@outlook.com
  • 3. Overview of SAS Data Step Ravi M., sasindia@outlook.com Compile Phase (Look at Syntax) Execution Phase (Read data, Calculate) Output Phase (Create Data Set)
  • 4. Compile Phase DATA NEW; INPUT ID $ AGE TEMPC; TEMPF=TEMPC*(9/5)+32; DATALINES; 0001 24 37.3 0002 35 38.2 ; run; proc print;run; Ravi M., sasindia@outlook.com SAS Checks the syntax of the program. • Identifies type and length of each variable • Does any variable need conversion? If everything is okay, proceed to the next step. If errors are discovered, SAS attempts to interpret what you mean. If SAS can’t correct the error, it prints an error message to the log.
  • 5. Create Input Buffer • SAS creates an input buffer • INPUT BUFFER contains data as it is read in DATALINES; 0001 24 37.3 0002 35 38.2 ; Ravi M., sasindia@outlook.com 1 2 3 4 5 6 7 8 9 10 11 12 0 0 0 1 2 4 3 7 . 3 INPUT BUFFER
  • 6. Execution Phase • PROGRAM DATA VECTOR (PDV) is created and contains information about the variables • Two automatic variables _N_ and _ERROR_ and a position for each of the four variables in the DATA step. • Sets _N_ = 1 _ERROR_ = 0 (no initial error) and remaining variables to missing. Ravi M., sasindia@outlook.com _N_ _ERROR_ ID AGE TEMPC TEMPF 1 0 . . .
  • 7. Buffer to PDV Ravi M., sasindia@outlook.com 1 2 3 4 5 6 7 8 9 10 11 12 0 0 0 1 2 4 3 7 . 3 _N_ _ERROR_ ID AGE TEMPC TEMPF 1 0 0001 24 37.3 . Calculated value Buffer PDV _N_ _ERROR_ ID AGE TEMPC TEMPF 1 0 0001 24 37.3 99.14 Processes the code TEMPF=TEMPC*(9/5)+32; Initially missing Reads 1st record If there is an executable statement…
  • 8. Output Phase • The values in the PDV are written to the output data set (NEW) as the first observation: Ravi M., sasindia@outlook.com _N_ _ERROR_ ID AGE TEMPC TEMPF 1 0 0001 24 37.3 99.14 ID AGE TEMPC TEMPF 0001 24 37.3 99.14 This is the first record in the output data set named “NEW.” Note that _N_ and _ERROR_ are dropped. From PDV Write data to data set.
  • 9. Exceptions to Missing in PDV • Some data values are not initially set to missing in the PDV • variables in a RETAIN statement • variables created in a SUM statement • data elements in a _TEMPORARY_ array • variables created with options in the FILE or INFILE statements • These exceptions are covered later. Ravi M., sasindia@outlook.com _N_ _ERROR_ ID AGE TEMPC TEMPF 1 0 . . . Initial values usually set to missing in PDV
  • 10. Next data record read • Once SAS finished reading the first data record, it continues the same process, and reads the second record…sending results to output data set (named NEW in this case.) • …and so on for all records. Ravi M., sasindia@outlook.com ID AGE TEMPC TEMPF 0001 24 37.3 99.14 0002 35 38.2 100.76
  • 11. Descriptor Information • For the data set, SAS creates and maintains a description about each SAS data set: • data set attributes • variable attributes • the name of the data set • member type, the date and time that the data set was created, and the number, names and data types (character or numeric) of the variables. Ravi M., sasindia@outlook.com
  • 12. Data Set Description proc datasets ; contents data=new; run; Contents output… (abbreviated) Ravi M., sasindia@outlook.com # Name Member Type File Size Last Modified 1 NEW DATA 5120 20Nov13:0 8:59:32 Alternate program proc contents data= new; run;
  • 13. Description output continued… Data Set Name WORK.NEW Observations 2 Member Type DATA Variables 4 Engine V9 Indexes 0 Created Wed, Nov 20, 2013 08:59:32 AM Observation Length 32 Last Modified Wed, Nov 20, 2013 08:59:32 AM Deleted Observations 0 Protection Compressed NO Data Set Type Sorted NO Label Data Representation WINDOWS_64 Encoding wlatin1 Western (Windows) Ravi M., sasindia@outlook.com
  • 14. Description output continued… Alphabetic List of Variables and Attributes # Variable Type Len 2 AGE Num 8 1 ID Char 8 3 TEMPC Num 8 4 TEMPF Num 8 Ravi M., sasindia@outlook.com
  • 15. Original Program DATA NEW; INPUT ID $ AGE TEMPC; TEMPF=TEMPC*(9/5)+32; DATALINES; 0001 24 37.3 0002 35 38.2 ; run; proc print;run; Ravi M., sasindia@outlook.com
  • 16. Original Program DATA NEW; INPUT ID $ AGE TEMPC; TEMPF=TEMPC*(9/5)+32; DATALINES; 0001 24 37.3 0002 35 38.2 ; run; proc print;run; Ravi M., sasindia@outlook.com Obs ID AGE TEMP C TEMP F 1 0001 24 37.3 99.14 2 0002 35 38.2 100.76 Program output
  • 17. Example of Error DATA NEW; INPUT ID $ AGE TEMPC; TEMPF=TEMPC*(9/5)+32 DATALINES; 0001 24 37.3 0002 35 38.2 ; run; proc print;run; proc datasets ; contents data=new; run; Ravi M., sasindia@outlook.com Missing Semi-colon
  • 18. 76 DATA NEW; 77 INPUT ID $ AGE TEMPC; 78 TEMPF=TEMPC*(9/5)+32 79 DATALINES; --------- 22 80 0001 24 37.3 ---- 180 ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, *, **, +, - , /, <, <=, <>, =, >, ><, >=, AND, EQ, GE, GT, IN, LE, LT, MAX, MIN, NE, NG, NL, NOTIN, OR, ^=, |, ||, ~=. ERROR 180-322: Statement is not valid or it is used out of proper order. 81 0002 35 38.2 82 ; 83 run; ERROR: No DATALINES or INFILE statement.Ravi M., sasindia@outlook.com Error found during compilation
  • 19. Summary - Compilation Phase • During Compilation • Check syntax • Identify type and length of each new variable (is a data type conversion needed?) • creates input buffer if there is an INPUT statement for an external file • creates the Program Data Vector (PDV) • creates descriptor information for data sets and variable attributes • Other options not discussed here: DROP; KEEP; RENAME; RETAIN; WHERE; LABEL; LENGTH; FORMAT; ARRAY; BY; ATTRIB; END=, IN=, FIRST, LAST, POINT= Ravi M., sasindia@outlook.com
  • 20. Summary – Execution Phase 1. The DATA step iterates once for each observation being created. 2. Each time the DATA statement executes, _N_ is incremented by 1. 3. Newly created variables set to missing in the PDV. 4. SAS reads a data record from a raw data file into the input buffer (there are other possibilities not discussed here). 5. SAS executes any other programming statements for the current record. 6. At the end of the data statements (RUN;) SAS writes an observation to the SAS data set (OUTPUT PHASE) 7. SAS returns to the top of the DATA step (Step 3 above) 8. The DATA step terminates when there is no more data. Ravi M., sasindia@outlook.com