Introduction to SAS Data Set Options


Published on

Using Base SAS can be powerful with the Data Set Options. This presentation previews helpful uses of this declarative feature.

Published in: Technology, Economy & Finance
  • ios online training
    Are you sure you want to  Yes  No
    Your message goes here

Introduction to SAS Data Set Options

  1. 1. Introduction to Data Set Options Mark Tabladillo, Ph.D. Software Developer, MarkTab Consulting Associate Faculty University of Phoenix Faculty, January 30, 2007
  2. 2. Introduction • Data set options allow features during dataset processing • Most SAS data set options can apply to either input or output SAS data sets in DATA steps or procedure (PROC) steps • Data set options allow the data step to control variables, observations, security, t l i bl b ti it and data set attributes
  3. 3. Outline • Define data set options • Provide examples in four categories • Di Discuss d t set processing rules data t i l
  4. 4. Outline • Define data set options • Provide examples in four categories • Di Discuss d t set processing rules data t i l
  5. 5. Definition • Data set options specify actions that apply only to the SAS data set with which they appear appear.
  6. 6. Syntax • Specify a data set option in parentheses after a SAS data set name. To specify several data set options separate them options, with spaces. (option-1=value-1<...option-n=value-n>) (option 1=value 1< option n=value n>)
  7. 7. Outline • Define data set options • Provide examples in four categories • Di Discuss d t set processing rules data t i l
  8. 8. Quick Examples • Data set options enable us to perform operations such as these: – Renaming variables – Selecting only the first or last n observations for processing – Dropping variables from processing or from the output data set – Specifying a password for a data set – Adding dataset labels
  9. 9. Common Option Categories • Variable Control • Observation Control • Security S it • Data Set Attributes
  10. 10. Examples Dataset data work.sales (drop=i randomState); length state $2 sales 8 randomState 3; do i = 1 to 2500; randomState = round(rand('gaussian',3,1)+0.5); if randomState in (1,2,3,4,5) then do; select(randomState); l t( d St t ) when(1) state='TN'; when(2) state='AL'; when(3) state='GA'; ( ) ; when(4) state='FL'; when(5) state='MS'; end; sales = int(rand('gaussian' 1000000 500000)); int(rand('gaussian',1000000,500000)); output work.sales; end; end; run;
  11. 11. List of Common Options SAS Data Set Option Description Variable DROP= Data Set Excludes variables from Control Option processing or from output SAS data sets KEEP= Data Set Specifies variables for processing Option p or for writing to output SAS data g p sets RENAME= Data Set Changes the name of a variable Option O i
  12. 12. Examples: Variable Control data work salesReformat; work.salesReformat; set work.sales (drop=sales); run; data work.salesReformat2; set work.sales (keep=state); run; proc sort data=work.sales (rename=(state=salesState)) out=work.salesReformat3 (drop=sales); by salesState; run;
  13. 13. List of Common Options SAS Data Set Option Description Observation FIRSTOBS= Data Set Specifies which observation SAS Control Option processes first IN= Data Set Option Creates a variable that indicates whether the data set contributed data to the current observation OBS= Data Set Option Specifies when to stop processing obse a o s observations WHERE= Data Set Selects observations that meet the Option specified condition
  14. 14. Examples: Observation Control * (obs - firstobs) + 1 = results; data work.selectObs1; set work.sales (firstobs=1 obs=200); ( ); run; data work.selectObs2; set work.sales (firstobs=200 obs=400); ( ); run; proc print data=work.sales (obs=25); run; ; proc freq data=work.sales (firstobs=1); tables state; run;; proc means data=work.sales (obs=max); class state; ; var sales; run;
  15. 15. Examples: Observation Control data work combineObs1; work.combineObs1; set work.selectObs1 (in=in1) work.selectObs2 (in=in2); length source $12; if in1 then source = 'Dataset One'; else if in2 then source = 'Dataset Two'; run; data work combineObs2; work.combineObs2; set work.selectObs1 (in=in1) work.selectObs2 (in=in2); if in1 and in2 then output; run;
  16. 16. List of Common Options SAS Data Set Option Description Security ALTER= Data Set Option Assigns an alter password to a SAS file and enables access to a password- protected SAS file ENCRYPT= Data Set Encrypts SAS data files Option PW= Data Set Option Assigns a read, write, or alter password to a SAS file and enables access to a password-protected SAS file READ= Data Set Option Assigns a read password to a SAS file and enables access to a read-protected SAS file WRITE= Data Set Option Assigns a write password to a SAS file and enables access to a write-protected SAS file
  17. 17. Examples: Security data work.secure1 (alter=NoErrors); set work.sales; run; data work.secure2; set work.sales (alter=NoErrors); work sales run; * Note: A SAS password does not control access to a SAS file beyond the SAS system. system You should use the operating system-supplied utilities and file-system system supplied file system security controls in order to control access to SAS files outside of SAS.; data work.secure3 (encrypt=yes pw=Scramble); set work.sales; run; proc sort data=work.secure3 (pw=scramble) out=work.secure4; by state sales; y ; run;
  18. 18. List of Common Options SAS Data Set Option Description Data Set COMPRESS= Data Set Controls the compression of Attributes Option observations in an output SAS data set GENMAX= Data Set Requests generations for a data set Option and specifies the maximum number of versions INDEX D INDEX= Data S Set D fi Defines i d indexes when a SAS d h data set Option is created LABEL= Data Set Specifies a label for the SAS data set Option
  19. 19. Examples: Data Set Attributes data work.compress1 (compress=yes label=quot;Attempt at Compressionquot;); set work.sales; run; data work masterSalesDataset (genmax=3); work.masterSalesDataset (genmax 3); set work.sales; run; d t work.masterSalesDataset; data k t S l D t t set work.masterSalesDataset work.selectObs1; run; data work.masterSalesDataset; set work.sales work.selectObs1; run;
  20. 20. Outline • Define data set options • Provide examples in four categories • Di Discuss d t set processing rules data t i l
  21. 21. Input and Output Datasets • If a data set option is associated with an input data set, the action applies to the data set that is being read. read • If the option appears in the DATA statement or after an output data set specification in a PROC step, SAS applies the action to the output data set set.
  22. 22. Input and Output Datasets data d t _null_; ll run; ; data; run; data _null_; set _null_; null ; if _n_ ge 0 then put 'hello'; run; data _null_; if _n_ ge 0 then put 'hello'; set _null_; run;
  23. 23. Order of Execution • When data set options appear on both input and output data sets in the same DATA or PROC step, SAS applies data set options to input data sets before it evaluates programming statements or before it applies data set options to output data t d t sets. • Likewise, data set options that are specified for the data set being created are applied after programming statements are processed.
  24. 24. Order of Execution data work.salesReformat4 (rename=(sales=monthlySales)); set work.sales; sales = sales/12; run; data work.salesReformat5; set work.sales (rename=(sales=monthlySales)); monthlySales = monthlySales/12; run;
  25. 25. Specification Conflicts • In some instances data set options instances, conflict when they are used in the same statement For example you cannot statement. example, specify both the DROP= and KEEP= options for the same variable in the same statement.
  26. 26. Statement Definition • A SAS statement is a series of items that may include keywords, SAS names, special characters and operators characters, operators. • All SAS statements end with a semicolon. • A SAS statement either requests SAS to t t t ith t t perform an operation or gives information to th t the system. t
  27. 27. Timing Conflicts • Timing can also be an issue in some cases. For example, if using KEEP= and RENAME RENAME= on a data set specified in the SET statement, KEEP= needs to use the original variable names, because SAS will names process KEEP= before the data set is read. read The new names specified in RENAME= will apply to the programming statements that follow the SET statement statement.
  28. 28. Timing Conflicts proc sort data=work.sales (keep=sales state rename=(sales=monthlySales)) out=work.salesReformat6; out=work salesReformat6; by state monthlySales; run; proc sort data=work.sales (rename=(sales=monthlySales) keep=sales state) out=work.salesReformat7; by state monthlySales; run;
  29. 29. Overriding System Options • Many system options and data set options share the same name and have the same function. • The data set option overrides the system option for the data set in the step in which p p it appears. • System options remain in effect for all y p DATA and PROC steps in a SAS job or session, unless they are respecified.
  30. 30. Conclusion • DATA set options allow features during data step processing • The SAS System Documentation provides specific details on the syntax
  31. 31. Contact Information • Mark Tabladillo MarkTab Consulting http://www marktab com/