- 1. Introduction to Where Expressions Mark Tabladillo, Ph.D. Software Developer, MarkTab Consulting Associate Faculty University of Phoenix Faculty, January 30, 2007
- 2. Introduction • WHERE expressions allow for processing subsets of observations • WHERE expressions can be used in the DATA step or with PROC (procedure) statements • This presentation will contain a series of features and examples of the WHERE p expression • We end with some intensive macros
- 3. WHERE-expression Processing WHERE expression • Enables us to conditionally select a subset of observations, so that SAS processes only the observations that meet a set of specified conditions. http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999253.htm
- 4. Work Sales Dataset Work.Sales data work.sales (drop=i randomState); length state $2 sales 8 randomState 3; do i = 1 to 2500; randomState = round(rand('gaussian',3,1)+0.5); if randomState in (1,2,3,4,5) then do; ( ) select(randomState); when(1) state='TN'; when(2) state='AL'; when(3) state= GA ; state='GA'; when(4) state='FL'; when(5) state='MS'; end; sales = int(rand('gaussian',1000000,500000)); output work.sales; end; end; run;
- 5. Data Set Option or Statement data work.highSales; set work.sales (where=(sales>1500000)); run; data work highSales; work.highSales; set work.sales; where sales>1500000; run; proc means data=work.sales; where sales>1500000; run; ;
- 6. Data Set Option or Statement data work.lowSales; set work.sales (where=(sales<0)); run; data work lowSales; work.lowSales; set work.sales; where sales<0; run; proc means data=work.sales (where=(sales<0)); run;
- 7. Multiple Comparisons data work.highFloridaSales; set work.sales (where=(sales>1500000 and state = 'FL')); run; data work highFloridaSales; work.highFloridaSales; set work.sales; where sales>1500000 and state = 'FL'; run; proc freq data=work.sales; tables state; where sales>1500000 and state = 'FL'; ; run;
- 8. SAS Functions data work.highFloridaSales; set work.sales (where=(sales>1500000 and substr(state,1,1) = 'F')); run; data work highFloridaSales; work.highFloridaSales; set work.sales; where sales>1500000 and substr(state,1,1) = 'F'; run; proc means data=work.sales; where sales>1500000 and substr(state,1,1) = 'F'; run; ;
- 9. Comparison Operators Priority Order of Symbols Mnemonic Evaluation Equivalent Group I right to left ** + - ˆ¬~ NOT >< MIN <> MAX http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000780367.htm
- 10. Comparison Operators Priority Order of Symbols Mnemonic Evaluation Equivalent Group II left to right * / Group left to right + III - Group left to right || ¦¦ !! IV http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000780367.htm
- 11. Comparison Operators Priority Order of Symbols Mnemonic Evaluation Equivalent Group left to right < LT V <= LE = EQ ¬= NE >= GE > GT IN http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000780367.htm
- 12. Comparison Operators Priority Order of Symbols Mnemonic Evaluation Equivalent Group left to right & AND VI Group left to right |¦! OR VII http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000780367.htm
- 13. Comparison Operators data work.extremeNonGeorgia; set work.sales (where=((sales<0 | sales>1500000) and state in ('TN','AL','FL','MS'))); run; data work.extremeNonGeorgia; set work.sales; where (sales<0 | sales>1500000) and state in ('TN','AL','FL','MS'); run; data work.extremeNonGeorgia; set work.sales; ; where ^ (0 <= sales <= 1500000) & state ne 'GA'; run; http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm
- 14. “Between And” Between And data work.boundedNonGeorgia; set work.sales (where=((sales between 1000000 and 1500000) & state in ('TN','AL','FL','MS'))); run; data work.boundedNonGeorgia; set work.sales; where (sales between 1000000 and 1500000) & state in ('TN','AL','FL','MS'); t t i ('TN' 'AL' 'FL' 'MS') run; http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm
- 15. Contains ? data work.LStates; set work.sales (where=(state contains 'L')); run; data work LStates; work.LStates; set work.sales; where state contains 'L'; run; data work.LStates; set work.sales; where state ? 'L'; ; run; http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm
- 16. Is Null/Is Missing data work.nullStates; set work.sales (where=(state is null)); run; data work.missingStates; se o sa es (where=(state s ss g)); set work.sales ( e e (s a e is missing)); run; data work.nullSales; set work.sales; work sales; where sales is missing; run; data work.nonNullSales; set work.sales; where sales is not missing; run; http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm
- 17. Like data work.likeL; set work.sales (where=(state like '%L')); work sales run; data work.likeL; set work.sales (where=(state like quot;%Lquot;)); run; data work likeL; work.likeL; set work.sales (where=(state like quot;%%Lquot;)); run; data work.notLikeG; set work.sales; where state not like 'G_'; run; http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm
- 18. Sounds Like (Soundex) data work.soundsLikeFill; set work.sales (where=(state =* 'fill')); run; data work notSoundsLikeTin; work.notSoundsLikeTin; set work.sales; where state not =* 'tin'; run; http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm
- 19. “Same And” Same And data work.boundedNonGeorgia; set work.sales (where=((sales between 1000000 and 1500000) & state in ('TN','AL','FL','MS'))); run; data work.boundedNonGeorgia; set work.sales; where (sales between 1000000 and 1500000); where same and state i ('TN' 'AL' 'FL' 'MS') h d t t in ('TN','AL','FL','MS'); run; data work.boundedNonGeorgia; g ; set work.sales; where same and (sales between 1000000 and 1500000); where same and state in ('TN','AL','FL','MS'); run; http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm
- 20. WHERE vs. Subsetting IF vs Task Method Make the selection in a procedure without using a WHERE expression preceding DATA step Take advantage of the efficiency available with an indexed WHERE expression data set Use one of a group of special operators, such as WHERE expression BETWEEN-AND, CONTAINS, IS MISSING or IS NULL, LIKE, SAME-AND, and Sounds-Like B th l ti thi th th i bl l b tti Base the selection on anything other than a variable value subsetting IF that already exists in a SAS data set. For example, you can select a value that is read from raw data, or a value that is calculated or assigned during the course of the DATA step f th t Make the selection at some point during a DATA step subsetting IF rather than at the beginning Execute the selection conditionally subsetting IF http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a001000521.htm
- 21. Intensive Dataset Generation %macro OurCentury(); %local year interest; y ; %do year = 2001 %to 2100; %let interest = %sysfunc(compound(1,.,0.05,%eval(&year.-2001))); data work.sales&year. (drop=i randomState index=(state sales)); length state $2 stateName $20 sales 8 randomState 3; g ; do i = 1 to 2500; randomState = round(56*rand('uniform')+0.5); if randomState <= 56 and randomState not in (3,7,14,43,52) then do; state = fipstate(randomState); p ( ) stateName = fipnameL(randomState); sales = int(rand('gaussian',1000000*&interest.,500000*&interest.)); output work.sales&year.; end; end; run; %end; %mend OurCentury; y %OurCentury;
- 22. Year/State Datasets %macro SalesByYearState(); %local year stateCode state; %do year = 2001 %to 2100; %do stateCode = 1 %to 56; %if &stateCode ne 3 & &stateCode ne 7 & &stateCode. ne 14 & &stateCode. &stateCode. &stateCode &stateCode. ne 43 & &stateCode. ne 52 %then %do; %let state = %sysfunc(fipstate(&stateCode.)); data work.sales&year.&state.; set work.sales&year.; t k l & where state = quot;&state.quot;; run; %end; ; %end; %end; %mend SalesByYearState; %SalesByYearState;
- 23. Year/State High Sales Datasets %macro HighSalesByYearState(); %local year stateCode state interest keepDataset; %do year = 2001 %to 2100; %let interest = %sysfunc(compound(1,.,0.05,%eval(&year.-2001))); %do stateCode = 1 %to 56; %if &stateCode. ne 3 & &stateCode. ne 7 & &stateCode. ne 14 & &stateCode. ne 43 & &stateCode. ne 52 %then %do; %let state = %sysfunc(fipstate(&stateCode.)); %let keepDataset = 0; data work.sales&year.&state.high; set work.sales&year.; where state = quot;&state.quot; and sales > 2000000*&i t h t t quot;& t t quot; d l 2000000*&interest.; t call symput('keepDataset',left('1')); run; %if not(&keepDataset.) %then %do; p proc datasets lib=work nolist; delete sales&year.&state.high; run; quit; %end; %end; %end; %end; %mend HighSalesByYearState; %HighSalesByYearState;
- 24. Conclusion • The WHERE expression allows for efficient observation processing in the DATA step and the PROC statements • The SAS System Documentation provides specific details on the syntax • Using macros increases the processing power of WHERE expressions f i
- 25. Contact Information • Mark Tabladillo MarkTab Consulting http://www.marktab.com/ http://www marktab com/

