Successfully reported this slideshow.
Introduction to Where
     Expressions
         Mark Tabladillo, Ph.D.
Software Developer, MarkTab Consulting
Associate Fa...
Introduction
• WHERE expressions allow for processing
  subsets of observations
• WHERE expressions can be used in the
  D...
WHERE-expression Processing
 WHERE expression
• Enables us to conditionally select a subset
  of observations, so that SAS...
Work Sales Dataset
     Work.Sales
data work.sales (drop=i randomState);
    length state $2 sales 8 randomState 3;
    do...
Data Set Option or Statement
   data work.highSales;
           set work.sales (where=(sales>1500000));
           run;

 ...
Data Set Option or Statement
   data work.lowSales;
           set work.sales (where=(sales<0));
           run;

   data ...
Multiple Comparisons
data work.highFloridaSales;
        set work.sales (where=(sales>1500000 and state = 'FL'));
        ...
SAS Functions
data work.highFloridaSales;
        set work.sales (where=(sales>1500000 and substr(state,1,1) = 'F'));
    ...
Comparison Operators
Priority     Order of               Symbols Mnemonic
                Evaluation                    Eq...
Comparison Operators
Priority Order of                  Symbols Mnemonic
            Evaluation                      Equiv...
Comparison Operators
Priority     Order of               Symbols Mnemonic
                Evaluation                    Eq...
Comparison Operators
Priority     Order of               Symbols Mnemonic
               Evaluation                    Equ...
Comparison Operators
data work.extremeNonGeorgia;
        set work.sales
          (where=((sales<0 | sales>1500000) and s...
“Between And”
                    Between And
data work.boundedNonGeorgia;
        set work.sales (where=((sales between 1...
Contains ?
           data work.LStates;
                   set work.sales (where=(state contains 'L'));
                 ...
Is Null/Is Missing
            data work.nullStates;
                    set work.sales (where=(state is null));
         ...
Like
         data work.likeL;
                 set work.sales (where=(state like '%L'));
                      work sales...
Sounds Like (Soundex)
        data work.soundsLikeFill;
                set work.sales (where=(state =* 'fill'));
        ...
“Same And”
                        Same And
data work.boundedNonGeorgia;
        set work.sales (where=((sales between 100...
WHERE vs. Subsetting IF
               vs
Task                                                         Method


Make the s...
Intensive Dataset Generation
%macro OurCentury();
%local year interest;
       y             ;
%do year = 2001 %to 2100;
 ...
Year/State Datasets
%macro SalesByYearState();
%local year stateCode state;
%do year = 2001 %to 2100;
    %do stateCode = ...
Year/State High Sales Datasets
%macro HighSalesByYearState();
%local year stateCode state interest keepDataset;
%do year =...
Conclusion
• The WHERE expression allows for
  efficient observation processing in the
  DATA step and the PROC statements...
Contact Information
• Mark Tabladillo
  MarkTab Consulting
  http://www.marktab.com/
  http://www marktab com/
Upcoming SlideShare
Loading in …5
×

Introduction to SAS System Where Expressions

4,299 views

Published on

The where expression allows for declaring filters on SAS System datasets. This presentation illustrates some uses in the data step and SAS Macro Language.

Published in: Technology, Business
  • Be the first to comment

Introduction to SAS System Where Expressions

  1. 1. Introduction to Where Expressions Mark Tabladillo, Ph.D. Software Developer, MarkTab Consulting Associate Faculty University of Phoenix Faculty, January 30, 2007
  2. 2. Introduction • WHERE expressions allow for processing subsets of observations • WHERE expressions can be used in the DATA step or with PROC (procedure) statements • This presentation will contain a series of features and examples of the WHERE p expression • We end with some intensive macros
  3. 3. WHERE-expression Processing WHERE expression • Enables us to conditionally select a subset of observations, so that SAS processes only the observations that meet a set of specified conditions. http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999253.htm
  4. 4. Work Sales Dataset Work.Sales data work.sales (drop=i randomState); length state $2 sales 8 randomState 3; do i = 1 to 2500; randomState = round(rand('gaussian',3,1)+0.5); if randomState in (1,2,3,4,5) then do; ( ) select(randomState); when(1) state='TN'; when(2) state='AL'; when(3) state= GA ; state='GA'; when(4) state='FL'; when(5) state='MS'; end; sales = int(rand('gaussian',1000000,500000)); output work.sales; end; end; run;
  5. 5. Data Set Option or Statement data work.highSales; set work.sales (where=(sales>1500000)); run; data work highSales; work.highSales; set work.sales; where sales>1500000; run; proc means data=work.sales; where sales>1500000; run; ;
  6. 6. Data Set Option or Statement data work.lowSales; set work.sales (where=(sales<0)); run; data work lowSales; work.lowSales; set work.sales; where sales<0; run; proc means data=work.sales (where=(sales<0)); run;
  7. 7. Multiple Comparisons data work.highFloridaSales; set work.sales (where=(sales>1500000 and state = 'FL')); run; data work highFloridaSales; work.highFloridaSales; set work.sales; where sales>1500000 and state = 'FL'; run; proc freq data=work.sales; tables state; where sales>1500000 and state = 'FL'; ; run;
  8. 8. SAS Functions data work.highFloridaSales; set work.sales (where=(sales>1500000 and substr(state,1,1) = 'F')); run; data work highFloridaSales; work.highFloridaSales; set work.sales; where sales>1500000 and substr(state,1,1) = 'F'; run; proc means data=work.sales; where sales>1500000 and substr(state,1,1) = 'F'; run; ;
  9. 9. Comparison Operators Priority Order of Symbols Mnemonic Evaluation Equivalent Group I right to left ** + - ˆ¬~ NOT >< MIN <> MAX http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000780367.htm
  10. 10. Comparison Operators Priority Order of Symbols Mnemonic Evaluation Equivalent Group II left to right * / Group left to right + III - Group left to right || ¦¦ !! IV http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000780367.htm
  11. 11. Comparison Operators Priority Order of Symbols Mnemonic Evaluation Equivalent Group left to right < LT V <= LE = EQ ¬= NE >= GE > GT IN http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000780367.htm
  12. 12. Comparison Operators Priority Order of Symbols Mnemonic Evaluation Equivalent Group left to right & AND VI Group left to right |¦! OR VII http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000780367.htm
  13. 13. Comparison Operators data work.extremeNonGeorgia; set work.sales (where=((sales<0 | sales>1500000) and state in ('TN','AL','FL','MS'))); run; data work.extremeNonGeorgia; set work.sales; where (sales<0 | sales>1500000) and state in ('TN','AL','FL','MS'); run; data work.extremeNonGeorgia; set work.sales; ; where ^ (0 <= sales <= 1500000) & state ne 'GA'; run; http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm
  14. 14. “Between And” Between And data work.boundedNonGeorgia; set work.sales (where=((sales between 1000000 and 1500000) & state in ('TN','AL','FL','MS'))); run; data work.boundedNonGeorgia; set work.sales; where (sales between 1000000 and 1500000) & state in ('TN','AL','FL','MS'); t t i ('TN' 'AL' 'FL' 'MS') run; http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm
  15. 15. Contains ? data work.LStates; set work.sales (where=(state contains 'L')); run; data work LStates; work.LStates; set work.sales; where state contains 'L'; run; data work.LStates; set work.sales; where state ? 'L'; ; run; http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm
  16. 16. Is Null/Is Missing data work.nullStates; set work.sales (where=(state is null)); run; data work.missingStates; se o sa es (where=(state s ss g)); set work.sales ( e e (s a e is missing)); run; data work.nullSales; set work.sales; work sales; where sales is missing; run; data work.nonNullSales; set work.sales; where sales is not missing; run; http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm
  17. 17. Like data work.likeL; set work.sales (where=(state like '%L')); work sales run; data work.likeL; set work.sales (where=(state like quot;%Lquot;)); run; data work likeL; work.likeL; set work.sales (where=(state like quot;%%Lquot;)); run; data work.notLikeG; set work.sales; where state not like 'G_'; run; http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm
  18. 18. Sounds Like (Soundex) data work.soundsLikeFill; set work.sales (where=(state =* 'fill')); run; data work notSoundsLikeTin; work.notSoundsLikeTin; set work.sales; where state not =* 'tin'; run; http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm
  19. 19. “Same And” Same And data work.boundedNonGeorgia; set work.sales (where=((sales between 1000000 and 1500000) & state in ('TN','AL','FL','MS'))); run; data work.boundedNonGeorgia; set work.sales; where (sales between 1000000 and 1500000); where same and state i ('TN' 'AL' 'FL' 'MS') h d t t in ('TN','AL','FL','MS'); run; data work.boundedNonGeorgia; g ; set work.sales; where same and (sales between 1000000 and 1500000); where same and state in ('TN','AL','FL','MS'); run; http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm
  20. 20. WHERE vs. Subsetting IF vs Task Method Make the selection in a procedure without using a WHERE expression preceding DATA step Take advantage of the efficiency available with an indexed WHERE expression data set Use one of a group of special operators, such as WHERE expression BETWEEN-AND, CONTAINS, IS MISSING or IS NULL, LIKE, SAME-AND, and Sounds-Like B th l ti thi th th i bl l b tti Base the selection on anything other than a variable value subsetting IF that already exists in a SAS data set. For example, you can select a value that is read from raw data, or a value that is calculated or assigned during the course of the DATA step f th t Make the selection at some point during a DATA step subsetting IF rather than at the beginning Execute the selection conditionally subsetting IF http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a001000521.htm
  21. 21. Intensive Dataset Generation %macro OurCentury(); %local year interest; y ; %do year = 2001 %to 2100; %let interest = %sysfunc(compound(1,.,0.05,%eval(&year.-2001))); data work.sales&year. (drop=i randomState index=(state sales)); length state $2 stateName $20 sales 8 randomState 3; g ; do i = 1 to 2500; randomState = round(56*rand('uniform')+0.5); if randomState <= 56 and randomState not in (3,7,14,43,52) then do; state = fipstate(randomState); p ( ) stateName = fipnameL(randomState); sales = int(rand('gaussian',1000000*&interest.,500000*&interest.)); output work.sales&year.; end; end; run; %end; %mend OurCentury; y %OurCentury;
  22. 22. Year/State Datasets %macro SalesByYearState(); %local year stateCode state; %do year = 2001 %to 2100; %do stateCode = 1 %to 56; %if &stateCode ne 3 & &stateCode ne 7 & &stateCode. ne 14 & &stateCode. &stateCode. &stateCode &stateCode. ne 43 & &stateCode. ne 52 %then %do; %let state = %sysfunc(fipstate(&stateCode.)); data work.sales&year.&state.; set work.sales&year.; t k l & where state = quot;&state.quot;; run; %end; ; %end; %end; %mend SalesByYearState; %SalesByYearState;
  23. 23. Year/State High Sales Datasets %macro HighSalesByYearState(); %local year stateCode state interest keepDataset; %do year = 2001 %to 2100; %let interest = %sysfunc(compound(1,.,0.05,%eval(&year.-2001))); %do stateCode = 1 %to 56; %if &stateCode. ne 3 & &stateCode. ne 7 & &stateCode. ne 14 & &stateCode. ne 43 & &stateCode. ne 52 %then %do; %let state = %sysfunc(fipstate(&stateCode.)); %let keepDataset = 0; data work.sales&year.&state.high; set work.sales&year.; where state = quot;&state.quot; and sales > 2000000*&i t h t t quot;& t t quot; d l 2000000*&interest.; t call symput('keepDataset',left('1')); run; %if not(&keepDataset.) %then %do; p proc datasets lib=work nolist; delete sales&year.&state.high; run; quit; %end; %end; %end; %end; %mend HighSalesByYearState; %HighSalesByYearState;
  24. 24. Conclusion • The WHERE expression allows for efficient observation processing in the DATA step and the PROC statements • The SAS System Documentation provides specific details on the syntax • Using macros increases the processing power of WHERE expressions f i
  25. 25. Contact Information • Mark Tabladillo MarkTab Consulting http://www.marktab.com/ http://www marktab com/

×