The document discusses the differences between using WHERE and IF statements in SAS, explaining that WHERE statements apply conditions before data enters the program data vector while IF statements apply conditions after, so WHERE is generally used with procedures and data set options and IF is used for other programming tasks like creating new variables or using automatic variables. Examples are provided to illustrate when each statement should be used based on the specific programming task.
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Where Vs If Statement
1. WHERE vs. IF
Statements:
Examples in usage and
differences
Sunil Gupta
Sunil@GuptaProgramming.com
www.SASSavvy.com
1
2. WHERE vs. IF: Examples in
how and when to apply
In SAS, there are multiple approaches
May initially assume no difference
SAS has requirements for WHERE & IF
Shows you how and when to apply to
get correct and reliable results
vs.
2
3. WHERE vs. IF: Examples in
how and when to apply
Not focusing on efficiency in approach
Not focusing on conditional execution of
statements
Ex. If name = ‘Tim’ then gender = ‘Male’;
Not using PROC SQL
Condition used for subsetting data set
Important to understand the similarities
and differences between WHERE & IF 3
4. WHERE versus IF Issues:
Time/Location
Input Buffer for text file
Raw
Data File Input Data Set
Variables for data set
Program Data Vector (PDV)
Input Input Data Set •Automatic variables
WHERE •New variables
condition •Compatible with OBS=,
POINT=, FIRSTOBS=
data set options
Output Output Buffer
Output Data Set IF condition
Understand how SAS works! 4
5. Where vs. If: Time/Location
WHERE conditions:
Applied before data enters the Input Buffer
Variables must exist in the input data set
Process is faster because limits obs. read
Take advantage of indexes
Introduced in version 6.12
IF conditions:
Applied after data enters the Program Data
Vector
Any variables can be used in the condition
Ex. _N_, FIRST.BY, LAST.BY 5
6. Where vs. If: Syntax
Examples of WHERE conditions:
where name = ‘Tim’ or name = ‘Sally’;
where name =: ‘T’ or name contains ‘ally’;
In Data Step or SAS Procedures
Special
Examples of IF conditions: Operators
if name = ‘Tim’ or name = ‘Sally’;
If first.name; * temp variable;
If classnum = 2; * new variable;
Keywords are everything!
Functions and expressions must be valid
– all numeric or character
Incorrect Syntax results in ERRORS! 6
7. Where vs. If: Seven
Examples
Using variables in data set, using SET, MERGE, or UPDATE
statement if within the DATA step
Accessing raw data file using INPUT statement
Using automatic variables such as _N_, FIRST.BY, LAST.BY
Using newly created variables in data set
Using special operators such as LIKE or CONTAINS
Directly using any SAS Procedure or as a data set option
When merging data sets – Be careful when subsetting data set
7
8. Sample Data Set.sas
data exam;
input name $ class $ score ;
cards;
Tim math 9
Tim history 8 9 records:
Tim science 7
Sally math 10 Three test scores
Sally science 7 for each of the
Sally history 10 three students
John math 8
John history 8
John science 9
;
run;
8
9. WHERE vs. IF: Example
Using variables in data set, with the SET, MERGE,
or UPDATE statement if within the DATA step?
(A typical simple query.)
– Use WHERE or IF statement
(Expect to get the same results, WHERE
statement has better performance)
9
11. Example .sas
data student1;
set exam;
where name = ‘Tim’ or name = ‘Sally’;
run;
NOTE: There were 6 observations read from the data set
WORK.EXAM.
WHERE name in ('Sally', 'Tim');
NOTE: The data set WORK.STUDENT1 has 6 observations and 3
variables.
WHERE condition requires all data set variables.
Account for spaces and case sensitivity with trim(), left()
and upcase().
11
12. Example .sas
data student1;
set exam;
if name = ‘Tim’ or name = ‘Sally’;
run;
NOTE: There were 9 observations read from the data set
WORK.EXAM.
NOTE: The data set WORK.STUDENT1 has 6
observations and 3 variables.
IF condition has almost no requirements on variables.
Basic Syntax of WHERE and IF is the same!
Keyword variable = ‘text’;
Keyword variable = number; 12
13. Example Results
STUDENT1 data set: Tim or Sally
Obs name class score
1 Tim math 9
2 Tim history 8 Same result
3 Tim science 7 as expected!
4 Sally math 10
5 Sally science 7
6 Sally history 10
Use WHERE or IF statement when using variables in
data set, using SET, MERGE, or UPDATE statement if
within the DATA step.
13
14. Example .sas
data student1;
set exam;
Incorrect Statement:
if score = ‘9’; Score is numeric
variable
run;
IF condition automatically converts variable types.
WHERE condition does not automatically convert
variable types. It generates an ERROR message because
operator requires compatible variables.
14
15. Example : SAS Log
70 data student1;
71 set exam;
72
73 if score = '9'; IF numeric = ‘character’;
74
75 run;
NOTE: Character values have been converted to numeric values
at the places given by: (Line):(Column).
73:14
NOTE: There were 9 observations read from the data set
WORK.EXAM.
NOTE: The data set WORK.STUDENT1 has 2 observations and
3 variables. 15
16. Example : SAS Log
70 data student1;
71 set exam;
72 WHERE numeric = ‘character’;
73 where score = '9';
ERROR: Where clause operator requires compatible variables.
74
75 run;
NOTE: The SAS System stopped processing this step because of
errors.
WARNING: The data set WORK.STUDENT1 may be
incomplete. When this step was stopped there were 0
observations and 3 variables.
16
17. Example : SAS Log
WHERE Condition
Single purpose statement Filter
Message in SAS Log
SAS Log: WHERE name in ('Sally', 'Tim');
IF Condition
No message in SAS Log
Multiple purpose statement
Input Process Output
17
18. WHERE vs. IF: Example
Accessing raw data file using INPUT statement?
(Need to be more efficient.)
– Must use IF statement
18
19. Example .sas
data exam;
input name $ class $ score ;
<< enter condition to select Tim or Sally >>
cards;
Tim math 9
Tim history 8 Same Data Set:
Tim science 7
Three test
Sally math 10
scores for three
Sally science 7
students each
Sally history 10
John math 8
John history 8
John science 9
;
run; 19
20. Example .sas
data exam;
input name $ class $ score ;
if name = ‘Tim’ or name = ‘Sally’;
cards;
Tim math 9
Same Results
Tim history 8
as Example !
Tim science 7
Sally math 10
Sally science 7
Sally history 10
John math 8
John history 8
John science 9
;
run; 20
21. Example : SAS Log
42 data exam;
43 input name $ class $ score ;
44 where name = 'Tim' or name = 'Sally';
ERROR: No input data sets available for WHERE
statement.
45 cards;
Semantic ERROR because Syntax is correct.
No data set available for WHERE statement.
Condition applied before input buffer.
IF statement allows for subsetting raw data!
21
22. WHERE vs. IF: Example
Using automatic variables such as _N_, FIRST.BY,
LAST.BY?
(Do more involved programming such as by group
processing using automatic or temporary
variables.)
– Must use IF statement
22
23. Example .sas
proc sort data = exam out=student2;
by name; Sort by name
run;
data student2;
set student2;
by name;
<< enter condition to select obs with FIRST name >>
run;
23
24. Example .sas
proc sort data = exam out=student2;
by name; Sort by name
run;
data student2; Temporary
set student2; variable
by name;
Forgot the dot? – if first name;
if first.name;
Syntax Error
run; 1. SAS tries to compares two variables
2. SAS expects an operator
24
25. Example : SAS Log
106 data student2;
107 set student2;
108 by name;
109 Syntax is
NOTE: SCL source line. incorrect!
110 where first.name;
----------
180
ERROR: Syntax error while parsing WHERE clause.
ERROR 180-322: Statement is not valid or it is used out of
proper order.
111 run;
WHERE statement does not recognize temporary
variables, only data set variables. 25
26. Example Results
STUDENT2 data set: If first.name
Obs name class score
1 John math 8
2 Sally math 10
3 Tim math 9
Use IF statement when conditioning on temporary
variables such as FIRST.NAME. These variables are
available in Program Data Vector.
26
27. WHERE vs. IF: Example
Using newly created variables in data set?
(Multi-task: 1. Create a new variable
2. Subset based on new variable)
– Must use IF statement
27
28. Example .sas
data student3;
set exam;
* Create CLASSNUM variable;
if class = ‘math’ then classnum = 1;
else if class = ‘science’ then classnum = 2;
else if class = ‘history’ then classnum = 3;
<< enter condition to select classnum = 2 >>
run;
28
29. Example .sas
data student3;
set exam;
* Create CLASSNUM variable;
if class = ‘math’ then classnum = 1;
else if class = ‘science’ then classnum = 2;
else if class = ‘history’ then classnum = 3;
if classnum = 2 ;
run;
29
30. Example : SAS Log
156 data student3;
157 set exam;
158
159 * Create CLASSNUM variable;
160 if class = 'math' then classnum = 1;
161 else if class = 'science' then classnum = 2;
162 else if class = 'history' then classnum = 3;
163
164 where classnum = 2;
ERROR: Variable classnum is not on file WORK.EXAM.
165 run;
Semantic ERROR because Syntax is correct.
WHERE statement does not recognize new 30
variables.
31. Example Results
STUDENT3 data set: classnum = 2
Obs name class score classnum
1 Tim science 7 2
2 Sally science 7 2
3 John science 9 2
Use IF statement when conditioning on new variables
such as CLASSNUM. These variables are available in
Program Data Vector, not in Input Buffer.
31
32. WHERE vs. IF: Example
Using special operators such as LIKE or
CONTAINS?
(Request to query data set based on incomplete
information – example spelling of a name.)
– Must use WHERE statement*
* Although not a actual special operator, the Colon
Modifier (:) and the IN () operator can be used with
IF conditions 32
33. Example .sas
data student4;
set exam;
<< enter condition to select names that start with ‘T’
or names that contain ‘ally’ >>
run;
33
34. Example .sas
Special
data student4; operators to
set exam; the rescue
where name =: ‘T’ or name contains ‘ally’ ;
run; Character Data Issues:
- Case sensitivity (use with upcase())
- Consider embedded blanks
34
35. Example : SAS Log
196 data student4; Syntax is
197 set exam; incorrect!
198
NOTE: SCL source line.
199 if name =: 'T' or name contains 'ally';
-------- ------
388 200
ERROR 388-185: Expecting an arithmetic
operator.
ERROR 200-322: The symbol is not recognized
and will be ignored.
200
201 run; 35
36. WHERE: Selected Special
Operators
Operator Description
COLON Compares starting text of the
MODIFIER (:)* variable with the shorter text
CONTAINS or ? Search for a specific text
BETWEEN … Includes values defined in the
AND range of numeric variables
IS NULL or Checks for all missing values
IS MISSING
* Colon Modifier can also be used with the IF statement. 36
37. WHERE vs. IF: Example
Directly using any SAS Procedure or as a data set
option?
(Why create separate subset data sets.)
– Must use WHERE statement
Note that multiple WHERE statements within
SAS Procedures or DATA Steps are not
cumulative unless ‘ALSO’ or ‘SAME’ keyword is
added. By default, the most recent WHERE
statement replaces previous WHERE statements.
37
38. Example .sas
proc print data = exam;
<< enter condition to select Tim or Sally
as case in-sensitive >>
run;
38
39. Example .sas
proc print data = exam;
where upcase(name) = ‘TIM’ or upcase(name) = ‘SALLY’;
run;
Same Condition,
Same Syntax as in
Data Step!
proc print data =
exam (where =(name = ‘Tim’ or name = ‘Sally’));
run;
Data set option, note the syntax difference
39
40. Example : SAS Log
209 proc print data = exam;
NOTE: SCL source line.
210 if name = 'Tim' or name = 'Sally';
--
180
ERROR 180-322: Statement is not valid or it is used
out of proper order.
211 run;
IF statement syntax is correct.
IF statement not valid in SAS Procedure. 40
41. Example .sas
proc print data = exam;
Which condition is
where name = ‘Tim’; applied using
where class = ‘math’; multiple WHERE
run; statements?
41
42. Example : SAS Log
180 proc print data = exam;
181
182 where name = 'Tim';
183 where class = ‘math';
NOTE: Where clause has been replaced.
184 run;
Most recent WHERE
condition is applied unless
ALSO or SAME keyword
is added
where also class = ‘math’;
42
43. Example : SAS Log
17 proc print data = exam (where =(name = 'Tim'));
18 where class = 'math';
NOTE: Where clause has been augmented.
19 run;
Both WHERE data set
option and statement are
accepted and applied using
the ‘AND’ operator.
43
44. Example .sas
data student1;
set exam; Which condition is
applied using
if name = 'Tim'; multiple IF
if class = 'math'; statements?
run;
Both IF statements
for cumulative effect:
Selects Tim ‘AND’
math record
44
45. Example .sas
data student1;
set exam; Which condition is
applied using IF and
if name = 'Tim'; WHERE
where class = 'math'; statements?
run;
Both IF and WHERE
for cumulative effect:
Selects Tim ‘AND’
Not a recommended approach.
math record
45
46. Example .sas
data student1;
set exam; Which condition is
applied using
if name = 'Tim'; conflicting IF
if name = ‘Sally'; statements?
run;
‘AND’, not ‘OR’ is
the operator.
Results in 0
observations. 46
47. Example Results
STUDENT1 data set: Tim and math
Obs name class score
1 Tim math 9
Only one record is selected that meets both
conditions if non-conflicting.
47
48. WHERE vs. IF: Example *
When merging data sets?
(One Data Step – merging and subsetting)
**** This example is a test case only. ****
* Be careful when subsetting data set:
Results may be different depending on the
data sets being merged.
No ERRORs are generated!
In general, use the IF condition to subset the
data set after merging the data sets.
48
49. Example Data Set.sas
Test Purpose Only: Not recommended code!
data school; data school_data;
input name $ class $ score ; input name $ class $ score ;
cards; cards;
A math 10 All 10 A math 10 One 10
B history 10 B history 8
scores score
C science 10 C science 7
; ;
run; run;
Same data sets except for B and C scores in school_data
data set.
49
50. Example Merge with X.sas
data school_where; data school_if;
merge school school_data; merge school school_data;
by name; by name;
* subsets BEFORE merging; * subsets AFTER merging;
where score = 10; if score = 10;
run; run;
Time and location of subsetting is different.
• WHERE statement subsets school and school_data data sets.
Note that the score variable exists in both data sets.
• IF statement subsets only the school_if data set.
50
51. Example : WHERE SAS Log
26 data school_where;
27 merge school school_data;
28 by name;
29
30 * subsets BEFORE merging;
Where subsets
31 where score = 10;
each data set
32 run;
before merging
NOTE: There were 3 observations read from the data set
WORK.SCHOOL.
WHERE score=10;
NOTE: There were 1 observations read from the data set
WORK.SCHOOL_DATA. Results in 3 obs.
WHERE score=10;
NOTE: The data set WORK.SCHOOL_WHERE has 3 observations 51
and 3 variables.
52. Example : IF SAS Log
39 data school_if;
40 merge school school_data; Consider condition on
41 by name; temp variable with data
42 set option.
43 * subsets AFTER merging; merge school (in=a)
44 if score = 10; school_data;
45 run;
if a;
NOTE: There were 3 observations read from the data set
WORK.SCHOOL.
NOTE: There were 3 observations read from the data set
WORK.SCHOOL_DATA.
NOTE: The data set WORK.SCHOOL_IF has 1 observations and 3
variables.
Results in 1 obs.
52
53. Example Results
SCHOOL_WHERE data set SCHOOL_IF data set
obs name class score obs name class score
1 A math 10 1 A math 10
2 B history 10
3 C science 10
Correct
data set
In both data sets, scores are all 10.
Results are different!
Important to be aware of this issue.
In general, you will want to use the IF statement to
subset after merging data sets.
53
54. Example Without Condition
Helpful to view the intermediate data set before
subsetting when merge order is school school_data.
obs name class score
1 A math 10 Only the first
2 B history 8 record should
3 C science 7 be selected!
BE CAREFUL when merging, updating or interleaving and
subsetting data sets in the same Data Step!
54
55. WHERE vs. IF: Pop Quiz
Where or If
___________ name = ‘Tim’;
If
___________ first.name;
Where
___________ name contains ‘Tim’;
55
56. Where vs. If: Seven
Examples
Using variables in data set, using SET, MERGE, or UPDATE
statement if within the DATA step
Where or If
Accessing raw data file using INPUT statement If
Using automatic variables such as _N_, FIRST.BY, LAST.BY
If
Using newly created variables in data set
If
Using special operators such as LIKE or CONTAINS
Where
Directly using any SAS Procedure or as a data set option
Where
When merging data sets – Be careful when subsetting data set
56
Where or If
57. Summary of Key Differences
Between WHERE and IF
Selected Conditions to Subset Data Sets
Example WHERE IF
X No Diff X
If X
If X
If X
X Where
X Where
* Before Merge After Merge
Please see paper for complete list of differences and notes.
57
58. Where vs. If: Read the Fine
Print
See Summary of Key Differences table in paper
(14 cases)
Rules and assumptions are applied
Be aware of the fine details
Not possible to present all differences in limited time
** OBS = data set option is compatible with the WHERE
statement in SAS version 8.1 and higher.
*** The Colon Modifier (:) works with the IF statement to
compare shorter text with longer text.
58
59. WHERE vs. IF: Guidelines
Must use WHERE condition for these cases:
For special operators (CONTAINS, etc.)
When specifying indexes or for efficiency
When using SAS Procedures or data set options
Use IF condition for all other cases:
For automatic or newly created variables
For FIRST.BY or LAST. BY temporary variables
When reading data using an INPUT statement
59
60. Compare & Conquer
While vs. Until Do Loop
Data Step vs. Proc SQL
Proc Report vs. Proc Tabulate
Proc Summary vs. Proc Means
Only by analyzing the similarities and the differences
between programming approach, can one truly
understand and apply the concepts.
60
61. WHERE vs. IF and other SAS
Tips
SAS Demo Area: SAS Publications
This and other useful SAS Tips
SAS Technology Report
SAS’s FAQ
SAS V9 Certification Exam
(//support/sas/com/certification)
Sunil Gupta
Gupta Programming
61
62. WHERE vs. IF
Statements:
Examples in usage and
differences
Sunil Gupta
Sunil@GuptaProgramming.com
www.SASSavvy.com
62
63. Example : IF SAS Log
72 options msglevel = i;
73 data school_if;
74 merge school_data school; SCHOOL_DATA
75 by name; data set is before
76 * subsets AFTER merging; SCHOOL data set
77 if score = 10; run;
INFO: The variable class on data set WORK.SCHOOL_DATA will
be overwritten by data set WORK.SCHOOL.
INFO: The variable score on data set WORK.SCHOOL_DATA will
be overwritten by data set WORK.SCHOOL.
NOTE: There were 3 observations read from the data set
WORK.SCHOOL_DATA.
NOTE: There were 3 observations read from the data set
WORK.SCHOOL.
NOTE: The data set WORK.SCHOOL_IF has 3 observations and 3
variables. 63
64. Example Data Sets Switched
Different result when SCHOOL_DATA data set is
before SCHOOL data set when using IF statement.
obs name class score
1 A math 10 Same three records
2 B history 10 are selected as that of
3 C science 10 SCHOOL_WHERE
data set!
BE CAREFUL – values of common variables from last data
set are kept!
64
Editor's Notes
Where vs. If Statements Copyright (C) 2006, Sunil Gupta How many people think there are no difference between WHERE and IF? How may think there is a difference? FAQ, subtle, subset data set, test your skills Anyone planning to take the SAS certification exam? – This topic is on the test. Poll: 0 to < 2 yrs, 2 - < 5 yrs, > 5 yrs Show book, confirm users have summary sheet Open powerpoint file and folder 50 minute presentation, 61 slides
Where vs. If Statements Copyright (C) 2006, Sunil Gupta I like to think of the analogy of comparing apples to oranges. While they are both fruits, they taste very different. In the back of your mind, you probability thought there some differences but did not pay too much attention to the details. This presentation will focus on the similarities and differences so that you will know how and when to apply these conditions. In many cases, SAS will give you an error message if you are breaking the rules, but in some special cases, you are on your own.
Where vs. If Statements Copyright (C) 2006, Sunil Gupta What this paper is not - Typically when you see a paper on this topic, it deals most with efficiency in the approach. Now you know WHY you see the code you see.
Where vs. If Statements Copyright (C) 2006, Sunil Gupta Where conditions are non-executable and If conditions are executable Make a note about no real input buffer when reading an input data set. Input data set variables are always retained as well as the automatic variables (_n_, first.var, in=) in the PDV. Automatic variables (in=a) and temporary variables are similar – need to confirm no real difference.
Where vs. If Statements Copyright (C) 2006, Sunil Gupta
Where vs. If Statements Copyright (C) 2006, Sunil Gupta
Where vs. If Statements Copyright (C) 2006, Sunil Gupta These are seven typical examples in subsetting data sets. See if you can identity with approach to take.
Where vs. If Statements Copyright (C) 2006, Sunil Gupta In my HOW, I would ask the students to enter the current SAS syntax.
Where vs. If Statements Copyright (C) 2006, Sunil Gupta ¼ of presentation – 12 minutes
Where vs. If Statements Copyright (C) 2006, Sunil Gupta Your boss has become more demanding by asking you to become more efficient. Fortunately, you are ready to take on his challenge.
Where vs. If Statements Copyright (C) 2006, Sunil Gupta What about applying conditions on variables that are not in the data set – Temporary variables used for data manipulation within Data Steps. Especially useful for debugging purpose
Where vs. If Statements Copyright (C) 2006, Sunil Gupta
Where vs. If Statements Copyright (C) 2006, Sunil Gupta HALF POINT – 25 minutes As SAS programmers, we can do it all right? We can multi-task – create a variable and subset on the same new variable.
Where vs. If Statements Copyright (C) 2006, Sunil Gupta Of course this does not happen to you – your customer tell you exactly what they want.
Where vs. If Statements Copyright (C) 2006, Sunil Gupta Looks strange to the experienced SAS programmer
Where vs. If Statements Copyright (C) 2006, Sunil Gupta Everyone uses SAS Procedures with subset conditions to process the data set before applying the SAS procedure. This greatly saves the creation of an intermediate data set from a DATA Step.
Where vs. If Statements Copyright (C) 2006, Sunil Gupta This looks strange
Where vs. If Statements Copyright (C) 2006, Sunil Gupta ¾ of presentation – 37 minutes SAS programmers have inquisent minds
Where vs. If Statements Copyright (C) 2006, Sunil Gupta I like to stick with one standard – all if or where
Where vs. If Statements Copyright (C) 2006, Sunil Gupta
Where vs. If Statements Copyright (C) 2006, Sunil Gupta I saved the most challenging example towards the end.
Where vs. If Statements Copyright (C) 2006, Sunil Gupta Please try this back in the office with caution.
Where vs. If Statements Copyright (C) 2006, Sunil Gupta What happens when score variable does not exist in both data sets? You will get an error message when using the WHERE condition.
Where vs. If Statements Copyright (C) 2006, Sunil Gupta Based on keyword, you should be able to get some clue as to the correct method – Where vs. If. Any data set variable – where or if. First or Last. – if. Special operators – where.
Where vs. If Statements Copyright (C) 2006, Sunil Gupta These are seven typical examples in subsetting data sets. See if you can identity with approach to take.
Where vs. If Statements Copyright (C) 2006, Sunil Gupta There are some cases where there is no difference between where and if as in example 1. For many others, however, you there are rules that must be applied.
Where vs. If Statements Copyright (C) 2006, Sunil Gupta Just like when you purchase a cell phone, there is always the fine print that is difficult to read but important. In this case, there are assumptions and conditions on this general table. In the paper, I have the fine details as I was doing my reserach.
Where vs. If Statements Copyright (C) 2006, Sunil Gupta So the next time you are hungry, don’t just grab a fruit, reach for a apple or an orange depending how you are feeling. Be aware of the differences as you are reviewing other programs.
Where vs. If Statements Copyright (C) 2006, Sunil Gupta I would like to challenge each of you to look at other similar approaches so that you may also master them.
Where vs. If Statements Copyright (C) 2006, Sunil Gupta This concept is important to know for base certification exam.
Where vs. If Statements Copyright (C) 2006, Sunil Gupta How many people think there are no difference between WHERE and IF? How may think there is a difference? FAQ, subtle, subset data set, test your skills Anyone planning to take the SAS certification exam? – This topic is on the test. Poll: 0 to < 2 yrs, 2 - < 5 yrs, > 5 yrs Show book, confirm users have summary sheet Open powerpoint file and folder 50 minute presentation, 61 slides
Where vs. If Statements Copyright (C) 2006, Sunil Gupta