David B. Horvath presented on using the NOBS option in SAS. NOBS allows you to store the number of observations in a dataset or view in a macro variable. However, NOBS is set before any WHERE processing, so the number of observations may not reflect any WHERE clause filters. To get an accurate observation count after filtering, a separate DATA step with the WHERE clause is needed before using NOBS. The XML engine also does not properly implement NOBS.
3. Abstract
This mini-session will be a short discussion of the NOBS (number of observations)
option on the SET statement.
This includes one "gotcha" that I've run into with where clauses: NOBS is set before
WHERE processing. If you have a reason to know the number of observations
after the WHERE clause, another DATA step is needed.
4. My Background
• David is an IT Professional who has worked with various platforms since the 1980’s with a
variety of development and analysis tools.
• He has presented at PhilaSUG, SESUG, and SGF previously and has presented workshops
and seminars in Australia, France, the US, Canada, and Oxford England (about the British
Author Nevil Shute).
• He holds an undergraduate degree in Computer and Information Sciences from Temple
University and a Masters in Organizational Dynamics from UPENN. He achieved the
Certified Computing Professional designation with honors.
• Most of his career has been in consulting (although recently he has been in-house) in the
Philadelphia PA area. He is currently in Data Analytics "Engineering" at a Regional Bank.
• He has several books to his credit (none SAS related) and is an Adjunct Instructor
covering IT topics.
5. Basic NOBS
• The nobs statement is a handy way of discovering how many
observations are in your SAS Dataset:
data simple;
a = 42;
output;
run;
data _null_;
put nobs=;
stop;
set simple nobs=nobs;
run;
• Prints
NOBS=1
6. Macro NOBS
• Great information if you need it!
• Is available before first row is processed
• Can be stored in macro variable for global usage:
data _null_;
call symput('ALLOBS', nobs);
stop;
set simple nobs=nobs;
run;
data _null_;
put "number of obs are &ALLOBS.";
stop;
run;
• Prints
number of obs are 1
7. NOBS – the catch: where
• Processing is at the file level – before the where clause:
data _null_;
put nobs=;
stop;
set simple nobs=nobs;
run;
• And
data _null_;
put nobs=;
stop;
set simple nobs=nobs;
where a = 10;
run;
• Both print the same result:
NOBS=1
8. NOBS – the catch: where
• I found out the hard way
• I had a process that rsubmitted N jobs to process the objects within
an XML file
• Each of the N jobs processed 1/Nth of the objects to spread load
• Process worked fine until the user said "Don't bother with THESE
tables".
• I figured "Oh, this is SAS, this is an easy change: 'where TABLE
not in (THESE1, THESE2, ... THESEn)'".
• The process worked fine but runtimes went up – no longer were N
processes running; the last 2 never started up.
• Solution was to add another data step in front to execute the
'where'
• Input to the rsubmit process now had the correct nobs
9. NOBS – the catch: not a new
variable
• Your nobs variable is special – it will not appear in the output
dataset
data new;
set simple nobs=nobs;
run;
proc print data=new; run;
• Prints
Obs a
1 42
• Coding the nobs variable in a keep statement is not a fix:
data new (keep=a nobs);
WARNING: The variable nobs in the DROP, KEEP, or RENAME
list has never been referenced.
• Only solution is an equal sign (even retain does not help):
nnobs=nobs;
10. NOBS – the catch: options obs=
• Is independent of options obs=;
options obs=2;
data _null_;
put nobs=;
stop;
set large nobs=nobs;
run;
• And (obs=)
data _null_;
put nobs=;
stop;
set large(obs=2) nobs=nobs;
run;
• Both print
nobs=915803
11. NOBS – the catch: not every engine
• The XML Engine does not properly implement:
filename SXLEMAP "OUR_MAP_FILE.map";
filename test2 "OUR_INPUT_FILE.xml";
libname test2 xml xmlmap=SXLEMAP access=READONLY;
NOTE: Libref TEST2 was successfully assigned as follows:
Engine: XML
Physical Name: TEST2
data _null_;
put nobs=;
stop;
set test2.application nobs=nobs;
run;
• Printing
nobs=9.0071993E15
• When the file only contained 17,383,357 bytes