SlideShare a Scribd company logo
1 of 44
Download to read offline
1
2015
Praxis Business School
Vibeesh C S
Solution for Even
Numbered Problems
For Chapters 7-15 from Learning SAS by Example -
A Programmer’s Guide by Ron Cody
2
Chapter 7- Performing Conditional Processing
Question
/* 7.2 Using the SAS data set Hosp, use PROC PRINT to list observations for
Subject values of 5, 100, 150, and 200. Do this twice, once using OR
operators and once using the IN operator. Note: Subject is a numeric
variable */
Program
data a15031.hosp99l4;
set a15031.hosp; *USING "OR";
where Subject eq 5 or Subject eq 100 or Subject eq 150 or Subject eq
200;
run;
proc print data=a15031.hosp99l4;
run;
data a15031.hosin; *using "IN";
set a15031.hosp;
where Subject in(5,100,150,200);
run;
proc print data=a15031.hosin;
run;
Output using OR
Output using OR
3
Question
/*4. Using the Sales data set, create a new, temporary SAS data set
containing Region and Total Sales plus a new variable called Weight with
values of 1.5 for the North Region, 1.7 for the South Region, and 2.0 for
the West and East Regions. Use a SELECT statement to do this */
Program
data a15031.sales11q;
set a15031.sales(keep = Region Totalsales);
*keep function use to keep totlsales and eliminate the other
variables;
select;
when (Region = 'North') Weight = 1.5;*WHEN function subset the data;
when (Region = 'South') Weight = 1.7;
when (Region = 'East') Weight = 2.0;
when (Region = 'West') Weight = 2.0;
otherwise;
end;
proc print data=a15031.sales11q;
run;
Output
4
Question
/*6. Using the Sales data set, list all the observations where Region is
North and Quantity is less than 60. Include in this list any observations
where the customer name (Customer) is Pet's are Us */
Program
data a15031.sal55;
set a15031.sales;
where Region eq "North" and Quantity < 60;
* Region is North and Quantity is less than 60 using where function;
run;
proc print data=a15031.sal55;
run;
Learnings from this chapter
 The importance of using Keep and drop functions in data step which allows
to select the required variables for doing analysis. If dataset has large
number of variables we can study only the variables of interest by using
these functions
 The importance of where statement in data step that allows us to execute
the filters in the dataset in accordance with the requirements
 The importance of using Boolean functions that allows us to execute the
conditions
 The importance of Select statement that allows us to implement the
customized selection of both variables associated with conditions
5
Chapter 8 – Performing Iterative Processing
Question
/*8.2 Run the program here to create a temporary SAS data set (MonthSales):
data monthsales;
input month sales @@;
---add your line(s) here---
datalines;
1 4000 2 5000 3 . 4 5500 5 5000 6 6000 7 6500 8 4500
9 5100 10 5700 11 6500 12 7500
;
Modify this program so that a new variable, SumSales, representing Sales to
date, is added to the data set. Be sure that the missing value for Sales in
month 3 does not result in a missing value for SumSales */
Program
data a15031.monthsales;
input month sales @@; *DOUBLE Trailing procedure to read the data
set ;
datalines;
1 4000 2 5000 3 . 4 5500 5 5000 6 6000 7 6500 8 4500
9 5100 10 5700 11 6500 12 7500
;
proc print data=a15031.monthsales;
run;
data a15031.modifiedsales;
set a15031.monthsales;
sumsales+sales; *sum function;
*RETAIN function for initiate and return value ;
retain sumsales 0;
run;
proc print data=a15031.modifiedsales;
run;
Output
6
Question
/*8.4 Count the number of missing values for the variables A, B, and
C in the Missing data set. Add the cumulative number of missing
values to each observation (use variable names MissA, MissB, and
MissC). Use the MISSING function to test for the missing values */
Program
data a15031.missing1;
input G $ A B C ;
*using sum function in if statement to calculate num of missing
value;
if missing(G) then COUNTG+1;
if missing(A) then COUNTA+1;
if missing(B) then COUNTB+1;
if missing(C) then COUNTC+1;
datalines;
M 56 68 89
F 33 60 71
M 45 91 .
F 35 35 68
M . 71 81
M 50 68 71
. 23 60 46
M 65 72 103
. 35 65 67
M 15 71 75
;
proc print data=a15031.missing1 NOOBS;
run;
Output
7
Question
/*8.6 Repeat Problem 5, except have the range of N go from 5 to 100
by 5 */
Program
data a15031.loger2;
do n = 5 to 100 by 5;*using do loop creating values from 5 to 100 by
5;
log_of_n=log(n);
output;
end;
run;
proc print data=a15031.loger2;
run;
Output
8
Question
/*8.8 Use an iterative DO loop to plot the following equation:
Logit = log(p / (1 – p))Use values of p from 0 to 1 (with a point at
every .05). Using the following GPLOT
statements will produce a very nice plot. (If you do not have
SAS/GRAPH
software, use PROC PLOT to plot your points).
goptions reset=all
ftext='arial'
htext=1.0
ftitle='arial/bo'
htitle=1.5
colors=(black);
symbol v=none i=sm;
title "Logit Plot";
proc gplot data=logitplot;
plot Logit * p;run;quit;*/
Program
data a15031.itrative1;
do p= 0 to 1 by 0.05;*using DO loop creating values from 0 to 1 by
0.05;logit=log(p/(1-p));
output;
end;run;
goptions reset=all ftext='arial' htext=1.0 ftitle='arial/bo'
htitle=1.5 colors=(black);
symbol v=none i=sm;
title "Logit Plot";
proc gplot data=a15031.itrative1;
plot logit * p; *plot function to draw a graph;
run;quit;
proc print data=a15031.itrative1;
run;
Output
9
Question
/*8.10 You are testing three speed-reading methods (A, B, and C) by
randomly assigning10 subjects to each of the three methods. You are
given the results as three lines of reading speeds, each line
representing the results from each of the three
methods,respectively. Here are the results:
250 255 256 300 244 268 301 322 256 333
267 275 256 320 250 340 345 290 280 300
350 350 340 290 377 401 380 310 299 399
Create a temporary SAS data set from these three lines of data. Each
observation should contain Method (A, B, or C), and Score. There
should be 30 observations inthis data set. Use a DO loop to create
the Method variable and remember to use asingle trailing @ in your
INPUT statement. Provide a listing of this data set using PROC PRINT
*/
Program
data a15031.speed;
do method = "method_a" ,"method_b", "method_c" ;
do n= 1 to 10;*creating values using do loop;
input score@;*single trail function read the data;
output;
end;end;
datalines;
250 255 256 300 244 268 301 322 256 333
267 275 256 320 250 340 345 290 280 300
350 350 340 290 377 401 380 310 299 399
;proc print data=a15031.speed;
run;
Output
10
Question
/* 8.12 You place money in a fund that returns a compound interest
of 4.25% annually. You
deposit $1,000 every year. How many years will it take to reach
$30,000? Do not
use compound interest formulas. Rather, use “brute force” methods
with DO WHILE
or DO UNTIL statements to solve this problem */
Program
data a15031.money;
interest=0.0424;
total=1000;
do until (total gt 30000) ;
year+1;
total=total+interest*total;
output;end;
run;
proc print data=a15031.money;
format total dollar10.2;
run;
Output
11
Question
/*14. Generate a table of integers and squares starting at 1 and
ending when the square
value is greater than 100. Use either a DO UNTIL or DO WHILE
statement to accomplish this*/
Program
data a15031.table;
do n=1 to 100 until (square ge 100);
square= n**2;
*using do until taking values from 1 to 100 and specifying the
condition for squares variable to stop the loop when it reaches 100;
output;
end;
run;
proc print data=a15031.table ;
run;
Output
Learnings from this chapter
 The importance of Sum and Retain functions
 Using Sum function to find the number of missing values
 The importance of do loop in executing iterative conditions
 Using single trial functions and double trial functions to read the data
 Using Do While and Do Until Statements
12
Chapter9 – Working with Dates
Question
/* 9.2 Using the following lines of data, create a temporary SAS
data set called Three Dates. Each line of data contains three dates,
the first two in the form mm/dd/yyyy descenders and the last in the
form ddmmmyyyy. Name the three date variables Date1, Date2, and
Date3. Format all three using the MMDDYY10. format. Include in your
data set the number of years from Date1 to Date2 (Year12) and the
number of years from Date2 to Date3 (Year23). Round these values to
the nearest year. Here are the lines of data (note that the columns
do not line up):
01/03/1950 01/03/1960 03Jan1970
05/15/2000 05/15/2002 15May2003
10/10/1998 11/12/2000 25Dec2005 */
Program
data a15031.threedate;
input @1 date1 mmddyy10. *fixed line reading;
@12 date2 mmddyy10.
@23 date3 date9. ;
format date1 mmddyy10.
date2 mmddyy10.
date3 mmddyy10.;
year1_2=round(yrdif(date1,date2,"actual"));
year2_3=round(yrdif(date2,date3,"actual"));
*accessing the values from the above dataset using set function
Using yrdif function to calculate difference between date1,date2 and
date3 variables and rounding them using round command along with
yrdif;
datalines;
01/03/1950 01/03/1960 03Jan1970
05/15/2000 05/15/2002 15May2003
10/10/1998 11/12/2000 25Dec2005
;
proc print data=a15031.threedate noobs ;
run;
Output
13
Question
/* 9.4 Using the Hosp data set, compute the subject’s ages two ways:
as of January 1, 2006(call it AgeJan1), and as of today’s date (call
it Age Today) The variable DOB represents the date of birth. Take
the integer portion of both ages. List the first 10
observations */
hint :
*using yrdif to find the difference between DOB and today’s date and
int to get only integer value of the difference
Program
data a15031.hospp;
set a15031.hosp;
age_tdat=round(yrdif(DOB,today(),"actual"));
age_1jan=round(yrdif(DOB,"01jan2006"d,"actual"));
run;
proc print data=a15031.hospp(OBS=10 );
run;
Output
Question
/* 9.6 Using the Medical data set, compute frequencies for the days
of the week for the date of the visit (VisitDate). Supply a format
for the days of the week and months of the year */
Program
data a15031.medical;
input @1 VisitDate mmddyy10. @12 patno $3.
format visitdate date9.;
day_of_week=weekday(visitdate); *fetching weekday from visitdate
variable;
month_of_year=month(visitdate); *providing format for month
variable;
14
datalines;
11/29/2003 879
11/30/2003 880
09/04/2003 883
08/28/2003 884
09/04/2003 885
08/26/2003 886
08/31/2003 887
08/25/2003 888
11/16/2003 913
11/15/2003 914
;
proc freq data= a15031.medical;
table day_of_week; format day_of_week date9.;
run;
proc print data=a15031.medical;
run;
Output
Question
/* 9.8 Using the values for Day, Month, and Year in the raw data
below, create a temporary SAS data set containing a SAS date based
on these values (call it Date) and format this value using the
MMDDYY10. format. Here are the Day, Month, and Year values:
25 12 2005
1 1 1960
21 10 1946 */
Program
data a15031.date_it;
input Day Month Year;
datalines;
25 12 2005
1 1 1960
21 10 1946
;
data a15031.date_it1;
set a15031.date_it;*set function to set the data into another data ;
15
Date = mdy(Month,Day,Year);* merging the day month year values into
mmddyy format;
format Date mmddyy10.;*date format mmddyy10.;
run;
proc print data=a15031.date_it1;
run;
Output
Question
/* 9.10 Using the Hosp data set, compute the number of months from
the admission date (AdmitDate) and December 31, 2007 (call it
MonthsDec). Also, compute the number of months from the admission
date to today's date (call it MonthsToday). Use a date interval
function to solve this problem. List the first 20 observations for
your solution */
Program
data a15031.monthdec;
set a15031.hosp;
*set hosp data into this data from permanent library;
*you can find hosp dataset in the blog folder uploaded in the
dropbox;
MonthDec =intck('month',admitdate,'31dec2007'd) ;
*using intck function to find month difference between admitdate and
31Dec2007;
MonthToday =intck('month',AdmitDate,today());
run;
proc print data= a15031.monthdec;
run;
Output
16
Question
/* 9.12 You want to see each patient in the Medical data set on the
same day of the week 5 weeks after they visited the clinic (the
variable name is VisitDate).Provide a listing of the patient number
(Patno), the visit date, and the date for the return visit */
Program
data a15031.med;
set a15031.medical;
Followdate=intnx('month',VisitDate,5,'sameday');
*using intck function calculate follow date for the given condition;
run;
proc print data=a15031.med;
format Followdate VisitDate date9.;
run;
Output
Learnings from this chapter
 The ways to read date variables
 The ways to store date variables
 The ways to extract day of a week or a month
 Providing formats to dates
 The importance and usage of intck function
17
Chapter 10 – Subsetting and Combining SAS
Datasets
Question
/*10.2 Using the SAS data set Hosp, create a temporary SAS data set
called Monday2002,consisting of observations from Hosp where the
admission date (AdmitDate) falls on a Monday and the year is 2002.
Include in this new data set a variable called Age,computed as the
person’s age as of the admission date, rounded to the nearest
year;*/
Program
data monday20122;
set a15031.hosp;
Admit_day= weekday(AdmitDate); *week day of admit;
admit_year=year(admitdate); *year OF admit;
admit_month=month(AdmitDate);*month of admit;
day_of_admit=day(admitdate); *date of admit;
run;
proc print data=monday20122;
run;
Output
Program
data a15031.monday2012;
set a15031.monday20122; *2 = monday;
where Admit_day = 2 and admit_year=2002;
AGE=ROUND( yrdif(DOB,AdmitDate,'Actual')); *ROUND THE NEAREST AGE;
run;
proc print data= monday2012;
format DOB date9. AdmitDate date9.0;run;
18
Output
Question
/*10.4 Using the SAS data set Bicycles, create two temporary SAS
data sets as follows:
Mountain_USA consists of all observations from Bicycles where State
is Uttar Pradesh and Model is Mountain. Road_France consists of all
observations from Bicycles where State is Maharastra and Model is
Road Bike. Print these two data sets */
Program
title bicycle;
data a15031.bicycle;*create a data set;
set a15031.bicycles;
run;
proc contents data=a15031.bicycle;*check the content of data set;
run;
proc print data=a15031.bicycle;
run;
data a15031.mountain_usa a15031.road_france;
set a15031.bicycle;
*using if statement subset the state and model pass the out put to
two data set above mentioned;
if state = "Uttar Pradesh" and model = "mountain bike" then output
a15031.mountaion_usa;
else if state = "Maharastra" and model = "road bike" then output
a15031.road_france;
run;
proc print data=a15031.mountain_usa;
run;
19
Output
Question
/*10.6 Repeat Problem 5, except this time sort Inventory and
NewProducts first (create two temporary SAS data sets for the sorted
observations). Next, create a new, temporary SAS data set (Updated)
by interleaving the two temporary, sorted SAS data sets. Print out
the result.*/
Program
proc sort data=a15031.inventory out=a15031.inventory;
by Model;*must sort before merge using any common variable;
run;
proc sort data=a15031.newproducts out=a15031.newproducts;
by Model;run;
data a15031.updated;
set a15031.inventory a15031.newproducts;
*set function use to combine the both table;
by Model;run;
title "Listing of UPDATED";
proc print data=a15031.updated;
run;
Output
20
Question
/* 10.8 Run the program here to create a SAS data set called Markup:
data markup;
input manuf : $10. Markup;
datalines;
Cannondale 1.05
Trek 1.07
;
Combine this data set with the Bicycles data set so that each
observation in the Bicycles data set now has a markup value of 1.05
or 1.07, depending on whether the bicycle is made by Cannondale or
Trek. In this new data set(call it Markup_Prices),create a new
variable (NewTotal) computed as TotalCost times Markup */
Program
data a15031.markup;
input manuf : $10. Markup;
datalines;
Cannondale 1.05
Trek 1.07
;
proc print data = a15031.markup;
run;
proc contents data= a15031.markup;
run;
data a15031.merage;
*combine markup data set with bicycle using merge function;
merge a15031.bicycle a15031.markup;
by manuf;
newtotal=sum(unitcost);run;
proc print data = a15031.merage;run;
proc sort data = a15031.merage;
by manuf;run;
proc print data = a15031.merage;*merged data set;run;
Output
21
Question
/*10.10 Using the Purchase and Inventory data sets, provide a list
of all Models (andthe Price) that were not purchased*/
Program
proc sort data=a15031.inventory out=a15031.inventory;
by Model;*sort the inventory data by model and pass it to inventory;
run;
proc sort data=a15031.purchase out=a15031.purchase;
by Model; *sort the purchase data by model and pass it to purchase;
run;
data a15031.not_bought;
merge a15031.inventory(in=InInventory)*merge the sorted data set;
a15031.purchase(in=InPurchase);
by Model;
if InInventory and not InPurchase;
keep Model Price;
*keep only model and price and eliminate the other variable;
run;
title "Listing of NOT_BOUGHT";
proc print data=a15031.not_bought noobs;
run;
Output
Question
/*10.12 You want to merge two SAS data sets, Demographic and
Survey1, based on an identifier. In Demographic, this identifier is
called ID; in Survey1, the identifier is called Subj. Both are
character variables.*/
Program
data a15031.demographic;
input ID : $3.
DOB : mmddyy10.
Gender : $1.;
format DOB mmddyy10.;
datalines;
012 10/10/37 M
22
535 7/12/87 F
723 1/5/2000 M
007 6/4/1966 F
;
*Data set SURVEY1;
data a15031.survey1;
input Subj : $3.
(Q1-Q5)($1.);
datalines;
535 13542
012 55443
723 21211
007 35142
;
*Data set SURVEY2;
data a15031.survey2;
input ID
(Q1-Q5)(1.);
datalines;
535 13542
012 55443
723 21211
007 35142
;
proc sort data=a15031.demographic out=demographic;
by ID;
run;
proc sort data=a15031.survey1 out=survey1;
by Subj;
run;
data a15031.combinech10;
merge a15031.demographic
survey1 (rename=(Subj = ID));
by ID;
run;
title "Listing of COMBINE";
proc print data=a15031.combinech10 noobs;
run;
Output
23
Question
/*14 Data set Inventory contains two variables: Model (an 8-byte
character variable) and Price (a numeric value). The price of Model
M567 has changed to 25.95 and the price of Model X999 has changed to
35.99. Create a temporary SAS data set (call it NewPrices) by
updating the prices in the Inventory data set*/
Program
data a15031.modelnew;
input Model $ Price;
datalines;
M567 25.95
X999 35.99
;
*sorting inventory data by model variable;
proc sort data=a15031.inventory out=inventory;
by Model;
run;
*updating inventory data with modelnew for price for the models;
data a15031.updatedprices;
update a15031.inventory a15031.modelnew;
by Model;
run;
title "Listing of NEWPRICES";
proc print data=a15031.updatedprices noobs;
run;
Output
Learnings from this chapter
 The ways to subset a dataset based on the requirements
 The ways to generate multiple subsets from the data in single data step
 The ways to manipulate the data.
 Adding observations, moving observations from datasets
 How to produce summary of variables
 Merging two datasets by performing one to one, one to many and
many to many joins

24
Chapter 11- Working with Numeric
Functions
Question
/* 11.1 Using the SAS data set Health, compute the body mass index
(BMI) defined as the weight in kilograms divided by the height (in
meters) squared. Create four other variables based on BMI: 1)
BMIRound is the BMI rounded to the nearest integer, 2) BMITenth is
the BMI rounded to the nearest tenth, 3) BMIGroup is the BMI rounded
to the nearest 5, and 4) BMITrunc is the BMI with a fractional
amount truncated. Conversion factors you will need are: 1 Kg equals
2.2 Lbs and 1 inch = .0254 meters */
Program
data a15031.health;
set a15031.health;
BMI = (Weight / 2.2) / (Height*.0254)**2;
BMIRound=round(BMI);
BMITenth=round(BMI,.1);
BMIGroup=round(BMI,5);
BMITrunc=int(BMI);
run;
proc print data=a15031.health;
run;
Output
Question
/* 11.2 Count the number of missing values for WBC, RBC, and Chol in
the Blood data set.
Use the MISSING function to detect missing values */
25
Program
data a15031.hel;
set a15031.blood;
*blood dataset is present in the blog folder uploaded in dropbox
folder;
if missing(Gender) then MissG+1;
if missing(WBC) then MissWBC+1;
if missing(RBC) then MissRBC+1;
if missing(Chol) then MissChol+1;
*using sum function to find the number of missing values in each
variable;
run;
proc print data=a15031.hel;
run;
Output
Question
/* 11.4 The SAS data set Psych contains an ID variable, 10 question
responses (Ques1–Ques10), and 5 scores (Score1–Score5). You want to
create a new, temporary SASdata set (Evaluate) containing the
following:
a. A variable called QuesAve computed as the mean of Ques1–Ques10.
Perform this computation only if there are seven or more non-missing
question values.
b. If there are no missing Score values, compute the minimum
score(MinScore),the maximum score (MaxScore), and the second highest
score (SecondHighest) */
Program
data a15031.evaluate;
set a15031.psych;
*pysch dataset is present in the blog folder uploaded in dropbox
folder;
if n(of Ques1-Ques10) ge 7 then QuesAve=mean(of Ques1-Ques10);
if n(of Score1-Score5) eq 5 then maxscore=max(of Score1-Score5);
if n(of Score1-Score5) eq 5 then Minscore=min(of Score1-Score5);
26
if n(of Score1-Score5) eq 5 then SecondHighest=largest(2,of Score1-
Score5);
*using if then stmt to find max score min score secondhighest of the
score variables;
run;
proc print data=a15031.evaluate;
run;
Output
Question
/* 11.6 Write a short DATA _NULL_ step to determine the largest
integer you can score on
your computer in 3, 4, 5, 6, and 7 bytes */
Program
data _null_;
set a15031.cons;
put int3= int4= int5= int6= int7= ;
run;
Output of log window
Question
/*11.8 Create a temporary SAS data set (Random) consisting of 1,000
observations, each with a random integer from 1 to 5. Make sure that
all integers in the range are equally likely. Run PROC FREQ to test
this assumption */
27
Program
data a15031.random;
do i=1 to 1000;
x=int(rand('uniform')*5)+1 /*OR x=int(ranuni(0)*5+1) */;output
;end;
*rand function to get random value between 1 and 5;
run;
proc freq data=a15031.random;
tables x/missing;run;
Output
Question
/* 11.10 Data set Char_Num contains character variables Age and
Weight and numeric variables SS and Zip. Create a new, temporary SAS
data set called Convert with new variables NumAge and NumWeight that
are numeric values of Age and Weight, respectively, and CharSS and
CharZip that are character variables created from SS and Zip. CharSS
should contain leading 0s and dashes in the appropriate places for
Social Security numbers and CharZip should contain leading 0s
Hint: The Z5. format includes leading 0s for the ZIP code */
Program
data a15031.convert;
set a15031.char_num;
NumAge = input(Age,8.);
NumWeight = input(weight,8.);
*converting character variables weight and age into numeric
variables;
CharSS = put(SS,ssn11.);
CharZip = put(Zip,z5.);
*converting numeric variables SS and Zip into character variables;
run;
proc print data=a15031.convert;
run;
28
Output
Question
/* 11.12 Using the Stocks data set (containing variables Date and
Price), compute daily changes in the prices. Use the statements here
to create the plot. Note: If you do not have SAS/GRAPH installed,
use PROC PLOT and omit the GOPTIONS and SYMBOL statements. goptions
reset=all colors=(black) ftext=swiss htitle=1.5;
symbol1 v=dot i=smooth;
title "Plot of Daily Price Differences";
proc gplot data=difference;
plot Diff*Date;
run;
quit; */
Program
data a15031.price_difference;
set a15031.stocks;
Diff = Dif(Price);
*using dif function to calculate the difference in the price
compared to the previous price ;
run;
goptions reset=all colors=(black) ftext=swiss htitle=1.5;
symbol1 v=dot i=smooth;
title "Plot for Price Differences";
proc gplot data=a15031.price_difference;
plot Diff * Date;
run;
quit;
29
Output
Learnings from this chapter
 The ways of rounding and truncating numerical values
 The ways to detect missing values
 The ways to treat missing values
 The ways to assign data types to missing values
 The usage of random numbers and the ways to generate random numbers
30
Chapter 12- Working with character
functions
Question
/*12.2 Using the data set Mixed, create a temporary SAS data set
(also called Mixed) with the following new variables:
a. NameLow – Name in lowercase
b. NameProp – Name in proper case
c. (Bonus – difficult) NameHard – Name in proper case without using
the
PROPCASE function*/
Program
data a15031.mixed;
set a15031.mixed;
length First Last $ 15 NameHard $ 20;
NameLow = lowcase(Name);
*converting entire word into lower case;
NameProp = propcase(Name);
*making first letter of each work into uppercase;
First = lowcase(scan(Name,1,' '));
*converting entire word into lower case;
Last = lowcase(scan(Name,2,' '));
*converting entire word into lower case;
substr(First,1,1) = upcase(substr(First,1,1));
*converting entire word into upper case;
substr(Last,1,1) = upcase(substr(Last,1,1));
*converting entire word into upper case;
NameHard = catx(' ',First,Last);
drop First Last;
run;
proc print data=a15031.mixed;
run;
Output
31
Question
/*12.4 Data set Names_And_More contains a character variable called
Height. As you cansee in the listing in Problem 3, the heights are
in feet and inches. Assume that these units can be in upper- or
lowercase and there may or may not be a period following the units.
Create a temporary SAS data set (Height) that contains a numeric
variable (HtInches) that is the height in inches.*/
*Data set NAMES_AND_MORE;
Program
data a15031.height;
set a15031.names_and_more(keep = Height);
Height = compress(Height,'INFT.','i');
/* Alternative
Height = compress(Height,' ','kd');
*keep digits and blanks;
*/
Feet = input(scan(Height,1,' '),8.);
Inches = input(scan(Height,2,' '),?? 8.);
*using scan function to extract values around the characters from
the variable1 value before space and 2 for value after two for ;
if missing(Inches) then HtInches = 12*Feet;
else HtInches = 12*Feet + Inches;
drop Feet Inches;
run;
title "Listing of HEIGHT";
proc print data=a15031.height noobs;
run;
Output
Question
/*12.6 Data set Study (shown here) contains the character variables
Group and Dose. Create a new, temporary SAS data set (Study) with a
variable called GroupDose by putting these two values together,
separated by a dash. The length of the resulting variable should be
6 (test this using PROC CONTENTS or the SAS Explorer). Make sure
that there are no blanks (except trailing blanks) in this value. Try
this problem two ways: first using one of the CAT functions, and
second without using any CAT functions*/*Using CAT functions;
32
Program
data a15031.study;
set a15031.study;
length GroupDose $ 6;
GroupDose = catx('-',Group,Dose);
*catx function conect the two variable values in “-” ;run;
title "Listing of STUDY";
proc print data=a15031.study noobs;run;
*Without using CAT functions;
data a15031.study;
set a15031.study;
length GroupDose $ 6;
GroupDose = trim(Group) || '-' || Dose;
*remove the blank space using trim function;
*combine the two variable;run;
title "Listing of STUDY";
proc print data=a15031.study noobs;
run;
Output
Question
/*12.8 Notice in the listing of data set Study in Problem 6 that the
variable called Weight contains units (either lbs or kgs). These
units are not always consistent in case and may or may not contain a
period. Assume an upper- or lowercase LB indicates pounds and an
upper- or lowercase KG indicates kilograms. Create a new, temporary
SAS data set (Study) with a numeric variable also called Weight
(careful here) thatrepresents weight in pounds, rounded to the
nearest 10th of a pound.Note: 1 kilogram = 2.2 pounds*/
Program
data a15031.study;
set a15031.study(keep=Weight rename=(Weight = WeightUnits));
*using compress(kd)inside input function to keep numerical values
alone from the string and change if character variables present to
numerical;
Weight = input(compress(WeightUnits,,'kd'),8.);
if find(WeightUnits,'KG','i') then Weight = round(2.2*Weight,.1);
else if find(WeightUnits,'LB','i') then Weight = round(Weight,.1);
*using find function with "i" argument to remove characters and to
ignore cases;
run;
title "Listing of STUDY";
33
proc print data=a15031.study noobs;
run;
Output
Question
/*12.10 Data set Errors contains character variables Subj (3 bytes)
and PartNumber (8bytes). (See the partial listing here.) Create a
temporary SAS data set (Check1) with any observation in Errors that
violates either of the following two rules: first,Subj should
contain only digits, and second, PartNumber should contain only the
uppercase letters L and S and digits.
Here is a partial listing of Errors:*/
Program
data a15031.violates_rules;
set a15031.errors;
where notdigit(trim(Subj)) or
*using notdigit to check any invalid character type value present
Here you should use trim function along with notdigit because
Without the TRIM function "not" function used here would return the
position of the first trailing blank in each of the character
values;
verify(trim(PartNumber),'0123456789LS');
run;
title "Listing of VIOLATES_RULES";
proc print data=a15031.violates_rules noobs;
run;
Output
34
Question
/*12.12 List the subject number (Subj) for any observations in
Errors where PartNumber contains an upper- or lowercase X or D.*/
Program
title "Subjects with X or D in PartNumber";
proc print data=a15031.errors noobs;
*using findc function with argument "i" to find if the variable
values contain any case ;
where findc(PartNumber,'XD','i'); var Subj PartNumber;run;
Output
Question
/*12.14 List all patients in the Medical data set where the word
antibiotics is in the comment field (Comment).*/
Program
proc print data=a15031.medical;
*comment function to find the particular word in the variable
comment;
where indexw(Comment,'antibiotics');
run;
Output
Question
/*12.16 Provide a list, in alphabetical order by last name, of the
observations in the Names_And_More data set. Set the length of the
last name to 15 and remove multiple blanks from Name.
35
Note: The variable Name contains a first name, one or more spaces,
and then a last name.*/
Program
data a15031.names;
set a15031.names_and_more;
length Last $ 15;
Name = compbl(Name);*compbl function use to compress the blank
value;
Last = scan(Name,2,' ');
*scan function use to take second part of the name and store it the
last variable;
run;
proc sort data=a15031.names;
by Last;
run;
title "Observations in NAMES_AND_MORE in "
"Alphabetical Order";
proc print data=a15031.names;
id Name;
var Phone Height Mixed;
run;
Output
Learnings from this chapter
 The ways to perform concatenation of strings
 The ways to calculate the length of the string
 The ways to remove leading and trailing blanks from string
 Using compress and NOT functions
 Using comment function to find a word in a variable
 Using notdigit to check the invalid character type in a dataset
36
Chapter 13- Working with arrays
Question
/*Using the SAS data set Survey1, create a new, temporary SAS data
set (Survey1) where the values of the variables Ques1–Ques5 are
reversed as follows: 1 ?? 5; 2?? 4; 3 ?? 3; 4 ?? 2; 5 ?? 1.
Note: Ques1–Ques5 are character variables. Accomplish this using an
array.*/
*Data set SURVEY;
Program
data a15031.survey;
infile 'c:bookslearningsurvey.txt' pad;
input ID : $3.
Gender : $1.
Age
Salary
(Ques1-Ques5)(1.);
run;
proc format library=a15031;
value $gender 'M' = 'Male'
'F' = 'Female'
' ' = 'Not entered'
other = 'Miscoded';
value age low-29 = 'Less than 30'
30-50 = '30 to 50'
51-high = '51+';
value $likert '1' = 'Strongly disagree'
'2' = 'Disagree'
'3' = 'No opinion'
'4' = 'Agree'
'5' = 'Strongly agree';
run;
data a15031.survey1;
set a15031.survey1;
array Ques{5} $ Q1-Q5;
*creating array to storing variables from Q1 to Q5;
do i = 1 to 5;
Ques{i} = translate(Ques{i},'54321','12345');
*using do loop to create "i" variable with values from 1 to 5 and
to reverse the question using translate function inside the Ques
array;
end;
drop i;
run;
title "List of SURVEY1 ";
proc print data=a15031.survey1;
run;
37
Output
Question
/*13.2 Redo Problem 1, except use data set Survey2.
Note: Ques1–Ques5 are numeric variables.*/
Program
data a15031.survey2;
set a15031.survey2;
array Ques{5} Q1-Q5;
do i = 1 to 5;
Ques{i} = 6 - Ques{i};
end;
drop i;
run;
title "List of SURVEY2 ";
proc print data=a15031.survey2;
run;
Output
Question
/*13.4 Data set Survey2 has five numeric variables (Q1–Q5), each
with values of 1, 2, 3, 4,or 5. You want to determine for each
subject (observation) if they responded with a5 on any of the five
questions. This is easily done using the OR or the IN
operators.However, for this question, use an array to check each of
the five questions. Set variable (ANY5) equal to Yes if any of the
five questions is a 5 and No otherwise.*/
38
Program
data a15031.any5;
set a15031.survey2;
array Ques{5} Q1-Q5;
Any5 = 'No ';
do i = 1 to 5;
if Ques{i} = 5 then do;
Any5 = 'Yes';
leave;
end;
end;
drop i;
run;
title "Listing of ANY5";
proc print data=a15031.any5 noobs;
run;
Output
Learnings from this chapter
 The ways to create arrays
 The ways of using arrays in creating new variables
 Setting values to a missing character values and missing numeric values
 Importance of temporary arrays
39
Chapter 14 - Displaying your Data
Question
/*14.2 Using the data set Sales, create the report shown here:*/
Program
proc sort data=a15031.sales out=a15031.sales;
by Region;
run;
title "Sales ";
proc print data=a15031.sales;
by Region;
id Region;
var Quantity TotalSales;
sumby Region;
run;
Output
Question
/*14.1 List the first 10 observations in data set Blood. Include only the
variables Subject,WBC (white blood cell), RBC (red blood cell), and Chol.
Label the last threevariables “White Blood Cells,” “Red Blood Cells,” and
“Cholesterol,” respectively. Omit the Obs column, and place Subject in the
first column. Be sure the column headings are the variable labels, not the
variable names.*/
Program
title " The First 10 Observations in BLOOD data";
proc print data=a15031.blood(obs=10) label;
id Subject;
var WBC RBC Chol;
label WBC = 'White Blood Cells'
RBC = 'Red Blood Cells'
Chol = 'Cholesterol';
run;
40
Output
Question
/* 14.3 Use PROC PRINT (without any DATA steps) to create a listing like
the one here. Note: The variables in the Hosp data set are Subject,
AdmitDate (Admission Date),DischrDate (Discharge Date), and DOB (Date of
Birth).*/
Program
proc print data=a15031.hosp
n='Number of Patients = '
label
double;
where Year(AdmitDate) eq 2004 and
Month(AdmitDate) eq 9 and
yrdif(DOB,AdmitDate,'Actual') ge 83;
id Subject;
var DOB AdmitDate DischrDate;
label AdmitDate = 'Admission Date'
DischrDate = 'Discharge Date'
DOB = 'Date of Birth';
run;
Output
41
Question
/*14.4List the first five observations from data set Blood. Print only
variables Subject,
Gender, and BloodType. Omit the Obs column.*/
Program
title "First 5 Observations";
proc print data=a15031.blood(obs=5) noobs;
var Subject Gender BloodType;
run;
Output
Learnings from this chapter
 The ways to view the summary of the data
 Listing the observations
 Changing the looks of the observation
 Sorting by multiple variables
 Computing total across variables
42
Chapter 15 – Creating Customized Reports
Question
/*15.2 Using the Blood data set, produce a summary report showing the
average WBC and RBC count for each value of Gender as well as an overall
average. Your report should look like this:*/
Program
proc report data=a15031.blood nowd headline;
column Gender WBC RBC;
define Gender / group width=6;
define WBC / analysis mean "Average WBC"
width=7 format=comma6.0;
define RBC / analysis mean "Average RBC"
width=7 format=5.2;
rbreak after / dol summarize;
run;
quit;
Output
Question
/*15.4 Using the SAS data set Blood Pressure, compute a new variable in
your report. This variable (Hypertensive) is defined as Yes for females
(Gender=F) if the SBP is greater than 138 or the DBP is greater than 88 and
No otherwise. For males(Gender=M), Hypertensive is defined as Yes if the
SBP is over 140 or the DBP is over 90 and No otherwise. Your report should
look like this:*/
Program
proc report data=a15031.bloodpressure nowd;
column Gender SBP DBP Hypertensive;
define Gender / Group width=6;
define SBP / display width=5;
define DBP / display width=5;
define Hypertensive / computed "Hypertensive?" width=13;
compute Hypertensive / character length=3;
if Gender = 'F' and (SBP gt 138 or DBP gt 88)
then Hypertensive = 'Yes';
else Hypertensive='No';
43
if Gender = 'M' and
(SBP gt 140 or DBP gt 90)
then Hypertensive = 'Yes';
else Hypertensive = 'No';
endcomp;
run;
quit;
Output
Question
/*15.6 Using the SAS data set BloodPressure, produce a report showing
Gender, Age, SBP,and DBP. Order the report in Gender and Age order as shown
here:*/
Program
proc report data=a15031.bloodpressure nowd;
column Gender Age SBP DBP;
define Gender / order width=6;
define Age / order width=5;
define SBP / display "Systolic Blood Pressure" width=8;
define DBP / display "Diastolic Blood Pressure" width=9;
run;
quit;
Output
44
Question
/*15.8 Using the data set Blood, produce a report like the one here. The
numbers in the table are the average WBC and RBC counts for each
combination of blood type and gender.*/
Program
proc report data=a15031.blood nowd headline;
column BloodType Gender,WBC Gender,RBC;
define BloodType / group 'Blood Type' width=5;
define Gender / across width=8 '-Gender-';
define WBC / analysis mean format=comma8.;
define RBC / analysis mean format=8.2;
run;
quit;
Output
Learnings from this chapter
 The importance and usage of PROC REPORT
 Customising the report using the options available under PROC REPORT
 Changing the order of the variables in the column statement

More Related Content

What's hot

Report procedure
Report procedureReport procedure
Report procedureMaanasaS
 
E-Book 25 Tips and Tricks MS Excel Functions & Formulaes
E-Book 25 Tips and Tricks MS Excel Functions & FormulaesE-Book 25 Tips and Tricks MS Excel Functions & Formulaes
E-Book 25 Tips and Tricks MS Excel Functions & FormulaesBurCom Consulting Ltd.
 
Utility Procedures in SAS
Utility Procedures in SASUtility Procedures in SAS
Utility Procedures in SASguest2160992
 
Calculated Fields in Tableau
Calculated Fields in TableauCalculated Fields in Tableau
Calculated Fields in TableauKanika Nagpal
 
A Step-By-Step Introduction to SAS Report Procedure
A Step-By-Step Introduction to SAS Report ProcedureA Step-By-Step Introduction to SAS Report Procedure
A Step-By-Step Introduction to SAS Report ProcedureYesAnalytics
 
Les01 (retrieving data using the sql select statement)
Les01 (retrieving data using the sql select statement)Les01 (retrieving data using the sql select statement)
Les01 (retrieving data using the sql select statement)Achmad Solichin
 
Proc SQL in SAS Enterprise Guide 4.3
Proc SQL in SAS Enterprise Guide 4.3Proc SQL in SAS Enterprise Guide 4.3
Proc SQL in SAS Enterprise Guide 4.3Mark Tabladillo
 
[APJ] Common Table Expressions (CTEs) in SQL
[APJ] Common Table Expressions (CTEs) in SQL[APJ] Common Table Expressions (CTEs) in SQL
[APJ] Common Table Expressions (CTEs) in SQLEDB
 
Information Visualisation - Lecture 2
Information Visualisation - Lecture 2Information Visualisation - Lecture 2
Information Visualisation - Lecture 2Stefan Wasserbauer
 
Plsql task answers
Plsql task answersPlsql task answers
Plsql task answersNawaz Sk
 
SQL Tutorial for Beginners
SQL Tutorial for BeginnersSQL Tutorial for Beginners
SQL Tutorial for BeginnersAbdelhay Shafi
 
Basic Sql Handouts
Basic Sql HandoutsBasic Sql Handouts
Basic Sql Handoutsjhe04
 
Pascal tutorial
Pascal tutorialPascal tutorial
Pascal tutorialhidden__
 
SAS Macros part 1
SAS Macros part 1SAS Macros part 1
SAS Macros part 1venkatam
 
How to create SDTM DM.xpt using Python v1.1
How to create SDTM DM.xpt using Python v1.1How to create SDTM DM.xpt using Python v1.1
How to create SDTM DM.xpt using Python v1.1Kevin Lee
 
Excel - Vloopup, Averageif , Countif, Index and Sumif
Excel - Vloopup, Averageif , Countif, Index and SumifExcel - Vloopup, Averageif , Countif, Index and Sumif
Excel - Vloopup, Averageif , Countif, Index and SumifVIVEKRAJ546946
 

What's hot (20)

Report procedure
Report procedureReport procedure
Report procedure
 
E-Book 25 Tips and Tricks MS Excel Functions & Formulaes
E-Book 25 Tips and Tricks MS Excel Functions & FormulaesE-Book 25 Tips and Tricks MS Excel Functions & Formulaes
E-Book 25 Tips and Tricks MS Excel Functions & Formulaes
 
Utility Procedures in SAS
Utility Procedures in SASUtility Procedures in SAS
Utility Procedures in SAS
 
Calculated Fields in Tableau
Calculated Fields in TableauCalculated Fields in Tableau
Calculated Fields in Tableau
 
A Step-By-Step Introduction to SAS Report Procedure
A Step-By-Step Introduction to SAS Report ProcedureA Step-By-Step Introduction to SAS Report Procedure
A Step-By-Step Introduction to SAS Report Procedure
 
Les01 (retrieving data using the sql select statement)
Les01 (retrieving data using the sql select statement)Les01 (retrieving data using the sql select statement)
Les01 (retrieving data using the sql select statement)
 
07. menggunakan fungsi
07. menggunakan fungsi07. menggunakan fungsi
07. menggunakan fungsi
 
Proc SQL in SAS Enterprise Guide 4.3
Proc SQL in SAS Enterprise Guide 4.3Proc SQL in SAS Enterprise Guide 4.3
Proc SQL in SAS Enterprise Guide 4.3
 
[APJ] Common Table Expressions (CTEs) in SQL
[APJ] Common Table Expressions (CTEs) in SQL[APJ] Common Table Expressions (CTEs) in SQL
[APJ] Common Table Expressions (CTEs) in SQL
 
Information Visualisation - Lecture 2
Information Visualisation - Lecture 2Information Visualisation - Lecture 2
Information Visualisation - Lecture 2
 
Plsql task answers
Plsql task answersPlsql task answers
Plsql task answers
 
SQL Tutorial for Beginners
SQL Tutorial for BeginnersSQL Tutorial for Beginners
SQL Tutorial for Beginners
 
SAS Proc SQL
SAS Proc SQLSAS Proc SQL
SAS Proc SQL
 
Unit 8. Pointers
Unit 8. PointersUnit 8. Pointers
Unit 8. Pointers
 
Best sql plsql material
Best sql plsql materialBest sql plsql material
Best sql plsql material
 
Basic Sql Handouts
Basic Sql HandoutsBasic Sql Handouts
Basic Sql Handouts
 
Pascal tutorial
Pascal tutorialPascal tutorial
Pascal tutorial
 
SAS Macros part 1
SAS Macros part 1SAS Macros part 1
SAS Macros part 1
 
How to create SDTM DM.xpt using Python v1.1
How to create SDTM DM.xpt using Python v1.1How to create SDTM DM.xpt using Python v1.1
How to create SDTM DM.xpt using Python v1.1
 
Excel - Vloopup, Averageif , Countif, Index and Sumif
Excel - Vloopup, Averageif , Countif, Index and SumifExcel - Vloopup, Averageif , Countif, Index and Sumif
Excel - Vloopup, Averageif , Countif, Index and Sumif
 

Similar to Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

Cmis 102 hands on/tutorialoutlet
Cmis 102 hands on/tutorialoutletCmis 102 hands on/tutorialoutlet
Cmis 102 hands on/tutorialoutletPoppinss
 
Power of call symput data
Power of call symput dataPower of call symput data
Power of call symput dataYash Sharma
 
1 CMIS 102 Hands-On Lab Week 8 Overview Th.docx
1  CMIS 102 Hands-On Lab  Week 8 Overview Th.docx1  CMIS 102 Hands-On Lab  Week 8 Overview Th.docx
1 CMIS 102 Hands-On Lab Week 8 Overview Th.docxhoney725342
 
Dynamically Evolving Systems: Cluster Analysis Using Time
Dynamically Evolving Systems: Cluster Analysis Using TimeDynamically Evolving Systems: Cluster Analysis Using Time
Dynamically Evolving Systems: Cluster Analysis Using TimeMagnify Analytic Solutions
 
Csphtp1 05
Csphtp1 05Csphtp1 05
Csphtp1 05HUST
 
FP 201 Unit 2 - Part 3
FP 201 Unit 2 - Part 3FP 201 Unit 2 - Part 3
FP 201 Unit 2 - Part 3rohassanie
 
Ecs 10 programming assignment 4 loopapalooza
Ecs 10 programming assignment 4   loopapaloozaEcs 10 programming assignment 4   loopapalooza
Ecs 10 programming assignment 4 loopapaloozaJenniferBall44
 
Apurv Gupta, BCA ,Final year , Dezyne E'cole College
 Apurv Gupta, BCA ,Final year , Dezyne E'cole College Apurv Gupta, BCA ,Final year , Dezyne E'cole College
Apurv Gupta, BCA ,Final year , Dezyne E'cole Collegedezyneecole
 
Answers To Selected Exercises For Fortran 90 95 For Scientists And Engineers
Answers To Selected Exercises For Fortran 90 95 For Scientists And EngineersAnswers To Selected Exercises For Fortran 90 95 For Scientists And Engineers
Answers To Selected Exercises For Fortran 90 95 For Scientists And EngineersSheila Sinclair
 
Final Case Study Churn (Autosaved)
Final Case Study Churn (Autosaved)Final Case Study Churn (Autosaved)
Final Case Study Churn (Autosaved)Marreddy P
 
Lecture#5 Operators in C++
Lecture#5 Operators in C++Lecture#5 Operators in C++
Lecture#5 Operators in C++NUST Stuff
 
8085 microprocessor lab manual
8085 microprocessor lab manual8085 microprocessor lab manual
8085 microprocessor lab manualNithin Mohan
 

Similar to Learning SAS by Example -A Programmer’s Guide by Ron CodySolution (20)

Cmis 102 hands on/tutorialoutlet
Cmis 102 hands on/tutorialoutletCmis 102 hands on/tutorialoutlet
Cmis 102 hands on/tutorialoutlet
 
Power of call symput data
Power of call symput dataPower of call symput data
Power of call symput data
 
Sas Plots Graphs
Sas Plots GraphsSas Plots Graphs
Sas Plots Graphs
 
1 CMIS 102 Hands-On Lab Week 8 Overview Th.docx
1  CMIS 102 Hands-On Lab  Week 8 Overview Th.docx1  CMIS 102 Hands-On Lab  Week 8 Overview Th.docx
1 CMIS 102 Hands-On Lab Week 8 Overview Th.docx
 
Dynamically Evolving Systems: Cluster Analysis Using Time
Dynamically Evolving Systems: Cluster Analysis Using TimeDynamically Evolving Systems: Cluster Analysis Using Time
Dynamically Evolving Systems: Cluster Analysis Using Time
 
Csphtp1 05
Csphtp1 05Csphtp1 05
Csphtp1 05
 
To excel or not?
To excel or not?To excel or not?
To excel or not?
 
FP 201 Unit 2 - Part 3
FP 201 Unit 2 - Part 3FP 201 Unit 2 - Part 3
FP 201 Unit 2 - Part 3
 
White box testing
White box testingWhite box testing
White box testing
 
Control Structures: Part 2
Control Structures: Part 2Control Structures: Part 2
Control Structures: Part 2
 
Ecs 10 programming assignment 4 loopapalooza
Ecs 10 programming assignment 4   loopapaloozaEcs 10 programming assignment 4   loopapalooza
Ecs 10 programming assignment 4 loopapalooza
 
Apurv Gupta, BCA ,Final year , Dezyne E'cole College
 Apurv Gupta, BCA ,Final year , Dezyne E'cole College Apurv Gupta, BCA ,Final year , Dezyne E'cole College
Apurv Gupta, BCA ,Final year , Dezyne E'cole College
 
Answers To Selected Exercises For Fortran 90 95 For Scientists And Engineers
Answers To Selected Exercises For Fortran 90 95 For Scientists And EngineersAnswers To Selected Exercises For Fortran 90 95 For Scientists And Engineers
Answers To Selected Exercises For Fortran 90 95 For Scientists And Engineers
 
Final Case Study Churn (Autosaved)
Final Case Study Churn (Autosaved)Final Case Study Churn (Autosaved)
Final Case Study Churn (Autosaved)
 
Lecture 3
Lecture 3Lecture 3
Lecture 3
 
Lecture#5 Operators in C++
Lecture#5 Operators in C++Lecture#5 Operators in C++
Lecture#5 Operators in C++
 
Basics of c++
Basics of c++ Basics of c++
Basics of c++
 
Lab manual
Lab manualLab manual
Lab manual
 
8085 microprocessor lab manual
8085 microprocessor lab manual8085 microprocessor lab manual
8085 microprocessor lab manual
 
DMAP Tutorial
DMAP TutorialDMAP Tutorial
DMAP Tutorial
 

Recently uploaded

Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad EscortsCall girls in Ahmedabad High profile
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 

Recently uploaded (20)

Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 

Learning SAS by Example -A Programmer’s Guide by Ron CodySolution

  • 1. 1 2015 Praxis Business School Vibeesh C S Solution for Even Numbered Problems For Chapters 7-15 from Learning SAS by Example - A Programmer’s Guide by Ron Cody
  • 2. 2 Chapter 7- Performing Conditional Processing Question /* 7.2 Using the SAS data set Hosp, use PROC PRINT to list observations for Subject values of 5, 100, 150, and 200. Do this twice, once using OR operators and once using the IN operator. Note: Subject is a numeric variable */ Program data a15031.hosp99l4; set a15031.hosp; *USING "OR"; where Subject eq 5 or Subject eq 100 or Subject eq 150 or Subject eq 200; run; proc print data=a15031.hosp99l4; run; data a15031.hosin; *using "IN"; set a15031.hosp; where Subject in(5,100,150,200); run; proc print data=a15031.hosin; run; Output using OR Output using OR
  • 3. 3 Question /*4. Using the Sales data set, create a new, temporary SAS data set containing Region and Total Sales plus a new variable called Weight with values of 1.5 for the North Region, 1.7 for the South Region, and 2.0 for the West and East Regions. Use a SELECT statement to do this */ Program data a15031.sales11q; set a15031.sales(keep = Region Totalsales); *keep function use to keep totlsales and eliminate the other variables; select; when (Region = 'North') Weight = 1.5;*WHEN function subset the data; when (Region = 'South') Weight = 1.7; when (Region = 'East') Weight = 2.0; when (Region = 'West') Weight = 2.0; otherwise; end; proc print data=a15031.sales11q; run; Output
  • 4. 4 Question /*6. Using the Sales data set, list all the observations where Region is North and Quantity is less than 60. Include in this list any observations where the customer name (Customer) is Pet's are Us */ Program data a15031.sal55; set a15031.sales; where Region eq "North" and Quantity < 60; * Region is North and Quantity is less than 60 using where function; run; proc print data=a15031.sal55; run; Learnings from this chapter  The importance of using Keep and drop functions in data step which allows to select the required variables for doing analysis. If dataset has large number of variables we can study only the variables of interest by using these functions  The importance of where statement in data step that allows us to execute the filters in the dataset in accordance with the requirements  The importance of using Boolean functions that allows us to execute the conditions  The importance of Select statement that allows us to implement the customized selection of both variables associated with conditions
  • 5. 5 Chapter 8 – Performing Iterative Processing Question /*8.2 Run the program here to create a temporary SAS data set (MonthSales): data monthsales; input month sales @@; ---add your line(s) here--- datalines; 1 4000 2 5000 3 . 4 5500 5 5000 6 6000 7 6500 8 4500 9 5100 10 5700 11 6500 12 7500 ; Modify this program so that a new variable, SumSales, representing Sales to date, is added to the data set. Be sure that the missing value for Sales in month 3 does not result in a missing value for SumSales */ Program data a15031.monthsales; input month sales @@; *DOUBLE Trailing procedure to read the data set ; datalines; 1 4000 2 5000 3 . 4 5500 5 5000 6 6000 7 6500 8 4500 9 5100 10 5700 11 6500 12 7500 ; proc print data=a15031.monthsales; run; data a15031.modifiedsales; set a15031.monthsales; sumsales+sales; *sum function; *RETAIN function for initiate and return value ; retain sumsales 0; run; proc print data=a15031.modifiedsales; run; Output
  • 6. 6 Question /*8.4 Count the number of missing values for the variables A, B, and C in the Missing data set. Add the cumulative number of missing values to each observation (use variable names MissA, MissB, and MissC). Use the MISSING function to test for the missing values */ Program data a15031.missing1; input G $ A B C ; *using sum function in if statement to calculate num of missing value; if missing(G) then COUNTG+1; if missing(A) then COUNTA+1; if missing(B) then COUNTB+1; if missing(C) then COUNTC+1; datalines; M 56 68 89 F 33 60 71 M 45 91 . F 35 35 68 M . 71 81 M 50 68 71 . 23 60 46 M 65 72 103 . 35 65 67 M 15 71 75 ; proc print data=a15031.missing1 NOOBS; run; Output
  • 7. 7 Question /*8.6 Repeat Problem 5, except have the range of N go from 5 to 100 by 5 */ Program data a15031.loger2; do n = 5 to 100 by 5;*using do loop creating values from 5 to 100 by 5; log_of_n=log(n); output; end; run; proc print data=a15031.loger2; run; Output
  • 8. 8 Question /*8.8 Use an iterative DO loop to plot the following equation: Logit = log(p / (1 – p))Use values of p from 0 to 1 (with a point at every .05). Using the following GPLOT statements will produce a very nice plot. (If you do not have SAS/GRAPH software, use PROC PLOT to plot your points). goptions reset=all ftext='arial' htext=1.0 ftitle='arial/bo' htitle=1.5 colors=(black); symbol v=none i=sm; title "Logit Plot"; proc gplot data=logitplot; plot Logit * p;run;quit;*/ Program data a15031.itrative1; do p= 0 to 1 by 0.05;*using DO loop creating values from 0 to 1 by 0.05;logit=log(p/(1-p)); output; end;run; goptions reset=all ftext='arial' htext=1.0 ftitle='arial/bo' htitle=1.5 colors=(black); symbol v=none i=sm; title "Logit Plot"; proc gplot data=a15031.itrative1; plot logit * p; *plot function to draw a graph; run;quit; proc print data=a15031.itrative1; run; Output
  • 9. 9 Question /*8.10 You are testing three speed-reading methods (A, B, and C) by randomly assigning10 subjects to each of the three methods. You are given the results as three lines of reading speeds, each line representing the results from each of the three methods,respectively. Here are the results: 250 255 256 300 244 268 301 322 256 333 267 275 256 320 250 340 345 290 280 300 350 350 340 290 377 401 380 310 299 399 Create a temporary SAS data set from these three lines of data. Each observation should contain Method (A, B, or C), and Score. There should be 30 observations inthis data set. Use a DO loop to create the Method variable and remember to use asingle trailing @ in your INPUT statement. Provide a listing of this data set using PROC PRINT */ Program data a15031.speed; do method = "method_a" ,"method_b", "method_c" ; do n= 1 to 10;*creating values using do loop; input score@;*single trail function read the data; output; end;end; datalines; 250 255 256 300 244 268 301 322 256 333 267 275 256 320 250 340 345 290 280 300 350 350 340 290 377 401 380 310 299 399 ;proc print data=a15031.speed; run; Output
  • 10. 10 Question /* 8.12 You place money in a fund that returns a compound interest of 4.25% annually. You deposit $1,000 every year. How many years will it take to reach $30,000? Do not use compound interest formulas. Rather, use “brute force” methods with DO WHILE or DO UNTIL statements to solve this problem */ Program data a15031.money; interest=0.0424; total=1000; do until (total gt 30000) ; year+1; total=total+interest*total; output;end; run; proc print data=a15031.money; format total dollar10.2; run; Output
  • 11. 11 Question /*14. Generate a table of integers and squares starting at 1 and ending when the square value is greater than 100. Use either a DO UNTIL or DO WHILE statement to accomplish this*/ Program data a15031.table; do n=1 to 100 until (square ge 100); square= n**2; *using do until taking values from 1 to 100 and specifying the condition for squares variable to stop the loop when it reaches 100; output; end; run; proc print data=a15031.table ; run; Output Learnings from this chapter  The importance of Sum and Retain functions  Using Sum function to find the number of missing values  The importance of do loop in executing iterative conditions  Using single trial functions and double trial functions to read the data  Using Do While and Do Until Statements
  • 12. 12 Chapter9 – Working with Dates Question /* 9.2 Using the following lines of data, create a temporary SAS data set called Three Dates. Each line of data contains three dates, the first two in the form mm/dd/yyyy descenders and the last in the form ddmmmyyyy. Name the three date variables Date1, Date2, and Date3. Format all three using the MMDDYY10. format. Include in your data set the number of years from Date1 to Date2 (Year12) and the number of years from Date2 to Date3 (Year23). Round these values to the nearest year. Here are the lines of data (note that the columns do not line up): 01/03/1950 01/03/1960 03Jan1970 05/15/2000 05/15/2002 15May2003 10/10/1998 11/12/2000 25Dec2005 */ Program data a15031.threedate; input @1 date1 mmddyy10. *fixed line reading; @12 date2 mmddyy10. @23 date3 date9. ; format date1 mmddyy10. date2 mmddyy10. date3 mmddyy10.; year1_2=round(yrdif(date1,date2,"actual")); year2_3=round(yrdif(date2,date3,"actual")); *accessing the values from the above dataset using set function Using yrdif function to calculate difference between date1,date2 and date3 variables and rounding them using round command along with yrdif; datalines; 01/03/1950 01/03/1960 03Jan1970 05/15/2000 05/15/2002 15May2003 10/10/1998 11/12/2000 25Dec2005 ; proc print data=a15031.threedate noobs ; run; Output
  • 13. 13 Question /* 9.4 Using the Hosp data set, compute the subject’s ages two ways: as of January 1, 2006(call it AgeJan1), and as of today’s date (call it Age Today) The variable DOB represents the date of birth. Take the integer portion of both ages. List the first 10 observations */ hint : *using yrdif to find the difference between DOB and today’s date and int to get only integer value of the difference Program data a15031.hospp; set a15031.hosp; age_tdat=round(yrdif(DOB,today(),"actual")); age_1jan=round(yrdif(DOB,"01jan2006"d,"actual")); run; proc print data=a15031.hospp(OBS=10 ); run; Output Question /* 9.6 Using the Medical data set, compute frequencies for the days of the week for the date of the visit (VisitDate). Supply a format for the days of the week and months of the year */ Program data a15031.medical; input @1 VisitDate mmddyy10. @12 patno $3. format visitdate date9.; day_of_week=weekday(visitdate); *fetching weekday from visitdate variable; month_of_year=month(visitdate); *providing format for month variable;
  • 14. 14 datalines; 11/29/2003 879 11/30/2003 880 09/04/2003 883 08/28/2003 884 09/04/2003 885 08/26/2003 886 08/31/2003 887 08/25/2003 888 11/16/2003 913 11/15/2003 914 ; proc freq data= a15031.medical; table day_of_week; format day_of_week date9.; run; proc print data=a15031.medical; run; Output Question /* 9.8 Using the values for Day, Month, and Year in the raw data below, create a temporary SAS data set containing a SAS date based on these values (call it Date) and format this value using the MMDDYY10. format. Here are the Day, Month, and Year values: 25 12 2005 1 1 1960 21 10 1946 */ Program data a15031.date_it; input Day Month Year; datalines; 25 12 2005 1 1 1960 21 10 1946 ; data a15031.date_it1; set a15031.date_it;*set function to set the data into another data ;
  • 15. 15 Date = mdy(Month,Day,Year);* merging the day month year values into mmddyy format; format Date mmddyy10.;*date format mmddyy10.; run; proc print data=a15031.date_it1; run; Output Question /* 9.10 Using the Hosp data set, compute the number of months from the admission date (AdmitDate) and December 31, 2007 (call it MonthsDec). Also, compute the number of months from the admission date to today's date (call it MonthsToday). Use a date interval function to solve this problem. List the first 20 observations for your solution */ Program data a15031.monthdec; set a15031.hosp; *set hosp data into this data from permanent library; *you can find hosp dataset in the blog folder uploaded in the dropbox; MonthDec =intck('month',admitdate,'31dec2007'd) ; *using intck function to find month difference between admitdate and 31Dec2007; MonthToday =intck('month',AdmitDate,today()); run; proc print data= a15031.monthdec; run; Output
  • 16. 16 Question /* 9.12 You want to see each patient in the Medical data set on the same day of the week 5 weeks after they visited the clinic (the variable name is VisitDate).Provide a listing of the patient number (Patno), the visit date, and the date for the return visit */ Program data a15031.med; set a15031.medical; Followdate=intnx('month',VisitDate,5,'sameday'); *using intck function calculate follow date for the given condition; run; proc print data=a15031.med; format Followdate VisitDate date9.; run; Output Learnings from this chapter  The ways to read date variables  The ways to store date variables  The ways to extract day of a week or a month  Providing formats to dates  The importance and usage of intck function
  • 17. 17 Chapter 10 – Subsetting and Combining SAS Datasets Question /*10.2 Using the SAS data set Hosp, create a temporary SAS data set called Monday2002,consisting of observations from Hosp where the admission date (AdmitDate) falls on a Monday and the year is 2002. Include in this new data set a variable called Age,computed as the person’s age as of the admission date, rounded to the nearest year;*/ Program data monday20122; set a15031.hosp; Admit_day= weekday(AdmitDate); *week day of admit; admit_year=year(admitdate); *year OF admit; admit_month=month(AdmitDate);*month of admit; day_of_admit=day(admitdate); *date of admit; run; proc print data=monday20122; run; Output Program data a15031.monday2012; set a15031.monday20122; *2 = monday; where Admit_day = 2 and admit_year=2002; AGE=ROUND( yrdif(DOB,AdmitDate,'Actual')); *ROUND THE NEAREST AGE; run; proc print data= monday2012; format DOB date9. AdmitDate date9.0;run;
  • 18. 18 Output Question /*10.4 Using the SAS data set Bicycles, create two temporary SAS data sets as follows: Mountain_USA consists of all observations from Bicycles where State is Uttar Pradesh and Model is Mountain. Road_France consists of all observations from Bicycles where State is Maharastra and Model is Road Bike. Print these two data sets */ Program title bicycle; data a15031.bicycle;*create a data set; set a15031.bicycles; run; proc contents data=a15031.bicycle;*check the content of data set; run; proc print data=a15031.bicycle; run; data a15031.mountain_usa a15031.road_france; set a15031.bicycle; *using if statement subset the state and model pass the out put to two data set above mentioned; if state = "Uttar Pradesh" and model = "mountain bike" then output a15031.mountaion_usa; else if state = "Maharastra" and model = "road bike" then output a15031.road_france; run; proc print data=a15031.mountain_usa; run;
  • 19. 19 Output Question /*10.6 Repeat Problem 5, except this time sort Inventory and NewProducts first (create two temporary SAS data sets for the sorted observations). Next, create a new, temporary SAS data set (Updated) by interleaving the two temporary, sorted SAS data sets. Print out the result.*/ Program proc sort data=a15031.inventory out=a15031.inventory; by Model;*must sort before merge using any common variable; run; proc sort data=a15031.newproducts out=a15031.newproducts; by Model;run; data a15031.updated; set a15031.inventory a15031.newproducts; *set function use to combine the both table; by Model;run; title "Listing of UPDATED"; proc print data=a15031.updated; run; Output
  • 20. 20 Question /* 10.8 Run the program here to create a SAS data set called Markup: data markup; input manuf : $10. Markup; datalines; Cannondale 1.05 Trek 1.07 ; Combine this data set with the Bicycles data set so that each observation in the Bicycles data set now has a markup value of 1.05 or 1.07, depending on whether the bicycle is made by Cannondale or Trek. In this new data set(call it Markup_Prices),create a new variable (NewTotal) computed as TotalCost times Markup */ Program data a15031.markup; input manuf : $10. Markup; datalines; Cannondale 1.05 Trek 1.07 ; proc print data = a15031.markup; run; proc contents data= a15031.markup; run; data a15031.merage; *combine markup data set with bicycle using merge function; merge a15031.bicycle a15031.markup; by manuf; newtotal=sum(unitcost);run; proc print data = a15031.merage;run; proc sort data = a15031.merage; by manuf;run; proc print data = a15031.merage;*merged data set;run; Output
  • 21. 21 Question /*10.10 Using the Purchase and Inventory data sets, provide a list of all Models (andthe Price) that were not purchased*/ Program proc sort data=a15031.inventory out=a15031.inventory; by Model;*sort the inventory data by model and pass it to inventory; run; proc sort data=a15031.purchase out=a15031.purchase; by Model; *sort the purchase data by model and pass it to purchase; run; data a15031.not_bought; merge a15031.inventory(in=InInventory)*merge the sorted data set; a15031.purchase(in=InPurchase); by Model; if InInventory and not InPurchase; keep Model Price; *keep only model and price and eliminate the other variable; run; title "Listing of NOT_BOUGHT"; proc print data=a15031.not_bought noobs; run; Output Question /*10.12 You want to merge two SAS data sets, Demographic and Survey1, based on an identifier. In Demographic, this identifier is called ID; in Survey1, the identifier is called Subj. Both are character variables.*/ Program data a15031.demographic; input ID : $3. DOB : mmddyy10. Gender : $1.; format DOB mmddyy10.; datalines; 012 10/10/37 M
  • 22. 22 535 7/12/87 F 723 1/5/2000 M 007 6/4/1966 F ; *Data set SURVEY1; data a15031.survey1; input Subj : $3. (Q1-Q5)($1.); datalines; 535 13542 012 55443 723 21211 007 35142 ; *Data set SURVEY2; data a15031.survey2; input ID (Q1-Q5)(1.); datalines; 535 13542 012 55443 723 21211 007 35142 ; proc sort data=a15031.demographic out=demographic; by ID; run; proc sort data=a15031.survey1 out=survey1; by Subj; run; data a15031.combinech10; merge a15031.demographic survey1 (rename=(Subj = ID)); by ID; run; title "Listing of COMBINE"; proc print data=a15031.combinech10 noobs; run; Output
  • 23. 23 Question /*14 Data set Inventory contains two variables: Model (an 8-byte character variable) and Price (a numeric value). The price of Model M567 has changed to 25.95 and the price of Model X999 has changed to 35.99. Create a temporary SAS data set (call it NewPrices) by updating the prices in the Inventory data set*/ Program data a15031.modelnew; input Model $ Price; datalines; M567 25.95 X999 35.99 ; *sorting inventory data by model variable; proc sort data=a15031.inventory out=inventory; by Model; run; *updating inventory data with modelnew for price for the models; data a15031.updatedprices; update a15031.inventory a15031.modelnew; by Model; run; title "Listing of NEWPRICES"; proc print data=a15031.updatedprices noobs; run; Output Learnings from this chapter  The ways to subset a dataset based on the requirements  The ways to generate multiple subsets from the data in single data step  The ways to manipulate the data.  Adding observations, moving observations from datasets  How to produce summary of variables  Merging two datasets by performing one to one, one to many and many to many joins 
  • 24. 24 Chapter 11- Working with Numeric Functions Question /* 11.1 Using the SAS data set Health, compute the body mass index (BMI) defined as the weight in kilograms divided by the height (in meters) squared. Create four other variables based on BMI: 1) BMIRound is the BMI rounded to the nearest integer, 2) BMITenth is the BMI rounded to the nearest tenth, 3) BMIGroup is the BMI rounded to the nearest 5, and 4) BMITrunc is the BMI with a fractional amount truncated. Conversion factors you will need are: 1 Kg equals 2.2 Lbs and 1 inch = .0254 meters */ Program data a15031.health; set a15031.health; BMI = (Weight / 2.2) / (Height*.0254)**2; BMIRound=round(BMI); BMITenth=round(BMI,.1); BMIGroup=round(BMI,5); BMITrunc=int(BMI); run; proc print data=a15031.health; run; Output Question /* 11.2 Count the number of missing values for WBC, RBC, and Chol in the Blood data set. Use the MISSING function to detect missing values */
  • 25. 25 Program data a15031.hel; set a15031.blood; *blood dataset is present in the blog folder uploaded in dropbox folder; if missing(Gender) then MissG+1; if missing(WBC) then MissWBC+1; if missing(RBC) then MissRBC+1; if missing(Chol) then MissChol+1; *using sum function to find the number of missing values in each variable; run; proc print data=a15031.hel; run; Output Question /* 11.4 The SAS data set Psych contains an ID variable, 10 question responses (Ques1–Ques10), and 5 scores (Score1–Score5). You want to create a new, temporary SASdata set (Evaluate) containing the following: a. A variable called QuesAve computed as the mean of Ques1–Ques10. Perform this computation only if there are seven or more non-missing question values. b. If there are no missing Score values, compute the minimum score(MinScore),the maximum score (MaxScore), and the second highest score (SecondHighest) */ Program data a15031.evaluate; set a15031.psych; *pysch dataset is present in the blog folder uploaded in dropbox folder; if n(of Ques1-Ques10) ge 7 then QuesAve=mean(of Ques1-Ques10); if n(of Score1-Score5) eq 5 then maxscore=max(of Score1-Score5); if n(of Score1-Score5) eq 5 then Minscore=min(of Score1-Score5);
  • 26. 26 if n(of Score1-Score5) eq 5 then SecondHighest=largest(2,of Score1- Score5); *using if then stmt to find max score min score secondhighest of the score variables; run; proc print data=a15031.evaluate; run; Output Question /* 11.6 Write a short DATA _NULL_ step to determine the largest integer you can score on your computer in 3, 4, 5, 6, and 7 bytes */ Program data _null_; set a15031.cons; put int3= int4= int5= int6= int7= ; run; Output of log window Question /*11.8 Create a temporary SAS data set (Random) consisting of 1,000 observations, each with a random integer from 1 to 5. Make sure that all integers in the range are equally likely. Run PROC FREQ to test this assumption */
  • 27. 27 Program data a15031.random; do i=1 to 1000; x=int(rand('uniform')*5)+1 /*OR x=int(ranuni(0)*5+1) */;output ;end; *rand function to get random value between 1 and 5; run; proc freq data=a15031.random; tables x/missing;run; Output Question /* 11.10 Data set Char_Num contains character variables Age and Weight and numeric variables SS and Zip. Create a new, temporary SAS data set called Convert with new variables NumAge and NumWeight that are numeric values of Age and Weight, respectively, and CharSS and CharZip that are character variables created from SS and Zip. CharSS should contain leading 0s and dashes in the appropriate places for Social Security numbers and CharZip should contain leading 0s Hint: The Z5. format includes leading 0s for the ZIP code */ Program data a15031.convert; set a15031.char_num; NumAge = input(Age,8.); NumWeight = input(weight,8.); *converting character variables weight and age into numeric variables; CharSS = put(SS,ssn11.); CharZip = put(Zip,z5.); *converting numeric variables SS and Zip into character variables; run; proc print data=a15031.convert; run;
  • 28. 28 Output Question /* 11.12 Using the Stocks data set (containing variables Date and Price), compute daily changes in the prices. Use the statements here to create the plot. Note: If you do not have SAS/GRAPH installed, use PROC PLOT and omit the GOPTIONS and SYMBOL statements. goptions reset=all colors=(black) ftext=swiss htitle=1.5; symbol1 v=dot i=smooth; title "Plot of Daily Price Differences"; proc gplot data=difference; plot Diff*Date; run; quit; */ Program data a15031.price_difference; set a15031.stocks; Diff = Dif(Price); *using dif function to calculate the difference in the price compared to the previous price ; run; goptions reset=all colors=(black) ftext=swiss htitle=1.5; symbol1 v=dot i=smooth; title "Plot for Price Differences"; proc gplot data=a15031.price_difference; plot Diff * Date; run; quit;
  • 29. 29 Output Learnings from this chapter  The ways of rounding and truncating numerical values  The ways to detect missing values  The ways to treat missing values  The ways to assign data types to missing values  The usage of random numbers and the ways to generate random numbers
  • 30. 30 Chapter 12- Working with character functions Question /*12.2 Using the data set Mixed, create a temporary SAS data set (also called Mixed) with the following new variables: a. NameLow – Name in lowercase b. NameProp – Name in proper case c. (Bonus – difficult) NameHard – Name in proper case without using the PROPCASE function*/ Program data a15031.mixed; set a15031.mixed; length First Last $ 15 NameHard $ 20; NameLow = lowcase(Name); *converting entire word into lower case; NameProp = propcase(Name); *making first letter of each work into uppercase; First = lowcase(scan(Name,1,' ')); *converting entire word into lower case; Last = lowcase(scan(Name,2,' ')); *converting entire word into lower case; substr(First,1,1) = upcase(substr(First,1,1)); *converting entire word into upper case; substr(Last,1,1) = upcase(substr(Last,1,1)); *converting entire word into upper case; NameHard = catx(' ',First,Last); drop First Last; run; proc print data=a15031.mixed; run; Output
  • 31. 31 Question /*12.4 Data set Names_And_More contains a character variable called Height. As you cansee in the listing in Problem 3, the heights are in feet and inches. Assume that these units can be in upper- or lowercase and there may or may not be a period following the units. Create a temporary SAS data set (Height) that contains a numeric variable (HtInches) that is the height in inches.*/ *Data set NAMES_AND_MORE; Program data a15031.height; set a15031.names_and_more(keep = Height); Height = compress(Height,'INFT.','i'); /* Alternative Height = compress(Height,' ','kd'); *keep digits and blanks; */ Feet = input(scan(Height,1,' '),8.); Inches = input(scan(Height,2,' '),?? 8.); *using scan function to extract values around the characters from the variable1 value before space and 2 for value after two for ; if missing(Inches) then HtInches = 12*Feet; else HtInches = 12*Feet + Inches; drop Feet Inches; run; title "Listing of HEIGHT"; proc print data=a15031.height noobs; run; Output Question /*12.6 Data set Study (shown here) contains the character variables Group and Dose. Create a new, temporary SAS data set (Study) with a variable called GroupDose by putting these two values together, separated by a dash. The length of the resulting variable should be 6 (test this using PROC CONTENTS or the SAS Explorer). Make sure that there are no blanks (except trailing blanks) in this value. Try this problem two ways: first using one of the CAT functions, and second without using any CAT functions*/*Using CAT functions;
  • 32. 32 Program data a15031.study; set a15031.study; length GroupDose $ 6; GroupDose = catx('-',Group,Dose); *catx function conect the two variable values in “-” ;run; title "Listing of STUDY"; proc print data=a15031.study noobs;run; *Without using CAT functions; data a15031.study; set a15031.study; length GroupDose $ 6; GroupDose = trim(Group) || '-' || Dose; *remove the blank space using trim function; *combine the two variable;run; title "Listing of STUDY"; proc print data=a15031.study noobs; run; Output Question /*12.8 Notice in the listing of data set Study in Problem 6 that the variable called Weight contains units (either lbs or kgs). These units are not always consistent in case and may or may not contain a period. Assume an upper- or lowercase LB indicates pounds and an upper- or lowercase KG indicates kilograms. Create a new, temporary SAS data set (Study) with a numeric variable also called Weight (careful here) thatrepresents weight in pounds, rounded to the nearest 10th of a pound.Note: 1 kilogram = 2.2 pounds*/ Program data a15031.study; set a15031.study(keep=Weight rename=(Weight = WeightUnits)); *using compress(kd)inside input function to keep numerical values alone from the string and change if character variables present to numerical; Weight = input(compress(WeightUnits,,'kd'),8.); if find(WeightUnits,'KG','i') then Weight = round(2.2*Weight,.1); else if find(WeightUnits,'LB','i') then Weight = round(Weight,.1); *using find function with "i" argument to remove characters and to ignore cases; run; title "Listing of STUDY";
  • 33. 33 proc print data=a15031.study noobs; run; Output Question /*12.10 Data set Errors contains character variables Subj (3 bytes) and PartNumber (8bytes). (See the partial listing here.) Create a temporary SAS data set (Check1) with any observation in Errors that violates either of the following two rules: first,Subj should contain only digits, and second, PartNumber should contain only the uppercase letters L and S and digits. Here is a partial listing of Errors:*/ Program data a15031.violates_rules; set a15031.errors; where notdigit(trim(Subj)) or *using notdigit to check any invalid character type value present Here you should use trim function along with notdigit because Without the TRIM function "not" function used here would return the position of the first trailing blank in each of the character values; verify(trim(PartNumber),'0123456789LS'); run; title "Listing of VIOLATES_RULES"; proc print data=a15031.violates_rules noobs; run; Output
  • 34. 34 Question /*12.12 List the subject number (Subj) for any observations in Errors where PartNumber contains an upper- or lowercase X or D.*/ Program title "Subjects with X or D in PartNumber"; proc print data=a15031.errors noobs; *using findc function with argument "i" to find if the variable values contain any case ; where findc(PartNumber,'XD','i'); var Subj PartNumber;run; Output Question /*12.14 List all patients in the Medical data set where the word antibiotics is in the comment field (Comment).*/ Program proc print data=a15031.medical; *comment function to find the particular word in the variable comment; where indexw(Comment,'antibiotics'); run; Output Question /*12.16 Provide a list, in alphabetical order by last name, of the observations in the Names_And_More data set. Set the length of the last name to 15 and remove multiple blanks from Name.
  • 35. 35 Note: The variable Name contains a first name, one or more spaces, and then a last name.*/ Program data a15031.names; set a15031.names_and_more; length Last $ 15; Name = compbl(Name);*compbl function use to compress the blank value; Last = scan(Name,2,' '); *scan function use to take second part of the name and store it the last variable; run; proc sort data=a15031.names; by Last; run; title "Observations in NAMES_AND_MORE in " "Alphabetical Order"; proc print data=a15031.names; id Name; var Phone Height Mixed; run; Output Learnings from this chapter  The ways to perform concatenation of strings  The ways to calculate the length of the string  The ways to remove leading and trailing blanks from string  Using compress and NOT functions  Using comment function to find a word in a variable  Using notdigit to check the invalid character type in a dataset
  • 36. 36 Chapter 13- Working with arrays Question /*Using the SAS data set Survey1, create a new, temporary SAS data set (Survey1) where the values of the variables Ques1–Ques5 are reversed as follows: 1 ?? 5; 2?? 4; 3 ?? 3; 4 ?? 2; 5 ?? 1. Note: Ques1–Ques5 are character variables. Accomplish this using an array.*/ *Data set SURVEY; Program data a15031.survey; infile 'c:bookslearningsurvey.txt' pad; input ID : $3. Gender : $1. Age Salary (Ques1-Ques5)(1.); run; proc format library=a15031; value $gender 'M' = 'Male' 'F' = 'Female' ' ' = 'Not entered' other = 'Miscoded'; value age low-29 = 'Less than 30' 30-50 = '30 to 50' 51-high = '51+'; value $likert '1' = 'Strongly disagree' '2' = 'Disagree' '3' = 'No opinion' '4' = 'Agree' '5' = 'Strongly agree'; run; data a15031.survey1; set a15031.survey1; array Ques{5} $ Q1-Q5; *creating array to storing variables from Q1 to Q5; do i = 1 to 5; Ques{i} = translate(Ques{i},'54321','12345'); *using do loop to create "i" variable with values from 1 to 5 and to reverse the question using translate function inside the Ques array; end; drop i; run; title "List of SURVEY1 "; proc print data=a15031.survey1; run;
  • 37. 37 Output Question /*13.2 Redo Problem 1, except use data set Survey2. Note: Ques1–Ques5 are numeric variables.*/ Program data a15031.survey2; set a15031.survey2; array Ques{5} Q1-Q5; do i = 1 to 5; Ques{i} = 6 - Ques{i}; end; drop i; run; title "List of SURVEY2 "; proc print data=a15031.survey2; run; Output Question /*13.4 Data set Survey2 has five numeric variables (Q1–Q5), each with values of 1, 2, 3, 4,or 5. You want to determine for each subject (observation) if they responded with a5 on any of the five questions. This is easily done using the OR or the IN operators.However, for this question, use an array to check each of the five questions. Set variable (ANY5) equal to Yes if any of the five questions is a 5 and No otherwise.*/
  • 38. 38 Program data a15031.any5; set a15031.survey2; array Ques{5} Q1-Q5; Any5 = 'No '; do i = 1 to 5; if Ques{i} = 5 then do; Any5 = 'Yes'; leave; end; end; drop i; run; title "Listing of ANY5"; proc print data=a15031.any5 noobs; run; Output Learnings from this chapter  The ways to create arrays  The ways of using arrays in creating new variables  Setting values to a missing character values and missing numeric values  Importance of temporary arrays
  • 39. 39 Chapter 14 - Displaying your Data Question /*14.2 Using the data set Sales, create the report shown here:*/ Program proc sort data=a15031.sales out=a15031.sales; by Region; run; title "Sales "; proc print data=a15031.sales; by Region; id Region; var Quantity TotalSales; sumby Region; run; Output Question /*14.1 List the first 10 observations in data set Blood. Include only the variables Subject,WBC (white blood cell), RBC (red blood cell), and Chol. Label the last threevariables “White Blood Cells,” “Red Blood Cells,” and “Cholesterol,” respectively. Omit the Obs column, and place Subject in the first column. Be sure the column headings are the variable labels, not the variable names.*/ Program title " The First 10 Observations in BLOOD data"; proc print data=a15031.blood(obs=10) label; id Subject; var WBC RBC Chol; label WBC = 'White Blood Cells' RBC = 'Red Blood Cells' Chol = 'Cholesterol'; run;
  • 40. 40 Output Question /* 14.3 Use PROC PRINT (without any DATA steps) to create a listing like the one here. Note: The variables in the Hosp data set are Subject, AdmitDate (Admission Date),DischrDate (Discharge Date), and DOB (Date of Birth).*/ Program proc print data=a15031.hosp n='Number of Patients = ' label double; where Year(AdmitDate) eq 2004 and Month(AdmitDate) eq 9 and yrdif(DOB,AdmitDate,'Actual') ge 83; id Subject; var DOB AdmitDate DischrDate; label AdmitDate = 'Admission Date' DischrDate = 'Discharge Date' DOB = 'Date of Birth'; run; Output
  • 41. 41 Question /*14.4List the first five observations from data set Blood. Print only variables Subject, Gender, and BloodType. Omit the Obs column.*/ Program title "First 5 Observations"; proc print data=a15031.blood(obs=5) noobs; var Subject Gender BloodType; run; Output Learnings from this chapter  The ways to view the summary of the data  Listing the observations  Changing the looks of the observation  Sorting by multiple variables  Computing total across variables
  • 42. 42 Chapter 15 – Creating Customized Reports Question /*15.2 Using the Blood data set, produce a summary report showing the average WBC and RBC count for each value of Gender as well as an overall average. Your report should look like this:*/ Program proc report data=a15031.blood nowd headline; column Gender WBC RBC; define Gender / group width=6; define WBC / analysis mean "Average WBC" width=7 format=comma6.0; define RBC / analysis mean "Average RBC" width=7 format=5.2; rbreak after / dol summarize; run; quit; Output Question /*15.4 Using the SAS data set Blood Pressure, compute a new variable in your report. This variable (Hypertensive) is defined as Yes for females (Gender=F) if the SBP is greater than 138 or the DBP is greater than 88 and No otherwise. For males(Gender=M), Hypertensive is defined as Yes if the SBP is over 140 or the DBP is over 90 and No otherwise. Your report should look like this:*/ Program proc report data=a15031.bloodpressure nowd; column Gender SBP DBP Hypertensive; define Gender / Group width=6; define SBP / display width=5; define DBP / display width=5; define Hypertensive / computed "Hypertensive?" width=13; compute Hypertensive / character length=3; if Gender = 'F' and (SBP gt 138 or DBP gt 88) then Hypertensive = 'Yes'; else Hypertensive='No';
  • 43. 43 if Gender = 'M' and (SBP gt 140 or DBP gt 90) then Hypertensive = 'Yes'; else Hypertensive = 'No'; endcomp; run; quit; Output Question /*15.6 Using the SAS data set BloodPressure, produce a report showing Gender, Age, SBP,and DBP. Order the report in Gender and Age order as shown here:*/ Program proc report data=a15031.bloodpressure nowd; column Gender Age SBP DBP; define Gender / order width=6; define Age / order width=5; define SBP / display "Systolic Blood Pressure" width=8; define DBP / display "Diastolic Blood Pressure" width=9; run; quit; Output
  • 44. 44 Question /*15.8 Using the data set Blood, produce a report like the one here. The numbers in the table are the average WBC and RBC counts for each combination of blood type and gender.*/ Program proc report data=a15031.blood nowd headline; column BloodType Gender,WBC Gender,RBC; define BloodType / group 'Blood Type' width=5; define Gender / across width=8 '-Gender-'; define WBC / analysis mean format=comma8.; define RBC / analysis mean format=8.2; run; quit; Output Learnings from this chapter  The importance and usage of PROC REPORT  Customising the report using the options available under PROC REPORT  Changing the order of the variables in the column statement