SlideShare a Scribd company logo
1 of 139
SAS Internal Training
Agenda

Introduction
Read data from raw data file
Formatting the data
Data Manipulation and statistical analysis
Combining and subsetting data
Processing the data iteratively
Report Production
Turning data into information


             DATA
  Data       Step



             SAS         PROC
             Data        Steps
             Sets


                        Information
Design of SAS System


          MultiVendor Architecture



       90%                                  10%
   independent                            dependent


                         Servers/                Super
     PC    Workstation   Midrange   Mainframe
                                                Computer
Design of SAS System

      MultiEngine Architecture™
SAS Program
A SAS program is a sequence of steps that the
user submits for execution.
          DATA steps are typically used to create
 Raw      SAS data sets.
 Data
           DATA      SAS        PROC       Report
           Step      Data       Step
                     Set
 SAS
 Data     PROC steps are typically used to process
 Set      SAS data sets (that is, generate reports
          and graphs, edit data, sort data).
SAS Syntax Rules
SAS statements
• usually begin with an identifying keyword
• always end with a semicolon.
data work.staff;
   infile 'emplist.dat';
   input LastName $ 1-20 FirstName $ 21-30
         JobTitle $ 36-43 Salary 54-59;
run;

proc print data=work.staff;
run;

proc means data=work.staff mean max;
   class JobTitle;
   var Salary;
run;
SAS Syntax Rules
SAS statements are free-format.
  They can begin and end in any column.
  One or more blanks or special characters can be used to
  separate words.
  A single statement can span multiple lines.
  Several statements can be on the same line.
Unconventional Spacing
data work.staff;
infile 'emplist.dat';
input LastName $ 1-20 FirstName $ 21-30
JobTitle $ 36-43 Salary 54-59;
run;
   proc means data=work.staff      mean max;
class JobTitle;    var Salary;run;
SAS Syntax Rules
SAS statements are free-format.
  They can begin and end in any column.
  One or more blanks or special characters can be used to
  separate words.
  A single statement can span multiple lines.
  Several statements can be on the same line.
Unconventional Spacing
data work.staff;
infile 'emplist.dat';
input LastName $ 1-20 FirstName $ 21-30
JobTitle $ 36-43 Salary 54-59;
run;
   proc means data=work.staff      mean max;
class JobTitle;    var Salary;run;
SAS Syntax Rules
SAS statements are free-format.
  They can begin and end in any column.
  One or more blanks or special characters can be used to
  separate words.
  A single statement can span multiple lines.
  Several statements can be on the same line.
Unconventional Spacing
data work.staff;
infile 'emplist.dat';
input LastName $ 1-20 FirstName $ 21-30
JobTitle $ 36-43 Salary 54-59;
run;
   proc means data=work.staff      mean max;
class JobTitle;    var Salary;run;
SAS Syntax Rules
SAS statements are free-format.
  They can begin and end in any column.
  One or more blanks or special characters can be used to
  separate words.
  A single statement can span multiple lines.
  Several statements can be on the same line.
Unconventional Spacing
data work.staff;
infile 'emplist.dat';
input LastName $ 1-20 FirstName $ 21-30
JobTitle $ 36-43 Salary 54-59;
run;
   proc means data=work.staff      mean max;
class JobTitle;    var Salary;run;
SAS Comments
  Type /* ‘comment’ */ for multiple lines
  comment
  Type ‘*’ for single lines comment, and end it
  with ‘;’

/* create dataset work.staff */
data work.staff;
     infile 'emplist.dat';
     input LastName $ 1-20 FirstName $ 21-30
           JobTitle $ 36-43 Salary 54-59;
run;
SAS Data Library
When you invoke SAS, you automatically
have access to a temporary and a permanent
SAS data library.
  WORK - temporary library         WORK

  SASUSER - permanent library
You can create and access your        SASUSER

own permanent libraries.
  IA - permanent library                  IA
Create SAS Data Library

By Statement
libname IA ‘C:SAS Institute’;
By wizard
Agenda

Introduction to SAS
Read data from raw data file
Formatting the data
Data Manipulation and statistical analysis
Combining and subsetting data
Processing the data iteratively
Report Production
What is the Import Wizard ?

The Import Wizard is a point-and-click graphical
interface that enables you to create a SAS data
set from several types of external files including
 – dBASE files (*.DBF)
 – Excel spreadsheets (*.XLS)
 – Microsoft Access tables
 – delimited files (*.*)
 – comma-separated values (*.CSV).
The Import Procedure


PROC IMPORT OUT=SAS-data-set
            DATAFILE='external-file-name‘
           DBMS=file-type;
RUN;
The Export Procedure


PROC EXPORT DATA=SAS-data-set
            OUTFILE=file-name
           DELIMITER=delimiter;
RUN;
Writing the data step (1)

data work.empdata (DROP = …);
   infile 'employee.dat';
   input EmpID     $ 1-4
         LastName $ 5-17
         FirstName $ 18-30
         JobCode   $ 31-36
         Salary      37-45;
run;


           OUTPUT
Writing the data step (2)

data work.aircraft;
   infile ‘aircraft.dat’(DROP =..);
   input @1 Model $16.
         @18 AircraftID $6.
         @25 InService mmddyy10.
         @36 LastMaint mmddyy10.;
run;
Reading delimited Raw data file

data work.aircraft (KEEP = …);
   infile ‘aircraft.dat’ DLM=‘,’ DSD;
   input @1 Model $16.
         @18 AircraftID $6.
         @25 InService mmddyy10.
         @36 LastMaint mmddyy10.;
run;
Testing the data step

data work.aircraft;
   input @1 Model $16.
         @18 AircraftID $6.
         @25 InService mmddyy10.
         @36 LastMaint mmddyy10.;
datalines;
JetCruise LF5200 030003 04/05/1994 03/11/2001
JetCruise LF5200 030005 02/15/1999 07/05/2001
run;
Retain the Variable(s)

RETAIN statement id used to :
    Prevent initialization of variables to missing each time the data
    step executes
    Give an initial value to a retained variable


       RETAIN variable(s) <initial value>
Sample :
data work.grand_salary(KEEP = emp_id emp_salary tot_sal);
   set ia.employee_data;
   retain tot_sal 0;
   tot_sal = tot_sal + emp_salary;
run;
Agenda

Introduction
Read data from raw data file
Formatting the data
Data Manipulation and statistical analysis
Combining and subsetting data
Processing the data iteratively
Report Production
Formatting data values

SAS Format/Informat
User-defined Format
SAS Formats/Informats

Selected SAS   formats:
w.d            standard numeric format
$w.            standard character format
COMMAw.d       commas in a number: 12,234.21
DOLLARw.d      dollar signs and commas in a
               number: $12,234.41
SAS Formats/Informats

Stored Value      Format    Displayed Value

27134.2864     COMMA12.2          27,134.29

27134.2864     12.2                27134.29

27134.2864     DOLLAR12.2        $27,134.29

27134.2864     DOLLAR9.2          $27134.29

27134.2864     DOLLAR8.2           27134.29
SAS Formats/Informats

Selected SAS date formats:
MMDDYYw.           101692 (MMDDYY6.)
              10/16/92 (MMDDYY8.)
              10/16/1992 (MMDDYY10.)

DATEw.       16OCT92 (DATE7.)
             16OCT1992 (DATE9.)
SAS Formats/Informats

Stored Value     Format       Displayed Value
0              MMDDYY8.                 01/01/60
0              MMDDYY10.              01/01/1960
0              DATE9.                  01JAN1960
0              DDMMYY10.              01/01/1960
0              WORDDATE.          January 1, 1960
0              WEEKDATE.   Friday, January 1, 1960
User defined Formats

PROC FORMAT;
      VALUE format-name range1='label'
                        range2='label'
                       …;
 RUN;
Creating User defined Format
Format-name
  names the format you are creating
  for character values, must have a dollar sign
  ($) as the first character and no more than
  seven additional characters, numbers, and
  underscores
  for numeric values, can be up to eight
  characters, numbers, and underscores
  cannot end in a number

                                  continued...
Creating User defined Format

Format-name
  cannot be the name of a SAS System format
  does not end with a period in the VALUE
  statement.
Labels must be
  200 characters or fewer in length
  enclosed in quotes.
Creating User defined Formats
Assign labels to single numbers.
  proc format;                  Formatted
     value gender 1='Female'       value
                  2='Male'
                  other='Miscoded';
  run;
            Numeric data value
Numeric
 format                         Keyword
  name
Creating User defined Formats
Assign labels to ranges of numbers.
proc format;                       Keyword
   value boardfmt low-49='Below'
                  50-99='Average'
                  100-high='Above Average';
run;
     Numeric data ranges
Creating User defined Format

Assign labels to character values and ranges
of character values.                 Character
proc format;                      format name
   value $grade 'A'='Good'
                   'B'-'D'='Fair'
 Character         'F'='Poor'
    value          'I','U'='See Instructor'
   range           other='Miscoded';
run;
                                Keyword
 Discrete character values
Creating User defined Format
proc format;
   value money low-<25000 ='< 25,000'
               25000-50000='25,000 - 50,000'
               50000<-high='> 50,000';
run;




                                   money

 proc print data=work.empdata;
    format Salary money.;
 run;
User defined Informat

 PROC FORMAT;
       INVALUE format-name range1='label'
                         range2='label'
                        …;
  RUN;
Agenda

Introduction
Read data from raw data file
Formatting the data
Data Manipulation and statistical analysis
Combining and subsetting data
Processing the data iteratively
Report Production
Creating Multiple SAS Dataset

data work.north_america work.europe work.other;
      set ia.employee_data;
      select(emp_country);
            when (‘USA’,’CANADA’)
                  output work.north_america;
            when (’DENMARK’,’SWEDEN’,’ITALY’,
                  ‘SPAIN’,’FRANCE’);
                  output work.europe;
            otherwise
                  output work.other;
      end;
run;
Create normal Variable

data ia.comparison;
   merge ia.sales(rename=(SaleMon=Month))
         work.goals;
   by Month Region;
   FClass=FSales-FGoal;
run;
Create Variable through conditional
processing

   TotPassCap                 Size
          100                 Small
          207                 Large
           98                 Small
          188                 Mediu

 if TotPassCap<=150 then Size='Small';
 else if 150<TotPassCap<=200 then
      Size='Medium';
    else if 200<TotPassCap then
         Size='Large';
The LENGTH Statement

You can use the LENGTH statement to define the
length of a variable explicitly.
General form of the LENGTH statement:
   LENGTH variable(s) $ length;

Example:
      length Size $ 6;
Conditionally Executing Multiple
Statements
You can use DO and END statements to execute a group
of statements based on a condition.
General form of DO and END statements:

   IF expression THEN
       DO;
           executable statements
       END;
   ELSE
       DO;
           executable statements
       END;
String Operations

SUBSTR
SCAN
CONCATENATION OPERATOR
TRIM
SUBSTR Function
The SUBSTR function extracts a portion of the character
data value based on how many characters are
designated for retrieval
   var1 = SUBSTR (var, start, <number of characters)

Example :

COUNTRY                                          COUNTRY
                name1=substr(name, 1, 3);
 Dorothy E                                        Dor
BEFORE                                            AFTER

Sample of application: applicable to retrieve the first Initial
Retrieve Middle Initial

Problems :
  Not all middle initials are in the same location,
  so you can’t use the SUBSTR function
  Not all people have middle initial
SCAN Function
The SCAN function extracts a portion of the character
data value based on what word-number to retrieve.
    var1 = SUBSTR (var, word-number, <delimiter(s)>);

Example :

 COUNTRY                                     COUNTRY
                name1= scan(name, 2, ‘ ‘);
Dorothy Edgar                                 Edgar


BEFORE                                       AFTER
Concatenation Operator

The concatenation operator joins character data
values together

             var = var1 !! var2;

Besides !!, the other concatenation chars are :
Two vertical bars and two broken vertical bars
Concatenation Operator

Example : newname = name1 !! Name2;

Compilation :
 NAME1          NAME2        NEWNAME
      $    +         $   =         $
       9             6               15
 Dorothy        E            Dorothy E


                             2 spaces
TRIM Function
The TRIM function removes trailing blanks from a
character data value during execution


              var = TRIM (var1)!! Var2;
Example :

     NAME1           NAME2           NEWNAME
          $     +         $    =           $
          9                6                 15
    Dorothy         E                DorothyE

                                    0 spaces
Numeric Operations

SUM
MEAN
ROUND
INT
SUM function

The SUM function adds the values of the arguments
and ignores missing values.
General form of the SUM function to create a new
variable:
   variable = SUM(argument1, argument2);
variable    variable you want to create
argument    variables, literals, or expressions to
            be summed.
SUM function

When you see the implied variable list, use the
keyword OF in front of the first variable name to
prevent subtraction from occurring.

   variable = SUM(OF var1-varN);
MEAN function

The MEAN function returns the arithmetic mean
(average) and ignores missing values.

   variable = MEAN(argument1, argument2);


variable    variable you want to create
argument    variables, literals, or expressions to
            be summed.
ROUND function

The ROUND function returns a value rounded to the
nearest rounded-off unit. If round-off unit is not
provided, the variable is rounded to the nearest
integer.

variable = ROUND(var1, round-off unit>);

Any number or fractional value can be used as a
     round-off unit
INT function

The INT function returns the integer portion of an
argument.

                var1 = INT(var);
Char-to-Num conversion

You can perform explicit character-to-numeric conversion with
the INPUT function.


      var = INPUT(var1, informat-name);


Notes : you can’t accomplish the type conversion by reassigning
the new varibales with the same name

  Emp_salary = INPUT(emp_salary);
Num-to-Char conversion

You can perform explicit character-to-numeric conversion with
the INPUT function.


       var = PUT(var1, format-name);


Notes : you can’t accomplish the type conversion by reassigning
the new varibales with the same name

  Emp_salary = INPUT(emp_salary);
Working with Date Values

Date values that are stored as SAS dates are special
numeric values.
A SAS date value is interpreted as the number
of days between January 1, 1960, and a specific date.


  01JAN1959          01JAN1960         01JAN1961
             informat
      -365                0                 366
              format

   01/01/1959       01/01/1960         01/01/1961
Converting Dates to SAS Date Values

 SAS uses date informats to read and convert
 dates to SAS date values, for example,

 Stored Value    Informat       Converted Value
 10/29/1999      MMDDYY10.                14546
 29OCT1999       DATE9.                   14546
 29/10/1999      DDMMYY10.                14546
Writing SAS Date Values

SAS uses date formats to write values from
columns that represent dates, for example,

Stored Value       Format      Displayed Value
               0   MMDDYY10.   01/01/1960
               0   DATE9.      01JAN1960
               0   DDMMYY10.   01/01/1960
               0   WEEKDATE.   Friday, January 1, 1960
SAS Time Values
SAS date informats and formats can be used
to read and write SAS time values.

      12:00 AM               9:30 AM
     05JUN1989              05JUN1989


         0                    34200
SAS Datetime Values

SAS datetimes are a combination of dates and times,
and are measured as the number of seconds since
January 1, 1960
          ‘ddmmmyyyy:hh:mm <:ss.s>’DT

SAS date informats and formats can also be used to
read and write SAS datetime values.

       12:00 AM                   9:30 AM
      01JUN1960                  05JUN1989


           0                     928661400
SAS Times

Just as SAS has a starting point of dates, it also
has a starting point of times

Time is measured as the number of seconds
since midnight


               ‘hh:mm<:ss.s>’T
INTNX Function
The INTNX function advances a date, time, or datetime
value by a given interval, and returns a date, time or a
datetime value.

     var = INTNX (‘interval’, start-from, increment)
Example :

   VAR1                                           VAR
    17787     var=intnx(‘year’, var, 1);           17898


SAS date for 12SEP2008                SAS date for 01JAN2009
TODAY () Function

The TODAY() function returns the current date
as SAS date

           var = TODAY ();
INTCK Function

The INTCK function returns the number of time
intervals in a given time span

      var = intck (‘interval’, from, to);
Other DATE Function

var = YEAR (var1);
var = MONTH (var1);
var = DAY (var1);
var = QTR (var1);
var = WEEKDAY (var1); (1-7, 1=Sunday)
var = DAYPART (var1);
MDY (month, day, year);
JULIAN date

To convert a Julian date to SAS date-value

      sas_date = DATEJUL (julian_date);

To convert a SAS date to a Julian date-value :

      jul_date = JULDATE(sas_date);
Creating a SAS date
General Form of the MDY function

      MDY (month, day, year)

Example :

      emp_hire_date = MDY(mon, day, year);
User-defined date formats

PROC FORMAT;
   PICTURE name
      value-range-set-1
  (DATATYPE=DATE|TIME|DATETIME)
run;
The permitted directives

%a   Abbreviated weekday name
%A   Full weekday name
%b   Abbreviated month name
%B   Full weekday name
%d   Day of the month as a decimal number (1-31), with no leading zero
%H   Hour (24-hour clock) as a decimal number (0-23) with no leading zero
%I   Hour (12-hour clock) as a decimal number (1-12) with no leading zero
%j   Day of the year as a decimal number (1-366), with no leading
     zero
%m   Month as a decimal number (1-12) with no leading zero
%M   Minute as a decimal number (0-59) with no leading zero
%p   AM or PM
%S   Second as a decimal number (0-59) with no leading zero
%U   Week number of the year (Sunday as the first day of the week) as a
     decimal number (0, 53) with no leading zero
%Y   Year with century as a decimal number
Program sample
PROC FORMAT;
   PICTURE myfmt
      low-high = ‘%0d-%b%Y   ‘ (datatype=date)
RUN;
Cleansing Techniques

Modify the data using the functions :
    UPCASE
    COMPBL
    TRANWRD
UPCASE Function
The UPCASE function converts all letters in the data
value into uppercase
             var = UPCASE (var)
Example :


COUNTRY       country=upcase(country);         COUNTRY
   france                                       FRANCE

BEFORE                                        AFTER

Use the LOWCASE function to convert data values to lowercase
COMPBL Function
The COMPBL function compresses multiple consecutive blanks in a
data value into one blank. Since the length of a variable is set at
compilation, the resulting data value is padded with blanks.
                var = COMPBL (var)
Example :

   NAME            name = compbl(name);             NAME
 DE PABLOS                                        DE PABLOS


BEFORE                                            AFTER
TRANWRD Function
The TRANWRD function replaces all occurrences of a pattern of
characters in a data value with another pattern of characters.
     var = TRANWRD (var, target, replacement);
Example :



  NAME        name = tranwrd (name, ‘Miss’, ‘Ms’);      NAME
 Miss. Joy Ho                                          Ms. Joy Ho

BEFORE                                                AFTER
Calculating Summary Statistics

Model    AircraftID   InService   TotPassCap
MF4000   010012           10890          267
LF5200   030006           10300          207
LF5200   030008           11389          207

proc means data=ia.aircraftcap maxdec=2;
   var TotPassCap;
   class Model;
run;
Calculating Summary Statistics

Model    AircraftID   InService   TotPassCap
MF4000   010012           10890          267
LF5200   030006           10300          207
LF5200   030008           11389          207

proc means data=ia.aircraftcap maxdec=2;
   var TotPassCap;
   class Model;
run;
Calculating Summary Statistics
BY default, proc means will display all classification
variables and the following the statistics functions :

 N
 Mean
 Std Dev
 Minimum
 Maximum
Agenda

Introduction
Read data from raw data file
Formatting the data
Data Manipulation and statistical analysis
Combining and subsetting data
Processing the data iteratively
Report Production
Concatenating SAS           Jan

 Data Sets                   Feb
                             Mar   Data Set
 Data Set        Data Set
IA.SALES1       IA.SALES2    Apr   IA.SALES1

   Jan             Jul       May

   Feb            Aug        Jun
                             Jul
   Mar      +     Sep    =
                             Aug
   Apr            Oct
                             Sep   Data Set
   May            Nov
                             Oct   IA.SALES2
   Jun            Dec
                             Nov
                             Dec
Concatenating SAS Data Sets

Example program to concatenate two
SAS data sets:
data ia.personnel;
    set ia.employees ia.departments;
run;
Example program to concatenate several
SAS data sets:
data ia.airlines;
    set ia.airport ia.aircraft ia.schedule
         ia.budget ia.sales ia.personnel;
run;
Interleaving SAS data sets

Example program to interleave two
SAS data sets:

data ia.personnel;
   set ia.employees (RENAME=(old=new))
       ia.departments;
     by id;
run;
Preparing Data for Merging

Often you must manipulate data before you
can perform a merge. You might have to
  rename variables
  sort the data.
Sorting the data

PROC SORT DATA = sas-data-set <OUT=sas-data-
set>;
   BY variable<s> <descending>
RUN;
...


  Performing a Match MERGE
              IA.SALES                            WORK.GOALS
SaleMon   Region            FSales   Month   Region             FGoal
      1   Europe        2118222.62       1   Europe        2127742.48
      1   North America 3135765.34       1   North America 2934441.72
      2   Europe        1960034.47       2   Europe        1920751.20
                              DATA STEP
          data ia.comparison;
             merge ia.sales(rename=(SaleMon=Month))
                   work.goals;
             by Month Region;
             FClass=FSales-FGoal;
          run;
                            IA.COMPARISON

Month Region                     FSales          FGoal      FClass
     1 Europe        2118222.62 2127742.48 -9519.86
     1 North America 3135765.34 2934441.72 201323.62
     2 Europe        1960034.47 1920751.20 39283.27
Other Merges (Self-study)

The DATA step merge works with many other kinds of
data combinations:
One-to-many        Unique BY values are in one
                   data set and duplicate
                   matching BY values are in the
                   other data set.
Many-to-many       Duplicate matching BY values
                   are in both data sets.
Non-matches        Some BY values in one data
                   set have no matching BY
                   values in the other data set.
One-To-Many Merging
       WORK.ONE                     WORK.TWO
   X       Y                    X       Z
   1       A                    1       A1
   2       B                    1       A2
   3       C                    2       B1
                                3       C1
                                3       C2
  data work.three;
     merge work.one work.two;
     by X;      X    Y     Z
  run;
                  1   A   A1
                  1   A   A2
                  2   B   B1
                  3   C   C1
                  3   C   C2
One-To-Many Merging
             IA.ALLSALES                   IA.ALLGOALS
 Month   Region               Sales     Month       Goal
     1   Europe          2118222.62         1 2127742.48
     1   North America   3135765.34         2 1920751.20
     2   Europe          1960034.47         3 2125112.75
     2   North America   2926929.91

               data ia.allcompare;
                  merge ia.allsales
                        ia.allgoals;
                  by Month;
                  Difference=Sales-Goal;
               run;
Month    Region                 Sales        Goal Difference
    1    Europe            2118222.62 21277742.48   -9519.86
    1    North America     3135765.34 21277742.48 1008022.86
    2    Europe            1960034.47 1920751.20    39283.27
    2    North America     2926929.91 1920751.20 1006178.71
Many-To-Many Merging
      WORK.ONE                      WORK.TWO
  X       Y                     X       Z
  1       A1                    1       AA1
  1       A2                    1       AA2
  2       B1                    1       AA3
  2       B2                    2       BB1
                                2       BB2
  data work.three;
     merge work.one work.two;
     by X;     X    Y     Z
  run;         1    A1    AA1
                 1   A2   AA2
                 1   A2   AA3
                 2   B1   BB1
                 2   B2   BB2
Many-To-Many Merging
   IA.ALLSALES2                            IA.ALLGOALS2
  Month        Sales                    Month          Goal
      1   2118222.62                        1   21277 42. 48
      1   3135765.34                        1   29344 41. 72
      2   1960034.47                        2   19207 51. 20
      2   2926929.91                        2   27477 87. 49
          data ia.allcompare2;
             merge ia.allsales2
                   ia.allgoals2;
             by Month;
             Difference=Sales-Goal;
          run;
    Month         Sales         Goal    Difference
          1   2118222.62   2127742.48    -9519.86
          1   3135765.34   2934441.72    201323.62
          2   1960034.47   1920751.20    39283.27
          2   2926929.91   2747787.49    179142.42
Merging With Non-matches
    WORK.ONE                    WORK.TWO

   X    Y                   X       Z
   1    A                   1       A1
   2    B                   3       C1
   3    C                   4       D1

 data work.three;
    merge work.one work.two;
    by X;
              X      Y       Z
 run;
              1      A       A1
              2      B
              3      C       C1
              4              D1
Merging With Non-matches
     IA.ESALES                     IA.EGOALS2
 Month        Sales            Month            Goal
    1    2118222.62                1   2127742.48
    2    1960034.47                3   2125112.75
    3    2094220.35                4   2058397.00
           data ia.ecompare2;
              merge ia.esales
                    ia.egoals2;
              by Month;
              Difference=Sales-Goal;
           run;
   Month      Sales        Goal Difference
       1 2118222.62 2127742.48    -9519.86
       2 1960034.47           .          .
               IA.EUROPE_COMPARE
       3 2094220.35 2125112.75   -30892.40
Review the Match-merge (1)

data work.tot_sales;
     merge ia.sales (in=a)
           ia.transaction (in=b);
     by sales_id;
     if a and b;
run;
Review the Match-merge (2)

data not_in_a not_in_b;
     merge work.a (in=a) work.b (in=b);
     by num;
     IF a and not b THEN not_in_b;
     ELSE not_in_a;
run;
Reading a subset of Raw Data

Use the DATA step that was written earlier.
Add a subsetting IF statement to process only the subset
in which the value of AGE is at least 15.
data work.aircraft;
    set ia.aircraft (firstobs=5 obs=10);
    YrInService=year(InService);
    Age=year(today())-YrInService;
    if Age>=15;
run;
Subsetting Your Data with the
WHERE Statement
The WHERE statement enables you to select observations
that meet a certain condition before SAS brings the
observation into the PROC REPORT step.
     Date   FlightID TotPassCap TotPass      TotRev
04JAN1999   IA00300         207     186 $140,170.00
26NOV1999   IA00300         207     176 $133,704.00
31DEC1999   IA00401         207     171 $129,491.00

         proc report data=ia.sales1999 nowd;
            where Date between '24nov1999'd
                  and '03jan2000'd;
         run;
     Date FlightID TotPassCap TotPass      TotRev
26NOV1999 IA00300         207     176 $133,704.00
31DEC1999 IA00401         207     171 $129,491.00
WHERE or IF

                                      WHERE          IF
Step and Usage
                                     Statement   Statement
PROC step                               Yes         No

DATA step (source of variable)
                 INPUT statement        No          Yes
            Assignment statement        No          Yes
  SET statement (single data set)       Yes         Yes
                      SET/MERGE
              (multiple data sets)
         Variable in ALL data sets      Yes         Yes
     Variable not in ALL data sets      No          Yes
Operators

The WHERE statement can be used with
 – comparison operators
 – logical operators.
You can also use the WHERE statement with
special operators.
Comparison Operators

Mnemonic Symbol Definition
    EQ       =   equal to
    NE      ^=   not equal to
    GT       >   greater than
    LT       <   less than
    GE      >=   greater than or equal to
    LE      <=   less than or equal to
    IN           equal to one of a list
Comparison Operators

Examples:
where   Salary>25000;
where   EmpID='0082';
where   Salary=.;
where   LastName=' ';
Logical Operators

Examples:
where JobCode='FLTAT3' and
      Salary>50000;
where JobCode='PILOT1' or
      JobCode='PILOT2' or
      JobCode='PILOT3';
Special Operators

The following are special operators :
  LIKE selects observations by comparing character
  values to specified patterns. A
  percent sign (%) replaces any number of characters
  and an underscore (_) replaces
  one character.
          where Code like 'E_U%';
  (E, a single character, U, followed by any
  characters.)

                                       continued...
Special Operators

The sounds-like (=*) operator selects observations
that contain a spelling variation
of the word or words specified.
  where Name=*'SMITH';
(Selects the names Smythe, Smitt, and so on.)
CONTAINS or ? selects observations that include the
specified substring.
  where Word ? 'LAM';
(BLAME, LAMENT, and BEDLAM are selected.)

                                       continued...
Special Operators

IS NULL or IS MISSING selects observations in which
the value of the variable is missing.
where Flight is missing;

BETWEEN-AND selects observations in which the
value of the variable falls within a range of values.
    where Date between '01mar1999'd
        and '01apr1999'd;
Agenda

Introduction
Read data from raw data file
Formatting the data
Data Manipulation and statistical analysis
Combining and subsetting data
Processing the data iteratively
Report Production
Using DO-LOOP command

DO index-variable = start   TO stop <BY increment>;
   SAS statements;
END;
DO-LOOP syntax

General form of a DO LOOP with a value list :
DO index-variable = value1, value2, value3;
   SAS statements;
END;
DO-LOOP syntax

General form of a DO LOOP with a value list :
DO index-variable = value1, value2, value3;
   SAS statements;
END;
Performing a calculation until the
condition is met
DO WHILE
DO UNTIL
DO WHILE syntax

DO WHILE (expression);
     SAS statements;
END;

  The statements in the loop iteratively execute while
  the expression is true
  The expression is evaluated at the top of the loop
  The statements in the loop never executed id the
  expression is initially false.
DO UNTIL syntax

DO UNTIL (expression);
     SAS statements;
END;

  The statements in the loop iteratively execute until
  the expression becomes true
  The expression is evaluated at the bottom of the
  loop
  The statements in the loop are executed at least
  once
Sample of programs
data work.retire;
  set ia.employee_data;
  service_yrs = year(today()) – year (hire_date);

  do while (service_yrs <= 30);
      emp_salary = emp_salary *1.05;
      service_yrs = service_yrs + 1;
  end;

  year_30 = year(intnx(‘year’, hire_date, 30));
  retire_date = mdy(month(hire_date,
      date(hire_date), year (hire_date));
run;
Sample of programs
Sampling of SAS data set (1)
row_to_read = 10;
  set ia.employee_data;
       point = row_to_read nobs = total_rows;
output;
stop;


Notes :
NOBS= option creates a new temporary variable that contains the
   number of observations in SAS data set.
This value is assigned during compilation, which means you can
   reference this variable before the SET statement
Sampling of SAS data set (2)
sample size = 100;
do while (sample_size >0);
  SAS statements…;
end;
stop;


Notes :
NOBS= option creates a new temporary variable that contains
   the number of observations in SAS data       set.
This value is assigned during compilation, which means you can
   reference this variable before the SET statement
Getting the random number


Random number = ranuni (0);
Agenda

Introduction
Read data from raw data file
Formatting the data
Data Manipulation and statistical analysis
Combining and subsetting data
Processing the data iteratively
Report Production
Creating a List Report (1)
  Model    AircraftID   InService   TotPassCap   Size
  MF4000   010012       10890       267          Large
  LF5200   030006       10300       207          Large
  LF5200   030008       11389       207          Large

proc print data=ia.aircraftcap;
   var AircraftID Size TotPassCap;
run;
                        Aircra           TotPassCa
                        ftID     Size            p
                        010012   Large         267
                        030006   Large         207
                        030008   Large         207
Creating a List Report (2)
  Model    AircraftID   InService   TotPassCap   Size
  MF4000   010012       10890       267          Large
  LF5200   030006       10300       207          Large
  LF5200   030008       11389       207          Large

proc report data=ia.aircraftcap nowd;
   column AircraftID Size TotPassCap;
run;
                        Aircra           TotPassCa
                        ftID     Size            p
                        010012   Large         267
                        030006   Large         207
                        030008   Large         207
The DEFINE Statement

General form of the DEFINE statement:

  DEFINE variable /<usage> <attribute-list>;

You can define options (usage and attributes) in the
DEFINE statement in any order.
Default usage for character variables is DISPLAY.
 – The report lists all of the variable’s values from the
   data set.
The DEFINE Statement

Default usage for numeric variables is ANALYSIS.
  If the report contains at least one display variable
  and no group variables, the report lists all of the
  values of the numeric variable.
  If the report contains only numeric variables, the
  report displays grand totals for the numeric
  variables.
  If the report contains group variables, the report
  displays the sum of the numeric variables’ values for
  each group.
The DEFINE Statement

Other available statistics include
N      number of nonmissing values
MEAN   average
MAX    maximum value
MIN    minimum value
The DEFINE Statement

Additional usage:
 ORDER   determines the order of the rows in
         the report.
         • The default order is ascending.
         • To force the order to be descending,
           include the DESCENDING option on
           the DEFINE statement.
         • Repetitious printing of values is
           suppressed.
The DEFINE Statement

Selected attributes:
 FORMAT=   assigns a format to a variable.
           •   If there is a format stored in the
               descriptor portion of the data set
               it is the default format.
 WIDTH=    controls the width of a report
           column.
           •   The default width is the variable
               length.
                                     continued...
The DEFINE Statement

CENTER identifies the justification of values
LEFT   and the header within the report
RIGHT  column.
       • The default is LEFT for character
         values and RIGHT for numeric
         values.




                                     continued...
The DEFINE Statement

'report-column-header'   defines the report
                         column header.
                         • If there is a label
                           stored in the
                           descriptor portion of
                           the data set it is the
                           default header.
Creating an Enhanced List Report

The enhanced aircraft capacity list report
includes
 – appropriate report column headings
 – formatted values for the INSERVICE variable
 – column widths wide enough for the headings
 – values and headings centered within the
   columns
 – rows of the report ordered by descending
   values of the variable SIZE.
Adding Options to Enhance Report
 Appearance

Selected PROC REPORT options:
HEADLINE underlines all column headers
           and the spaces between them.
HEADSKIP writes a blank line beneath all
           column headers.
Writing A PROC REPORT step
Use DEFINE statements to define the variables
as display variables.
–   Add column headers and specify column width.
–   Add formats and specify alignment. Add titles.
proc report data=ia.sales1999 nowd headline
            headskip;
   column Date FlightId TotPass TotRev;
   define Date     / display center 'Sales Date';
   define FlightId / display center 'Flight';
   define TotPass / display format=3. width=10
                      center 'Total Passengers';
   define TotRev   / display format=dollar11.2
                      center 'Total Revenue';
   title 'Sales and Passenger Data for 1999';
run;
Controlling Report Appearance
Use the HEADLINE option to underline the column headers and
the HEADSKIP option to add a blank line between the column
headers. Add titles.
title1 'Sales and Passenger Data, by Day of
Week'; title2 'Sunday through Saturday’;
proc report data=ia.sales1999 headline headskip
                nowd;
    column Day TotPass TotRev;
    define Day         / group 'Day of Week';
    define TotPass / sum format=comma5.
                         width=10 center
                         'Total Passengers';
    define TotRev / sum format=dollar13.2
                         center 'Total Revenue';
run;
PROC SUMMARY
  PROC SUMMARY DATA = sas-data-set;
     VAR analysis-variable(s);
     CLASS class-variable(s);
     OUTPUT OUT = output-data-set
            STATISTIC = variable(s);
  RUN;

Statistics in PROC SUMMARY :
• N   Number of observations with no missing values
• MEAN      average
• STD Standard Deviation
• MIN Minimum value
• MAX Maximum value
PROC TABULATE

PROC TABULATE DATA = sas-data-set;
   CLASS class-variable(s);
   VAR analysis-variable(s);
   TABLE class_var<*analysis var* Stat>,
       class_var<*analysis var * stat>;
RUN;
PROC TABULATE (detail)
PROC TABULATE DATA=sas-dataset (WHERE=(condition))
            FORMAT=COMMA20.0;
  CLASS Card_group Type;
  VAR Balance N_account Crd_limit;
  TABLE (Type ALL='Total'*{s={foreground=#002288
      Background=white}}),
      N_account*(SUM PCTSUM)
      Balance='Current Balance'*(SUM PCTSUM)
      crd_limit='Credit Limit'*(SUM PCTSUM MEAN)
      Balance='Percentage Balance to Limit'*PCTSUM
<Crd_limit> *f=comma25.2;
  BY Card_Group;
RUN;
The Output Delivery System
ODS statements enable you to create output in a
variety of forms.

                      ODS                SAS
                                        Output
                                        Window
    SAS
   Report                                HTML

                                          File
Generating HTML Files

The ODS HTML statement opens, closes, or manages
the HTML destination.
General form of the ODS statement to create an HTML
file:

  ODS HTML FILE='HTML-file-specification'
      <options>;
  SAS code generating output
  ODS HTML CLOSE;
Creating an HTML Report

Create a report and close the HTML destination.
ods html file='listing.html';
proc report data=ia.comparison nowd;
   column Month …;
   define Month /…;
   other statements
run;
ods html close;
©2004 Amrih Muktiaji

More Related Content

What's hot

Base SAS Full Sample Paper
Base SAS Full Sample Paper Base SAS Full Sample Paper
Base SAS Full Sample Paper Jimmy Rana
 
SAS cheat sheet
SAS cheat sheetSAS cheat sheet
SAS cheat sheetAli Ajouz
 
Utility Procedures in SAS
Utility Procedures in SASUtility Procedures in SAS
Utility Procedures in SASguest2160992
 
SAS Macros part 2
SAS Macros part 2SAS Macros part 2
SAS Macros part 2venkatam
 
SAS Macros part 1
SAS Macros part 1SAS Macros part 1
SAS Macros part 1venkatam
 
Introduction to sas
Introduction to sasIntroduction to sas
Introduction to sasAjay Ohri
 
Understanding sas data step processing.
Understanding sas data step processing.Understanding sas data step processing.
Understanding sas data step processing.Ravi Mandal, MBA
 
Data Match Merging in SAS
Data Match Merging in SASData Match Merging in SAS
Data Match Merging in SASguest2160992
 
introdution to SQL and SQL functions
introdution to SQL and SQL functionsintrodution to SQL and SQL functions
introdution to SQL and SQL functionsfarwa waqar
 
sql function(ppt)
sql function(ppt)sql function(ppt)
sql function(ppt)Ankit Dubey
 
Introduction to data analysis using R
Introduction to data analysis using RIntroduction to data analysis using R
Introduction to data analysis using RVictoria López
 

What's hot (20)

Base SAS Full Sample Paper
Base SAS Full Sample Paper Base SAS Full Sample Paper
Base SAS Full Sample Paper
 
INTRODUCTION TO SAS
INTRODUCTION TO SASINTRODUCTION TO SAS
INTRODUCTION TO SAS
 
Unit 1 - R Programming (Part 2).pptx
Unit 1 - R Programming (Part 2).pptxUnit 1 - R Programming (Part 2).pptx
Unit 1 - R Programming (Part 2).pptx
 
SAS cheat sheet
SAS cheat sheetSAS cheat sheet
SAS cheat sheet
 
Utility Procedures in SAS
Utility Procedures in SASUtility Procedures in SAS
Utility Procedures in SAS
 
SAS Macros part 2
SAS Macros part 2SAS Macros part 2
SAS Macros part 2
 
Sas cheat
Sas cheatSas cheat
Sas cheat
 
Aggregate function
Aggregate functionAggregate function
Aggregate function
 
MySQL
MySQLMySQL
MySQL
 
SAS Macros part 1
SAS Macros part 1SAS Macros part 1
SAS Macros part 1
 
Introduction to sas
Introduction to sasIntroduction to sas
Introduction to sas
 
Understanding sas data step processing.
Understanding sas data step processing.Understanding sas data step processing.
Understanding sas data step processing.
 
Data Match Merging in SAS
Data Match Merging in SASData Match Merging in SAS
Data Match Merging in SAS
 
SAS Functions
SAS FunctionsSAS Functions
SAS Functions
 
introdution to SQL and SQL functions
introdution to SQL and SQL functionsintrodution to SQL and SQL functions
introdution to SQL and SQL functions
 
MySQL and its basic commands
MySQL and its basic commandsMySQL and its basic commands
MySQL and its basic commands
 
Data Visualization With R
Data Visualization With RData Visualization With R
Data Visualization With R
 
sql function(ppt)
sql function(ppt)sql function(ppt)
sql function(ppt)
 
Introduction to Stata
Introduction to Stata Introduction to Stata
Introduction to Stata
 
Introduction to data analysis using R
Introduction to data analysis using RIntroduction to data analysis using R
Introduction to data analysis using R
 

Viewers also liked

Understanding SAS Data Step Processing
Understanding SAS Data Step ProcessingUnderstanding SAS Data Step Processing
Understanding SAS Data Step Processingguest2160992
 
Conditional statements in sas
Conditional statements in sasConditional statements in sas
Conditional statements in sasvenkatam
 
Learn SAS Programming
Learn SAS ProgrammingLearn SAS Programming
Learn SAS ProgrammingSASTechies
 
Base SAS Exam Questions
Base SAS Exam QuestionsBase SAS Exam Questions
Base SAS Exam Questionsguestc45097
 
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...SlideShare
 

Viewers also liked (7)

Understanding SAS Data Step Processing
Understanding SAS Data Step ProcessingUnderstanding SAS Data Step Processing
Understanding SAS Data Step Processing
 
SAS basics Step by step learning
SAS basics Step by step learningSAS basics Step by step learning
SAS basics Step by step learning
 
Conditional statements in sas
Conditional statements in sasConditional statements in sas
Conditional statements in sas
 
Sas demo
Sas demoSas demo
Sas demo
 
Learn SAS Programming
Learn SAS ProgrammingLearn SAS Programming
Learn SAS Programming
 
Base SAS Exam Questions
Base SAS Exam QuestionsBase SAS Exam Questions
Base SAS Exam Questions
 
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
 

Similar to SAS Internal Training

Prog1 chap1 and chap 2
Prog1 chap1 and chap 2Prog1 chap1 and chap 2
Prog1 chap1 and chap 2rowensCap
 
Sas Talk To R Users Group
Sas Talk To R Users GroupSas Talk To R Users Group
Sas Talk To R Users Groupgeorgette1200
 
SAS Online Training Institute in Hyderabad - C-Point
SAS Online Training Institute in Hyderabad - C-PointSAS Online Training Institute in Hyderabad - C-Point
SAS Online Training Institute in Hyderabad - C-Pointcpointss
 
BAS 150 Lesson 4 Lecture
BAS 150 Lesson 4 LectureBAS 150 Lesson 4 Lecture
BAS 150 Lesson 4 LectureWake Tech BAS
 
RStudio_s_R_Markdown_documentation_Cheat_Cheet__1677232908.pdf
RStudio_s_R_Markdown_documentation_Cheat_Cheet__1677232908.pdfRStudio_s_R_Markdown_documentation_Cheat_Cheet__1677232908.pdf
RStudio_s_R_Markdown_documentation_Cheat_Cheet__1677232908.pdfAkshay Sahatpure
 
Sas Enterprise Guide A Revolutionary Tool
Sas Enterprise Guide A Revolutionary ToolSas Enterprise Guide A Revolutionary Tool
Sas Enterprise Guide A Revolutionary Toolsysseminar
 
Database queries
Database queriesDatabase queries
Database querieslaiba29012
 
Python Programming.pptx
Python Programming.pptxPython Programming.pptx
Python Programming.pptxSudhakarVenkey
 
Introducción al Software Analítico SAS
Introducción al Software Analítico SASIntroducción al Software Analítico SAS
Introducción al Software Analítico SASJorge Rodríguez M.
 
SAS INTERVIEW QUESTIONS AND ANSWERS IN 2022
SAS INTERVIEW QUESTIONS AND ANSWERS IN 2022SAS INTERVIEW QUESTIONS AND ANSWERS IN 2022
SAS INTERVIEW QUESTIONS AND ANSWERS IN 2022Sprintzeal
 
Draft sas and r and sas (may, 2018 asa meeting)
Draft sas and r and sas (may, 2018 asa meeting)Draft sas and r and sas (may, 2018 asa meeting)
Draft sas and r and sas (may, 2018 asa meeting)Barry DeCicco
 
SAS Mainframe -Program-Tips
SAS Mainframe -Program-TipsSAS Mainframe -Program-Tips
SAS Mainframe -Program-TipsSrinimf-Slides
 
Getting started with Pandas Cheatsheet.pdf
Getting started with Pandas Cheatsheet.pdfGetting started with Pandas Cheatsheet.pdf
Getting started with Pandas Cheatsheet.pdfSudhakarVenkey
 
OWB11gR2 - Extending ETL
OWB11gR2 - Extending ETL OWB11gR2 - Extending ETL
OWB11gR2 - Extending ETL Suraj Bang
 
rmarkdown.pdf
rmarkdown.pdfrmarkdown.pdf
rmarkdown.pdfTheZephyr
 
New Features of SQL Server 2016
New Features of SQL Server 2016New Features of SQL Server 2016
New Features of SQL Server 2016Mir Mahmood
 
Introduction to-sas-1211594349119006-8
Introduction to-sas-1211594349119006-8Introduction to-sas-1211594349119006-8
Introduction to-sas-1211594349119006-8thotakoti
 

Similar to SAS Internal Training (20)

Prog1 chap1 and chap 2
Prog1 chap1 and chap 2Prog1 chap1 and chap 2
Prog1 chap1 and chap 2
 
Sas classes in mumbai
Sas classes in mumbaiSas classes in mumbai
Sas classes in mumbai
 
Sas Talk To R Users Group
Sas Talk To R Users GroupSas Talk To R Users Group
Sas Talk To R Users Group
 
SAS Online Training Institute in Hyderabad - C-Point
SAS Online Training Institute in Hyderabad - C-PointSAS Online Training Institute in Hyderabad - C-Point
SAS Online Training Institute in Hyderabad - C-Point
 
SAS - Training
SAS - Training SAS - Training
SAS - Training
 
BAS 150 Lesson 4 Lecture
BAS 150 Lesson 4 LectureBAS 150 Lesson 4 Lecture
BAS 150 Lesson 4 Lecture
 
RStudio_s_R_Markdown_documentation_Cheat_Cheet__1677232908.pdf
RStudio_s_R_Markdown_documentation_Cheat_Cheet__1677232908.pdfRStudio_s_R_Markdown_documentation_Cheat_Cheet__1677232908.pdf
RStudio_s_R_Markdown_documentation_Cheat_Cheet__1677232908.pdf
 
Sas Enterprise Guide A Revolutionary Tool
Sas Enterprise Guide A Revolutionary ToolSas Enterprise Guide A Revolutionary Tool
Sas Enterprise Guide A Revolutionary Tool
 
Database queries
Database queriesDatabase queries
Database queries
 
Python Programming.pptx
Python Programming.pptxPython Programming.pptx
Python Programming.pptx
 
SAS Defined Format.pptx
SAS Defined Format.pptxSAS Defined Format.pptx
SAS Defined Format.pptx
 
Introducción al Software Analítico SAS
Introducción al Software Analítico SASIntroducción al Software Analítico SAS
Introducción al Software Analítico SAS
 
SAS INTERVIEW QUESTIONS AND ANSWERS IN 2022
SAS INTERVIEW QUESTIONS AND ANSWERS IN 2022SAS INTERVIEW QUESTIONS AND ANSWERS IN 2022
SAS INTERVIEW QUESTIONS AND ANSWERS IN 2022
 
Draft sas and r and sas (may, 2018 asa meeting)
Draft sas and r and sas (may, 2018 asa meeting)Draft sas and r and sas (may, 2018 asa meeting)
Draft sas and r and sas (may, 2018 asa meeting)
 
SAS Mainframe -Program-Tips
SAS Mainframe -Program-TipsSAS Mainframe -Program-Tips
SAS Mainframe -Program-Tips
 
Getting started with Pandas Cheatsheet.pdf
Getting started with Pandas Cheatsheet.pdfGetting started with Pandas Cheatsheet.pdf
Getting started with Pandas Cheatsheet.pdf
 
OWB11gR2 - Extending ETL
OWB11gR2 - Extending ETL OWB11gR2 - Extending ETL
OWB11gR2 - Extending ETL
 
rmarkdown.pdf
rmarkdown.pdfrmarkdown.pdf
rmarkdown.pdf
 
New Features of SQL Server 2016
New Features of SQL Server 2016New Features of SQL Server 2016
New Features of SQL Server 2016
 
Introduction to-sas-1211594349119006-8
Introduction to-sas-1211594349119006-8Introduction to-sas-1211594349119006-8
Introduction to-sas-1211594349119006-8
 

Recently uploaded

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 

Recently uploaded (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

SAS Internal Training

  • 2. Agenda Introduction Read data from raw data file Formatting the data Data Manipulation and statistical analysis Combining and subsetting data Processing the data iteratively Report Production
  • 3. Turning data into information DATA Data Step SAS PROC Data Steps Sets Information
  • 4. Design of SAS System MultiVendor Architecture 90% 10% independent dependent Servers/ Super PC Workstation Midrange Mainframe Computer
  • 5. Design of SAS System MultiEngine Architecture™
  • 6. SAS Program A SAS program is a sequence of steps that the user submits for execution. DATA steps are typically used to create Raw SAS data sets. Data DATA SAS PROC Report Step Data Step Set SAS Data PROC steps are typically used to process Set SAS data sets (that is, generate reports and graphs, edit data, sort data).
  • 7. SAS Syntax Rules SAS statements • usually begin with an identifying keyword • always end with a semicolon. data work.staff; infile 'emplist.dat'; input LastName $ 1-20 FirstName $ 21-30 JobTitle $ 36-43 Salary 54-59; run; proc print data=work.staff; run; proc means data=work.staff mean max; class JobTitle; var Salary; run;
  • 8. SAS Syntax Rules SAS statements are free-format. They can begin and end in any column. One or more blanks or special characters can be used to separate words. A single statement can span multiple lines. Several statements can be on the same line. Unconventional Spacing data work.staff; infile 'emplist.dat'; input LastName $ 1-20 FirstName $ 21-30 JobTitle $ 36-43 Salary 54-59; run; proc means data=work.staff mean max; class JobTitle; var Salary;run;
  • 9. SAS Syntax Rules SAS statements are free-format. They can begin and end in any column. One or more blanks or special characters can be used to separate words. A single statement can span multiple lines. Several statements can be on the same line. Unconventional Spacing data work.staff; infile 'emplist.dat'; input LastName $ 1-20 FirstName $ 21-30 JobTitle $ 36-43 Salary 54-59; run; proc means data=work.staff mean max; class JobTitle; var Salary;run;
  • 10. SAS Syntax Rules SAS statements are free-format. They can begin and end in any column. One or more blanks or special characters can be used to separate words. A single statement can span multiple lines. Several statements can be on the same line. Unconventional Spacing data work.staff; infile 'emplist.dat'; input LastName $ 1-20 FirstName $ 21-30 JobTitle $ 36-43 Salary 54-59; run; proc means data=work.staff mean max; class JobTitle; var Salary;run;
  • 11. SAS Syntax Rules SAS statements are free-format. They can begin and end in any column. One or more blanks or special characters can be used to separate words. A single statement can span multiple lines. Several statements can be on the same line. Unconventional Spacing data work.staff; infile 'emplist.dat'; input LastName $ 1-20 FirstName $ 21-30 JobTitle $ 36-43 Salary 54-59; run; proc means data=work.staff mean max; class JobTitle; var Salary;run;
  • 12. SAS Comments Type /* ‘comment’ */ for multiple lines comment Type ‘*’ for single lines comment, and end it with ‘;’ /* create dataset work.staff */ data work.staff; infile 'emplist.dat'; input LastName $ 1-20 FirstName $ 21-30 JobTitle $ 36-43 Salary 54-59; run;
  • 13. SAS Data Library When you invoke SAS, you automatically have access to a temporary and a permanent SAS data library. WORK - temporary library WORK SASUSER - permanent library You can create and access your SASUSER own permanent libraries. IA - permanent library IA
  • 14. Create SAS Data Library By Statement libname IA ‘C:SAS Institute’; By wizard
  • 15. Agenda Introduction to SAS Read data from raw data file Formatting the data Data Manipulation and statistical analysis Combining and subsetting data Processing the data iteratively Report Production
  • 16. What is the Import Wizard ? The Import Wizard is a point-and-click graphical interface that enables you to create a SAS data set from several types of external files including – dBASE files (*.DBF) – Excel spreadsheets (*.XLS) – Microsoft Access tables – delimited files (*.*) – comma-separated values (*.CSV).
  • 17. The Import Procedure PROC IMPORT OUT=SAS-data-set DATAFILE='external-file-name‘ DBMS=file-type; RUN;
  • 18. The Export Procedure PROC EXPORT DATA=SAS-data-set OUTFILE=file-name DELIMITER=delimiter; RUN;
  • 19. Writing the data step (1) data work.empdata (DROP = …); infile 'employee.dat'; input EmpID $ 1-4 LastName $ 5-17 FirstName $ 18-30 JobCode $ 31-36 Salary 37-45; run; OUTPUT
  • 20. Writing the data step (2) data work.aircraft; infile ‘aircraft.dat’(DROP =..); input @1 Model $16. @18 AircraftID $6. @25 InService mmddyy10. @36 LastMaint mmddyy10.; run;
  • 21. Reading delimited Raw data file data work.aircraft (KEEP = …); infile ‘aircraft.dat’ DLM=‘,’ DSD; input @1 Model $16. @18 AircraftID $6. @25 InService mmddyy10. @36 LastMaint mmddyy10.; run;
  • 22. Testing the data step data work.aircraft; input @1 Model $16. @18 AircraftID $6. @25 InService mmddyy10. @36 LastMaint mmddyy10.; datalines; JetCruise LF5200 030003 04/05/1994 03/11/2001 JetCruise LF5200 030005 02/15/1999 07/05/2001 run;
  • 23. Retain the Variable(s) RETAIN statement id used to : Prevent initialization of variables to missing each time the data step executes Give an initial value to a retained variable RETAIN variable(s) <initial value> Sample : data work.grand_salary(KEEP = emp_id emp_salary tot_sal); set ia.employee_data; retain tot_sal 0; tot_sal = tot_sal + emp_salary; run;
  • 24. Agenda Introduction Read data from raw data file Formatting the data Data Manipulation and statistical analysis Combining and subsetting data Processing the data iteratively Report Production
  • 25. Formatting data values SAS Format/Informat User-defined Format
  • 26. SAS Formats/Informats Selected SAS formats: w.d standard numeric format $w. standard character format COMMAw.d commas in a number: 12,234.21 DOLLARw.d dollar signs and commas in a number: $12,234.41
  • 27. SAS Formats/Informats Stored Value Format Displayed Value 27134.2864 COMMA12.2 27,134.29 27134.2864 12.2 27134.29 27134.2864 DOLLAR12.2 $27,134.29 27134.2864 DOLLAR9.2 $27134.29 27134.2864 DOLLAR8.2 27134.29
  • 28. SAS Formats/Informats Selected SAS date formats: MMDDYYw. 101692 (MMDDYY6.) 10/16/92 (MMDDYY8.) 10/16/1992 (MMDDYY10.) DATEw. 16OCT92 (DATE7.) 16OCT1992 (DATE9.)
  • 29. SAS Formats/Informats Stored Value Format Displayed Value 0 MMDDYY8. 01/01/60 0 MMDDYY10. 01/01/1960 0 DATE9. 01JAN1960 0 DDMMYY10. 01/01/1960 0 WORDDATE. January 1, 1960 0 WEEKDATE. Friday, January 1, 1960
  • 30. User defined Formats PROC FORMAT; VALUE format-name range1='label' range2='label' …; RUN;
  • 31. Creating User defined Format Format-name names the format you are creating for character values, must have a dollar sign ($) as the first character and no more than seven additional characters, numbers, and underscores for numeric values, can be up to eight characters, numbers, and underscores cannot end in a number continued...
  • 32. Creating User defined Format Format-name cannot be the name of a SAS System format does not end with a period in the VALUE statement. Labels must be 200 characters or fewer in length enclosed in quotes.
  • 33. Creating User defined Formats Assign labels to single numbers. proc format; Formatted value gender 1='Female' value 2='Male' other='Miscoded'; run; Numeric data value Numeric format Keyword name
  • 34. Creating User defined Formats Assign labels to ranges of numbers. proc format; Keyword value boardfmt low-49='Below' 50-99='Average' 100-high='Above Average'; run; Numeric data ranges
  • 35. Creating User defined Format Assign labels to character values and ranges of character values. Character proc format; format name value $grade 'A'='Good' 'B'-'D'='Fair' Character 'F'='Poor' value 'I','U'='See Instructor' range other='Miscoded'; run; Keyword Discrete character values
  • 36. Creating User defined Format proc format; value money low-<25000 ='< 25,000' 25000-50000='25,000 - 50,000' 50000<-high='> 50,000'; run; money proc print data=work.empdata; format Salary money.; run;
  • 37. User defined Informat PROC FORMAT; INVALUE format-name range1='label' range2='label' …; RUN;
  • 38. Agenda Introduction Read data from raw data file Formatting the data Data Manipulation and statistical analysis Combining and subsetting data Processing the data iteratively Report Production
  • 39. Creating Multiple SAS Dataset data work.north_america work.europe work.other; set ia.employee_data; select(emp_country); when (‘USA’,’CANADA’) output work.north_america; when (’DENMARK’,’SWEDEN’,’ITALY’, ‘SPAIN’,’FRANCE’); output work.europe; otherwise output work.other; end; run;
  • 40. Create normal Variable data ia.comparison; merge ia.sales(rename=(SaleMon=Month)) work.goals; by Month Region; FClass=FSales-FGoal; run;
  • 41. Create Variable through conditional processing TotPassCap Size 100 Small 207 Large 98 Small 188 Mediu if TotPassCap<=150 then Size='Small'; else if 150<TotPassCap<=200 then Size='Medium'; else if 200<TotPassCap then Size='Large';
  • 42. The LENGTH Statement You can use the LENGTH statement to define the length of a variable explicitly. General form of the LENGTH statement: LENGTH variable(s) $ length; Example: length Size $ 6;
  • 43. Conditionally Executing Multiple Statements You can use DO and END statements to execute a group of statements based on a condition. General form of DO and END statements: IF expression THEN DO; executable statements END; ELSE DO; executable statements END;
  • 45. SUBSTR Function The SUBSTR function extracts a portion of the character data value based on how many characters are designated for retrieval var1 = SUBSTR (var, start, <number of characters) Example : COUNTRY COUNTRY name1=substr(name, 1, 3); Dorothy E Dor BEFORE AFTER Sample of application: applicable to retrieve the first Initial
  • 46. Retrieve Middle Initial Problems : Not all middle initials are in the same location, so you can’t use the SUBSTR function Not all people have middle initial
  • 47. SCAN Function The SCAN function extracts a portion of the character data value based on what word-number to retrieve. var1 = SUBSTR (var, word-number, <delimiter(s)>); Example : COUNTRY COUNTRY name1= scan(name, 2, ‘ ‘); Dorothy Edgar Edgar BEFORE AFTER
  • 48. Concatenation Operator The concatenation operator joins character data values together var = var1 !! var2; Besides !!, the other concatenation chars are : Two vertical bars and two broken vertical bars
  • 49. Concatenation Operator Example : newname = name1 !! Name2; Compilation : NAME1 NAME2 NEWNAME $ + $ = $ 9 6 15 Dorothy E Dorothy E 2 spaces
  • 50. TRIM Function The TRIM function removes trailing blanks from a character data value during execution var = TRIM (var1)!! Var2; Example : NAME1 NAME2 NEWNAME $ + $ = $ 9 6 15 Dorothy E DorothyE 0 spaces
  • 52. SUM function The SUM function adds the values of the arguments and ignores missing values. General form of the SUM function to create a new variable: variable = SUM(argument1, argument2); variable variable you want to create argument variables, literals, or expressions to be summed.
  • 53. SUM function When you see the implied variable list, use the keyword OF in front of the first variable name to prevent subtraction from occurring. variable = SUM(OF var1-varN);
  • 54. MEAN function The MEAN function returns the arithmetic mean (average) and ignores missing values. variable = MEAN(argument1, argument2); variable variable you want to create argument variables, literals, or expressions to be summed.
  • 55. ROUND function The ROUND function returns a value rounded to the nearest rounded-off unit. If round-off unit is not provided, the variable is rounded to the nearest integer. variable = ROUND(var1, round-off unit>); Any number or fractional value can be used as a round-off unit
  • 56. INT function The INT function returns the integer portion of an argument. var1 = INT(var);
  • 57. Char-to-Num conversion You can perform explicit character-to-numeric conversion with the INPUT function. var = INPUT(var1, informat-name); Notes : you can’t accomplish the type conversion by reassigning the new varibales with the same name Emp_salary = INPUT(emp_salary);
  • 58. Num-to-Char conversion You can perform explicit character-to-numeric conversion with the INPUT function. var = PUT(var1, format-name); Notes : you can’t accomplish the type conversion by reassigning the new varibales with the same name Emp_salary = INPUT(emp_salary);
  • 59. Working with Date Values Date values that are stored as SAS dates are special numeric values. A SAS date value is interpreted as the number of days between January 1, 1960, and a specific date. 01JAN1959 01JAN1960 01JAN1961 informat -365 0 366 format 01/01/1959 01/01/1960 01/01/1961
  • 60. Converting Dates to SAS Date Values SAS uses date informats to read and convert dates to SAS date values, for example, Stored Value Informat Converted Value 10/29/1999 MMDDYY10. 14546 29OCT1999 DATE9. 14546 29/10/1999 DDMMYY10. 14546
  • 61. Writing SAS Date Values SAS uses date formats to write values from columns that represent dates, for example, Stored Value Format Displayed Value 0 MMDDYY10. 01/01/1960 0 DATE9. 01JAN1960 0 DDMMYY10. 01/01/1960 0 WEEKDATE. Friday, January 1, 1960
  • 62. SAS Time Values SAS date informats and formats can be used to read and write SAS time values. 12:00 AM 9:30 AM 05JUN1989 05JUN1989 0 34200
  • 63. SAS Datetime Values SAS datetimes are a combination of dates and times, and are measured as the number of seconds since January 1, 1960 ‘ddmmmyyyy:hh:mm <:ss.s>’DT SAS date informats and formats can also be used to read and write SAS datetime values. 12:00 AM 9:30 AM 01JUN1960 05JUN1989 0 928661400
  • 64. SAS Times Just as SAS has a starting point of dates, it also has a starting point of times Time is measured as the number of seconds since midnight ‘hh:mm<:ss.s>’T
  • 65. INTNX Function The INTNX function advances a date, time, or datetime value by a given interval, and returns a date, time or a datetime value. var = INTNX (‘interval’, start-from, increment) Example : VAR1 VAR 17787 var=intnx(‘year’, var, 1); 17898 SAS date for 12SEP2008 SAS date for 01JAN2009
  • 66. TODAY () Function The TODAY() function returns the current date as SAS date var = TODAY ();
  • 67. INTCK Function The INTCK function returns the number of time intervals in a given time span var = intck (‘interval’, from, to);
  • 68. Other DATE Function var = YEAR (var1); var = MONTH (var1); var = DAY (var1); var = QTR (var1); var = WEEKDAY (var1); (1-7, 1=Sunday) var = DAYPART (var1); MDY (month, day, year);
  • 69. JULIAN date To convert a Julian date to SAS date-value sas_date = DATEJUL (julian_date); To convert a SAS date to a Julian date-value : jul_date = JULDATE(sas_date);
  • 70. Creating a SAS date General Form of the MDY function MDY (month, day, year) Example : emp_hire_date = MDY(mon, day, year);
  • 71. User-defined date formats PROC FORMAT; PICTURE name value-range-set-1 (DATATYPE=DATE|TIME|DATETIME) run;
  • 72. The permitted directives %a Abbreviated weekday name %A Full weekday name %b Abbreviated month name %B Full weekday name %d Day of the month as a decimal number (1-31), with no leading zero %H Hour (24-hour clock) as a decimal number (0-23) with no leading zero %I Hour (12-hour clock) as a decimal number (1-12) with no leading zero %j Day of the year as a decimal number (1-366), with no leading zero %m Month as a decimal number (1-12) with no leading zero %M Minute as a decimal number (0-59) with no leading zero %p AM or PM %S Second as a decimal number (0-59) with no leading zero %U Week number of the year (Sunday as the first day of the week) as a decimal number (0, 53) with no leading zero %Y Year with century as a decimal number
  • 73. Program sample PROC FORMAT; PICTURE myfmt low-high = ‘%0d-%b%Y ‘ (datatype=date) RUN;
  • 74. Cleansing Techniques Modify the data using the functions : UPCASE COMPBL TRANWRD
  • 75. UPCASE Function The UPCASE function converts all letters in the data value into uppercase var = UPCASE (var) Example : COUNTRY country=upcase(country); COUNTRY france FRANCE BEFORE AFTER Use the LOWCASE function to convert data values to lowercase
  • 76. COMPBL Function The COMPBL function compresses multiple consecutive blanks in a data value into one blank. Since the length of a variable is set at compilation, the resulting data value is padded with blanks. var = COMPBL (var) Example : NAME name = compbl(name); NAME DE PABLOS DE PABLOS BEFORE AFTER
  • 77. TRANWRD Function The TRANWRD function replaces all occurrences of a pattern of characters in a data value with another pattern of characters. var = TRANWRD (var, target, replacement); Example : NAME name = tranwrd (name, ‘Miss’, ‘Ms’); NAME Miss. Joy Ho Ms. Joy Ho BEFORE AFTER
  • 78. Calculating Summary Statistics Model AircraftID InService TotPassCap MF4000 010012 10890 267 LF5200 030006 10300 207 LF5200 030008 11389 207 proc means data=ia.aircraftcap maxdec=2; var TotPassCap; class Model; run;
  • 79. Calculating Summary Statistics Model AircraftID InService TotPassCap MF4000 010012 10890 267 LF5200 030006 10300 207 LF5200 030008 11389 207 proc means data=ia.aircraftcap maxdec=2; var TotPassCap; class Model; run;
  • 80. Calculating Summary Statistics BY default, proc means will display all classification variables and the following the statistics functions : N Mean Std Dev Minimum Maximum
  • 81. Agenda Introduction Read data from raw data file Formatting the data Data Manipulation and statistical analysis Combining and subsetting data Processing the data iteratively Report Production
  • 82. Concatenating SAS Jan Data Sets Feb Mar Data Set Data Set Data Set IA.SALES1 IA.SALES2 Apr IA.SALES1 Jan Jul May Feb Aug Jun Jul Mar + Sep = Aug Apr Oct Sep Data Set May Nov Oct IA.SALES2 Jun Dec Nov Dec
  • 83. Concatenating SAS Data Sets Example program to concatenate two SAS data sets: data ia.personnel; set ia.employees ia.departments; run; Example program to concatenate several SAS data sets: data ia.airlines; set ia.airport ia.aircraft ia.schedule ia.budget ia.sales ia.personnel; run;
  • 84. Interleaving SAS data sets Example program to interleave two SAS data sets: data ia.personnel; set ia.employees (RENAME=(old=new)) ia.departments; by id; run;
  • 85. Preparing Data for Merging Often you must manipulate data before you can perform a merge. You might have to rename variables sort the data.
  • 86. Sorting the data PROC SORT DATA = sas-data-set <OUT=sas-data- set>; BY variable<s> <descending> RUN;
  • 87. ... Performing a Match MERGE IA.SALES WORK.GOALS SaleMon Region FSales Month Region FGoal 1 Europe 2118222.62 1 Europe 2127742.48 1 North America 3135765.34 1 North America 2934441.72 2 Europe 1960034.47 2 Europe 1920751.20 DATA STEP data ia.comparison; merge ia.sales(rename=(SaleMon=Month)) work.goals; by Month Region; FClass=FSales-FGoal; run; IA.COMPARISON Month Region FSales FGoal FClass 1 Europe 2118222.62 2127742.48 -9519.86 1 North America 3135765.34 2934441.72 201323.62 2 Europe 1960034.47 1920751.20 39283.27
  • 88. Other Merges (Self-study) The DATA step merge works with many other kinds of data combinations: One-to-many Unique BY values are in one data set and duplicate matching BY values are in the other data set. Many-to-many Duplicate matching BY values are in both data sets. Non-matches Some BY values in one data set have no matching BY values in the other data set.
  • 89. One-To-Many Merging WORK.ONE WORK.TWO X Y X Z 1 A 1 A1 2 B 1 A2 3 C 2 B1 3 C1 3 C2 data work.three; merge work.one work.two; by X; X Y Z run; 1 A A1 1 A A2 2 B B1 3 C C1 3 C C2
  • 90. One-To-Many Merging IA.ALLSALES IA.ALLGOALS Month Region Sales Month Goal 1 Europe 2118222.62 1 2127742.48 1 North America 3135765.34 2 1920751.20 2 Europe 1960034.47 3 2125112.75 2 North America 2926929.91 data ia.allcompare; merge ia.allsales ia.allgoals; by Month; Difference=Sales-Goal; run; Month Region Sales Goal Difference 1 Europe 2118222.62 21277742.48 -9519.86 1 North America 3135765.34 21277742.48 1008022.86 2 Europe 1960034.47 1920751.20 39283.27 2 North America 2926929.91 1920751.20 1006178.71
  • 91. Many-To-Many Merging WORK.ONE WORK.TWO X Y X Z 1 A1 1 AA1 1 A2 1 AA2 2 B1 1 AA3 2 B2 2 BB1 2 BB2 data work.three; merge work.one work.two; by X; X Y Z run; 1 A1 AA1 1 A2 AA2 1 A2 AA3 2 B1 BB1 2 B2 BB2
  • 92. Many-To-Many Merging IA.ALLSALES2 IA.ALLGOALS2 Month Sales Month Goal 1 2118222.62 1 21277 42. 48 1 3135765.34 1 29344 41. 72 2 1960034.47 2 19207 51. 20 2 2926929.91 2 27477 87. 49 data ia.allcompare2; merge ia.allsales2 ia.allgoals2; by Month; Difference=Sales-Goal; run; Month Sales Goal Difference 1 2118222.62 2127742.48 -9519.86 1 3135765.34 2934441.72 201323.62 2 1960034.47 1920751.20 39283.27 2 2926929.91 2747787.49 179142.42
  • 93. Merging With Non-matches WORK.ONE WORK.TWO X Y X Z 1 A 1 A1 2 B 3 C1 3 C 4 D1 data work.three; merge work.one work.two; by X; X Y Z run; 1 A A1 2 B 3 C C1 4 D1
  • 94. Merging With Non-matches IA.ESALES IA.EGOALS2 Month Sales Month Goal 1 2118222.62 1 2127742.48 2 1960034.47 3 2125112.75 3 2094220.35 4 2058397.00 data ia.ecompare2; merge ia.esales ia.egoals2; by Month; Difference=Sales-Goal; run; Month Sales Goal Difference 1 2118222.62 2127742.48 -9519.86 2 1960034.47 . . IA.EUROPE_COMPARE 3 2094220.35 2125112.75 -30892.40
  • 95. Review the Match-merge (1) data work.tot_sales; merge ia.sales (in=a) ia.transaction (in=b); by sales_id; if a and b; run;
  • 96. Review the Match-merge (2) data not_in_a not_in_b; merge work.a (in=a) work.b (in=b); by num; IF a and not b THEN not_in_b; ELSE not_in_a; run;
  • 97. Reading a subset of Raw Data Use the DATA step that was written earlier. Add a subsetting IF statement to process only the subset in which the value of AGE is at least 15. data work.aircraft; set ia.aircraft (firstobs=5 obs=10); YrInService=year(InService); Age=year(today())-YrInService; if Age>=15; run;
  • 98. Subsetting Your Data with the WHERE Statement The WHERE statement enables you to select observations that meet a certain condition before SAS brings the observation into the PROC REPORT step. Date FlightID TotPassCap TotPass TotRev 04JAN1999 IA00300 207 186 $140,170.00 26NOV1999 IA00300 207 176 $133,704.00 31DEC1999 IA00401 207 171 $129,491.00 proc report data=ia.sales1999 nowd; where Date between '24nov1999'd and '03jan2000'd; run; Date FlightID TotPassCap TotPass TotRev 26NOV1999 IA00300 207 176 $133,704.00 31DEC1999 IA00401 207 171 $129,491.00
  • 99. WHERE or IF WHERE IF Step and Usage Statement Statement PROC step Yes No DATA step (source of variable) INPUT statement No Yes Assignment statement No Yes SET statement (single data set) Yes Yes SET/MERGE (multiple data sets) Variable in ALL data sets Yes Yes Variable not in ALL data sets No Yes
  • 100. Operators The WHERE statement can be used with – comparison operators – logical operators. You can also use the WHERE statement with special operators.
  • 101. Comparison Operators Mnemonic Symbol Definition EQ = equal to NE ^= not equal to GT > greater than LT < less than GE >= greater than or equal to LE <= less than or equal to IN equal to one of a list
  • 102. Comparison Operators Examples: where Salary>25000; where EmpID='0082'; where Salary=.; where LastName=' ';
  • 103. Logical Operators Examples: where JobCode='FLTAT3' and Salary>50000; where JobCode='PILOT1' or JobCode='PILOT2' or JobCode='PILOT3';
  • 104. Special Operators The following are special operators : LIKE selects observations by comparing character values to specified patterns. A percent sign (%) replaces any number of characters and an underscore (_) replaces one character. where Code like 'E_U%'; (E, a single character, U, followed by any characters.) continued...
  • 105. Special Operators The sounds-like (=*) operator selects observations that contain a spelling variation of the word or words specified. where Name=*'SMITH'; (Selects the names Smythe, Smitt, and so on.) CONTAINS or ? selects observations that include the specified substring. where Word ? 'LAM'; (BLAME, LAMENT, and BEDLAM are selected.) continued...
  • 106. Special Operators IS NULL or IS MISSING selects observations in which the value of the variable is missing. where Flight is missing; BETWEEN-AND selects observations in which the value of the variable falls within a range of values. where Date between '01mar1999'd and '01apr1999'd;
  • 107. Agenda Introduction Read data from raw data file Formatting the data Data Manipulation and statistical analysis Combining and subsetting data Processing the data iteratively Report Production
  • 108. Using DO-LOOP command DO index-variable = start TO stop <BY increment>; SAS statements; END;
  • 109. DO-LOOP syntax General form of a DO LOOP with a value list : DO index-variable = value1, value2, value3; SAS statements; END;
  • 110. DO-LOOP syntax General form of a DO LOOP with a value list : DO index-variable = value1, value2, value3; SAS statements; END;
  • 111. Performing a calculation until the condition is met DO WHILE DO UNTIL
  • 112. DO WHILE syntax DO WHILE (expression); SAS statements; END; The statements in the loop iteratively execute while the expression is true The expression is evaluated at the top of the loop The statements in the loop never executed id the expression is initially false.
  • 113. DO UNTIL syntax DO UNTIL (expression); SAS statements; END; The statements in the loop iteratively execute until the expression becomes true The expression is evaluated at the bottom of the loop The statements in the loop are executed at least once
  • 114. Sample of programs data work.retire; set ia.employee_data; service_yrs = year(today()) – year (hire_date); do while (service_yrs <= 30); emp_salary = emp_salary *1.05; service_yrs = service_yrs + 1; end; year_30 = year(intnx(‘year’, hire_date, 30)); retire_date = mdy(month(hire_date, date(hire_date), year (hire_date)); run;
  • 116. Sampling of SAS data set (1) row_to_read = 10; set ia.employee_data; point = row_to_read nobs = total_rows; output; stop; Notes : NOBS= option creates a new temporary variable that contains the number of observations in SAS data set. This value is assigned during compilation, which means you can reference this variable before the SET statement
  • 117. Sampling of SAS data set (2) sample size = 100; do while (sample_size >0); SAS statements…; end; stop; Notes : NOBS= option creates a new temporary variable that contains the number of observations in SAS data set. This value is assigned during compilation, which means you can reference this variable before the SET statement
  • 118. Getting the random number Random number = ranuni (0);
  • 119. Agenda Introduction Read data from raw data file Formatting the data Data Manipulation and statistical analysis Combining and subsetting data Processing the data iteratively Report Production
  • 120. Creating a List Report (1) Model AircraftID InService TotPassCap Size MF4000 010012 10890 267 Large LF5200 030006 10300 207 Large LF5200 030008 11389 207 Large proc print data=ia.aircraftcap; var AircraftID Size TotPassCap; run; Aircra TotPassCa ftID Size p 010012 Large 267 030006 Large 207 030008 Large 207
  • 121. Creating a List Report (2) Model AircraftID InService TotPassCap Size MF4000 010012 10890 267 Large LF5200 030006 10300 207 Large LF5200 030008 11389 207 Large proc report data=ia.aircraftcap nowd; column AircraftID Size TotPassCap; run; Aircra TotPassCa ftID Size p 010012 Large 267 030006 Large 207 030008 Large 207
  • 122. The DEFINE Statement General form of the DEFINE statement: DEFINE variable /<usage> <attribute-list>; You can define options (usage and attributes) in the DEFINE statement in any order. Default usage for character variables is DISPLAY. – The report lists all of the variable’s values from the data set.
  • 123. The DEFINE Statement Default usage for numeric variables is ANALYSIS. If the report contains at least one display variable and no group variables, the report lists all of the values of the numeric variable. If the report contains only numeric variables, the report displays grand totals for the numeric variables. If the report contains group variables, the report displays the sum of the numeric variables’ values for each group.
  • 124. The DEFINE Statement Other available statistics include N number of nonmissing values MEAN average MAX maximum value MIN minimum value
  • 125. The DEFINE Statement Additional usage: ORDER determines the order of the rows in the report. • The default order is ascending. • To force the order to be descending, include the DESCENDING option on the DEFINE statement. • Repetitious printing of values is suppressed.
  • 126. The DEFINE Statement Selected attributes: FORMAT= assigns a format to a variable. • If there is a format stored in the descriptor portion of the data set it is the default format. WIDTH= controls the width of a report column. • The default width is the variable length. continued...
  • 127. The DEFINE Statement CENTER identifies the justification of values LEFT and the header within the report RIGHT column. • The default is LEFT for character values and RIGHT for numeric values. continued...
  • 128. The DEFINE Statement 'report-column-header' defines the report column header. • If there is a label stored in the descriptor portion of the data set it is the default header.
  • 129. Creating an Enhanced List Report The enhanced aircraft capacity list report includes – appropriate report column headings – formatted values for the INSERVICE variable – column widths wide enough for the headings – values and headings centered within the columns – rows of the report ordered by descending values of the variable SIZE.
  • 130. Adding Options to Enhance Report Appearance Selected PROC REPORT options: HEADLINE underlines all column headers and the spaces between them. HEADSKIP writes a blank line beneath all column headers.
  • 131. Writing A PROC REPORT step Use DEFINE statements to define the variables as display variables. – Add column headers and specify column width. – Add formats and specify alignment. Add titles. proc report data=ia.sales1999 nowd headline headskip; column Date FlightId TotPass TotRev; define Date / display center 'Sales Date'; define FlightId / display center 'Flight'; define TotPass / display format=3. width=10 center 'Total Passengers'; define TotRev / display format=dollar11.2 center 'Total Revenue'; title 'Sales and Passenger Data for 1999'; run;
  • 132. Controlling Report Appearance Use the HEADLINE option to underline the column headers and the HEADSKIP option to add a blank line between the column headers. Add titles. title1 'Sales and Passenger Data, by Day of Week'; title2 'Sunday through Saturday’; proc report data=ia.sales1999 headline headskip nowd; column Day TotPass TotRev; define Day / group 'Day of Week'; define TotPass / sum format=comma5. width=10 center 'Total Passengers'; define TotRev / sum format=dollar13.2 center 'Total Revenue'; run;
  • 133. PROC SUMMARY PROC SUMMARY DATA = sas-data-set; VAR analysis-variable(s); CLASS class-variable(s); OUTPUT OUT = output-data-set STATISTIC = variable(s); RUN; Statistics in PROC SUMMARY : • N Number of observations with no missing values • MEAN average • STD Standard Deviation • MIN Minimum value • MAX Maximum value
  • 134. PROC TABULATE PROC TABULATE DATA = sas-data-set; CLASS class-variable(s); VAR analysis-variable(s); TABLE class_var<*analysis var* Stat>, class_var<*analysis var * stat>; RUN;
  • 135. PROC TABULATE (detail) PROC TABULATE DATA=sas-dataset (WHERE=(condition)) FORMAT=COMMA20.0; CLASS Card_group Type; VAR Balance N_account Crd_limit; TABLE (Type ALL='Total'*{s={foreground=#002288 Background=white}}), N_account*(SUM PCTSUM) Balance='Current Balance'*(SUM PCTSUM) crd_limit='Credit Limit'*(SUM PCTSUM MEAN) Balance='Percentage Balance to Limit'*PCTSUM <Crd_limit> *f=comma25.2; BY Card_Group; RUN;
  • 136. The Output Delivery System ODS statements enable you to create output in a variety of forms. ODS SAS Output Window SAS Report HTML File
  • 137. Generating HTML Files The ODS HTML statement opens, closes, or manages the HTML destination. General form of the ODS statement to create an HTML file: ODS HTML FILE='HTML-file-specification' <options>; SAS code generating output ODS HTML CLOSE;
  • 138. Creating an HTML Report Create a report and close the HTML destination. ods html file='listing.html'; proc report data=ia.comparison nowd; column Month …; define Month /…; other statements run; ods html close;