The Many Ways to Effectively Utilize Array Processing Arthur Li
INTRODUCTION <ul><li>Why do we need to use Arrays? </li></ul><ul><ul><li>Allows us to reduce the amount of coding in the D...
REVIEW: COMPILATION AND EXECUTION PHASES Compilation phase: Each statement is scanned for syntax errors.  Execution phase:...
REVIEW IMPLICIT AND EXPLICIT LOOPS REVIEW IMPLICIT LOOP Patient: <ul><li>The DATA step works like a loop – an implicit loo...
REVIEW IMPLICIT LOOP data  trial1 (drop=rannum); set  patient; rannum = ranuni( 2 ); if  rannum>  0.5   then group =  'D' ...
REVIEW IMPLICIT LOOP data  trial1 (drop=rannum); set  patient; rannum = ranuni( 2 ); if  rannum>  0.5   then group =  'D' ...
REVIEW IMPLICIT LOOP 1 st  iteration: <ul><li>RANNUM is generated </li></ul>Patient: PDV: data  trial1 (drop=rannum); set ...
REVIEW IMPLICIT LOOP 1 st  iteration: <ul><li>GROUP    ‘P’ since RANNUM is  not >  0.5 </li></ul>Patient: PDV: data  tria...
REVIEW IMPLICIT LOOP 1 st  iteration: <ul><li>The implicit OUTPUT statement writes the variables marked with (K) to the fi...
REVIEW IMPLICIT LOOP 2 nd  iteration: <ul><li>_N_  ↑2 </li></ul>Patient: PDV: Trial1: Variables exist in the  input  datas...
REVIEW IMPLICIT LOOP 2 nd  iteration: Patient: PDV: Trial1: Variables being created in the DATA step <ul><li>SAS sets each...
REVIEW IMPLICIT LOOP 2 nd  iteration: <ul><li>The SET statement copies the 2 nd  observation    PDV </li></ul>Patient: PD...
REVIEW: OUTPUT STATEMENT data  trial1 (drop=rannum); set  patient; rannum = ranuni( 2 ); if  rannum>  0.5   then  group = ...
REVIEW: OUTPUT STATEMENT <ul><li>The implicit OUTPUT statement:  </li></ul><ul><li>It tells SAS to write observations to t...
REVIEW: OUTPUT STATEMENT <ul><li>Placing an explicit OUTPUT </li></ul><ul><ul><li>Override the implicit OUTPUT </li></ul><...
REVIEW EXPLICIT LOOP <ul><li>Suppose you don’t have a dataset containing the patient IDs </li></ul><ul><li>You are asked t...
REVIEW EXPLICIT LOOP data  trial2(drop = rannum); id =  'M2390' ; rannum = ranuni( 2 ); if  rannum>  0.5   then  group =  ...
REVIEW EXPLICIT LOOP data  trial2(drop = rannum); id =  'M2390' ; rannum = ranuni( 2 ); if  rannum>  0.5   then  group =  ...
REVIEW EXPLICIT LOOP data  trial2(drop = rannum); id =  'M2390' ; rannum = ranuni( 2 ); if  rannum>  0.5   then  group =  ...
ITERATIVE DO LOOP data  trial2(drop = rannum); id =  'M2390' ; rannum = ranuni( 2 ); if  rannum>  0.5   then  group =  'D'...
ITERATIVE DO LOOP data  trial2(drop = rannum); id =  'M2390' ; rannum = ranuni( 2 ); if  rannum>  0.5   then  group =  'D'...
THE ITERATIVE DO LOOP ALONG A SEQUENCE OF INTEGERS data  trial3 (drop = rannum); do  id =  1   to   4 ; rannum = ranuni( 2...
PURPOSE OF USING ARRAYS <ul><li>6 measurements of SBP for each patient </li></ul><ul><li>The missing values are coded as 9...
PURPOSE OF USING ARRAYS <ul><li>RECALL: DO LOOP </li></ul>data  trial2(drop = rannum); id =  'M2390' ; rannum = ranuni( 2 ...
PURPOSE OF USING ARRAYS data  sbp1;   set  sbp;   if  sbp1 =  999   then  sbp1 =  . ;   if  sbp2 =  999   then  sbp2 =  . ...
ARRAY DEFINITION AND SYNTAX ARRAY  ARRAYNAME [DIMENSION] <$> <ELEMENTS>; <ul><li>Must be a SAS name </li></ul><ul><li>Cann...
ARRAY DEFINITION AND SYNTAX ARRAY  ARRAYNAME [DIMENSION] <$> <ELEMENTS>; <ul><li>DIMENSION is the number of elements in th...
ARRAY DEFINITION AND SYNTAX ARRAY  ARRAYNAME [DIMENSION] <$> <ELEMENTS>; <ul><li>$ indicates that the elements in the arra...
ARRAY DEFINITION AND SYNTAX ARRAY  ARRAYNAME [DIMENSION] <$> <ELEMENTS>; <ul><li>ELEMENTS are the variables to be included...
ARRAY DEFINITION AND SYNTAX ARRAY  ARRAYNAME [DIMENSION] <$> <ELEMENTS>; array  sbparray [ 6 ] sbp1 sbp2 sbp3 sbp4 sbp5 sb...
ARRAY DEFINITION AND SYNTAX ARRAY  ARRAYNAME [DIMENSION] <$> <ELEMENTS>; array  sbparray [*] sbp1 sbp2 sbp3 sbp4 sbp5 sbp6...
ARRAY DEFINITION AND SYNTAX ARRAY  ARRAYNAME [DIMENSION] <$> <ELEMENTS>; array  sbparray ( 6 ) sbp1 sbp2 sbp3 sbp4 sbp5 sb...
ARRAY DEFINITION AND SYNTAX ARRAY  ARRAYNAME [DIMENSION] <$> <ELEMENTS>; array  sbp [ 6 ];   =  array  sbp [ 6 ] sbp1 sbp2...
ARRAY DEFINITION AND SYNTAX ARRAY  ARRAYNAME [DIMENSION] <$> <ELEMENTS>; array  num [*]  _numeric_;   array  char [*] _cha...
ARRAY DEFINITION AND SYNTAX ARRAY  ARRAYNAME [DIMENSION] <$> <ELEMENTS>; array  sbp [ 6 ] sbp1 - sbp6; <ul><li>A single da...
ARRAY DEFINITION AND SYNTAX ARRAYNAME [INDEX]; <ul><li>must be closed in ( ), [ ], or { } </li></ul><ul><li>is specified a...
ARRAY DEFINITION AND SYNTAX data  sbp2 (drop=i);   set  sbp;   array  sbparray[ 6 ] sbp1 sbp2 sbp3 sbp4 sbp5 sbp6;   do  i...
THE DIM FUNCTION data  sbp3 (drop=i);   set  sbp;   array  sbparray [*] sbp1 - sbp6;   do  i =  1   to  dim(sbparray);   i...
ASSIGNING INITIAL VALUES TO AN ARRAY <ul><li>When creating a group of variables by using the ARRAY statement, you can assi...
TEMPORARY ARRAYS <ul><li>Temporary arrays contain temporary data elements </li></ul><ul><li>Using temporary arrays is usef...
COMPILATION AND EXECUTION PHASES COMPILATION PHASE  data  sbp2 (drop=i);   set  sbp;   array  sbparray[ 6 ] sbp1 - sbp6;  ...
EXECUTION PHASE  <ul><li>_N_    1 </li></ul><ul><li>The rest of the variables     missing </li></ul>1 st  iteration of t...
EXECUTION PHASE <ul><li>SET statement copies the 1 st   obs. from Sbp to the PDV </li></ul>1 st  iteration of the DATA ste...
EXECUTION PHASE <ul><li>The ARRAY statement is a compile-time only statement </li></ul>1 st  iteration of the DATA step: d...
EXECUTION PHASE <ul><li>I    1 </li></ul>1 st  iteration of the DATA step: 1 st  iteration of the DO loop: data  sbp2 (dr...
EXECUTION PHASE <ul><li>SBPARRAY [ i ]    SBPARRAY [1]  </li></ul><ul><li>SBPARRAY [1]    SBP1 </li></ul><ul><li>Since S...
EXECUTION PHASE <ul><li>SAS reaches the end of the DO loop </li></ul>1 st  iteration of the DATA step: 1 st  iteration of ...
EXECUTION PHASE <ul><li>I    2 </li></ul><ul><li>Since I  ≤ 6, the loop continues </li></ul>1 st  iteration of the DATA s...
EXECUTION PHASE 1 st  iteration of the DATA step: 2 nd  iteration of the DO loop: <ul><li>SBPARRAY [ i ]    SBPARRAY [2] ...
EXECUTION PHASE 1 st  iteration of the DATA step: 2 nd  iteration of the DO loop: <ul><li>SAS reaches the end of the DO lo...
EXECUTION PHASE 1 st  iteration of the DATA step: <ul><li>SAS reaches the end of the DATA step </li></ul><ul><li>The impli...
EXECUTION PHASE 2 nd  iteration of the DATA step: <ul><li>_N_  ↑  2 </li></ul><ul><li>SBP1 – SBP6 are retained </li></ul><...
EXECUTION PHASE 2 nd  iteration of the DATA step: <ul><li>The SET statement copies the 2 nd  obs. to the PDV </li></ul>dat...
EXECUTION PHASE 2 nd  iteration of the DATA step: <ul><li>I    1 </li></ul>1 st  iteration of the DO loop: data  sbp2 (dr...
EXECUTION PHASE <ul><li>SBPARRAY [ i ]    SBPARRAY [1]  </li></ul><ul><li>SBPARRAY [1]    SBP1 </li></ul>2 nd  iteration...
EXECUTION PHASE <ul><li>SBPARRAY [ i ]    SBPARRAY [1]  </li></ul><ul><li>SBPARRAY [1]    SBP1 </li></ul><ul><li>Since S...
EXECUTION PHASE <ul><li>SAS reaches the end of loop </li></ul><ul><li>Skip the rest of the loop </li></ul>2 nd  iteration ...
EXECUTION PHASE 2 nd  iteration of the DATA step: <ul><li>SAS reaches the end of the DATA step </li></ul>data  sbp2 (drop=...
EXECUTION PHASE 2 nd  iteration of the DATA step: <ul><li>SAS reaches the end of the DATA step </li></ul><ul><li>The impli...
SOME ARRAY APPLICATIONS CREATING A GROUP OF VARIABLES BY USING ARRAYS   Pre-treatment Post-treatment MEAN  SBP: 140 120 da...
CREATING A GROUP OF VARIABLES BY USING ARRAYS Pre-treatment Post-treatment MEAN  SBP: 140 120 data  sbp4 (drop=i); set  sb...
CREATING A GROUP OF VARIABLES BY USING ARRAYS Pre-treatment Post-treatment MEAN  SBP: 140 120 data  sbp4 (drop=i); set  sb...
CREATING A GROUP OF VARIABLES BY USING ARRAYS Pre-treatment Post-treatment MEAN  SBP: 140 120 data  sbp4 (drop=i); set  sb...
THE IN OPERATOR  data  sbp6 (drop = i); set  sbp2; array  sbp [6]; if   .  IN sbp  then  miss =  1 ; else  miss =  0 ; run...
CALCULATING PRODUCTS OF MULTIPLE VARIABLES  data  product (drop=i); set  test; array  num[ 4 ]; if  missing(num[ 1 ])  the...
CALCULATING PRODUCTS OF MULTIPLE VARIABLES  data  product (drop=i); set  test; array  num[ 4 ]; if  missing(num[ 1 ])  the...
CALCULATING PRODUCTS OF MULTIPLE VARIABLES  data  product (drop=i); set  test; array  num[ 4 ]; if  missing(num[ 1 ])  the...
RESTRUCTURING DATASETS USING ARRAYS <ul><li>Restructuring datasets: </li></ul>data with one observation per subject  (the ...
FROM WIDE FORMAT TO LONG FORMAT  (WITHOUT USING ARRAYS) Wide: Long: <ul><li>Transform wide    long </li></ul><ul><li>2 ob...
FROM WIDE FORMAT TO LONG FORMAT(USING ARRAYS) Wide: Long: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  ...
FROM WIDE FORMAT TO LONG FORMAT(USING ARRAYS) Wide: Long: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  ...
FROM WIDE FORMAT TO LONG FORMAT(USING ARRAYS) Wide: Long: data  long (drop=s1-s3); set  wide; time =  1 ; score = s1; if  ...
FROM LONG FORMAT TO WIDE FORMAT <ul><li>Reading 5 observations but only creating 2 observations </li></ul><ul><ul><li>You ...
REVIEW THE RETAIN STATEMENT <ul><li>To prevents the VARIABLE from being initialized each time the DATA step executes, use ...
REVIEW: THE SUM STATEMENT <ul><li>The SUM statement has the following form: </li></ul>VARIABLE + EXPRESSION; <ul><li>The n...
REVIEW: FIRST.VARIABLE AND LAST.VARIABLE   <ul><li>You only output the data after you finish reading the last observation ...
REVIEW: FIRST.VARIABLE AND LAST.VARIABLE  <ul><li>BY-group processing method </li></ul>proc   sort   data =b; by  by_varia...
REVIEW: FIRST.VARIABLE AND LAST.VARIABLE  <ul><li>Suppose ID is the “BY” variable: </li></ul>SAS reads the 1 st  observati...
REVIEW SUBSETTING IF STATEMENT <ul><li>Use the IF statement to continue processing only the observations that meet the con...
REVIEW SUBSETTING IF STATEMENT <ul><li>Use the IF statement to continue processing only the observations that meet the con...
FROM LONG FORMAT TO WIDE FORMAT (WITHOUT USING ARRAYS) if  time = 1  then  s1 = score; else if  time = 2  then  s2 = score...
FROM LONG FORMAT TO WIDE FORMAT (WITHOUT USING ARRAYS) RETAIN proc   sort   data =long; by  id; data  wide (drop=time scor...
EXECUTION PHASE data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  time =  1   then  s1 = score; else  ...
EXECUTION PHASE <ul><li>1 ST  iteration: </li></ul><ul><li>The SET statement copies the 1 st  observation   PDV </li></ul...
EXECUTION PHASE <ul><li>1 ST  iteration: </li></ul><ul><li>The SET statement copies the 1 st  observation   PDV </li></ul...
EXECUTION PHASE <ul><li>1 ST  iteration: </li></ul><ul><li>Since TIME = 1, S1    SCORE (3) </li></ul>data  wide (drop=tim...
EXECUTION PHASE <ul><li>1 ST  iteration: </li></ul><ul><li>Since LAST.ID  ≠1, SAS returns to the beginning of the DATA ste...
EXECUTION PHASE <ul><li>2 nd  iteration: </li></ul><ul><li>_N_  ↑ 2 </li></ul>data  wide (drop=time score); set  long; by ...
EXECUTION PHASE <ul><li>2 nd  iteration: </li></ul><ul><li>FIRST.ID and LAST.ID are retained; they are automatic variables...
EXECUTION PHASE <ul><li>2 nd  iteration: </li></ul><ul><li>The SET statement copies the 2 nd  observation to the PDV </li>...
EXECUTION PHASE <ul><li>2 nd  iteration: </li></ul><ul><li>The SET statement copies the 2 nd  observation to the PDV </li>...
EXECUTION PHASE <ul><li>2 nd  iteration: </li></ul><ul><li>Since TIME = 2, S2    SCORE (4) </li></ul>data  wide (drop=tim...
EXECUTION PHASE <ul><li>2 nd  iteration: </li></ul><ul><li>Since LAST.ID ≠1, SAS returns to the beginning of the DATA step...
EXECUTION PHASE <ul><li>3 rd  iteration: </li></ul><ul><li>_N_  ↑ 3 </li></ul><ul><li>The rest of the variables are retain...
EXECUTION PHASE <ul><li>3 rd  iteration: </li></ul><ul><li>The SET statement copies the 3 rd  observation   PDV </li></ul...
EXECUTION PHASE <ul><li>3 rd  iteration: </li></ul><ul><li>The SET statement copies the 3 rd  observation   PDV </li></ul...
EXECUTION PHASE <ul><li>3 rd  iteration: </li></ul><ul><li>Since TIME = 3, S3    SCORE (5) </li></ul>data  wide (drop=tim...
EXECUTION PHASE <ul><li>3 rd  iteration: </li></ul><ul><li>Since LAST.ID = 1, SAS continues to execute statements in the D...
EXECUTION PHASE <ul><li>3 rd  iteration: </li></ul><ul><li>SAS reaches the end of 3 rd  iteration  </li></ul><ul><li>The i...
EXECUTION PHASE <ul><li>4 th  iteration: </li></ul><ul><li>_N_  ↑ 4 </li></ul><ul><li>The rest of the variables are retain...
EXECUTION PHASE <ul><li>4 th  iteration: </li></ul><ul><li>The SET statement copies the 4 th  observation   PDV </li></ul...
EXECUTION PHASE <ul><li>4 th  iteration: </li></ul><ul><li>The SET statement copies the 4 th  observation   PDV </li></ul...
EXECUTION PHASE <ul><li>4 th  iteration: </li></ul><ul><li>Since TIME = 1, S1    SCORE (4) </li></ul>data  wide (drop=tim...
EXECUTION PHASE <ul><li>4 th  iteration: </li></ul><ul><li>Since LAST.ID ≠1, SAS returns to the beginning of the DATA step...
EXECUTION PHASE <ul><li>5 th  iteration: </li></ul><ul><li>_N_  ↑ 5 </li></ul><ul><li>The rest of the variables are retain...
EXECUTION PHASE <ul><li>5 th  iteration: </li></ul><ul><li>The SET statement copies the 5 th  observation   PDV </li></ul...
EXECUTION PHASE <ul><li>5 th  iteration: </li></ul><ul><li>The SET statement copies the 5 th  observation   PDV </li></ul...
EXECUTION PHASE <ul><li>5 th  iteration: </li></ul><ul><li>Since TIME = 3, S3    SCORE (2) </li></ul>data  wide (drop=tim...
EXECUTION PHASE <ul><li>5 th  iteration: </li></ul><ul><li>Since LAST.ID = 1, SAS continues to execute the rest of the sta...
EXECUTION PHASE <ul><li>5 th  iteration: </li></ul><ul><li>SAS reaches the end of 5 th  iteration  </li></ul><ul><li>The i...
EXECUTION PHASE data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  first.id  then   do ; s1 =  . ;  s2 ...
FROM LONG FORMAT TO WIDE FORMAT (USING ARRAYS) data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  first...
FROM LONG FORMAT TO WIDE FORMAT (USING ARRAYS) data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  first...
FROM LONG FORMAT TO WIDE FORMAT (USING ARRAYS) data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  first...
FROM LONG FORMAT TO WIDE FORMAT (USING ARRAYS) data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  first...
FROM LONG FORMAT TO WIDE FORMAT (USING ARRAYS) data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  first...
FROM LONG FORMAT TO WIDE FORMAT (USING ARRAYS) data  wide (drop=time score); set  long; by  id; retain  s1 - s3; if  first...
MULTIDIMENSIONAL ARRAYS ARRAY  ARRAYNAME[R, C, …] <$> <ELEMENTS>; <ul><li>The difference between one- and multi-dimensiona...
MULTIDIMENSIONAL ARRAYS array a[2,3];  equivalent  to … array a[2,3] a1 - a6;  a6 a5 a4 2 1 a3 a2 a1 3 2 1 a[2,2] a[1,3]
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: Dat2: <ul><li>Create ONE observation after you finish rea...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: Dat2: G[3]: ALL_G[2,3]: Use to group existing variables U...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY proc   sort   data =dat1; by  id; run ; data  dat2 (drop =  i j...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st  iteration: ALL_G [I,J] G [J] data  dat2 (drop =  i ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st  iteration: ALL_G [I,J] G [J] data  dat2 (drop =  i ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st  iteration: ALL_G [I,J] G [J] data  dat2 (drop =  i ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st  iteration (1 st  DO loop): ALL_G [I,J] G [J] data  ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st  iteration (1 st  DO loop): ALL_G [I,J] G [J] data  ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st  iteration (1 st  DO loop): ALL_G [I,J] G [J] data  ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st  iteration (2 nd  DO loop): ALL_G [I,J] G [J] data  ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st  iteration (2 nd  DO loop): ALL_G [I,J] G [J] data  ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st  iteration (2 nd  DO loop): ALL_G [I,J] G [J] data  ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st  iteration (3 rd  DO loop): ALL_G [I,J] G [J] data  ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st  iteration (3 rd  DO loop): ALL_G [I,J] G [J] data  ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st  iteration (3 rd  DO loop): ALL_G [I,J] G [J] data  ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st  iteration (4 th  DO loop): ALL_G [I,J] G [J] data  ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st  iteration: ALL_G [I,J] G [J] data  dat2 (drop =  i ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd  iteration: ALL_G [I,J] G [J] data  dat2 (drop =  i ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd  iteration: ALL_G [I,J] G [J] data  dat2 (drop =  i ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd  iteration: ALL_G [I,J] G [J] data  dat2 (drop =  i ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd  iteration: ALL_G [I,J] G [J] data  dat2 (drop =  i ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd  iteration (1 st  DO loop): ALL_G [I,J] G [J] data  ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd  iteration (1 st  DO loop): ALL_G [I,J] G [J] data  ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd  iteration (1 st  DO loop): ALL_G [I,J] G [J] data  ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd  iteration (2 nd  DO loop): ALL_G [I,J] G [J] data  ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd  iteration (2 nd  DO loop): ALL_G [I,J] G [J] data  ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd  iteration (2 nd  DO loop): ALL_G [I,J] G [J] data  ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd  iteration (3 rd  DO loop): ALL_G [I,J] G [J] data  ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd  iteration (3 rd  DO loop): ALL_G [I,J] G [J] data  ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd  iteration (3 rd  DO loop): ALL_G [I,J] G [J] data  ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd  iteration (4 th  DO loop): ALL_G [I,J] G [J] data  ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd  iteration: ALL_G [I,J] G [J] data  dat2 (drop =  i ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd  iteration: ALL_G [I,J] G [J] data  dat2 (drop =  i ...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data  dat2 (drop =  i j g1 - g3); set  dat1; by  id; array  all...
The many ways to effectively utilize array processing
The many ways to effectively utilize array processing
The many ways to effectively utilize array processing
The many ways to effectively utilize array processing
The many ways to effectively utilize array processing
The many ways to effectively utilize array processing
The many ways to effectively utilize array processing
The many ways to effectively utilize array processing
The many ways to effectively utilize array processing
The many ways to effectively utilize array processing
Upcoming SlideShare
Loading in …5
×

The many ways to effectively utilize array processing

560 views

Published on

Utilizing array processing allows us to reduce the amount of coding in the DATA step. In addition to learning how to create one- and multi-dimensional arrays, this paper will review how to create an explicit loop in the DATA step - the prerequisite of constructing an array. You will also be exposed to what happens in the Program Data Vector (PDV) during array processing. A wide range of applications in using loop structures with array processing, such as recoding missing values for a list of variables, transforming datasets, etc., will be covered in this paper.

Published in: Education, Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
560
On SlideShare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
22
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

The many ways to effectively utilize array processing

  1. 1. The Many Ways to Effectively Utilize Array Processing Arthur Li
  2. 2. INTRODUCTION <ul><li>Why do we need to use Arrays? </li></ul><ul><ul><li>Allows us to reduce the amount of coding in the DATA step </li></ul></ul><ul><li>What is essential for learning Arrays? </li></ul><ul><ul><li>Compilation and execution of the DATA step </li></ul></ul><ul><ul><li>How the Program Data Vector (PDV) works </li></ul></ul>
  3. 3. REVIEW: COMPILATION AND EXECUTION PHASES Compilation phase: Each statement is scanned for syntax errors. Execution phase: The DATA step reads and processes the input data. If there is no syntax error A DATA step is processed in two-phase sequences :
  4. 4. REVIEW IMPLICIT AND EXPLICIT LOOPS REVIEW IMPLICIT LOOP Patient: <ul><li>The DATA step works like a loop – an implicit loop </li></ul><ul><li>It repetitively executes statements </li></ul><ul><ul><li>reads data values </li></ul></ul><ul><ul><li>creates observations in the PDV one at a time </li></ul></ul><ul><li>Each loop is called an iteration </li></ul><ul><li>Suppose you have the following dataset that contains patient IDs for a clinical trial </li></ul><ul><li>You would like to assign each patient with either a drug or a placebo (50% chance of either/or) </li></ul>M1240 4 F2340 3 F2390 2 M2390 1 ID
  5. 5. REVIEW IMPLICIT LOOP data trial1 (drop=rannum); set patient; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; run ; 1 st iteration: <ul><li>_N_  1 </li></ul><ul><li>_ERROR_  0 </li></ul><ul><li>The rest of variables are set to missing </li></ul>Patient: PDV: M1240 4 F2340 3 F2390 2 M2390 1 ID . 0 1 K GROUP D RANNUM K ID D _ERROR_ D _N_
  6. 6. REVIEW IMPLICIT LOOP data trial1 (drop=rannum); set patient; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; run ; 1 st iteration: <ul><li>The SET statement copies the 1 st observation  PDV </li></ul>Patient: PDV: M1240 4 F2340 3 F2390 2 M2390 1 ID . M2390 0 1 K GROUP D RANNUM K ID D _ERROR_ D _N_
  7. 7. REVIEW IMPLICIT LOOP 1 st iteration: <ul><li>RANNUM is generated </li></ul>Patient: PDV: data trial1 (drop=rannum); set patient; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; run ; M1240 4 F2340 3 F2390 2 M2390 1 ID 0.36993 M2390 0 1 K GROUP D RANNUM K ID D _ERROR_ D _N_
  8. 8. REVIEW IMPLICIT LOOP 1 st iteration: <ul><li>GROUP  ‘P’ since RANNUM is not > 0.5 </li></ul>Patient: PDV: data trial1 (drop=rannum); set patient; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; run ; M1240 4 F2340 3 F2390 2 M2390 1 ID P 0.36993 M2390 0 1 K GROUP D RANNUM K ID D _ERROR_ D _N_
  9. 9. REVIEW IMPLICIT LOOP 1 st iteration: <ul><li>The implicit OUTPUT statement writes the variables marked with (K) to the final dataset </li></ul><ul><li>SAS returns to the beginning of the DATA step </li></ul>Patient: PDV: Trial1: data trial1 (drop=rannum); set patient; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; run ; M1240 4 F2340 3 F2390 2 M2390 1 ID P 0.36993 M2390 0 1 K GROUP D RANNUM K ID D _ERROR_ D _N_ M2390 ID P GROUP 1
  10. 10. REVIEW IMPLICIT LOOP 2 nd iteration: <ul><li>_N_ ↑2 </li></ul>Patient: PDV: Trial1: Variables exist in the input dataset <ul><li>SAS sets each variable to missing in the PDV only before the 1 st iteration of the execution </li></ul><ul><li>Variables will retain their values in the PDV until they are replaced by the new values </li></ul>data trial1 (drop=rannum); set patient; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; run ; M1240 4 F2340 3 F2390 2 M2390 1 ID . M2390 0 2 K GROUP D RANNUM K ID D _ERROR_ D _N_ M2390 ID P GROUP 1
  11. 11. REVIEW IMPLICIT LOOP 2 nd iteration: Patient: PDV: Trial1: Variables being created in the DATA step <ul><li>SAS sets each variable to missing in the PDV at the beginning of every iteration of the execution </li></ul>data trial1 (drop=rannum); set patient; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; run ; M1240 4 F2340 3 F2390 2 M2390 1 ID . M2390 0 2 K GROUP D RANNUM K ID D _ERROR_ D _N_ M2390 ID P GROUP 1
  12. 12. REVIEW IMPLICIT LOOP 2 nd iteration: <ul><li>The SET statement copies the 2 nd observation  PDV </li></ul>Patient: PDV: Trial1: Skip the rest iterations…. data trial1 (drop=rannum); set patient; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; run ; M1240 4 F2340 3 F2390 2 M2390 1 ID . M2390 0 2 K GROUP D RANNUM K ID D _ERROR_ D _N_ M2390 ID P GROUP 1
  13. 13. REVIEW: OUTPUT STATEMENT data trial1 (drop=rannum); set patient; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; run ; <ul><li>The explicit OUTPUT statement: </li></ul><ul><li>Write the current observation from the PDV to the SAS dataset immediately </li></ul><ul><li>Not at the end of the DATA step </li></ul>output ;
  14. 14. REVIEW: OUTPUT STATEMENT <ul><li>The implicit OUTPUT statement: </li></ul><ul><li>It tells SAS to write observations to the dataset at the end of the DATA step </li></ul><ul><li>Without explicit OUTPUT statements, every DATA step contains an implicit OUTPUT statement at the end of the DATA step </li></ul>data trial1 (drop=rannum); set patient; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; run ;
  15. 15. REVIEW: OUTPUT STATEMENT <ul><li>Placing an explicit OUTPUT </li></ul><ul><ul><li>Override the implicit OUTPUT </li></ul></ul><ul><ul><li>SAS adds an observation to a dataset only when an explicit OUTPUT is executed </li></ul></ul><ul><ul><li>We can use more than one OUTPUT statement in the DATA step </li></ul></ul>
  16. 16. REVIEW EXPLICIT LOOP <ul><li>Suppose you don’t have a dataset containing the patient IDs </li></ul><ul><li>You are asked to assign four patients, ‘M2390’, ‘F2390’, ‘F2340’, ‘M1240’, with a 50% chance of receiving either the drug or the placebo </li></ul><ul><li>You can create the ID and assign each ID to a group in the DATA step at the same time. For example </li></ul>
  17. 17. REVIEW EXPLICIT LOOP data trial2(drop = rannum); id = 'M2390' ; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; output ; id = 'F2390' ; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; output ; id = 'F2340' ; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; output ; id = 'M1240' ; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; output ; run ; Assigning IDs in the DATA step
  18. 18. REVIEW EXPLICIT LOOP data trial2(drop = rannum); id = 'M2390' ; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; output ; id = 'F2390' ; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; output ; id = 'F2340' ; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; output ; id = 'M1240' ; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; output ; run ; 4 explicit OUTPUT statements
  19. 19. REVIEW EXPLICIT LOOP data trial2(drop = rannum); id = 'M2390' ; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; output ; id = 'F2390' ; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; output ; id = 'F2340' ; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; output ; id = 'M1240' ; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; output ; run ; 4 almost identical blocks <ul><li>Put identical codes in a loop </li></ul><ul><li>Loop along the IDs </li></ul><ul><li>Reduce amount of coding </li></ul>
  20. 20. ITERATIVE DO LOOP data trial2(drop = rannum); id = 'M2390' ; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; output ; id = 'F2390' ; ... id = 'F2340' ; ... id = 'M1240' ; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; output ; run ; DO INDEX-VARIABLE = VALUE1, VALUE2, …, VALUEN ; SAS STATEMENTS END; <ul><li>INDEX-VARIABLE: ID </li></ul><ul><li>VALUE1 – VALUEN: </li></ul><ul><li>'M2390’, 'F2390’, 'F2340’, 'M1240' </li></ul><ul><li>SAS STATEMENTS: </li></ul>rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; output ;
  21. 21. ITERATIVE DO LOOP data trial2(drop = rannum); id = 'M2390' ; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; output ; id = 'F2390' ; ... id = 'F2340' ; ... id = 'M1240' ; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; output ; run ; DO INDEX-VARIABLE = VALUE1, VALUE2, …, VALUEN ; SAS STATEMENTS END; data trial2 (drop = rannum); do id = 'M2390' , 'F2390' , 'F2340' , 'M1240' ; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; output ; end ; run ;
  22. 22. THE ITERATIVE DO LOOP ALONG A SEQUENCE OF INTEGERS data trial3 (drop = rannum); do id = 1 to 4 ; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; output ; end ; run ; <ul><li>Suppose you are using a sequence of numbers, say 1 to 4, as patient IDs </li></ul>DO INDEX-VARIABLE = START TO STOP < BY INCREMENT> ; SAS STATEMENTS END; <ul><li>INDEX-VARIABLE: ID </li></ul><ul><li>START: 1 </li></ul><ul><li>STOP: 4 </li></ul><ul><li>INCREMENT: 1 </li></ul>
  23. 23. PURPOSE OF USING ARRAYS <ul><li>6 measurements of SBP for each patient </li></ul><ul><li>The missing values are coded as 999 </li></ul><ul><li>Suppose you would like to recode 999 to periods (.) </li></ul>data sbp1; set sbp; if sbp1 = 999 then sbp1 = . ; if sbp2 = 999 then sbp2 = . ; if sbp3 = 999 then sbp3 = . ; if sbp4 = 999 then sbp4 = . ; if sbp5 = 999 then sbp5 = . ; if sbp6 = 999 then sbp6 = . ; run ; <ul><li>Each of the IF statements are almost identical </li></ul><ul><li>Only the variable names are different </li></ul><ul><li>Use a DO loop? </li></ul>123 121 118 142 140 136 4 999 120 119 139 999 142 3 122 119 119 138 141 999 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1
  24. 24. PURPOSE OF USING ARRAYS <ul><li>RECALL: DO LOOP </li></ul>data trial2(drop = rannum); id = 'M2390' ; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; output ; id = 'F2390' ; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; output ; id = 'F2340' ; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; output ; id = 'M1240' ; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; output ; run ; data trial2 (drop = rannum); do id = 'M2390' , 'F2390' , 'F2340' , 'M1240' ; rannum = ranuni( 2 ); if rannum> 0.5 then group = 'D' ; else group = 'P' ; output ; end ; run ; <ul><li>The loop iterates along a sequence of values </li></ul><ul><li>The index variable holds these values </li></ul>Difference: The values of ID variables
  25. 25. PURPOSE OF USING ARRAYS data sbp1; set sbp; if sbp1 = 999 then sbp1 = . ; if sbp2 = 999 then sbp2 = . ; if sbp3 = 999 then sbp3 = . ; if sbp4 = 999 then sbp4 = . ; if sbp5 = 999 then sbp5 = . ; if sbp6 = 999 then sbp6 = . ; run ; Difference: Variable names If we can group these variables into a single unit  We can loop along these variables ARRAY: a temporary grouping of SAS variables 123 121 118 142 140 136 4 999 120 119 139 999 142 3 122 119 119 138 141 999 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 6 5 4 3 2 1 SBP
  26. 26. ARRAY DEFINITION AND SYNTAX ARRAY ARRAYNAME [DIMENSION] <$> <ELEMENTS>; <ul><li>Must be a SAS name </li></ul><ul><li>Cannot be the name of a SAS variable in the same DATA step </li></ul><ul><li>See handouts for other rules </li></ul>
  27. 27. ARRAY DEFINITION AND SYNTAX ARRAY ARRAYNAME [DIMENSION] <$> <ELEMENTS>; <ul><li>DIMENSION is the number of elements in the array </li></ul><ul><li>More on DIMENSION later… </li></ul>
  28. 28. ARRAY DEFINITION AND SYNTAX ARRAY ARRAYNAME [DIMENSION] <$> <ELEMENTS>; <ul><li>$ indicates that the elements in the array are character elements </li></ul><ul><li>$ is not necessary if the elements have been previously defined as character elements </li></ul>
  29. 29. ARRAY DEFINITION AND SYNTAX ARRAY ARRAYNAME [DIMENSION] <$> <ELEMENTS>; <ul><li>ELEMENTS are the variables to be included in the array </li></ul><ul><li>Must either be all numeric or characters </li></ul><ul><li>More on ELEMENTS later… </li></ul>
  30. 30. ARRAY DEFINITION AND SYNTAX ARRAY ARRAYNAME [DIMENSION] <$> <ELEMENTS>; array sbparray [ 6 ] sbp1 sbp2 sbp3 sbp4 sbp5 sbp6;
  31. 31. ARRAY DEFINITION AND SYNTAX ARRAY ARRAYNAME [DIMENSION] <$> <ELEMENTS>; array sbparray [*] sbp1 sbp2 sbp3 sbp4 sbp5 sbp6; <ul><li>You can use an asterisk (*) as DIMENSION </li></ul><ul><li>You must include ELEMENTS </li></ul>
  32. 32. ARRAY DEFINITION AND SYNTAX ARRAY ARRAYNAME [DIMENSION] <$> <ELEMENTS>; array sbparray ( 6 ) sbp1 sbp2 sbp3 sbp4 sbp5 sbp6; array sbparray { 6 } sbp1 sbp2 sbp3 sbp4 sbp5 sbp6; array sbparray [ 6 ] sbp1 sbp2 sbp3 sbp4 sbp5 sbp6; <ul><li>DIMENSION can be enclosed in parentheses, braces, or brackets </li></ul>
  33. 33. ARRAY DEFINITION AND SYNTAX ARRAY ARRAYNAME [DIMENSION] <$> <ELEMENTS>; array sbp [ 6 ]; = array sbp [ 6 ] sbp1 sbp2 sbp3 sbp4 sbp5 sbp6; <ul><li>If ELEMENTS are not specified, for example: </li></ul>Case1: sbp1 – sbp6 were previously defined in the DATA step Case2: if sbp1 – sbp6 were not previously defined in the DATA step, they will be created by the ARRAY statement
  34. 34. ARRAY DEFINITION AND SYNTAX ARRAY ARRAYNAME [DIMENSION] <$> <ELEMENTS>; array num [*] _numeric_; array char [*] _character_; array allvar [*] _all_; <ul><li>_NUMERIC_ : all numeric variables </li></ul><ul><li>_CHARACTER_ : all character variables </li></ul><ul><li>_ALL_: all the variables; variables must be either all numeric or character </li></ul>
  35. 35. ARRAY DEFINITION AND SYNTAX ARRAY ARRAYNAME [DIMENSION] <$> <ELEMENTS>; array sbp [ 6 ] sbp1 - sbp6; <ul><li>A single dash format can be used to specify a range of variables </li></ul>
  36. 36. ARRAY DEFINITION AND SYNTAX ARRAYNAME [INDEX]; <ul><li>must be closed in ( ), [ ], or { } </li></ul><ul><li>is specified as an integer, a numeric variable, or a SAS expression </li></ul><ul><li>must be within the lower and upper bounds of the DIMENSION of the array </li></ul><ul><li>To reference an array element: </li></ul>
  37. 37. ARRAY DEFINITION AND SYNTAX data sbp2 (drop=i); set sbp; array sbparray[ 6 ] sbp1 sbp2 sbp3 sbp4 sbp5 sbp6; do i = 1 to 6 ; if sbparray[i] = 999 then sbparray[i] = . ; end ; run ; data sbp2 (drop=i); set sbp; array sbp [ 6 ]; do i = 1 to 6 ; if sbp [i] = 999 then sbp [i] = . ; end ; run ; ARRAY: array sbparray [ 6 ] sbp1 - sbp6; array sbp [ 6 ]; = array sbp [ 6 ] sbp1 - sbp6; data sbp1; set sbp; if sbp1 = 999 then sbp1 = . ; if sbp2 = 999 then sbp2 = . ; if sbp3 = 999 then sbp3 = . ; if sbp4 = 999 then sbp4 = . ; if sbp5 = 999 then sbp5 = . ; if sbp6 = 999 then sbp6 = . ; run ; 123 121 118 142 140 136 4 999 120 119 139 999 142 3 122 119 119 138 141 999 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1
  38. 38. THE DIM FUNCTION data sbp3 (drop=i); set sbp; array sbparray [*] sbp1 - sbp6; do i = 1 to dim(sbparray); if sbparray [i] = 999 then sbparray [i] = . ; end ; run ; <ul><li>Use the DIM function to determine the number of elements in an array </li></ul><ul><li>It is convenient when you use _NUMERIC_, _CHARACTER_, _ALL_ as array ELEMENTS </li></ul>DIM (ARRAYNAME)
  39. 39. ASSIGNING INITIAL VALUES TO AN ARRAY <ul><li>When creating a group of variables by using the ARRAY statement, you can assign initial values to the array elements </li></ul>array num[ 3 ] n1 n2 n3 ( 1 2 3 ); array chr[ 3 ] $ ( 'A' , 'B' , 'C' );
  40. 40. TEMPORARY ARRAYS <ul><li>Temporary arrays contain temporary data elements </li></ul><ul><li>Using temporary arrays is useful when you want to create an array only for calculation purposes </li></ul><ul><li>When referring to a temporary data element, you refer to it by the ARRAYNAME and its DIMENSION </li></ul><ul><li>You cannot use the asterisk (*) with temporary arrays </li></ul><ul><li>They are not output to the output dataset </li></ul><ul><li>They are always automatically retained </li></ul><ul><li>To create a temporary array, you need to use the keyword _TEMPORARY_ </li></ul>array num[ 3 ] _temporary_ ( 1 2 3 );
  41. 41. COMPILATION AND EXECUTION PHASES COMPILATION PHASE data sbp2 (drop=i); set sbp; array sbparray[ 6 ] sbp1 - sbp6; do i = 1 to 6 ; if sbparray[i] = 999 then sbparray[i] = . ; end ; run ; <ul><li>PDV is created </li></ul><ul><li>Array name SBPARRAY and references are not included in the PDV </li></ul><ul><li>SBP1 – SBP6, is referenced by the ARRAY reference </li></ul><ul><li>Syntax errors in the ARRAY statement will be detected during the compilation phase </li></ul>SBP6 K SBP5 K SBP4 K D I K SBP3 K SBP2 K SBP1 D _N_ 123 121 118 142 140 136 4 999 120 119 139 999 142 3 122 119 119 138 141 999 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 SBPARRAY[6] SBPARRAY[5] SBPARRAY[4] SBPARRAY[3] SBPARRAY[2] SBPARRAY[1]
  42. 42. EXECUTION PHASE <ul><li>_N_  1 </li></ul><ul><li>The rest of the variables  missing </li></ul>1 st iteration of the DATA step: data sbp2 (drop=i); set sbp; array sbparray[ 6 ] sbp1 - sbp6; do i = 1 to 6 ; if sbparray[i] = 999 then sbparray[i] = . ; end ; run ; . SBP6 K . SBP5 K . SBP4 K . . . . 1 D I K SBP3 K SBP2 K SBP1 D _N_ 123 121 118 142 140 136 4 999 120 119 139 999 142 3 122 119 119 138 141 999 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 SBPARRAY[6] SBPARRAY[5] SBPARRAY[4] SBPARRAY[3] SBPARRAY[2] SBPARRAY[1]
  43. 43. EXECUTION PHASE <ul><li>SET statement copies the 1 st obs. from Sbp to the PDV </li></ul>1 st iteration of the DATA step: data sbp2 (drop=i); set sbp; array sbparray[ 6 ] sbp1 - sbp6; do i = 1 to 6 ; if sbparray[i] = 999 then sbparray[i] = . ; end ; run ; 124 SBP6 K 116 SBP5 K 117 SBP4 K . 137 142 141 1 D I K SBP3 K SBP2 K SBP1 D _N_ 123 121 118 142 140 136 4 999 120 119 139 999 142 3 122 119 119 138 141 999 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 SBPARRAY[6] SBPARRAY[5] SBPARRAY[4] SBPARRAY[3] SBPARRAY[2] SBPARRAY[1]
  44. 44. EXECUTION PHASE <ul><li>The ARRAY statement is a compile-time only statement </li></ul>1 st iteration of the DATA step: data sbp2 (drop=i); set sbp; array sbparray[ 6 ] sbp1 - sbp6; do i = 1 to 6 ; if sbparray[i] = 999 then sbparray[i] = . ; end ; run ; 124 SBP6 K 116 SBP5 K 117 SBP4 K . 137 142 141 1 D I K SBP3 K SBP2 K SBP1 D _N_ 123 121 118 142 140 136 4 999 120 119 139 999 142 3 122 119 119 138 141 999 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 SBPARRAY[6] SBPARRAY[5] SBPARRAY[4] SBPARRAY[3] SBPARRAY[2] SBPARRAY[1]
  45. 45. EXECUTION PHASE <ul><li>I  1 </li></ul>1 st iteration of the DATA step: 1 st iteration of the DO loop: data sbp2 (drop=i); set sbp; array sbparray[ 6 ] sbp1 - sbp6; do i = 1 to 6 ; if sbparray[i] = 999 then sbparray[i] = . ; end ; run ; 124 SBP6 K 116 SBP5 K 117 SBP4 K 1 137 142 141 1 D I K SBP3 K SBP2 K SBP1 D _N_ 123 121 118 142 140 136 4 999 120 119 139 999 142 3 122 119 119 138 141 999 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 SBPARRAY[6] SBPARRAY[5] SBPARRAY[4] SBPARRAY[3] SBPARRAY[2] SBPARRAY[1]
  46. 46. EXECUTION PHASE <ul><li>SBPARRAY [ i ]  SBPARRAY [1] </li></ul><ul><li>SBPARRAY [1]  SBP1 </li></ul><ul><li>Since SBP1 ≠ 999, no execution </li></ul>1 st iteration of the DATA step: 1 st iteration of the DO loop: data sbp2 (drop=i); set sbp; array sbparray[ 6 ] sbp1 - sbp6; do i = 1 to 6 ; if sbparray[i] = 999 then sbparray[i] = . ; end ; run ; 124 SBP6 K 116 SBP5 K 117 SBP4 K 1 137 142 141 1 D I K SBP3 K SBP2 K SBP1 D _N_ 123 121 118 142 140 136 4 999 120 119 139 999 142 3 122 119 119 138 141 999 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 SBPARRAY[6] SBPARRAY[5] SBPARRAY[4] SBPARRAY[3] SBPARRAY[2] SBPARRAY[1]
  47. 47. EXECUTION PHASE <ul><li>SAS reaches the end of the DO loop </li></ul>1 st iteration of the DATA step: 1 st iteration of the DO loop: data sbp2 (drop=i); set sbp; array sbparray[ 6 ] sbp1 - sbp6; do i = 1 to 6 ; if sbparray[i] = 999 then sbparray[i] = . ; end ; run ; 124 SBP6 K 116 SBP5 K 117 SBP4 K 1 137 142 141 1 D I K SBP3 K SBP2 K SBP1 D _N_ 123 121 118 142 140 136 4 999 120 119 139 999 142 3 122 119 119 138 141 999 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 SBPARRAY[6] SBPARRAY[5] SBPARRAY[4] SBPARRAY[3] SBPARRAY[2] SBPARRAY[1]
  48. 48. EXECUTION PHASE <ul><li>I  2 </li></ul><ul><li>Since I ≤ 6, the loop continues </li></ul>1 st iteration of the DATA step: 2 nd iteration of the DO loop: data sbp2 (drop=i); set sbp; array sbparray[ 6 ] sbp1 - sbp6; do i = 1 to 6 ; if sbparray[i] = 999 then sbparray[i] = . ; end ; run ; 124 SBP6 K 116 SBP5 K 117 SBP4 K 2 137 142 141 1 D I K SBP3 K SBP2 K SBP1 D _N_ 123 121 118 142 140 136 4 999 120 119 139 999 142 3 122 119 119 138 141 999 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 SBPARRAY[6] SBPARRAY[5] SBPARRAY[4] SBPARRAY[3] SBPARRAY[2] SBPARRAY[1]
  49. 49. EXECUTION PHASE 1 st iteration of the DATA step: 2 nd iteration of the DO loop: <ul><li>SBPARRAY [ i ]  SBPARRAY [2] </li></ul><ul><li>SBPARRAY [2]  SBP2 </li></ul><ul><li>Since SBP2 ≠ 999, no execution </li></ul>data sbp2 (drop=i); set sbp; array sbparray[ 6 ] sbp1 - sbp6; do i = 1 to 6 ; if sbparray[i] = 999 then sbparray[i] = . ; end ; run ; 124 SBP6 K 116 SBP5 K 117 SBP4 K 2 137 142 141 1 D I K SBP3 K SBP2 K SBP1 D _N_ 123 121 118 142 140 136 4 999 120 119 139 999 142 3 122 119 119 138 141 999 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 SBPARRAY[6] SBPARRAY[5] SBPARRAY[4] SBPARRAY[3] SBPARRAY[2] SBPARRAY[1]
  50. 50. EXECUTION PHASE 1 st iteration of the DATA step: 2 nd iteration of the DO loop: <ul><li>SAS reaches the end of the DO loop </li></ul><ul><li>Skip the rest of the iterations </li></ul>data sbp2 (drop=i); set sbp; array sbparray[ 6 ] sbp1 - sbp6; do i = 1 to 6 ; if sbparray[i] = 999 then sbparray[i] = . ; end ; run ; 124 SBP6 K 116 SBP5 K 117 SBP4 K 2 137 142 141 1 D I K SBP3 K SBP2 K SBP1 D _N_ 123 121 118 142 140 136 4 999 120 119 139 999 142 3 122 119 119 138 141 999 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 SBPARRAY[6] SBPARRAY[5] SBPARRAY[4] SBPARRAY[3] SBPARRAY[2] SBPARRAY[1]
  51. 51. EXECUTION PHASE 1 st iteration of the DATA step: <ul><li>SAS reaches the end of the DATA step </li></ul><ul><li>The implicit OUTPUT executes </li></ul>data sbp2 (drop=i); set sbp; array sbparray[ 6 ] sbp1 - sbp6; do i = 1 to 6 ; if sbparray[i] = 999 then sbparray[i] = . ; end ; run ; 124 SBP6 K 116 SBP5 K 117 SBP4 K 7 137 142 141 1 D I K SBP3 K SBP2 K SBP1 D _N_ 123 121 118 142 140 136 4 999 120 119 139 999 142 3 122 119 119 138 141 999 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 SBPARRAY[6] SBPARRAY[5] SBPARRAY[4] SBPARRAY[3] SBPARRAY[2] SBPARRAY[1] 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1
  52. 52. EXECUTION PHASE 2 nd iteration of the DATA step: <ul><li>_N_ ↑ 2 </li></ul><ul><li>SBP1 – SBP6 are retained </li></ul><ul><li>I  missing </li></ul>data sbp2 (drop=i); set sbp; array sbparray[ 6 ] sbp1 - sbp6; do i = 1 to 6 ; if sbparray[i] = 999 then sbparray[i] = . ; end ; run ; 124 SBP6 K 116 SBP5 K 117 SBP4 K . 137 142 141 2 D I K SBP3 K SBP2 K SBP1 D _N_ 123 121 118 142 140 136 4 999 120 119 139 999 142 3 122 119 119 138 141 999 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 SBPARRAY[6] SBPARRAY[5] SBPARRAY[4] SBPARRAY[3] SBPARRAY[2] SBPARRAY[1] 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1
  53. 53. EXECUTION PHASE 2 nd iteration of the DATA step: <ul><li>The SET statement copies the 2 nd obs. to the PDV </li></ul>data sbp2 (drop=i); set sbp; array sbparray[ 6 ] sbp1 - sbp6; do i = 1 to 6 ; if sbparray[i] = 999 then sbparray[i] = . ; end ; run ; 122 SBP6 K 119 SBP5 K 119 SBP4 K . 138 141 999 2 D I K SBP3 K SBP2 K SBP1 D _N_ 123 121 118 142 140 136 4 999 120 119 139 999 142 3 122 119 119 138 141 999 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 SBPARRAY[6] SBPARRAY[5] SBPARRAY[4] SBPARRAY[3] SBPARRAY[2] SBPARRAY[1] 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1
  54. 54. EXECUTION PHASE 2 nd iteration of the DATA step: <ul><li>I  1 </li></ul>1 st iteration of the DO loop: data sbp2 (drop=i); set sbp; array sbparray[ 6 ] sbp1 - sbp6; do i = 1 to 6 ; if sbparray[i] = 999 then sbparray[i] = . ; end ; run ; 122 SBP6 K 119 SBP5 K 119 SBP4 K 1 138 141 999 2 D I K SBP3 K SBP2 K SBP1 D _N_ 123 121 118 142 140 136 4 999 120 119 139 999 142 3 122 119 119 138 141 999 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 SBPARRAY[6] SBPARRAY[5] SBPARRAY[4] SBPARRAY[3] SBPARRAY[2] SBPARRAY[1] 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1
  55. 55. EXECUTION PHASE <ul><li>SBPARRAY [ i ]  SBPARRAY [1] </li></ul><ul><li>SBPARRAY [1]  SBP1 </li></ul>2 nd iteration of the DATA step: 1 st iteration of the DO loop: data sbp2 (drop=i); set sbp; array sbparray[ 6 ] sbp1 - sbp6; do i = 1 to 6 ; if sbparray[i] = 999 then sbparray[i] = . ; end ; run ; 122 SBP6 K 119 SBP5 K 119 SBP4 K 1 138 141 999 2 D I K SBP3 K SBP2 K SBP1 D _N_ 123 121 118 142 140 136 4 999 120 119 139 999 142 3 122 119 119 138 141 999 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 SBPARRAY[6] SBPARRAY[5] SBPARRAY[4] SBPARRAY[3] SBPARRAY[2] SBPARRAY[1] 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1
  56. 56. EXECUTION PHASE <ul><li>SBPARRAY [ i ]  SBPARRAY [1] </li></ul><ul><li>SBPARRAY [1]  SBP1 </li></ul><ul><li>Since SBP1 = 999, SBP1  missing </li></ul>2 nd iteration of the DATA step: 1 st iteration of the DO loop: data sbp2 (drop=i); set sbp; array sbparray[ 6 ] sbp1 - sbp6; do i = 1 to 6 ; if sbparray[i] = 999 then sbparray[i] = . ; end ; run ; 122 SBP6 K 119 SBP5 K 119 SBP4 K 1 138 141 . 2 D I K SBP3 K SBP2 K SBP1 D _N_ 123 121 118 142 140 136 4 999 120 119 139 999 142 3 122 119 119 138 141 999 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 SBPARRAY[6] SBPARRAY[5] SBPARRAY[4] SBPARRAY[3] SBPARRAY[2] SBPARRAY[1] 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1
  57. 57. EXECUTION PHASE <ul><li>SAS reaches the end of loop </li></ul><ul><li>Skip the rest of the loop </li></ul>2 nd iteration of the DATA step: 1 st iteration of the DO loop: data sbp2 (drop=i); set sbp; array sbparray[ 6 ] sbp1 - sbp6; do i = 1 to 6 ; if sbparray[i] = 999 then sbparray[i] = . ; end ; run ; 122 SBP6 K 119 SBP5 K 119 SBP4 K 1 138 141 . 2 D I K SBP3 K SBP2 K SBP1 D _N_ 123 121 118 142 140 136 4 999 120 119 139 999 142 3 122 119 119 138 141 999 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 SBPARRAY[6] SBPARRAY[5] SBPARRAY[4] SBPARRAY[3] SBPARRAY[2] SBPARRAY[1] 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1
  58. 58. EXECUTION PHASE 2 nd iteration of the DATA step: <ul><li>SAS reaches the end of the DATA step </li></ul>data sbp2 (drop=i); set sbp; array sbparray[ 6 ] sbp1 - sbp6; do i = 1 to 6 ; if sbparray[i] = 999 then sbparray[i] = . ; end ; run ; 122 SBP6 K 119 SBP5 K 119 SBP4 K 7 138 141 . 2 D I K SBP3 K SBP2 K SBP1 D _N_ 123 121 118 142 140 136 4 999 120 119 139 999 142 3 122 119 119 138 141 999 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 SBPARRAY[6] SBPARRAY[5] SBPARRAY[4] SBPARRAY[3] SBPARRAY[2] SBPARRAY[1] 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1
  59. 59. EXECUTION PHASE 2 nd iteration of the DATA step: <ul><li>SAS reaches the end of the DATA step </li></ul><ul><li>The implicit OUTPUT executes </li></ul><ul><li>Skip the rest of the iterations </li></ul>data sbp2 (drop=i); set sbp; array sbparray[ 6 ] sbp1 - sbp6; do i = 1 to 6 ; if sbparray[i] = 999 then sbparray[i] = . ; end ; run ; 122 SBP6 K 119 SBP5 K 119 SBP4 K 7 138 141 . 2 D I K SBP3 K SBP2 K SBP1 D _N_ 123 121 118 142 140 136 4 999 120 119 139 999 142 3 122 119 119 138 141 999 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 SBPARRAY[6] SBPARRAY[5] SBPARRAY[4] SBPARRAY[3] SBPARRAY[2] SBPARRAY[1] 122 119 119 138 141 . 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1
  60. 60. SOME ARRAY APPLICATIONS CREATING A GROUP OF VARIABLES BY USING ARRAYS Pre-treatment Post-treatment MEAN SBP: 140 120 data sbp4 (drop=i); set sbp2; array sbp[ 6 ]; array above[ 6 ]; array threshhold[ 6 ] _temporary_ ( 140 140 140 120 120 120 ); do i = 1 to 6 ; if (not missing(sbp[i])) then above [i] = sbp[i] > threshhold[i]; end ; run ; Used to group the existing variables: sbp1 – sbp6 123 121 118 142 140 136 4 . 120 119 139 . 142 3 122 119 119 138 141 . 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 1 1 0 1 0 0 4 . 0 0 0 . 1 3 1 0 0 0 1 . 2 1 0 0 0 1 1 1 above6 above5 above4 above3 above2 above1
  61. 61. CREATING A GROUP OF VARIABLES BY USING ARRAYS Pre-treatment Post-treatment MEAN SBP: 140 120 data sbp4 (drop=i); set sbp2; array sbp[ 6 ]; array above[ 6 ]; array threshhold[ 6 ] _temporary_ ( 140 140 140 120 120 120 ); do i = 1 to 6 ; if (not missing(sbp[i])) then above [i] = sbp[i] > threshhold[i]; end ; run ; Used to create variables: above1 – above6 123 121 118 142 140 136 4 . 120 119 139 . 142 3 122 119 119 138 141 . 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 1 1 0 1 0 0 4 . 0 0 0 . 1 3 1 0 0 0 1 . 2 1 0 0 0 1 1 1 above6 above5 above4 above3 above2 above1
  62. 62. CREATING A GROUP OF VARIABLES BY USING ARRAYS Pre-treatment Post-treatment MEAN SBP: 140 120 data sbp4 (drop=i); set sbp2; array sbp[ 6 ]; array above[ 6 ]; array threshhold[ 6 ] _temporary_ ( 140 140 140 120 120 120 ); do i = 1 to 6 ; if (not missing(sbp[i])) then above [i] = sbp[i] > threshhold[i]; end ; run ; The temporary array is for comparison purposes 123 121 118 142 140 136 4 . 120 119 139 . 142 3 122 119 119 138 141 . 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 1 1 0 1 0 0 4 . 0 0 0 . 1 3 1 0 0 0 1 . 2 1 0 0 0 1 1 1 above6 above5 above4 above3 above2 above1
  63. 63. CREATING A GROUP OF VARIABLES BY USING ARRAYS Pre-treatment Post-treatment MEAN SBP: 140 120 data sbp4 (drop=i); set sbp2; array sbp[ 6 ]; array above[ 6 ]; array threshhold[ 6 ] _temporary_ ( 140 140 140 120 120 120 ); do i = 1 to 6 ; if (not missing(sbp[i])) then above [i] = sbp[i] > threshhold[i]; end ; run ; 123 121 118 142 140 136 4 . 120 119 139 . 142 3 122 119 119 138 141 . 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 1 1 0 1 0 0 4 . 0 0 0 . 1 3 1 0 0 0 1 . 2 1 0 0 0 1 1 1 above6 above5 above4 above3 above2 above1
  64. 64. THE IN OPERATOR data sbp6 (drop = i); set sbp2; array sbp [6]; if . IN sbp then miss = 1 ; else miss = 0 ; run ; 123 121 118 142 140 136 4 . 120 119 139 . 142 3 122 119 119 138 141 . 2 124 116 117 137 142 141 1 sbp6 sbp5 sbp4 sbp3 sbp2 sbp1 0 1 1 0 miss
  65. 65. CALCULATING PRODUCTS OF MULTIPLE VARIABLES data product (drop=i); set test; array num[ 4 ]; if missing(num[ 1 ]) then result = 1 ; else result = num[ 1 ]; do i = 2 to 4 ; if not missing(num[i]) then result =result*num[i]; end ; run ; <ul><li>Approach: </li></ul><ul><li>Create an array: num[4] </li></ul><ul><li>Treat missing value as 1 </li></ul><ul><li>Set result = num[1] </li></ul><ul><li>Loop: i = 2 to 4 </li></ul><ul><li>result = result * num[i] </li></ul><ul><li>End Loop </li></ul>Test: Used to group the existing variables: num1 – num6 1 3 2 . 2 3 2 . 4 1 num4 num3 num2 num1
  66. 66. CALCULATING PRODUCTS OF MULTIPLE VARIABLES data product (drop=i); set test; array num[ 4 ]; if missing(num[ 1 ]) then result = 1 ; else result = num[ 1 ]; do i = 2 to 4 ; if not missing(num[i]) then result =result*num[i]; end ; run ; <ul><li>Approach: </li></ul><ul><li>Create an array: num[4] </li></ul><ul><li>Treat missing value as 1 </li></ul><ul><li>Set result = num[1] </li></ul><ul><li>Loop: i = 2 to 4 </li></ul><ul><li>result = result * num[i] </li></ul><ul><li>End Loop </li></ul>Test: 1 3 2 . 2 3 2 . 4 1 num4 num3 num2 num1
  67. 67. CALCULATING PRODUCTS OF MULTIPLE VARIABLES data product (drop=i); set test; array num[ 4 ]; if missing(num[ 1 ]) then result = 1 ; else result = num[ 1 ]; do i = 2 to 4 ; if not missing(num[i]) then result =result*num[i]; end ; run ; <ul><li>Approach: </li></ul><ul><li>Create an array: num[4] </li></ul><ul><li>Treat missing value as 1 </li></ul><ul><li>Set result = num[1] </li></ul><ul><li>Loop: i = 2 to 4 </li></ul><ul><li>result = result * num[i] </li></ul><ul><li>End Loop </li></ul>Test: 1 3 2 . 2 3 2 . 4 1 num4 num3 num2 num1
  68. 68. RESTRUCTURING DATASETS USING ARRAYS <ul><li>Restructuring datasets: </li></ul>data with one observation per subject (the wide format) data with multiple observations per subject (the long format) 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
  69. 69. FROM WIDE FORMAT TO LONG FORMAT (WITHOUT USING ARRAYS) Wide: Long: <ul><li>Transform wide  long </li></ul><ul><li>2 obs. to read  2 DATA step iterations </li></ul><ul><li>Use multiple OUTPUT statement </li></ul><ul><li>Any missing values in S1 – S3 will not be outputted to long </li></ul>data long (drop=s1-s3); set wide; time = 1 ; score = s1; if not missing(score) then output ; time = 2 ; score = s2; if not missing(score) then output ; time = 3 ; score = s3; if not missing(score) then output ; run ; 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
  70. 70. FROM WIDE FORMAT TO LONG FORMAT(USING ARRAYS) Wide: Long: data long (drop=s1-s3); set wide; time = 1 ; score = s1; if not missing(score) then output ; time = 2 ; score = s2; if not missing(score) then output ; time = 3 ; score = s3; if not missing(score) then output ; run ; array s[ 3 ]; 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID [3] [2] [1] S S[1]; S[2]; S[3];
  71. 71. FROM WIDE FORMAT TO LONG FORMAT(USING ARRAYS) Wide: Long: data long (drop=s1-s3); set wide; time = 1 ; score = s1; if not missing(score) then output ; time = 2 ; score = s2; if not missing(score) then output ; time = 3 ; score = s3; if not missing(score) then output ; run ; array s[ 3 ]; Create a DO loop – TIME as index variable 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID [3] [2] [1] S S[1]; S[2]; S[3];
  72. 72. FROM WIDE FORMAT TO LONG FORMAT(USING ARRAYS) Wide: Long: data long (drop=s1-s3); set wide; time = 1 ; score = s1; if not missing(score) then output ; time = 2 ; score = s2; if not missing(score) then output ; time = 3 ; score = s3; if not missing(score) then output ; run ; array s[ 3 ]; do time = 1 to 3 ; score = s[time]; if not missing(score) then output ; end ; data long (drop=s1-s3); set wide; array s[ 3 ]; run ; 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID [3] [2] [1] S S[1]; S[2]; S[3];
  73. 73. FROM LONG FORMAT TO WIDE FORMAT <ul><li>Reading 5 observations but only creating 2 observations </li></ul><ul><ul><li>You are not copying data from the PDV to the final dataset at each iteration </li></ul></ul><ul><ul><li>You only need to generate one observation once all the observations for each subject have been processed </li></ul></ul>Wide: Long: 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
  74. 74. REVIEW THE RETAIN STATEMENT <ul><li>To prevents the VARIABLE from being initialized each time the DATA step executes, use the RETAIN statement: </li></ul>RETAIN VARIABLE <VALUE>; Name of the variable that we will want to retain <ul><li>A numeric value </li></ul><ul><li>Used to initialize the VARIABLE only at the first iteration of the DATA step execution </li></ul><ul><li>Not specifying an initial value  VARIABLE is initialized as missing </li></ul>
  75. 75. REVIEW: THE SUM STATEMENT <ul><li>The SUM statement has the following form: </li></ul>VARIABLE + EXPRESSION; <ul><li>The numeric accumulator variable that is to be created </li></ul><ul><li>It is automatically set to 0 at the beginning of the first iteration of the DATA step execution </li></ul><ul><li>Retained in following iterations </li></ul><ul><li>Any SAS expression </li></ul><ul><li>If EXPRESSION is evaluated to a missing value, it is treated as 0 </li></ul>
  76. 76. REVIEW: FIRST.VARIABLE AND LAST.VARIABLE <ul><li>You only output the data after you finish reading the last observation of each subject </li></ul><ul><li>Thus, you need to identify the last observation </li></ul>Wide: Long: 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
  77. 77. REVIEW: FIRST.VARIABLE AND LAST.VARIABLE <ul><li>BY-group processing method </li></ul>proc sort data =b; by by_variable; run ; data a; set b; by by_variable; ... ... run ; <ul><li>For each BY-variable, SAS creates two temporary variables: </li></ul><ul><ul><li>FIRST.VARIABLE </li></ul></ul><ul><ul><li>LAST.VARIABLE </li></ul></ul><ul><li>FIRST.VARIABLE & LAST.VARIABLE are set to 1 at the beginning of the execution phase </li></ul><ul><li>They are not being output to the final dataset </li></ul>
  78. 78. REVIEW: FIRST.VARIABLE AND LAST.VARIABLE <ul><li>Suppose ID is the “BY” variable: </li></ul>SAS reads the 1 st observation for ID = A01 SAS reads the last observation for ID = A01 2 A02 5 4 A02 4 2 A01 3 3 A01 2 3 A01 1 SCORE ID 0 1 0 0 1 FIRST.ID 1 0 1 0 0 LAST.ID 2 1 “ GROUPING” Grouping based ID
  79. 79. REVIEW SUBSETTING IF STATEMENT <ul><li>Use the IF statement to continue processing only the observations that meet the condition of the specified expression </li></ul>IF EXPRESSION; <ul><li>If the EXPRESSION is true for the observation </li></ul><ul><ul><li>SAS continues to execute statements in the DATA step and includes the current observation in the data set </li></ul></ul>
  80. 80. REVIEW SUBSETTING IF STATEMENT <ul><li>Use the IF statement to continue processing only the observations that meet the condition of the specified expression </li></ul>IF EXPRESSION; <ul><li>If the EXPRESSION is false </li></ul><ul><ul><li>no further statements are processed for that observation </li></ul></ul><ul><ul><li>the current observation is not written to the data set </li></ul></ul><ul><ul><li>the remaining program statements in the DATA step are not executed </li></ul></ul><ul><ul><li>SAS immediately returns to the beginning of the DATA step </li></ul></ul>
  81. 81. FROM LONG FORMAT TO WIDE FORMAT (WITHOUT USING ARRAYS) if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; <ul><li>Use BY-group processing: BY ID </li></ul><ul><li>Output to the final data when LAST.ID = 1 </li></ul><ul><li>SCORE  S1, S2 S3 </li></ul>RETAIN S3 S1 S3 S2 S1 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID
  82. 82. FROM LONG FORMAT TO WIDE FORMAT (WITHOUT USING ARRAYS) RETAIN proc sort data =long; by id; data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; S3 S1 S3 S2 S1 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID
  83. 83. EXECUTION PHASE data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; <ul><li>1 ST iteration: </li></ul><ul><li>_N_  1 </li></ul><ul><li>FIRST.ID  1, LAST.ID  1 </li></ul><ul><li>Other variables  missing </li></ul>Long: 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID . . . . . 1 1 1 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_
  84. 84. EXECUTION PHASE <ul><li>1 ST iteration: </li></ul><ul><li>The SET statement copies the 1 st observation  PDV </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; . . . 3 1 A01 1 1 1 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
  85. 85. EXECUTION PHASE <ul><li>1 ST iteration: </li></ul><ul><li>The SET statement copies the 1 st observation  PDV </li></ul><ul><li>FIRST.ID  1 since this is the 1 st observation for A01 </li></ul><ul><li>LAST.ID  0 since this is not the last observation for A01 </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; . . . 3 1 A01 0 1 1 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
  86. 86. EXECUTION PHASE <ul><li>1 ST iteration: </li></ul><ul><li>Since TIME = 1, S1  SCORE (3) </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; . . 3 3 1 A01 0 1 1 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
  87. 87. EXECUTION PHASE <ul><li>1 ST iteration: </li></ul><ul><li>Since LAST.ID ≠1, SAS returns to the beginning of the DATA step to begin the 2nd iteration </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; . . 3 3 1 A01 0 1 1 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
  88. 88. EXECUTION PHASE <ul><li>2 nd iteration: </li></ul><ul><li>_N_ ↑ 2 </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; . . 3 3 1 A01 0 1 2 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
  89. 89. EXECUTION PHASE <ul><li>2 nd iteration: </li></ul><ul><li>FIRST.ID and LAST.ID are retained; they are automatic variables </li></ul><ul><li>ID, TIME, SCORE are retained; they are from input dataset </li></ul><ul><li>S1, S2, and S3 are retained because of the RETAIN statement </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; . . 3 3 1 A01 0 1 2 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
  90. 90. EXECUTION PHASE <ul><li>2 nd iteration: </li></ul><ul><li>The SET statement copies the 2 nd observation to the PDV </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; . . 3 4 2 A01 0 1 2 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
  91. 91. EXECUTION PHASE <ul><li>2 nd iteration: </li></ul><ul><li>The SET statement copies the 2 nd observation to the PDV </li></ul><ul><li>FIRST.ID  0; this is not the first observation for A01 </li></ul><ul><li>LAST.ID  0; this is not the last observation for A01 either </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; . . 3 4 2 A01 0 0 2 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
  92. 92. EXECUTION PHASE <ul><li>2 nd iteration: </li></ul><ul><li>Since TIME = 2, S2  SCORE (4) </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; . 4 3 4 2 A01 0 0 2 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
  93. 93. EXECUTION PHASE <ul><li>2 nd iteration: </li></ul><ul><li>Since LAST.ID ≠1, SAS returns to the beginning of the DATA step to begin the 3rd iteration </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; . 4 3 4 2 A01 0 0 2 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
  94. 94. EXECUTION PHASE <ul><li>3 rd iteration: </li></ul><ul><li>_N_ ↑ 3 </li></ul><ul><li>The rest of the variables are retained </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; . 4 3 4 2 A01 0 0 3 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
  95. 95. EXECUTION PHASE <ul><li>3 rd iteration: </li></ul><ul><li>The SET statement copies the 3 rd observation  PDV </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; . 4 3 5 3 A01 0 0 3 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
  96. 96. EXECUTION PHASE <ul><li>3 rd iteration: </li></ul><ul><li>The SET statement copies the 3 rd observation  PDV </li></ul><ul><li>FIRST.ID  0; this is not the first observation for A01 </li></ul><ul><li>LAST.ID  1; this is the last observation for A01 </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; . 4 3 5 3 A01 1 0 3 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
  97. 97. EXECUTION PHASE <ul><li>3 rd iteration: </li></ul><ul><li>Since TIME = 3, S3  SCORE (5) </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; 5 4 3 5 3 A01 1 0 3 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
  98. 98. EXECUTION PHASE <ul><li>3 rd iteration: </li></ul><ul><li>Since LAST.ID = 1, SAS continues to execute statements in the DATA step </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; 5 4 3 5 3 A01 1 0 3 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
  99. 99. EXECUTION PHASE <ul><li>3 rd iteration: </li></ul><ul><li>SAS reaches the end of 3 rd iteration </li></ul><ul><li>The implicit OUTPUT executes, variables marked with (K) are copied to the dataset wide </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; 5 4 3 5 3 A01 1 0 3 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 3 S1 4 S2 5 A01 1 S3 ID
  100. 100. EXECUTION PHASE <ul><li>4 th iteration: </li></ul><ul><li>_N_ ↑ 4 </li></ul><ul><li>The rest of the variables are retained </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; 5 4 3 5 3 A01 1 0 4 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 3 S1 4 S2 5 A01 1 S3 ID
  101. 101. EXECUTION PHASE <ul><li>4 th iteration: </li></ul><ul><li>The SET statement copies the 4 th observation  PDV </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; 5 4 3 4 1 A02 1 0 4 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 3 S1 4 S2 5 A01 1 S3 ID
  102. 102. EXECUTION PHASE <ul><li>4 th iteration: </li></ul><ul><li>The SET statement copies the 4 th observation  PDV </li></ul><ul><li>FIRST.ID  1; this is the first observation for A02 </li></ul><ul><li>LAST.ID  0; this is not the last observation for A02 </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; 5 4 3 4 1 A02 0 1 4 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 3 S1 4 S2 5 A01 1 S3 ID
  103. 103. EXECUTION PHASE <ul><li>4 th iteration: </li></ul><ul><li>Since TIME = 1, S1  SCORE (4) </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; 5 4 4 4 1 A02 0 1 4 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 3 S1 4 S2 5 A01 1 S3 ID
  104. 104. EXECUTION PHASE <ul><li>4 th iteration: </li></ul><ul><li>Since LAST.ID ≠1, SAS returns to the beginning of the DATA step to begin the 5th iteration </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; 5 4 4 4 1 A02 0 1 4 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 3 S1 4 S2 5 A01 1 S3 ID
  105. 105. EXECUTION PHASE <ul><li>5 th iteration: </li></ul><ul><li>_N_ ↑ 5 </li></ul><ul><li>The rest of the variables are retained </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; 5 4 4 4 1 A02 0 1 5 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 3 S1 4 S2 5 A01 1 S3 ID
  106. 106. EXECUTION PHASE <ul><li>5 th iteration: </li></ul><ul><li>The SET statement copies the 5 th observation  PDV </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; 5 4 4 2 3 A02 0 1 5 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 3 S1 4 S2 5 A01 1 S3 ID
  107. 107. EXECUTION PHASE <ul><li>5 th iteration: </li></ul><ul><li>The SET statement copies the 5 th observation  PDV </li></ul><ul><li>FIRST.ID  0; this is not the first observation for A02 </li></ul><ul><li>LAST.ID  1; this is the last observation for A02 </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; 5 4 4 2 3 A02 1 0 5 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 3 S1 4 S2 5 A01 1 S3 ID
  108. 108. EXECUTION PHASE <ul><li>5 th iteration: </li></ul><ul><li>Since TIME = 3, S3  SCORE (2) </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; 2 4 4 2 3 A02 1 0 5 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 3 S1 4 S2 5 A01 1 S3 ID
  109. 109. EXECUTION PHASE <ul><li>5 th iteration: </li></ul><ul><li>Since LAST.ID = 1, SAS continues to execute the rest of the statement </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; 2 4 4 2 3 A02 1 0 5 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 3 S1 4 S2 5 A01 1 S3 ID
  110. 110. EXECUTION PHASE <ul><li>5 th iteration: </li></ul><ul><li>SAS reaches the end of 5 th iteration </li></ul><ul><li>The implicit OUTPUT executes, variables marked with (K) are copied to the dataset wide </li></ul>data wide (drop=time score); set long; by id; retain s1 - s3; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; How to fix this? 2 4 4 2 3 A02 1 0 5 K S3 K S2 K S1 D SCORE D TIME K ID D LAST.ID D FIRST.ID D _N_ 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID 2 4 4 A02 2 3 S1 4 S2 5 A01 1 S3 ID
  111. 111. EXECUTION PHASE data wide (drop=time score); set long; by id; retain s1 - s3; if first.id then do ; s1 = . ; s2 = . ; s3 = . ; end ; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ;
  112. 112. FROM LONG FORMAT TO WIDE FORMAT (USING ARRAYS) data wide (drop=time score); set long; by id; retain s1 - s3; if first.id then do ; s1 = . ; s2 = . ; s3 = . ; end ; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; array s[ 3 ]; if first.id then do ; do i = 1 to 3 ; s[i] = . ; end ; end ; retain s; 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID [3] [2] [1] S S[1] S[2] S[3] S[1] S[2] S[3] S[1] S[2] S[3]
  113. 113. FROM LONG FORMAT TO WIDE FORMAT (USING ARRAYS) data wide (drop=time score); set long; by id; retain s1 - s3; if first.id then do ; s1 = . ; s2 = . ; s3 = . ; end ; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; array s[ 3 ]; retain s; 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID [3] [2] [1] S S[1] S[2] S[3] S[1] S[2] S[3] S[1] S[2] S[3] D LAST.ID D FIRST.ID D _N_ D SCORE D TIME K ID K S3 K S2 K S1 S[3] S[2] S[1] 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID
  114. 114. FROM LONG FORMAT TO WIDE FORMAT (USING ARRAYS) data wide (drop=time score); set long; by id; retain s1 - s3; if first.id then do ; s1 = . ; s2 = . ; s3 = . ; end ; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; array s[ 3 ]; S[TIME] 3 retain s; 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID [3] [2] [1] S S[1] S[2] S[3] S[1] S[2] S[3] S[1] S[2] S[3] 0 1 1 D LAST.ID D FIRST.ID D _N_ 3 1 A01 D SCORE D TIME K ID . . . K S3 K S2 K S1 S[3] S[2] S[1]
  115. 115. FROM LONG FORMAT TO WIDE FORMAT (USING ARRAYS) data wide (drop=time score); set long; by id; retain s1 - s3; if first.id then do ; s1 = . ; s2 = . ; s3 = . ; end ; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; array s[ 3 ]; S[TIME] 3 4 retain s; 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID [3] [2] [1] S S[1] S[2] S[3] S[1] S[2] S[3] S[1] S[2] S[3] 0 0 2 D LAST.ID D FIRST.ID D _N_ 4 2 A01 D SCORE D TIME K ID . . . K S3 K S2 K S1 S[3] S[2] S[1]
  116. 116. FROM LONG FORMAT TO WIDE FORMAT (USING ARRAYS) data wide (drop=time score); set long; by id; retain s1 - s3; if first.id then do ; s1 = . ; s2 = . ; s3 = . ; end ; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; array s[ 3 ]; s[time] = score; retain s; 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID [3] [2] [1] S S[1] S[2] S[3] S[1] S[2] S[3] S[1] S[2] S[3]
  117. 117. FROM LONG FORMAT TO WIDE FORMAT (USING ARRAYS) data wide (drop=time score); set long; by id; retain s1 - s3; if first.id then do ; s1 = . ; s2 = . ; s3 = . ; end ; if time = 1 then s1 = score; else if time = 2 then s2 = score; else s3 = score; if last.id; run ; array s[ 3 ]; if first.id then do ; do i = 1 to 3 ; s[i] = . ; end ; end ; s[time] = score; if last.id; run ; data wide (drop = time score i); set long; by id; array s[ 3 ]; retain s; retain s; 4 3 S1 . 4 S2 2 A02 2 5 A01 1 S3 ID 3 1 3 2 1 TIME 2 A02 5 4 A02 4 5 A01 3 4 A01 2 3 A01 1 SCORE ID [3] [2] [1] S S[1] S[2] S[3] S[1] S[2] S[3] S[1] S[2] S[3]
  118. 118. MULTIDIMENSIONAL ARRAYS ARRAY ARRAYNAME[R, C, …] <$> <ELEMENTS>; <ul><li>The difference between one- and multi-dimensional arrays is the DIMENSION </li></ul><ul><li>R: number of rows </li></ul><ul><li>C: number of columns </li></ul><ul><li>If there are 3 dimensions, the next number will refer to the number of pages </li></ul>
  119. 119. MULTIDIMENSIONAL ARRAYS array a[2,3]; equivalent to … array a[2,3] a1 - a6; a6 a5 a4 2 1 a3 a2 a1 3 2 1 a[2,2] a[1,3]
  120. 120. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: Dat2: <ul><li>Create ONE observation after you finish reading ALL the observations for EACH person </li></ul><ul><li>Use the BY-group processing </li></ul><ul><li>The output will be generated when LAST.ID equals 1 </li></ul>B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID C C F_ G3 C B F_ G1 D F M_ G3 A B M_ G2 B B 2 2 A A 1 1 F_ G2 M_ G1 ID 0 1 0 1 FIRST.ID 1 0 1 0 LAST.ID
  121. 121. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: Dat2: G[3]: ALL_G[2,3]: Use to group existing variables Use to create new variables RETAIN i + 1; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID C C F_ G3 C B F_ G1 D F M_ G3 A B M_ G2 B B 2 2 A A 1 1 F_ G2 M_ G1 ID G3 G2 G1 3 2 1 F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1]
  122. 122. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY proc sort data =dat1; by id; run ; data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ;
  123. 123. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; ALL_G [I,J] Dat1: At the beginning of the 1 st iteration: G [J] ARRAY TRACKING B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 0 I D . G1 D . G2 D . G3 D . ID K 1 LAST.ID D . 1 1 D J D FIRST.ID D _N_ . F_G2 K . F_G3 K . F_G1 K . M_G3 K . . K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  124. 124. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st iteration: ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 0 I D A G1 D B G2 D F G3 D 1 ID K 0 LAST.ID D . 1 1 D J D FIRST.ID D _N_ . F_G2 K . F_G3 K . F_G1 K . M_G3 K . . K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  125. 125. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st iteration: ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 0 I D A G1 D B G2 D F G3 D 1 ID K 0 LAST.ID D . 1 1 D J D FIRST.ID D _N_ . F_G2 K . F_G3 K . F_G1 K . M_G3 K . . K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  126. 126. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st iteration: ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D A G1 D B G2 D F G3 D 1 ID K 0 LAST.ID D . 1 1 D J D FIRST.ID D _N_ . F_G2 K . F_G3 K . F_G1 K . M_G3 K . . K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  127. 127. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st iteration (1 st DO loop): ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D A G1 D B G2 D F G3 D 1 ID K 0 LAST.ID D 1 1 1 D J D FIRST.ID D _N_ . F_G2 K . F_G3 K . F_G1 K . M_G3 K . . K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  128. 128. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st iteration (1 st DO loop): ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D A G1 D B G2 D F G3 D 1 ID K 0 LAST.ID D 1 1 1 D J D FIRST.ID D _N_ . F_G2 K . F_G3 K . F_G1 K . M_G3 K . A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  129. 129. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st iteration (1 st DO loop): ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D A G1 D B G2 D F G3 D 1 ID K 0 LAST.ID D 1 1 1 D J D FIRST.ID D _N_ . F_G2 K . F_G3 K . F_G1 K . M_G3 K . A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  130. 130. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st iteration (2 nd DO loop): ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D A G1 D B G2 D F G3 D 1 ID K 0 LAST.ID D 2 1 1 D J D FIRST.ID D _N_ . F_G2 K . F_G3 K . F_G1 K . M_G3 K . A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  131. 131. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st iteration (2 nd DO loop): ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D A G1 D B G2 D F G3 D 1 ID K 0 LAST.ID D 2 1 1 D J D FIRST.ID D _N_ . F_G2 K . F_G3 K . F_G1 K . M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1 ] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  132. 132. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st iteration (2 nd DO loop): ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D A G1 D B G2 D F G3 D 1 ID K 0 LAST.ID D 2 1 1 D J D FIRST.ID D _N_ . F_G2 K . F_G3 K . F_G1 K . M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  133. 133. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st iteration (3 rd DO loop): ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D A G1 D B G2 D F G3 D 1 ID K 0 LAST.ID D 3 1 1 D J D FIRST.ID D _N_ . F_G2 K . F_G3 K . F_G1 K . M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1 ] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  134. 134. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st iteration (3 rd DO loop): ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D A G1 D B G2 D F G3 D 1 ID K 0 LAST.ID D 3 1 1 D J D FIRST.ID D _N_ . F_G2 K . F_G3 K . F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  135. 135. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st iteration (3 rd DO loop): ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D A G1 D B G2 D F G3 D 1 ID K 0 LAST.ID D 3 1 1 D J D FIRST.ID D _N_ . F_G2 K . F_G3 K . F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  136. 136. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st iteration (4 th DO loop): ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D A G1 D B G2 D F G3 D 1 ID K 0 LAST.ID D 4 1 1 D J D FIRST.ID D _N_ . F_G2 K . F_G3 K . F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  137. 137. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 1 st iteration: ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D A G1 D B G2 D F G3 D 1 ID K 0 LAST.ID D 4 1 1 D J D FIRST.ID D _N_ . F_G2 K . F_G3 K . F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  138. 138. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd iteration: ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D A G1 D B G2 D F G3 D 1 ID K 0 LAST.ID D . 1 2 D J D FIRST.ID D _N_ . F_G2 K . F_G3 K . F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  139. 139. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd iteration: ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D B G1 D A G2 D C G3 D 1 ID K 1 LAST.ID D . 0 2 D J D FIRST.ID D _N_ . F_G2 K . F_G3 K . F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  140. 140. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd iteration: ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D B G1 D A G2 D C G3 D 1 ID K 1 LAST.ID D . 0 2 D J D FIRST.ID D _N_ . F_G2 K . F_G3 K . F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  141. 141. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd iteration: ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 2 I D B G1 D A G2 D C G3 D 1 ID K 1 LAST.ID D . 0 2 D J D FIRST.ID D _N_ . F_G2 K . F_G3 K . F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  142. 142. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd iteration (1 st DO loop): ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 2 I D B G1 D A G2 D C G3 D 1 ID K 1 LAST.ID D 1 0 2 D J D FIRST.ID D _N_ . F_G2 K . F_G3 K . F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  143. 143. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd iteration (1 st DO loop): ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 2 I D B G1 D A G2 D C G3 D 1 ID K 1 LAST.ID D 1 0 2 D J D FIRST.ID D _N_ . F_G2 K . F_G3 K B F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  144. 144. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd iteration (1 st DO loop): ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 2 I D B G1 D A G2 D C G3 D 1 ID K 1 LAST.ID D 1 0 2 D J D FIRST.ID D _N_ . F_G2 K . F_G3 K B F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  145. 145. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd iteration (2 nd DO loop): ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 2 I D B G1 D A G2 D C G3 D 1 ID K 1 LAST.ID D 2 0 2 D J D FIRST.ID D _N_ . F_G2 K . F_G3 K B F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  146. 146. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd iteration (2 nd DO loop): ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 2 I D B G1 D A G2 D C G3 D 1 ID K 1 LAST.ID D 2 0 2 D J D FIRST.ID D _N_ A F_G2 K . F_G3 K B F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  147. 147. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd iteration (2 nd DO loop): ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 2 I D B G1 D A G2 D C G3 D 1 ID K 1 LAST.ID D 2 0 2 D J D FIRST.ID D _N_ A F_G2 K . F_G3 K B F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  148. 148. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd iteration (3 rd DO loop): ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 2 I D B G1 D A G2 D C G3 D 1 ID K 1 LAST.ID D 3 0 2 D J D FIRST.ID D _N_ A F_G2 K . F_G3 K B F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  149. 149. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd iteration (3 rd DO loop): ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 2 I D B G1 D A G2 D C G3 D 1 ID K 1 LAST.ID D 3 0 2 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  150. 150. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd iteration (3 rd DO loop): ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 2 I D B G1 D A G2 D C G3 D 1 ID K 1 LAST.ID D 3 0 2 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  151. 151. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd iteration (4 th DO loop): ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 2 I D B G1 D A G2 D C G3 D 1 ID K 1 LAST.ID D 4 0 2 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  152. 152. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd iteration: ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 2 I D B G1 D A G2 D C G3 D 1 ID K 1 LAST.ID D 4 0 2 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  153. 153. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY Dat1: 2 nd iteration: ALL_G [I,J] G [J] data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 2 I D B G1 D A G2 D C G3 D 1 ID K 1 LAST.ID D 4 0 2 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1
  154. 154. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 2 nd iteration: ALL_G [I,J] G [J] Dat2: B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 2 I D B G1 D A G2 D C G3 D 1 ID K 1 LAST.ID D 4 0 2 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  155. 155. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 3 rd iteration: Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 2 I D B G1 D A G2 D C G3 D 1 ID K 1 LAST.ID D . 0 3 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  156. 156. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 3 rd iteration: Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 2 I D B G1 D A G2 D D G3 D 2 ID K 0 LAST.ID D . 1 3 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  157. 157. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 3 rd iteration: Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 0 I D B G1 D A G2 D D G3 D 2 ID K 0 LAST.ID D . 1 3 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  158. 158. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 3 rd iteration: Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D B G1 D A G2 D D G3 D 2 ID K 0 LAST.ID D . 1 3 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  159. 159. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 3 rd iteration (1 st DO loop): Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D B G1 D A G2 D D G3 D 2 ID K 0 LAST.ID D 1 1 3 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K F M_G3 K B A K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  160. 160. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 3 rd iteration (1 st DO loop): Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D B G1 D A G2 D D G3 D 2 ID K 0 LAST.ID D 1 1 3 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K F M_G3 K B B K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  161. 161. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 3 rd iteration (1 st DO loop): Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D B G1 D A G2 D D G3 D 2 ID K 0 LAST.ID D 1 1 3 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K F M_G3 K B B K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  162. 162. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 3 rd iteration (2 nd DO loop): Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D B G1 D A G2 D D G3 D 2 ID K 0 LAST.ID D 2 1 3 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K F M_G3 K B B K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  163. 163. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 3 rd iteration (2 nd DO loop): Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D B G1 D A G2 D D G3 D 2 ID K 0 LAST.ID D 2 1 3 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K F M_G3 K A B K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  164. 164. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 3 rd iteration (2 nd DO loop): Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D B G1 D A G2 D D G3 D 2 ID K 0 LAST.ID D 2 1 3 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K F M_G3 K A B K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  165. 165. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 3 rd iteration (3 rd DO loop): Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D B G1 D A G2 D D G3 D 2 ID K 0 LAST.ID D 3 1 3 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K F M_G3 K A B K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  166. 166. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 3 rd iteration (3 rd DO loop): Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D B G1 D A G2 D D G3 D 2 ID K 0 LAST.ID D 3 1 3 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K D M_G3 K A B K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  167. 167. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 3 rd iteration (3 rd DO loop): Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D B G1 D A G2 D D G3 D 2 ID K 0 LAST.ID D 3 1 3 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K D M_G3 K A B K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  168. 168. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 3 rd iteration (4 th DO loop): Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D B G1 D A G2 D D G3 D 2 ID K 0 LAST.ID D 4 1 3 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K D M_G3 K A B K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  169. 169. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 3 rd iteration: Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D B G1 D A G2 D D G3 D 2 ID K 0 LAST.ID D 4 1 3 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K D M_G3 K A B K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  170. 170. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 4 th iteration: Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D B G1 D A G2 D D G3 D 2 ID K 0 LAST.ID D . 1 4 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K D M_G3 K A B K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  171. 171. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 4 th iteration: Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D C G1 D B G2 D C G3 D 2 ID K 1 LAST.ID D . 0 4 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K D M_G3 K A B K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  172. 172. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 4 th iteration: Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 1 I D C G1 D B G2 D C G3 D 2 ID K 1 LAST.ID D . 0 4 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K D M_G3 K A B K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  173. 173. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 4 th iteration: Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 2 I D C G1 D B G2 D C G3 D 2 ID K 1 LAST.ID D . 0 4 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K D M_G3 K A B K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  174. 174. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 4 th iteration (1 st DO loop): Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 2 I D C G1 D B G2 D C G3 D 2 ID K 1 LAST.ID D 1 0 4 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K B F_G1 K D M_G3 K A B K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  175. 175. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 4 th iteration (1 st DO loop): Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 2 I D C G1 D B G2 D C G3 D 2 ID K 1 LAST.ID D 1 0 4 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K C F_G1 K D M_G3 K A B K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  176. 176. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 4 th iteration (1 st DO loop): Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 2 I D C G1 D B G2 D C G3 D 2 ID K 1 LAST.ID D 1 0 4 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K C F_G1 K D M_G3 K A B K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  177. 177. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 4 th iteration (2 nd DO loop): Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 2 I D C G1 D B G2 D C G3 D 2 ID K 1 LAST.ID D 2 0 4 D J D FIRST.ID D _N_ A F_G2 K C F_G3 K C F_G1 K D M_G3 K A B K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  178. 178. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 4 th iteration (2 nd DO loop): Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 2 I D C G1 D B G2 D C G3 D 2 ID K 1 LAST.ID D 2 0 4 D J D FIRST.ID D _N_ B F_G2 K C F_G3 K C F_G1 K D M_G3 K A B K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  179. 179. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 4 th iteration (2 nd DO loop): Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 2 I D C G1 D B G2 D C G3 D 2 ID K 1 LAST.ID D 2 0 4 D J D FIRST.ID D _N_ B F_G2 K C F_G3 K C F_G1 K D M_G3 K A B K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  180. 180. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i = 0 ; i + 1 ; do j = 1 to 3 ; all_g[i,j] = g[j]; end ; if last.id; run ; Dat1: 4 th iteration (3 rd DO loop): Dat2: ALL_G [I,J] G [J] B A A B G2 C C 2 4 D B 2 3 C B 1 2 F A 1 1 G3 G1 ID 2 I D C G1 D B G2 D C G3 D 2 ID K 1 LAST.ID D 3 0 4 D J D FIRST.ID D _N_ B F_G2 K C F_G3 K C F_G1 K D M_G3 K A B K M_G2 K M_G1 ALL_G [2,1] ALL_G [2,2] ALL_G [2,3] ALL_G [1,3] ALL_G [1,2] ALL_G [1,1] G[3] G[2] G[1] F_G3 F_G2 F_G1 2 1 M_G3 M_G2 M_G1 3 2 1 G3 G2 G1 C F_G3 B F_G1 F M_G3 B M_G2 A A 1 1 F_G2 M_G1 ID
  181. 181. RESTRUCTURING DATASETS BY USING THE MULTIDIMENSIONAL ARRAY data dat2 (drop = i j g1 - g3); set dat1; by id; array all_g [ 2 , 3 ] $ m_g1 - m_g3 f_g1 - f_g3; array g[ 3 ]; retain all_g; if first.id then i =

×