Set and Merge

1,553 views

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,553
On SlideShare
0
From Embeds
0
Number of Embeds
665
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Set and Merge

  1. 1. Data Step Programming made easy with SET and MERGE Statement by using SET and MERGE Statement Venkata Maguluri
  2. 2. Synopsis <ul><li>Purpose </li></ul><ul><li>SET Statement </li></ul><ul><li>MERGE Statement </li></ul><ul><li>Advantage </li></ul><ul><li>Conclusion </li></ul>
  3. 3. Purpose- why? <ul><li>Copying of dataset </li></ul><ul><li>Reading selected records </li></ul><ul><li>Modification of dataset </li></ul><ul><li>Combining two or more datasets </li></ul><ul><li>Creating derived dataset </li></ul>
  4. 4. SET Statement <ul><li>Used for reading data </li></ul><ul><li>data newdata; </li></ul><ul><li>set olddata; </li></ul><ul><li>run; </li></ul><ul><li>Commonly used to concatenate two or more datasets </li></ul><ul><li>SET statement combines two or more datasets vertically </li></ul><ul><li>data newdata; </li></ul><ul><li>set olddata1 olddata2; </li></ul><ul><li>run; </li></ul>
  5. 5. Points to Remember <ul><li>Variable types should be same in adjoining datasets. </li></ul><ul><li>ERROR: Variable variablename has been defined as both </li></ul><ul><li>character and numeric </li></ul><ul><li>If common variables are same either character/numeric but lengths differ then output dataset will use the first length it encounters for that variable </li></ul><ul><li>data newdata; </li></ul><ul><li>length common $ 25.; </li></ul><ul><li>set olddata1 olddata2; </li></ul><ul><li>run; </li></ul><ul><li>Interleaving two or more SAS datasets </li></ul><ul><li>data newdata; </li></ul><ul><li>set olddata1 olddata2; </li></ul><ul><li>by keyvar1 keyvar2; </li></ul><ul><li>run; </li></ul>
  6. 6. Other Options <ul><li>POINT= option can be used to read specific observation </li></ul><ul><li>data result; </li></ul><ul><li>a=3; </li></ul><ul><li>set test point=a; </li></ul><ul><li>output; </li></ul><ul><li>stop; </li></ul><ul><li>run; </li></ul><ul><li>END= option can be used to read last observation from the </li></ul><ul><li>dataset. </li></ul><ul><li>data dummy; </li></ul><ul><li>set old end=eof; </li></ul><ul><li>if eof then output; </li></ul><ul><li>run; </li></ul><ul><li>FIRSTOBS= and OBS= option </li></ul><ul><li>data newdata; </li></ul><ul><li>set olddata (FIRSTOBS=1000 OBS=2000); </li></ul><ul><li>run; </li></ul>
  7. 7. MERGE Statement <ul><li>Used to join two or more datasets horizontally </li></ul><ul><li>3 Types of merging:- </li></ul><ul><ul><li>One-to-One MERGE </li></ul></ul><ul><ul><li>One-to-Many MERGE </li></ul></ul><ul><ul><li>Many-to-One Merge </li></ul></ul><ul><ul><li>Syntax:- </li></ul></ul><ul><ul><li>data dummy; data dummy; </li></ul></ul><ul><ul><li>merge data1 data2; merge data1(in=a) data2(in=b); </li></ul></ul><ul><ul><li>run; by key-variable; </li></ul></ul><ul><ul><li>run; </li></ul></ul>
  8. 8. Advantage <ul><li>SET statement can be used for 2 or more datasets while </li></ul><ul><li>PROC APPEND statement can only handle 2 datasets at a time. </li></ul><ul><li>MERGE statement precedes over UPDATE statement because it can handle more than 2 datasets at a time. </li></ul><ul><li>MERGE statement has simpler code than SQL Join. </li></ul><ul><li>SET and MERGE statements are easy to use and hence more user friendly. </li></ul>
  9. 9. Conclusion <ul><li>SET statement and MERGE statement is a powerful programming tool as it has got many options. </li></ul><ul><li>Handy to use and makes task easier. </li></ul><ul><li>Very Tricky, hence should be used wisely else wrong results would be generated. </li></ul>

×