SlideShare a Scribd company logo
1 of 4
Download to read offline
  1	
  
Virginia Tech. CS 4604 – Introduction to DBMS
Computer Science Spring 2015, Prakash
Amazon Web Services Setup Guidelines for Homework 5
Goals:
1. Create an AWS instance (to get access to EC2, Elastic MapReduce and S3 storage).
2. Create storage buckets on S3 (to save outputs and logs of MR jobs)
3. Create a key pair (required for running MR jobs on EC2)
4. Get Access keys (also required for running MR jobs on EC2)
5. Get and redeem your free credit (worth $100) (and monitor how much you have left).
You have to contact Yao and Elaheh for this step, so please do it early.
6. Familiarize yourself with S3, EC2 and EMR (by doing a sample MR run)
1. Create an AWS account
a. Go to http://aws.amazon.com/account and sign up for an account, if you don’t have
one already.
b. Follow all the required steps and enter all the details. Assume you don’t have any
promotional credit---hence you’ll have to enter payment details (you’ll need a valid
credit/debit card). You’ll also have to validate using your phone.
c. Choose the ‘basic’ plan on the ‘AWS Support plan’ screen that is displayed after you
validate your identity.
d. Once everything has been verified and created, you should have access to the AWS
management Console.
2. Create storage buckets on S3
In the AWS Management Console click on “S3” under Storage & Content Delivery. We need S3
for two reasons: (1) an EMR workflow requires the input data to be on S3; and (2) EMR
workflow output is always saved to S3.
Data (or objects) in S3 are stored in what we call “buckets” (essentially directories). For this
assignment, the data you will process is in a public bucket called:
s3n://cs4604-2015-vt-cs-data/eng-1m/
You will see how to reference this for EMR input later on. In the meanwhile you will need some
buckets of your own (to store output and log files if you want to debug your runs).
For creating a ‘log’ bucket:
a. In the S3 console, click on ‘Create Bucket’.
b. All S3 buckets need to have unique names, so call your logging bucket cs4604-2015-vt-cs-
YOURVTID-logging. Importantly, pick ‘US Standard’ for the Region dropdown. Click on
“Create” (not on “Set-up Logging”)
c. Your new bucket will appear in the S3 console. Clicking on it will tell you it is empty.
Now we will create our main bucket:
a. Again, create bucket. Name it cs4604-2015-vt-cs-YOURVTID. Again pick US Standard
time. Now we need to link our logging bucket to this one---so click on ‘Set Up
Logging >>’.
b. “Enable Logging” and start typing in the name of your logging bucket. It should appear
in the drop down menu. Select it and ‘Create’.
  2	
  
3. Create a key pair
When you run jobs on EMR, you will need a valid public/private key pair for authentication.
To create your first key pair:
a. Click on “EC2” under Compute and Networking section in the AWS management
console.
b. Select the region, on the top right as US East or US Standard. The page will refresh.
c. On the refreshed page you should see a link stating ‘0 Key Pairs’. Click on this.
d. You will be given an option top ‘Create Key Pair’. Name your key pair as you wish.
e. Upon providing a name and clicking on ‘Create’, your private key (a .pem file), will
automatically begin downloading---choose a safe place so that you can retrieve it again.
f. If you need to access your public key, you will be able to find it in the same place where
you found your account credentials. Amazon will not keep a record of your private key
(as expected), so if you lose it, you will need to generate a new set.
Note: You would not really need to access your private key if you use the AWS Management
Console, but you will be asked to name your key pair each time you run an EMR job. If you
wish to log into the master node running your MapReduce job, you will need your .pem file. To
log on to the master node (you can find the address of the master node from the MapReduce
dashboard), you will need to do the following:
$ ssh hadoop@<master-node-address> -i <path-to-pem-file>
4. Get Access keys
Go to your Security Credentials from the AWS management console. The link to this page is in
the dropdown under your name on the top right corner of the AWS management console. Select
the Continue to Security Credentials option in the pop up. Under the Security Credentials
section, check your Access Keys list. Click on the Create new Access Key link. On clicking it will
create a new access key, please download it. Now you are ready to run a MapReduce job.
5. Get and Redeem your free credit
In order to get your credit, you will need a unique credit code. Please send email to
yaozhang@vt.edu , elaheh@vt.edu with the subject header ‘CS4604: AWS Code’ and they will
mail you one (follow subject header strictly). Once you have your unique credit code go to
‘Billing and Cost Management’ link in the dropdown under your name, on the top right corner
of the AWS management console---click on ‘Credits’ on the left side menu bar. Enter your code
and click on ‘Redeem’---if it does not work, email us ASAP. You will be able to see all your
‘Credit’ information such as usage on this page.
Important: There is only so much credit we can give---so you should always check how much
credit you have left by clicking on the ‘Account Activity’ link from your account page.
Sometimes this takes a while to update. Always make sure to test your mappers and reducers
on some sample local data before using the AWS.
6. Familiarize yourself with S3, EC2, and EMR
Run the sample word-count application that comes with AWS.
To do this, follow these steps:
a. Click on the Elastic MapReduce (EMR) link in the Analytics Section of the AWS
management console. This will take you to the EMR Cluster page.
b. Click on the Create cluster link, you will be taken to ‘cluster configuration’ page.
c. Click on ‘Configure Sample Application’ on top right corner. Follow the directions to
run the sample application ‘Word Count (Streaming)’.
  3	
  
Most of the directions are clear and stick to the default values (except your output and logging
bucket which you will need to specify to get the output). Once you create your cluster, it will
take 5-6 minutes to finish and the output will be in your S3 bucket. More step-by-step
information on how-to use this sample application is given here:
http://courses.cs.vt.edu/~cs4604/Spring15/homeworks/hw5/SBS-AWS-Wordcount.pdf
When you run your own mappers and reducers for HW5, there is little difference as follows:
1. Ignore the step of configure sample application.
2. In Step 11, you need to set up a ‘step’. In our homework, we can set the step as “Streaming
program” or “Custom JAR”.
And then configure and add this step: (note that the screen-shot is only instructive: e.g. note
that in the input location below, you’ll instead enter the 4-gram dataset we have mentioned in
the HW, the output location bucket will use the bucket you created before i.e. s3://cs4604-2015-
cs-yourpid/new_folder and so on).
  4	
  
Before you do this step, make sure you have updated your mapper and reducer code to the
responding S3 bucket (folder).
Also, make sure the Streaming output directory doesn’t already exist, because if it is the
running cluster will be terminated with errors.
Note that this is only for one mapper and reducer. If you want to run a series of MR jobs: Set-up
and run the first MR job. Try to check the output if the format looks OK. Then set-up the second
MR job by taking the output bucket of the first MR job and setting it as the input of the second
MR job and so on.
You can run jobs using the AWS web-console like above in this assignment (the easier way); or
you are also welcome to use the elastic-mapreduce command-line interface based on Ruby.
Note: In order to avoid extra charges, after you finished your homework, do not forget to
remove all files in s3 buckets; the ones are generated as output of 4-grams and 2-grams counts
and the ones you have uploaded.
Follow Ruby-AWS instructions given here:
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-cli-reference.html
and see a sample invocation here:
http://cs.smith.edu/dftwiki/index.php/Hadoop_Tutorial_3.2_--_Using_Your_Own_WordCount_program
Note: AWS has excellent documentation (see http://aws.amazon.com/documentation). So
make sure to check it out in case of any further doubts.
Acks: Thanks to Amazon for generously providing free credits. Adapted from similar guidelines by Polo
Chau (GTech) and Diana Maclean (Stanford).	
  

More Related Content

Similar to Aws setup

Amazon dynamodb &amp; amazon redshift
Amazon dynamodb &amp; amazon redshiftAmazon dynamodb &amp; amazon redshift
Amazon dynamodb &amp; amazon redshiftSumeraHangi
 
McrUmbMeetup 22 May 14: Umbraco and Amazon
McrUmbMeetup 22 May 14: Umbraco and AmazonMcrUmbMeetup 22 May 14: Umbraco and Amazon
McrUmbMeetup 22 May 14: Umbraco and AmazonDan Lister
 
Hands-on Lab: Migrating Oracle to PostgreSQL
Hands-on Lab: Migrating Oracle to PostgreSQL Hands-on Lab: Migrating Oracle to PostgreSQL
Hands-on Lab: Migrating Oracle to PostgreSQL Amazon Web Services
 
SQL Server to Redshift Data Load Using SSIS
SQL Server to Redshift Data Load Using SSISSQL Server to Redshift Data Load Using SSIS
SQL Server to Redshift Data Load Using SSISMarc Leinbach
 
Richard Cole of Amazon Gives Lightning Tallk at BigDataCamp
Richard Cole of Amazon Gives Lightning Tallk at BigDataCampRichard Cole of Amazon Gives Lightning Tallk at BigDataCamp
Richard Cole of Amazon Gives Lightning Tallk at BigDataCampBigDataCamp
 
Basic introduction of Amazon Web Services (AWS)
Basic introduction of Amazon Web Services (AWS)Basic introduction of Amazon Web Services (AWS)
Basic introduction of Amazon Web Services (AWS)One That Matters
 
The Future is Now: Leveraging the Cloud with Ruby
The Future is Now: Leveraging the Cloud with RubyThe Future is Now: Leveraging the Cloud with Ruby
The Future is Now: Leveraging the Cloud with RubyRobert Dempsey
 
Getting started with AWS
Getting started with AWSGetting started with AWS
Getting started with AWSJungwon Seo
 
Aws data analytics practice tests 2022
Aws data analytics practice tests 2022Aws data analytics practice tests 2022
Aws data analytics practice tests 2022SkillCertProExams
 
Querying and Analyzing Data in Amazon S3
Querying and Analyzing Data in Amazon S3Querying and Analyzing Data in Amazon S3
Querying and Analyzing Data in Amazon S3Amazon Web Services
 
GUIDE - Migrating AWS EBS backed AMI's between Regions
GUIDE - Migrating AWS EBS backed AMI's between RegionsGUIDE - Migrating AWS EBS backed AMI's between Regions
GUIDE - Migrating AWS EBS backed AMI's between RegionsRob Linton
 
Awsgsg computebasics
Awsgsg computebasicsAwsgsg computebasics
Awsgsg computebasicsjames0417
 
ArcMC for AWS 2.2 P1 Setup Guide
ArcMC for AWS 2.2 P1 Setup GuideArcMC for AWS 2.2 P1 Setup Guide
ArcMC for AWS 2.2 P1 Setup GuideProtect724mouni
 
Scaling drupal horizontally and in cloud
Scaling drupal horizontally and in cloudScaling drupal horizontally and in cloud
Scaling drupal horizontally and in cloudVladimir Ilic
 
Building AWS Redshift Data Warehouse with Matillion and Tableau
Building AWS Redshift Data Warehouse with Matillion and TableauBuilding AWS Redshift Data Warehouse with Matillion and Tableau
Building AWS Redshift Data Warehouse with Matillion and TableauLynn Langit
 
Day 2 - Amazon EC2 Masterclass - Getting the most from Amazon EC2
Day 2 - Amazon EC2 Masterclass - Getting the most from Amazon EC2Day 2 - Amazon EC2 Masterclass - Getting the most from Amazon EC2
Day 2 - Amazon EC2 Masterclass - Getting the most from Amazon EC2Amazon Web Services
 

Similar to Aws setup (20)

Amazon dynamodb &amp; amazon redshift
Amazon dynamodb &amp; amazon redshiftAmazon dynamodb &amp; amazon redshift
Amazon dynamodb &amp; amazon redshift
 
AWS essentials EC2
AWS essentials EC2AWS essentials EC2
AWS essentials EC2
 
McrUmbMeetup 22 May 14: Umbraco and Amazon
McrUmbMeetup 22 May 14: Umbraco and AmazonMcrUmbMeetup 22 May 14: Umbraco and Amazon
McrUmbMeetup 22 May 14: Umbraco and Amazon
 
Hands-on Lab: Migrating Oracle to PostgreSQL
Hands-on Lab: Migrating Oracle to PostgreSQL Hands-on Lab: Migrating Oracle to PostgreSQL
Hands-on Lab: Migrating Oracle to PostgreSQL
 
SQL Server to Redshift Data Load Using SSIS
SQL Server to Redshift Data Load Using SSISSQL Server to Redshift Data Load Using SSIS
SQL Server to Redshift Data Load Using SSIS
 
Richard Cole of Amazon Gives Lightning Tallk at BigDataCamp
Richard Cole of Amazon Gives Lightning Tallk at BigDataCampRichard Cole of Amazon Gives Lightning Tallk at BigDataCamp
Richard Cole of Amazon Gives Lightning Tallk at BigDataCamp
 
Amazon AWS SAA-C03 Exam Dumps
Amazon AWS SAA-C03 Exam DumpsAmazon AWS SAA-C03 Exam Dumps
Amazon AWS SAA-C03 Exam Dumps
 
Basic introduction of Amazon Web Services (AWS)
Basic introduction of Amazon Web Services (AWS)Basic introduction of Amazon Web Services (AWS)
Basic introduction of Amazon Web Services (AWS)
 
AWS essentials S3
AWS essentials S3AWS essentials S3
AWS essentials S3
 
The Future is Now: Leveraging the Cloud with Ruby
The Future is Now: Leveraging the Cloud with RubyThe Future is Now: Leveraging the Cloud with Ruby
The Future is Now: Leveraging the Cloud with Ruby
 
Getting started with AWS
Getting started with AWSGetting started with AWS
Getting started with AWS
 
Aws data analytics practice tests 2022
Aws data analytics practice tests 2022Aws data analytics practice tests 2022
Aws data analytics practice tests 2022
 
Querying and Analyzing Data in Amazon S3
Querying and Analyzing Data in Amazon S3Querying and Analyzing Data in Amazon S3
Querying and Analyzing Data in Amazon S3
 
Lampstack (1)
Lampstack (1)Lampstack (1)
Lampstack (1)
 
GUIDE - Migrating AWS EBS backed AMI's between Regions
GUIDE - Migrating AWS EBS backed AMI's between RegionsGUIDE - Migrating AWS EBS backed AMI's between Regions
GUIDE - Migrating AWS EBS backed AMI's between Regions
 
Awsgsg computebasics
Awsgsg computebasicsAwsgsg computebasics
Awsgsg computebasics
 
ArcMC for AWS 2.2 P1 Setup Guide
ArcMC for AWS 2.2 P1 Setup GuideArcMC for AWS 2.2 P1 Setup Guide
ArcMC for AWS 2.2 P1 Setup Guide
 
Scaling drupal horizontally and in cloud
Scaling drupal horizontally and in cloudScaling drupal horizontally and in cloud
Scaling drupal horizontally and in cloud
 
Building AWS Redshift Data Warehouse with Matillion and Tableau
Building AWS Redshift Data Warehouse with Matillion and TableauBuilding AWS Redshift Data Warehouse with Matillion and Tableau
Building AWS Redshift Data Warehouse with Matillion and Tableau
 
Day 2 - Amazon EC2 Masterclass - Getting the most from Amazon EC2
Day 2 - Amazon EC2 Masterclass - Getting the most from Amazon EC2Day 2 - Amazon EC2 Masterclass - Getting the most from Amazon EC2
Day 2 - Amazon EC2 Masterclass - Getting the most from Amazon EC2
 

Recently uploaded

On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024APNIC
 
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With RoomVIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Roomdivyansh0kumar0
 
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girladitipandeya
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersDamian Radcliffe
 
Russian Call girls in Dubai +971563133746 Dubai Call girls
Russian  Call girls in Dubai +971563133746 Dubai  Call girlsRussian  Call girls in Dubai +971563133746 Dubai  Call girls
Russian Call girls in Dubai +971563133746 Dubai Call girlsstephieert
 
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girlsstephieert
 
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607dollysharma2066
 
Challengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya Shirtrahman018755
 
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...SofiyaSharma5
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...Diya Sharma
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Sheetaleventcompany
 
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls KolkataLow Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
AlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsAlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsThierry TROUIN ☁
 
Russian Call Girls Thane Swara 8617697112 Independent Escort Service Thane
Russian Call Girls Thane Swara 8617697112 Independent Escort Service ThaneRussian Call Girls Thane Swara 8617697112 Independent Escort Service Thane
Russian Call Girls Thane Swara 8617697112 Independent Escort Service ThaneCall girls in Ahmedabad High profile
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Dana Luther
 

Recently uploaded (20)

On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024
 
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
 
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With RoomVIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
 
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
 
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
Russian Call girls in Dubai +971563133746 Dubai Call girls
Russian  Call girls in Dubai +971563133746 Dubai  Call girlsRussian  Call girls in Dubai +971563133746 Dubai  Call girls
Russian Call girls in Dubai +971563133746 Dubai Call girls
 
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
 
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
 
Challengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya Shirt
 
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
 
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
 
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls KolkataLow Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
AlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsAlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with Flows
 
Russian Call Girls Thane Swara 8617697112 Independent Escort Service Thane
Russian Call Girls Thane Swara 8617697112 Independent Escort Service ThaneRussian Call Girls Thane Swara 8617697112 Independent Escort Service Thane
Russian Call Girls Thane Swara 8617697112 Independent Escort Service Thane
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
 

Aws setup

  • 1.   1   Virginia Tech. CS 4604 – Introduction to DBMS Computer Science Spring 2015, Prakash Amazon Web Services Setup Guidelines for Homework 5 Goals: 1. Create an AWS instance (to get access to EC2, Elastic MapReduce and S3 storage). 2. Create storage buckets on S3 (to save outputs and logs of MR jobs) 3. Create a key pair (required for running MR jobs on EC2) 4. Get Access keys (also required for running MR jobs on EC2) 5. Get and redeem your free credit (worth $100) (and monitor how much you have left). You have to contact Yao and Elaheh for this step, so please do it early. 6. Familiarize yourself with S3, EC2 and EMR (by doing a sample MR run) 1. Create an AWS account a. Go to http://aws.amazon.com/account and sign up for an account, if you don’t have one already. b. Follow all the required steps and enter all the details. Assume you don’t have any promotional credit---hence you’ll have to enter payment details (you’ll need a valid credit/debit card). You’ll also have to validate using your phone. c. Choose the ‘basic’ plan on the ‘AWS Support plan’ screen that is displayed after you validate your identity. d. Once everything has been verified and created, you should have access to the AWS management Console. 2. Create storage buckets on S3 In the AWS Management Console click on “S3” under Storage & Content Delivery. We need S3 for two reasons: (1) an EMR workflow requires the input data to be on S3; and (2) EMR workflow output is always saved to S3. Data (or objects) in S3 are stored in what we call “buckets” (essentially directories). For this assignment, the data you will process is in a public bucket called: s3n://cs4604-2015-vt-cs-data/eng-1m/ You will see how to reference this for EMR input later on. In the meanwhile you will need some buckets of your own (to store output and log files if you want to debug your runs). For creating a ‘log’ bucket: a. In the S3 console, click on ‘Create Bucket’. b. All S3 buckets need to have unique names, so call your logging bucket cs4604-2015-vt-cs- YOURVTID-logging. Importantly, pick ‘US Standard’ for the Region dropdown. Click on “Create” (not on “Set-up Logging”) c. Your new bucket will appear in the S3 console. Clicking on it will tell you it is empty. Now we will create our main bucket: a. Again, create bucket. Name it cs4604-2015-vt-cs-YOURVTID. Again pick US Standard time. Now we need to link our logging bucket to this one---so click on ‘Set Up Logging >>’. b. “Enable Logging” and start typing in the name of your logging bucket. It should appear in the drop down menu. Select it and ‘Create’.
  • 2.   2   3. Create a key pair When you run jobs on EMR, you will need a valid public/private key pair for authentication. To create your first key pair: a. Click on “EC2” under Compute and Networking section in the AWS management console. b. Select the region, on the top right as US East or US Standard. The page will refresh. c. On the refreshed page you should see a link stating ‘0 Key Pairs’. Click on this. d. You will be given an option top ‘Create Key Pair’. Name your key pair as you wish. e. Upon providing a name and clicking on ‘Create’, your private key (a .pem file), will automatically begin downloading---choose a safe place so that you can retrieve it again. f. If you need to access your public key, you will be able to find it in the same place where you found your account credentials. Amazon will not keep a record of your private key (as expected), so if you lose it, you will need to generate a new set. Note: You would not really need to access your private key if you use the AWS Management Console, but you will be asked to name your key pair each time you run an EMR job. If you wish to log into the master node running your MapReduce job, you will need your .pem file. To log on to the master node (you can find the address of the master node from the MapReduce dashboard), you will need to do the following: $ ssh hadoop@<master-node-address> -i <path-to-pem-file> 4. Get Access keys Go to your Security Credentials from the AWS management console. The link to this page is in the dropdown under your name on the top right corner of the AWS management console. Select the Continue to Security Credentials option in the pop up. Under the Security Credentials section, check your Access Keys list. Click on the Create new Access Key link. On clicking it will create a new access key, please download it. Now you are ready to run a MapReduce job. 5. Get and Redeem your free credit In order to get your credit, you will need a unique credit code. Please send email to yaozhang@vt.edu , elaheh@vt.edu with the subject header ‘CS4604: AWS Code’ and they will mail you one (follow subject header strictly). Once you have your unique credit code go to ‘Billing and Cost Management’ link in the dropdown under your name, on the top right corner of the AWS management console---click on ‘Credits’ on the left side menu bar. Enter your code and click on ‘Redeem’---if it does not work, email us ASAP. You will be able to see all your ‘Credit’ information such as usage on this page. Important: There is only so much credit we can give---so you should always check how much credit you have left by clicking on the ‘Account Activity’ link from your account page. Sometimes this takes a while to update. Always make sure to test your mappers and reducers on some sample local data before using the AWS. 6. Familiarize yourself with S3, EC2, and EMR Run the sample word-count application that comes with AWS. To do this, follow these steps: a. Click on the Elastic MapReduce (EMR) link in the Analytics Section of the AWS management console. This will take you to the EMR Cluster page. b. Click on the Create cluster link, you will be taken to ‘cluster configuration’ page. c. Click on ‘Configure Sample Application’ on top right corner. Follow the directions to run the sample application ‘Word Count (Streaming)’.
  • 3.   3   Most of the directions are clear and stick to the default values (except your output and logging bucket which you will need to specify to get the output). Once you create your cluster, it will take 5-6 minutes to finish and the output will be in your S3 bucket. More step-by-step information on how-to use this sample application is given here: http://courses.cs.vt.edu/~cs4604/Spring15/homeworks/hw5/SBS-AWS-Wordcount.pdf When you run your own mappers and reducers for HW5, there is little difference as follows: 1. Ignore the step of configure sample application. 2. In Step 11, you need to set up a ‘step’. In our homework, we can set the step as “Streaming program” or “Custom JAR”. And then configure and add this step: (note that the screen-shot is only instructive: e.g. note that in the input location below, you’ll instead enter the 4-gram dataset we have mentioned in the HW, the output location bucket will use the bucket you created before i.e. s3://cs4604-2015- cs-yourpid/new_folder and so on).
  • 4.   4   Before you do this step, make sure you have updated your mapper and reducer code to the responding S3 bucket (folder). Also, make sure the Streaming output directory doesn’t already exist, because if it is the running cluster will be terminated with errors. Note that this is only for one mapper and reducer. If you want to run a series of MR jobs: Set-up and run the first MR job. Try to check the output if the format looks OK. Then set-up the second MR job by taking the output bucket of the first MR job and setting it as the input of the second MR job and so on. You can run jobs using the AWS web-console like above in this assignment (the easier way); or you are also welcome to use the elastic-mapreduce command-line interface based on Ruby. Note: In order to avoid extra charges, after you finished your homework, do not forget to remove all files in s3 buckets; the ones are generated as output of 4-grams and 2-grams counts and the ones you have uploaded. Follow Ruby-AWS instructions given here: http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-cli-reference.html and see a sample invocation here: http://cs.smith.edu/dftwiki/index.php/Hadoop_Tutorial_3.2_--_Using_Your_Own_WordCount_program Note: AWS has excellent documentation (see http://aws.amazon.com/documentation). So make sure to check it out in case of any further doubts. Acks: Thanks to Amazon for generously providing free credits. Adapted from similar guidelines by Polo Chau (GTech) and Diana Maclean (Stanford).