SlideShare a Scribd company logo
1 of 4
Download to read offline
  1	
  
Virginia Tech. CS 4604 – Introduction to DBMS
Computer Science Spring 2015, Prakash
Amazon Web Services Setup Guidelines for Homework 5
Goals:
1. Create an AWS instance (to get access to EC2, Elastic MapReduce and S3 storage).
2. Create storage buckets on S3 (to save outputs and logs of MR jobs)
3. Create a key pair (required for running MR jobs on EC2)
4. Get Access keys (also required for running MR jobs on EC2)
5. Get and redeem your free credit (worth $100) (and monitor how much you have left).
You have to contact Yao and Elaheh for this step, so please do it early.
6. Familiarize yourself with S3, EC2 and EMR (by doing a sample MR run)
1. Create an AWS account
a. Go to http://aws.amazon.com/account and sign up for an account, if you don’t have
one already.
b. Follow all the required steps and enter all the details. Assume you don’t have any
promotional credit---hence you’ll have to enter payment details (you’ll need a valid
credit/debit card). You’ll also have to validate using your phone.
c. Choose the ‘basic’ plan on the ‘AWS Support plan’ screen that is displayed after you
validate your identity.
d. Once everything has been verified and created, you should have access to the AWS
management Console.
2. Create storage buckets on S3
In the AWS Management Console click on “S3” under Storage & Content Delivery. We need S3
for two reasons: (1) an EMR workflow requires the input data to be on S3; and (2) EMR
workflow output is always saved to S3.
Data (or objects) in S3 are stored in what we call “buckets” (essentially directories). For this
assignment, the data you will process is in a public bucket called:
s3n://cs4604-2015-vt-cs-data/eng-1m/
You will see how to reference this for EMR input later on. In the meanwhile you will need some
buckets of your own (to store output and log files if you want to debug your runs).
For creating a ‘log’ bucket:
a. In the S3 console, click on ‘Create Bucket’.
b. All S3 buckets need to have unique names, so call your logging bucket cs4604-2015-vt-cs-
YOURVTID-logging. Importantly, pick ‘US Standard’ for the Region dropdown. Click on
“Create” (not on “Set-up Logging”)
c. Your new bucket will appear in the S3 console. Clicking on it will tell you it is empty.
Now we will create our main bucket:
a. Again, create bucket. Name it cs4604-2015-vt-cs-YOURVTID. Again pick US Standard
time. Now we need to link our logging bucket to this one---so click on ‘Set Up
Logging >>’.
b. “Enable Logging” and start typing in the name of your logging bucket. It should appear
in the drop down menu. Select it and ‘Create’.
  2	
  
3. Create a key pair
When you run jobs on EMR, you will need a valid public/private key pair for authentication.
To create your first key pair:
a. Click on “EC2” under Compute and Networking section in the AWS management
console.
b. Select the region, on the top right as US East or US Standard. The page will refresh.
c. On the refreshed page you should see a link stating ‘0 Key Pairs’. Click on this.
d. You will be given an option top ‘Create Key Pair’. Name your key pair as you wish.
e. Upon providing a name and clicking on ‘Create’, your private key (a .pem file), will
automatically begin downloading---choose a safe place so that you can retrieve it again.
f. If you need to access your public key, you will be able to find it in the same place where
you found your account credentials. Amazon will not keep a record of your private key
(as expected), so if you lose it, you will need to generate a new set.
Note: You would not really need to access your private key if you use the AWS Management
Console, but you will be asked to name your key pair each time you run an EMR job. If you
wish to log into the master node running your MapReduce job, you will need your .pem file. To
log on to the master node (you can find the address of the master node from the MapReduce
dashboard), you will need to do the following:
$ ssh hadoop@<master-node-address> -i <path-to-pem-file>
4. Get Access keys
Go to your Security Credentials from the AWS management console. The link to this page is in
the dropdown under your name on the top right corner of the AWS management console. Select
the Continue to Security Credentials option in the pop up. Under the Security Credentials
section, check your Access Keys list. Click on the Create new Access Key link. On clicking it will
create a new access key, please download it. Now you are ready to run a MapReduce job.
5. Get and Redeem your free credit
In order to get your credit, you will need a unique credit code. Please send email to
yaozhang@vt.edu , elaheh@vt.edu with the subject header ‘CS4604: AWS Code’ and they will
mail you one (follow subject header strictly). Once you have your unique credit code go to
‘Billing and Cost Management’ link in the dropdown under your name, on the top right corner
of the AWS management console---click on ‘Credits’ on the left side menu bar. Enter your code
and click on ‘Redeem’---if it does not work, email us ASAP. You will be able to see all your
‘Credit’ information such as usage on this page.
Important: There is only so much credit we can give---so you should always check how much
credit you have left by clicking on the ‘Account Activity’ link from your account page.
Sometimes this takes a while to update. Always make sure to test your mappers and reducers
on some sample local data before using the AWS.
6. Familiarize yourself with S3, EC2, and EMR
Run the sample word-count application that comes with AWS.
To do this, follow these steps:
a. Click on the Elastic MapReduce (EMR) link in the Analytics Section of the AWS
management console. This will take you to the EMR Cluster page.
b. Click on the Create cluster link, you will be taken to ‘cluster configuration’ page.
c. Click on ‘Configure Sample Application’ on top right corner. Follow the directions to
run the sample application ‘Word Count (Streaming)’.
  3	
  
Most of the directions are clear and stick to the default values (except your output and logging
bucket which you will need to specify to get the output). Once you create your cluster, it will
take 5-6 minutes to finish and the output will be in your S3 bucket. More step-by-step
information on how-to use this sample application is given here:
http://courses.cs.vt.edu/~cs4604/Spring15/homeworks/hw5/SBS-AWS-Wordcount.pdf
When you run your own mappers and reducers for HW5, there is little difference as follows:
1. Ignore the step of configure sample application.
2. In Step 11, you need to set up a ‘step’. In our homework, we can set the step as “Streaming
program” or “Custom JAR”.
And then configure and add this step: (note that the screen-shot is only instructive: e.g. note
that in the input location below, you’ll instead enter the 4-gram dataset we have mentioned in
the HW, the output location bucket will use the bucket you created before i.e. s3://cs4604-2015-
cs-yourpid/new_folder and so on).
  4	
  
Before you do this step, make sure you have updated your mapper and reducer code to the
responding S3 bucket (folder).
Also, make sure the Streaming output directory doesn’t already exist, because if it is the
running cluster will be terminated with errors.
Note that this is only for one mapper and reducer. If you want to run a series of MR jobs: Set-up
and run the first MR job. Try to check the output if the format looks OK. Then set-up the second
MR job by taking the output bucket of the first MR job and setting it as the input of the second
MR job and so on.
You can run jobs using the AWS web-console like above in this assignment (the easier way); or
you are also welcome to use the elastic-mapreduce command-line interface based on Ruby.
Note: In order to avoid extra charges, after you finished your homework, do not forget to
remove all files in s3 buckets; the ones are generated as output of 4-grams and 2-grams counts
and the ones you have uploaded.
Follow Ruby-AWS instructions given here:
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-cli-reference.html
and see a sample invocation here:
http://cs.smith.edu/dftwiki/index.php/Hadoop_Tutorial_3.2_--_Using_Your_Own_WordCount_program
Note: AWS has excellent documentation (see http://aws.amazon.com/documentation). So
make sure to check it out in case of any further doubts.
Acks: Thanks to Amazon for generously providing free credits. Adapted from similar guidelines by Polo
Chau (GTech) and Diana Maclean (Stanford).	
  

More Related Content

Similar to Aws setup

Amazon dynamodb &amp; amazon redshift
Amazon dynamodb &amp; amazon redshiftAmazon dynamodb &amp; amazon redshift
Amazon dynamodb &amp; amazon redshiftSumeraHangi
 
McrUmbMeetup 22 May 14: Umbraco and Amazon
McrUmbMeetup 22 May 14: Umbraco and AmazonMcrUmbMeetup 22 May 14: Umbraco and Amazon
McrUmbMeetup 22 May 14: Umbraco and AmazonDan Lister
 
Hands-on Lab: Migrating Oracle to PostgreSQL
Hands-on Lab: Migrating Oracle to PostgreSQL Hands-on Lab: Migrating Oracle to PostgreSQL
Hands-on Lab: Migrating Oracle to PostgreSQL Amazon Web Services
 
SQL Server to Redshift Data Load Using SSIS
SQL Server to Redshift Data Load Using SSISSQL Server to Redshift Data Load Using SSIS
SQL Server to Redshift Data Load Using SSISMarc Leinbach
 
Richard Cole of Amazon Gives Lightning Tallk at BigDataCamp
Richard Cole of Amazon Gives Lightning Tallk at BigDataCampRichard Cole of Amazon Gives Lightning Tallk at BigDataCamp
Richard Cole of Amazon Gives Lightning Tallk at BigDataCampBigDataCamp
 
Basic introduction of Amazon Web Services (AWS)
Basic introduction of Amazon Web Services (AWS)Basic introduction of Amazon Web Services (AWS)
Basic introduction of Amazon Web Services (AWS)One That Matters
 
The Future is Now: Leveraging the Cloud with Ruby
The Future is Now: Leveraging the Cloud with RubyThe Future is Now: Leveraging the Cloud with Ruby
The Future is Now: Leveraging the Cloud with RubyRobert Dempsey
 
Getting started with AWS
Getting started with AWSGetting started with AWS
Getting started with AWSJungwon Seo
 
Aws data analytics practice tests 2022
Aws data analytics practice tests 2022Aws data analytics practice tests 2022
Aws data analytics practice tests 2022SkillCertProExams
 
Querying and Analyzing Data in Amazon S3
Querying and Analyzing Data in Amazon S3Querying and Analyzing Data in Amazon S3
Querying and Analyzing Data in Amazon S3Amazon Web Services
 
GUIDE - Migrating AWS EBS backed AMI's between Regions
GUIDE - Migrating AWS EBS backed AMI's between RegionsGUIDE - Migrating AWS EBS backed AMI's between Regions
GUIDE - Migrating AWS EBS backed AMI's between RegionsRob Linton
 
Awsgsg computebasics
Awsgsg computebasicsAwsgsg computebasics
Awsgsg computebasicsjames0417
 
ArcMC for AWS 2.2 P1 Setup Guide
ArcMC for AWS 2.2 P1 Setup GuideArcMC for AWS 2.2 P1 Setup Guide
ArcMC for AWS 2.2 P1 Setup GuideProtect724mouni
 
Scaling drupal horizontally and in cloud
Scaling drupal horizontally and in cloudScaling drupal horizontally and in cloud
Scaling drupal horizontally and in cloudVladimir Ilic
 
Building AWS Redshift Data Warehouse with Matillion and Tableau
Building AWS Redshift Data Warehouse with Matillion and TableauBuilding AWS Redshift Data Warehouse with Matillion and Tableau
Building AWS Redshift Data Warehouse with Matillion and TableauLynn Langit
 
Day 2 - Amazon EC2 Masterclass - Getting the most from Amazon EC2
Day 2 - Amazon EC2 Masterclass - Getting the most from Amazon EC2Day 2 - Amazon EC2 Masterclass - Getting the most from Amazon EC2
Day 2 - Amazon EC2 Masterclass - Getting the most from Amazon EC2Amazon Web Services
 

Similar to Aws setup (20)

Amazon dynamodb &amp; amazon redshift
Amazon dynamodb &amp; amazon redshiftAmazon dynamodb &amp; amazon redshift
Amazon dynamodb &amp; amazon redshift
 
AWS essentials EC2
AWS essentials EC2AWS essentials EC2
AWS essentials EC2
 
McrUmbMeetup 22 May 14: Umbraco and Amazon
McrUmbMeetup 22 May 14: Umbraco and AmazonMcrUmbMeetup 22 May 14: Umbraco and Amazon
McrUmbMeetup 22 May 14: Umbraco and Amazon
 
Hands-on Lab: Migrating Oracle to PostgreSQL
Hands-on Lab: Migrating Oracle to PostgreSQL Hands-on Lab: Migrating Oracle to PostgreSQL
Hands-on Lab: Migrating Oracle to PostgreSQL
 
SQL Server to Redshift Data Load Using SSIS
SQL Server to Redshift Data Load Using SSISSQL Server to Redshift Data Load Using SSIS
SQL Server to Redshift Data Load Using SSIS
 
Richard Cole of Amazon Gives Lightning Tallk at BigDataCamp
Richard Cole of Amazon Gives Lightning Tallk at BigDataCampRichard Cole of Amazon Gives Lightning Tallk at BigDataCamp
Richard Cole of Amazon Gives Lightning Tallk at BigDataCamp
 
Amazon AWS SAA-C03 Exam Dumps
Amazon AWS SAA-C03 Exam DumpsAmazon AWS SAA-C03 Exam Dumps
Amazon AWS SAA-C03 Exam Dumps
 
Basic introduction of Amazon Web Services (AWS)
Basic introduction of Amazon Web Services (AWS)Basic introduction of Amazon Web Services (AWS)
Basic introduction of Amazon Web Services (AWS)
 
AWS essentials S3
AWS essentials S3AWS essentials S3
AWS essentials S3
 
The Future is Now: Leveraging the Cloud with Ruby
The Future is Now: Leveraging the Cloud with RubyThe Future is Now: Leveraging the Cloud with Ruby
The Future is Now: Leveraging the Cloud with Ruby
 
Getting started with AWS
Getting started with AWSGetting started with AWS
Getting started with AWS
 
Aws data analytics practice tests 2022
Aws data analytics practice tests 2022Aws data analytics practice tests 2022
Aws data analytics practice tests 2022
 
Querying and Analyzing Data in Amazon S3
Querying and Analyzing Data in Amazon S3Querying and Analyzing Data in Amazon S3
Querying and Analyzing Data in Amazon S3
 
Lampstack (1)
Lampstack (1)Lampstack (1)
Lampstack (1)
 
GUIDE - Migrating AWS EBS backed AMI's between Regions
GUIDE - Migrating AWS EBS backed AMI's between RegionsGUIDE - Migrating AWS EBS backed AMI's between Regions
GUIDE - Migrating AWS EBS backed AMI's between Regions
 
Awsgsg computebasics
Awsgsg computebasicsAwsgsg computebasics
Awsgsg computebasics
 
ArcMC for AWS 2.2 P1 Setup Guide
ArcMC for AWS 2.2 P1 Setup GuideArcMC for AWS 2.2 P1 Setup Guide
ArcMC for AWS 2.2 P1 Setup Guide
 
Scaling drupal horizontally and in cloud
Scaling drupal horizontally and in cloudScaling drupal horizontally and in cloud
Scaling drupal horizontally and in cloud
 
Building AWS Redshift Data Warehouse with Matillion and Tableau
Building AWS Redshift Data Warehouse with Matillion and TableauBuilding AWS Redshift Data Warehouse with Matillion and Tableau
Building AWS Redshift Data Warehouse with Matillion and Tableau
 
Day 2 - Amazon EC2 Masterclass - Getting the most from Amazon EC2
Day 2 - Amazon EC2 Masterclass - Getting the most from Amazon EC2Day 2 - Amazon EC2 Masterclass - Getting the most from Amazon EC2
Day 2 - Amazon EC2 Masterclass - Getting the most from Amazon EC2
 

Recently uploaded

AlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsAlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsThierry TROUIN ☁
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITMgdsc13
 
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on DeliveryCall Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Deliverybabeytanya
 
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts servicevipmodelshub1
 
定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一
定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一
定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一3sw2qly1
 
Complet Documnetation for Smart Assistant Application for Disabled Person
Complet Documnetation   for Smart Assistant Application for Disabled PersonComplet Documnetation   for Smart Assistant Application for Disabled Person
Complet Documnetation for Smart Assistant Application for Disabled Personfurqan222004
 
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一Fs
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Paul Calvano
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一Fs
 
Denver Web Design brochure for public viewing
Denver Web Design brochure for public viewingDenver Web Design brochure for public viewing
Denver Web Design brochure for public viewingbigorange77
 
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls KolkataVIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhimiss dipika
 
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130  Available With RoomVIP Kolkata Call Girl Kestopur 👉 8250192130  Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Roomdivyansh0kumar0
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作ys8omjxb
 
The Intriguing World of CDR Analysis by Police: What You Need to Know.pdf
The Intriguing World of CDR Analysis by Police: What You Need to Know.pdfThe Intriguing World of CDR Analysis by Police: What You Need to Know.pdf
The Intriguing World of CDR Analysis by Police: What You Need to Know.pdfMilind Agarwal
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)Christopher H Felton
 
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts servicesonalikaur4
 

Recently uploaded (20)

AlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsAlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with Flows
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITM
 
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on DeliveryCall Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
 
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
 
定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一
定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一
定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一
 
Complet Documnetation for Smart Assistant Application for Disabled Person
Complet Documnetation   for Smart Assistant Application for Disabled PersonComplet Documnetation   for Smart Assistant Application for Disabled Person
Complet Documnetation for Smart Assistant Application for Disabled Person
 
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
 
Denver Web Design brochure for public viewing
Denver Web Design brochure for public viewingDenver Web Design brochure for public viewing
Denver Web Design brochure for public viewing
 
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls KolkataVIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhi
 
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130  Available With RoomVIP Kolkata Call Girl Kestopur 👉 8250192130  Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
 
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
 
The Intriguing World of CDR Analysis by Police: What You Need to Know.pdf
The Intriguing World of CDR Analysis by Police: What You Need to Know.pdfThe Intriguing World of CDR Analysis by Police: What You Need to Know.pdf
The Intriguing World of CDR Analysis by Police: What You Need to Know.pdf
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
 
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
 

Aws setup

  • 1.   1   Virginia Tech. CS 4604 – Introduction to DBMS Computer Science Spring 2015, Prakash Amazon Web Services Setup Guidelines for Homework 5 Goals: 1. Create an AWS instance (to get access to EC2, Elastic MapReduce and S3 storage). 2. Create storage buckets on S3 (to save outputs and logs of MR jobs) 3. Create a key pair (required for running MR jobs on EC2) 4. Get Access keys (also required for running MR jobs on EC2) 5. Get and redeem your free credit (worth $100) (and monitor how much you have left). You have to contact Yao and Elaheh for this step, so please do it early. 6. Familiarize yourself with S3, EC2 and EMR (by doing a sample MR run) 1. Create an AWS account a. Go to http://aws.amazon.com/account and sign up for an account, if you don’t have one already. b. Follow all the required steps and enter all the details. Assume you don’t have any promotional credit---hence you’ll have to enter payment details (you’ll need a valid credit/debit card). You’ll also have to validate using your phone. c. Choose the ‘basic’ plan on the ‘AWS Support plan’ screen that is displayed after you validate your identity. d. Once everything has been verified and created, you should have access to the AWS management Console. 2. Create storage buckets on S3 In the AWS Management Console click on “S3” under Storage & Content Delivery. We need S3 for two reasons: (1) an EMR workflow requires the input data to be on S3; and (2) EMR workflow output is always saved to S3. Data (or objects) in S3 are stored in what we call “buckets” (essentially directories). For this assignment, the data you will process is in a public bucket called: s3n://cs4604-2015-vt-cs-data/eng-1m/ You will see how to reference this for EMR input later on. In the meanwhile you will need some buckets of your own (to store output and log files if you want to debug your runs). For creating a ‘log’ bucket: a. In the S3 console, click on ‘Create Bucket’. b. All S3 buckets need to have unique names, so call your logging bucket cs4604-2015-vt-cs- YOURVTID-logging. Importantly, pick ‘US Standard’ for the Region dropdown. Click on “Create” (not on “Set-up Logging”) c. Your new bucket will appear in the S3 console. Clicking on it will tell you it is empty. Now we will create our main bucket: a. Again, create bucket. Name it cs4604-2015-vt-cs-YOURVTID. Again pick US Standard time. Now we need to link our logging bucket to this one---so click on ‘Set Up Logging >>’. b. “Enable Logging” and start typing in the name of your logging bucket. It should appear in the drop down menu. Select it and ‘Create’.
  • 2.   2   3. Create a key pair When you run jobs on EMR, you will need a valid public/private key pair for authentication. To create your first key pair: a. Click on “EC2” under Compute and Networking section in the AWS management console. b. Select the region, on the top right as US East or US Standard. The page will refresh. c. On the refreshed page you should see a link stating ‘0 Key Pairs’. Click on this. d. You will be given an option top ‘Create Key Pair’. Name your key pair as you wish. e. Upon providing a name and clicking on ‘Create’, your private key (a .pem file), will automatically begin downloading---choose a safe place so that you can retrieve it again. f. If you need to access your public key, you will be able to find it in the same place where you found your account credentials. Amazon will not keep a record of your private key (as expected), so if you lose it, you will need to generate a new set. Note: You would not really need to access your private key if you use the AWS Management Console, but you will be asked to name your key pair each time you run an EMR job. If you wish to log into the master node running your MapReduce job, you will need your .pem file. To log on to the master node (you can find the address of the master node from the MapReduce dashboard), you will need to do the following: $ ssh hadoop@<master-node-address> -i <path-to-pem-file> 4. Get Access keys Go to your Security Credentials from the AWS management console. The link to this page is in the dropdown under your name on the top right corner of the AWS management console. Select the Continue to Security Credentials option in the pop up. Under the Security Credentials section, check your Access Keys list. Click on the Create new Access Key link. On clicking it will create a new access key, please download it. Now you are ready to run a MapReduce job. 5. Get and Redeem your free credit In order to get your credit, you will need a unique credit code. Please send email to yaozhang@vt.edu , elaheh@vt.edu with the subject header ‘CS4604: AWS Code’ and they will mail you one (follow subject header strictly). Once you have your unique credit code go to ‘Billing and Cost Management’ link in the dropdown under your name, on the top right corner of the AWS management console---click on ‘Credits’ on the left side menu bar. Enter your code and click on ‘Redeem’---if it does not work, email us ASAP. You will be able to see all your ‘Credit’ information such as usage on this page. Important: There is only so much credit we can give---so you should always check how much credit you have left by clicking on the ‘Account Activity’ link from your account page. Sometimes this takes a while to update. Always make sure to test your mappers and reducers on some sample local data before using the AWS. 6. Familiarize yourself with S3, EC2, and EMR Run the sample word-count application that comes with AWS. To do this, follow these steps: a. Click on the Elastic MapReduce (EMR) link in the Analytics Section of the AWS management console. This will take you to the EMR Cluster page. b. Click on the Create cluster link, you will be taken to ‘cluster configuration’ page. c. Click on ‘Configure Sample Application’ on top right corner. Follow the directions to run the sample application ‘Word Count (Streaming)’.
  • 3.   3   Most of the directions are clear and stick to the default values (except your output and logging bucket which you will need to specify to get the output). Once you create your cluster, it will take 5-6 minutes to finish and the output will be in your S3 bucket. More step-by-step information on how-to use this sample application is given here: http://courses.cs.vt.edu/~cs4604/Spring15/homeworks/hw5/SBS-AWS-Wordcount.pdf When you run your own mappers and reducers for HW5, there is little difference as follows: 1. Ignore the step of configure sample application. 2. In Step 11, you need to set up a ‘step’. In our homework, we can set the step as “Streaming program” or “Custom JAR”. And then configure and add this step: (note that the screen-shot is only instructive: e.g. note that in the input location below, you’ll instead enter the 4-gram dataset we have mentioned in the HW, the output location bucket will use the bucket you created before i.e. s3://cs4604-2015- cs-yourpid/new_folder and so on).
  • 4.   4   Before you do this step, make sure you have updated your mapper and reducer code to the responding S3 bucket (folder). Also, make sure the Streaming output directory doesn’t already exist, because if it is the running cluster will be terminated with errors. Note that this is only for one mapper and reducer. If you want to run a series of MR jobs: Set-up and run the first MR job. Try to check the output if the format looks OK. Then set-up the second MR job by taking the output bucket of the first MR job and setting it as the input of the second MR job and so on. You can run jobs using the AWS web-console like above in this assignment (the easier way); or you are also welcome to use the elastic-mapreduce command-line interface based on Ruby. Note: In order to avoid extra charges, after you finished your homework, do not forget to remove all files in s3 buckets; the ones are generated as output of 4-grams and 2-grams counts and the ones you have uploaded. Follow Ruby-AWS instructions given here: http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-cli-reference.html and see a sample invocation here: http://cs.smith.edu/dftwiki/index.php/Hadoop_Tutorial_3.2_--_Using_Your_Own_WordCount_program Note: AWS has excellent documentation (see http://aws.amazon.com/documentation). So make sure to check it out in case of any further doubts. Acks: Thanks to Amazon for generously providing free credits. Adapted from similar guidelines by Polo Chau (GTech) and Diana Maclean (Stanford).