Discovery 2015 Workshop

2,146 views

Published on

Presentation at the Discovery 2015 Workshop on Cloud Computing at Berkeley

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,146
On SlideShare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
47
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Discovery 2015 Workshop

  1. 1. There  is  no  magic,  there  is  only  awesome Scien&fic  compu&ng  with  Amazon  Web  Services Deepak  Singh Business  Development  Manager  -­‐  Amazon  Compute  Services Discovery  2015  Workshop,  July  23  2010
  2. 2. Via Reavel under a CC-BY-NC-ND license
  3. 3. life science industry
  4. 4. Credit: Bosco Ho
  5. 5. By ~Prescott under a CC-BY-NC license
  6. 6. <1>
  7. 7. the cloud
  8. 8. has_many :definitions
  9. 9. infrastructure as a service
  10. 10. The   “ Living   a nd   Evolving”   C loud AWS  services  and  basic  terminology Most  Applica9ons  Need: 1. Compute Your   A pplication 2. Storage Amazon   Amazon   E lastic   3. Messaging RDS MapReduce   J obFlows Payment   :   A mazon   F PS/   D evPay Amazon   S impleDB   D omains 4. Payment Amazon   Cloud Amazon   S QS   Q ueues Auto-­‐ Elastic   Cloud 5. Distribu9on Amazon   S NS   Topics Scaling LB Watch Front Amazon   S 3   6. Scale Objects   a nd   Amazon   EC2   I nstances Buckets 7. Analy9cs (On-­‐Demand,   Reserved,   S pot) EBS Snapshots Volumes Amazon   Virtual   P rivate   C loud Amazon   Worldwide   P hysical   I nfrastructure   (Geographical   Regions,   Availability   Zones,   Edge   L ocations)  
  11. 11. Scalable Increase  or  decrease   capacity  in  minutes AutomaIon
  12. 12. Scalable Increase  or  decrease   Cost  Effec9ve capacity  in  minutes Low  rate,  pay-­‐as-­‐you-­‐go AutomaIon
  13. 13. Scalable Increase  or  decrease   Cost  Effec9ve capacity  in  minutes Low  rate,  pay-­‐as-­‐you-­‐go AutomaIon Reliable Mission  CriIcal   Infrastructure
  14. 14. Scalable Increase  or  decrease   Cost  Effec9ve capacity  in  minutes Low  rate,  pay-­‐as-­‐you-­‐go AutomaIon Reliable Secure Mission  CriIcal   MulIlayer  security  faciliIes Infrastructure
  15. 15. compute
  16. 16. elastic compute cloud
  17. 17. elastic
  18. 18. 3000 CPU’s for one firm’s risk management application 3444JJ' !"#$%&'()'*+,'-./01.2%/' 344'+567/'(.' 8%%9%.:/' 344'JJ' I%:.%/:1=' ;<"&/:1=' A&B:1=' C10"&:1=' C".:1=' E(.:1=' ;"%/:1=' >?,,?,44@' >?,3?,44@' >?,>?,44@' >?,H?,44@' >?,D?,44@' >?,F?,44@' >?,G?,44@'
  19. 19. programmable
  20. 20. // Run an instance $EC2 = new AmazonEC2(); $Options = array('KeyName' => "Jeff's Keys", 'InstanceType' => "m1.small"); $Res = $EC2->run_instances("ami-db7b9db2", 1, 1, $Options);
  21. 21. more later
  22. 22. cost effective
  23. 23. 3000 CPU’s for one firm’s risk management application 3444JJ' !"#$%&'()'*+,'-./01.2%/' 344'+567/'(.' 8%%9%.:/' 344'JJ' I%:.%/:1=' ;<"&/:1=' A&B:1=' C10"&:1=' C".:1=' E(.:1=' ;"%/:1=' >?,,?,44@' >?,3?,44@' >?,>?,44@' >?,H?,44@' >?,D?,44@' >?,F?,44@' >?,G?,44@'
  24. 24. % Utilization time
  25. 25. Ideal Effective Utilization % Utilization time
  26. 26. Ideal Effective Utilization % Utilization Real Utilization time
  27. 27. Ideal Effective Utilization % Utilization Real Utilization time
  28. 28. on-demand instances reserved instances spot instances
  29. 29. Amazon EC2 On-Demand price for the same instance is $0.50
  30. 30. Ideal Effective Utilization % Utilization time
  31. 31. Ideal Effective Utilization % Utilization Reserved Utilization time
  32. 32. Ideal Effective Utilization % Utilization Reserved Utilization time
  33. 33. Ideal Effective Utilization % Utilization On Demand Utilization Reserved Utilization time
  34. 34. Ideal Effective Utilization Spot Utilization % Utilization On Demand Utilization Reserved Utilization time
  35. 35. secure
  36. 36. Customer  A Customer  B Customer  Z • Guest  operaIng  system  doesn’t   have  elevated  privilege  level. • Instances  are  completely   … isolated. • Intrinsic  network  firewall. • No  access  to  raw  devices. • Virtualized  disks,  logically   isolated,  wiped  clean  aRer  use.                            Hypervisor Firewall Physical                                  Interface
  37. 37. { "Version": "2008-10-17", "Id": "Queue1_Policy_UUID", "Statement": { "Sid":"Queue1_AnonymousAccess_ReceiveMessage_TimeLimit" , "Effect": "Allow", "Principal": { "AWS": "*" }, "Action": "SQS:ReceiveMessage", "Resource": "/987654321098/queue1", "Condition" : { "DateGreaterThan" : { "AWS:CurrentTime":"2009-01-31T12:00Z" }, "DateLessThan" : { "AWS:CurrentTime":"2009-01-31T15:00Z" } } } }
  38. 38. Amazon  Virtual  Private  Cloud  (VPC) Customer’s isolated AWS resources Subnets Router VPN Gateway Amazon Web Services Cloud Secure VPN Connection over the Internet Customer’s Network
  39. 39. storage
  40. 40. Amazon S3
  41. 41. highly scalable
  42. 42. highly available
  43. 43. highly durable
  44. 44. Note: Conceptual drawing only. Actual number of nodes & datacenters may vary
  45. 45. T Node  1 Node  n ... Note: Conceptual drawing only. Actual number of nodes & datacenters may vary
  46. 46. Region Datacent Datacent er er Datacent er Node  1 Node  n ... Note: Conceptual drawing only. Actual number of nodes & datacenters may vary
  47. 47. elastic block store
  48. 48. block device
  49. 49. resizable
  50. 50. boot device
  51. 51. one size does not fit all
  52. 52. Amazon S3 Amazon EC2 + EBS • Cost-­‐effecIve  blob  or    large  object  storage • Mul9ple  flavors  of  database  engine • Minimal  rela9onships  between  objects • Complete  control Amazon SimpleDB Amazon RDS • Zero  administra9ve  overhead  (automaIc   • Na9ve  access  to  database  engine handling  of  geo-­‐redundant  replicaIon,  index   • Easy  migra9on  path  (exisIng  code,  tools,   creaIon,  database  tuning) applicaIon  are  compaIble) • AutomaIc  and  elasIc  scaling  of  resources  to   • Key  features  of  a  relaIonal  database,  such  as   meet  request  load joins  or  complex  transac9ons • High  availability  (mulIple  copies  of  data  for   • Managed  experience  (offload  common  DBA   reliability  and  failover) tasks,  lower  total  cost  of  ownership) • Flexibility  (schema-­‐less  data  store)
  53. 53. an ecosystem prospers
  54. 54. <2>
  55. 55. infrastructure as code
  56. 56. Source: Chris Dagdigian
  57. 57. • Images: • Keypairs: • VPC: – RegisterImage – CreateKeyPair – CreateCustomerGateway – DescribeImages – DescribeKeyPairs – DeleteCustomerGateway – DeregisterImage – DeleteKeyPair – DescribeCustomerGateways – ModifyImageAcribute – AssociateDhcpOpIons – DescribeImageAcribute • Security  Groups: – CreateDhcpOpIons – ResetImageAcribute – DeleteDhcpOpIons – CreateSecurityGroup – DescribeDhcpOpIons – DescribeSecurityGroups – CreateSubnet • Instances: – DeleteSecurityGroup – DeleteSubnet – RunInstances – DescribeSubnets – AuthorizeSecurityGroupIngress – DescribeInstances – CreateVpc – TerminateInstances – RevokeSecurityGroupIngress – DeleteVpc – StopInstances – DescribeVpcs – GetConsoleOutput • Block  Storage  Volumes: – CreateVpnConnecIon – RebootInstances – CreateVolume – DeleteVpnConnecIon – CreatePlacementGroup – DescribeVpnConnecIons – DeleteVolume – DescribePlacementGroup – AcachVpnGateway – DescribeVolumes – CreateVpnGateway • IP  Addresses: – AhachVolume – DeleteVpnGateway – AllocateAddress – DetachVolume – DescribeVpnGateways – ReleaseAddress – CreateSnapshot – DetachVpnGateway – AssociateAddress – DescribeSnapshots – DisassociateAddress – DeleteSnapshot – DescribeAddresses  
  58. 58. using libraries
  59. 59. def access_key options.services['access-key'] Access end credentials def secret_key options.services['secret-key'] end
  60. 60. class EC2 attr_accessor :ec2, :instance_index, :image_index, :elastic_ip_index, :volume_index def initialize(access_key, secret_key) @ec2 = RightAws::Ec2.new(access_key, secret_key) @instance_index = {} @image_index = {} @elastic_ip_index = {} @volume_index = {} end end
  61. 61. class Instance attr_accessor :aws_hash, :elastic_ip def initialize(hash, elastic_ip = nil) @aws_hash = hash @elastic_ip = elastic_ip end def public_dns @aws_hash[:dns_name] || "" end def friendly_name public_dns.empty? ? status.capitalize : public_dns.split(".")[0] end def id @aws_hash[:aws_instance_id] end end
  62. 62. class EC2 attr_accessor :ec2, :instance_index, :image_index, :elastic_ip_index, :volume_index def initialize(access_key, secret_key) @ec2 = RightAws::Ec2.new(access_key, secret_key) @instance_index = {} @image_index = {} @elastic_ip_index = {} @volume_index = {} end def instance_index if @instance_index.empty? @ec2.describe_instances.each do |i| # create an Instance object & add to the array Custom @instance_index[i[:aws_instance_id]] = Instance.new(i, get_elastic_ip_for_instance_id(i[:aws_instance_id])) index end end return @instance_index end end
  63. 63. class Instance attr_accessor :aws_hash, :elastic_ip def initialize(hash, elastic_ip = nil) @aws_hash = hash @elastic_ip = elastic_ip end def public_dns @aws_hash[:dns_name] || "" end def friendly_name public_dns.empty? ? status.capitalize : public_dns.split(".")[0] end def id @aws_hash[:aws_instance_id] end def running? Helper status == "running" end end
  64. 64. configuration management
  65. 65. cfengine puppet chef
  66. 66. chef
  67. 67. dsl
  68. 68. include_recipe "packages" include_recipe "ruby" include_recipe "apache2" if platform?("centos","redhat") if dist_only? # just the gem, we'll install the apache module within apache2 package "rubygem-passenger" return else package "httpd-devel" end else %w{ apache2-prefork-dev libapr1-dev }.each do |pkg| package pkg do action :upgrade end end end gem_package "passenger" do version node[:passenger][:version] end execute "passenger_module" do command 'echo -en "nnnn" | passenger-install-apache2-module' creates node[:passenger][:module_path] end
  69. 69. include_recipe "packages" Modular include_recipe "ruby" include_recipe "apache2" if platform?("centos","redhat") if dist_only? # just the gem, we'll install the apache module within apache2 package "rubygem-passenger" return else package "httpd-devel" end else %w{ apache2-prefork-dev libapr1-dev }.each do |pkg| package pkg do action :upgrade end end end gem_package "passenger" do version node[:passenger][:version] end execute "passenger_module" do command 'echo -en "nnnn" | passenger-install-apache2-module' creates node[:passenger][:module_path] end
  70. 70. include_recipe "packages" include_recipe "ruby" include_recipe "apache2" OS aware if platform?("centos","redhat") if dist_only? # just the gem, we'll install the apache module within apache2 package "rubygem-passenger" return else package "httpd-devel" end else %w{ apache2-prefork-dev libapr1-dev }.each do |pkg| package pkg do action :upgrade end end end gem_package "passenger" do version node[:passenger][:version] end execute "passenger_module" do command 'echo -en "nnnn" | passenger-install-apache2-module' creates node[:passenger][:module_path] end
  71. 71. include_recipe "packages" include_recipe "ruby" include_recipe "apache2" if platform?("centos","redhat") if dist_only? # just the gem, we'll install the apache module within apache2 package "rubygem-passenger" return else package "httpd-devel" end else %w{ apache2-prefork-dev libapr1-dev }.each do |pkg| Ruby package pkg do action :upgrade syntax end end end gem_package "passenger" do version node[:passenger][:version] end execute "passenger_module" do command 'echo -en "nnnn" | passenger-install-apache2-module' creates node[:passenger][:module_path] end
  72. 72. include_recipe "packages" include_recipe "ruby" include_recipe "apache2" if platform?("centos","redhat") if dist_only? # just the gem, we'll install the apache module within apache2 package "rubygem-passenger" return else package "httpd-devel" end else %w{ apache2-prefork-dev libapr1-dev }.each do |pkg| package pkg do action :upgrade end end end Package gem_package "passenger" do version node[:passenger][:version] aware end execute "passenger_module" do command 'echo -en "nnnn" | passenger-install-apache2-module' creates node[:passenger][:module_path] end
  73. 73. include_recipe "packages" include_recipe "ruby" include_recipe "apache2" if platform?("centos","redhat") if dist_only? # just the gem, we'll install the apache module within apache2 package "rubygem-passenger" return else package "httpd-devel" end else %w{ apache2-prefork-dev libapr1-dev }.each do |pkg| package pkg do action :upgrade end end end gem_package "passenger" do version node[:passenger][:version] end execute "passenger_module" do Execute command 'echo -en "nnnn" | passenger-install-apache2-module' creates node[:passenger][:module_path] end
  74. 74. recipes
  75. 75. template "#{node[:apache][:dir]}/mods-available/passenger.conf" do cookbook "passenger_apache2" source "passenger.conf.erb" owner "root" group "root" mode 0755 end
  76. 76. Template template "#{node[:apache][:dir]}/mods-available/passenger.conf" do cookbook "passenger_apache2" source "passenger.conf.erb" owner "root" group "root" mode 0755 end
  77. 77. template "#{node[:apache][:dir]}/mods-available/passenger.conf" do cookbook "passenger_apache2" Cookbook source "passenger.conf.erb" owner "root" re-use group "root" mode 0755 end
  78. 78. <3>
  79. 79. architectural lessons
  80. 80. design for failure
  81. 81. “Everything fails, all the time” -- Werner Vogels
  82. 82. “Things will crash. Deal with it” -- Jeff Dean
  83. 83. 2-4% of servers will die annually Source: Jeff Dean, LADIS 2009
  84. 84. 1-5% of disk drives will die every year Source: Jeff Dean, LADIS 2009
  85. 85. 2.3% AFR in population of 13,250 3.3% AFR in population of 22,400 4.2% AFR in population of 246,000
  86. 86. 2.3% AFR in population of 13,250 3.3% AFR in population of 22,400 4.2% AFR in population of 246,000 Source: James Hamilton (http://perspectives.mvdirona.com)
  87. 87. human errors
  88. 88. human errors ~20% admin issues have unintended consequences Source: James Hamilton (http://perspectives.mvdirona.com)
  89. 89. assume sw/hw failure
  90. 90. avoid single points of failure
  91. 91. system as a whole is reslient
  92. 92. loose coupling sets you free
  93. 93. loose coupling sets you free
  94. 94. using message queues
  95. 95. Tight  Coupling Controller  A Controller  B Controller  C Q Q Q Loose  Coupling  using   Controller  A Controller  B Controller  C Queues
  96. 96. implement elasticity
  97. 97. no assumptions
  98. 98. resilience to reboot
  99. 99. bootstrap
  100. 100. dynamic
  101. 101. multi-layered security
  102. 102. “Web”  Security  Group: TCP    80   0.0.0.0/0 TCP    443   0.0.0.0/0 TCP    22   “App” “App”  Security  Group: TCP    8080   “Web” TCP    22   172.154.0.0/16 TCP    22   “App” “DB”  Security  Group: TCP    3306   “App” TCP    3306   163.128.25.32/32 TCP    22   “App”
  103. 103. embrace constraints
  104. 104. distributed memory
  105. 105. sharded DBs
  106. 106. hardware failed? simply throw it away and switch to new hardware with no additional cost
  107. 107. cache
  108. 108. think parallel
  109. 109. different architectures
  110. 110. multi-threaded, concurrent requests
  111. 111. mapreduce
  112. 112. elastic load-balancing
  113. 113. decompose jobs into simplest form
  114. 114. leverage many storage options
  115. 115. <4>
  116. 116. computing in the cloud
  117. 117. 3 modalities
  118. 118. batch processing
  119. 119. “grids”
  120. 120. queues
  121. 121. URL  Queue Fetch  Images S3 Fetch  &  Store   Render   Page Queue Parse   Render   S3 Images  &   S3 Queue Pages Parse  Page Image   Queue Source: Jeff Barr
  122. 122. sudo gem install cloud-crowd http://wiki.github.com/documentcloud/cloud-crowd
  123. 123. http://www.rightscale.com
  124. 124. data-intensive computing
  125. 125. Amazon Elastic MapReduce Amazon EC2 Instances End Deploy Application Hadoop Hadoop Hadoop Elastic Elastic MapReduce MapReduce Hadoop Hadoop Hadoop Notify Web Console, Command line tools Input output dataset results Input  S3   Output  S3   Get Results Input Data bucket bucket Amazon S3
  126. 126. PREANNOUNCE  –  EXPAND/SHRINK  CLUSTERS Use  Case:  Increase  speed  of  running  job  flows Speed  up  job  flow  execuIon  in  response  to  changing  requirements Dynamically  balance  cost  versus  performance  without  restarIng  a  job Job Flow Job Flow Job Flow Allocate Expand to Expand to 4 instances 9 instances 25 instances Time remaining: Time remaining: 14 Hours 7 Hours Time remaining: 3 Hours
  127. 127. Use  Case:  Agile  Data  Warehouse  Cluster Customize  cluster  size  to  support  varying  resource  needs Leverage  flexibility  to  reduce  costs  and  increase  cluster  uIlizaIon Data Warehouse (Batch Processing) Data Warehouse Data Warehouse (Steady State) (Steady State) Allocate Expand to Shrink to 9 instances 25 instances 9 instances
  128. 128. PREANNOUNCE  –  IntegraIon  with  Spot  Instances Cost without Spot: 4 instances *14 hrs * $0.50 = $28 Job Flow Job Flow Cost with Spot: Allocate Expand to 4 instances *7 hrs * $0.50 = $13 + 4 instances 9 instances 5 instances * 7 hrs * $0.25 = $8.75 Total = $21.75 Time remaining: Savings: ~22% 14 Hours 7 Hours Time remaining:
  129. 129. high performance computing
  130. 130. Low latency high bandwidth
  131. 131. cluster compute instances
  132. 132. full bisection bandwidth
  133. 133. 10gbps
  134. 134. 2 * Xeon 5570 (Intel “Nehalem”) 23 GB RAM 10 gbps Ethernet 1690 TB local disk HVM-based virtualization $1.60 / hr
  135. 135. managing compute cycles
  136. 136. http://cyclecomputing.com
  137. 137. http://web.mit.edu/stardev/cluster/
  138. 138. SQS
  139. 139. <5>
  140. 140. AWS + science = win
  141. 141. 3.7 million classifications in just over three days ~15 million in less than a month >2.6 million clicks in 100 hours
  142. 142. Biomarker Warehouse pre-clinical, clinical, 3rd party data and publications Estimated cost: 10 TB warehouse over 3 years
  143. 143. Protein interactions @ U. Washington Simple Python scripts automate the management of 1000s of simultaneous experiments using the EC2 API http://faculty.washington.edu/danielt/ Source: Ed Lazowska
  144. 144. 200 instances 60000 structures 4 hours http://bioteam.net/aws
  145. 145. HEAVY-ION COLLISIONS Problem: Quark matter physics conference imminent but no compute resources handy Solution: NIMBUS context broker allowed researchers to provision 300 nodes and get the simulations done
  146. 146. Image: Wikipedia
  147. 147. lots and lots and lots and lots and lots and lots of data and lots and lots of lots of data
  148. 148. Image  via  image  editor  under  a  CC-­‐BY  License
  149. 149. Image: NOAA
  150. 150. scale availability utilization sharing collaboration
  151. 151. we are data geeks not data center geeks
  152. 152. BLAT @ U. Penn Map 100 million, 100 base paired end reads Quad core with 5 GB of RAM would take 16 days 30 high-memory instances; 32 hours; $195 Source: Angel Pizzaro/John Hogenesch
  153. 153. BELLE MONTE CARLO Credit: Tom Fifield
  154. 154. MapReduce for Genomics Ben Langmead http://bowtie-bio.sourceforge.net/crossbow/index.shtml http://contrail-bio.sourceforge.net http://bowtie-bio.sourceforge.net/myrna/index.shtml
  155. 155. platform for science
  156. 156. http://www.cloudbiolinux.com/
  157. 157. http://usegalaxy.org/cloud
  158. 158. http://dnanexus.com
  159. 159. http://www.elasticr.net Elastic-R Collaborative Research Environment
  160. 160. http://aws.amazon.com/publicdatasets/
  161. 161. s3://1000genomes
  162. 162. deesingh@amazon.com   Twicer:@mndoci slides  at  hcp://slideshare.net/mndoci InspiraIon  and  material  from  Mah  Wood, James  Hamilton  &  Larry  Lessig By Oberazzi under a CC-BY-NC-SA license

×