Cloud Computing:                 What it is, DOs and DONTs                                      Svet Ivantchev, eFaber    ...
Our plan for today                            • What Is Cloud Computing?                            • Enabling technologie...
Our plan for tomorrow               •      Create a HPC cluster with:                     •      184 GB RAM               ...
(Kind of) Evolution                   • Grid Computing                   • Utility Computing                   • Cloud Com...
Grid Computing       Grid computing is a term referring to the combination of       computer resources from multiple admin...
Utility Computing            Utility Computing is the packaging of computing       resources, such as computation, storage...
Cloud Computing                                      McKinsey & Co. Reportdomingo 1 de mayo de 2011
Cloud Computing        Cloud computing is a model for enabling convenient, on-        demand network access to a shared po...
Cloud Computing                     1. The illusion of infinite computing resources...                     2. The eliminati...
So, what it is?                   • Pay-per-use                   • Resources are abstracted (virtualized)                ...
Enabling technologies                            • Virtualisation                            • Virtualised Storage        ...
Virtualisation                            • Xen                            • KVM                            • WMware      ...
Abstracted Storage                            • Distributed File Systems; examples:                             • Amazon S...
Stack                              Software as a Service (SaaS)                              Platform as a Service (PaaS) ...
Public Cloud Services                            • Amazon EC2                            • RackSpace                      ...
domingo 1 de mayo de 2011
Amazon Web Services (AWS)domingo 1 de mayo de 2011
AWS EC2 Prices                   • on demand instances                   • reserved instances                   • spot ins...
AWS EC2 pricesdomingo 1 de mayo de 2011
Spot Instancesdomingo 1 de mayo de 2011
domingo 1 de mayo de 2011
domingo 1 de mayo de 2011
domingo 1 de mayo de 2011
domingo 1 de mayo de 2011
Private                   • Eucalyptus                   • OpenNebula                   • Nimbus                   • OpenS...
Public or private?                              Better mixeddomingo 1 de mayo de 2011
MapReduce                   • High level vs low level languages                   • Example: MPI/PVM vs MapReducedomingo 1...
MRs “Hello world”                                Unix-style                     “en un lugar de la Mancha de cuyo nombre n...
domingo 1 de mayo de 2011
Google Books                • 129 000 000 books are publshed so far                • 15 000 000 books scanned (1700-2010) ...
http://ngrams.googlelabs.com/domingo 1 de mayo de 2011
domingo 1 de mayo de 2011
MapReduce                      map:        (k1, v1) ! list (k2, v2)               reduce:       (k2, list(v2)) ! list (v2)...
MapReduce: Mapper                            map(String key, String value):                               // key: document...
MapReduce: Reducer                            reduce(String key, Iterator values):                               // key: a...
Dean, J and Ghemawat, S, Comm. ACM,Vol 51, pp. 107--113, (2008)domingo 1 de mayo de 2011
Our input    $ ls -l donquijote_s?.txt    -rw-r--r-- 1 svet staff       1037413 23 abr 18:26 donquijote_s1.txt    -rw-r--r...
Python Mapper                     #!/usr/bin/python                     import sys                     import re          ...
Test the mapper                            $ cat donquijote_s1.txt | ./wsplit.py                            LongValueSum:e...
Preparing the S3domingo 1 de mayo de 2011
domingo 1 de mayo de 2011
domingo 1 de mayo de 2011
domingo 1 de mayo de 2011
Rundomingo 1 de mayo de 2011
domingo 1 de mayo de 2011
domingo 1 de mayo de 2011
domingo 1 de mayo de 2011
domingo 1 de mayo de 2011
domingo 1 de mayo de 2011
domingo 1 de mayo de 2011
domingo 1 de mayo de 2011
domingo 1 de mayo de 2011
domingo 1 de mayo de 2011
domingo 1 de mayo de 2011
domingo 1 de mayo de 2011
domingo 1 de mayo de 2011
domingo 1 de mayo de 2011
Final result                            $ awk {print $2 " " $1} part-00000 | sort -r -n                            21477 q...
CL alternative                        $ elastic-mapreduce --create                               --stream                 ...
MapReduce, ex 2                                Pi = 4*M/Ndomingo 1 de mayo de 2011
MapReduce: Mapper                      #!/usr/bin/ruby                      ARGF.each do |line|                        mcs...
Pi                            $ cat mcs.txt                            1000                            $ cat mcs.txt | ./m...
MapReduce: Reducer                            #!/usr/bin/ruby                            count = 0                        ...
Prepare the EMR                   • upload mcsnn.txt to mrbg/mcinput/                   • upload mc-mapper.rb to mrbg/prog...
domingo 1 de mayo de 2011
est: 109955955/140000000*4=3.14159871domingo 1 de mayo de 2011
• Hadoop Common        • HDFS        • MapReducedomingo 1 de mayo de 2011
Thank you                            Q &Adomingo 1 de mayo de 2011
Upcoming SlideShare
Loading in …5
×

Cloud Computing: What it is, DOs and DON'Ts

1,621 views

Published on

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,621
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
27
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Cloud Computing: What it is, DOs and DON'Ts

  1. 1. Cloud Computing: What it is, DOs and DONTs Svet Ivantchev, eFaber Fourth Workshop on Advanced Computing Techniques in the Microworld, April 2011domingo 1 de mayo de 2011
  2. 2. Our plan for today • What Is Cloud Computing? • Enabling technologies • Public vs Private Clouds • Idea of MapReduce with two examplesdomingo 1 de mayo de 2011
  3. 3. Our plan for tomorrow • Create a HPC cluster with: • 184 GB RAM • 13 TB local disk space and 800 GB persistent storage • 64 cores @ 2.9 GHz, Intel Nehalem = 268 ECUs (~268 2007 1.2 GHz Xeons) • 10 GB network connection between themdomingo 1 de mayo de 2011
  4. 4. (Kind of) Evolution • Grid Computing • Utility Computing • Cloud Computing • Software as a Service (SaaS)domingo 1 de mayo de 2011
  5. 5. Grid Computing Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common goal. The grid can be thought of as a distributed system with non-interactive workloads that involve a large number of files. http://en.wikipedia.org/wiki/Grid_computingdomingo 1 de mayo de 2011
  6. 6. Utility Computing Utility Computing is the packaging of computing resources, such as computation, storage and services, as a metered service similar to a traditional public utility (such as electricity, water, natural gas, or telephone network). http://en.wikipedia.org/wiki/Utility_computingdomingo 1 de mayo de 2011
  7. 7. Cloud Computing McKinsey & Co. Reportdomingo 1 de mayo de 2011
  8. 8. Cloud Computing Cloud computing is a model for enabling convenient, on- demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. NISTdomingo 1 de mayo de 2011
  9. 9. Cloud Computing 1. The illusion of infinite computing resources... 2. The elimination of an up-front commitment... 3. The ability to pay for use ... as needed. UC Berkeley RAD Labsdomingo 1 de mayo de 2011
  10. 10. So, what it is? • Pay-per-use • Resources are abstracted (virtualized) • Upscale and downscale on demand • Self service interface (API included)domingo 1 de mayo de 2011
  11. 11. Enabling technologies • Virtualisation • Virtualised Storage • Web Servicesdomingo 1 de mayo de 2011
  12. 12. Virtualisation • Xen • KVM • WMware • more...domingo 1 de mayo de 2011
  13. 13. Abstracted Storage • Distributed File Systems; examples: • Amazon S3 • RackSpace’s CloudFiles • HDFSdomingo 1 de mayo de 2011
  14. 14. Stack Software as a Service (SaaS) Platform as a Service (PaaS) Infrastructure as a Service (IaaS) Cloud Enabler(s) Hardwaredomingo 1 de mayo de 2011
  15. 15. Public Cloud Services • Amazon EC2 • RackSpace • 100s more ...domingo 1 de mayo de 2011
  16. 16. domingo 1 de mayo de 2011
  17. 17. Amazon Web Services (AWS)domingo 1 de mayo de 2011
  18. 18. AWS EC2 Prices • on demand instances • reserved instances • spot instancesdomingo 1 de mayo de 2011
  19. 19. AWS EC2 pricesdomingo 1 de mayo de 2011
  20. 20. Spot Instancesdomingo 1 de mayo de 2011
  21. 21. domingo 1 de mayo de 2011
  22. 22. domingo 1 de mayo de 2011
  23. 23. domingo 1 de mayo de 2011
  24. 24. domingo 1 de mayo de 2011
  25. 25. Private • Eucalyptus • OpenNebula • Nimbus • OpenStack • Hadoop & friendsdomingo 1 de mayo de 2011
  26. 26. Public or private? Better mixeddomingo 1 de mayo de 2011
  27. 27. MapReduce • High level vs low level languages • Example: MPI/PVM vs MapReducedomingo 1 de mayo de 2011
  28. 28. MRs “Hello world” Unix-style “en un lugar de la Mancha de cuyo nombre no quiero acordarme no ha mucho tiempo que vivía un hidalgo ...” $ cat i.txt | tr n | sort | uniq -c 1 Mancha 1 acordarme 1 cuyo 2 de ...domingo 1 de mayo de 2011
  29. 29. domingo 1 de mayo de 2011
  30. 30. Google Books • 129 000 000 books are publshed so far • 15 000 000 books scanned (1700-2010) • 5 000 000 classified and with metadata Science,Vol. 331, no 6014, pp. 176-182 (Jan 14, 2011):domingo 1 de mayo de 2011
  31. 31. http://ngrams.googlelabs.com/domingo 1 de mayo de 2011
  32. 32. domingo 1 de mayo de 2011
  33. 33. MapReduce map: (k1, v1) ! list (k2, v2) reduce: (k2, list(v2)) ! list (v2)domingo 1 de mayo de 2011
  34. 34. MapReduce: Mapper map(String key, String value): // key: document name // value: document contents for each word w in value: EmitIntermediate(w, 1); “en”, 1 “un”, 1 “en un lugar de la Mancha de “lugar”, 1 cuyo nombre no quiero acordarme “de”, 1 no ha mucho tiempo que vivía un “la”, 1 hidalgo” “Mancha”, 1 “de”, 1 ...domingo 1 de mayo de 2011
  35. 35. MapReduce: Reducer reduce(String key, Iterator values): // key: a word // values: a list of counts result = 0; for each v in values: result += v; Emit(result); “en”, [1] “en”, 1 “un”, [1,1] “un”, 2 “lugar”, [1] “lugar”, 1 “de”, [1] “de”, 1 ... ...domingo 1 de mayo de 2011
  36. 36. Dean, J and Ghemawat, S, Comm. ACM,Vol 51, pp. 107--113, (2008)domingo 1 de mayo de 2011
  37. 37. Our input $ ls -l donquijote_s?.txt -rw-r--r-- 1 svet staff 1037413 23 abr 18:26 donquijote_s1.txt -rw-r--r-- 1 svet staff 1099078 23 abr 18:22 donquijote_s2.txt $ head -6 donquijote_s1.txt El ingenioso hidalgo don Quijote de la Mancha TASA Yo, Juan Gallo de Andrada, escribano de Camara del Rey nuestro senor, de los que residen en su Consejo, certifico y doy fe que, habiendo visto por los senores del un librodomingo 1 de mayo de 2011
  38. 38. Python Mapper #!/usr/bin/python import sys import re def main(argv): line = sys.stdin.readline() pattern = re.compile("[a-zA-Z][a-zA-Z0-9]*") try: while line: for word in pattern.findall(line): print "LongValueSum:" + word.lower() + "t" + "1" line = sys.stdin.readline() except "end of file": return None if __name__ == "__main__": main(sys.argv)domingo 1 de mayo de 2011
  39. 39. Test the mapper $ cat donquijote_s1.txt | ./wsplit.py LongValueSum:el 1 LongValueSum:ingenioso 1 LongValueSum:hidalgo 1 LongValueSum:don 1 LongValueSum:quijote 1 LongValueSum:de 1 LongValueSum:la 1 LongValueSum:mancha 1 LongValueSum:tasa 1 LongValueSum:yo 1 LongValueSum:juan 1 LongValueSum:gallo 1 LongValueSum:de 1 LongValueSum:andrada 1domingo 1 de mayo de 2011
  40. 40. Preparing the S3domingo 1 de mayo de 2011
  41. 41. domingo 1 de mayo de 2011
  42. 42. domingo 1 de mayo de 2011
  43. 43. domingo 1 de mayo de 2011
  44. 44. Rundomingo 1 de mayo de 2011
  45. 45. domingo 1 de mayo de 2011
  46. 46. domingo 1 de mayo de 2011
  47. 47. domingo 1 de mayo de 2011
  48. 48. domingo 1 de mayo de 2011
  49. 49. domingo 1 de mayo de 2011
  50. 50. domingo 1 de mayo de 2011
  51. 51. domingo 1 de mayo de 2011
  52. 52. domingo 1 de mayo de 2011
  53. 53. domingo 1 de mayo de 2011
  54. 54. domingo 1 de mayo de 2011
  55. 55. domingo 1 de mayo de 2011
  56. 56. domingo 1 de mayo de 2011
  57. 57. domingo 1 de mayo de 2011
  58. 58. Final result $ awk {print $2 " " $1} part-00000 | sort -r -n 21477 que 18297 de 18189 y 3352 su 10363 la 2647 don 9824 a 2623 del 9490 el 2539 como 8243 en 2345 me 6335 no 2312 si 5079 se 2284 mas 4748 los 2207 mi 4202 con 2175 quijote 3940 por 2148 sancho 3468 las 2142 es 3461 lo 2077 yo 3398 le 1938 un 1808 dijo 1740 al 1463 para 1400 porquedomingo 1 de mayo de 2011
  59. 59. CL alternative $ elastic-mapreduce --create --stream --input s3n://mrbg/input --mapper s3://mrbg/prog/wsplit.py --output s3n://mgbr/output/run2 $ elastic-mapreduce --createdomingo 1 de mayo de 2011
  60. 60. MapReduce, ex 2 Pi = 4*M/Ndomingo 1 de mayo de 2011
  61. 61. MapReduce: Mapper #!/usr/bin/ruby ARGF.each do |line| mcsteps = line.strip unless mcsteps.length == 0 begin inside = 0 mcsteps.to_i.times do x, y = rand, rand inside += 1 if Math.hypot(x,y) < 1.0 end puts inside.to_s rescue # couldnt parse mc steps end end enddomingo 1 de mayo de 2011
  62. 62. Pi $ cat mcs.txt 1000 $ cat mcs.txt | ./mc-pi-mr.rb 776 ... create more mcs.txts: 200_000_000 200_000_000domingo 1 de mayo de 2011
  63. 63. MapReduce: Reducer #!/usr/bin/ruby count = 0 ARGF.each do |line| count += line.to_i end puts "#{count} points inside"domingo 1 de mayo de 2011
  64. 64. Prepare the EMR • upload mcsnn.txt to mrbg/mcinput/ • upload mc-mapper.rb to mrbg/prog/ • upload mc-reducer.rb to mrbg/prog/domingo 1 de mayo de 2011
  65. 65. domingo 1 de mayo de 2011
  66. 66. est: 109955955/140000000*4=3.14159871domingo 1 de mayo de 2011
  67. 67. • Hadoop Common • HDFS • MapReducedomingo 1 de mayo de 2011
  68. 68. Thank you Q &Adomingo 1 de mayo de 2011

×