Implementacion de Cloud Computing:
          Alcances y Tecnologia
                   Lic. Jorge Guerra Guerra
          U...
Agenda

•   Definiciones
•   Taxonomía
•   Costos
•   Implementaciones




                       Lic. Jorge Guerra   2
“No es nada nuevo”
“... hemos redefinido launa trampa”
                  “Es
computación en nube para
             “Es la ...
Todo el mundo tiene un montón de
                                     datos para procesar!
  • Wayback Machine tiene 2 PB ...
Evolucion hacia el Cloud




         Source: http://news.cnet.com
     Lic. Jorge Guerra                  5
Que es Cloud Computing?
• Viejas ideas:
   – Grids, supercomputadoras vectoriales
   – Software como Servicio (SaaS)
     ...
Definiciones formales

• Un estilo de computación donde capacidades
  basadas en TI masivamente escalables en
  forma masi...
Características
• Virtual – Ubicación física y detalles sobre los
  infraestructura son transparentes para los usuarios
• ...
Percepción del usuario




     Lic. Jorge Guerra   9
Como lo ven al Cloud Computing

                   • “Sólo me interesa resultados, no
                     cómo se impleme...
Mapa Cloud/Saas de Laird




      Lic. Jorge Guerra    11
Curva de evolución Cloud de Gartner




           Lic. Jorge Guerra     12
Implementaciones Cloud




     Lic. Jorge Guerra   13
Tipos de implementacion




     Lic. Jorge Guerra    14
SAAS




Lic. Jorge Guerra   15
Mapa Saas de Wolosky 2008




       Lic. Jorge Guerra   16
Tipos de Cloud Computing




         Lic. Jorge Guerra   17
Tipos




Lic. Jorge Guerra   18
Enabling Technology:
                                                  Virtualization


                                  ...
Muchos Tipos de Virtualizacion
• Full virtualization
    – Instrucciones sensibles (descubrimiento estático o dinámico en ...
Que hay del Grid?




               Hitachi SR8000 – Leibnitz Rechenzentrum
               2 TFlop/s (2*1012)
      Lic. ...
Grid Computing
• Grid Computing Criteria (Ian Foster 2004)
   – Coordination: A grid must coordinate resources that are no...
Cloud Computing vs.
  Grid Computing




   Lic. Jorge Guerra   23
Datacenter es el nuevo“servidor”
    • “Programa” = Web search, email, map/GIS, …
    • “Computadora” = 1000’s computadora...
Datacenter Architectures

• Major engineering design challenges in building
  datacenters
  – One of Google’s biggest secr...
Algunos con accesso de fibra muy
           seguro …




                   Lic. Jorge Guerra                             ...
Algunos con menos que eso




                 Lic. Jorge Guerra                               27
Source: Build vs. Buy: I...
Infraestructura de seguridad

• 24x7 Manned
• Acceso: Biometrics,
  card keys
• Video Surveillance




                   ...
Algunos muy seguros…
        http://www.thebunker.net




                  Lic. Jorge Guerra                             ...
Otros como si hubiera pasado un
          huracan…




                 Lic. Jorge Guerra                               30...
Datacenter Architectures

• Let’s look at an example from telco
  professionals
• Example: AT&T Miami, Florida Tier 1 data...
AT&T Internet Data Center
                       Security
• Hardened facilities protected by multiple
  security measures:...
AT&T Internet Data Center
                               Power
 Commercial
                        Transformer
 Power Supp...
AT&T Internet Data Center
                                     Power




Commercial Power Feeds
  2 Commercial Feed Each A...
AT&T Internet Data Center Power
• Paralleling Switch Gear
• Automatically Powers Up All
  Generators When
  Commercial Pow...
AT&T Internet Data Center Power
•   Four (4) Battery Strings To
    Support The UPS Systems
•   Battery Strings Contain
  ...
AT&T Internet Data Center Power




                        Uninterruptible Power Supply (UPS)
Eliminate Spikes, Sags, Sur...
AT&T Internet Data Center Power




Back-up Power – Generators and Diesel Fuel
       • Four (4) 2,500 kw Diesel Generator...
Typical Tier-2 One Megawatt Datacenter
          Main Supply

                             Transformer
                   ...
Systems & Power Density
• Estimating DC power density hard
   – Power is 40% of DC costs
       • Power + Mechanical: 55% ...
Porque ahora(y no antes)?

• Commoditization of HW & SW
   – x86 as universal ISA, plus fast virtualization
   – Standard ...
Classifying Clouds
                 App Model for Utility Computing
   Amazon EC2          Windows Azure                  ...
Aplicaciones web asesinas
• Mobile and web applications
• Extensiones de software de escritorio
  – Matlab, Mathematica
• ...
Demanda de Aplicacion Cloud
• Muchas aplicaciones de nubes tienen curvas
  cíclicas de demanda




                       ...
Economia de usuarioselegir un
                                  Cómo
                                        Cloud
       ...
Economia de usuarios Cloud
• Riesgo de sobre-provision: baja utilizacion
   • enorme costo perdido en infraestructura
    ...
Economia de usuarios Cloud
              • Dura penalidad por baja-provision




                                         ...
Utility Computing Arrives
   • Amazon Elastic Compute Cloud (EC2)
   • “Compute unit” rental: $0.10-0.80 0.085-0.68/hour
 ...
Economics of Cloud Providers

• Microsoft and Google race to build next-gen DCs
  (Jan’07)
   – Microsoft announces a $550...
Costos ocultos del cloud




     Lic. Jorge Guerra     50
Google Oregon Datacenter




      Lic. Jorge Guerra                      51
                          Source: Harper’s (F...
Containerized Datacenters




          Nortel Steel Enclosure
     Containerized telecom equipment                 Sun Bl...
Unit of Data Center Growth
• One at a time:
    – 1 system
    – Racking & networking: 14 hrs ($1,330)
• Rack at a time:
 ...
Sun Modular Datacenter
                     “BlackBox” (GreenBox)
• Delivered June 9th, operational in September
  – Signi...
Economics of Cloud Providers
    Economies of Scale for Humongous Datacenters
                   (1,000’s to 10,000’s of c...
Alimentación y refrigeración es cara!

                                                                  La infraestructur...
Public vs. Private Clouds
• Building a Very Large-Scale Datacenter Very Is Expensive
   – $100+ Million (Minimum)
• Large ...
Extra Benefits para Cloud Providers

• Amazon: utiliza capacidad ociosa
• Microsoft: vende herramientas .NET
• Google: reu...
Platform - Amazon Web
        Services

Elastic Compute Cloud (EC2)
  Rent computing resources by the hour
  Basic unit of...
Platform - Amazon Web Services(EC2)


• • Infrastructure as a Service provider, and current market
  leader.
• • Data cent...
Platform - Amazon Web Services(EC2)

•   Users provision instances with an Amazon Machine Image (AMI),
    packaged virtua...
Platform - Amazon Web Services(EC2)
• Flexible, but low-level (roll-your-own)
• No built-in load balancing or scaling (yet...
Platform – Windows Azure
• Platform as a Service (in pre-release)
   – “Cloud OS”
   – .NET libraries for managed code lik...
Platform – Google App Engine
• Platform as a Service
• Target: Web applications
• Provides custom Python runtime environme...
Cloud Computing Infrastructure

       • Computation model: MapReduce*
       • Storage model: HDFS*
       • Other comput...
Cloud Computing Computation
                    Models
• Finding the right level of abstraction
  – von Neumann architectu...
“Big Ideas”

• Scale “out”, not “up”
  – Limits of SMP and large shared-memory machines
• Idempotent operations
  – Simpli...
Typical Large-Data Problem

       •     Iterate over a large number of records
       •     Extract something of interest...
Google MapReduce
            Simplified Data Processing on Clusters/Clouds
• http://labs.google.com/papers/mapreduce.html
...
Roots in Functional Programming



Map          f      f          f     f   f




Fold         g     g           g     g  ...
Putting everything together…

                     namenode                 job submission node


             namenode da...
MapReduce/GFS Summary

• Simple, pero poderoso modelo de programación
• Escala a manejar cargas de trabajo de petabyte+
  ...
Implementacion




 Lic. Jorge Guerra   76
Estrategias comerciales

• Microsoft: Software plus Services
  – Uso de .NET y Windows
• IBM: Transformation through Custo...
Metodología de implementación




       Lic. Jorge Guerra    78
Definir Casos de Uso




    Lic. Jorge Guerra   79
Evaluar Infraestructura




     Lic. Jorge Guerra    80
Implementar




Lic. Jorge Guerra   81
Problemas a considerar




     Lic. Jorge Guerra   82
Problemas a considerar




     Lic. Jorge Guerra   83
Buenas practicas




  Lic. Jorge Guerra   84
Criterios a considerar




    Lic. Jorge Guerra    85
Sumario
• Muchos beneficios de Cloud Computing :
  –   Desplazar de CapEx aOpEx , escalar OpEx a la demanda
  –   Startups...
Referencias
• http://en.wikipedia.org/wiki/Cloud_computing
   – Includes references to Amazon, Apple, Dell, Enomalism, Glo...
Gracias!

jguerrag@unmsm.edu.pe
jorgeguerra@uigv.edu.pe
cloud computing alcances e implementacion
cloud computing alcances e implementacion
cloud computing alcances e implementacion
Upcoming SlideShare
Loading in …5
×

cloud computing alcances e implementacion

5,219 views
5,084 views

Published on

charla de cloud computing agosto 2010

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
5,219
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
218
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

cloud computing alcances e implementacion

  1. 1. Implementacion de Cloud Computing: Alcances y Tecnologia Lic. Jorge Guerra Guerra Universidad Nacional Mayor de San Marcos XVII Congreso Nacional de Estudiantes de Ingeniería de Sistemas y Computación 6 Agosto 2010 http://sites.google.com/site/jguerra91/home/ /
  2. 2. Agenda • Definiciones • Taxonomía • Costos • Implementaciones Lic. Jorge Guerra 2
  3. 3. “No es nada nuevo” “... hemos redefinido launa trampa” “Es computación en nube para “Es la peor estupidez: es una incluir todo bola del marketing. Alguien está lo que ya hacemos ... No entiendo que podriamos diciendo que es inevitable-y cada Que es cloud computing? de otra maneraque oigo eso, es muy vez ... que no sea cambiar la probable que algunoscampaña de redacción de sea un de nuestros anuncios.” hacerlo realidad.” negocios para Larry Ellison, CEO, Stallman, Founder, Free Richard Oracle (Wall Street Journal, Sept.Foundation (The Software 26, 2008) Guardian, Sept. 29, 2008) No hay una respuesta consistente…
  4. 4. Todo el mundo tiene un montón de datos para procesar! • Wayback Machine tiene 2 PB + 20 TB/mes (2006) • Google procesa 20 PB por dia (2008) • “Todas las palabras que han hablado alguna vez los seres humanos” ~ 5 EB • NOAA tiene ~1 PB datos del clima (2007) • CERN’s LHC genera 15 PB al año(2008) Some material adapted from slides by Jimmy Lin, Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet, Google Distributed Computing Seminar, 2007 (licensed under Lic. Jorge Guerra 4 Creation Commons Attribution 3.0 License) Maximilien Brice, © CERN
  5. 5. Evolucion hacia el Cloud Source: http://news.cnet.com Lic. Jorge Guerra 5
  6. 6. Que es Cloud Computing? • Viejas ideas: – Grids, supercomputadoras vectoriales – Software como Servicio (SaaS) • Def: desarrollando aplicaciones sobre la Internet • Recientemente: “[Hardware, Infraestructura, Plataforma] como un servicio” – Pobremente definido por lo que hay que evitar “X es un servicio” • Utility Computing: computacion paga-como-tu-vas – Ilusion de infinitos recursos – No hay costo por adelantado – Facturacion de grano fino(ejm. por hora) Lic. Jorge Guerra 6
  7. 7. Definiciones formales • Un estilo de computación donde capacidades basadas en TI masivamente escalables en forma masiva se proporcionan "como un servicio" en la red (IBM) Lic. Jorge Guerra 7
  8. 8. Características • Virtual – Ubicación física y detalles sobre los infraestructura son transparentes para los usuarios • Escalable – Capaz de dividir en partes cargas de trabajo complejas para ser atendidos, a través de una infraestructura ampliable de forma incremental • Eficiente – Arquitectura Orientada a Servicios para la provisión dinámica de compartir los recursos informáticos • Flexible – Puede servir una variedad de tipos de carga de trabajo - tanto de cliente o de empresa Lic. Jorge Guerra 8
  9. 9. Percepción del usuario Lic. Jorge Guerra 9
  10. 10. Como lo ven al Cloud Computing • “Sólo me interesa resultados, no cómo se implementan las capacidades de TI” • " Quiero pagar por lo que yo uso, como una utilidad mas“ • " Puedo acceder a los servicios desde cualquier lugar, desde cualquier dispositivo” • “Puedo escalar hacia arriba o abajo de la capacidad, según sea necesario"" Lic. Jorge Guerra 10
  11. 11. Mapa Cloud/Saas de Laird Lic. Jorge Guerra 11
  12. 12. Curva de evolución Cloud de Gartner Lic. Jorge Guerra 12
  13. 13. Implementaciones Cloud Lic. Jorge Guerra 13
  14. 14. Tipos de implementacion Lic. Jorge Guerra 14
  15. 15. SAAS Lic. Jorge Guerra 15
  16. 16. Mapa Saas de Wolosky 2008 Lic. Jorge Guerra 16
  17. 17. Tipos de Cloud Computing Lic. Jorge Guerra 17
  18. 18. Tipos Lic. Jorge Guerra 18
  19. 19. Enabling Technology: Virtualization App App App App App App OS OS OS Operating System Hypervisor Hardware Hardware Traditional Stack Virtualized Stack Some material adapted from slides by Jimmy Lin, Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet, Google Distributed Computing Seminar, 2007 (licensed under Lic. Jorge Guerra 19 Creation Commons Attribution 3.0 License)
  20. 20. Muchos Tipos de Virtualizacion • Full virtualization – Instrucciones sensibles (descubrimiento estático o dinámico en tiempo de ejecución) se sustituyen por la traducción binaria o ejecucion por pasos enhardware en VMM para la simulacion de SW – Cualquier SO puede correr en el VM – Ejemplos: IBM’s CP/CMS, Oracle (Sun) VirtualBox, VMware Workstation • Virtualizacion asistido por Hardware(IBM S/370, Intel VT, o AMD-V) – Instrucciones sensibles a traps de CPU– ejecuta sin modificar sistema operativo invitado – Ejemplos: VMware Workstation, Linux Xen, Linux KVM, Microsoft Hyper-V • Para-virtualizacion – Presenta interfaz de SW para las máquinas virtuales similar pero no idéntica a la del HW subyacente, requiriendo los sistemas operativos invitados que adaptarse – Examples: early versions of Xen • Virtualizacion del Sistema Operativo – kernel del sistema operativo permite instancias de espacio de usuario aislados, en lugar de un solo espacio – Instancia look and feel como un servidor real – Ejemplos: Solaris Zones, QEMU, BSDJorge Guerra Lic. Jails, OpenVZ 20
  21. 21. Que hay del Grid? Hitachi SR8000 – Leibnitz Rechenzentrum 2 TFlop/s (2*1012) Lic. Jorge Guerra 21
  22. 22. Grid Computing • Grid Computing Criteria (Ian Foster 2004) – Coordination: A grid must coordinate resources that are not subject to centralized control – Open APIs: A grid must use standard, open, general-purpose protocols and interfaces – QoS: A grid must deliver nontrivial qualities of service (e.g., relating to response time, throughput, availability, and security) for co-allocating multiple resource types to meet complex user demands • Promise of ubiquitous grid computing (utility) – Reality is specialized grids • TeraGrid, Open Science Grid, LHC Grid – Grid provides “library level” service customized to HW • Ensuring consistent libraries across HW is hard! Lic. Jorge Guerra 22
  23. 23. Cloud Computing vs. Grid Computing Lic. Jorge Guerra 23
  24. 24. Datacenter es el nuevo“servidor” • “Programa” = Web search, email, map/GIS, … • “Computadora” = 1000’s computadoras, almacenamiento, redes • Facilidades y carga de trabajo del tamaño de la instalacion • Nuevas ideas de datacenter (2007-2008): camion container (Sun), flotantes (Google), datacenter-en-tienda (Microsoft) • Cómo habilitar la innovación en nuevos servicios sin tener que construir primero y capitalizar una gran empresa? 24 photos: Sun Microsystems & datacenterknowledge.com Lic. Jorge Guerra 24
  25. 25. Datacenter Architectures • Major engineering design challenges in building datacenters – One of Google’s biggest secrets and challenges – Read: https://groups.google.com/group/google- appengine/browse_thread/thread/a7640a2743922dcf – Very hard to get everything correct! • Some issues – Network access, physical security, power – And there’s all the software… Lic. Jorge Guerra 25
  26. 26. Algunos con accesso de fibra muy seguro … Lic. Jorge Guerra 26 Source: Build vs. Buy: Internet Datacenter, W. B. Norton and M. Lucking
  27. 27. Algunos con menos que eso Lic. Jorge Guerra 27 Source: Build vs. Buy: Internet Datacenter, W. B. Norton and M. Lucking
  28. 28. Infraestructura de seguridad • 24x7 Manned • Acceso: Biometrics, card keys • Video Surveillance Sliding Glass Lic. Jorge Guerra 28 Source: Build vs. Buy: Internet Datacenter, W. B. Norton and M. Lucking
  29. 29. Algunos muy seguros… http://www.thebunker.net Lic. Jorge Guerra 29 Source: Build vs. Buy: Internet Datacenter, W. B. Norton and M. Lucking
  30. 30. Otros como si hubiera pasado un huracan… Lic. Jorge Guerra 30 Source: Build vs. Buy: Internet Datacenter, W. B. Norton and M. Lucking
  31. 31. Datacenter Architectures • Let’s look at an example from telco professionals • Example: AT&T Miami, Florida Tier 1 datacenter – Redundant dual uplinks to AT&T global backbone – Minimum N+1 redundancy factor on all critical infrastructure systems Lic. Jorge Guerra 31
  32. 32. AT&T Internet Data Center Security • Hardened facilities protected by multiple security measures: – 24x7x365 on-premise support – Continuous CCTV surveillance, security breach alarms, electronic card key access, biometric palm scan and individual personal access code – Secured cage and cabinet environment Lic. Jorge Guerra 32 AT&T Enterprise Hosting Services briefing 10/29/2008
  33. 33. AT&T Internet Data Center Power Commercial Transformer Power Supply Paralleling Switch Gear / Batteries UPS Systems Manual Switch Power Diesel Fuel Tanks Generators Distribution Units Remote Power Panels Lic. Jorge Guerra 33 AT&T Enterprise Hosting Services briefing 10/29/2008
  34. 34. AT&T Internet Data Center Power Commercial Power Feeds 2 Commercial Feed Each At 13,800V Located Near Substation supplied from 2 different grids All Cable Routed Underground for Protection Lic. Jorge Guerra 34 AT&T Enterprise Hosting Services briefing 10/29/2008
  35. 35. AT&T Internet Data Center Power • Paralleling Switch Gear • Automatically Powers Up All Generators When Commercial Power is Interrupted for More Than 7 Seconds – Generators are Shed to Cover Load as Needed – Typical Transition Takes Less Than 60 Seconds • Manual Override Available to Emergency Power Switch Ensure Continuity if Automatic Start-Up Should Fail Lic. Jorge Guerra 35 AT&T Enterprise Hosting Services briefing 10/29/2008
  36. 36. AT&T Internet Data Center Power • Four (4) Battery Strings To Support The UPS Systems • Battery Strings Contain Flooded Cell Batteries • A minimum of Fifteen (15) Minutes of Battery Backup Available At Full Load • Hydrogen Sensors Monitoring • Remote Status Monitoring UPS Batteries of Battery Strings Lic. Jorge Guerra 36 AT&T Enterprise Hosting Services briefing 10/29/2008
  37. 37. AT&T Internet Data Center Power Uninterruptible Power Supply (UPS) Eliminate Spikes, Sags, Surges, Transients, And All Other Over/Under Voltage And Frequency Conditions, Providing Clean Power To Connected Critical Loads • Four UPS Modules connected in a Ring Bus configuration • Each Module rated at 1000kVA • Rotary Type UPS by Piller Lic. Jorge Guerra 37 AT&T Enterprise Hosting Services briefing 10/29/2008
  38. 38. AT&T Internet Data Center Power Back-up Power – Generators and Diesel Fuel • Four (4) 2,500 kw Diesel Generators Providing Standby Power, capable of producing 10 MW of power • Two (2) 33,000 Gallon Aboveground Diesel Fuel Storage Tanks Lic. Jorge Guerra 38 AT&T Enterprise Hosting Services briefing 10/29/2008
  39. 39. Typical Tier-2 One Megawatt Datacenter Main Supply Transformer ATS Generator Switch 1000 kW Board UPS UPS • Reliable Power: Mains + Generator, STS Dual UPS PDU … • Units of Aggregation STS – Rack (10-80 nodes) → PDU (20-60 200 kW PDU racks) → Facility/Datacenter Panel Panel 50 kW Circuit Rack 2.5 kW X. Fan, W-D Weber, L. Barroso, “Power Provisioning for a Lic. Jorge Guerra 39 Warehouse-sized Computer,” ISCA’07, San Diego, (June 2007).
  40. 40. Systems & Power Density • Estimating DC power density hard – Power is 40% of DC costs • Power + Mechanical: 55% of cost – Shell is roughly 15% of DC cost – Cheaper to waste floor than power • Typically 100 to 200 W/sq ft • Rarely as high as 350 to 600 W/sq ft • Over 20% of entire DC costs is in power redundancy – Batteries able to supply 13 megawatt for 12 min – N+2 generation (11 x 2.5 megawatt) Lic. Jorge Guerra 40 James Hamilton talk, 1/17/2007
  41. 41. Porque ahora(y no antes)? • Commoditization of HW & SW – x86 as universal ISA, plus fast virtualization – Standard software stack, largely open source (LAMP) – Bet: Can statistically multiplex multiple instances onto a single box without interference between instances • Novel economic model: fine grain billing – Earlier examples: Sun, Intel Computing Services—longer commitment, more $$$/hour • Infrastructure software: eg Google FileSystem • Operational expertise: failover, DDoS, firewalls... • More pervasive broadband Internet Lic. Jorge Guerra 41
  42. 42. Classifying Clouds App Model for Utility Computing Amazon EC2 Windows Azure Google AppEngine Something Close to Physical .NET and CLR… App Specific Traditional New Hardware ASP.NET Support Web App Model ??? Lower-level, User Controls More Constraints Higher-level, Constrained Less managed on User Stack More managed ??? Most of Stack Stateless/Stateful Tiers “flexibility/portability” Hard to Auto Auto Provisioning “more Auto Scaling and built-in functionality” ??? Scale and Failover of Stateless App Auto High-Availability Constraints on App Model Offer Tradeoffs… Lots of Ongoing Innovation… • Instruction Set VM (Amazon EC2, 3Tera) • Managed runtime VM (Microsoft Azure) • Framework VM (Google AppEngine, Force.com) Lic. Jorge Guerra 42
  43. 43. Aplicaciones web asesinas • Mobile and web applications • Extensiones de software de escritorio – Matlab, Mathematica • Batch processing / MapReduce – Oracle at Harvard, Hadoop at NY Times Lic. Jorge Guerra 43
  44. 44. Demanda de Aplicacion Cloud • Muchas aplicaciones de nubes tienen curvas cíclicas de demanda Recursos – Daily, weekly, monthly, … Demanda Tiempo • Picos de carga de trabajo más frecuentes y significativos – Muerte de Michael Jackson: • 22% de tweets, 20% de trafico Wikipedia , Google penso que encontraba bajo ataque – Day de toma de posesion de Obama : 5x incremento en tweets Lic. Jorge Guerra 44
  45. 45. Economia de usuarioselegir un Cómo Cloud nivel de • Pago por usar en lugar de aprovisionamiento capacidad? para el pico • Recuerde: los costos de CD > $ 150M y toma 24 + meses para diseñar y construir Capacidad Recursos Recursos Demanda Capacidad Demanda Tiempo Tiempo Data center estatico Data center en el cloud Recursos sin usar Lic. Jorge Guerra 45
  46. 46. Economia de usuarios Cloud • Riesgo de sobre-provision: baja utilizacion • enorme costo perdido en infraestructura Capacidad Recursos sin usar Recrsos Demanda Tiempo Static data center Lic. Jorge Guerra 46
  47. 47. Economia de usuarios Cloud • Dura penalidad por baja-provision Resources Riesgo de bajo uso si Capacity predicciones de pico Demand Resources Capacity Aplicacion 1 2 3 son demasiado Time (days) Demand Perdida de ingresos optimistas 2 CapEx 1 – 3 Resources despericiado Time (days) Capacity Demand Muy difícil provisión para 1 2 3 cargas de trabajo de punta Time (days) Perdida de usuarios Lic. Jorge Guerra 47
  48. 48. Utility Computing Arrives • Amazon Elastic Compute Cloud (EC2) • “Compute unit” rental: $0.10-0.80 0.085-0.68/hour – 1 CU ≈ 1.0-1.2 GHz 2007 AMD Opteron/Intel Xeon core Platform Units Memory Disk Small - $0.10 $.085/hour 32-bit 1 1.7GB 160GB Large - $0.40 $0.35/hour 64-bit 4 7.5GB 850GB – 2 spindles X Large - $0.80 $0.68/hour 64-bit 8 15GB 1690GB – 4 spindles High CPU Med - $0.20 $0.17 64-bit 5 1.7GB 350GB High CPU Large - $0.80 $0.68 64-bit 20 7GB 1690GB High Mem X Large - $0.50 64-bit 6.5 17.1GB 1690GB High Mem XXL - $1.20 64-bit 13 34.2GB 1690GB High Mem XXXL - $2.40 64-bit 26 68.4GB 1690GB Northern VA cluster • No up-front cost, no contract, no minimum • Billing rounded to nearest hour (also regional,spot pricing) • New paradigm(!) for deployingJorge Guerra Lic. services?, HPC? 48
  49. 49. Economics of Cloud Providers • Microsoft and Google race to build next-gen DCs (Jan’07) – Microsoft announces a $550 million DC in Texas – Google confirm plans for a $600 million site in North Carolina – Google two more DCs in South Carolina; may cost another $950 million – about 150,000 computers each • Power availability drives deployment decisions Lic. Jorge Guerra 49
  50. 50. Costos ocultos del cloud Lic. Jorge Guerra 50
  51. 51. Google Oregon Datacenter Lic. Jorge Guerra 51 Source: Harper’s (Feb, 2008)
  52. 52. Containerized Datacenters Nortel Steel Enclosure Containerized telecom equipment Sun Black Box (242 systems in 20’) Rackable Systems (1,152 Systems in 40’) Rackable Systems Container Cooling Model Lic. Jorge Guerra 52 James Hamilton talk, 1/7/2007
  53. 53. Unit of Data Center Growth • One at a time: – 1 system – Racking & networking: 14 hrs ($1,330) • Rack at a time: – ~40 systems – Install & networking: .75 hrs ($60) • Container at a time: – ~1,000 systems – No packaging to remove – No floor space required – Power, network, & cooling only • Weatherproof & easy to transport • Data center construction takes 24+ months – Both new build & DC expansion require regulatory approval Lic. Jorge Guerra 53
  54. 54. Sun Modular Datacenter “BlackBox” (GreenBox) • Delivered June 9th, operational in September – Significant challenges with cooling reliability • 7.5 40U racks – Power and cooling equivalent to all Soda machine rooms Lic. Jorge Guerra 54
  55. 55. Economics of Cloud Providers Economies of Scale for Humongous Datacenters (1,000’s to 10,000’s of commodity computers) Electricity Network Operations Hardware Put Datacenters Put Datacenters Standardize and Containerized at Cheap Power on Main Trunks Automate Ops Low-Cost Servers 5 to 7 Times Reduction in the Cost of Computing… • Economy of scale vs. provisioning a medium- sized (100’s machines) facility – Public (utility) vs. private clouds issue • Build-out driven by demand growth (more users) Lic. Jorge Guerra 55
  56. 56. Alimentación y refrigeración es cara! La infraestructura de energía y enfriamiento cuestan MUCHO Infrastructure PLUS Energy > Server Cost Since 2001 Infrastructure Alone > Server Cost Since 2004 Energy Alone > Server Cost Since 2008 Cost Effective to Discard Inefficient Servers Dispuesto a pagar más $ / servidor para servidores eficientes mas potentes Belady, C., “In the Data Center, Power and Ahorro de energía Ahorro en Infraestructura! Cooling Costs More than IT Equipment it Supports”, Electronics Cooling Magazine Like Airlines Retiring Fuel-Guzzling Airplanes Lic. Jorge Guerra (Feb 2007) 56
  57. 57. Public vs. Private Clouds • Building a Very Large-Scale Datacenter Very Is Expensive – $100+ Million (Minimum) • Large Internet Companies Already Building Huge DCs – Google, Amazon, Microsoft… • Large Internet Companies Already Building Software – MapReduce, GoogleFS, BigTable, Dynamo Technology Cost in Medium-Sized DC Cost in Very Large DC Ratio Network $95 per Mbit/sec/month $13 per Mbit/sec/Month 7.1 Storage $2.20 per GByte/month $0.40 per Gbyte/month 5.7 Administration ≈ 140 Servers / > 1000 Servers / 7.1 Administrator Administrator James Hamilton, Internet Scale Service Efficiency, Large-Scale Distributed Systems Huge DCs 5-7X as Cost Effective and Middleware (LADIS) Workshop Sept‘08 Lic. Jorge Guerra as Medium-Scale DCs 57
  58. 58. Extra Benefits para Cloud Providers • Amazon: utiliza capacidad ociosa • Microsoft: vende herramientas .NET • Google: reutiliza infraestructura existente Lic. Jorge Guerra 58
  59. 59. Platform - Amazon Web Services Elastic Compute Cloud (EC2) Rent computing resources by the hour Basic unit of accounting = instance-hour Additional costs for bandwidth Simple Storage Service (S3) Persistent storage Charge by the GB/month Additional costs for bandwidth
  60. 60. Platform - Amazon Web Services(EC2) • • Infrastructure as a Service provider, and current market leader. • • Data centers in USA and Europe • • Different regions and availability zones • • Uses Xen hypervisor • • Users provision instances in classes, with different CPU, memory and I/O performance.
  61. 61. Platform - Amazon Web Services(EC2) • Users provision instances with an Amazon Machine Image (AMI), packaged virtual machines. – Instances ready in 10-20 seconds. – Amazon provides a range of AMIs • Users can upload and share custom AMIs, – preconfigured for different roles. – • Supports Windows, OpenSolaris and Linux • Control interface – HTTP REST/SOAP API – Command line tools • Able to implement external monitoring and scaling using interface.
  62. 62. Platform - Amazon Web Services(EC2) • Flexible, but low-level (roll-your-own) • No built-in load balancing or scaling (yet) • Integrated with services: – Simple Storage Service (S3) – Scalable Queue Service (SQS) – SimpleDB • Pricing based on instance hours – + bandwidth charges – + service charges (S3, SQS etc.)
  63. 63. Platform – Windows Azure • Platform as a Service (in pre-release) – “Cloud OS” – .NET libraries for managed code like C# – Web and worker roles (w/queues) • Topology described in metadata • Live upgrades (w/upgrade zones)
  64. 64. Platform – Google App Engine • Platform as a Service • Target: Web applications • Provides custom Python runtime environment, with a specialized version of the Django framework. • Integrated with Google data store (Bigtable), and other “Internet-scale” infrastucture. • Actually support Java Technology.
  65. 65. Cloud Computing Infrastructure • Computation model: MapReduce* • Storage model: HDFS* • Other computation models: HPC/Grid Computing • Network structure *Some material adapted from slides by Jimmy Lin, Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet, Lic. Jorge Guerra 68 Google Distributed Computing Seminar, 2007 (licensed under Creation Commons Attribution 3.0 License)
  66. 66. Cloud Computing Computation Models • Finding the right level of abstraction – von Neumann architecture vs cloud environment • Hide system-level details from the developers – No more race conditions, lock contention, etc. • Separating the what from how – Developer specifies the computation that needs to be performed – Execution framework (“runtime”) handles actual execution Lic. Jorge Guerra 69
  67. 67. “Big Ideas” • Scale “out”, not “up” – Limits of SMP and large shared-memory machines • Idempotent operations – Simplifies redo in the presence of failures • Move processing to the data – Cluster has limited bandwidth • Process data sequentially, avoid random access – Seeks are expensive, disk throughput is reasonable • Seamless scalability for ordinary programmers – From the mythical man-month to the tradable machine-hour Lic. Jorge Guerra 70
  68. 68. Typical Large-Data Problem • Iterate over a large number of records • Extract something of interest from each • Shuffle and sort intermediate results • Aggregate intermediate results • Generate final output Key idea: provide a functional abstraction for these two operations – MapReduce Lic. Jorge Guerra 71 (Dean and Ghemawat, OSDI 2004)
  69. 69. Google MapReduce Simplified Data Processing on Clusters/Clouds • http://labs.google.com/papers/mapreduce.html • This is a dataflow model between services where services can do useful document oriented data parallel applications including reductions • The decomposition of services onto cluster engines (clouds) is automated • The large I/O requirements of datasets changes efficiency analysis in favor of dataflow • Services (count words in example) can obviously be extended to general parallel applications • There are many alternatives to language expressing either dataflow and/or parallel operations and/or workflow Lic. Jorge Guerra 72
  70. 70. Roots in Functional Programming Map f f f f f Fold g g g g g Lic. Jorge Guerra 73
  71. 71. Putting everything together… namenode job submission node namenode daemon jobtracker tasktracker tasktracker tasktracker datanode daemon datanode daemon datanode daemon Linux file system Linux file system Linux file system … … … slave node slave node slave node Lic. Jorge Guerra 74
  72. 72. MapReduce/GFS Summary • Simple, pero poderoso modelo de programación • Escala a manejar cargas de trabajo de petabyte+ – Google: six hours and two minutes to sort 1PB (10 trillion 100-byte records) on 4,000 computers – Yahoo!: 16.25 hours to sort 1PB on 3,800 computers • Incrementa la mejora del rendimiento con más nodos • Maneja a la perfección los fallos, pero posiblemente con penalizaciones en el rendimiento Lic. Jorge Guerra 75
  73. 73. Implementacion Lic. Jorge Guerra 76
  74. 74. Estrategias comerciales • Microsoft: Software plus Services – Uso de .NET y Windows • IBM: Transformation through Customer Implementations – Implementacion construida con participacion del cliente • Cisco: Evolving Interoperability – Provee herramientas basadas en Web 2.0 Lic. Jorge Guerra 77
  75. 75. Metodología de implementación Lic. Jorge Guerra 78
  76. 76. Definir Casos de Uso Lic. Jorge Guerra 79
  77. 77. Evaluar Infraestructura Lic. Jorge Guerra 80
  78. 78. Implementar Lic. Jorge Guerra 81
  79. 79. Problemas a considerar Lic. Jorge Guerra 82
  80. 80. Problemas a considerar Lic. Jorge Guerra 83
  81. 81. Buenas practicas Lic. Jorge Guerra 84
  82. 82. Criterios a considerar Lic. Jorge Guerra 85
  83. 83. Sumario • Muchos beneficios de Cloud Computing : – Desplazar de CapEx aOpEx , escalar OpEx a la demanda – Startups and prototyping, One-off tasks (Wash. Post) – Costo asociativo – Investigacion a escala • Many Cloud Computing Challenges: – Disponibilidad – Datos en la nube pueden ser “pesados” ($$$ para mover) Lic. Jorge Guerra 86
  84. 84. Referencias • http://en.wikipedia.org/wiki/Cloud_computing – Includes references to Amazon, Apple, Dell, Enomalism, Globus, Google, IBM, KnowledgeTreeLive, Nature, New York Times, Zimdesk – Others like Microsoft Windows Live Skydrive important • http://en.wikipedia.org/wiki/Amazon_Elastic_Compute_Cloud • http://uc.princeton.edu/main/index.php?option=com_conten t&task=view&id=2589&Itemid=1 Policy Issues • http://www.cra.org/ccc/home.article.bigdata.html – Hadoop (MapReduce) and “Data Intensive Computing” – See Data intensive computing minitrack at HICSS-42 January 2009 • http://ianfoster.typepad.com/blog/2008/01/theres-grid- in.html – OGF Thought Leadership blog • OGF22 talks by Charlie Catlett and Irving Wladawsky-Berger Lic. Jorge Guerra 87
  85. 85. Gracias! jguerrag@unmsm.edu.pe jorgeguerra@uigv.edu.pe

×