cloud computing alcances e implementacion
Upcoming SlideShare
Loading in...5
×
 

cloud computing alcances e implementacion

on

  • 4,917 views

charla de cloud computing agosto 2010

charla de cloud computing agosto 2010

Statistics

Views

Total Views
4,917
Views on SlideShare
4,917
Embed Views
0

Actions

Likes
0
Downloads
198
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

cloud computing alcances e implementacion cloud computing alcances e implementacion Presentation Transcript

  • Implementacion de Cloud Computing: Alcances y Tecnologia Lic. Jorge Guerra Guerra Universidad Nacional Mayor de San Marcos XVII Congreso Nacional de Estudiantes de Ingeniería de Sistemas y Computación 6 Agosto 2010 http://sites.google.com/site/jguerra91/home/ /
  • Agenda • Definiciones • Taxonomía • Costos • Implementaciones Lic. Jorge Guerra 2
  • “No es nada nuevo” “... hemos redefinido launa trampa” “Es computación en nube para “Es la peor estupidez: es una incluir todo bola del marketing. Alguien está lo que ya hacemos ... No entiendo que podriamos diciendo que es inevitable-y cada Que es cloud computing? de otra maneraque oigo eso, es muy vez ... que no sea cambiar la probable que algunoscampaña de redacción de sea un de nuestros anuncios.” hacerlo realidad.” negocios para Larry Ellison, CEO, Stallman, Founder, Free Richard Oracle (Wall Street Journal, Sept.Foundation (The Software 26, 2008) Guardian, Sept. 29, 2008) No hay una respuesta consistente…
  • Todo el mundo tiene un montón de datos para procesar! • Wayback Machine tiene 2 PB + 20 TB/mes (2006) • Google procesa 20 PB por dia (2008) • “Todas las palabras que han hablado alguna vez los seres humanos” ~ 5 EB • NOAA tiene ~1 PB datos del clima (2007) • CERN’s LHC genera 15 PB al año(2008) Some material adapted from slides by Jimmy Lin, Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet, Google Distributed Computing Seminar, 2007 (licensed under Lic. Jorge Guerra 4 Creation Commons Attribution 3.0 License) Maximilien Brice, © CERN
  • Evolucion hacia el Cloud Source: http://news.cnet.com Lic. Jorge Guerra 5
  • Que es Cloud Computing? • Viejas ideas: – Grids, supercomputadoras vectoriales – Software como Servicio (SaaS) • Def: desarrollando aplicaciones sobre la Internet • Recientemente: “[Hardware, Infraestructura, Plataforma] como un servicio” – Pobremente definido por lo que hay que evitar “X es un servicio” • Utility Computing: computacion paga-como-tu-vas – Ilusion de infinitos recursos – No hay costo por adelantado – Facturacion de grano fino(ejm. por hora) Lic. Jorge Guerra 6
  • Definiciones formales • Un estilo de computación donde capacidades basadas en TI masivamente escalables en forma masiva se proporcionan "como un servicio" en la red (IBM) Lic. Jorge Guerra 7
  • Características • Virtual – Ubicación física y detalles sobre los infraestructura son transparentes para los usuarios • Escalable – Capaz de dividir en partes cargas de trabajo complejas para ser atendidos, a través de una infraestructura ampliable de forma incremental • Eficiente – Arquitectura Orientada a Servicios para la provisión dinámica de compartir los recursos informáticos • Flexible – Puede servir una variedad de tipos de carga de trabajo - tanto de cliente o de empresa Lic. Jorge Guerra 8
  • Percepción del usuario Lic. Jorge Guerra 9
  • Como lo ven al Cloud Computing • “Sólo me interesa resultados, no cómo se implementan las capacidades de TI” • " Quiero pagar por lo que yo uso, como una utilidad mas“ • " Puedo acceder a los servicios desde cualquier lugar, desde cualquier dispositivo” • “Puedo escalar hacia arriba o abajo de la capacidad, según sea necesario"" Lic. Jorge Guerra 10
  • Mapa Cloud/Saas de Laird Lic. Jorge Guerra 11
  • Curva de evolución Cloud de Gartner Lic. Jorge Guerra 12
  • Implementaciones Cloud Lic. Jorge Guerra 13
  • Tipos de implementacion Lic. Jorge Guerra 14
  • SAAS Lic. Jorge Guerra 15
  • Mapa Saas de Wolosky 2008 Lic. Jorge Guerra 16
  • Tipos de Cloud Computing Lic. Jorge Guerra 17
  • Tipos Lic. Jorge Guerra 18
  • Enabling Technology: Virtualization App App App App App App OS OS OS Operating System Hypervisor Hardware Hardware Traditional Stack Virtualized Stack Some material adapted from slides by Jimmy Lin, Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet, Google Distributed Computing Seminar, 2007 (licensed under Lic. Jorge Guerra 19 Creation Commons Attribution 3.0 License)
  • Muchos Tipos de Virtualizacion • Full virtualization – Instrucciones sensibles (descubrimiento estático o dinámico en tiempo de ejecución) se sustituyen por la traducción binaria o ejecucion por pasos enhardware en VMM para la simulacion de SW – Cualquier SO puede correr en el VM – Ejemplos: IBM’s CP/CMS, Oracle (Sun) VirtualBox, VMware Workstation • Virtualizacion asistido por Hardware(IBM S/370, Intel VT, o AMD-V) – Instrucciones sensibles a traps de CPU– ejecuta sin modificar sistema operativo invitado – Ejemplos: VMware Workstation, Linux Xen, Linux KVM, Microsoft Hyper-V • Para-virtualizacion – Presenta interfaz de SW para las máquinas virtuales similar pero no idéntica a la del HW subyacente, requiriendo los sistemas operativos invitados que adaptarse – Examples: early versions of Xen • Virtualizacion del Sistema Operativo – kernel del sistema operativo permite instancias de espacio de usuario aislados, en lugar de un solo espacio – Instancia look and feel como un servidor real – Ejemplos: Solaris Zones, QEMU, BSDJorge Guerra Lic. Jails, OpenVZ 20
  • Que hay del Grid? Hitachi SR8000 – Leibnitz Rechenzentrum 2 TFlop/s (2*1012) Lic. Jorge Guerra 21
  • Grid Computing • Grid Computing Criteria (Ian Foster 2004) – Coordination: A grid must coordinate resources that are not subject to centralized control – Open APIs: A grid must use standard, open, general-purpose protocols and interfaces – QoS: A grid must deliver nontrivial qualities of service (e.g., relating to response time, throughput, availability, and security) for co-allocating multiple resource types to meet complex user demands • Promise of ubiquitous grid computing (utility) – Reality is specialized grids • TeraGrid, Open Science Grid, LHC Grid – Grid provides “library level” service customized to HW • Ensuring consistent libraries across HW is hard! Lic. Jorge Guerra 22
  • Cloud Computing vs. Grid Computing Lic. Jorge Guerra 23
  • Datacenter es el nuevo“servidor” • “Programa” = Web search, email, map/GIS, … • “Computadora” = 1000’s computadoras, almacenamiento, redes • Facilidades y carga de trabajo del tamaño de la instalacion • Nuevas ideas de datacenter (2007-2008): camion container (Sun), flotantes (Google), datacenter-en-tienda (Microsoft) • Cómo habilitar la innovación en nuevos servicios sin tener que construir primero y capitalizar una gran empresa? 24 photos: Sun Microsystems & datacenterknowledge.com Lic. Jorge Guerra 24
  • Datacenter Architectures • Major engineering design challenges in building datacenters – One of Google’s biggest secrets and challenges – Read: https://groups.google.com/group/google- appengine/browse_thread/thread/a7640a2743922dcf – Very hard to get everything correct! • Some issues – Network access, physical security, power – And there’s all the software… Lic. Jorge Guerra 25
  • Algunos con accesso de fibra muy seguro … Lic. Jorge Guerra 26 Source: Build vs. Buy: Internet Datacenter, W. B. Norton and M. Lucking
  • Algunos con menos que eso Lic. Jorge Guerra 27 Source: Build vs. Buy: Internet Datacenter, W. B. Norton and M. Lucking
  • Infraestructura de seguridad • 24x7 Manned • Acceso: Biometrics, card keys • Video Surveillance Sliding Glass Lic. Jorge Guerra 28 Source: Build vs. Buy: Internet Datacenter, W. B. Norton and M. Lucking
  • Algunos muy seguros… http://www.thebunker.net Lic. Jorge Guerra 29 Source: Build vs. Buy: Internet Datacenter, W. B. Norton and M. Lucking
  • Otros como si hubiera pasado un huracan… Lic. Jorge Guerra 30 Source: Build vs. Buy: Internet Datacenter, W. B. Norton and M. Lucking
  • Datacenter Architectures • Let’s look at an example from telco professionals • Example: AT&T Miami, Florida Tier 1 datacenter – Redundant dual uplinks to AT&T global backbone – Minimum N+1 redundancy factor on all critical infrastructure systems Lic. Jorge Guerra 31
  • AT&T Internet Data Center Security • Hardened facilities protected by multiple security measures: – 24x7x365 on-premise support – Continuous CCTV surveillance, security breach alarms, electronic card key access, biometric palm scan and individual personal access code – Secured cage and cabinet environment Lic. Jorge Guerra 32 AT&T Enterprise Hosting Services briefing 10/29/2008
  • AT&T Internet Data Center Power Commercial Transformer Power Supply Paralleling Switch Gear / Batteries UPS Systems Manual Switch Power Diesel Fuel Tanks Generators Distribution Units Remote Power Panels Lic. Jorge Guerra 33 AT&T Enterprise Hosting Services briefing 10/29/2008
  • AT&T Internet Data Center Power Commercial Power Feeds 2 Commercial Feed Each At 13,800V Located Near Substation supplied from 2 different grids All Cable Routed Underground for Protection Lic. Jorge Guerra 34 AT&T Enterprise Hosting Services briefing 10/29/2008
  • AT&T Internet Data Center Power • Paralleling Switch Gear • Automatically Powers Up All Generators When Commercial Power is Interrupted for More Than 7 Seconds – Generators are Shed to Cover Load as Needed – Typical Transition Takes Less Than 60 Seconds • Manual Override Available to Emergency Power Switch Ensure Continuity if Automatic Start-Up Should Fail Lic. Jorge Guerra 35 AT&T Enterprise Hosting Services briefing 10/29/2008
  • AT&T Internet Data Center Power • Four (4) Battery Strings To Support The UPS Systems • Battery Strings Contain Flooded Cell Batteries • A minimum of Fifteen (15) Minutes of Battery Backup Available At Full Load • Hydrogen Sensors Monitoring • Remote Status Monitoring UPS Batteries of Battery Strings Lic. Jorge Guerra 36 AT&T Enterprise Hosting Services briefing 10/29/2008
  • AT&T Internet Data Center Power Uninterruptible Power Supply (UPS) Eliminate Spikes, Sags, Surges, Transients, And All Other Over/Under Voltage And Frequency Conditions, Providing Clean Power To Connected Critical Loads • Four UPS Modules connected in a Ring Bus configuration • Each Module rated at 1000kVA • Rotary Type UPS by Piller Lic. Jorge Guerra 37 AT&T Enterprise Hosting Services briefing 10/29/2008
  • AT&T Internet Data Center Power Back-up Power – Generators and Diesel Fuel • Four (4) 2,500 kw Diesel Generators Providing Standby Power, capable of producing 10 MW of power • Two (2) 33,000 Gallon Aboveground Diesel Fuel Storage Tanks Lic. Jorge Guerra 38 AT&T Enterprise Hosting Services briefing 10/29/2008
  • Typical Tier-2 One Megawatt Datacenter Main Supply Transformer ATS Generator Switch 1000 kW Board UPS UPS • Reliable Power: Mains + Generator, STS Dual UPS PDU … • Units of Aggregation STS – Rack (10-80 nodes) → PDU (20-60 200 kW PDU racks) → Facility/Datacenter Panel Panel 50 kW Circuit Rack 2.5 kW X. Fan, W-D Weber, L. Barroso, “Power Provisioning for a Lic. Jorge Guerra 39 Warehouse-sized Computer,” ISCA’07, San Diego, (June 2007).
  • Systems & Power Density • Estimating DC power density hard – Power is 40% of DC costs • Power + Mechanical: 55% of cost – Shell is roughly 15% of DC cost – Cheaper to waste floor than power • Typically 100 to 200 W/sq ft • Rarely as high as 350 to 600 W/sq ft • Over 20% of entire DC costs is in power redundancy – Batteries able to supply 13 megawatt for 12 min – N+2 generation (11 x 2.5 megawatt) Lic. Jorge Guerra 40 James Hamilton talk, 1/17/2007
  • Porque ahora(y no antes)? • Commoditization of HW & SW – x86 as universal ISA, plus fast virtualization – Standard software stack, largely open source (LAMP) – Bet: Can statistically multiplex multiple instances onto a single box without interference between instances • Novel economic model: fine grain billing – Earlier examples: Sun, Intel Computing Services—longer commitment, more $$$/hour • Infrastructure software: eg Google FileSystem • Operational expertise: failover, DDoS, firewalls... • More pervasive broadband Internet Lic. Jorge Guerra 41
  • Classifying Clouds App Model for Utility Computing Amazon EC2 Windows Azure Google AppEngine Something Close to Physical .NET and CLR… App Specific Traditional New Hardware ASP.NET Support Web App Model ??? Lower-level, User Controls More Constraints Higher-level, Constrained Less managed on User Stack More managed ??? Most of Stack Stateless/Stateful Tiers “flexibility/portability” Hard to Auto Auto Provisioning “more Auto Scaling and built-in functionality” ??? Scale and Failover of Stateless App Auto High-Availability Constraints on App Model Offer Tradeoffs… Lots of Ongoing Innovation… • Instruction Set VM (Amazon EC2, 3Tera) • Managed runtime VM (Microsoft Azure) • Framework VM (Google AppEngine, Force.com) Lic. Jorge Guerra 42
  • Aplicaciones web asesinas • Mobile and web applications • Extensiones de software de escritorio – Matlab, Mathematica • Batch processing / MapReduce – Oracle at Harvard, Hadoop at NY Times Lic. Jorge Guerra 43
  • Demanda de Aplicacion Cloud • Muchas aplicaciones de nubes tienen curvas cíclicas de demanda Recursos – Daily, weekly, monthly, … Demanda Tiempo • Picos de carga de trabajo más frecuentes y significativos – Muerte de Michael Jackson: • 22% de tweets, 20% de trafico Wikipedia , Google penso que encontraba bajo ataque – Day de toma de posesion de Obama : 5x incremento en tweets Lic. Jorge Guerra 44
  • Economia de usuarioselegir un Cómo Cloud nivel de • Pago por usar en lugar de aprovisionamiento capacidad? para el pico • Recuerde: los costos de CD > $ 150M y toma 24 + meses para diseñar y construir Capacidad Recursos Recursos Demanda Capacidad Demanda Tiempo Tiempo Data center estatico Data center en el cloud Recursos sin usar Lic. Jorge Guerra 45
  • Economia de usuarios Cloud • Riesgo de sobre-provision: baja utilizacion • enorme costo perdido en infraestructura Capacidad Recursos sin usar Recrsos Demanda Tiempo Static data center Lic. Jorge Guerra 46
  • Economia de usuarios Cloud • Dura penalidad por baja-provision Resources Riesgo de bajo uso si Capacity predicciones de pico Demand Resources Capacity Aplicacion 1 2 3 son demasiado Time (days) Demand Perdida de ingresos optimistas 2 CapEx 1 – 3 Resources despericiado Time (days) Capacity Demand Muy difícil provisión para 1 2 3 cargas de trabajo de punta Time (days) Perdida de usuarios Lic. Jorge Guerra 47
  • Utility Computing Arrives • Amazon Elastic Compute Cloud (EC2) • “Compute unit” rental: $0.10-0.80 0.085-0.68/hour – 1 CU ≈ 1.0-1.2 GHz 2007 AMD Opteron/Intel Xeon core Platform Units Memory Disk Small - $0.10 $.085/hour 32-bit 1 1.7GB 160GB Large - $0.40 $0.35/hour 64-bit 4 7.5GB 850GB – 2 spindles X Large - $0.80 $0.68/hour 64-bit 8 15GB 1690GB – 4 spindles High CPU Med - $0.20 $0.17 64-bit 5 1.7GB 350GB High CPU Large - $0.80 $0.68 64-bit 20 7GB 1690GB High Mem X Large - $0.50 64-bit 6.5 17.1GB 1690GB High Mem XXL - $1.20 64-bit 13 34.2GB 1690GB High Mem XXXL - $2.40 64-bit 26 68.4GB 1690GB Northern VA cluster • No up-front cost, no contract, no minimum • Billing rounded to nearest hour (also regional,spot pricing) • New paradigm(!) for deployingJorge Guerra Lic. services?, HPC? 48
  • Economics of Cloud Providers • Microsoft and Google race to build next-gen DCs (Jan’07) – Microsoft announces a $550 million DC in Texas – Google confirm plans for a $600 million site in North Carolina – Google two more DCs in South Carolina; may cost another $950 million – about 150,000 computers each • Power availability drives deployment decisions Lic. Jorge Guerra 49
  • Costos ocultos del cloud Lic. Jorge Guerra 50
  • Google Oregon Datacenter Lic. Jorge Guerra 51 Source: Harper’s (Feb, 2008)
  • Containerized Datacenters Nortel Steel Enclosure Containerized telecom equipment Sun Black Box (242 systems in 20’) Rackable Systems (1,152 Systems in 40’) Rackable Systems Container Cooling Model Lic. Jorge Guerra 52 James Hamilton talk, 1/7/2007
  • Unit of Data Center Growth • One at a time: – 1 system – Racking & networking: 14 hrs ($1,330) • Rack at a time: – ~40 systems – Install & networking: .75 hrs ($60) • Container at a time: – ~1,000 systems – No packaging to remove – No floor space required – Power, network, & cooling only • Weatherproof & easy to transport • Data center construction takes 24+ months – Both new build & DC expansion require regulatory approval Lic. Jorge Guerra 53
  • Sun Modular Datacenter “BlackBox” (GreenBox) • Delivered June 9th, operational in September – Significant challenges with cooling reliability • 7.5 40U racks – Power and cooling equivalent to all Soda machine rooms Lic. Jorge Guerra 54
  • Economics of Cloud Providers Economies of Scale for Humongous Datacenters (1,000’s to 10,000’s of commodity computers) Electricity Network Operations Hardware Put Datacenters Put Datacenters Standardize and Containerized at Cheap Power on Main Trunks Automate Ops Low-Cost Servers 5 to 7 Times Reduction in the Cost of Computing… • Economy of scale vs. provisioning a medium- sized (100’s machines) facility – Public (utility) vs. private clouds issue • Build-out driven by demand growth (more users) Lic. Jorge Guerra 55
  • Alimentación y refrigeración es cara! La infraestructura de energía y enfriamiento cuestan MUCHO Infrastructure PLUS Energy > Server Cost Since 2001 Infrastructure Alone > Server Cost Since 2004 Energy Alone > Server Cost Since 2008 Cost Effective to Discard Inefficient Servers Dispuesto a pagar más $ / servidor para servidores eficientes mas potentes Belady, C., “In the Data Center, Power and Ahorro de energía Ahorro en Infraestructura! Cooling Costs More than IT Equipment it Supports”, Electronics Cooling Magazine Like Airlines Retiring Fuel-Guzzling Airplanes Lic. Jorge Guerra (Feb 2007) 56
  • Public vs. Private Clouds • Building a Very Large-Scale Datacenter Very Is Expensive – $100+ Million (Minimum) • Large Internet Companies Already Building Huge DCs – Google, Amazon, Microsoft… • Large Internet Companies Already Building Software – MapReduce, GoogleFS, BigTable, Dynamo Technology Cost in Medium-Sized DC Cost in Very Large DC Ratio Network $95 per Mbit/sec/month $13 per Mbit/sec/Month 7.1 Storage $2.20 per GByte/month $0.40 per Gbyte/month 5.7 Administration ≈ 140 Servers / > 1000 Servers / 7.1 Administrator Administrator James Hamilton, Internet Scale Service Efficiency, Large-Scale Distributed Systems Huge DCs 5-7X as Cost Effective and Middleware (LADIS) Workshop Sept‘08 Lic. Jorge Guerra as Medium-Scale DCs 57
  • Extra Benefits para Cloud Providers • Amazon: utiliza capacidad ociosa • Microsoft: vende herramientas .NET • Google: reutiliza infraestructura existente Lic. Jorge Guerra 58
  • Platform - Amazon Web Services Elastic Compute Cloud (EC2) Rent computing resources by the hour Basic unit of accounting = instance-hour Additional costs for bandwidth Simple Storage Service (S3) Persistent storage Charge by the GB/month Additional costs for bandwidth
  • Platform - Amazon Web Services(EC2) • • Infrastructure as a Service provider, and current market leader. • • Data centers in USA and Europe • • Different regions and availability zones • • Uses Xen hypervisor • • Users provision instances in classes, with different CPU, memory and I/O performance.
  • Platform - Amazon Web Services(EC2) • Users provision instances with an Amazon Machine Image (AMI), packaged virtual machines. – Instances ready in 10-20 seconds. – Amazon provides a range of AMIs • Users can upload and share custom AMIs, – preconfigured for different roles. – • Supports Windows, OpenSolaris and Linux • Control interface – HTTP REST/SOAP API – Command line tools • Able to implement external monitoring and scaling using interface.
  • Platform - Amazon Web Services(EC2) • Flexible, but low-level (roll-your-own) • No built-in load balancing or scaling (yet) • Integrated with services: – Simple Storage Service (S3) – Scalable Queue Service (SQS) – SimpleDB • Pricing based on instance hours – + bandwidth charges – + service charges (S3, SQS etc.)
  • Platform – Windows Azure • Platform as a Service (in pre-release) – “Cloud OS” – .NET libraries for managed code like C# – Web and worker roles (w/queues) • Topology described in metadata • Live upgrades (w/upgrade zones)
  • Platform – Google App Engine • Platform as a Service • Target: Web applications • Provides custom Python runtime environment, with a specialized version of the Django framework. • Integrated with Google data store (Bigtable), and other “Internet-scale” infrastucture. • Actually support Java Technology.
  • Cloud Computing Infrastructure • Computation model: MapReduce* • Storage model: HDFS* • Other computation models: HPC/Grid Computing • Network structure *Some material adapted from slides by Jimmy Lin, Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet, Lic. Jorge Guerra 68 Google Distributed Computing Seminar, 2007 (licensed under Creation Commons Attribution 3.0 License)
  • Cloud Computing Computation Models • Finding the right level of abstraction – von Neumann architecture vs cloud environment • Hide system-level details from the developers – No more race conditions, lock contention, etc. • Separating the what from how – Developer specifies the computation that needs to be performed – Execution framework (“runtime”) handles actual execution Lic. Jorge Guerra 69
  • “Big Ideas” • Scale “out”, not “up” – Limits of SMP and large shared-memory machines • Idempotent operations – Simplifies redo in the presence of failures • Move processing to the data – Cluster has limited bandwidth • Process data sequentially, avoid random access – Seeks are expensive, disk throughput is reasonable • Seamless scalability for ordinary programmers – From the mythical man-month to the tradable machine-hour Lic. Jorge Guerra 70
  • Typical Large-Data Problem • Iterate over a large number of records • Extract something of interest from each • Shuffle and sort intermediate results • Aggregate intermediate results • Generate final output Key idea: provide a functional abstraction for these two operations – MapReduce Lic. Jorge Guerra 71 (Dean and Ghemawat, OSDI 2004)
  • Google MapReduce Simplified Data Processing on Clusters/Clouds • http://labs.google.com/papers/mapreduce.html • This is a dataflow model between services where services can do useful document oriented data parallel applications including reductions • The decomposition of services onto cluster engines (clouds) is automated • The large I/O requirements of datasets changes efficiency analysis in favor of dataflow • Services (count words in example) can obviously be extended to general parallel applications • There are many alternatives to language expressing either dataflow and/or parallel operations and/or workflow Lic. Jorge Guerra 72
  • Roots in Functional Programming Map f f f f f Fold g g g g g Lic. Jorge Guerra 73
  • Putting everything together… namenode job submission node namenode daemon jobtracker tasktracker tasktracker tasktracker datanode daemon datanode daemon datanode daemon Linux file system Linux file system Linux file system … … … slave node slave node slave node Lic. Jorge Guerra 74
  • MapReduce/GFS Summary • Simple, pero poderoso modelo de programación • Escala a manejar cargas de trabajo de petabyte+ – Google: six hours and two minutes to sort 1PB (10 trillion 100-byte records) on 4,000 computers – Yahoo!: 16.25 hours to sort 1PB on 3,800 computers • Incrementa la mejora del rendimiento con más nodos • Maneja a la perfección los fallos, pero posiblemente con penalizaciones en el rendimiento Lic. Jorge Guerra 75
  • Implementacion Lic. Jorge Guerra 76
  • Estrategias comerciales • Microsoft: Software plus Services – Uso de .NET y Windows • IBM: Transformation through Customer Implementations – Implementacion construida con participacion del cliente • Cisco: Evolving Interoperability – Provee herramientas basadas en Web 2.0 Lic. Jorge Guerra 77
  • Metodología de implementación Lic. Jorge Guerra 78
  • Definir Casos de Uso Lic. Jorge Guerra 79
  • Evaluar Infraestructura Lic. Jorge Guerra 80
  • Implementar Lic. Jorge Guerra 81
  • Problemas a considerar Lic. Jorge Guerra 82
  • Problemas a considerar Lic. Jorge Guerra 83
  • Buenas practicas Lic. Jorge Guerra 84
  • Criterios a considerar Lic. Jorge Guerra 85
  • Sumario • Muchos beneficios de Cloud Computing : – Desplazar de CapEx aOpEx , escalar OpEx a la demanda – Startups and prototyping, One-off tasks (Wash. Post) – Costo asociativo – Investigacion a escala • Many Cloud Computing Challenges: – Disponibilidad – Datos en la nube pueden ser “pesados” ($$$ para mover) Lic. Jorge Guerra 86
  • Referencias • http://en.wikipedia.org/wiki/Cloud_computing – Includes references to Amazon, Apple, Dell, Enomalism, Globus, Google, IBM, KnowledgeTreeLive, Nature, New York Times, Zimdesk – Others like Microsoft Windows Live Skydrive important • http://en.wikipedia.org/wiki/Amazon_Elastic_Compute_Cloud • http://uc.princeton.edu/main/index.php?option=com_conten t&task=view&id=2589&Itemid=1 Policy Issues • http://www.cra.org/ccc/home.article.bigdata.html – Hadoop (MapReduce) and “Data Intensive Computing” – See Data intensive computing minitrack at HICSS-42 January 2009 • http://ianfoster.typepad.com/blog/2008/01/theres-grid- in.html – OGF Thought Leadership blog • OGF22 talks by Charlie Catlett and Irving Wladawsky-Berger Lic. Jorge Guerra 87
  • Gracias! jguerrag@unmsm.edu.pe jorgeguerra@uigv.edu.pe