Cloud computing shim


Published on

Cloud Computing

Published in: Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Cloud computing shim

  1. 1. 7/16/2014 1 Introduction to Cloud Services Simon Shim Survey Results Fundamentals • Cluster Computing • Sharding • Horizontal Scaling Evolution of Cloud Computing Grid Computing Utility Computing SaaS Computing Cloud Computing  Solving large problems with Parallel computing  Made mainstream By Global Alliance  Offering computing resources as a metered service  Introduced in late 1990s  Network-based subscriptions to applications  Gained momentum in 2001  Next-Generation Internet computing  Next-Generation Data Centers Cloud Computing - The Coming Storm Source: PWC: PETROFED
  2. 2. 7/16/2014 2 Cloud Computing Framework Cloud Framework System Business Process as a Service Application/Software as a Service Platform as a Service Infrastructure as a Service Cloud Computing - The Coming Storm What is the landscape of Cloud Computing? SaaS and IaaS are the key cloud capabilities for 80% of our customers Applications, typically available via the browser: • Google Apps • SaaS (Software as a Service) Hosted application environment for building and deploying cloud applications: • • Amazon E2C • Microsoft Azure •Google AppEngine PaaS (Platform as a Service) Utility computing data center providing on demand server resources: • HP Adaptive Infrastructure as a Service • Rackspace • Amazon E2C & S3 IaaS (Infrastructure as a Service) Three primary models for Cloud Computing have emerged: Cloud Computing - The Coming Storm Source: PWC: PETROFED Cloud Computing • Rackspace:Number of Systems: 75,000 – Number of CPUs: 150,000 – Bandwidth: 400 Gbps • Amazon:Number of Systems: 160,000 – Number of CPUs: 320,000 – Bandwidth: 500 Gbps • Google (include google, apps, YouTube): Number of Systems: 500,000 – Number of CPUs: 1,000,000 – Bandwidth: 1,500 Gbps • According to IDC, 1.6milion servers w/ virtualization in 2011 Yahoo Brings its Computing Coop yahoo apple facebook
  3. 3. 7/16/2014 3 The Microsoft Cloud Data Center Infrastructure > Purpose-built data center to host containers at large scale • Cost $500 million, 100,000 square foot facility (10 football fields) > 40 foot shipping containers can house as many as 2,500 servers • Density of 10 times amount of compute in equivalent space in traditional data center > Deliver an average PUE of 1.22 • Power Usage Effectiveness benchmark from The Green Grid™ consortium on energy efficiency The Microsoft Cloud Data Center Infrastructure Windows Azure Datacenters The Microsoft Cloud ~100 Globally Distributed Data Centers Quincy, WA Chicago, IL San Antonio, TX Dublin, Ireland Generation 4 DCs
  4. 4. 7/16/2014 4 Amazon S3 Growth Google Capex 1Q2010: $239 million 2Q2010: $476 million 3Q2010: $757 million4Q2010: $2.55 Billion Finland google data center Source: Google Cloudy: Gathering Storm HP Buys Stratavia for Application Automation (8/2010) NASA to spend over $4 Billion on IT Services PG&E add MAID data storage systems it will pay people to install In 2011, Intel reports best quarter ever. Juniper powers 100G Demonstration (11/2010) SGI introduces Jolt technology: scale from 2,048 to over 250,000 cores, from 16TB to 8PB of memory Amazon Buying More Land in Oregon Microsoft has bought land in Longmont, Colorado (2/2011) Cisco to acquire LineSider (12/2010) Dell, Sabey Outline Huge Data Center Projects (12/2010) Cisco launched Data Center 2011 Texas HP launches Sydney data center (1/2011) IBM acquires OpenPages (2010) Amazon Buys Dublin Site for Data Center(2/2011) The Defense Information Systems Agency (DISA) operates 14 data centers around the world with approximately 3.7 petabytes Savvis to Expand Facilities in Three Markets GSMA Mobile World Congress, global mobile data trafficset to increase exponentially in the coming years (2/2011) Twitter Adds Data Center in Sacramento(12/2010) Microsoft Starts Building Iowa Data Center (1/2011) Facebook Plans North Carolina Data Center (11/2010) What are Governments worried about? Source: World Economic Forum survey, Fall 2009 0% 20% 40% 60% 80% 100% Information security issues Ability to meet national security requirements Governance issues Compliance issues Fear of vendor lock-in Data Privacy/Confidentiality Business Continuity Issues Government/Regulatory Respondents Who are “Very Concerned”
  5. 5. 7/16/2014 5 What cloud computing is used for Source: Parallels: IT track Cloud Computing Success Stories •GE - Global procurement hosting 500k suppliers and 100k users in six languages on SaaS platform to manage $55B/yr in spend •Bechtel - Reduced infrastructure cost by 30% in part by achieving 70% server utilization •Washington DC - Google Apps used by 38k employees reducing costs to $50/user per year for email, calendaring, documents, spreadsheets, wikis, and instant messaging •Eli Lilly - Using Amazon Web Services can deploy a new server in 3min vs 50days and a 64-node Linux cluster in 5min vs 100days •NASDAQ - Using Amazon Storage to store 30-80GB/day of trading activity Cloud cover Others leaders include Hasbro, ESPN, Major League Baseball, New York Times and British Telecom Underlying Technlogies • Web 2.0/Internet • SaaS • PaaS • IaaS • Virtualization – OS – Storage Cloud Framework System Application/Software as a Service Platform as a Service Infrastructure as a Service Solutions and vendors are emerging daily External IaaS Utility Systems Management Tools+ Utility Application Development •Data Synapse •Univa UD •Elastra Cloud Server •3tera App Logic •VMWare •IBM Tivoli •Cassatt •Parallels •HP/EDS (TBD) •IBM Blue Cloud •Sun Grid •Joyent Software as a Service (Saas) •Google Apps •Zoho Office •Workday •Microsoft Office Live Platform as a Service •Amazon E2C • •Google App Engine •Coghead Internal IaaS •HP Adaptive Infrastructure as a Service •Oracle On Demand Apps •NetSuite ERP • SFA •Etelos •LongJump •Boomi •Microsoft Azure* •Xen •Zuora •Aria Systems •eVapt •IBM WebSphere XD •BEA Weblogic Server VE •Mule •Rackspace •Jamcracker Cloud Computing - The Coming Storm Source: PWC: PETROFED
  6. 6. 7/16/2014 6 Cloud Services Amazon and Salesforce Simon Shim 2 1 5 essentials for clouds • R pooling self-service• O • Rapid E • Broad N A • M service 2 2 22CMPE 282 Cloud Services: Software Stack OS Services Operating System Virtualized Instance Frameworks Application Hardware Operating System Virtualized Instance Frameworks Application OS Services Hardware Source: Microsoft 2 3 23CMPE 282 Cloud Services: Pros and Cons Software as a service Platform as a service Infrastructure as a service 2 4 Flexibility, capability, vendor dependency, licensing cost, administration overhead 24CMPE 282
  7. 7. 7/16/2014 7 Forecasted Infrastructure Demand Time Capital Traditional Infrastructure Model Forecasted Infrastructure Demand Surplus Time Capital Acceptable Surplus Predicting Resource Demands You just lost customers Time Large Capital Expenditure Opportunity Cost Predicted Demand Traditional Hardware Actual Demand Automated Elasticity Source: Amazon Cost What is a “Cloud”? • Cloud: on-demand, scalable, multi-tenant, self- service compute and storage resources
  8. 8. 7/16/2014 8 Enterprise Cloud Solutions 1. Hybrid Cloud – Scalability of the Public Cloud with the control and security of a private cloud 2. Test / Development / QA Platform – Use cloud infrastructure servers as your test and development platform 3. Disaster Recovery – Keep images of your servers on cloud infrastructure ready to go in case of a disaster 4. Cloud File Storage – Backup or Archive your company data to cloud file storage 5. Load Balancing – Use cloud infrastructure for overflow management during peak usage times Enterprise Cloud Solutions (cont) 6. Overhead Control – Lower overhead costs and make your bids more competitive 7. Distributed Network Control and Cost Reporting – Create an individual private networks for each of your subsidiaries or contracts 8. Messaging Alternatives – Replace Microsoft Exchange and SharePoint with Google Apps 9. Rapid Deployment – Turn up servers immediately to fulfill project timelines 10. Functional IT Labor Shift – Refocus your IT labor expense on revenue producing activities Deployment and Migration Assessment and Design leads to a working solutions document (published best practice solutions guides) • Solutions planning • Investment planning & acquisition • Integration & test • Deployment, documentation, operations & maintenance
  9. 9. 7/16/2014 9 Cloud Territory SaaS PaaS IaaS HybridCloud PrivateCloud (software business) PublicCloud (datacenterbusiness) Deploymentoption Servicecapability Amazon Microsoft Oracle Capability Productivity Reachability Vendordependency Controllability Flexibility Administrationoverhead Licensingcost Use: Software as a Service Build: Platform as a Service Host: Infrastructure as a Service Tradeoffs Amazon Web Services (new) Source: Amazon Amazon Application Source: Amazon
  10. 10. 7/16/2014 10 AWS Cost vCPU ECU Memory (GiB) Instance Storage (GB) Linux/UNIX Usage General Purpose - Current Generation m3.medium 1 3 3.75 1 x 4 SSD $0.113 per Hour m3.large 2 6.5 7.5 1 x 32 SSD $0.225 per Hour m3.xlarge 4 13 15 2 x 40 SSD $0.450 per Hour m3.2xlarge 8 26 30 2 x 80 SSD $0.900 per Hour Source: Amazon 39 Key Features of EC2 • Cloud Watch – Monitors Amazon EC2 instances, Amazon EBS Volumes, Elastic LoadBalancers, and RDS database instances in real-time • Elastic Load-balancer – Distributes incoming application traffic across multiple Amazon EC2 instances – Achieving even better fault tolerance for applications – Automatically scaled up/down with Auto Scaling • Auto Scaling – Automatically scales Amazon EC2 capacity up or down according to conditions 39CMPE 282 Amazon Database Service: SimpleDB • Dynamo/SimpleDB(Simple Database) – A highly available, scalable, and flexible non-relational data store – Querying light-weight attribute data – Querying, mapping, tagging, metadata state management – Not for Transactional system: OLTP, Data warehouse • Consistency options – Eventual consistency • Consistency across all copies of data is usually reached within a second 40CMPE 282 Amazon Parallel Processing Service: Hadoop 21
  11. 11. 7/16/2014 11 Amazon SQS Use Cases Fault isolation HA PipeliningLB Source: Amazon Source: Amazon Salesforce ( App Db App Db “Can’t we create a separate stack for just this one customer? I promise it’s just this one…” 43CMPE 282 Source: Salesforce Architecture Polymorphic applications Runtime engine PaaS Single shared stack of software and hardware Tenant specific metadata Common metadata PaaS SaaS Cloud Infrastructure Multiple organizations (Tenants) Source: Salesforce metadata-driven, multi-tenant, Internet application platform Poly- Morphic Application 45CMPE 282 Source: Salesforce
  12. 12. 7/16/2014 12 Key Architectural Principles • Stateless AppServers • Database system of record • No DDL • All tables partitioned by OrgId • Smart PKs, Polymorphic FKs • Creative de-normalization and pivoting • Use every RDBMS feature/trick 46CMPE 282 Metadata, data, and pivot table structures store data corresponding to virtual data structures 47CMPE 282 Source: Salesforce Data Table Everyone's data is in one table, but each customer's data can be extracted by referencing tables of metadata containing customer’s custom fields and objects. 48 48CMPE 282 Source: Salesforce 49 Application Framework • Bulk data processing engine • Combine and execute repetitive operations in bulk to lessen overhead for transaction-sensitive applications • Convert a loop to array operations:create, update, delete • Multi-tenant-aware query optimizer • Speculate the number of rows that a particular query can potentially access, based on optimizer statistics (per-tenant,per-group, per-user) for each virtual objects 49CMPE 282
  13. 13. 7/16/2014 13 The Objects table stores metadata about custom objects (tables) • The Data heap table stores all structured data • corresponding to custom objects • The Fields table stores metadata about • custom fields (columns) 50CMPE 282 Source: Salesforce A single slot can store various types of data that originate from different objects 51CMPE 282 Source: Salesforce The Web Services API • provides programmatic access to your organization’s information using a simple, powerful, and secure Web Service SOAP API 52CMPE 282 Source: Salesforce Apex Web Services • Apex Code is our on demand, multi-tenant programming language that extends the capabilities of the platform by introducing the ability to write custom business logic that runs on our servers. • Apex Code is organized in Classes Apex Code Class WebService Source: Salesforce
  14. 14. 7/16/2014 14 is a proven multi-tenant application platform that performs and scales 0.0 2.0 250 1.0 3.0 4.0 5.0 6.0 7.0 8.0 Q1 Q2 Q3 Q4 2005 Q1 Q2 Q3 Q4 Q1 2006 Fiscal Year Q2 Q3 2007 Q4 Q1 Q2 0 500 750 1,000 1,250 Page Response Time (ms) Quarterly Transactions (billions) 54CMPE 282 Source: Salesforce Concluding Remarks • PaaS is a major architectural shifts • PaaS is Application focused, high level of abstraction • is the most mature, proven PaaS offering available today • Optimized for fast, secure, and reliable multi-tenant application development and deployment 55CMPE 282 Source: Salesforce Evolving for the Cloud “It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is the most adaptable to change.” – Charles Darwin NETFLIX MOVED TO AMAZON CLOUD Source: Hien Luu, software architect, Netflix
  15. 15. 7/16/2014 15 Netflix Make Existing Apps Better Apps powered by both Cloud and on- premises Resources Netflix
  16. 16. 7/16/2014 16 Why Cloud? 23M Members Focus On Core Competence Why Cloud?  2008 – Netflix had a single data center  SPOF  Approaching capacity limits  Accelerating product launches – iPhone, Wii, PS3, XBox  Alternatives  Build more data centers  Outsource it to cloud providers – Amazon  Capacity planning  Maintenance Why Cloud? High Growth High Scalability High Availability Elasticity Why Amazon Cloud? AWS EC2, SDB, SQS, S3, EBS, EMR, ELB, ASG, RDB, (IAM)
  17. 17. 7/16/2014 17 Why Amazon Cloud? The cloud lets its users focus on delivering differentiating business value instead of wasting valuable resources on the undifferentiated heavy lifting that makes up most of IT infrastructure - Werner Vogels Tour of AWS Service Service Name Compute Elastic Compute Cloud (EC2) Elastic Map/Reduce (EMR) Auto Scaling (ASG) Database Relation Database Services (RDS) Messaging Simple Queue Service (SQS) Simple Notification Service (SNS) Monitoring CloudWatch Networking Elastic Load Balancing (ELB) Storage SimpleDB (SDB) Simple Storage Service (S3) Elastic Block Storage (EBS) Tour of AWS (EC2) $0.085/hr - $2.40/hr Type Computing Memory Unit (GB) Storage (GB) Platform I/O Name Small 1 1.7 160 32 Moderate m1.small Large 4 7.5 850 64 High m1.large X-Large 8 15 1690 64 High m1.xlarge High-CPU Medium 5 1.7 350 32 Moderate c1.medium High-CPU X-Large 20 7 1690 64 High c1.xlarge High-Memory X-Large 6.5 17.1 420 64 Moderate m2.xlarge High-Memory 2X-Large 13 34.2 850 64 High m2.2xlarge High-Memory 4X-Large 26 68.4 1690 64 High m2.4xlarge Cluster Compute 33.5 23 1690 64 Very High (10 Gbps) cc1.4xlarge Tour of AWS (EC2) High Availability Netflix: us-east-1c & us-east-1d Region US-East (Northern Virginia) US-West (Northern California) EU (Ireland) Asia Pacific (Singapore)
  18. 18. 7/16/2014 18 Tour of AWS - ELB ELB (DNS name, port) Availability Zone EC2 Instance EC2 Instance Availability Zone EC2 Instance EC2 Instance Client1 Client3Client2 EC2 Instance HTTP/HTTPS Health check URL, interval us-east Tour of AWS – Cloud Watch EC2 Instance EC2 Instance EC2 Instance EC2 Instance Visibility into resource utilization, operational performance CPU Network Disk I/O EBS Load Balancer RDSAWS Management Console Tour of AWS – S3 99.999999999% durability and 99.99% availability $0.055/GB -> $0.15/GB  Data storage infrastructure – for the Internet  Write, read, delete objects up to 5 GB  Scalable, reliable, unlimited storage  Objects can be made publicly accessible • Per Account • ..100 Tour of AWS - SimpleDB itemId Email Pets jdoe dog mjane cat, bird  For structured, non-relational text data  Highly available  Zero administrative overhead  Auto indexing Domain primary key Domains are collections of items that are described by attribute-value pairs Item
  19. 19. 7/16/2014 19 Tour of AWS - SimpleDB itemId Email Pet jdoe dog mjane cat, bird 10 GB 256 Attributes 1024 Bytes select <attributes> from <domain> where <query expression> Default to 100 items per select, maximum up to 2500 items 1024 Bytes Tour of AWS - SNS m4 m3 m2 Notification Infrastructure m1 m1 •100 topics per account •Message max size 8K text data Emailm6 m5 m1 HTTP/HTTPS SQS Topic Netflix Data  Video centric data  Video encodings  Video metadata  Actors, director, description  Critics’ reviews  User centric data  Video queue  Video watch history  Video ratings  Video playback metadata  Streaming bookmarks, activity logs Netflix In AWS Cloud Encoding Use ~4K EC2 Instances CDN Petabytes on S3
  20. 20. 7/16/2014 20 Netflix In AWS Cloud Discovery SeInrtveicrneal memcached Service API memcached S3 SimpleDB SQS API SQS Consumer SDeirsvcoicveery Internal Oracle API ELB Netflix Data Center Service Netflix In AWS Cloud  SimpleDB  Rental history: ~800M items  Queue: ~1B items  S3  Compressed rental history: ~17M objects  Streaming activity logs  Video encodes Access through customer id or movie id or both Netflix In AWS Cloud  Missing infrastructure services  Discovery service  Middle tier load balancer  Encryption service  Key management  Caching  Wrap memcached server  Discoverable  Instrumented Netflix In AWS Cloud Internal SeIrnvteicrensal SeIrnvteicrensal SeIrnvteicrensal Service Discovery Web Application Heart beat Middle Tier Load Balancer Service Discovery
  21. 21. 7/16/2014 21 Netflix In AWS Cloud Big Bang Transition iPhone Launch Totally run in cloud and no fallback option No control once App Store gate is open Have to scale on day one EC2 elasticity Netflix In AWS Cloud Datacenter Cloudvs Copy fromAdrian’s slide Best Practices  Automate deployment process  Dealing with failure  Network latency  503s, 408s, exponential backoff  Read/connect timeout  Persistence strategy  Rethink storage  SimpleDB, S3, RDS  Sharding  Eventual consistency Best Practices  Monitoring  Keynote – external URL monitoring  Amazon CloudWatch  AppDynamics  End to end transaction view  Good for debugging  Nimbus  Alerting  Epic  Graphs  Logs  Log analysis – Hadoop
  22. 22. 7/16/2014 22 Q&A