Cloud Computing
CS4262 Distributed Systems
Dilum Bandara
Dilum.Bandara@uom.lk
Cloud Computing
2
Clients
Other
Cloud Services
Govt.
Cloud Services
Private
Cloud
Cloud
Manager
Public Cloud
Source: Green Cloud Computing by Dr. Rajkumar Buyya
Cloud Computing (Cont.)
 Variety of services available over Internet that
deliver compute functionality on service
provider’s infrastructure
 Umbrella term
 Computing as a utility
 Pay as you go model
3
Source: www.free-power-point-templates.com/articles/best-
cloud-computing-powerpoint-templates/
Cloud Computing Characteristics
 Massive scale
 Rapid elasticity
 Resource pool
 Virtualization
 On demand
 Resilient computing
 Broad network access
 Service orientation
 Geographic distribution
 Homogeneity 4
Hardware
OS
App App App
Hypervisor
OS OS
Virtualized Stack
Cloud Computing – Pros & Cons
5
Cloud Computing – Pros & Cons (Cont.)
 Reduced cost
 5.7 times reduction in storage costs
 7.1 times reduction in administrative costs
 7.3 times reduction in networking costs
 No upfront investment
 Better performance
 Rapid scalability
 Access to latest version
 Global distribution
 Device independent
 More secure than having your own server rack
6Source – Green Cloud Computing by Dr. Rajkumar Buyya
Cloud Computing – Pros & Cons (Cont.)
 Cons
 Need high-bandwidth links
 Lower control & security concerns
 Low performance
 Web-based applications aren’t the fastest
 Interoperability
 Deployment specific software
7
Cloud Computing – Levels
Cloud Computing =
Software as a Service
+ Platform as a Service
+ Infrastructure as a Service
+ Data as a Service
8
Software as a Service (SaaS)
 Examples
 Google apps, O365, Salesforce.com (CRM)
 Pros
 Availability
 When & where you need them
 Cost reduction
 No up front costs
 Access to the latest version
 Cons
 Lack of control
 Lower customizability
9
Platform as a Service (PaaS)
 Examples
 Google app engine, Windows Azure, Heroku
 Pros
 Rapid development
 Better control
 Cost reduction
 Access to latest version
 Cons
 Relatively lower customizability
10
Infrastructure as a Service (IaaS)
 Examples
 Amazon, Rackspace, Akamai, SLT
 Pros
 Better control
 High customizability
 Cons
 Administration overhead
 High upfront cost, if application is built using
commercial software/OS
11
Delivered Using Warehouse-Scale
Computers (WSC)
12
www.laserfocusworld.com/articles/print/volume-48/issue-
12/features/optical-technologies-scale-the-datacenter.html http://www.slashgear.com/google-data-center-hd-photos-
hit-where-the-internet-lives-gallery-17252451/
WSC (Cont.)
13
Design Factors for WSC
 Cost-performance
 Small savings add up
 Energy efficiency
 Affects power distribution & cooling
 Work per joule
 Operational costs count
 Power consumption is a primary, constraint when
designing a system
 Dependability via redundancy
 Many low-cost components
14
Design Factors (Cont.)
 Network I/O
 Interactive & batch processing workloads
 Web search – interactive
 Web indexing – batch
 Ample computational parallelism isn’t important
 Most jobs are totally independent, “Request-level
parallelism”
 Scale – Its opportunities & problems
 Can afford to build customized systems as WSC
require volume purchase
 Frequent failures
15
Programming Models & Workloads
 Batch processing framework
– MapReduce
 Map
 Applies a programmer-
supplied function to each
logical input record
 Runs on thousands of
computers
 Provides new set of (key,
value) pairs as intermediate
values
 Reduce
 Collapses values using
another function 16
Source:
www.cbsolution.net/techniques/ontarget/
mapreduce_vs_data_warehouse
Divide & Conquer
17
“Work”
w1 w2 w3
r1 r2 r3
“Result”
“worker” “worker” “worker”
Partition
Combine
Source – “What is Cloud Computing? (and an intro to parallel/distributed
processing) “by Jimmy Lin, The iSchool, University of Maryland
Map-Reduce (Contd.)
 Map-reduce support is provided by a function
like following
 Y map-reduce(mapfn, reducefn, List<X>)
 Map reduce implementation takes list of inputs
(list) & does following:
 Apply map function to each entry in the list, which
emit (key, value) pairs
 Collect results, group them by keys, & then pass them
to reduce function as an array
18
Map-Reduce (Contd.)
19
Source: www.datasciencecentral.com/profiles/blogs/practical-
illustration-of-map-reduce-hadoop-style-on-real-data
Applications of Map-Reduce
 Frequency distribution of word occurrences
 Building inverted index of a search engine
 Sorting
 Stitch Imagery
 Google maps
 Data clustering
 Data analytics & business intelligence
20
Map-Reduce for Word Counting
21
Source: http://xiaochongzhang.me/blog/?p=338
How to do this for a large dataset using a distributed system?
Example – Word Count
Map(docId, text):
for all terms t in text
emit(t, 1);
Reduce(t, values[])
int sum = 0;
for all values v
sum += v;
emit(t, sum);
22
In Class Activity
1. Identify missing card(s)
2. Card sorting
3. Card sorting with 2 rounds
23
Inspired by Marcio Silva's “The MapReduce Card Game” at
http://blog.marciosilva.com/2012/10/the-mapreduce-card-game.html
Why Map-Reduce?
 Implementing same pattern in a distributed
system isn’t that easy
 Need to worry about communication, failures,
initialization, etc.
 MapReduce frameworks worry about all those
 You write map & reduce functions & call
framework
 It forces you to think parallel in design time
 It gives you a higher-level of abstraction to think in
 It’s very generic, & covers lot of usecases
 See http://wiki.apache.org/hadoop/PoweredBy
24
MapReduce Execution
25
Source: Dean et. al.,
“MapReduce, OSDI, 2004
Amazon Web Services
 Virtual machines – XEN
 Very low cost
 $ 0.10 per hour per instance
 Primary rely on open source software
 No (initial) service guarantees
 No contract required
 Amazon EC2
 Elastic Computer Cloud
 Amazon S3
 Simple Storage Service
26
Amazon Web Services – Example
27www.ryhug.com/free-art-available-on-amazon-amazon-web-services-that-is/
Cloud Computing Middleware
28
Openstack basic architecture
OpenStack Architecture
29
11 components
1229 parameters
CloudStack Architecture
30
CloudStack Architecture (Cont.)
31
Source: https://cwiki.apache.org/confluence/display/CLOUDSTACK/Original+Feature+Spec
Source: www.suse.com/documentation/sles-
12/book_virt/data/sec_kvm_intro_arch.html
Hypervisors
32
Source: http://wiki.xen.org/wiki/Xen_Project_Software_Overview
Xen vs. KVM
33
Source: http://dtrace.org/blogs/brendan/2013/01/11/virtualization-performance-zones-kvm-xen/
Challenges
 Getting large volume of data in/out
 Bandwidth aggregation
 Lack/lower QoS
 SLAs are too simplistic
 Deployment times are in 10s of seconds to
minutes
 Distributed cloud
 Lack of control
 Security, privacy, & ownership concerns
 Policy issues
34

Cloud Computing

  • 1.
    Cloud Computing CS4262 DistributedSystems Dilum Bandara Dilum.Bandara@uom.lk
  • 2.
    Cloud Computing 2 Clients Other Cloud Services Govt. CloudServices Private Cloud Cloud Manager Public Cloud Source: Green Cloud Computing by Dr. Rajkumar Buyya
  • 3.
    Cloud Computing (Cont.) Variety of services available over Internet that deliver compute functionality on service provider’s infrastructure  Umbrella term  Computing as a utility  Pay as you go model 3 Source: www.free-power-point-templates.com/articles/best- cloud-computing-powerpoint-templates/
  • 4.
    Cloud Computing Characteristics Massive scale  Rapid elasticity  Resource pool  Virtualization  On demand  Resilient computing  Broad network access  Service orientation  Geographic distribution  Homogeneity 4 Hardware OS App App App Hypervisor OS OS Virtualized Stack
  • 5.
    Cloud Computing –Pros & Cons 5
  • 6.
    Cloud Computing –Pros & Cons (Cont.)  Reduced cost  5.7 times reduction in storage costs  7.1 times reduction in administrative costs  7.3 times reduction in networking costs  No upfront investment  Better performance  Rapid scalability  Access to latest version  Global distribution  Device independent  More secure than having your own server rack 6Source – Green Cloud Computing by Dr. Rajkumar Buyya
  • 7.
    Cloud Computing –Pros & Cons (Cont.)  Cons  Need high-bandwidth links  Lower control & security concerns  Low performance  Web-based applications aren’t the fastest  Interoperability  Deployment specific software 7
  • 8.
    Cloud Computing –Levels Cloud Computing = Software as a Service + Platform as a Service + Infrastructure as a Service + Data as a Service 8
  • 9.
    Software as aService (SaaS)  Examples  Google apps, O365, Salesforce.com (CRM)  Pros  Availability  When & where you need them  Cost reduction  No up front costs  Access to the latest version  Cons  Lack of control  Lower customizability 9
  • 10.
    Platform as aService (PaaS)  Examples  Google app engine, Windows Azure, Heroku  Pros  Rapid development  Better control  Cost reduction  Access to latest version  Cons  Relatively lower customizability 10
  • 11.
    Infrastructure as aService (IaaS)  Examples  Amazon, Rackspace, Akamai, SLT  Pros  Better control  High customizability  Cons  Administration overhead  High upfront cost, if application is built using commercial software/OS 11
  • 12.
    Delivered Using Warehouse-Scale Computers(WSC) 12 www.laserfocusworld.com/articles/print/volume-48/issue- 12/features/optical-technologies-scale-the-datacenter.html http://www.slashgear.com/google-data-center-hd-photos- hit-where-the-internet-lives-gallery-17252451/
  • 13.
  • 14.
    Design Factors forWSC  Cost-performance  Small savings add up  Energy efficiency  Affects power distribution & cooling  Work per joule  Operational costs count  Power consumption is a primary, constraint when designing a system  Dependability via redundancy  Many low-cost components 14
  • 15.
    Design Factors (Cont.) Network I/O  Interactive & batch processing workloads  Web search – interactive  Web indexing – batch  Ample computational parallelism isn’t important  Most jobs are totally independent, “Request-level parallelism”  Scale – Its opportunities & problems  Can afford to build customized systems as WSC require volume purchase  Frequent failures 15
  • 16.
    Programming Models &Workloads  Batch processing framework – MapReduce  Map  Applies a programmer- supplied function to each logical input record  Runs on thousands of computers  Provides new set of (key, value) pairs as intermediate values  Reduce  Collapses values using another function 16 Source: www.cbsolution.net/techniques/ontarget/ mapreduce_vs_data_warehouse
  • 17.
    Divide & Conquer 17 “Work” w1w2 w3 r1 r2 r3 “Result” “worker” “worker” “worker” Partition Combine Source – “What is Cloud Computing? (and an intro to parallel/distributed processing) “by Jimmy Lin, The iSchool, University of Maryland
  • 18.
    Map-Reduce (Contd.)  Map-reducesupport is provided by a function like following  Y map-reduce(mapfn, reducefn, List<X>)  Map reduce implementation takes list of inputs (list) & does following:  Apply map function to each entry in the list, which emit (key, value) pairs  Collect results, group them by keys, & then pass them to reduce function as an array 18
  • 19.
  • 20.
    Applications of Map-Reduce Frequency distribution of word occurrences  Building inverted index of a search engine  Sorting  Stitch Imagery  Google maps  Data clustering  Data analytics & business intelligence 20
  • 21.
    Map-Reduce for WordCounting 21 Source: http://xiaochongzhang.me/blog/?p=338 How to do this for a large dataset using a distributed system?
  • 22.
    Example – WordCount Map(docId, text): for all terms t in text emit(t, 1); Reduce(t, values[]) int sum = 0; for all values v sum += v; emit(t, sum); 22
  • 23.
    In Class Activity 1.Identify missing card(s) 2. Card sorting 3. Card sorting with 2 rounds 23 Inspired by Marcio Silva's “The MapReduce Card Game” at http://blog.marciosilva.com/2012/10/the-mapreduce-card-game.html
  • 24.
    Why Map-Reduce?  Implementingsame pattern in a distributed system isn’t that easy  Need to worry about communication, failures, initialization, etc.  MapReduce frameworks worry about all those  You write map & reduce functions & call framework  It forces you to think parallel in design time  It gives you a higher-level of abstraction to think in  It’s very generic, & covers lot of usecases  See http://wiki.apache.org/hadoop/PoweredBy 24
  • 25.
    MapReduce Execution 25 Source: Deanet. al., “MapReduce, OSDI, 2004
  • 26.
    Amazon Web Services Virtual machines – XEN  Very low cost  $ 0.10 per hour per instance  Primary rely on open source software  No (initial) service guarantees  No contract required  Amazon EC2  Elastic Computer Cloud  Amazon S3  Simple Storage Service 26
  • 27.
    Amazon Web Services– Example 27www.ryhug.com/free-art-available-on-amazon-amazon-web-services-that-is/
  • 28.
  • 29.
  • 30.
  • 31.
    CloudStack Architecture (Cont.) 31 Source:https://cwiki.apache.org/confluence/display/CLOUDSTACK/Original+Feature+Spec
  • 32.
  • 33.
    Xen vs. KVM 33 Source:http://dtrace.org/blogs/brendan/2013/01/11/virtualization-performance-zones-kvm-xen/
  • 34.
    Challenges  Getting largevolume of data in/out  Bandwidth aggregation  Lack/lower QoS  SLAs are too simplistic  Deployment times are in 10s of seconds to minutes  Distributed cloud  Lack of control  Security, privacy, & ownership concerns  Policy issues 34

Editor's Notes

  • #9 DaaS examples - Urban Mapping, a geography data service, AWS data (Genome data, US Census, corpus of web crawl data)
  • #27 S3 - Simple Storage Service EC2 - Elastic Compute Cloud
  • #33 KVM - Kernel-based Virtual Machine QEMU - Quick Emulator Requires a processor with hardware virtualization extensions