SlideShare a Scribd company logo
1 of 52
Download to read offline
UC Berkeley
1
Above the Clouds:
A Berkeley View of Cloud Computing
Armando Fox and a cast of tens
, UC Berkeley Reliable Adaptive Distributed Systems Lab
USENIX LISA 2009
© 2009
Image: John Curley http://www.flickr.com/photos/jay_que/1834540/
2
Datacenter is new “server”
• “Program” == Web search, email, map/GIS, …
• “Computer” == 1000ʼs computers, storage, network
• Warehouse-sized facilities and workloads
• New datacenter ideas (2007-2008): truck container (Sun),
floating (Google), In Tents Computing (Microsoft)
• How to enable innovation in new services without first
building & capitalizing a large company?
photos: Sun Microsystems & datacenterknowledge.com
RAD Lab 5-year Mission
Goal: Enable 1 person to develop, deploy, operate
next -generation Internet application
• Key enabling technology: Statistical machine learning
– management, scaling, anomaly detection, performance prediction...
• interdisciplinary: 7 faculty, ~30 PhDʼs, ~6 ugrads, ~1
sysadm
• Regular engagement with industrial affiliates keeps us from
smoking our own dope too often
3
How we got into the clouds
• Theme: cutting-edge statistical machine
learning works where simple methods fail
– Resource utilization prediction
– Adding/removing storage bricks to meet SLA
– Console log analysis for problem finding
• Sponsor feedback: Great, now show that
it works on at least 1000ʼs of machines
4
Utility Computing to the
Rescue: Pay as you Go
• Amazon Elastic Compute Cloud (EC2)
• “Compute units” $0.10-0.80/hr. $0.085/hr & up
– 1 CU ≈ 1.0-1.2 GHz 2007 AMD Opteron/Xeon core
• N
• No up-front cost, no contract, no minimum
• storage (~0.15/GB/month)
• network (~0.10-0.15/GB external; 0.00 internal)
• Everything virtualized, even concept of
independent failure
5
“Instances” Platform Cores Memory Disk
Small - $0.085 / hr 32-bit 1 1.7 GB 160 GB
Large - $0.34/ hr 64-bit 4 7.5 GB 850 GB – 2 spindles
XLarge - $0.68/ hr 64-bit 8 15.0 GB 1690 GB – 3 spindles
Options....extra memory, extra CPU, extra disk, ...
5
Cloud Computing is Hot *sigh*
“...weʼve redefined Cloud Computing to
include everything that we already
do... I donʼt understand what we
would do differently ... other than
change the wording of some of our
ads.” Sept. 2008
“Weʼve been building data center after
data center, acquiring application after
application, ...driving up the cost of
technology immensely across the
board. We need to find a more
innovative path.” Sept. 2009 6
A Berkeley View of
Cloud Computing
abovetheclouds.cs.berkeley.edu
• 2/09 White paper by RAD Lab PIʼs/students
• Goal: stimulate discussion on whatʼs new
– Clarify terminology
– Quantify comparisons
– Identify challenges & opportunities
• UC Berkeley perspective
– industry engagement but no axe to grind
– users of CC since late 2007
7
Rest of talk
1. What is it? Whatʼs new?
2. Challenges & Opportunities
3. “We should cloudify our
datacenter/cluster/whatever!”
4. Academics in the cloud
8
1. What is it? Whatʼs new?
• Old idea: Software as a Service (SaaS),
predates Multics
• New: pay-as-you-go, utility computing
– Illusion of infinite resources on demand (minutes)
– Fine-grained billing: release == donʼt pay
– No minimum commitment
– Earlier examples (Sun, Intel): longer
commitment, more $$$/hour, no storage
9
Unused resources
Cloud Economics 101
• Cloud Computing User: Static provisioning
for peak - wasteful, but necessary for SLA
“Statically provisioned”
data center
“Virtual” data center
in the cloud
Demand
Capacity
Time
Machines
Demand
Capacity
Time
$
10
Unused resources
Cloud Economics 101
• Cloud Computing Provider: Could save
energy
“Statically provisioned”
data center
Real data center
in the cloud
Demand
Capacity
Time
Machines
Demand
Capacity
Time
Energy
11
Back of the envelope
• Server utilization in datacenters: 5-20%
– peaks 2x-10x average
• C = cost/hr. to use cloud (.085 for AWS)
• B = cost/hr. to buy server
– $2K server, 3-year depreciation: $0.076
• HW savings = (peak/average util.) – (C/B)
– in this example, save $$ if peak > 1.1x average
– can also factor in network & storage costs
• Caveat: IT accounting often not so simple
12
Unused resources
Risk of Overprovisioning
• Underutilization results if “peak” predictions
are too optimistic
Static data center
Demand
Capacity
Time
Resources
13
Risks of Under Provisioning
Lost revenue
Lost users
Resources
Demand
Capacity
Time (days)
1 2 3
Resources
Demand
Capacity
Time (days)
1 2 3
Resources
Demand
Capacity
Time (days)
1 2 3
14
Risk Transfer vs. CapEx/OpEx
• Over long timescales, a dollar is a dollar
• CC is not necessarily cheaper, esp. if you
have steady, known capacity needs
• But risk transfer opens fundamentally new
opportunities.
15
Risk Transfer: new scenarios
• “Cost associativity”:
1K servers x 1 hour == 1 server x 1K hours
– Washington Post: Hillary Clintonʼs travel docs
posted to WWW <1 day after released
– RAD Lab: publish results on 1,000+ servers
• Major enabler for SaaS startups
– Animoto Facebook plugin => traffic doubled
every 12 hours for 3 days
– Scaled from 50 to >3500 servers
– ...then scaled back down 16
Why Now (not then)?
• Build-out of extremely large datacenters
(10,000s commodity PCs)
• ...and how to run them
– Infrastructure SW: e.g., Google File System
– Operational expertise: failover, DDoS, firewalls...
– economy of scale: 5-7x cheaper than provisioning
medium-sized (100s/low 1000s machines) facility
• Necessary-but-not-sufficient factors
– pervasive broadband Internet
– Commoditization of HW & Fast Virtualization
– Standardized (& free) software stacks 17
UC Berkeley
2. Challenges & Opportunities
A subset of whatʼs in the paper
Both technical & nontechnical
18
Classifying Clouds
• Instruction Set VM (Amazon EC2)
• Managed runtime VM (Microsoft Azure)
• Framework VM (Google AppEngine, Force.com)
• Tradeoff: flexibility/portability vs. “built in”
functionality
EC2 Azure AppEngine,
Force.com
Lower-level,
Less managed
Higher-level,
More managed
19
Lock-in/business continuity
Challenge Opportunity
Availability /
business continuity
Multiple providers & datacenters
Open API’s
20
• Few enterprise datacentersʼ availability is as good
• “Higher level” (AppEngine, Force.com) vs. “lower
level” (EC2) clouds include proprietary software
+ richer functionality, better built-in ops support
– structural restrictions
• FOSS reimplementations on way? (eg AppScale)
Data lock-in
Challenge Opportunity
Data lock-in Standardization
21
• FOSS implementations of storage (eg
HyperTable)
• 10/19/09: Google Data Liberation Front
Data is a Gravity Well
Challenge Opportunity
Data transfer
bottlenecks
FedEx-ing disks,
Data Backup/Archiving
22
• Amazon now provides “FedEx a disk”
service
• and hosts free public datasets to “attract”
cycles
Data is a Gravity Well
Challenge Opportunity
Scale-up/scale-down
structured storage
Major research opportunity
23
•Profileration of non-relational scalable
storage:
SQL Services (MS Azure), Hypertable,
Cassandra, HBase, Amazon SimpleDB &
S3, Voldemort, CouchDB, NoSQL
movement
Policy/Business Challenges
Challenge Opportunity
Reputation Fate Sharing Offer reputation-guarding
services like those for email
24
4/2/09: FBI raid on Dallas datacenter shuts down
legitimate businesses along with criminal suspects
10/28/09: Amazon will whitelist elastic-IP
addresses and selectively raise limit on outgoing
SMTP
Policy/Business Challenges
Challenge Opportunity
Software Licensing Pay-as-you-go licenses;
Bulk licenses
25
2/11/09: IBM pay-as-you-go Websphere,
DB2, etc. on EC2
Windows on EC2
FOSS makes this less of a problem for
some potential cloud users
UC Berkeley
3. Should I cloudify?
26
Public vs. private clouds wonʼt
see same benefits
Benefit Public Private
Economy of scale Yes No
Illusion of infinite resources on-demand Yes Unlikely
Eliminate up-front commitment by users* Yes No
True fine-grained pay-as-you-go ** Yes ??
Better utilization (workload multiplexing) Yes Depends
on size**
Better utilization & simplified operations
through virtualization
Yes Yes
27
* What about nonrecoverable engineering/capital costs?
** Implies ability to meter & incentive to release idle resources
Consider getting best of both with surge computing
So, should I cloudify?
• Why? Is cost savings expected?
– economies of scale unlikely for most shops
– beware “double paying” for bundled costs
• Internal incentive to release unused
resources?
– If not...donʼt expect improved utilization
– Implies ability to meter (technical) and charge
(nontechnical)
28
IT best practices become
critical
• Authentication, data privacy/sensitivity
–Data flows over public networks, stored in
public infrastructure
–Weakest link in security chain == ?
• Support/lifecycle costs vs. alternatives
–Strong appliance market (e.g. spam
filters)
–“Accountability gap” for support
29
Hybrid/Surge Computing
• Use cloud for separate/one-off jobs?
• Harder: Provision steady state,
overflow your app to cloud?
–implies high degree of location
independence, software modularity
–must overcome most Cloud obstacles
–FOSS reimplementations (Eucalyptus) or
commercial products (VMware vCloud)?
30
Do my apps make sense in
cloud?
• Some app types compelling
– Extend desktop apps into cloud: Matlab,
Mathematica; soon productivity apps?
– Web-like apps with reasonable database
strategy
– Batch processing to exploit cost associativity,
e.g. for business analytics
• Others cloud-challenged
– Bulk data movement expensive, slow
– Jitter-sensitive apps (long-haul latency &
virtualization-induced performance distortion)
31
UC Berkeley
4. Academics in the Cloud:
some experiences
(thanks: Jon Kuroda, Eric Fraser,
Mike Howard)
32
Clouds in the RAD Lab
• Eucalyptus on ~40-node cluster
• Lots of Amazon AWS usage
• Workload can overflow from one to the
other (same tools, VM images, ...)
• Primarily for research/experiments that
donʼt need to tie in with, eg, UCB Kerberos
• Permissions, authentication, access to
home dirs from AWS, etc.—open
problems
33
An EECS-centric view
• Higher quality research
– routinely do experiments on 100+ servers
– many results published on 1,000+ servers
– unthinkable a few years ago
• Get results faster => solve new problems
– lots of machine learning/data mining research
– eg console log analysis [Xu et al, SOSP 09 &
ICDM 09]: minutes vs. hours means can do in
near-real-time
• Save money? um...that was a non-goal 34
Obstacles to CC in Research
• Accounting models that reward cost-
effective cloud use
• Funding/grants culture hasnʼt caught up to
“CapEx vs. OpEx”
• Tools still require high sophistication
– but attractive role for software appliances
• Software licensing isnʼt “cost associative”
– typically still tied to seats or fixed #CPUs
– less problematic for us as researchers
35
Cloud Computing &
Statistical Machine Learning
• Before CC, performance optimization was
mostly focused on small-scale systems
• CC  detailed cost-performance model
– Optimization more difficult with more metrics
• CC  Everyone can use 1000+ servers
– Optimization more difficult at large scale
• Economics rewards scale up and down
– Optimization more difficult if add/drop servers
• SML as optimization difficulty increases
36
Example: “elastic” key-value store
for SCADS [Armbrust et al, CIDR 09]
Capacity on demand
+
Motivation to release unused
=
Do the least you can up front
CS education in the Cloud
• Moved Berkeley SaaS course to AWS
– expose students to realistic environment
– Watch a database fall over: would have
needed 200 servers for ~20 project teams
– End of term project demos, Lab deadlines
• VM image simplifies courseware
distribution
– Students can be root
– repair damage == reinstantiate image
Summary: Clouds in EECS
• Focus is new research/teaching
opportunities vs. cost savings
• Mileage may vary in other departments
• Tools still require sophistication
• Authentication, other “admino-technical”
issues largely unsolved
• Funding/costing models not caught up
39
UC Berkeley
Wrapping up...
40
Summary: Whatʼs new
• CC “Risk transfer” enables new scenarios
– Startups and prototyping
– One-off tasks that exploit “cost associativity”
– Research & education at scale
• Improved utilization and lower costs if
scale down as well as up
– Economic motivation to scale down
– Changes thinking about load balancing, SW
design to support scale-down
41
Summary: Obstacles
• How “dependent” can you become?
– Data expensive to move, no universal format
– Management APIʼs not yet standardized
– Doesnʼt (necessarily) eliminate reliance on
proprietary SW
• SW licensing mostly cloud-unfriendly
• Security considerations, IT best practices
• Difficulty of quantifying savings
• Locus of administration/accountability?
42
Should I cloudify?
• Expecting to save money?
– Economy of scale unlikely; savings more likely
from better utilization
– But must design for resource accounting &
offer incentive to release
– Does hybrid/surge make sense?
• Even if donʼt move to cloud...use as driver
– enforce best practices
– identify bundled costs => true cost of IT
43
Conclusion
Is cloud computing all hype?
No.
Is it a fad that will fizzle out?
We think itʼs a major sea change.
Is it for everyone?
No/not yet, but be familiar with
obstacles & opportunities .44
UC Berkeley
Thank you!
More: abovetheclouds.cs.berkeley.edu
45
BACKUP SLIDES
46
RAD Lab Prototype:
System Architecture
Drivers
Drivers
Drivers
New apps,
equipment,
global policies
(eg SLA)
Offered load,
resource
utilization, etc.
Chukwa
&
XTrace
(monitoring)
Training data
Ruby on
Rails environment
VM monitor
local OS functions
Chukwa trace coll.
web svc
APIs
Web 2.0 apps
local OS functions
Chukwa trace coll.
SCADS
Director
performance &
cost
models
Log
Mining
Automatic
Workload
Evaluation
(AWE)
47
CC Changes Demands on
Instructional Computing?
• Runs on your laptop or
class Un*x account
• Good enough for course
project
• project scrapped when
course ends
• Intra-class teams
• Courseware: custom install
• Code never leaves UCB
_____________________
• Per-student/per-course
account
• Runs in cloud, remote
management
• Your friends can use it 
*ilities matter
• Gain customers  app
outlives course
• Teams cross UCB boundary
• Courseware: VM image
• Code released open source,
résumé builder
______________________
• General, collaboration-
enabling tools & facilities
Big science in the cloud?
• Web apps restructured to “shared-nothing
friendly” thru 90s; can science do same?
– gang scheduling for clouds/virtual clouds?
– rethink storage vs. checkpointing vs. code
structure
– move to much higher level languages (leave
tuning to macroblocks/runtime, not woven into
source code)
– Data-intensive (I/O rates & volume) needs of
science apps
• Opportunity for “cost associativity”! 49
SCADS: Scalable, Consistency-
Adjustable Data Storage
• Scale Independence – as #users grows:
– No changes to application
– Cost per user doesnʼt increase
– Request latency doesnʼt change
• Key Innovations
1.Performance safe query language
2.Declarative performance/consistency
tradeoffs
3.Automatic scale up and down using
machine learning
Scale Independence Arch
• Developers provide
performance safe
queries along with
consistency
requirements
• Use ML, workload
information, and
requirements to
provision proactively
via repartitioning
keys and replicas
SCADS Performance Model
(on m1.small, all data in memory)
SLA threshold
5% writes 1% writes
99th percen6le
median

More Related Content

Similar to Cloud Computing Berkeley.pdf

Similar to Cloud Computing Berkeley.pdf (20)

L2-3.FA19.ppt
L2-3.FA19.pptL2-3.FA19.ppt
L2-3.FA19.ppt
 
L2-3.FA19.ppt
L2-3.FA19.pptL2-3.FA19.ppt
L2-3.FA19.ppt
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 
L2-3.FA19.ppt
L2-3.FA19.pptL2-3.FA19.ppt
L2-3.FA19.ppt
 
L2-3.FA19.ppt
L2-3.FA19.pptL2-3.FA19.ppt
L2-3.FA19.ppt
 
L2-3.FA19.ppt
L2-3.FA19.pptL2-3.FA19.ppt
L2-3.FA19.ppt
 
L2-3.FA19.ppt
L2-3.FA19.pptL2-3.FA19.ppt
L2-3.FA19.ppt
 
L2-3.FA19.ppt
L2-3.FA19.pptL2-3.FA19.ppt
L2-3.FA19.ppt
 
Introduction To Cloud Computing.ppt
Introduction To Cloud Computing.pptIntroduction To Cloud Computing.ppt
Introduction To Cloud Computing.ppt
 
cloud computing services
cloud computing servicescloud computing services
cloud computing services
 
Internet of behaviours features and documents
Internet of behaviours features and documentsInternet of behaviours features and documents
Internet of behaviours features and documents
 
L2 3.fa19
L2 3.fa19L2 3.fa19
L2 3.fa19
 
L2-3.FA19.ppt
L2-3.FA19.pptL2-3.FA19.ppt
L2-3.FA19.ppt
 
Cloud computingjun28
Cloud computingjun28Cloud computingjun28
Cloud computingjun28
 
Cloud computingjun28
Cloud computingjun28Cloud computingjun28
Cloud computingjun28
 
Cloud Computing from Academic Perspective
Cloud Computing from Academic PerspectiveCloud Computing from Academic Perspective
Cloud Computing from Academic Perspective
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Above The Clouds
Above The CloudsAbove The Clouds
Above The Clouds
 
Introduction to Cloud computing
Introduction to Cloud computingIntroduction to Cloud computing
Introduction to Cloud computing
 

Recently uploaded

VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls KolkataVIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With RoomVIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Roomgirls4nights
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一Fs
 
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一Fs
 
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一Fs
 
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130  Available With RoomVIP Kolkata Call Girl Alambazar 👉 8250192130  Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Roomdivyansh0kumar0
 
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Lucknow
 
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With RoomVIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Roomishabajaj13
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Dana Luther
 
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girladitipandeya
 
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607dollysharma2066
 
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts servicesonalikaur4
 
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...aditipandeya
 
Sushant Golf City / best call girls in Lucknow | Service-oriented sexy call g...
Sushant Golf City / best call girls in Lucknow | Service-oriented sexy call g...Sushant Golf City / best call girls in Lucknow | Service-oriented sexy call g...
Sushant Golf City / best call girls in Lucknow | Service-oriented sexy call g...akbard9823
 
Russian Call girls in Dubai +971563133746 Dubai Call girls
Russian  Call girls in Dubai +971563133746 Dubai  Call girlsRussian  Call girls in Dubai +971563133746 Dubai  Call girls
Russian Call girls in Dubai +971563133746 Dubai Call girlsstephieert
 
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With RoomVIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Roomdivyansh0kumar0
 

Recently uploaded (20)

VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls KolkataVIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
Call Girls Service Dwarka @9999965857 Delhi 🫦 No Advance VVIP 🍎 SERVICE
Call Girls Service Dwarka @9999965857 Delhi 🫦 No Advance  VVIP 🍎 SERVICECall Girls Service Dwarka @9999965857 Delhi 🫦 No Advance  VVIP 🍎 SERVICE
Call Girls Service Dwarka @9999965857 Delhi 🫦 No Advance VVIP 🍎 SERVICE
 
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With RoomVIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
 
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
 
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
 
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130  Available With RoomVIP Kolkata Call Girl Alambazar 👉 8250192130  Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
 
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
 
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICECall Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
 
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With RoomVIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
 
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
 
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
 
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
 
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
 
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
 
Sushant Golf City / best call girls in Lucknow | Service-oriented sexy call g...
Sushant Golf City / best call girls in Lucknow | Service-oriented sexy call g...Sushant Golf City / best call girls in Lucknow | Service-oriented sexy call g...
Sushant Golf City / best call girls in Lucknow | Service-oriented sexy call g...
 
Russian Call girls in Dubai +971563133746 Dubai Call girls
Russian  Call girls in Dubai +971563133746 Dubai  Call girlsRussian  Call girls in Dubai +971563133746 Dubai  Call girls
Russian Call girls in Dubai +971563133746 Dubai Call girls
 
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With RoomVIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
 

Cloud Computing Berkeley.pdf

  • 1. UC Berkeley 1 Above the Clouds: A Berkeley View of Cloud Computing Armando Fox and a cast of tens , UC Berkeley Reliable Adaptive Distributed Systems Lab USENIX LISA 2009 © 2009 Image: John Curley http://www.flickr.com/photos/jay_que/1834540/
  • 2. 2 Datacenter is new “server” • “Program” == Web search, email, map/GIS, … • “Computer” == 1000ʼs computers, storage, network • Warehouse-sized facilities and workloads • New datacenter ideas (2007-2008): truck container (Sun), floating (Google), In Tents Computing (Microsoft) • How to enable innovation in new services without first building & capitalizing a large company? photos: Sun Microsystems & datacenterknowledge.com
  • 3. RAD Lab 5-year Mission Goal: Enable 1 person to develop, deploy, operate next -generation Internet application • Key enabling technology: Statistical machine learning – management, scaling, anomaly detection, performance prediction... • interdisciplinary: 7 faculty, ~30 PhDʼs, ~6 ugrads, ~1 sysadm • Regular engagement with industrial affiliates keeps us from smoking our own dope too often 3
  • 4. How we got into the clouds • Theme: cutting-edge statistical machine learning works where simple methods fail – Resource utilization prediction – Adding/removing storage bricks to meet SLA – Console log analysis for problem finding • Sponsor feedback: Great, now show that it works on at least 1000ʼs of machines 4
  • 5. Utility Computing to the Rescue: Pay as you Go • Amazon Elastic Compute Cloud (EC2) • “Compute units” $0.10-0.80/hr. $0.085/hr & up – 1 CU ≈ 1.0-1.2 GHz 2007 AMD Opteron/Xeon core • N • No up-front cost, no contract, no minimum • storage (~0.15/GB/month) • network (~0.10-0.15/GB external; 0.00 internal) • Everything virtualized, even concept of independent failure 5 “Instances” Platform Cores Memory Disk Small - $0.085 / hr 32-bit 1 1.7 GB 160 GB Large - $0.34/ hr 64-bit 4 7.5 GB 850 GB – 2 spindles XLarge - $0.68/ hr 64-bit 8 15.0 GB 1690 GB – 3 spindles Options....extra memory, extra CPU, extra disk, ... 5
  • 6. Cloud Computing is Hot *sigh* “...weʼve redefined Cloud Computing to include everything that we already do... I donʼt understand what we would do differently ... other than change the wording of some of our ads.” Sept. 2008 “Weʼve been building data center after data center, acquiring application after application, ...driving up the cost of technology immensely across the board. We need to find a more innovative path.” Sept. 2009 6
  • 7. A Berkeley View of Cloud Computing abovetheclouds.cs.berkeley.edu • 2/09 White paper by RAD Lab PIʼs/students • Goal: stimulate discussion on whatʼs new – Clarify terminology – Quantify comparisons – Identify challenges & opportunities • UC Berkeley perspective – industry engagement but no axe to grind – users of CC since late 2007 7
  • 8. Rest of talk 1. What is it? Whatʼs new? 2. Challenges & Opportunities 3. “We should cloudify our datacenter/cluster/whatever!” 4. Academics in the cloud 8
  • 9. 1. What is it? Whatʼs new? • Old idea: Software as a Service (SaaS), predates Multics • New: pay-as-you-go, utility computing – Illusion of infinite resources on demand (minutes) – Fine-grained billing: release == donʼt pay – No minimum commitment – Earlier examples (Sun, Intel): longer commitment, more $$$/hour, no storage 9
  • 10. Unused resources Cloud Economics 101 • Cloud Computing User: Static provisioning for peak - wasteful, but necessary for SLA “Statically provisioned” data center “Virtual” data center in the cloud Demand Capacity Time Machines Demand Capacity Time $ 10
  • 11. Unused resources Cloud Economics 101 • Cloud Computing Provider: Could save energy “Statically provisioned” data center Real data center in the cloud Demand Capacity Time Machines Demand Capacity Time Energy 11
  • 12. Back of the envelope • Server utilization in datacenters: 5-20% – peaks 2x-10x average • C = cost/hr. to use cloud (.085 for AWS) • B = cost/hr. to buy server – $2K server, 3-year depreciation: $0.076 • HW savings = (peak/average util.) – (C/B) – in this example, save $$ if peak > 1.1x average – can also factor in network & storage costs • Caveat: IT accounting often not so simple 12
  • 13. Unused resources Risk of Overprovisioning • Underutilization results if “peak” predictions are too optimistic Static data center Demand Capacity Time Resources 13
  • 14. Risks of Under Provisioning Lost revenue Lost users Resources Demand Capacity Time (days) 1 2 3 Resources Demand Capacity Time (days) 1 2 3 Resources Demand Capacity Time (days) 1 2 3 14
  • 15. Risk Transfer vs. CapEx/OpEx • Over long timescales, a dollar is a dollar • CC is not necessarily cheaper, esp. if you have steady, known capacity needs • But risk transfer opens fundamentally new opportunities. 15
  • 16. Risk Transfer: new scenarios • “Cost associativity”: 1K servers x 1 hour == 1 server x 1K hours – Washington Post: Hillary Clintonʼs travel docs posted to WWW <1 day after released – RAD Lab: publish results on 1,000+ servers • Major enabler for SaaS startups – Animoto Facebook plugin => traffic doubled every 12 hours for 3 days – Scaled from 50 to >3500 servers – ...then scaled back down 16
  • 17. Why Now (not then)? • Build-out of extremely large datacenters (10,000s commodity PCs) • ...and how to run them – Infrastructure SW: e.g., Google File System – Operational expertise: failover, DDoS, firewalls... – economy of scale: 5-7x cheaper than provisioning medium-sized (100s/low 1000s machines) facility • Necessary-but-not-sufficient factors – pervasive broadband Internet – Commoditization of HW & Fast Virtualization – Standardized (& free) software stacks 17
  • 18. UC Berkeley 2. Challenges & Opportunities A subset of whatʼs in the paper Both technical & nontechnical 18
  • 19. Classifying Clouds • Instruction Set VM (Amazon EC2) • Managed runtime VM (Microsoft Azure) • Framework VM (Google AppEngine, Force.com) • Tradeoff: flexibility/portability vs. “built in” functionality EC2 Azure AppEngine, Force.com Lower-level, Less managed Higher-level, More managed 19
  • 20. Lock-in/business continuity Challenge Opportunity Availability / business continuity Multiple providers & datacenters Open API’s 20 • Few enterprise datacentersʼ availability is as good • “Higher level” (AppEngine, Force.com) vs. “lower level” (EC2) clouds include proprietary software + richer functionality, better built-in ops support – structural restrictions • FOSS reimplementations on way? (eg AppScale)
  • 21. Data lock-in Challenge Opportunity Data lock-in Standardization 21 • FOSS implementations of storage (eg HyperTable) • 10/19/09: Google Data Liberation Front
  • 22. Data is a Gravity Well Challenge Opportunity Data transfer bottlenecks FedEx-ing disks, Data Backup/Archiving 22 • Amazon now provides “FedEx a disk” service • and hosts free public datasets to “attract” cycles
  • 23. Data is a Gravity Well Challenge Opportunity Scale-up/scale-down structured storage Major research opportunity 23 •Profileration of non-relational scalable storage: SQL Services (MS Azure), Hypertable, Cassandra, HBase, Amazon SimpleDB & S3, Voldemort, CouchDB, NoSQL movement
  • 24. Policy/Business Challenges Challenge Opportunity Reputation Fate Sharing Offer reputation-guarding services like those for email 24 4/2/09: FBI raid on Dallas datacenter shuts down legitimate businesses along with criminal suspects 10/28/09: Amazon will whitelist elastic-IP addresses and selectively raise limit on outgoing SMTP
  • 25. Policy/Business Challenges Challenge Opportunity Software Licensing Pay-as-you-go licenses; Bulk licenses 25 2/11/09: IBM pay-as-you-go Websphere, DB2, etc. on EC2 Windows on EC2 FOSS makes this less of a problem for some potential cloud users
  • 26. UC Berkeley 3. Should I cloudify? 26
  • 27. Public vs. private clouds wonʼt see same benefits Benefit Public Private Economy of scale Yes No Illusion of infinite resources on-demand Yes Unlikely Eliminate up-front commitment by users* Yes No True fine-grained pay-as-you-go ** Yes ?? Better utilization (workload multiplexing) Yes Depends on size** Better utilization & simplified operations through virtualization Yes Yes 27 * What about nonrecoverable engineering/capital costs? ** Implies ability to meter & incentive to release idle resources Consider getting best of both with surge computing
  • 28. So, should I cloudify? • Why? Is cost savings expected? – economies of scale unlikely for most shops – beware “double paying” for bundled costs • Internal incentive to release unused resources? – If not...donʼt expect improved utilization – Implies ability to meter (technical) and charge (nontechnical) 28
  • 29. IT best practices become critical • Authentication, data privacy/sensitivity –Data flows over public networks, stored in public infrastructure –Weakest link in security chain == ? • Support/lifecycle costs vs. alternatives –Strong appliance market (e.g. spam filters) –“Accountability gap” for support 29
  • 30. Hybrid/Surge Computing • Use cloud for separate/one-off jobs? • Harder: Provision steady state, overflow your app to cloud? –implies high degree of location independence, software modularity –must overcome most Cloud obstacles –FOSS reimplementations (Eucalyptus) or commercial products (VMware vCloud)? 30
  • 31. Do my apps make sense in cloud? • Some app types compelling – Extend desktop apps into cloud: Matlab, Mathematica; soon productivity apps? – Web-like apps with reasonable database strategy – Batch processing to exploit cost associativity, e.g. for business analytics • Others cloud-challenged – Bulk data movement expensive, slow – Jitter-sensitive apps (long-haul latency & virtualization-induced performance distortion) 31
  • 32. UC Berkeley 4. Academics in the Cloud: some experiences (thanks: Jon Kuroda, Eric Fraser, Mike Howard) 32
  • 33. Clouds in the RAD Lab • Eucalyptus on ~40-node cluster • Lots of Amazon AWS usage • Workload can overflow from one to the other (same tools, VM images, ...) • Primarily for research/experiments that donʼt need to tie in with, eg, UCB Kerberos • Permissions, authentication, access to home dirs from AWS, etc.—open problems 33
  • 34. An EECS-centric view • Higher quality research – routinely do experiments on 100+ servers – many results published on 1,000+ servers – unthinkable a few years ago • Get results faster => solve new problems – lots of machine learning/data mining research – eg console log analysis [Xu et al, SOSP 09 & ICDM 09]: minutes vs. hours means can do in near-real-time • Save money? um...that was a non-goal 34
  • 35. Obstacles to CC in Research • Accounting models that reward cost- effective cloud use • Funding/grants culture hasnʼt caught up to “CapEx vs. OpEx” • Tools still require high sophistication – but attractive role for software appliances • Software licensing isnʼt “cost associative” – typically still tied to seats or fixed #CPUs – less problematic for us as researchers 35
  • 36. Cloud Computing & Statistical Machine Learning • Before CC, performance optimization was mostly focused on small-scale systems • CC  detailed cost-performance model – Optimization more difficult with more metrics • CC  Everyone can use 1000+ servers – Optimization more difficult at large scale • Economics rewards scale up and down – Optimization more difficult if add/drop servers • SML as optimization difficulty increases 36
  • 37. Example: “elastic” key-value store for SCADS [Armbrust et al, CIDR 09] Capacity on demand + Motivation to release unused = Do the least you can up front
  • 38. CS education in the Cloud • Moved Berkeley SaaS course to AWS – expose students to realistic environment – Watch a database fall over: would have needed 200 servers for ~20 project teams – End of term project demos, Lab deadlines • VM image simplifies courseware distribution – Students can be root – repair damage == reinstantiate image
  • 39. Summary: Clouds in EECS • Focus is new research/teaching opportunities vs. cost savings • Mileage may vary in other departments • Tools still require sophistication • Authentication, other “admino-technical” issues largely unsolved • Funding/costing models not caught up 39
  • 41. Summary: Whatʼs new • CC “Risk transfer” enables new scenarios – Startups and prototyping – One-off tasks that exploit “cost associativity” – Research & education at scale • Improved utilization and lower costs if scale down as well as up – Economic motivation to scale down – Changes thinking about load balancing, SW design to support scale-down 41
  • 42. Summary: Obstacles • How “dependent” can you become? – Data expensive to move, no universal format – Management APIʼs not yet standardized – Doesnʼt (necessarily) eliminate reliance on proprietary SW • SW licensing mostly cloud-unfriendly • Security considerations, IT best practices • Difficulty of quantifying savings • Locus of administration/accountability? 42
  • 43. Should I cloudify? • Expecting to save money? – Economy of scale unlikely; savings more likely from better utilization – But must design for resource accounting & offer incentive to release – Does hybrid/surge make sense? • Even if donʼt move to cloud...use as driver – enforce best practices – identify bundled costs => true cost of IT 43
  • 44. Conclusion Is cloud computing all hype? No. Is it a fad that will fizzle out? We think itʼs a major sea change. Is it for everyone? No/not yet, but be familiar with obstacles & opportunities .44
  • 45. UC Berkeley Thank you! More: abovetheclouds.cs.berkeley.edu 45
  • 47. RAD Lab Prototype: System Architecture Drivers Drivers Drivers New apps, equipment, global policies (eg SLA) Offered load, resource utilization, etc. Chukwa & XTrace (monitoring) Training data Ruby on Rails environment VM monitor local OS functions Chukwa trace coll. web svc APIs Web 2.0 apps local OS functions Chukwa trace coll. SCADS Director performance & cost models Log Mining Automatic Workload Evaluation (AWE) 47
  • 48. CC Changes Demands on Instructional Computing? • Runs on your laptop or class Un*x account • Good enough for course project • project scrapped when course ends • Intra-class teams • Courseware: custom install • Code never leaves UCB _____________________ • Per-student/per-course account • Runs in cloud, remote management • Your friends can use it  *ilities matter • Gain customers  app outlives course • Teams cross UCB boundary • Courseware: VM image • Code released open source, résumé builder ______________________ • General, collaboration- enabling tools & facilities
  • 49. Big science in the cloud? • Web apps restructured to “shared-nothing friendly” thru 90s; can science do same? – gang scheduling for clouds/virtual clouds? – rethink storage vs. checkpointing vs. code structure – move to much higher level languages (leave tuning to macroblocks/runtime, not woven into source code) – Data-intensive (I/O rates & volume) needs of science apps • Opportunity for “cost associativity”! 49
  • 50. SCADS: Scalable, Consistency- Adjustable Data Storage • Scale Independence – as #users grows: – No changes to application – Cost per user doesnʼt increase – Request latency doesnʼt change • Key Innovations 1.Performance safe query language 2.Declarative performance/consistency tradeoffs 3.Automatic scale up and down using machine learning
  • 51. Scale Independence Arch • Developers provide performance safe queries along with consistency requirements • Use ML, workload information, and requirements to provision proactively via repartitioning keys and replicas
  • 52. SCADS Performance Model (on m1.small, all data in memory) SLA threshold 5% writes 1% writes 99th percen6le median