SlideShare a Scribd company logo
1 of 47
CS162
Operating Systems and
Systems Programming
Lecture 24
Capstone: Cloud Computing
December 2, 2013
Anthony D. Joseph and John Canny
http://inst.eecs.berkeley.edu/~cs162
24.2
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Goals for Today
• Distributed systems
• Cloud Computing programming paradigms
• Cloud Computing OS
Note: Some slides and/or pictures in the following are
adapted from slides Ali Ghodsi.
24.3
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Background of Cloud Computing
• 1990: Heyday of parallel computing, multi-processors
– 52% growth in performance per year!
• 2002: The thermal wall
– Speed (frequency) peaks,
but transistors keep
shrinking
• The Multicore revolution
– 15-20 years later than
predicted, we have hit
the performance wall
24.4
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Sources Driving Big Data
It’s All Happening On-line
Every:
Click
Ad impression
Billing event
Fast Forward, pause,…
Friend Request
Transaction
Network message
Fault
…
User Generated (Web &
Mobile)
…
..
Internet of Things / M2M Scientific Computing
24.5
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Data Deluge
• Billions of users connected through the net
– WWW, FB, twitter, cell phones, …
– 80% of the data on FB was produced last year
• Storage getting cheaper
– Store more data!
24.6
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Data Grows Faster than Moore’s Law
Projected Growth
Increase
over
2010
0
10
20
30
40
50
60
2010 2011 2012 2013 2014 2015
Moore's Law
Overall Data
Particle Accel.
24.8
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Solving the Impedance Mismatch
• Computers not getting faster, and
we are drowning in data
– How to resolve the dilemma?
• Solution adopted by web-scale
companies
– Go massively distributed
and parallel
24.9
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Enter the World of Distributed Systems
• Distributed Systems/Computing
– Loosely coupled set of computers, communicating through
message passing, solving a common goal
• Distributed computing is challenging
– Dealing with partial failures (examples?)
– Dealing with asynchrony (examples?)
• Distributed Computing versus Parallel Computing?
– distributed computing=parallel computing + partial failures
24.10
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Dealing with Distribution
• We have seen several of the tools that help with
distributed programming
– Message Passing Interface (MPI)
– Distributed Shared Memory (DSM)
– Remote Procedure Calls (RPC)
• But, distributed programming is still very hard
– Programming for scale, fault-tolerance, consistency, …
24.11
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
The Datacenter is the new Computer
• “Program” == Web search, email,
map/GIS, …
• “Computer” == 10,000’s
computers, storage, network
• Warehouse-sized facilities and
workloads
• Built from less reliable
components than traditional
datacenters
24.12
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Datacenter/Cloud Computing OS
• If the datacenter/cloud is the new computer
– What is its Operating System?
– Note that we are not talking about a host OS
24.13
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Classical Operating Systems
• Data sharing
– Inter-Process Communication, RPC, files, pipes, …
• Programming Abstractions
– Libraries (libc), system calls, …
• Multiplexing of resources
– Scheduling, virtual memory, file allocation/protection, …
24.14
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Datacenter/Cloud Operating System
• Data sharing
– Google File System, key/value stores
• Programming Abstractions
– Google MapReduce, PIG, Hive, Spark
• Multiplexing of resources
– Apache projects: Mesos, YARN (MRv2), ZooKeeper,
BookKeeper, …
24.15
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Google Cloud Infrastructure
• Google File System (GFS), 2003
– Distributed File System for entire
cluster
– Single namespace
• Google MapReduce (MR), 2004
– Runs queries/jobs on data
– Manages work distribution & fault-
tolerance
– Colocated with file system
• Apache open source versions Hadoop DFS and Hadoop MR
24.16
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
GFS/HDFS Insights
• Petabyte storage
– Files split into large blocks (128 MB) and replicated across
several nodes
– Big blocks allow high throughput sequential reads/writes
• Data striped on hundreds/thousands of servers
– Scan 100 TB on 1 node @ 50 MB/s = 24 days
– Scan on 1000-node cluster = 35 minutes
24.17
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
GFS/HDFS Insights (2)
• Failures will be the norm
– Mean time between failures for 1 node = 3 years
– Mean time between failures for 1000 nodes = 1 day
• Use commodity hardware
– Failures are the norm anyway, buy cheaper hardware
• No complicated consistency models
– Single writer, append-only data
24.18
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
MapReduce Insights
• Restricted key-value model
– Same fine-grained operation (Map & Reduce)
repeated on big data
– Operations must be deterministic
– Operations must be idempotent/no side effects
– Only communication is through the shuffle
– Operation (Map & Reduce) output saved (on disk)
24.19
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
What is MapReduce Used For?
• At Google:
– Index building for Google Search
– Article clustering for Google News
– Statistical machine translation
• At Yahoo!:
– Index building for Yahoo! Search
– Spam detection for Yahoo! Mail
• At Facebook:
– Data mining
– Ad optimization
– Spam detection
24.20
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
MapReduce Pros
• Distribution is completely transparent
– Not a single line of distributed programming (ease, correctness)
• Automatic fault-tolerance
– Determinism enables running failed tasks somewhere else again
– Saved intermediate data enables just re-running failed reducers
• Automatic scaling
– As operations as side-effect free, they can be distributed to any
number of machines dynamically
• Automatic load-balancing
– Move tasks and speculatively execute duplicate copies of slow
tasks (stragglers)
24.21
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
MapReduce Cons
• Restricted programming model
– Not always natural to express problems in this model
– Low-level coding necessary
– Little support for iterative jobs (lots of disk access)
– High-latency (batch processing)
• Addressed by follow-up research
– Pig and Hive for high-level coding
– Spark for iterative and low-latency jobs
24.22
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Pig
• High-level language:
– Expresses sequences of MapReduce jobs
– Provides relational (SQL) operators
(JOIN, GROUP BY, etc)
– Easy to plug in Java functions
• Started at Yahoo! Research
– Runs about 50% of Yahoo!’s jobs
24.23
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Example Problem
Given user data in one file,
and website data in another,
find the top 5 most visited
pages by users aged 18-25
Load
Users
Load Pages
Filter by age
Join on name
Group on url
Count clicks
Order by clicks
Take top 5
Example from http://wiki.apache.org/pig-data/attachments/PigTalksPapers/attachments/ApacheConEurope09.ppt
24.24
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
In MapReduce
Example from http://wiki.apache.org/pig-data/attachments/PigTalksPapers/attachments/ApacheConEurope09.ppt
24.25
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
In Pig Latin
Example from http://wiki.apache.org/pig-data/attachments/PigTalksPapers/attachments/ApacheConEurope09.ppt
Users = load ‘users’ as (name, age);
Filtered = filter Users by
age >= 18 and age <= 25;
Pages = load ‘pages’ as (user, url);
Joined = join Filtered by name, Pages by user;
Grouped = group Joined by url;
Summed = foreach Grouped generate group,
count(Joined) as clicks;
Sorted = order Summed by clicks desc;
Top5 = limit Sorted 5;
store Top5 into ‘top5sites’;
24.26
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Translation to MapReduce
Notice how naturally the components of the job translate into Pig Latin.
Users = load …
Filtered = filter …
Pages = load …
Joined = join …
Grouped = group …
Summed = … count()…
Sorted = order …
Top5 = limit …
Example from http://wiki.apache.org/pig-data/attachments/PigTalksPapers/attachments/ApacheConEurope09
Load
Users
Load Pages
Filter by age
Join on name
Group on url
Count clicks
Order by clicks
Take top 5
Job 1
Job 2
Job 3
24.27
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Hive
• Relational database built on Hadoop
– Maintains table schemas
– SQL-like query language (which can also call Hadoop
Streaming scripts)
– Supports table partitioning,
complex data types, sampling,
some query optimization
• Developed at Facebook
– Used for many Facebook jobs
24.28
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Spark Motivation
Complex jobs, interactive queries and online processing
all need one thing that MR lacks:
Efficient primitives for data sharing
Stage
1
Stage
2
Stage
3
Iterative job
Query 1
Query 2
Query 3
Interactive mining
Job
1
Job
2
…
Stream processing
24.29
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Spark Motivation
Complex jobs, interactive queries and online processing
all need one thing that MR lacks:
Efficient primitives for data sharing
Stage
1
Stage
2
Stage
3
Iterative job
Query 1
Query 2
Query 3
Interactive mining
Job
1
Job
2
…
Stream processing
Problem: in MR, the only way to share data
across jobs is using stable storage
(e.g. file system)  slow!
24.30
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Examples
iter. 1 iter. 2 . .
.
Input
HDFS
read
HDFS
write
HDFS
read
HDFS
write
Input
query 1
query 2
query 3
result 1
result 2
result 3
. . .
HDFS
read
Opportunity: DRAM is getting cheaper 
use main memory for intermediate
results instead of disks
24.31
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
iter. 1 iter. 2 . .
.
Input
Goal: In-Memory Data Sharing
Distributed
memory
Input
query 1
query 2
query 3
. . .
one-time
processing
10-100× faster than network and disk
24.32
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Solution: Resilient Distributed
Datasets (RDDs)
• Partitioned collections of records that can be stored in
memory across the cluster
• Manipulated through a diverse set of transformations
(map, filter, join, etc)
• Fault recovery without costly replication
– Remember the series of transformations that built an
RDD (its lineage) to recompute lost data
• http://spark.incubator.apache.org/
24.33
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
24.34
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Administrivia
• Project 4
– Design Doc due today (12/2) by 11:59pm
– Code due next week Thu 12/12 by 11:59pm
• MIDTERM #2 is this Wednesday 12/4 5:30-7pm in
145 Dwinelle (A-L) and 2060 Valley LSB (M-Z)
– Covers Lectures #13-24, projects, and readings
– One sheet of notes, both sides
• Prof Joseph’s office hours extended tomorrow:
– 10-11:30 in 449 Soda
• RRR week office hours: TBA
24.35
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
5min Break
24.36
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
• Rapid innovation in datacenter computing frameworks
• No single framework optimal for all applications
• Want to run multiple frameworks in a single datacenter
– …to maximize utilization
– …to share data between frameworks
Pig
Datacenter Scheduling Problem
Dryad
Pregel
Percolator
CIEL
24.37
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Hadoop
Pregel
MPI
Shared cluster
Today: static partitioning Dynamic sharing
Where We Want to Go
24.38
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Solution: Apache Mesos
Mesos
Node Node Node Node
Hadoop Pregel
…
Node Node
Hadoop
Node Node
Pregel
…
• Mesos is a common resource sharing layer over which
diverse frameworks can run
• Run multiple instances of the same framework
– Isolate production and experimental jobs
– Run multiple versions of the framework concurrently
• Build specialized frameworks targeting particular
problem domains
– Better performance than general-purpose abstractions
24.39
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Mesos Goals
• High utilization of resources
• Support diverse frameworks (current & future)
• Scalability to 10,000’s of nodes
• Reliability in face of failures
http://mesos.apache.org/
Resulting design: Small microkernel-like
core that pushes scheduling
logic to frameworks
24.40
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Mesos Design Elements
•Fine-grained sharing:
– Allocation at the level of tasks within a job
– Improves utilization, latency, and data locality
•Resource offers:
– Simple, scalable application-controlled scheduling
mechanism
24.41
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Element 1: Fine-Grained
Sharing
Framework 1
Framework 2
Framework 3
Coarse-Grained Sharing (HPC): Fine-Grained Sharing (Mesos):
+ Improved utilization, responsiveness, data locality
Storage System (e.g. HDFS) Storage System (e.g. HDFS)
Fw. 1
Fw. 1
Fw. 3
Fw. 3 Fw. 2
Fw. 2
Fw. 2
Fw. 1
Fw. 3
Fw. 2
Fw. 3
Fw. 1
Fw. 1 Fw. 2
Fw. 2
Fw. 1
Fw. 3 Fw. 3
Fw. 3
Fw. 2
Fw. 2
24.42
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Element 2: Resource Offers
•Option: Global scheduler
– Frameworks express needs in a specification language,
global scheduler matches them to resources
+ Can make optimal decisions
– Complex: language must support all framework
needs
– Difficult to scale and to make robust
– Future frameworks may have unanticipated needs
24.43
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Element 2: Resource Offers
• Mesos: Resource offers
– Offer available resources to frameworks, let them pick which
resources to use and which tasks to launch
+ Keeps Mesos simple, lets it support future frameworks
- Decentralized decisions might not be optimal
24.44
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Mesos Architecture
MPI job
MPI
scheduler
Hadoop job
Hadoop
scheduler
Allocation
module
Mesos
master
Mesos slave
MPI
executor
Mesos slave
MPI
executor
task
task
Resource
offer
Pick framework to
offer resources to
24.45
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Mesos Architecture
MPI job
MPI
scheduler
Hadoop job
Hadoop
scheduler
Allocation
module
Mesos
master
Mesos slave
MPI
executor
Mesos slave
MPI
executor
task
task
Resource
offer
Pick framework to
offer resources to
Resource offer =
list of (node, availableResources)
E.g. { (node1, <2 CPUs, 4 GB>),
(node2, <3 CPUs, 2 GB>) }
24.46
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Mesos Architecture
MPI job
MPI
scheduler
Hadoop job
Hadoop
scheduler
Allocation
module
Mesos
master
Mesos slave
MPI
executor
Hadoop
executor
Mesos slave
MPI
executor
task
task
Pick framework to
offer resources to
task
Framework-
specific
scheduling
Resource
offer
Launches and
isolates executors
24.47
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Deployments
1,000’s of nodes running over a dozen
production services
Genomics researchers using Hadoop and
Spark on Mesos
Spark in use by Yahoo! Research
Spark for analytics
Hadoop and Spark used by machine
learning researchers
24.48
12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
Summary
• Cloud computing/datacenters are the new computer
– Emerging “Datacenter/Cloud Operating System” appearing
• Pieces of the DC/Cloud OS
– High-throughput filesystems (GFS/HDFS)
– Job frameworks (MapReduce, Apache Hadoop,
Apache Spark, Pregel)
– High-level query languages (Apache Pig, Apache Hive)
– Cluster scheduling (Apache Mesos)

More Related Content

Similar to lec24-cloud-computing.ppt

IMS to DB2 Migration: How a Fortune 500 Company Made the Move in Record Time ...
IMS to DB2 Migration: How a Fortune 500 Company Made the Move in Record Time ...IMS to DB2 Migration: How a Fortune 500 Company Made the Move in Record Time ...
IMS to DB2 Migration: How a Fortune 500 Company Made the Move in Record Time ...Precisely
 
Hadoop training-in-hyderabad
Hadoop training-in-hyderabadHadoop training-in-hyderabad
Hadoop training-in-hyderabadsreehari orienit
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation HadoopVarun Narang
 
Presentation on Large Scale Data Management
Presentation on Large Scale Data ManagementPresentation on Large Scale Data Management
Presentation on Large Scale Data ManagementChris Bunch
 
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop ClustersHDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop ClustersXiao Qin
 
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...Big Data Montreal
 
Nicholas:hdfs what is new in hadoop 2
Nicholas:hdfs what is new in hadoop 2Nicholas:hdfs what is new in hadoop 2
Nicholas:hdfs what is new in hadoop 2hdhappy001
 
The Rise of Cloud Computing Systems
The Rise of Cloud Computing SystemsThe Rise of Cloud Computing Systems
The Rise of Cloud Computing SystemsDaehyeok Kim
 
Resilient Distributed Datasets
Resilient Distributed DatasetsResilient Distributed Datasets
Resilient Distributed DatasetsGabriele Modena
 
DSD-INT 2017 High Performance Parallel Computing with iMODFLOW-MetaSWAP - Ver...
DSD-INT 2017 High Performance Parallel Computing with iMODFLOW-MetaSWAP - Ver...DSD-INT 2017 High Performance Parallel Computing with iMODFLOW-MetaSWAP - Ver...
DSD-INT 2017 High Performance Parallel Computing with iMODFLOW-MetaSWAP - Ver...Deltares
 
PEARC17:A real-time machine learning and visualization framework for scientif...
PEARC17:A real-time machine learning and visualization framework for scientif...PEARC17:A real-time machine learning and visualization framework for scientif...
PEARC17:A real-time machine learning and visualization framework for scientif...Feng Li
 
Introduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopIntroduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopGERARDO BARBERENA
 
A performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
A performance analysis of OpenStack Cloud vs Real System on Hadoop ClustersA performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
A performance analysis of OpenStack Cloud vs Real System on Hadoop ClustersKumari Surabhi
 
Hadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologiesHadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologiesKelly Technologies
 

Similar to lec24-cloud-computing.ppt (20)

afternoon3.pdf
afternoon3.pdfafternoon3.pdf
afternoon3.pdf
 
Hadoop
HadoopHadoop
Hadoop
 
IMS to DB2 Migration: How a Fortune 500 Company Made the Move in Record Time ...
IMS to DB2 Migration: How a Fortune 500 Company Made the Move in Record Time ...IMS to DB2 Migration: How a Fortune 500 Company Made the Move in Record Time ...
IMS to DB2 Migration: How a Fortune 500 Company Made the Move in Record Time ...
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop training-in-hyderabad
Hadoop training-in-hyderabadHadoop training-in-hyderabad
Hadoop training-in-hyderabad
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation Hadoop
 
Spark
SparkSpark
Spark
 
Presentation on Large Scale Data Management
Presentation on Large Scale Data ManagementPresentation on Large Scale Data Management
Presentation on Large Scale Data Management
 
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop ClustersHDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
 
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
 
Big data nyu
Big data nyuBig data nyu
Big data nyu
 
Nicholas:hdfs what is new in hadoop 2
Nicholas:hdfs what is new in hadoop 2Nicholas:hdfs what is new in hadoop 2
Nicholas:hdfs what is new in hadoop 2
 
The Rise of Cloud Computing Systems
The Rise of Cloud Computing SystemsThe Rise of Cloud Computing Systems
The Rise of Cloud Computing Systems
 
Resilient Distributed Datasets
Resilient Distributed DatasetsResilient Distributed Datasets
Resilient Distributed Datasets
 
DSD-INT 2017 High Performance Parallel Computing with iMODFLOW-MetaSWAP - Ver...
DSD-INT 2017 High Performance Parallel Computing with iMODFLOW-MetaSWAP - Ver...DSD-INT 2017 High Performance Parallel Computing with iMODFLOW-MetaSWAP - Ver...
DSD-INT 2017 High Performance Parallel Computing with iMODFLOW-MetaSWAP - Ver...
 
PEARC17:A real-time machine learning and visualization framework for scientif...
PEARC17:A real-time machine learning and visualization framework for scientif...PEARC17:A real-time machine learning and visualization framework for scientif...
PEARC17:A real-time machine learning and visualization framework for scientif...
 
Introduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopIntroduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to Hadoop
 
A performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
A performance analysis of OpenStack Cloud vs Real System on Hadoop ClustersA performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
A performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
 
B9 cmis
B9 cmisB9 cmis
B9 cmis
 
Hadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologiesHadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologies
 

More from sarahabbas40

More from sarahabbas40 (6)

TSO.ppt
TSO.pptTSO.ppt
TSO.ppt
 
ERD.pptx
ERD.pptxERD.pptx
ERD.pptx
 
DFD.ppt
DFD.pptDFD.ppt
DFD.ppt
 
Presentation clod.pptx
Presentation clod.pptxPresentation clod.pptx
Presentation clod.pptx
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
 
Introduction to Cloud Computing.pptx
Introduction to Cloud Computing.pptxIntroduction to Cloud Computing.pptx
Introduction to Cloud Computing.pptx
 

Recently uploaded

%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyviewmasabamasaba
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareJim McKeeth
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsBert Jan Schrijver
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfVishalKumarJha10
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...masabamasaba
 
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...masabamasaba
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburgmasabamasaba
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 

Recently uploaded (20)

%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 

lec24-cloud-computing.ppt

  • 1. CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing December 2, 2013 Anthony D. Joseph and John Canny http://inst.eecs.berkeley.edu/~cs162
  • 2. 24.2 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Goals for Today • Distributed systems • Cloud Computing programming paradigms • Cloud Computing OS Note: Some slides and/or pictures in the following are adapted from slides Ali Ghodsi.
  • 3. 24.3 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Background of Cloud Computing • 1990: Heyday of parallel computing, multi-processors – 52% growth in performance per year! • 2002: The thermal wall – Speed (frequency) peaks, but transistors keep shrinking • The Multicore revolution – 15-20 years later than predicted, we have hit the performance wall
  • 4. 24.4 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Sources Driving Big Data It’s All Happening On-line Every: Click Ad impression Billing event Fast Forward, pause,… Friend Request Transaction Network message Fault … User Generated (Web & Mobile) … .. Internet of Things / M2M Scientific Computing
  • 5. 24.5 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Data Deluge • Billions of users connected through the net – WWW, FB, twitter, cell phones, … – 80% of the data on FB was produced last year • Storage getting cheaper – Store more data!
  • 6. 24.6 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Data Grows Faster than Moore’s Law Projected Growth Increase over 2010 0 10 20 30 40 50 60 2010 2011 2012 2013 2014 2015 Moore's Law Overall Data Particle Accel.
  • 7. 24.8 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Solving the Impedance Mismatch • Computers not getting faster, and we are drowning in data – How to resolve the dilemma? • Solution adopted by web-scale companies – Go massively distributed and parallel
  • 8. 24.9 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Enter the World of Distributed Systems • Distributed Systems/Computing – Loosely coupled set of computers, communicating through message passing, solving a common goal • Distributed computing is challenging – Dealing with partial failures (examples?) – Dealing with asynchrony (examples?) • Distributed Computing versus Parallel Computing? – distributed computing=parallel computing + partial failures
  • 9. 24.10 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Dealing with Distribution • We have seen several of the tools that help with distributed programming – Message Passing Interface (MPI) – Distributed Shared Memory (DSM) – Remote Procedure Calls (RPC) • But, distributed programming is still very hard – Programming for scale, fault-tolerance, consistency, …
  • 10. 24.11 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 The Datacenter is the new Computer • “Program” == Web search, email, map/GIS, … • “Computer” == 10,000’s computers, storage, network • Warehouse-sized facilities and workloads • Built from less reliable components than traditional datacenters
  • 11. 24.12 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Datacenter/Cloud Computing OS • If the datacenter/cloud is the new computer – What is its Operating System? – Note that we are not talking about a host OS
  • 12. 24.13 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Classical Operating Systems • Data sharing – Inter-Process Communication, RPC, files, pipes, … • Programming Abstractions – Libraries (libc), system calls, … • Multiplexing of resources – Scheduling, virtual memory, file allocation/protection, …
  • 13. 24.14 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Datacenter/Cloud Operating System • Data sharing – Google File System, key/value stores • Programming Abstractions – Google MapReduce, PIG, Hive, Spark • Multiplexing of resources – Apache projects: Mesos, YARN (MRv2), ZooKeeper, BookKeeper, …
  • 14. 24.15 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Google Cloud Infrastructure • Google File System (GFS), 2003 – Distributed File System for entire cluster – Single namespace • Google MapReduce (MR), 2004 – Runs queries/jobs on data – Manages work distribution & fault- tolerance – Colocated with file system • Apache open source versions Hadoop DFS and Hadoop MR
  • 15. 24.16 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 GFS/HDFS Insights • Petabyte storage – Files split into large blocks (128 MB) and replicated across several nodes – Big blocks allow high throughput sequential reads/writes • Data striped on hundreds/thousands of servers – Scan 100 TB on 1 node @ 50 MB/s = 24 days – Scan on 1000-node cluster = 35 minutes
  • 16. 24.17 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 GFS/HDFS Insights (2) • Failures will be the norm – Mean time between failures for 1 node = 3 years – Mean time between failures for 1000 nodes = 1 day • Use commodity hardware – Failures are the norm anyway, buy cheaper hardware • No complicated consistency models – Single writer, append-only data
  • 17. 24.18 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 MapReduce Insights • Restricted key-value model – Same fine-grained operation (Map & Reduce) repeated on big data – Operations must be deterministic – Operations must be idempotent/no side effects – Only communication is through the shuffle – Operation (Map & Reduce) output saved (on disk)
  • 18. 24.19 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 What is MapReduce Used For? • At Google: – Index building for Google Search – Article clustering for Google News – Statistical machine translation • At Yahoo!: – Index building for Yahoo! Search – Spam detection for Yahoo! Mail • At Facebook: – Data mining – Ad optimization – Spam detection
  • 19. 24.20 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 MapReduce Pros • Distribution is completely transparent – Not a single line of distributed programming (ease, correctness) • Automatic fault-tolerance – Determinism enables running failed tasks somewhere else again – Saved intermediate data enables just re-running failed reducers • Automatic scaling – As operations as side-effect free, they can be distributed to any number of machines dynamically • Automatic load-balancing – Move tasks and speculatively execute duplicate copies of slow tasks (stragglers)
  • 20. 24.21 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 MapReduce Cons • Restricted programming model – Not always natural to express problems in this model – Low-level coding necessary – Little support for iterative jobs (lots of disk access) – High-latency (batch processing) • Addressed by follow-up research – Pig and Hive for high-level coding – Spark for iterative and low-latency jobs
  • 21. 24.22 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Pig • High-level language: – Expresses sequences of MapReduce jobs – Provides relational (SQL) operators (JOIN, GROUP BY, etc) – Easy to plug in Java functions • Started at Yahoo! Research – Runs about 50% of Yahoo!’s jobs
  • 22. 24.23 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Example Problem Given user data in one file, and website data in another, find the top 5 most visited pages by users aged 18-25 Load Users Load Pages Filter by age Join on name Group on url Count clicks Order by clicks Take top 5 Example from http://wiki.apache.org/pig-data/attachments/PigTalksPapers/attachments/ApacheConEurope09.ppt
  • 23. 24.24 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 In MapReduce Example from http://wiki.apache.org/pig-data/attachments/PigTalksPapers/attachments/ApacheConEurope09.ppt
  • 24. 24.25 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 In Pig Latin Example from http://wiki.apache.org/pig-data/attachments/PigTalksPapers/attachments/ApacheConEurope09.ppt Users = load ‘users’ as (name, age); Filtered = filter Users by age >= 18 and age <= 25; Pages = load ‘pages’ as (user, url); Joined = join Filtered by name, Pages by user; Grouped = group Joined by url; Summed = foreach Grouped generate group, count(Joined) as clicks; Sorted = order Summed by clicks desc; Top5 = limit Sorted 5; store Top5 into ‘top5sites’;
  • 25. 24.26 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Translation to MapReduce Notice how naturally the components of the job translate into Pig Latin. Users = load … Filtered = filter … Pages = load … Joined = join … Grouped = group … Summed = … count()… Sorted = order … Top5 = limit … Example from http://wiki.apache.org/pig-data/attachments/PigTalksPapers/attachments/ApacheConEurope09 Load Users Load Pages Filter by age Join on name Group on url Count clicks Order by clicks Take top 5 Job 1 Job 2 Job 3
  • 26. 24.27 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Hive • Relational database built on Hadoop – Maintains table schemas – SQL-like query language (which can also call Hadoop Streaming scripts) – Supports table partitioning, complex data types, sampling, some query optimization • Developed at Facebook – Used for many Facebook jobs
  • 27. 24.28 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Spark Motivation Complex jobs, interactive queries and online processing all need one thing that MR lacks: Efficient primitives for data sharing Stage 1 Stage 2 Stage 3 Iterative job Query 1 Query 2 Query 3 Interactive mining Job 1 Job 2 … Stream processing
  • 28. 24.29 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Spark Motivation Complex jobs, interactive queries and online processing all need one thing that MR lacks: Efficient primitives for data sharing Stage 1 Stage 2 Stage 3 Iterative job Query 1 Query 2 Query 3 Interactive mining Job 1 Job 2 … Stream processing Problem: in MR, the only way to share data across jobs is using stable storage (e.g. file system)  slow!
  • 29. 24.30 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Examples iter. 1 iter. 2 . . . Input HDFS read HDFS write HDFS read HDFS write Input query 1 query 2 query 3 result 1 result 2 result 3 . . . HDFS read Opportunity: DRAM is getting cheaper  use main memory for intermediate results instead of disks
  • 30. 24.31 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 iter. 1 iter. 2 . . . Input Goal: In-Memory Data Sharing Distributed memory Input query 1 query 2 query 3 . . . one-time processing 10-100× faster than network and disk
  • 31. 24.32 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Solution: Resilient Distributed Datasets (RDDs) • Partitioned collections of records that can be stored in memory across the cluster • Manipulated through a diverse set of transformations (map, filter, join, etc) • Fault recovery without costly replication – Remember the series of transformations that built an RDD (its lineage) to recompute lost data • http://spark.incubator.apache.org/
  • 32. 24.33 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013
  • 33. 24.34 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Administrivia • Project 4 – Design Doc due today (12/2) by 11:59pm – Code due next week Thu 12/12 by 11:59pm • MIDTERM #2 is this Wednesday 12/4 5:30-7pm in 145 Dwinelle (A-L) and 2060 Valley LSB (M-Z) – Covers Lectures #13-24, projects, and readings – One sheet of notes, both sides • Prof Joseph’s office hours extended tomorrow: – 10-11:30 in 449 Soda • RRR week office hours: TBA
  • 34. 24.35 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 5min Break
  • 35. 24.36 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 • Rapid innovation in datacenter computing frameworks • No single framework optimal for all applications • Want to run multiple frameworks in a single datacenter – …to maximize utilization – …to share data between frameworks Pig Datacenter Scheduling Problem Dryad Pregel Percolator CIEL
  • 36. 24.37 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Hadoop Pregel MPI Shared cluster Today: static partitioning Dynamic sharing Where We Want to Go
  • 37. 24.38 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Solution: Apache Mesos Mesos Node Node Node Node Hadoop Pregel … Node Node Hadoop Node Node Pregel … • Mesos is a common resource sharing layer over which diverse frameworks can run • Run multiple instances of the same framework – Isolate production and experimental jobs – Run multiple versions of the framework concurrently • Build specialized frameworks targeting particular problem domains – Better performance than general-purpose abstractions
  • 38. 24.39 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Mesos Goals • High utilization of resources • Support diverse frameworks (current & future) • Scalability to 10,000’s of nodes • Reliability in face of failures http://mesos.apache.org/ Resulting design: Small microkernel-like core that pushes scheduling logic to frameworks
  • 39. 24.40 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Mesos Design Elements •Fine-grained sharing: – Allocation at the level of tasks within a job – Improves utilization, latency, and data locality •Resource offers: – Simple, scalable application-controlled scheduling mechanism
  • 40. 24.41 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Element 1: Fine-Grained Sharing Framework 1 Framework 2 Framework 3 Coarse-Grained Sharing (HPC): Fine-Grained Sharing (Mesos): + Improved utilization, responsiveness, data locality Storage System (e.g. HDFS) Storage System (e.g. HDFS) Fw. 1 Fw. 1 Fw. 3 Fw. 3 Fw. 2 Fw. 2 Fw. 2 Fw. 1 Fw. 3 Fw. 2 Fw. 3 Fw. 1 Fw. 1 Fw. 2 Fw. 2 Fw. 1 Fw. 3 Fw. 3 Fw. 3 Fw. 2 Fw. 2
  • 41. 24.42 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Element 2: Resource Offers •Option: Global scheduler – Frameworks express needs in a specification language, global scheduler matches them to resources + Can make optimal decisions – Complex: language must support all framework needs – Difficult to scale and to make robust – Future frameworks may have unanticipated needs
  • 42. 24.43 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Element 2: Resource Offers • Mesos: Resource offers – Offer available resources to frameworks, let them pick which resources to use and which tasks to launch + Keeps Mesos simple, lets it support future frameworks - Decentralized decisions might not be optimal
  • 43. 24.44 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Mesos Architecture MPI job MPI scheduler Hadoop job Hadoop scheduler Allocation module Mesos master Mesos slave MPI executor Mesos slave MPI executor task task Resource offer Pick framework to offer resources to
  • 44. 24.45 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Mesos Architecture MPI job MPI scheduler Hadoop job Hadoop scheduler Allocation module Mesos master Mesos slave MPI executor Mesos slave MPI executor task task Resource offer Pick framework to offer resources to Resource offer = list of (node, availableResources) E.g. { (node1, <2 CPUs, 4 GB>), (node2, <3 CPUs, 2 GB>) }
  • 45. 24.46 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Mesos Architecture MPI job MPI scheduler Hadoop job Hadoop scheduler Allocation module Mesos master Mesos slave MPI executor Hadoop executor Mesos slave MPI executor task task Pick framework to offer resources to task Framework- specific scheduling Resource offer Launches and isolates executors
  • 46. 24.47 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Deployments 1,000’s of nodes running over a dozen production services Genomics researchers using Hadoop and Spark on Mesos Spark in use by Yahoo! Research Spark for analytics Hadoop and Spark used by machine learning researchers
  • 47. 24.48 12/2/2013 Anthony D. Joseph and John Canny CS162 ©UCB Fall 2013 Summary • Cloud computing/datacenters are the new computer – Emerging “Datacenter/Cloud Operating System” appearing • Pieces of the DC/Cloud OS – High-throughput filesystems (GFS/HDFS) – Job frameworks (MapReduce, Apache Hadoop, Apache Spark, Pregel) – High-level query languages (Apache Pig, Apache Hive) – Cluster scheduling (Apache Mesos)