SlideShare a Scribd company logo
1 of 44
Cloud Computing Lecture #1
What is Cloud Computing?
(and an intro to parallel/distributed processing)
Jimmy Lin
The iSchool
University of Maryland
Wednesday, September 3, 2008
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States
See http://creativecommons.org/licenses/by-nc-sa/3.0/us/ for details
Some material adapted from slides by Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet,
Google Distributed Computing Seminar, 2007 (licensed under Creation Commons Attribution 3.0 License)
Source: http://www.free-pictures-photos.com/
The iSchool
University of Maryland
What is Cloud Computing?
1. Web-scale problems
2. Large data centers
3. Different models of computing
4. Highly-interactive Web applications
The iSchool
University of Maryland
1. Web-Scale Problems
 Characteristics:
 Definitely data-intensive
 May also be processing intensive
 Examples:
 Crawling, indexing, searching, mining the Web
 “Post-genomics” life sciences research
 Other scientific data (physics, astronomers, etc.)
 Sensor networks
 Web 2.0 applications
 …
The iSchool
University of Maryland
How much data?
 Wayback Machine has 2 PB + 20 TB/month (2006)
 Google processes 20 PB a day (2008)
 “all words ever spoken by human beings” ~ 5 EB
 NOAA has ~1 PB climate data (2007)
 CERN’s LHC will generate 15 PB a year (2008)
640K ought to be
enough for anybody.
Maximilien Brice, © CERN
Maximilien Brice, © CERN
The iSchool
University of Maryland
There’s nothing like more data!
s/inspiration/data/g;
(Banko and Brill, ACL 2001)
(Brants et al., EMNLP 2007)
The iSchool
University of Maryland
What to do with more data?
 Answering factoid questions
 Pattern matching on the Web
 Works amazingly well
 Learning relations
 Start with seed instances
 Search for patterns on the Web
 Using patterns to find more instances
Who shot Abraham Lincoln? → X shot Abraham Lincoln
Birthday-of(Mozart, 1756)
Birthday-of(Einstein, 1879)
Wolfgang Amadeus Mozart (1756 - 1791)
Einstein was born in 1879
PERSON (DATE –
PERSON was born in DATE
(Brill et al., TREC 2001; Lin, ACM TOIS 2007)
(Agichtein and Gravano, DL 2000; Ravichandran and Hovy, ACL 2002; … )
The iSchool
University of Maryland
2. Large Data Centers
 Web-scale problems? Throw more machines at it!
 Clear trend: centralization of computing resources in large
data centers
 Necessary ingredients: fiber, juice, and space
 What do Oregon, Iceland, and abandoned mines have in
common?
 Important Issues:
 Redundancy
 Efficiency
 Utilization
 Management
Source: Harper’s (Feb, 2008)
Maximilien Brice, © CERN
The iSchool
University of Maryland
Key Technology: Virtualization
Hardware
Operating System
App App App
Traditional Stack
Hardware
OS
App App App
Hypervisor
OS OS
Virtualized Stack
The iSchool
University of Maryland
3. Different Computing Models
 Utility computing
 Why buy machines when you can rent cycles?
 Examples: Amazon’s EC2, GoGrid, AppNexus
 Platform as a Service (PaaS)
 Give me nice API and take care of the implementation
 Example: Google App Engine
 Software as a Service (SaaS)
 Just run it for me!
 Example: Gmail
“Why do it yourself if you can pay someone to do it for you?”
The iSchool
University of Maryland
4. Web Applications
 A mistake on top of a hack built on sand held together by
duct tape?
 What is the nature of software applications?
 From the desktop to the browser
 SaaS == Web-based applications
 Examples: Google Maps, Facebook
 How do we deliver highly-interactive Web-based
applications?
 AJAX (asynchronous JavaScript and XML)
 For better, or for worse…
The iSchool
University of Maryland
What is the course about?
 MapReduce: the “back-end” of cloud computing
 Batch-oriented processing of large datasets
 Ajax: the “front-end” of cloud computing
 Highly-interactive Web-based applications
 Computing “in the clouds”
 Amazon’s EC2/S3 as an example of utility computing
The iSchool
University of Maryland
Amazon Web Services
 Elastic Compute Cloud (EC2)
 Rent computing resources by the hour
 Basic unit of accounting = instance-hour
 Additional costs for bandwidth
 Simple Storage Service (S3)
 Persistent storage
 Charge by the GB/month
 Additional costs for bandwidth
 You’ll be using EC2/S3 for course assignments!
The iSchool
University of Maryland
This course is not for you…
 If you’re not genuinely interested in the topic
 If you’re not ready to do a lot of programming
 If you’re not open to thinking about computing in new ways
 If you can’t cope with uncertainly, unpredictability, poor
documentation, and immature software
 If you can’t put in the time
Otherwise, this will be a richly rewarding course!
Source: http://davidzinger.wordpress.com/2007/05/page/2/
The iSchool
University of Maryland
Cloud Computing Zen
 Don’t get frustrated (take a deep breath)…
 This is bleeding edge technology
 Those W$*#T@F! moments
 Be patient…
 This is the second first time I’ve taught this course
 Be flexible…
 There will be unanticipated issues along the way
 Be constructive…
 Tell me how I can make everyone’s experience better
Source: Wikipedia
Source: Wikipedia
Source: Wikipedia
Source: Wikipedia
The iSchool
University of Maryland
Things to go over…
 Course schedule
 Assignments and deliverables
 Amazon EC2/S3
The iSchool
University of Maryland
Web-Scale Problems?
 Don’t hold your breath:
 Biocomputing
 Nanocomputing
 Quantum computing
 …
 It all boils down to…
 Divide-and-conquer
 Throwing more hardware at the problem
Simple to understand… a lifetime to master…
The iSchool
University of Maryland
Divide and Conquer
“Work”
w1 w2 w3
r1 r2 r3
“Result”
“worker” “worker” “worker”
Partition
Combine
The iSchool
University of Maryland
Different Workers
 Different threads in the same core
 Different cores in the same CPU
 Different CPUs in a multi-processor system
 Different machines in a distributed system
The iSchool
University of Maryland
Choices, Choices, Choices
 Commodity vs. “exotic” hardware
 Number of machines vs. processor vs. cores
 Bandwidth of memory vs. disk vs. network
 Different programming models
The iSchool
University of Maryland
Flynn’s Taxonomy
Instructions
Single (SI) Multiple (MI)
Data
Multiple(M
SISD
Single-threaded
process
MISD
Pipeline
architecture
SIMD
Vector Processing
MIMD
Multi-threaded
Programming
Single(SD)
The iSchool
University of Maryland
SISD
D D D D D D D
Processor
Instructions
The iSchool
University of Maryland
SIMD
D0
Processor
Instructions
D0D0 D0 D0 D0
D1
D2
D3
D4
…
Dn
D1
D2
D3
D4
…
Dn
D1
D2
D3
D4
…
Dn
D1
D2
D3
D4
…
Dn
D1
D2
D3
D4
…
Dn
D1
D2
D3
D4
…
Dn
D1
D2
D3
D4
…
Dn
D0
The iSchool
University of Maryland
MIMD
D D D D D D D
Processor
Instructions
D D D D D D D
Processor
Instructions
The iSchool
University of Maryland
Memory Typology: Shared
Memory
Processor
Processor Processor
Processor
The iSchool
University of Maryland
Memory Typology: Distributed
MemoryProcessor MemoryProcessor
MemoryProcessor MemoryProcessor
Network
The iSchool
University of Maryland
Memory Typology: Hybrid
Memory
Processor
Network
Processor
Memory
Processor
Processor
Memory
Processor
Processor
Memory
Processor
Processor
The iSchool
University of Maryland
Parallelization Problems
 How do we assign work units to workers?
 What if we have more work units than workers?
 What if workers need to share partial results?
 How do we aggregate partial results?
 How do we know all the workers have finished?
 What if workers die?
What is the common theme of all of these problems?
The iSchool
University of Maryland
General Theme?
 Parallelization problems arise from:
 Communication between workers
 Access to shared resources (e.g., data)
 Thus, we need a synchronization system!
 This is tricky:
 Finding bugs is hard
 Solving bugs is even harder
The iSchool
University of Maryland
Managing Multiple Workers
 Difficult because
 (Often) don’t know the order in which workers run
 (Often) don’t know where the workers are running
 (Often) don’t know when workers interrupt each other
 Thus, we need:
 Semaphores (lock, unlock)
 Conditional variables (wait, notify, broadcast)
 Barriers
 Still, lots of problems:
 Deadlock, livelock, race conditions, ...
 Moral of the story: be careful!
 Even trickier if the workers are on different machines
The iSchool
University of Maryland
Patterns for Parallelism
 Parallel computing has been around for decades
 Here are some “design patterns” …
The iSchool
University of Maryland
Master/Slaves
slaves
master
The iSchool
University of Maryland
Producer/Consumer Flow
CP
P
P
C
C
CP
P
P
C
C
The iSchool
University of Maryland
Work Queues
CP
P
P
C
C
shared queue
W W W W W
The iSchool
University of Maryland
Rubber Meets Road
 From patterns to implementation:
 pthreads, OpenMP for multi-threaded programming
 MPI for clustering computing
 …
 The reality:
 Lots of one-off solutions, custom code
 Write you own dedicated library, then program with it
 Burden on the programmer to explicitly manage everything
 MapReduce to the rescue!
 (for next time)

More Related Content

Similar to Session1

ccna course 2
ccna course 2ccna course 2
ccna course 2S Sridhar
 
eScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodeScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodDuncan Hull
 
MeDiCI - How to Withstand a Research Data Tsunami
MeDiCI - How to Withstand a Research Data TsunamiMeDiCI - How to Withstand a Research Data Tsunami
MeDiCI - How to Withstand a Research Data Tsunamiinside-BigData.com
 
ESWC SS 2012 - Friday Keynote Marko Grobelnik: Big Data Tutorial
ESWC SS 2012 - Friday Keynote Marko Grobelnik: Big Data TutorialESWC SS 2012 - Friday Keynote Marko Grobelnik: Big Data Tutorial
ESWC SS 2012 - Friday Keynote Marko Grobelnik: Big Data Tutorialeswcsummerschool
 
The Hidden Web, XML and the Semantic Web: A Scientific Data Management Perspe...
The Hidden Web, XML and the Semantic Web: A Scientific Data Management Perspe...The Hidden Web, XML and the Semantic Web: A Scientific Data Management Perspe...
The Hidden Web, XML and the Semantic Web: A Scientific Data Management Perspe...Dr. Aparna Varde
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introductionbutest
 
A New Year in Data Science: ML Unpaused
A New Year in Data Science: ML UnpausedA New Year in Data Science: ML Unpaused
A New Year in Data Science: ML UnpausedPaco Nathan
 
Data Science in Future Tense
Data Science in Future TenseData Science in Future Tense
Data Science in Future TensePaco Nathan
 
Debunking "Purpose-Built Data Systems:": Enter the Universal Database
Debunking "Purpose-Built Data Systems:": Enter the Universal DatabaseDebunking "Purpose-Built Data Systems:": Enter the Universal Database
Debunking "Purpose-Built Data Systems:": Enter the Universal DatabaseStavros Papadopoulos
 
Parallel Computing 2007: Overview
Parallel Computing 2007: OverviewParallel Computing 2007: Overview
Parallel Computing 2007: OverviewGeoffrey Fox
 
Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22marpierc
 
Above the Clouds: A View From Academia
Above the Clouds: A View From AcademiaAbove the Clouds: A View From Academia
Above the Clouds: A View From AcademiaEduserv
 
Industry Vs Curriculum Talk Mec
Industry Vs Curriculum Talk MecIndustry Vs Curriculum Talk Mec
Industry Vs Curriculum Talk Mectej_arora
 
Dm sei-tutorial-v7
Dm sei-tutorial-v7Dm sei-tutorial-v7
Dm sei-tutorial-v7CS, NcState
 
Big Data is changing abruptly, and where it is likely heading
Big Data is changing abruptly, and where it is likely headingBig Data is changing abruptly, and where it is likely heading
Big Data is changing abruptly, and where it is likely headingPaco Nathan
 
Ibm and innovation overview 20150326 v15 short
Ibm and innovation overview 20150326 v15 shortIbm and innovation overview 20150326 v15 short
Ibm and innovation overview 20150326 v15 shortISSIP
 
Introduction to the Artificial Intelligence and Computer Vision revolution
Introduction to the Artificial Intelligence and Computer Vision revolutionIntroduction to the Artificial Intelligence and Computer Vision revolution
Introduction to the Artificial Intelligence and Computer Vision revolutionDarian Frajberg
 
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...San Diego Supercomputer Center
 

Similar to Session1 (20)

ccna course 2
ccna course 2ccna course 2
ccna course 2
 
eScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodeScience: A Transformed Scientific Method
eScience: A Transformed Scientific Method
 
MeDiCI - How to Withstand a Research Data Tsunami
MeDiCI - How to Withstand a Research Data TsunamiMeDiCI - How to Withstand a Research Data Tsunami
MeDiCI - How to Withstand a Research Data Tsunami
 
ESWC SS 2012 - Friday Keynote Marko Grobelnik: Big Data Tutorial
ESWC SS 2012 - Friday Keynote Marko Grobelnik: Big Data TutorialESWC SS 2012 - Friday Keynote Marko Grobelnik: Big Data Tutorial
ESWC SS 2012 - Friday Keynote Marko Grobelnik: Big Data Tutorial
 
The Hidden Web, XML and the Semantic Web: A Scientific Data Management Perspe...
The Hidden Web, XML and the Semantic Web: A Scientific Data Management Perspe...The Hidden Web, XML and the Semantic Web: A Scientific Data Management Perspe...
The Hidden Web, XML and the Semantic Web: A Scientific Data Management Perspe...
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introduction
 
A New Year in Data Science: ML Unpaused
A New Year in Data Science: ML UnpausedA New Year in Data Science: ML Unpaused
A New Year in Data Science: ML Unpaused
 
ppt
pptppt
ppt
 
Big Data Analytics V2
Big Data Analytics V2Big Data Analytics V2
Big Data Analytics V2
 
Data Science in Future Tense
Data Science in Future TenseData Science in Future Tense
Data Science in Future Tense
 
Debunking "Purpose-Built Data Systems:": Enter the Universal Database
Debunking "Purpose-Built Data Systems:": Enter the Universal DatabaseDebunking "Purpose-Built Data Systems:": Enter the Universal Database
Debunking "Purpose-Built Data Systems:": Enter the Universal Database
 
Parallel Computing 2007: Overview
Parallel Computing 2007: OverviewParallel Computing 2007: Overview
Parallel Computing 2007: Overview
 
Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22
 
Above the Clouds: A View From Academia
Above the Clouds: A View From AcademiaAbove the Clouds: A View From Academia
Above the Clouds: A View From Academia
 
Industry Vs Curriculum Talk Mec
Industry Vs Curriculum Talk MecIndustry Vs Curriculum Talk Mec
Industry Vs Curriculum Talk Mec
 
Dm sei-tutorial-v7
Dm sei-tutorial-v7Dm sei-tutorial-v7
Dm sei-tutorial-v7
 
Big Data is changing abruptly, and where it is likely heading
Big Data is changing abruptly, and where it is likely headingBig Data is changing abruptly, and where it is likely heading
Big Data is changing abruptly, and where it is likely heading
 
Ibm and innovation overview 20150326 v15 short
Ibm and innovation overview 20150326 v15 shortIbm and innovation overview 20150326 v15 short
Ibm and innovation overview 20150326 v15 short
 
Introduction to the Artificial Intelligence and Computer Vision revolution
Introduction to the Artificial Intelligence and Computer Vision revolutionIntroduction to the Artificial Intelligence and Computer Vision revolution
Introduction to the Artificial Intelligence and Computer Vision revolution
 
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
 

More from Van Pham

Thi cong da hoa cuong o tphcm thien loc phat
Thi cong da hoa cuong o tphcm thien loc phatThi cong da hoa cuong o tphcm thien loc phat
Thi cong da hoa cuong o tphcm thien loc phatVan Pham
 
Cửa hàng bán đồ chơi xe máy ở TPHCM - Hoàng Phúc Decal
Cửa hàng bán đồ chơi xe máy ở TPHCM - Hoàng Phúc DecalCửa hàng bán đồ chơi xe máy ở TPHCM - Hoàng Phúc Decal
Cửa hàng bán đồ chơi xe máy ở TPHCM - Hoàng Phúc DecalVan Pham
 
Giao trinh co so du lieu can ban
Giao trinh co so du lieu can banGiao trinh co so du lieu can ban
Giao trinh co so du lieu can banVan Pham
 
Lect15 cloud
Lect15 cloudLect15 cloud
Lect15 cloudVan Pham
 
172506 633746925739945000
172506 633746925739945000172506 633746925739945000
172506 633746925739945000Van Pham
 
Bao cao thuc tap - Điện toán đám mây
Bao cao thuc tap - Điện toán đám mâyBao cao thuc tap - Điện toán đám mây
Bao cao thuc tap - Điện toán đám mâyVan Pham
 
Bai 02 active directory
Bai 02   active directoryBai 02   active directory
Bai 02 active directoryVan Pham
 
Gioi thieu va cac lenh tren console
Gioi thieu va cac lenh tren consoleGioi thieu va cac lenh tren console
Gioi thieu va cac lenh tren consoleVan Pham
 
Bai 08 quan ly in an
Bai 08   quan ly in anBai 08   quan ly in an
Bai 08 quan ly in anVan Pham
 
Bai 07 tao quan ly thu muc
Bai 07   tao quan ly thu mucBai 07   tao quan ly thu muc
Bai 07 tao quan ly thu mucVan Pham
 
Bai 06 quan ly dia
Bai 06   quan ly diaBai 06   quan ly dia
Bai 06 quan ly diaVan Pham
 
Bai 05 chinh sach nhom
Bai 05   chinh sach nhomBai 05   chinh sach nhom
Bai 05 chinh sach nhomVan Pham
 
Bai 04 chinh sach he thong
Bai 04   chinh sach he thongBai 04   chinh sach he thong
Bai 04 chinh sach he thongVan Pham
 
Bai 03 quan ly tai khoan nguoi dung
Bai 03   quan ly tai khoan nguoi dungBai 03   quan ly tai khoan nguoi dung
Bai 03 quan ly tai khoan nguoi dungVan Pham
 
Bai 01 gioi thieu cai dat
Bai 01   gioi thieu cai datBai 01   gioi thieu cai dat
Bai 01 gioi thieu cai datVan Pham
 
Bai12 too ls-kiemtra-ktrpm@softtesting-nntu
Bai12 too ls-kiemtra-ktrpm@softtesting-nntuBai12 too ls-kiemtra-ktrpm@softtesting-nntu
Bai12 too ls-kiemtra-ktrpm@softtesting-nntuVan Pham
 
Bai11 quan ly-kiemtra-ktrpm@softtesting-nntu
Bai11 quan ly-kiemtra-ktrpm@softtesting-nntuBai11 quan ly-kiemtra-ktrpm@softtesting-nntu
Bai11 quan ly-kiemtra-ktrpm@softtesting-nntuVan Pham
 
Bai10 lap tailieukiemtra-k-trpm@softtesting-nntu
Bai10 lap tailieukiemtra-k-trpm@softtesting-nntuBai10 lap tailieukiemtra-k-trpm@softtesting-nntu
Bai10 lap tailieukiemtra-k-trpm@softtesting-nntuVan Pham
 

More from Van Pham (20)

Thi cong da hoa cuong o tphcm thien loc phat
Thi cong da hoa cuong o tphcm thien loc phatThi cong da hoa cuong o tphcm thien loc phat
Thi cong da hoa cuong o tphcm thien loc phat
 
Cửa hàng bán đồ chơi xe máy ở TPHCM - Hoàng Phúc Decal
Cửa hàng bán đồ chơi xe máy ở TPHCM - Hoàng Phúc DecalCửa hàng bán đồ chơi xe máy ở TPHCM - Hoàng Phúc Decal
Cửa hàng bán đồ chơi xe máy ở TPHCM - Hoàng Phúc Decal
 
Giao trinh co so du lieu can ban
Giao trinh co so du lieu can banGiao trinh co so du lieu can ban
Giao trinh co so du lieu can ban
 
Avl tree
Avl treeAvl tree
Avl tree
 
Quy tắc
Quy tắcQuy tắc
Quy tắc
 
Lect15 cloud
Lect15 cloudLect15 cloud
Lect15 cloud
 
172506 633746925739945000
172506 633746925739945000172506 633746925739945000
172506 633746925739945000
 
Bao cao thuc tap - Điện toán đám mây
Bao cao thuc tap - Điện toán đám mâyBao cao thuc tap - Điện toán đám mây
Bao cao thuc tap - Điện toán đám mây
 
Bai 02 active directory
Bai 02   active directoryBai 02   active directory
Bai 02 active directory
 
Gioi thieu va cac lenh tren console
Gioi thieu va cac lenh tren consoleGioi thieu va cac lenh tren console
Gioi thieu va cac lenh tren console
 
Bai 08 quan ly in an
Bai 08   quan ly in anBai 08   quan ly in an
Bai 08 quan ly in an
 
Bai 07 tao quan ly thu muc
Bai 07   tao quan ly thu mucBai 07   tao quan ly thu muc
Bai 07 tao quan ly thu muc
 
Bai 06 quan ly dia
Bai 06   quan ly diaBai 06   quan ly dia
Bai 06 quan ly dia
 
Bai 05 chinh sach nhom
Bai 05   chinh sach nhomBai 05   chinh sach nhom
Bai 05 chinh sach nhom
 
Bai 04 chinh sach he thong
Bai 04   chinh sach he thongBai 04   chinh sach he thong
Bai 04 chinh sach he thong
 
Bai 03 quan ly tai khoan nguoi dung
Bai 03   quan ly tai khoan nguoi dungBai 03   quan ly tai khoan nguoi dung
Bai 03 quan ly tai khoan nguoi dung
 
Bai 01 gioi thieu cai dat
Bai 01   gioi thieu cai datBai 01   gioi thieu cai dat
Bai 01 gioi thieu cai dat
 
Bai12 too ls-kiemtra-ktrpm@softtesting-nntu
Bai12 too ls-kiemtra-ktrpm@softtesting-nntuBai12 too ls-kiemtra-ktrpm@softtesting-nntu
Bai12 too ls-kiemtra-ktrpm@softtesting-nntu
 
Bai11 quan ly-kiemtra-ktrpm@softtesting-nntu
Bai11 quan ly-kiemtra-ktrpm@softtesting-nntuBai11 quan ly-kiemtra-ktrpm@softtesting-nntu
Bai11 quan ly-kiemtra-ktrpm@softtesting-nntu
 
Bai10 lap tailieukiemtra-k-trpm@softtesting-nntu
Bai10 lap tailieukiemtra-k-trpm@softtesting-nntuBai10 lap tailieukiemtra-k-trpm@softtesting-nntu
Bai10 lap tailieukiemtra-k-trpm@softtesting-nntu
 

Recently uploaded

Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Pooja Bhuva
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...Nguyen Thanh Tu Collection
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxCeline George
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 

Recently uploaded (20)

Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 

Session1

  • 1. Cloud Computing Lecture #1 What is Cloud Computing? (and an intro to parallel/distributed processing) Jimmy Lin The iSchool University of Maryland Wednesday, September 3, 2008 This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States See http://creativecommons.org/licenses/by-nc-sa/3.0/us/ for details Some material adapted from slides by Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet, Google Distributed Computing Seminar, 2007 (licensed under Creation Commons Attribution 3.0 License)
  • 3. The iSchool University of Maryland What is Cloud Computing? 1. Web-scale problems 2. Large data centers 3. Different models of computing 4. Highly-interactive Web applications
  • 4. The iSchool University of Maryland 1. Web-Scale Problems  Characteristics:  Definitely data-intensive  May also be processing intensive  Examples:  Crawling, indexing, searching, mining the Web  “Post-genomics” life sciences research  Other scientific data (physics, astronomers, etc.)  Sensor networks  Web 2.0 applications  …
  • 5. The iSchool University of Maryland How much data?  Wayback Machine has 2 PB + 20 TB/month (2006)  Google processes 20 PB a day (2008)  “all words ever spoken by human beings” ~ 5 EB  NOAA has ~1 PB climate data (2007)  CERN’s LHC will generate 15 PB a year (2008) 640K ought to be enough for anybody.
  • 8. The iSchool University of Maryland There’s nothing like more data! s/inspiration/data/g; (Banko and Brill, ACL 2001) (Brants et al., EMNLP 2007)
  • 9. The iSchool University of Maryland What to do with more data?  Answering factoid questions  Pattern matching on the Web  Works amazingly well  Learning relations  Start with seed instances  Search for patterns on the Web  Using patterns to find more instances Who shot Abraham Lincoln? → X shot Abraham Lincoln Birthday-of(Mozart, 1756) Birthday-of(Einstein, 1879) Wolfgang Amadeus Mozart (1756 - 1791) Einstein was born in 1879 PERSON (DATE – PERSON was born in DATE (Brill et al., TREC 2001; Lin, ACM TOIS 2007) (Agichtein and Gravano, DL 2000; Ravichandran and Hovy, ACL 2002; … )
  • 10. The iSchool University of Maryland 2. Large Data Centers  Web-scale problems? Throw more machines at it!  Clear trend: centralization of computing resources in large data centers  Necessary ingredients: fiber, juice, and space  What do Oregon, Iceland, and abandoned mines have in common?  Important Issues:  Redundancy  Efficiency  Utilization  Management
  • 13. The iSchool University of Maryland Key Technology: Virtualization Hardware Operating System App App App Traditional Stack Hardware OS App App App Hypervisor OS OS Virtualized Stack
  • 14. The iSchool University of Maryland 3. Different Computing Models  Utility computing  Why buy machines when you can rent cycles?  Examples: Amazon’s EC2, GoGrid, AppNexus  Platform as a Service (PaaS)  Give me nice API and take care of the implementation  Example: Google App Engine  Software as a Service (SaaS)  Just run it for me!  Example: Gmail “Why do it yourself if you can pay someone to do it for you?”
  • 15. The iSchool University of Maryland 4. Web Applications  A mistake on top of a hack built on sand held together by duct tape?  What is the nature of software applications?  From the desktop to the browser  SaaS == Web-based applications  Examples: Google Maps, Facebook  How do we deliver highly-interactive Web-based applications?  AJAX (asynchronous JavaScript and XML)  For better, or for worse…
  • 16. The iSchool University of Maryland What is the course about?  MapReduce: the “back-end” of cloud computing  Batch-oriented processing of large datasets  Ajax: the “front-end” of cloud computing  Highly-interactive Web-based applications  Computing “in the clouds”  Amazon’s EC2/S3 as an example of utility computing
  • 17. The iSchool University of Maryland Amazon Web Services  Elastic Compute Cloud (EC2)  Rent computing resources by the hour  Basic unit of accounting = instance-hour  Additional costs for bandwidth  Simple Storage Service (S3)  Persistent storage  Charge by the GB/month  Additional costs for bandwidth  You’ll be using EC2/S3 for course assignments!
  • 18. The iSchool University of Maryland This course is not for you…  If you’re not genuinely interested in the topic  If you’re not ready to do a lot of programming  If you’re not open to thinking about computing in new ways  If you can’t cope with uncertainly, unpredictability, poor documentation, and immature software  If you can’t put in the time Otherwise, this will be a richly rewarding course!
  • 20. The iSchool University of Maryland Cloud Computing Zen  Don’t get frustrated (take a deep breath)…  This is bleeding edge technology  Those W$*#T@F! moments  Be patient…  This is the second first time I’ve taught this course  Be flexible…  There will be unanticipated issues along the way  Be constructive…  Tell me how I can make everyone’s experience better
  • 25. The iSchool University of Maryland Things to go over…  Course schedule  Assignments and deliverables  Amazon EC2/S3
  • 26. The iSchool University of Maryland Web-Scale Problems?  Don’t hold your breath:  Biocomputing  Nanocomputing  Quantum computing  …  It all boils down to…  Divide-and-conquer  Throwing more hardware at the problem Simple to understand… a lifetime to master…
  • 27. The iSchool University of Maryland Divide and Conquer “Work” w1 w2 w3 r1 r2 r3 “Result” “worker” “worker” “worker” Partition Combine
  • 28. The iSchool University of Maryland Different Workers  Different threads in the same core  Different cores in the same CPU  Different CPUs in a multi-processor system  Different machines in a distributed system
  • 29. The iSchool University of Maryland Choices, Choices, Choices  Commodity vs. “exotic” hardware  Number of machines vs. processor vs. cores  Bandwidth of memory vs. disk vs. network  Different programming models
  • 30. The iSchool University of Maryland Flynn’s Taxonomy Instructions Single (SI) Multiple (MI) Data Multiple(M SISD Single-threaded process MISD Pipeline architecture SIMD Vector Processing MIMD Multi-threaded Programming Single(SD)
  • 31. The iSchool University of Maryland SISD D D D D D D D Processor Instructions
  • 32. The iSchool University of Maryland SIMD D0 Processor Instructions D0D0 D0 D0 D0 D1 D2 D3 D4 … Dn D1 D2 D3 D4 … Dn D1 D2 D3 D4 … Dn D1 D2 D3 D4 … Dn D1 D2 D3 D4 … Dn D1 D2 D3 D4 … Dn D1 D2 D3 D4 … Dn D0
  • 33. The iSchool University of Maryland MIMD D D D D D D D Processor Instructions D D D D D D D Processor Instructions
  • 34. The iSchool University of Maryland Memory Typology: Shared Memory Processor Processor Processor Processor
  • 35. The iSchool University of Maryland Memory Typology: Distributed MemoryProcessor MemoryProcessor MemoryProcessor MemoryProcessor Network
  • 36. The iSchool University of Maryland Memory Typology: Hybrid Memory Processor Network Processor Memory Processor Processor Memory Processor Processor Memory Processor Processor
  • 37. The iSchool University of Maryland Parallelization Problems  How do we assign work units to workers?  What if we have more work units than workers?  What if workers need to share partial results?  How do we aggregate partial results?  How do we know all the workers have finished?  What if workers die? What is the common theme of all of these problems?
  • 38. The iSchool University of Maryland General Theme?  Parallelization problems arise from:  Communication between workers  Access to shared resources (e.g., data)  Thus, we need a synchronization system!  This is tricky:  Finding bugs is hard  Solving bugs is even harder
  • 39. The iSchool University of Maryland Managing Multiple Workers  Difficult because  (Often) don’t know the order in which workers run  (Often) don’t know where the workers are running  (Often) don’t know when workers interrupt each other  Thus, we need:  Semaphores (lock, unlock)  Conditional variables (wait, notify, broadcast)  Barriers  Still, lots of problems:  Deadlock, livelock, race conditions, ...  Moral of the story: be careful!  Even trickier if the workers are on different machines
  • 40. The iSchool University of Maryland Patterns for Parallelism  Parallel computing has been around for decades  Here are some “design patterns” …
  • 41. The iSchool University of Maryland Master/Slaves slaves master
  • 42. The iSchool University of Maryland Producer/Consumer Flow CP P P C C CP P P C C
  • 43. The iSchool University of Maryland Work Queues CP P P C C shared queue W W W W W
  • 44. The iSchool University of Maryland Rubber Meets Road  From patterns to implementation:  pthreads, OpenMP for multi-threaded programming  MPI for clustering computing  …  The reality:  Lots of one-off solutions, custom code  Write you own dedicated library, then program with it  Burden on the programmer to explicitly manage everything  MapReduce to the rescue!  (for next time)