SlideShare a Scribd company logo
By 邓侃 Ph.D
dengkan@smartclouder.com
02/28/2012
2 / 30
1 Billion US$




                                                                              Mobile
                                                                              Internet




• 5 phases of computing growth, since 1960’s.
  1. Main-frame, 2. Minicomputer, 3. PC, 4. Internet, 5. Mobile Internet.
• Every phase, the total amount of user-time, increased 10 times.
  The sum of the top 5 companies’ market value increased 10 times every phase.
• With mobile internet, the big amount of user-times, induces big data。
  The technical challenge is how to deal with big data.
• The solution to the big data challenge, is cloud computing.


                                                                                 3 / 30
Intel Pentium4 CPU’s power is
 10,000 MIPS
 MIPS: Million Instructions Per Second.




• 1965, Moore’s Law:
  The number of transistors in IC doubles every 2 years, or even 18 months.
• Still, the power of a single CPU, cannot beat the human brain power.
  Solution: use many computers.
• Challenge, to orchestrate many computers working together.


                                                                              4 / 30
Google’s initial cloud

• Cloud computing can be built
 with commodity PC servers.
• The most successful cloud so far, was by two graduate students.
  Larry Page from University of Maryland, (北航 in the US).
  Sergey Brin from UIUC, (北邮 in the US).

                                                                    5 / 30
Sergey Brin & Larry Page                               Andy Bechtolsheim


• Sergey and Larry wanted to build a search engine.
  Need the power of super-computer,
  to store every webpage, of every website, globally, every historic version.
  And to process the big data, to build search index.
• Raised fund from Andy Bechtolsheim, in 1997.
  Andy, CMU alumni, cofounder of Sun Microsystems, very rich.
• But Andy only gave them 100K US$.
 The most successful investment, but also the most stupid one.
                                                                                        6 / 30
• Why was Andy not positive on Google?
  4 technical difficulties.
  The two boys might not have the skillset.
• Scalability:
  Big storage space for big data, Googol (10^100) scale!
  Big paralleled computing to process them.
  Never succeeded in human’s history.
  And the data is increased every second.
                                                                  Andy Bechtolsheim
• Reliability:
  Using commodity machines,
  One single machine’s failure should not break down the entire system.
• Elasticity:
  The load fluctuation on different modules are different,
  Schedule the same machines, to work for different modules at different time.
• Security:
  Dynamically separate the machines into clusters, mutually inaccessible.



                                                                                      7 / 30
• Sergey and Larry’s answer was,
 “O, yah, our company’s name is Google!
  We deal with big data.”
• Google runs the world’s largest cloud,
  for 15 years continuously, reliably.




                                           8 / 30
9 / 30
• Scalability: add more machines, without modify the current system.
• Twitter was launched in May 2006.
  Dec 2007, Twitter users increased to 66K.
  Dec 2008, Twitter users grew to 5 millions.
  April 2009, over 100 million.
• Weibo was launched in Sept 2009.
  Nov 2009, Weibo users increased to 1 million.
  April 2010, over 10 million.
  Aug 2010, over 30 million.
  Oct 2010, over 50 million.
• China’s population
  makes itself the best test-bed
  of cloud computing technology.




                                                                       10 / 30
• Reliability: one single machine’s failure, don’t break down the entire system.
• Oct 29, 2009, T-mall kicked-off 50% discount.
• Half hour after the event started,
 支付宝 slowed down significantly.
  Another half hour later, the service shut down.
  One hour later, the service recovered.
• During the one hour that service was down,
  billion yuan’s business was lost.




                                                                                   11 / 30
• Elasticity: use the same machines, for different business, at different time.
• Does 支付宝 need to keep the huge amount of machines,
  only to prepare for the annual sales? NO!
• Superbowl is the most popular sport event in the US.
  During the game, Twitter’s load is 40% higher than the usual one.
  During the exciting moment,
  Twitter’s load is 150% higher than usual .
• But unlike 支付宝,
  Twitter doesn’t keep a lot of machines.
  Twitter borrowed machines
  temporarily from a third-party.
• A lesson learned from Twitter,
  to dynamically allocate machines,
  among different business,
  automatically,
  in real-time.

                                                                                  12 / 30
• Security: prevent data leak.
• Cloud can contain multiple business.
• Each business runs in its own LAN.
  Mutually inaccessible.




                                         13 / 30
14 / 30
• Data flow and control:
 push the cloud to run faster.
• Anatomy of Twitter.
• Cache for fast read.
• Queue for async tasks.
• Pub/Sub for messaging.




                                 15 / 30
• Distributed File System:
  Scalable file storage.
• Google File System.
  (Hadoop HDFS)
• Master and Namespace.
• Chunk vs. File
• Replica vs. Fragmentations




                               16 / 30
• Distributed Database:
  Scalable database.
• Google Bigtable.
  (Hadoop HBase)
• Distributed Index.
• Distributed ACID Transaction.
• Distributed lock.




                                  17 / 30
• Distributed Lock:
 Guarantee multiple read single write
 in distributed system.
• With replica,
 each data one lock or plural.
• How to deal with inconsistency?
• How to raise master,
 by Paxos protocol?




                                        18 / 30
• No-SQL Database:
  Make database more efficient.
• No relational, but only key-value.
• No index, but algorithm.
• No SQL language.
• Easier to add machine.




                                       19 / 30
• Paralleled computing:
  Process big data by divide and conquer
• Google’s MapReduce
• Not a panacea, case study.




                                           20 / 30
• Virtual Machine:
 Run multiple OSes on single machine.
• Separate modules,
 Present bugs and virus from infecting.
• Dynamically allocate resource.




                                          21 / 30
• VLAN:
 Regardless physical locations,
 multiple machine operate as if in the same network domain.
• VLAN vs. VPN
• Group machines in different regions
 as in one LAN.
• Separate machines in the same LAN,
 into different groups,
 mutually inaccessible.




                                                              22 / 30
• Traffic Monitoring and Network Topology.
  Construct the entire cloud system.




                                             23 / 30
• Future trends:
 smaller, bigger, faster, easier.
• One chip with 48 CPUs.
• Data-center TCP.
• Cloud in RAM.
• Erlang, PigLatin:
 languages for cloud computing.




                                    24 / 30
25 / 30
1. 2/28, 18:00pm - 20:00pm, Tuesday,
Introduction to clouding computer? Why cloud, what to do, and how to do?
Homework: Construct a simple 3-tier website.

2. 3/6, 18:00pm - 20:00pm, Tuesday,
Cluster-based scalable network services, SOA.
Homework: Learn to use THRIFT and MemCached to implement a messaging system.

3. 3/13, 18:00pm - 20:00pm, Tuesday,
Scalable file system, Google file system.
Homework: Learn to use SWIFT file system.

4. 3/20, 18:00pm - 20:00pm, Tuesday,                                   • Syllabus.
Distributed RDBMS database, Google Bigtable.
Homework: Learn to use Hadoop HBase.                                   • Core cloud techniques.
5. 3/27, 18:00pm - 20:00pm, Tuesday,                                       Understand principles,
Invited seminar: Baidu.
                                                                       • Learn how to use,
6. 4/3, 18:00pm - 20:00pm, Tuesday,
                                                                           but not re-implement.
Distributed Locking system, Paxos and Google Chubby.
Homework: Learn to use Hadoop ZooKeeper                                    (that is for advanced courses)
7. 4/10, 18:00pm - 20:00pm, Tuesday,
Distributed NO-SQL Database.
Homework: Learn to use Facebook Cassandra.

8. 4/17, 18:00pm - 20:00pm, Tuesday,
Paralleled computation, Google MapReduce.
Homework: Learn to use Hadoop MapReduce.

                                                                                                      26 / 30
9. 4/24, 18:00pm - 20:00pm, Tuesday,                                    • Syllabus.
Invited Seminar: Taobao.
                                                                        • Core cloud techniques.
10. 5/1, 18:00pm - 20:00pm, Tuesday,
Virtual Machine for dynamic resource allocation.                             Understand principles,
Homework: Learn to use KVM.
                                                                        • Learn how to use,
11. 5/8, 18:00pm - 20:00pm, Tuesday,
Cloud security and VLAN.                                                     but not re-implement.
Homework: TBD
                                                                             (that is for advanced courses)
12. 5/15, 18:00pm - 20:00pm, Tuesday,
Invited seminar: EMC/VMWare.

13. 5/22, 18:00pm - 20:00pm, Tuesday,
Datacenter network topology and traffic management.
Homework: Learn to use Zookeeper.

14. 5/29, 18:00pm - 20:00pm, Tuesday,
Invited seminar: Google.

15. 6/5, 18:00pm - 20:00pm, Tuesday,
Future Trend:
 Bigger: Datacenter as a warehouse-scale computer, Datacenter needs an OS.
 Smaller: Multicore CPU and GPUs.
 Faster: In-Memory Framework, Piccolo an Spark.
 Easier: Erlang, PigLatin language.

16. 6/12, 18:00pm - 20:00pm, Tuesday,
Invited seminar: CloudValley.


                                                                                                       27 / 30
• Invited seminars.
• Top cloud players will be your teachers.
• Diverse opinions, also deviated from theory,
  and why?
• Scheduled for mid-term & final exam periods,
  and no homework!




                                                 28 / 30
• Homework.
• Homework: 50%
 Mid-term exam: 20%
 Final exam: 30%
• You will be able to build a cloud!
  Not just Hadoop, and beyond.




                                       29 / 30
• No stupid questions, but it is stupid if not ask!
• Ask a good question, and impress your professor and classmates!
• 新浪微博: @北航云计算公开课


                                                                    30 / 30

More Related Content

Similar to 01 introduction to cloud computing technology

Week 1 lecture material cc
Week 1 lecture material ccWeek 1 lecture material cc
Week 1 lecture material cc
Ankit Gupta
 
vssutcloud computing.pptx
vssutcloud computing.pptxvssutcloud computing.pptx
vssutcloud computing.pptx
MunmunSaha7
 
ppt vdo stream cloud comp.ppt Cloud computing with the help of AWS
ppt vdo stream cloud comp.ppt Cloud computing with the help of AWSppt vdo stream cloud comp.ppt Cloud computing with the help of AWS
ppt vdo stream cloud comp.ppt Cloud computing with the help of AWS
vij7027
 
What ya gonna do?
What ya gonna do?What ya gonna do?
What ya gonna do?
CQD
 
Big Data in Action : Operations, Analytics and more
Big Data in Action : Operations, Analytics and moreBig Data in Action : Operations, Analytics and more
Big Data in Action : Operations, Analytics and more
Softweb Solutions
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoopMohit Tare
 
Adoption of Cloud Computing in Scientific Research
Adoption of Cloud Computing in Scientific ResearchAdoption of Cloud Computing in Scientific Research
Adoption of Cloud Computing in Scientific Research
Yehia El-khatib
 
Introduction to Cloud computing and Big Data-Hadoop
Introduction to Cloud computing and  Big Data-HadoopIntroduction to Cloud computing and  Big Data-Hadoop
Introduction to Cloud computing and Big Data-Hadoop
Nagarjuna D.N
 
The Effectiveness, Efficiency and Legitimacy of Outsourcing Your Data
The Effectiveness, Efficiency and Legitimacy of Outsourcing Your Data The Effectiveness, Efficiency and Legitimacy of Outsourcing Your Data
The Effectiveness, Efficiency and Legitimacy of Outsourcing Your Data
DataCentred
 
CLIMB System Introduction Talk - CLIMB Launch
CLIMB System Introduction Talk - CLIMB LaunchCLIMB System Introduction Talk - CLIMB Launch
CLIMB System Introduction Talk - CLIMB Launch
Tom Connor
 
node.js on Google Compute Engine
node.js on Google Compute Enginenode.js on Google Compute Engine
node.js on Google Compute EngineArun Nagarajan
 
Cloud computingjun28
Cloud computingjun28Cloud computingjun28
Cloud computingjun28
Aravindharamanan S
 
cloudcomputingdistributedcomputing-171208050503 (1).pdf
cloudcomputingdistributedcomputing-171208050503 (1).pdfcloudcomputingdistributedcomputing-171208050503 (1).pdf
cloudcomputingdistributedcomputing-171208050503 (1).pdf
ArchanaPandiyan
 
Cloud Computing & Distributed Computing
Cloud Computing & Distributed ComputingCloud Computing & Distributed Computing
Information system architecture
Information system architectureInformation system architecture
Information system architecture
Dr. Vardhan choubey
 
Big Data & Hadoop Introduction
Big Data & Hadoop IntroductionBig Data & Hadoop Introduction
Big Data & Hadoop Introduction
Jayant Mukherjee
 
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Larry Smarr
 
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Larry Smarr
 
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Larry Smarr
 

Similar to 01 introduction to cloud computing technology (20)

Week 1 lecture material cc
Week 1 lecture material ccWeek 1 lecture material cc
Week 1 lecture material cc
 
vssutcloud computing.pptx
vssutcloud computing.pptxvssutcloud computing.pptx
vssutcloud computing.pptx
 
ppt vdo stream cloud comp.ppt Cloud computing with the help of AWS
ppt vdo stream cloud comp.ppt Cloud computing with the help of AWSppt vdo stream cloud comp.ppt Cloud computing with the help of AWS
ppt vdo stream cloud comp.ppt Cloud computing with the help of AWS
 
What ya gonna do?
What ya gonna do?What ya gonna do?
What ya gonna do?
 
Big Data in Action : Operations, Analytics and more
Big Data in Action : Operations, Analytics and moreBig Data in Action : Operations, Analytics and more
Big Data in Action : Operations, Analytics and more
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
Adoption of Cloud Computing in Scientific Research
Adoption of Cloud Computing in Scientific ResearchAdoption of Cloud Computing in Scientific Research
Adoption of Cloud Computing in Scientific Research
 
Introduction to Cloud computing and Big Data-Hadoop
Introduction to Cloud computing and  Big Data-HadoopIntroduction to Cloud computing and  Big Data-Hadoop
Introduction to Cloud computing and Big Data-Hadoop
 
The Effectiveness, Efficiency and Legitimacy of Outsourcing Your Data
The Effectiveness, Efficiency and Legitimacy of Outsourcing Your Data The Effectiveness, Efficiency and Legitimacy of Outsourcing Your Data
The Effectiveness, Efficiency and Legitimacy of Outsourcing Your Data
 
CLIMB System Introduction Talk - CLIMB Launch
CLIMB System Introduction Talk - CLIMB LaunchCLIMB System Introduction Talk - CLIMB Launch
CLIMB System Introduction Talk - CLIMB Launch
 
node.js on Google Compute Engine
node.js on Google Compute Enginenode.js on Google Compute Engine
node.js on Google Compute Engine
 
Cloud computingjun28
Cloud computingjun28Cloud computingjun28
Cloud computingjun28
 
Cloud computingjun28
Cloud computingjun28Cloud computingjun28
Cloud computingjun28
 
cloudcomputingdistributedcomputing-171208050503 (1).pdf
cloudcomputingdistributedcomputing-171208050503 (1).pdfcloudcomputingdistributedcomputing-171208050503 (1).pdf
cloudcomputingdistributedcomputing-171208050503 (1).pdf
 
Cloud Computing & Distributed Computing
Cloud Computing & Distributed ComputingCloud Computing & Distributed Computing
Cloud Computing & Distributed Computing
 
Information system architecture
Information system architectureInformation system architecture
Information system architecture
 
Big Data & Hadoop Introduction
Big Data & Hadoop IntroductionBig Data & Hadoop Introduction
Big Data & Hadoop Introduction
 
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
 
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
 
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
 

Recently uploaded

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 

01 introduction to cloud computing technology

  • 3. 1 Billion US$ Mobile Internet • 5 phases of computing growth, since 1960’s. 1. Main-frame, 2. Minicomputer, 3. PC, 4. Internet, 5. Mobile Internet. • Every phase, the total amount of user-time, increased 10 times. The sum of the top 5 companies’ market value increased 10 times every phase. • With mobile internet, the big amount of user-times, induces big data。 The technical challenge is how to deal with big data. • The solution to the big data challenge, is cloud computing. 3 / 30
  • 4. Intel Pentium4 CPU’s power is 10,000 MIPS MIPS: Million Instructions Per Second. • 1965, Moore’s Law: The number of transistors in IC doubles every 2 years, or even 18 months. • Still, the power of a single CPU, cannot beat the human brain power. Solution: use many computers. • Challenge, to orchestrate many computers working together. 4 / 30
  • 5. Google’s initial cloud • Cloud computing can be built with commodity PC servers. • The most successful cloud so far, was by two graduate students. Larry Page from University of Maryland, (北航 in the US). Sergey Brin from UIUC, (北邮 in the US). 5 / 30
  • 6. Sergey Brin & Larry Page Andy Bechtolsheim • Sergey and Larry wanted to build a search engine. Need the power of super-computer, to store every webpage, of every website, globally, every historic version. And to process the big data, to build search index. • Raised fund from Andy Bechtolsheim, in 1997. Andy, CMU alumni, cofounder of Sun Microsystems, very rich. • But Andy only gave them 100K US$. The most successful investment, but also the most stupid one. 6 / 30
  • 7. • Why was Andy not positive on Google? 4 technical difficulties. The two boys might not have the skillset. • Scalability: Big storage space for big data, Googol (10^100) scale! Big paralleled computing to process them. Never succeeded in human’s history. And the data is increased every second. Andy Bechtolsheim • Reliability: Using commodity machines, One single machine’s failure should not break down the entire system. • Elasticity: The load fluctuation on different modules are different, Schedule the same machines, to work for different modules at different time. • Security: Dynamically separate the machines into clusters, mutually inaccessible. 7 / 30
  • 8. • Sergey and Larry’s answer was, “O, yah, our company’s name is Google! We deal with big data.” • Google runs the world’s largest cloud, for 15 years continuously, reliably. 8 / 30
  • 10. • Scalability: add more machines, without modify the current system. • Twitter was launched in May 2006. Dec 2007, Twitter users increased to 66K. Dec 2008, Twitter users grew to 5 millions. April 2009, over 100 million. • Weibo was launched in Sept 2009. Nov 2009, Weibo users increased to 1 million. April 2010, over 10 million. Aug 2010, over 30 million. Oct 2010, over 50 million. • China’s population makes itself the best test-bed of cloud computing technology. 10 / 30
  • 11. • Reliability: one single machine’s failure, don’t break down the entire system. • Oct 29, 2009, T-mall kicked-off 50% discount. • Half hour after the event started, 支付宝 slowed down significantly. Another half hour later, the service shut down. One hour later, the service recovered. • During the one hour that service was down, billion yuan’s business was lost. 11 / 30
  • 12. • Elasticity: use the same machines, for different business, at different time. • Does 支付宝 need to keep the huge amount of machines, only to prepare for the annual sales? NO! • Superbowl is the most popular sport event in the US. During the game, Twitter’s load is 40% higher than the usual one. During the exciting moment, Twitter’s load is 150% higher than usual . • But unlike 支付宝, Twitter doesn’t keep a lot of machines. Twitter borrowed machines temporarily from a third-party. • A lesson learned from Twitter, to dynamically allocate machines, among different business, automatically, in real-time. 12 / 30
  • 13. • Security: prevent data leak. • Cloud can contain multiple business. • Each business runs in its own LAN. Mutually inaccessible. 13 / 30
  • 15. • Data flow and control: push the cloud to run faster. • Anatomy of Twitter. • Cache for fast read. • Queue for async tasks. • Pub/Sub for messaging. 15 / 30
  • 16. • Distributed File System: Scalable file storage. • Google File System. (Hadoop HDFS) • Master and Namespace. • Chunk vs. File • Replica vs. Fragmentations 16 / 30
  • 17. • Distributed Database: Scalable database. • Google Bigtable. (Hadoop HBase) • Distributed Index. • Distributed ACID Transaction. • Distributed lock. 17 / 30
  • 18. • Distributed Lock: Guarantee multiple read single write in distributed system. • With replica, each data one lock or plural. • How to deal with inconsistency? • How to raise master, by Paxos protocol? 18 / 30
  • 19. • No-SQL Database: Make database more efficient. • No relational, but only key-value. • No index, but algorithm. • No SQL language. • Easier to add machine. 19 / 30
  • 20. • Paralleled computing: Process big data by divide and conquer • Google’s MapReduce • Not a panacea, case study. 20 / 30
  • 21. • Virtual Machine: Run multiple OSes on single machine. • Separate modules, Present bugs and virus from infecting. • Dynamically allocate resource. 21 / 30
  • 22. • VLAN: Regardless physical locations, multiple machine operate as if in the same network domain. • VLAN vs. VPN • Group machines in different regions as in one LAN. • Separate machines in the same LAN, into different groups, mutually inaccessible. 22 / 30
  • 23. • Traffic Monitoring and Network Topology. Construct the entire cloud system. 23 / 30
  • 24. • Future trends: smaller, bigger, faster, easier. • One chip with 48 CPUs. • Data-center TCP. • Cloud in RAM. • Erlang, PigLatin: languages for cloud computing. 24 / 30
  • 26. 1. 2/28, 18:00pm - 20:00pm, Tuesday, Introduction to clouding computer? Why cloud, what to do, and how to do? Homework: Construct a simple 3-tier website. 2. 3/6, 18:00pm - 20:00pm, Tuesday, Cluster-based scalable network services, SOA. Homework: Learn to use THRIFT and MemCached to implement a messaging system. 3. 3/13, 18:00pm - 20:00pm, Tuesday, Scalable file system, Google file system. Homework: Learn to use SWIFT file system. 4. 3/20, 18:00pm - 20:00pm, Tuesday, • Syllabus. Distributed RDBMS database, Google Bigtable. Homework: Learn to use Hadoop HBase. • Core cloud techniques. 5. 3/27, 18:00pm - 20:00pm, Tuesday, Understand principles, Invited seminar: Baidu. • Learn how to use, 6. 4/3, 18:00pm - 20:00pm, Tuesday, but not re-implement. Distributed Locking system, Paxos and Google Chubby. Homework: Learn to use Hadoop ZooKeeper (that is for advanced courses) 7. 4/10, 18:00pm - 20:00pm, Tuesday, Distributed NO-SQL Database. Homework: Learn to use Facebook Cassandra. 8. 4/17, 18:00pm - 20:00pm, Tuesday, Paralleled computation, Google MapReduce. Homework: Learn to use Hadoop MapReduce. 26 / 30
  • 27. 9. 4/24, 18:00pm - 20:00pm, Tuesday, • Syllabus. Invited Seminar: Taobao. • Core cloud techniques. 10. 5/1, 18:00pm - 20:00pm, Tuesday, Virtual Machine for dynamic resource allocation. Understand principles, Homework: Learn to use KVM. • Learn how to use, 11. 5/8, 18:00pm - 20:00pm, Tuesday, Cloud security and VLAN. but not re-implement. Homework: TBD (that is for advanced courses) 12. 5/15, 18:00pm - 20:00pm, Tuesday, Invited seminar: EMC/VMWare. 13. 5/22, 18:00pm - 20:00pm, Tuesday, Datacenter network topology and traffic management. Homework: Learn to use Zookeeper. 14. 5/29, 18:00pm - 20:00pm, Tuesday, Invited seminar: Google. 15. 6/5, 18:00pm - 20:00pm, Tuesday, Future Trend: Bigger: Datacenter as a warehouse-scale computer, Datacenter needs an OS. Smaller: Multicore CPU and GPUs. Faster: In-Memory Framework, Piccolo an Spark. Easier: Erlang, PigLatin language. 16. 6/12, 18:00pm - 20:00pm, Tuesday, Invited seminar: CloudValley. 27 / 30
  • 28. • Invited seminars. • Top cloud players will be your teachers. • Diverse opinions, also deviated from theory, and why? • Scheduled for mid-term & final exam periods, and no homework! 28 / 30
  • 29. • Homework. • Homework: 50% Mid-term exam: 20% Final exam: 30% • You will be able to build a cloud! Not just Hadoop, and beyond. 29 / 30
  • 30. • No stupid questions, but it is stupid if not ask! • Ask a good question, and impress your professor and classmates! • 新浪微博: @北航云计算公开课 30 / 30