Data Centers Architecture
Presented By Ali Al Ogaili
 About Google
 Google Products
 Background
 Distributed Computing
 Layered Architecture & Abstraction
 Google Archit...
 Google Mission: to organize the world’s information
and make it universally accessible and useful
 Established in (Sept...
 Some of the products Google provide
 Google Search
 Gmail
 Maps
 YouTube
 GoogleDoc
 Google Calendar
 App Engine
...
 About Google
 Google Products
 Background
 Distributed Parallel Computing
 Layered Architecture & Abstraction
 Goog...
 One “smart” computer doing the task of summing up the cells of the arrays
sequentially.
ackground: (Distributed Parallel...
 Five “dummy” distributed computers doing the same task in parallel.
ackground: (Distributed Parallel Computing)
1 2 0 3
...
 Division of concern
Structure the system in layers, such as that each layer has a set of
problems, tasks and processes d...
 About Google
 Google Products
 Background
 Distributed Parallel Computing
 Layered Architecture & Abstraction
 Goog...
rchitecture (General Overview)
Computing Platform:
- Cost Efficiency
- Server Design
- Networking
- Datacenters Technologi...
 Google Datacenters evolved over time…
 Google.standford.edu (circa 1997)
 Eric & Sergey (google founders)
volunteered ...
ompuing Platform
 Google Datacenters evolved over time…
ompuing Platform
 Google’s software architecture arises from two
basic insights *:
o Reliability in software rather than ...
ompuing Platform
Dual SATA
Disks
RAM
12VDC Sealed
Lead-Acid Battery
Dual CPUs
Power Supply
Google custom made
servers uses...
ompuing Platform
 The servers are placed in racks in
a shipment container (Modular
design)
 Plug & play (or serve)
 The...
ompuing Platform
 Some key challenges with Datacenter
design:
 Powering:
(Google has a backup battery for each
server as...
ompuing Platform
 What could go wrong? Many things*..
 Overheating (power down most machines)
 PDU failure (machines su...
ompuing Platform
 Google datacenters are more a single upgradable machine
Warehouse Scale Machines– (WSM).
ompuing Platform
 “Cloud” computing or back to mainframe computing?
1960s mainframe machines
serving thin clients
2005 Go...
oftware Platform
A software layer on top of computing platform
If one thinks of Google Datacenter as one single machine
(W...
oftware Platform (GFS)
 Google File System (GFS)
It is designed to provide efficient, reliable access to data
using large...
oftware Platform (MapRecude)
MapReduce
Introduced by Google to support distributed computing on
large data sets on cluster...
oftware Platform (MapRecude)
 Split the data set into N (mapping)
where N is equal to the number of
available workers
 W...
oftware Platform (BigTable)
BigTable
A compressed, high performance,
and proprietary database system built on Google File
...
 About Google
 Google Products
 Background
 Distributed Parallel Computing
 Layered Architecture & Abstraction
 Goog...
pp Engine
* From http://code.google.com/appengine
Upcoming SlideShare
Loading in …5
×

Google data centers

2,709 views

Published on

Published in: Technology, Business
  • Be the first to comment

Google data centers

  1. 1. Data Centers Architecture Presented By Ali Al Ogaili
  2. 2.  About Google  Google Products  Background  Distributed Computing  Layered Architecture & Abstraction  Google Architecture  Computing Infrastructure  Software Infrastructure  App Engine: Google platform for your Enterprise genda
  3. 3.  Google Mission: to organize the world’s information and make it universally accessible and useful  Established in (September 4, 1998)  Today Google runs over one million servers in centers around the world  Processes over one billion search requests[ and twenty petabyte (1015 B) of user-generated data every day oogle at glance
  4. 4.  Some of the products Google provide  Google Search  Gmail  Maps  YouTube  GoogleDoc  Google Calendar  App Engine  And many more  Most of their products are web based  They serve millions of people and they store user’s data in the “cloud”  How do they do that? What is under the hood? oogle products
  5. 5.  About Google  Google Products  Background  Distributed Parallel Computing  Layered Architecture & Abstraction  Google Architecture  Computing Infrastructure  Software Infrastructure  App Engine: Google platform for your Enterprise genda
  6. 6.  One “smart” computer doing the task of summing up the cells of the arrays sequentially. ackground: (Distributed Parallel Computing) 1 2 0 3 3 1 2 2 5 1 3 3 4 5 3 6 6 8 12 18 44 Compute
  7. 7.  Five “dummy” distributed computers doing the same task in parallel. ackground: (Distributed Parallel Computing) 1 2 0 3 3 1 2 2 5 1 3 3 4 5 3 6 6 8 12 18 44 Compute WorkerServers Master Distribute computation power and memory
  8. 8.  Division of concern Structure the system in layers, such as that each layer has a set of problems, tasks and processes decoupled from the other layers.  Abstraction Each layer abstract a set of functions and concerns to the layer above it  Flexibility Replace an implementation while maintaining the interface ackground: (Layers & Abstraction) The trouble with layers of computer software is that sooner or later you loose touch with reality.
  9. 9.  About Google  Google Products  Background  Distributed Parallel Computing  Layered Architecture & Abstraction  Google Architecture  Computing Infrastructure  Software Infrastructure  App Engine: Google platform for your Enterprise genda
  10. 10. rchitecture (General Overview) Computing Platform: - Cost Efficiency - Server Design - Networking - Datacenters Technologies System Infrastructure: -Google File System (GFS) -MapReduce -BigTable Google Services Computing Platform Clusters of thousands of commodity-class PC -Reliable (fault tolerance) -Scalable -Cost Efficient (Low end servers) System Infrastructure: A layer of software that abstracts the hardware complexity from the developers, it provides features such as: -Scheduling -File access -Fault management -And many more Google Services: The set of services provided for the users: -Usability/User friendliness -Simplicity -Performance -Innovation & solving people’s problems
  11. 11.  Google Datacenters evolved over time…  Google.standford.edu (circa 1997)  Eric & Sergey (google founders) volunteered to receive a shipments of machines other research groups order, and hold on them for sometime. ompuing Platform
  12. 12. ompuing Platform  Google Datacenters evolved over time…
  13. 13. ompuing Platform  Google’s software architecture arises from two basic insights *: o Reliability in software rather than in server-class hardware (thus they can commodity PC) o Tailor the design for best aggregate request throughput, not peak server response time (manage request time by parallelizing individual request) * WEB SEARCH FOR A PLANET:THE GOOGLE CLUSTER ARCHITECTURE by Luiz André Barroso , Jeffrey Dean & Urs Hölzle
  14. 14. ompuing Platform Dual SATA Disks RAM 12VDC Sealed Lead-Acid Battery Dual CPUs Power Supply Google custom made servers uses consumers products to get the best economical value per performance..
  15. 15. ompuing Platform  The servers are placed in racks in a shipment container (Modular design)  Plug & play (or serve)  The servers interconnect via a 100- Mbps Ethernet switch that has one or two gigabit uplinks to a core gigabit switch that connects all racks together.  Each shipping container can hold up to 1,160 servers  “power above, water below,”  Modular design  The Google facility features a “container hanger” filled with 45 containers,
  16. 16. ompuing Platform  Some key challenges with Datacenter design:  Powering: (Google has a backup battery for each server as a oppose to a centralized UPS)  Cooling (Low tech PC generates more heat, thus the datacenter requires more aggressive cooling)  Cabling and modularity (Low tech pc are more prone to failure and their life span is shorter; thus, those machines need to be replaced easily)  And much more..
  17. 17. ompuing Platform  What could go wrong? Many things*..  Overheating (power down most machines)  PDU failure (machines suddenly disappear)  Rack-move (plenty of warnings)  Rack-failures (40-80 machines instantly disappear)  Racks go wonky (40-80 machines see 50% pack loss)  Network maintenance ( ~ 30 min random connectivity loss)  Individual machine failures  Thousands of hard drive failures  And much more (slow disk, bad memory, miss configured machine, etc..) Thousands of low end machines clustered together is maintenance nightmare ! *Google Seattle Conference on Scalability
  18. 18. ompuing Platform  Google datacenters are more a single upgradable machine Warehouse Scale Machines– (WSM).
  19. 19. ompuing Platform  “Cloud” computing or back to mainframe computing? 1960s mainframe machines serving thin clients 2005 Google datacenters hosting web applications and serving thin clients
  20. 20. oftware Platform A software layer on top of computing platform If one thinks of Google Datacenter as one single machine (WSM) composted of thousands of individual machines, then the software platforms managing those machines could be thought of as an operating system for this machine  Some of the main custom tools created by Google  Google File Systems (GFS)  MapReduce  BigTable
  21. 21. oftware Platform (GFS)  Google File System (GFS) It is designed to provide efficient, reliable access to data using large clusters of commodity hardware. (from Wikipedia)  Abstract the storage on distributed unreliable hardware  Master machines that deals with Metadata(Filename, mapping from filename to chuck locations)  64MB chunks (on the disk 8K file system block on the Operating System)  Every chunk is replicated 3 times on different racks  Responsible for managing failures (if machine dies, then replicate the data in another machine)
  22. 22. oftware Platform (MapRecude) MapReduce Introduced by Google to support distributed computing on large data sets on clusters of computers. (from Wikipedia)  Abstract the computation on distributed unreliable hardware  User has to write to functions (Map & Redeuce) and the library will take care of all the hardware related issues (Assigning tasks to machines, managing machines failures etc)  The library will try to make the computation faster by pushing the logic closer to where the chunk data is located  Deals with scalability
  23. 23. oftware Platform (MapRecude)  Split the data set into N (mapping) where N is equal to the number of available workers  Wait until all the workers finish their tasks (some processing is done on intermediate results)  Computer the final result (reduce) functions
  24. 24. oftware Platform (BigTable) BigTable A compressed, high performance, and proprietary database system built on Google File System (GFS), Chubby Lock Service, and a few other Google programs (from Wikipedia)  Non-relational distributed database created by Google  Built on top of GFS and provides a higher level of abstraction  Implements a sub-set of typical DBMS (Database management system)  Google Analytics, Google Earth, Personalized Search, App Engine and many more..
  25. 25.  About Google  Google Products  Background  Distributed Parallel Computing  Layered Architecture & Abstraction  Google Architecture  Computing Infrastructure  Software Infrastructure  App Engine: Google platform for your Enterprise genda
  26. 26. pp Engine * From http://code.google.com/appengine

×