Applicaon-­‐Driven	  	  Datacenter	  Compung	              Shiding Lin   EDCS-HPCA, Shenzhen              2013/2/24
Let’s Start from the Search Engine…      Web       Web	         Pages	                                                    ...
To Build a High-Throughput Storage System                                                 In-Memory Records               ...
To Build a High-Throughput Storage System                                                                        Block    ...
3-Layer Architecture of a Typical Storage System          Table	                        Base	  Stream	                    ...
To Make It Large-Scale    Which	  Layer	  to	  ParNNon,	  and	  the	  ReplicaNon	  Granularity?	      	      	  	  	  	  C...
Replication Scheme 1Table	                                        Table	                                     Table	       ...
Replication Scheme 2                        Base	  Stream	                                                        Mod	  St...
Map to Physical ArchitectureLogical	  Layer	           Physical	  Boundary	           Physical	  Layer	  	  	  	  	  Table...
What Are Changed?    Single-­‐User	  MulN-­‐Task	  à	  MulN-­‐User	  Single-­‐Task	      	      Scale	  &	  Cost	      	 ...
Software Architecture Principles in Datacenter    Layered	  à	  VerNcal	  	      	      Out-­‐of-­‐the-­‐Box	      	  	  ...
Hardware Architecture Principles in Datacenter    Dummy	      	  	  	  	  Control	  Logic	  Goes	  SoXware	      	  	  	  ...
Hardware Architecture Principles in Datacenter    Modularized	  and	  Configurable	      	      Reduce	  All	  the	  Unnece...
Practice 1: Baidu SSD                        Raw	  Channels	                          No	  Shadow	  Buffer	                ...
Practice 2: Smart Disk Replacement         Failure	  and	  Repair	                            Predict	  Failure	          ...
Practice 3: ARM Server2U,	  6	  Nodes,	  12	  HDD/U	  Internal	  Network	  Switch	  
Application Driven Datacenter Computing
Upcoming SlideShare
Loading in …5
×

Application Driven Datacenter Computing

710 views

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
710
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
13
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Application Driven Datacenter Computing

  1. 1. Applicaon-­‐Driven    Datacenter  Compung   Shiding Lin EDCS-HPCA, Shenzhen 2013/2/24
  2. 2. Let’s Start from the Search Engine… Web Web   Pages   Index   Central  Repository   Building Inverted   of   Index   Web  Pages   Data   Mining
  3. 3. To Build a High-Throughput Storage System In-Memory Records Update Query {<key, data>} Dump MemoryLog-­‐based  Structure   Log Block-N Disk   …        Block  I/O   Log Block-1 Commit New stream        Batch  Commit   Stream Block 0        Stream  R/W   Block0 Block 1 … Block 1 Block Y … Block X
  4. 4. To Build a High-Throughput Storage System Block Block <key, data> …Maximize  Parallelism   Memory dump   Disk        NO  RAID,  Raw  Disk   Block Block Block        Direct  I/O   @disk0 @disk1 … @diskN        Independent  of  FS   A  Big  Virtual  File  by  Blocks  
  5. 5. 3-Layer Architecture of a Typical Storage System Table   Base  Stream   Mod  Stream   Index  Stream   Patch  Stream   Block   Block   Block   …   Block   Block   Block  
  6. 6. To Make It Large-Scale Which  Layer  to  ParNNon,  and  the  ReplicaNon  Granularity?            Complexity          Data  Exchange  Traffic          Reliability  
  7. 7. Replication Scheme 1Table   Table   Table   Base   Mod   Base   Mod   Base   Mod   Stream   Stream   Stream   Stream   Stream   Stream   Index   Patch   Index   Patch   Index   Patch   Stream   Stream   Stream   Stream   Stream   Stream  Block   …   …   Block   Block   …   …   Block   Block   …   …   Block   Replica  1   Replica  2   Replica  3   3x  Commit  Cost   Local  I/O  Only  
  8. 8. Replication Scheme 2 Base  Stream   Mod  Stream  Block   …   Block   …   Block   …   Block   …   Block   …   Block   …   Replica  1   Replica  2   Replica  3   Replica  1   Replica  2   Replica  3   Index  Stream   Patch  Stream  Block   …   Block   …   Block   …   Block   …   Block   …   Block   …   Replica  1   Replica  2   Replica  3   Replica  1   Replica  2   Replica  3   1x  Commit  Cost   Network  &  Disk  I/O  
  9. 9. Map to Physical ArchitectureLogical  Layer   Physical  Boundary   Physical  Layer          Table          Datacenter          Memory          Stream          Cluster          Flash          Block          Rack          Disk          Node  
  10. 10. What Are Changed? Single-­‐User  MulN-­‐Task  à  MulN-­‐User  Single-­‐Task     Scale  &  Cost     Speed  of  Delivery  
  11. 11. Software Architecture Principles in Datacenter Layered  à  VerNcal       Out-­‐of-­‐the-­‐Box          Datacenter  as  a  Computer          To  Tolerate  Component  Failure  
  12. 12. Hardware Architecture Principles in Datacenter Dummy          Control  Logic  Goes  SoXware          ReplicaNon/Checksum/Buffer  Goes  Global     Programmable          Expose  All  Interfaces          Collect  All  Data  
  13. 13. Hardware Architecture Principles in Datacenter Modularized  and  Configurable     Reduce  All  the  Unnecessary     Share  All  the  Possible  
  14. 14. Practice 1: Baidu SSD Raw  Channels   No  Shadow  Buffer   No  Wear  Leveling  
  15. 15. Practice 2: Smart Disk Replacement Failure  and  Repair   Predict  Failure   Logs   Reduce  False-­‐Alarm   Failure  Model  
  16. 16. Practice 3: ARM Server2U,  6  Nodes,  12  HDD/U  Internal  Network  Switch  

×