Datacenter ComputingTrends and Problems :      A survey       Partha Kundu   Sr. Distinguished Engineer      Corporate CTO...
Data center computing is a new paradigm!               Partha Kundu   Special Session NOCS 2011   2
Outline of talk Power & Energy in Data CentersNetwork architecture Protocol interactions ConclusionsPartha Kundu   Spe...
Power & Energy in        the Data CenterPartha Kundu   Special Session NOCS 2011   4
Data Center Energy breakdown                      Server Peak power usage profile Source: ASHRAE                          ...
Energy EfficiencySource : Barroso, Holzle: Data Center as a Computer, Morgan Claypool (publishers), 2009Servers are never ...
Dynamic Power Range                                              CPU power component (peak & idle) in                     ...
Energy Proportionality                                                 Goal:                                              ...
Application Behavior in Data Centers                                                           • Cosmos is similar to data...
Dynamic Resource requirements                                                   in the Data-center                        ...
Motivating Disaggregated memory**Lim et al: Disaggregated Memory for expansion and sharing in Blade Servers, ISCA 2009    ...
Disaggregated Memory*                      Blade systems with disaggregated memory                           DIMM         ...
Disaggregated Memory**Lim et al: Disaggregated Memory for expansion and sharing in Blade Servers, ISCA 2009               ...
Disaggregated Server                                             Servers with Consolidated                                ...
Network        ArchitecturePartha Kundu   Special Session NOCS 2011   15
Requirements of a Cloud-enabled         Data Center                               Economic & Technical Motivations:       ...
Status Quo: Conventional DC Network Internet                                       CR               CRDC-Layer 3          ...
Conventional DC Network Problems                       CR                            CR                                   ...
And More Problems …                      CR                                CR                                       ~ 200:...
And More Problems …                      CR                                CR                                       ~ 200:...
What We Need is…..            1. L2 semantics2. Uniform High            3. Performance    capacity                   isola...
Achieve Uniform High Capacity :            Clos Network Topology**Ref: A Scalable, Commodity, Data Center architecture, Al...
Addressing and Routing:                 Name-Location Separation                    Switches run link-state routing and   ...
Addressing and Routing:                 Name-Location Separation                    Switches run link-state routing and   ...
VL2 Fabric               Objectives and Solutions    Objective             Approach                      Solution         ...
Protocol        InteractionsPartha Kundu   Special Session NOCS 2011   26
TCP InCast Collapse : Problem                                                     Source : Nagle et al, The Panasas Active...
Partha Kundu   Special Session NOCS 2011   28
New Cluster Based Storage System            Partha Kundu   Special Session NOCS 2011   29
Incast Application overfills Buffers             Partha Kundu   Special Session NOCS 2011   30
Solution: TCP with ms-RTO**Safe and Effective Fine-grained TCP Retransmissions for Datacenter Communication, Vasudevan et ...
Incast Collapse :          an unsolved problem at scale**Understanding TCP Incast Throughput Collapse in Datacenter Networ...
ConclusionsPartha Kundu   Special Session NOCS 2011   33
Data Center Computing• Opportunities to realize energy efficiency  particularly in IO sub-systems• Data Center fabrics nee...
NOCs in the Data Center• Energy Efficiency:  Local (distributed) energy management decision  & coordination by NOC• Fabric...
Thank you! Partha Kundu   Special Session NOCS 2011   36
Upcoming SlideShare
Loading in …5
×

Data center computing trends a survey

2,408 views

Published on

State of research question in data center computation

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,408
On SlideShare
0
From Embeds
0
Number of Embeds
60
Actions
Shares
0
Downloads
80
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Data center computing trends a survey

  1. 1. Datacenter ComputingTrends and Problems : A survey Partha Kundu Sr. Distinguished Engineer Corporate CTO Office Special Session, May3 NOCS 2011 Pittsburgh, PA, USA
  2. 2. Data center computing is a new paradigm! Partha Kundu Special Session NOCS 2011 2
  3. 3. Outline of talk Power & Energy in Data CentersNetwork architecture Protocol interactions ConclusionsPartha Kundu Special Session NOCS 2011 3
  4. 4. Power & Energy in the Data CenterPartha Kundu Special Session NOCS 2011 4
  5. 5. Data Center Energy breakdown Server Peak power usage profile Source: ASHRAE Source: Google 2007• Power delivery and Cooling overheads CPU power contribution is less than 1/3are quantified in PUE metric of server power• Cooling is the most significant source ofenergy inefficiency Partha Kundu Special Session NOCS 2011 5
  6. 6. Energy EfficiencySource : Barroso, Holzle: Data Center as a Computer, Morgan Claypool (publishers), 2009Servers are never completely idle Most of the time server But, server is least energy load is around 30% efficient in it’s most common operating region! Partha Kundu Special Session NOCS 2011 6
  7. 7. Dynamic Power Range CPU power component (peak & idle) in servers has reduced over the yearsDynamic Power range:• CPU power range is 3x for servers• DRAM range is 2X• Disk and Networking is < 1.2X Disk and Network switches need to learn from the CPU’s power Source : Barroso, Holzle: Data Center as a Computer, Morgan Claypool proportionality gains (publishers), 2009 Partha Kundu Special Session NOCS 2011 7
  8. 8. Energy Proportionality Goal: Achieve best energy efficiency (~80%) in the common operating regions (20 – 30% load)Challenges to proportionality:• Most proportionality tricks in embedded/mobile devices are not useable in DC due tohuge activation penalties• Distributed structure of data and application doesn’t allow powering down during lowuse• Disk drives spin >50% of time even when there is no activity  [Sankar et al, ISCA ‘08] smaller rotational speeds, multiple heads Partha Kundu Special Session NOCS 2011 8
  9. 9. Application Behavior in Data Centers • Cosmos is similar to data mining workload • Bing preloads web index in memory • But, peak disk bandwidth can be highSource : Kozyrakis et al, IEEE Micro 2010 Significant variation in disk, memory and network capacity and bandwidth usage across Apps Partha Kundu Special Session NOCS 2011 9
  10. 10. Dynamic Resource requirements in the Data-center Intra-server variation (TPC-H, log scale) Inter-server variation (rendering farm) 100GBServer Memory Allocation 10GB 1GB 100MB 10MB 1MB 0.1MB Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Time Query Huge variations even within a single Application running in a large cluster Partha Kundu Special Session NOCS 2011 10
  11. 11. Motivating Disaggregated memory**Lim et al: Disaggregated Memory for expansion and sharing in Blade Servers, ISCA 2009 Conventional blade systems DIMM DIMM DIMM DIMM DIMM CPUs CPUs DIMM DIMM DIMM Backplane DIMM DIMM DIMM DIMM DIMM CPUs CPUs DIMM DIMM DIMM Partha Kundu Special Session NOCS 2011 11
  12. 12. Disaggregated Memory* Blade systems with disaggregated memory DIMM DIMM DIMM CPUs CPUs DIMM Backplane DIMM DIMM DIMM CPUs CPUs DIMM Leverage fast, shared DIMM DIMM communication fabrics DIMM DIMM DIMM DIMM  Break CPU-memory co-location DIMM DIMM*Lim et al: Disaggregated Memory for expansion and sharing in Blade Servers, ISCA 2009 Memory blade Partha Kundu Special Session NOCS 2011 12
  13. 13. Disaggregated Memory**Lim et al: Disaggregated Memory for expansion and sharing in Blade Servers, ISCA 2009 Blade systems with disaggregated memory DIMM CPUs CPUs DIMM DIMM DIMM Backplane DIMM CPUs CPUs DIMM DIMM DIMM DIMMDIMM DIMMDIMM DIMMDIMM DIMMDIMM Authors claim: Memory  8X improvement on memory constrained blade environments  80+% improvement in performance per $  3x consolidation Partha Kundu Special Session NOCS 2011 13
  14. 14. Disaggregated Server Servers with Consolidated Power Fabric DRAM Disk drives supply connectivity High Density, Low Power SM10000 Servers* • Designed to replace 40 1 RU servers in a single 10 RU system. • 512 1.66 GHz 64 bit X86 Intel Atom cores in 10 RU; 2,048 CPUs/rack • 1.28 Terabit interconnect fabric • Up to 64 1 Gbps or 16 10 Gbps uplinks SeaMicro SM10000 server* • 0-64 SATA SSD/Hard disk • Integrated load balancing, Ethernet switching, and serverClaim: management • Uses less than 2.5 KW of powerAchieves 4x Space &Power consolidation *Source : Seamicro URL http://www.seamicro.com/?q=node/102 Partha Kundu Special Session NOCS 2011 14
  15. 15. Network ArchitecturePartha Kundu Special Session NOCS 2011 15
  16. 16. Requirements of a Cloud-enabled Data Center Economic & Technical Motivations:  Use commodity hardware & components  Dynamically distribute compute resources Capacity re- Economies allocation of Scale Partha Kundu Special Session NOCS 2011 16
  17. 17. Status Quo: Conventional DC Network Internet CR CRDC-Layer 3 AR AR ... AR ARDC-Layer 2 S S Key • CR = Core Router (L3) • AR = Access Router (L3) S S S S ... • S = Ethernet Switch (L2) • A = Rack of app. servers … … ~ 1,000 servers/pod == IP subnet Ref: “Data Center: Load balancing Data Center Services”, Cisco 2004 Partha Kundu Special Session NOCS 2011 17
  18. 18. Conventional DC Network Problems CR CR ~ 200:1 AR AR AR AR S S S S ~ 40:1 S S S S S ~S 5:1 S S ... … … … …• Cost of network equipment is prohibitive• Limited server-to-server capacity Partha Kundu Special Session NOCS 2011 18
  19. 19. And More Problems … CR CR ~ 200:1 AR AR AR AR S S S S S S S S S S S S … … … … IP subnet (VLAN) #1 IP subnet (VLAN) #2• Resource fragmentation, significantly lowering cloud utilization (and cost-efficiency) Partha Kundu Special Session NOCS 2011 19
  20. 20. And More Problems … CR CR ~ 200:1 AR AR AR AR Complicated manual S S L2/L3 re-configuration S S S S S S S S S S … … … … IP subnet (VLAN) #1 IP subnet (VLAN) #2• Server IP address assignments are topological• IP movement from contained VLAN is hard Partha Kundu Special Session NOCS 2011 20
  21. 21. What We Need is….. 1. L2 semantics2. Uniform High 3. Performance capacity isolation… … … … Partha Kundu Special Session NOCS 2011 21
  22. 22. Achieve Uniform High Capacity : Clos Network Topology**Ref: A Scalable, Commodity, Data Center architecture, Al-Fares et al, SIGCOMM 2008 Int .. . Aggr .. K aggr switches with D ports . .. TOR ..... . . .. 20 Servers ....... 20*(DK/4) . . Servers • Large bisection BW • Multi paths at modest cost • Tolerates Fabric Failure Partha Kundu Special Session NOCS 2011 22
  23. 23. Addressing and Routing: Name-Location Separation Switches run link-state routing and Directory maintain only switch-level topology Service … x  ToR2 y  ToR3 z  ToR4 ToR1 . . . ToR2 ... ToR3 ... ToR4 … ToR3 y payload Lookup & x y z Response ToR3 z payload 4 Servers use flat names*VL2: A Scalable and Flexible Data Center Network, Greenberg et al, SIGCOMM 2009 Partha Kundu Special Session NOCS 2011 23
  24. 24. Addressing and Routing: Name-Location Separation Switches run link-state routing and Directory maintain only switch-level topology Service … x  ToR2 y  ToR3 z  ToR4 3 ToR1 . . . ToR2 ... ToR3 ... ToR4 … ToR3 y payload Lookup & x yz Response ToR3 z payload 4 Servers use flat names*VL2: A Scalable and Flexible Data Center Network, Greenberg et al, SIGCOMM 2009 Partha Kundu Special Session NOCS 2011 24
  25. 25. VL2 Fabric Objectives and Solutions Objective Approach Solution Name-location1. Layer-2 Employ flat separation & semantics addressing resolution service2. Uniform Guarantee Clos based network, high capacity bandwidth for Valiant LB flow between servers hose-model traffic routing Enforce hose model3. Performance using existing TCP Isolation mechanisms only Partha Kundu Special Session NOCS 2011 25
  26. 26. Protocol InteractionsPartha Kundu Special Session NOCS 2011 26
  27. 27. TCP InCast Collapse : Problem Source : Nagle et al, The Panasas ActiveScale Storage Cluster – Delivering Scalable High Bandwidth Storage, SC2004Affects key datacenter applications with barrier synchronization boundariese.g. DFS, web search, MapReduce Partha Kundu Special Session NOCS 2011 27
  28. 28. Partha Kundu Special Session NOCS 2011 28
  29. 29. New Cluster Based Storage System Partha Kundu Special Session NOCS 2011 29
  30. 30. Incast Application overfills Buffers Partha Kundu Special Session NOCS 2011 30
  31. 31. Solution: TCP with ms-RTO**Safe and Effective Fine-grained TCP Retransmissions for Datacenter Communication, Vasudevan et al,SIGCOMM 2009 • Little adverse effect on WAN traffic Partha Kundu Special Session NOCS 2011 31
  32. 32. Incast Collapse : an unsolved problem at scale**Understanding TCP Incast Throughput Collapse in Datacenter Networks, Griffith et al WREN 2009 Solution space is complex: • Network conditions can impact RTT • Switch buffer management strategies • Goodput can be unstable with load/num. senders Partha Kundu Special Session NOCS 2011 32
  33. 33. ConclusionsPartha Kundu Special Session NOCS 2011 33
  34. 34. Data Center Computing• Opportunities to realize energy efficiency particularly in IO sub-systems• Data Center fabrics need to be re-architected for application scalability and cost• WAN artifacts can create bottlenecks Partha Kundu Special Session NOCS 2011 34
  35. 35. NOCs in the Data Center• Energy Efficiency: Local (distributed) energy management decision & coordination by NOC• Fabric communication: NOC can reduce intra-chip/socket communication latencies between VMs• Congestion Mgt: NOC can assist in traffic orchestration across VMs Partha Kundu Special Session NOCS 2011 35
  36. 36. Thank you! Partha Kundu Special Session NOCS 2011 36

×