SlideShare a Scribd company logo
数据中心网络研究:机遇与挑战

       郭传雄

   微软亚洲研究院 (MSRA)
      2011.04.15
                    1
Outline
•   DCN background
•   Opportunities
•   Research challenges
•   A modular DCN design




                             2
3
4
Background: personal experience
• Bandwidth is a scarce resource
 Network   Memory        Disk     CPU                Year

 10Mb/s    2MB           10MB     386/20M            1994

 100Mb/s   128MB         2GB      PentiumII/233      1998

 100Mb/s   256MB         40GB     PentiumIII/800     2002

 1Gb/s     2GB           160GB    Core2/2GHZ         2007

 1Gb/s     4GB           500GB    Core2 Quad/3GHZ    2011

 X100      X2000, but    X50000   X150X4, but multi- 17 years
           slow access            core and instruction
                                  level progress

                                                                5
Background: technology trends
– Disk is cheap (TB and PB are common)
   • 500RMB for 1TB
– Memory is cheap (32GB a PC is not uncommon)
   • 150RMB for 2GB DRAM
– CPU is powerful yet inexpensive (multi-core)
   • 2000RMB for Intel core i7 with 4 cores
– But “network bandwidth is a scarce resource
   • Intra-DC: replication everywhere for fault tolerance
   • Inter-DC: Input and output need bandwidth
   • 50$ (per 1G port), 500$ (per 10G port)
– 0.1$ = 1GB bandwidth = 1CPU hour = 1GB storage per
  month
                                                            6
DCN building blocks




Server   Rack   Container   Data Center   7
DCN reference design
              •   Does not scale
              •   Low bandwidth
              •   Single point of failure
              •   High cost




                                       8
Outline
•   DCN background
•   Opportunities
•   Research challenges
•   A modular DCN design




                             9
Right time for DCN research
• It is a real problem
• It is an important problem
  – DCN as the infrastructure for cloud computing
• The assumptions are different
  – Data centers are owned by single organization
  – We can innovate at both end-hosts and network
    devices
  – Security is easier (closed environment and trusted
    people)

                                                     10
DCN research: opportunities
• Full of research problems
  – Scalability: tens of thousands to millions servers
  – Performance
  – Fault tolerance
  – Cost saving
  – Feel free to suggest new “TCP” protocols
• You can invent your own DCN!


                                                         11
Outline
•   DCN background
•   Opportunities
•   Research challenges
•   A modular DCN design




                             12
Research challenges
Applications                       Architectures

•   Search                         •   Topology design
•   Distributed execution engine   •   Network virtualization
•   Distributed file systems       •   Electrical/optical switching
•   Online social networking       •   Commodity vs. special system
•   HPC applications



Technologies                       Protocols

• DCN management                   • DCN routing
• DCN platform                     • TCP incast congestion control
• Energy efficiency                • Multicast




                                                                      13
Architecture design
•   Scaling: from thousands to millions of servers
•   High capacity: support various traffic patterns
•   Fault tolerance
•   Cost efficient
•   Easy to deploy and manage




                                                      14
Fat-tree (ucsd-sigcomm08)




                            15
VL2 (msrr-sigcomm09)

               OSFP+ECMP


                           10G


                           10G

                           1G




                                16
Dcell/Bcube (msra-sigcomm08,09)

             • Put intelligence at servers
             • Use Ethernet switches as crossbar
             • Innovations in topology design and routing




  DCell                          BCube
                                                      17
Architecture: optical/electrical
switching (ucsd-sigcomm10, rice-
           sigcomm10)
                    • A hybrid architecture
                       • Optical circuit switching
                       • Electrical packet switching




                                              18
Protocols: TCP incast congestion
                 control

                   S1


                   S2
R



                   Sn


cmu-sigcomm09, msra-conext10


                                       19
Technologies: research platform
• A DCN research platform
  – High performance: comparable to ASIC
  – Easy to program: comparable to commodity server
  – Rich functions
     • Programmable packet forwarding
     • Experiment various control/management funcs
     • Can implement various routing/congestion control
       designs
• ServerSwitch (msra-nsdi11)
                                                          20
Applications
• A unified network for both data center and
  HPC applications?
                      Data center               HPC
Topology              Tree-based                Torus/mesh, fat-tree
Routing               Deterministic routing     Single path routing
                      Per-packet adaptive       L2 spanning tree
                      routing to exploit path   L3 shortest path routing
                      diversity
Flow control          No packet drop            Packets can be dropped
                      Hop by hop                End-to-end
Application support   Scientific applications   Search, e-commerce,
                                                cloud computing
Programming API       MPI/RDMA                  TCP/IP socket
                                                                           21
Outline
•   DCN background
•   Opportunities
•   Research challenges
•   A modular DCN design




                             22
Team
• Chuanxiong Guo, Guohan Lu, Haitao Wu,
  Yongqiang Xiong
• Interns: Zhiqiang Zhou, Jiaxin Cao, Jiabo Ju, Qin
  Jia, Jun Li
• Alumni/Alumna
  – members: Songwu Lu, Dan Li
  – interns: Lei Shi, Yunfeng Shi, Danfeng Zhang, Xuan Zhang,
    Byunchul Park, Nan Hua, Chen Tian, Min-Chen Zhao, Chao
    Kong, Kai Chen, Wenfei Wu, Shuang Yang, Peng Su, Bruce
    Chen, Zhenqian Feng, Min-Jeong Shi, Yibo Zhu…
                                                                23
Modular, mega-data center
      networking




                            24
Modular, mega-data center
        networking

BCube       BCube        BCube


BCube      MDCube        BCube


BCube       BCube        BCube
                                 25
BCube: Server centric network
BCube1


      <1,0>               <1,1>               <1,2>               <1,3>



BCube0
      <0,0>               <0,1>               <0,2>               <0,3>



 00   01   02   03   10   11   12   13   20   21   22   23   30   31   32        33




                                                                            26
2-D MDCube
             MDCube structure




                                27
Problem: Server for pkt fwding?
BCube1


      <1,0>                <1,1>               <1,2>               <1,3>



BCube0
      <0,0>                <0,1>               <0,2>               <0,3>



 00   01    02   03   10   11   12   13   20   21   22   23   30   31   32        33



                                      Forwarding node
                                                                             28
Solution: ServerSwitch

                   • Full programmability at server CPU
                      – Kernel module for low latency processing
Software




                      – User space for ease-to-use
                        programmability

                   • Low latency and high throughput
           PCI-E
                     interconnection
Hardware




                   • Packet forwarding in commodity
                     switching ASIC
                      – High performance and limited
                        programmability
                                                           29
Testbed
• A BCube testbed
  – 16 servers (Dell Precision 490 workstation with
    Intel 2.00GHz dualcore CPU, 4GB DRAM, 160GB
    disk)
  – 8 8-port mini-switches (DLink 8-port Gigabit
    switch DGS-1008D)
• NIC
  – Intel Pro/1000 PT quad-port Ethernet NIC
  – NetFPGA
                                                      30
Summary
• DCN is an area full of opportunities and
  challenges
• The best is yet to come!
• Further information
  • http://research.microsoft.com/en-
    us/projects/msradcn/default.aspx




                                             31
32

More Related Content

What's hot

User-space Network Processing
User-space Network ProcessingUser-space Network Processing
User-space Network Processing
Ryousei Takano
 
POWER10 innovations for HPC
POWER10 innovations for HPCPOWER10 innovations for HPC
POWER10 innovations for HPC
Ganesan Narayanasamy
 
IEEE CloudCom 2014参加報告
IEEE CloudCom 2014参加報告IEEE CloudCom 2014参加報告
IEEE CloudCom 2014参加報告
Ryousei Takano
 
AI Accelerators for Cloud Datacenters
AI Accelerators for Cloud DatacentersAI Accelerators for Cloud Datacenters
AI Accelerators for Cloud Datacenters
CastLabKAIST
 
Virtualization Primer for Java Developers
Virtualization Primer for Java DevelopersVirtualization Primer for Java Developers
Virtualization Primer for Java Developers
Richard McDougall
 
AI Chip Trends and Forecast
AI Chip Trends and ForecastAI Chip Trends and Forecast
AI Chip Trends and Forecast
CastLabKAIST
 
Summit workshop thompto
Summit workshop thomptoSummit workshop thompto
Summit workshop thompto
Ganesan Narayanasamy
 
Exploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design spaceExploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design space
jsvetter
 
Design installation-commissioning-red raider-cluster-ttu
Design installation-commissioning-red raider-cluster-ttuDesign installation-commissioning-red raider-cluster-ttu
Design installation-commissioning-red raider-cluster-ttu
Alan Sill
 
Cache-partitioning
Cache-partitioningCache-partitioning
Cache-partitioning
davidkftam
 
Ph.D. thesis presentation
Ph.D. thesis presentationPh.D. thesis presentation
Ph.D. thesis presentation
davidkftam
 
Design and implementation of a reliable and cost-effective cloud computing in...
Design and implementation of a reliable and cost-effective cloud computing in...Design and implementation of a reliable and cost-effective cloud computing in...
Design and implementation of a reliable and cost-effective cloud computing in...
Francesco Taurino
 
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
Linaro
 
Roeder posterismb2010
Roeder posterismb2010Roeder posterismb2010
Roeder posterismb2010
Chris Roeder
 
A Prototype Storage Subsystem based on Phase Change Memory
A Prototype Storage Subsystem based on Phase Change MemoryA Prototype Storage Subsystem based on Phase Change Memory
A Prototype Storage Subsystem based on Phase Change Memory
IBM Research
 
Maxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialMaxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorial
madhuinturi
 
Hyper v.nu-windows serverhyperv-networkingevolved
Hyper v.nu-windows serverhyperv-networkingevolvedHyper v.nu-windows serverhyperv-networkingevolved
Hyper v.nu-windows serverhyperv-networkingevolved
hypervnu
 
Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...
Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...
Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...
Stefano Salsano
 
Exaflop In 2018 Hardware
Exaflop In 2018   HardwareExaflop In 2018   Hardware
Exaflop In 2018 Hardware
Jacob Wu
 
Gluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFSGluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFS
GlusterFS
 

What's hot (20)

User-space Network Processing
User-space Network ProcessingUser-space Network Processing
User-space Network Processing
 
POWER10 innovations for HPC
POWER10 innovations for HPCPOWER10 innovations for HPC
POWER10 innovations for HPC
 
IEEE CloudCom 2014参加報告
IEEE CloudCom 2014参加報告IEEE CloudCom 2014参加報告
IEEE CloudCom 2014参加報告
 
AI Accelerators for Cloud Datacenters
AI Accelerators for Cloud DatacentersAI Accelerators for Cloud Datacenters
AI Accelerators for Cloud Datacenters
 
Virtualization Primer for Java Developers
Virtualization Primer for Java DevelopersVirtualization Primer for Java Developers
Virtualization Primer for Java Developers
 
AI Chip Trends and Forecast
AI Chip Trends and ForecastAI Chip Trends and Forecast
AI Chip Trends and Forecast
 
Summit workshop thompto
Summit workshop thomptoSummit workshop thompto
Summit workshop thompto
 
Exploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design spaceExploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design space
 
Design installation-commissioning-red raider-cluster-ttu
Design installation-commissioning-red raider-cluster-ttuDesign installation-commissioning-red raider-cluster-ttu
Design installation-commissioning-red raider-cluster-ttu
 
Cache-partitioning
Cache-partitioningCache-partitioning
Cache-partitioning
 
Ph.D. thesis presentation
Ph.D. thesis presentationPh.D. thesis presentation
Ph.D. thesis presentation
 
Design and implementation of a reliable and cost-effective cloud computing in...
Design and implementation of a reliable and cost-effective cloud computing in...Design and implementation of a reliable and cost-effective cloud computing in...
Design and implementation of a reliable and cost-effective cloud computing in...
 
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
 
Roeder posterismb2010
Roeder posterismb2010Roeder posterismb2010
Roeder posterismb2010
 
A Prototype Storage Subsystem based on Phase Change Memory
A Prototype Storage Subsystem based on Phase Change MemoryA Prototype Storage Subsystem based on Phase Change Memory
A Prototype Storage Subsystem based on Phase Change Memory
 
Maxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialMaxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorial
 
Hyper v.nu-windows serverhyperv-networkingevolved
Hyper v.nu-windows serverhyperv-networkingevolvedHyper v.nu-windows serverhyperv-networkingevolved
Hyper v.nu-windows serverhyperv-networkingevolved
 
Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...
Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...
Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...
 
Exaflop In 2018 Hardware
Exaflop In 2018   HardwareExaflop In 2018   Hardware
Exaflop In 2018 Hardware
 
Gluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFSGluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFS
 

Similar to 数据中心网络研究:机遇与挑战

Lecture notes - Data Centers________.pptx
Lecture notes - Data Centers________.pptxLecture notes - Data Centers________.pptx
Lecture notes - Data Centers________.pptx
SandeepGupta229023
 
Expectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software researchExpectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software research
Ryousei Takano
 
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
Linaro
 
Linaro connect 2018 keynote final updated
Linaro connect 2018 keynote final updatedLinaro connect 2018 keynote final updated
Linaro connect 2018 keynote final updated
Dileep Bhandarkar
 
Resilient Network Design Concepts Educat
Resilient Network Design Concepts EducatResilient Network Design Concepts Educat
Resilient Network Design Concepts Educat
SamGrandprix
 
Vaibhav (2)
Vaibhav (2)Vaibhav (2)
Vaibhav (2)
vaibhav jindal
 
The von Neumann Memory Barrier and Computer Architectures for the 21st Century
The von Neumann Memory Barrier and Computer Architectures for the 21st CenturyThe von Neumann Memory Barrier and Computer Architectures for the 21st Century
The von Neumann Memory Barrier and Computer Architectures for the 21st Century
Perry Lea
 
Gpu with cuda architecture
Gpu with cuda architectureGpu with cuda architecture
Gpu with cuda architecture
Dhaval Kaneria
 
Trends and challenges in IP based SOC design
Trends and challenges in IP based SOC designTrends and challenges in IP based SOC design
Trends and challenges in IP based SOC design
AishwaryaRavishankar8
 
Solace Systems The Evolution of Messaging The Rise of the Appliance
Solace Systems The Evolution of Messaging The Rise of the ApplianceSolace Systems The Evolution of Messaging The Rise of the Appliance
Solace Systems The Evolution of Messaging The Rise of the Appliance
Iosif Itkin
 
Extent 2013 Obninsk High Performance Messaging
Extent 2013 Obninsk High Performance MessagingExtent 2013 Obninsk High Performance Messaging
Extent 2013 Obninsk High Performance Messaging
extentconf Tsoy
 
Disruptive Technologies
Disruptive TechnologiesDisruptive Technologies
Disruptive Technologies
Internet Society
 
HiPEAC-CSW 2022_Pedro Trancoso presentation
HiPEAC-CSW 2022_Pedro Trancoso presentationHiPEAC-CSW 2022_Pedro Trancoso presentation
HiPEAC-CSW 2022_Pedro Trancoso presentation
VEDLIoT Project
 
Memory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationMemory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and Virtualization
Bigstep
 
The CMS Online Cluster: 
 Setup, Operation and Maintenance 
 of an Evolving C...
The CMS Online Cluster: 
 Setup, Operation and Maintenance 
 of an Evolving C...The CMS Online Cluster: 
 Setup, Operation and Maintenance 
 of an Evolving C...
The CMS Online Cluster: 
 Setup, Operation and Maintenance 
 of an Evolving C...
Jose Antonio Coarasa Perez
 
Multicloud as the Next Generation of Cloud Infrastructure
Multicloud as the Next Generation of Cloud Infrastructure Multicloud as the Next Generation of Cloud Infrastructure
Multicloud as the Next Generation of Cloud Infrastructure
Brad Eckert
 
PLNOG 8: Ivan Pepelnjak - Data Center Fabrics - What Really Matters
PLNOG 8: Ivan Pepelnjak - Data Center Fabrics - What Really Matters PLNOG 8: Ivan Pepelnjak - Data Center Fabrics - What Really Matters
PLNOG 8: Ivan Pepelnjak - Data Center Fabrics - What Really Matters
PROIDEA
 
Navigating dc architectures tech&amp;sales
Navigating dc architectures tech&amp;salesNavigating dc architectures tech&amp;sales
Navigating dc architectures tech&amp;sales
Eric Zhaohui Ji
 
PacketCloud: an Open Platform for Elastic In-network Services.
PacketCloud: an Open Platform for Elastic In-network Services. PacketCloud: an Open Platform for Elastic In-network Services.
PacketCloud: an Open Platform for Elastic In-network Services.
yeung2000
 
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
inside-BigData.com
 

Similar to 数据中心网络研究:机遇与挑战 (20)

Lecture notes - Data Centers________.pptx
Lecture notes - Data Centers________.pptxLecture notes - Data Centers________.pptx
Lecture notes - Data Centers________.pptx
 
Expectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software researchExpectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software research
 
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
 
Linaro connect 2018 keynote final updated
Linaro connect 2018 keynote final updatedLinaro connect 2018 keynote final updated
Linaro connect 2018 keynote final updated
 
Resilient Network Design Concepts Educat
Resilient Network Design Concepts EducatResilient Network Design Concepts Educat
Resilient Network Design Concepts Educat
 
Vaibhav (2)
Vaibhav (2)Vaibhav (2)
Vaibhav (2)
 
The von Neumann Memory Barrier and Computer Architectures for the 21st Century
The von Neumann Memory Barrier and Computer Architectures for the 21st CenturyThe von Neumann Memory Barrier and Computer Architectures for the 21st Century
The von Neumann Memory Barrier and Computer Architectures for the 21st Century
 
Gpu with cuda architecture
Gpu with cuda architectureGpu with cuda architecture
Gpu with cuda architecture
 
Trends and challenges in IP based SOC design
Trends and challenges in IP based SOC designTrends and challenges in IP based SOC design
Trends and challenges in IP based SOC design
 
Solace Systems The Evolution of Messaging The Rise of the Appliance
Solace Systems The Evolution of Messaging The Rise of the ApplianceSolace Systems The Evolution of Messaging The Rise of the Appliance
Solace Systems The Evolution of Messaging The Rise of the Appliance
 
Extent 2013 Obninsk High Performance Messaging
Extent 2013 Obninsk High Performance MessagingExtent 2013 Obninsk High Performance Messaging
Extent 2013 Obninsk High Performance Messaging
 
Disruptive Technologies
Disruptive TechnologiesDisruptive Technologies
Disruptive Technologies
 
HiPEAC-CSW 2022_Pedro Trancoso presentation
HiPEAC-CSW 2022_Pedro Trancoso presentationHiPEAC-CSW 2022_Pedro Trancoso presentation
HiPEAC-CSW 2022_Pedro Trancoso presentation
 
Memory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationMemory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and Virtualization
 
The CMS Online Cluster: 
 Setup, Operation and Maintenance 
 of an Evolving C...
The CMS Online Cluster: 
 Setup, Operation and Maintenance 
 of an Evolving C...The CMS Online Cluster: 
 Setup, Operation and Maintenance 
 of an Evolving C...
The CMS Online Cluster: 
 Setup, Operation and Maintenance 
 of an Evolving C...
 
Multicloud as the Next Generation of Cloud Infrastructure
Multicloud as the Next Generation of Cloud Infrastructure Multicloud as the Next Generation of Cloud Infrastructure
Multicloud as the Next Generation of Cloud Infrastructure
 
PLNOG 8: Ivan Pepelnjak - Data Center Fabrics - What Really Matters
PLNOG 8: Ivan Pepelnjak - Data Center Fabrics - What Really Matters PLNOG 8: Ivan Pepelnjak - Data Center Fabrics - What Really Matters
PLNOG 8: Ivan Pepelnjak - Data Center Fabrics - What Really Matters
 
Navigating dc architectures tech&amp;sales
Navigating dc architectures tech&amp;salesNavigating dc architectures tech&amp;sales
Navigating dc architectures tech&amp;sales
 
PacketCloud: an Open Platform for Elastic In-network Services.
PacketCloud: an Open Platform for Elastic In-network Services. PacketCloud: an Open Platform for Elastic In-network Services.
PacketCloud: an Open Platform for Elastic In-network Services.
 
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
 

Recently uploaded

Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 

Recently uploaded (20)

Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 

数据中心网络研究:机遇与挑战

  • 1. 数据中心网络研究:机遇与挑战 郭传雄 微软亚洲研究院 (MSRA) 2011.04.15 1
  • 2. Outline • DCN background • Opportunities • Research challenges • A modular DCN design 2
  • 3. 3
  • 4. 4
  • 5. Background: personal experience • Bandwidth is a scarce resource Network Memory Disk CPU Year 10Mb/s 2MB 10MB 386/20M 1994 100Mb/s 128MB 2GB PentiumII/233 1998 100Mb/s 256MB 40GB PentiumIII/800 2002 1Gb/s 2GB 160GB Core2/2GHZ 2007 1Gb/s 4GB 500GB Core2 Quad/3GHZ 2011 X100 X2000, but X50000 X150X4, but multi- 17 years slow access core and instruction level progress 5
  • 6. Background: technology trends – Disk is cheap (TB and PB are common) • 500RMB for 1TB – Memory is cheap (32GB a PC is not uncommon) • 150RMB for 2GB DRAM – CPU is powerful yet inexpensive (multi-core) • 2000RMB for Intel core i7 with 4 cores – But “network bandwidth is a scarce resource • Intra-DC: replication everywhere for fault tolerance • Inter-DC: Input and output need bandwidth • 50$ (per 1G port), 500$ (per 10G port) – 0.1$ = 1GB bandwidth = 1CPU hour = 1GB storage per month 6
  • 7. DCN building blocks Server Rack Container Data Center 7
  • 8. DCN reference design • Does not scale • Low bandwidth • Single point of failure • High cost 8
  • 9. Outline • DCN background • Opportunities • Research challenges • A modular DCN design 9
  • 10. Right time for DCN research • It is a real problem • It is an important problem – DCN as the infrastructure for cloud computing • The assumptions are different – Data centers are owned by single organization – We can innovate at both end-hosts and network devices – Security is easier (closed environment and trusted people) 10
  • 11. DCN research: opportunities • Full of research problems – Scalability: tens of thousands to millions servers – Performance – Fault tolerance – Cost saving – Feel free to suggest new “TCP” protocols • You can invent your own DCN! 11
  • 12. Outline • DCN background • Opportunities • Research challenges • A modular DCN design 12
  • 13. Research challenges Applications Architectures • Search • Topology design • Distributed execution engine • Network virtualization • Distributed file systems • Electrical/optical switching • Online social networking • Commodity vs. special system • HPC applications Technologies Protocols • DCN management • DCN routing • DCN platform • TCP incast congestion control • Energy efficiency • Multicast 13
  • 14. Architecture design • Scaling: from thousands to millions of servers • High capacity: support various traffic patterns • Fault tolerance • Cost efficient • Easy to deploy and manage 14
  • 16. VL2 (msrr-sigcomm09) OSFP+ECMP 10G 10G 1G 16
  • 17. Dcell/Bcube (msra-sigcomm08,09) • Put intelligence at servers • Use Ethernet switches as crossbar • Innovations in topology design and routing DCell BCube 17
  • 18. Architecture: optical/electrical switching (ucsd-sigcomm10, rice- sigcomm10) • A hybrid architecture • Optical circuit switching • Electrical packet switching 18
  • 19. Protocols: TCP incast congestion control S1 S2 R Sn cmu-sigcomm09, msra-conext10 19
  • 20. Technologies: research platform • A DCN research platform – High performance: comparable to ASIC – Easy to program: comparable to commodity server – Rich functions • Programmable packet forwarding • Experiment various control/management funcs • Can implement various routing/congestion control designs • ServerSwitch (msra-nsdi11) 20
  • 21. Applications • A unified network for both data center and HPC applications? Data center HPC Topology Tree-based Torus/mesh, fat-tree Routing Deterministic routing Single path routing Per-packet adaptive L2 spanning tree routing to exploit path L3 shortest path routing diversity Flow control No packet drop Packets can be dropped Hop by hop End-to-end Application support Scientific applications Search, e-commerce, cloud computing Programming API MPI/RDMA TCP/IP socket 21
  • 22. Outline • DCN background • Opportunities • Research challenges • A modular DCN design 22
  • 23. Team • Chuanxiong Guo, Guohan Lu, Haitao Wu, Yongqiang Xiong • Interns: Zhiqiang Zhou, Jiaxin Cao, Jiabo Ju, Qin Jia, Jun Li • Alumni/Alumna – members: Songwu Lu, Dan Li – interns: Lei Shi, Yunfeng Shi, Danfeng Zhang, Xuan Zhang, Byunchul Park, Nan Hua, Chen Tian, Min-Chen Zhao, Chao Kong, Kai Chen, Wenfei Wu, Shuang Yang, Peng Su, Bruce Chen, Zhenqian Feng, Min-Jeong Shi, Yibo Zhu… 23
  • 24. Modular, mega-data center networking 24
  • 25. Modular, mega-data center networking BCube BCube BCube BCube MDCube BCube BCube BCube BCube 25
  • 26. BCube: Server centric network BCube1 <1,0> <1,1> <1,2> <1,3> BCube0 <0,0> <0,1> <0,2> <0,3> 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 26
  • 27. 2-D MDCube MDCube structure 27
  • 28. Problem: Server for pkt fwding? BCube1 <1,0> <1,1> <1,2> <1,3> BCube0 <0,0> <0,1> <0,2> <0,3> 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 Forwarding node 28
  • 29. Solution: ServerSwitch • Full programmability at server CPU – Kernel module for low latency processing Software – User space for ease-to-use programmability • Low latency and high throughput PCI-E interconnection Hardware • Packet forwarding in commodity switching ASIC – High performance and limited programmability 29
  • 30. Testbed • A BCube testbed – 16 servers (Dell Precision 490 workstation with Intel 2.00GHz dualcore CPU, 4GB DRAM, 160GB disk) – 8 8-port mini-switches (DLink 8-port Gigabit switch DGS-1008D) • NIC – Intel Pro/1000 PT quad-port Ethernet NIC – NetFPGA 30
  • 31. Summary • DCN is an area full of opportunities and challenges • The best is yet to come! • Further information • http://research.microsoft.com/en- us/projects/msradcn/default.aspx 31
  • 32. 32