数据中心网络研究：机遇与挑战

数据中心网络研究：机遇与挑战

郭传雄

微软亚洲研究院 (MSRA)
2011.04.15
1

Outline
• DCN background
• Opportunities
• Research challenges
• A modular DCN design

2

Background: personal experience
• Bandwidth is a scarce resource
Network Memory Disk CPU Year

10Mb/s 2MB 10MB 386/20M 1994

100Mb/s 128MB 2GB PentiumII/233 1998

100Mb/s 256MB 40GB PentiumIII/800 2002

1Gb/s 2GB 160GB Core2/2GHZ 2007

1Gb/s 4GB 500GB Core2 Quad/3GHZ 2011

X100 X2000, but X50000 X150X4, but multi- 17 years
slow access core and instruction
level progress

5

Background: technology trends
– Disk is cheap (TB and PB are common)
• 500RMB for 1TB
– Memory is cheap (32GB a PC is not uncommon)
• 150RMB for 2GB DRAM
– CPU is powerful yet inexpensive (multi-core)
• 2000RMB for Intel core i7 with 4 cores
– But “network bandwidth is a scarce resource
• Intra-DC: replication everywhere for fault tolerance
• Inter-DC: Input and output need bandwidth
• 50$ (per 1G port), 500$ (per 10G port)
– 0.1$ = 1GB bandwidth = 1CPU hour = 1GB storage per
month
6

DCN building blocks

Server Rack Container Data Center 7

DCN reference design
• Does not scale
• Low bandwidth
• Single point of failure
• High cost

8

Outline
• DCN background
• Opportunities

9

Right time for DCN research
• It is a real problem
• It is an important problem
– DCN as the infrastructure for cloud computing
• The assumptions are different
– Data centers are owned by single organization
– We can innovate at both end-hosts and network
devices
– Security is easier (closed environment and trusted
people)

10

DCN research: opportunities
• Full of research problems
– Scalability: tens of thousands to millions servers
– Performance
– Fault tolerance
– Cost saving
– Feel free to suggest new “TCP” protocols
• You can invent your own DCN!

11

Outline
• DCN background
• Opportunities

12

Research challenges
Applications Architectures

• Search • Topology design
• Distributed execution engine • Network virtualization
• Distributed file systems • Electrical/optical switching
• Online social networking • Commodity vs. special system
• HPC applications

Technologies Protocols

• DCN management • DCN routing
• DCN platform • TCP incast congestion control
• Energy efficiency • Multicast

13

Architecture design
• Scaling: from thousands to millions of servers
• High capacity: support various traffic patterns
• Fault tolerance
• Cost efficient
• Easy to deploy and manage

14

Fat-tree (ucsd-sigcomm08)

15

VL2 (msrr-sigcomm09)

OSFP+ECMP

10G

10G

1G

16

Dcell/Bcube (msra-sigcomm08,09)

• Put intelligence at servers
• Use Ethernet switches as crossbar
• Innovations in topology design and routing

DCell BCube
17

Architecture: optical/electrical
switching (ucsd-sigcomm10, rice-
sigcomm10)
• A hybrid architecture
• Optical circuit switching
• Electrical packet switching

18

Protocols: TCP incast congestion
control

S1

S2
R

Sn

cmu-sigcomm09, msra-conext10

19

Technologies: research platform
• A DCN research platform
– High performance: comparable to ASIC
– Easy to program: comparable to commodity server
– Rich functions
• Programmable packet forwarding
• Experiment various control/management funcs
• Can implement various routing/congestion control
designs
• ServerSwitch (msra-nsdi11)
20

Applications
• A unified network for both data center and
HPC applications?
Data center HPC
Topology Tree-based Torus/mesh, fat-tree
Routing Deterministic routing Single path routing
Per-packet adaptive L2 spanning tree
routing to exploit path L3 shortest path routing
diversity
Flow control No packet drop Packets can be dropped
Hop by hop End-to-end
Application support Scientific applications Search, e-commerce,
cloud computing
Programming API MPI/RDMA TCP/IP socket
21

Outline
• DCN background
• Opportunities

22

Team
• Chuanxiong Guo, Guohan Lu, Haitao Wu,
Yongqiang Xiong
• Interns: Zhiqiang Zhou, Jiaxin Cao, Jiabo Ju, Qin
Jia, Jun Li
• Alumni/Alumna
– members: Songwu Lu, Dan Li
– interns: Lei Shi, Yunfeng Shi, Danfeng Zhang, Xuan Zhang,
Byunchul Park, Nan Hua, Chen Tian, Min-Chen Zhao, Chao
Kong, Kai Chen, Wenfei Wu, Shuang Yang, Peng Su, Bruce
Chen, Zhenqian Feng, Min-Jeong Shi, Yibo Zhu…
23

Modular, mega-data center
networking

24

Modular, mega-data center
networking

BCube BCube BCube

BCube MDCube BCube

BCube BCube BCube
25

BCube: Server centric network
BCube1

<1,0> <1,1> <1,2> <1,3>

BCube0
<0,0> <0,1> <0,2> <0,3>

00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33

26

2-D MDCube
MDCube structure

27

Problem: Server for pkt fwding?
BCube1

<1,0> <1,1> <1,2> <1,3>

BCube0
<0,0> <0,1> <0,2> <0,3>

00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33

Forwarding node
28

Solution: ServerSwitch

• Full programmability at server CPU
– Kernel module for low latency processing
Software

– User space for ease-to-use
programmability

• Low latency and high throughput
PCI-E
interconnection
Hardware

• Packet forwarding in commodity
switching ASIC
– High performance and limited
programmability
29

Testbed
• A BCube testbed
– 16 servers (Dell Precision 490 workstation with
Intel 2.00GHz dualcore CPU, 4GB DRAM, 160GB
disk)
– 8 8-port mini-switches (DLink 8-port Gigabit
switch DGS-1008D)
• NIC
– Intel Pro/1000 PT quad-port Ethernet NIC
– NetFPGA
30

Summary
• DCN is an area full of opportunities and
challenges
• The best is yet to come!
• Further information
• http://research.microsoft.com/en-
us/projects/msradcn/default.aspx

31

数据中心网络研究：机遇与挑战

More Related Content

What's hot

Similar to 数据中心网络研究：机遇与挑战

Recently uploaded

数据中心网络研究：机遇与挑战