Streaming	  Exa-­‐scale	  Data	  over	  100Gbps	                    Networks	                     Mehmet	  Balman	        ...
Outline	  •  A	  recent	  100Gbps	  demo	  by	  ESnet	  and	  Internet2	     at	  SC11	  	  	  	  •  One	  of	  the	  appl...
Climate	  Data	  Distribution	  •  ESG	  data	  nodes	     •  Data	  replica=on	  in	  the	        ESG	  Federa=on	  •  Lo...
Climate	  Data	  over	  100Gbps	  •  Data	  volume	  in	  climate	  applica=ons	  is	  increasing	     exponen=ally.	  •  ...
Keep	  the	  data	  channel	  full	                                         request  request a file                       ...
lots-­‐of-­‐small-­‐<iles	  problem!	                        <ile-­‐centric	  tools?	  	  l    Not	  necessarily	  high-­...
Framework	  for	  the	  Memory-­‐mapped	            Network	  Channel	  memory	  caches	  are	  logically	  mapped	  betwe...
Moving	  climate	  <iles	  ef<iciently	  
Advantages	  •  Decoupling	  I/O	  and	  network	  opera=ons	           •  front-­‐end	  (I/O	  	  processing)	           ...
The	  SC11	  100Gbps	  demo	          environment	  
The	  SC11	  100Gbps	  Demo	  •  CMIP3	  data	  (35TB)	  from	  the	  GPFS	  filesystem	  at	  NERSC	       •  Block	  size...
83Gbps	  	  throughput	  
MemzNet:	  memory-­‐mapped	  zero-­‐copy	                 network	  channel	          Front-­‐end	            Memory	     ...
ANI Middleware Testbed                                                                                                    ...
Many	  TCP	  Streams	  (a) total throughput vs. the number of concurrent memory-to-memory transfers, (b) interface traffic,...
Effects	  of	  many	  streams	  ANI testbed 100Gbps (10x10NICs, three hosts): Interrupts/CPU vs the number of concurrent t...
MemzNet’s	  Performance	  	                                                                      SC11 demo                ...
MemzNet’s	  Architecture	  for	  data	            streaming	  
Acknowledgements	  Peter	   Nugent,	   Zarija	   Lukic	   ,	   Patrick	   Dorn,	   Evangelos	  Chaniotakis,	   John	   Chr...
Upcoming SlideShare
Loading in …5
×

MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data over 100Gbps networks

401 views

Published on

100Gbps memory-mapped zero-copy network channel

Streaming exascala data over 100Gbps networks MemzNet

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
401
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data over 100Gbps networks

  1. 1. Streaming  Exa-­‐scale  Data  over  100Gbps   Networks   Mehmet  Balman   Computa/onal  Research  Division   Lawrence  Berkeley  Na/onal  Laboratory  Collaborators: Eric Pouyoul, Yushu Yao, E. Wes Bethel, Burlen Loring, Prabhat, John Shalf, Alex Sim, Arie Shoshani, Dean N. Williams, Brian L. Tierney  
  2. 2. Outline  •  A  recent  100Gbps  demo  by  ESnet  and  Internet2   at  SC11        •  One  of  the  applica=ons:   •  Data  movement  of  large    datasets  with  many   files  (Scaling  the  Earth  System  Grid  to  100Gbps   Networks)  
  3. 3. Climate  Data  Distribution  •  ESG  data  nodes   •  Data  replica=on  in  the   ESG  Federa=on  •  Local  copies   •  data  files  are  copied   into  temporary   storage  in  HPC  centers   for  post-­‐processing   and  further  climate   analysis.    
  4. 4. Climate  Data  over  100Gbps  •  Data  volume  in  climate  applica=ons  is  increasing   exponen=ally.  •  An  important  challenge  in  managing  ever  increasing  data  sizes   in  climate  science  is  the  large  variance  in  file  sizes.     •  Climate  simula=on  data  consists  of  a  mix  of  rela=vely  small  and   large  files  with  irregular  file  size  distribu=on  in  each  dataset.     •  Many  small  files  
  5. 5. Keep  the  data  channel  full   request request a file data send file send datarequest a file send file RPC FTP •  Concurrent  transfers   •  Parallel  streams  
  6. 6. lots-­‐of-­‐small-­‐<iles  problem!   <ile-­‐centric  tools?    l  Not  necessarily  high-­‐speed  (same  distance)   -  Latency  is  s=ll  a  problem   request a dataset send data 100Gbps pipe 10Gbps pipe
  7. 7. Framework  for  the  Memory-­‐mapped   Network  Channel  memory  caches  are  logically  mapped  between  client  and  server    
  8. 8. Moving  climate  <iles  ef<iciently  
  9. 9. Advantages  •  Decoupling  I/O  and  network  opera=ons   •  front-­‐end  (I/O    processing)   •  back-­‐end  (networking  layer)    •  Not  limited  by  the  characteris=cs  of  the  file  sizes    On  the  fly  tar  approach,    bundling  and  sending    many  files  together  •  Dynamic  data  channel  management    Can  increase/decrease  the  parallelism  level  both    in  the  network  communica=on  and  I/O  read/write    opera=ons,  without  closing  and  reopening  the    data  channel  connec=on  (as  is  done  in  regular  FTP    variants).    
  10. 10. The  SC11  100Gbps  demo   environment  
  11. 11. The  SC11  100Gbps  Demo  •  CMIP3  data  (35TB)  from  the  GPFS  filesystem  at  NERSC   •  Block  size  4MB   •  Each  block’s  data  sec=on  was  aligned  according  to  the   system  pagesize.     •  1GB  cache  both  at  the  client  and  the  server      •  At  NERSC,  8  front-­‐end  threads  on  each  host  for  reading  data   files  in  parallel.  •   At  ANL/ORNL,  4  front-­‐end  threads  for  processing  received   data  blocks.  •   4  parallel  TCP  streams  (four  back-­‐end  threads)  were  used  for   each  host-­‐to-­‐host  connec=on.    
  12. 12. 83Gbps    throughput  
  13. 13. MemzNet:  memory-­‐mapped  zero-­‐copy   network  channel   Front-­‐end   Memory   network threads  (access   blocks Memory   to  memory   Front-­‐end   blocks blocks) threads   (access  to   memory   blocks)memory  caches  are  logically  mapped  between  client  and  server    
  14. 14. ANI Middleware Testbed ANI  100Gbps    NERSC To ESnet ANL 10G To ESnet 1GE 10G nersc-asw1 Site Router testbed   (nersc-mr2) ANI 100G Network 1GE anl-asw1 1 GE nersc-C2940 ANL Site switch Router 1 GE 100G anl-C2940 100G switch 1 GE 1 GE eth0 1 GE nersc-app 100G 100G nersc-diskpt-1 NICs: 1 GE 4x10GE (MM) 1 GE 2: 2x10G Myricom eth2-5 1: 4x10G HotLava 1 GE eth0 nersc-diskpt-1 10GE (MM) nersc-diskpt-2 NICs: 10GE (MM) 1 GE eth0 1: 2x10G Myricom 4x10GE (MM) 1: 2x10G Chelsio eth2-5 ANI 100G anl-app 1: 6x10G HotLava ANI 100G eth0 Router anl-mempt-1 NICs: Router eth2-5 eth0 4x10GE (MM) nersc-diskpt-2 4x 10GE (MM) 2: 2x10G Myricom nersc-diskpt-3 NICs: 4x10GE (MM) 1: 2x10G Myricom eth2-5 anl-mempt-1 1: 2x10G Mellanox eth0 1: 6x10G HotLava eth0 anl-mempt-2 NICs: eth2-5 nersc-diskpt-3 2: 2x10G Myricom 4x10GE (MM) anl-mempt-2 eth0 anl-mempt-3 NICs: eth2-5 1: 2x10G Myricom 4x10GE (MM) 1: 2x10G MellanoxNote: ANI 100G routers and 100G wave available till summer 2012;Testbed resources after that subject funding availability. anl-mempt-3 Updated December 11, 2011 SC11  100Gbps     demo  
  15. 15. Many  TCP  Streams  (a) total throughput vs. the number of concurrent memory-to-memory transfers, (b) interface traffic, packages per second (blue) and bytes per second, over a singleNIC with different number of concurrent transfers. Three hosts, each with 4 available NICs, and a total of 10 10Gbps NIC pairs were used to saturate the 100Gbpspipe in the ANI Testbed. 10 data movement jobs, each corresponding to a NIC pair, at source and destination started simultaneously. Each peak represents adifferent test; 1, 2, 4, 8, 16, 32, 64 concurrent streams per job were initiated for 5min intervals (e.g. when concurrency level is 4, there are 40 streams in total).  
  16. 16. Effects  of  many  streams  ANI testbed 100Gbps (10x10NICs, three hosts): Interrupts/CPU vs the number of concurrent transfers [1, 2, 4, 8, 16,32 64 concurrent jobs - 5min intervals], TCP buffer size is 50M
  17. 17. MemzNet’s  Performance     SC11 demo GridFTP MemzNetTCP  buffer  size  is  set  to  50MB     ANI Testbed
  18. 18. MemzNet’s  Architecture  for  data   streaming  
  19. 19. Acknowledgements  Peter   Nugent,   Zarija   Lukic   ,   Patrick   Dorn,   Evangelos  Chaniotakis,   John   Christman,   Chin   Guok,   Chris   Tracy,   Lauren  Rotman,   Jason   Lee,   Shane   Canon,   Tina   Declerck,   Cary  Whitney,   Ed   Holohan,     Adam   Scovel,   Linda   Winkler,   Jason   Hill,  Doug  Fuller,    Susan  Hicks,  Hank  Childs,  Mark  Howison,  Aaron  Thomas,  John  Dugan,  Gopal  Vaswani  

×