産総研におけるプライベートクラウドへの取り組み
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

産総研におけるプライベートクラウドへの取り組み

on

  • 4,733 views

第33回グリッド協議会ワークショップ発表。

第33回グリッド協議会ワークショップ発表。

Statistics

Views

Total Views
4,733
Views on SlideShare
4,679
Embed Views
54

Actions

Likes
10
Downloads
49
Comments
2

7 Embeds 54

http://a0.twimg.com 24
https://twitter.com 12
http://us-w1.rockmelt.com 9
http://www.techgig.com 6
http://twitter.com 1
https://si0.twimg.com 1
https://abs.twimg.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

産総研におけるプライベートクラウドへの取り組み Presentation Transcript

  • 1. ! !2011 12 21 33 @
  • 2. • • • •  I/O• •  2
  • 3. X as a Service 3
  • 4. Software as a Service Google Apps, (SaaS) Salesforce.com Platform as a Service Web Google App Engine, (PaaS) Windows Azure Infrastructure as a Service (IaaS) Amazon EC2 4
  • 5. IaaS •  •  Pay-as-you-go •  “ ” •  Web •  •  •  IT •  •  Cloud bursting 5
  • 6. 6
  • 7. •  HPC •  AIST Super Cluster AIST Super Cloud2004 3 2010 2 AIST Green Cloud 2011 6 2010 3 7
  • 8. AIST F-32 10 P-32 M-64 79
  • 9. AIST 
 
 
 

  • 10. 20,000 300 20,000 ACM/IEEE Supercomputing 2005 Best Paper Award “Full Electron Calculation Beyond 20,000 Atoms: Ground Electronic State of Photosynthetic Proteins”
  • 11. !  NEB+Hybrid QM/MD !   GridRPC + MPI !  (1193 CPUs) !  58 5000 QM QM QM NCSA 64×4 CPUs for QM QM QM SDSC 64×2 CPUs for QM Purdue 64×4 CPUs for QM USC 64×1 CPUs for QM QM QM QM QM NCSA MPI USC MPI RPC RPC Purdue AIST SDSC QM QM QM QM QM QM MD MD RPC QM QM RPC MPI MPI NEB scheduler AIST F32 41×1 CPUs for MD Number of Execution AIST F32 64×3 CPUs for QM MD MD AIST P32 64×4 CPUs for QM MPIsystem energy end-1 end-2 reaction 0 Elapsed time
  • 12. Top500 P32 19 458
  • 13. AIST Green Cloud Server (AGC) !   2010 3!  !   Dell PowerEdge M1000e ×8!  !   Dell PowerEdge M610 ×16×8 =128 !  Intel Xeon E5540 × 2 2.53GHz, 8MB ) !   48GB !   300GB HDD ×2!  ! Mellanox M3601Q FI Infiniband I/F 4X QDR 32port!  !   Gigabit Ethernet
  • 14. AIST Super Cloud Server (ASC) !   2011 6!  !   Dell PowerEdge M1000e ×12 !  !   Dell PowerEdge M610 ×16×12 =192 !  Intel Xeon E5620 × 2 2.4GHz, 12MB ) !   24GB !   600GB HDD ×2!  !   10GB Ethernet!  !   Gigabit Ethernet! 
  • 15. 1/10 AIST AIST AIST ASC AGC ASC 2004 2009 2011 (M ) 1500 76 66 (M /5 ) 400 28 43 1408 128 192 2816 1024 1536 (TFlops) 14.6 10.4 14.6 (KW) 800 63 86 (KW) 460 40 48 (Gflops/KW) 18.25 165.08 169.77 (Gflops/KW) 31.74 260 304.17 /Tflops) 1 760 450 15
  • 16. ASC •  –  Pay as you go – •  200 BMM BMM KVM 150 KVM OpenNebula 100 KVM OS: Scientific Linux 6.0 OS: CentOS 5.6 CentOS 4.9 openSUSE 11.4 50 BMM CentOS 5.6 openSUSE 11.4 0 16
  • 17. •  AIST Super Cluster•  AIST Green Cloud AIST Super Cloud 1~2 –  T2K • •  IT 17
  • 18. 18
  • 19. e.g., ASC 19
  • 20. •  –  VMWare ESXi Xen KVM Hyper-V•  –  OS•  –  Eucalyptus OpenStack CloudStack OpenNebula Nimbus Wakame Rocks Condor VMWare vSphere… OSS 20
  • 21. •  IaaS –  VM –  VM –  Instance Instance Instance Instance 21
  • 22. •  Rocks: http://www.rocksclusters.org/ –  UCSD –  CentOS (RHEL) Rolls•  Eucalyptus http://www.eucalyptus.com/ –  UCSB Amazon EC2/S3•  OpenStack: http://openstack.org/ –  NASA Rackspace •  NASA Eucalyptus•  OpenNebula http://opennebula.org/ –  OpenNebula OpenStack 22
  • 23. Rocks Eucalyptus OpenStack OpenNebula OS RHEL5 1 VMM Xen Xen KVM Xen KVM VMWare Hyper-V VM VLAN /home libvirt VMM VM 23
  • 24. OpenNebula •  (Complutense University of Madrid)•  Apache License 2.0•  C12G Labs http://opennebula.org/about:about (FP7) 24
  • 25. OpenNebula CLI or Sunstone GUI global network frontend Sche ONED VM host 1 VM host 2 duler VMM VMM /srv/cloud /srv/cloud SSH /srv/cloud SSH |-- one `-- images VM local network VM NFS scp 25
  • 26. OpenNebula •  CUI VM ID HostID •  Sunstone –  Web GUI 26
  • 27. Contextualization •  VM VM VM•  –  –  root –  SSH VM oned –  /etc/hosts /etc/resolv. conf –  NFS from OpenNebula Documentation 27
  • 28. Contextualization VM context.sh (test.one) � � � � � � � � � �� � � � � � � � � � % onevm create test.one � � � 28
  • 29. MAC_PREFIX:IP •  OpenNebula VM IP MAC MAC IP•  VM MAC IP NIC % onevnet show 1 (snip) LEASES INFORMATION LEASE=[IP=192.168.57.209, MAC=02:00:c0:a8:39:d1, USED=0, VID=-1] IP ”02:00” IP 4 octet 16 29
  • 30. •  –  •  OS OS –  – •  –  –  VLAN •  OpenNebula 3.0 –  PCI I/O –  30
  • 31. •  VM –  OS• • •  –  VM SDSC•  P2V•  H/W – •  – •  ASC 31
  • 32. •  –  VMM•  QEMU Xen libvirt –  Linux libvirt OpenStack –  QEMU -cpu host•  –  –  – •  I/O – • •  32
  • 33. I/O 33
  • 34. …•  – •  – •  –  IO –  •  … –  34
  • 35. • •  ←•  DB HPC 35
  • 36. IO IO emulation PCI passthrough SR-IOVVM1 VM2 VM1 VM2 VM1 VM2 Guest OS Guest OS Guest OS … … … Guest Physical Physical driver driver driver VMM VMM VMM vSwitch Physical driver NIC NIC NIC Switch (VEB) IO emulation PCI passthrough SR-IOV VM sharing ✔ ✖ ✔ Performance ✖ ✔ ✔ 36
  • 37. AIST Green Cloud AGC 1 16 HPC Compute node Dell PowerEdge M610 Host machine environment CPU Intel quad-core Xeon E5540/2.53GHz x2 OS Debian 6.0.1 Chipset Intel 5520 Linux kernel 2.6.32-5-amd64 Memory 48 GB DDR3 KVM 0.12.50 InfiniBand Mellanox ConnectX (MT26428) Compiler gcc/gfortran 4.4.5 MPI Open MPI 1.4.2 Blade switch VM environment InfiniBand Mellanox M3601Q (QDR 16 ports) VCPU 8 Memory 45 GB 1 1 VM 37
  • 38. MPI Point-to-Point 10000 (higher is better) 1000Bandwidth [MB/sec] 100 PCI KVM 10 Bare Metal Bare Metal KVM 1 1 10 100 1k 10k 100k 1M 10M 100M 1G Message size [byte] Bare Metal: 38
  • 39. NPB BT-MZ: (higher is better) 300 100Performance [Gop/s total] 250 Degradation of PE: Parallel efficiency [%] 80 KVM: 2%, EC2: 14% 200 Bare Metal 60 150 KVM Amazon EC2 40 100 Bare Metal (PE) KVM (PE) 20 50 Amazon EC2 (PE) 0 0 1 2 4 8 16 Number of nodes 39
  • 40. Bloss: Bloss: –  MPI OpenMP 120 100Parallel Efficiency [%] 80 60 Degradation of PE: KVM: 8%, EC2: 22% 40 20 Bare Metal KVM Amazon EC2 Ideal 0 1 2 4 8 16 Number of nodes 40
  • 41. •  HPC•  –  "InfiniBand PCI HPC ", SACSIS2011, pp.109-116, 2011 5 . •  KVM Xen PVM NAB-MZ Bloss –  "HPC ", ACS37 . •  NUMA KVM Xen HVM –  Takano, et al., "Toward a practical "HPC Cloud": Performance tuning of a virtualized InfiniBand cluster", CUTE2011, December 2011. •  VMM HPC Challenge benchmark 41
  • 42. PCI •  PCI NG•  PCI Bonding –  PCI NIC –  NIC IO NIC active-standby bonding –  S•  SR-IOV NIC VF IO PV 1 NIC 42
  • 43. GesutOS bond0 eth0 eth1 (virtio) (igbvf) tap0 Host OS Host OS tap0 br0 br0 eth0 eth0 (igb) (igb) SR-IOV NIC SR-IOV NIC 43
  • 44. GesutOS (qemu) device_del vf0 bond0 eth0 eth1 (virtio) (igbvf) tap0 Host OS Host OS tap0 br0 br0 eth0 eth0 (igb) (igb) SR-IOV NIC SR-IOV NIC 44
  • 45. (qemu) migrate -d tcp:x.x.x.x:y GesutOS GesutOS bond0 eth0 (virtio) $ qemu -incoming tcp:0:y ... tap0 Host OS Host OS tap0 br0 br0 eth0 eth0 (igb) (igb) SR-IOV NIC SR-IOV NIC 45
  • 46. (qemu) device_add pci-assign,host=05:10.0,id=vf0 GesutOS bond0 eth0 eth1 (virtio) (igbvf) tap0 Host OS Host OS tap0 br0 br0 eth0 eth0 (igb) (igb) SR-IOV NIC SR-IOV NIC 46
  • 47. MPI Guest OS rank 1 → bond0 eth0 eth1 (virtio) (igbvf) tap0 192.168.0.1 tap0 192.168.0.2 192.168.0.3 br0 rank 0 br0 eth0 eth0 (igb) (igb) SR-IOV NIC SR-IOV NIC NIC 192.168.0.0/24 47
  • 48. 48
  • 49. GridARS •  RMS   –    –  GRC  :     User –  RM  (CRM/NRM/SRM)  :     Domain  0 •    ID GRC DMS/A  AEM   –    –  CRM   GRC DMS/A GRC DMS/A •    CRM Domain  2 CRM DMS   NRM DMC/C NRM DMC/C –    –  DMS/A  :     CRM DMC/C DMC/C GRS DMC/A SRM SRM –  DMS/C  :     CRM DMC/C Domain  1 Domain  3 49
  • 50. PRAGMA Grid/Clouds UZHSwitzerland CNIC JLU AIST China China KISTI OsakaU IndianaU KMU UTsukuba SDSC USA LZU Korea Japan China USA ASGC HKU NCHC UoHyd HongKong Taiwan India ASTI NECTEC Philippines CeNAT-ITCR KU HCMUT Costa Rica Thailand HUT IOIT-Hanoi UValle MIMOS IOIT-HCM Colombia USM Vietnam Malaysia MU BESTGrid UChile Australia New Zealand Chile 26 institutions in 17 countries/regions, 23 compute sites, 10VM sites 50
  • 51. Put  all  together   Store  VM  images  in  Gfarm  systems   gFC   Run  vm-­‐deploy  scripts  at  PRAGMA  Sites   gFC  VM  Image   Copy  VM  images  on  Demand  from  gFarm   VM  Image  copied  from     CondorgFarm   slave gFS   Modify/start  VM  instances  at  PRAGMA  sites   Master copied  from     gFarm   slave SDSC  (USA)   Manage  jobs  with  Condor   AIST  (Japan)   Rocks  Xen   OpenNebula  KVM   gFS   GFARM  Grid  File   gFC   System  (Japan)   gFC  VM  Image   AIST  QuickQuake  +  Condor   VM  Image  copied  from     copied  from    gFarm   slave gFS   NCHC  FmoRf   gFS   gFarm   slave gFS   UCSD  Autodock  +  Condor   gFS   NCHC  (Taiwan)   IU  (USA)  OpenNebula  KVM   AIST  Web  Map  Service  +  Condor   Rocks  Xen   AIST  Geogrid  +  Bloss   AIST  HotSpot  +  Condor     gFC   gFS   gFS   gFC  VM  Image   VM  Image  copied  from     copied  from    gFarm   slave gFS   gFS   gFarm   slave LZU  (China)   =  VM  deploy  Script   Osaka  (Japan)   Rocks  KVM   gFC   =  Grid  Farm  Client   Rocks  Xen   gFS   =  Grid  Farm  Server   51
  • 52. •  2011 –  OpenNebula OSS OpenStack•  –  PCI I/O – • •  52
  • 53. •  –  – •  –  – •  –  –  •  c.f., rough consensus and running code 53