XS Boston 2008 Fault Tolerance

713 views

Published on

Yoshi Tamura: VM Synchronization for Fault Tolerance Using DomT

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
713
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

XS Boston 2008 Fault Tolerance

  1. 1. Kemari: Virtual Machine Synchronization for Fault Tolerance using DomT Yoshi Tamura NTT Cyber Space Labs. tamura.yoshiaki@lab.ntt.co.jp 2008/6/24
  2. 2. Outline  Our goal Brush up on Xen Summit 2007  Design  Architecture overview  Implementation  Evaluation  Conclusion Copyright © 2007-2008 Nippon Telegraph and Telephone Corporation 2
  3. 3. What is Kemari? (Kemari)  Kemari is a football game that players keep a ball in the air Don’t drop the ball! Copyright © 2007-2008 Nippon Telegraph and Telephone Corporation 3
  4. 4. Our goal Don’t drop the ball! Don’t drop the VMs! Hardware failure Keep running transparently Kemari: Virtual Machine Synchronization Copyright © 2007-2008 Nippon Telegraph and Telephone Corporation 4
  5. 5. What needs to be done?   Virtual Machine Synchronization   Primary VM and Secondary VM must be identical   Detection of failure Extension of existing techniques   Failover mechanism Failover Primary node Secondary node Hardware Apps Apps Failure VM Synchronization Guest OS Guest OS VMM VMM Network Hardware Hardware SAN Copyright © 2007-2008 Nippon Telegraph and Telephone Corporation 5
  6. 6. How to synchronize VMs? Secondary Primary   Need to make the overhead of VM VM sync smaller 1.  Pause Primary, and sync with   Make sync time shorter Secondary tsync Only transfer updated data tinterval   Sync VMs less often 2. Resume Primary after sync Secondary must be able to continue transparently   Sync VMs before sending or receiving Events   Events: Storage, network, console Copyright © 2007-2008 Nippon Telegraph and Telephone Corporation 6
  7. 7. What happens if synced on specific intervals? Primary VM Primary VMM Storage Secondary VMM Secondary VM Vi-1 1. Sync VM 2. Update Secondary Vi 3. Sync completed Vi 4. Read request Sj 5. Reply Vi+1 6. Write request Sj+1 Vi+2 7. Reply × 8. Failover to Secondary VM after detecting HW failure Vi Sj+1 9. Read request Vi : VM’s state Sj : Storage’s state The state between VM and storage isn’t consistent   Secondary VM won’t be able to continue transparently Copyright © 2007-2008 Nippon Telegraph and Telephone Corporation 7
  8. 8. Sync on events from VM to storage Primary VM Primary VMM Storage Secondary VMM Secondary VM Vi Sj Vi-1 1. Read / Write request 2. Sync VM and event 3. Update Secondary Vi 4. Sync completed Resume point: 5. Resume Read / Write Just before operating storage Sj+1 6. Reply Vi : VM’s state Vi+1 Sj : Storage’s state   Secondary will redo the same operation as Primary   Secondary will receive the same reply as Primary Copyright © 2007-2008 Nippon Telegraph and Telephone Corporation 8
  9. 9. Demo… Hardware failure VNC Server VNC Client Guest VM is running on Kemari   Guest is running VNC server, and the client accesses via VNC client   xclock is launched from the client   See what happens to the clock when the primary physical server is   shut downed from HP iLO2 management console Copyright © 2007-2008 Nippon Telegraph and Telephone Corporation 9
  10. 10. Architecture overview Kemari Kemari Sync DomT Dom0 DomT DomT Dom0 Back-end Front-end Front-end Back-end Kemari Xen Xen Network Hardware Hardware SAN   The core of the synchronization mechanism resides in hypervisor to synchronize virtual machines efficiently   LOC ≅ 3000 (hypervisor: 1000, Dom0+Tools: 2000) Copyright © 2007-2008 Nippon Telegraph and Telephone Corporation 10
  11. 11. What is DomT? Domain U Domain T Shadow pfn mfn pfn mfn PT PT PT PT PT   Para-virtualized domain which uses shadow page table (auto-translated-mode)   Don't have to translate the page tables on transferring   DomT patch set for xen-3.0.4 was written by Michael A Fetterman from University of Cambridge Copyright © 2007-2008 Nippon Telegraph and Telephone Corporation 11
  12. 12. Implementation of Kemari  Event Channel tapping  Transferring DomT  Restoring para-virtualized devices Copyright © 2007-2008 Nippon Telegraph and Telephone Corporation 12
  13. 13. Event Channel tapping   Simple but the key component of Kemari   Monitors IN/OUT or Both   Registered function is called on specific events   Dynamically attachable   May be useful for measurements DomT Dom0 2 5 Front-end Back-end ECS_TAP 1 3 4 Kemari Copyright © 2007-2008 Nippon Telegraph and Telephone Corporation 13
  14. 14. Transferring DomT 1 (VMM) 2 (VMM) 3 (Tools) 4 (Tools) DomT Shared tmp Grant table buffer buffer (VMM/Tools) Log dirty VCPU DomT region Primary Secondary bitmap 1.  Pauses DomT and locks the grant tables. No need to suspend! •  Grant tables are mapped at the last 4 pages of DomT region 2.  Extracts dirtied pfn from the bitmap, copies pfns and the vcpu to the shared buffer, and notifies Tools via event channel 3.  Maps dirtied pages, transfers pages and vcpu to the secondary 4.  Secondary prepares temp buffers to rollback when failure is detected during transfer Copyright © 2007-2008 Nippon Telegraph and Telephone Corporation 14
  15. 15. Restoring para-virtualized devices 1. Device Channel is DomT stored in DomT region Front-end Device Channel 2. Attach the Back-end to Dom0 the Device Channel using BACK_RING_ATTACH Back-end macro Response Response consumer producer 3. Adjust producer and consumer indexes of the Back-end appropriately Request Request producer consumer Copyright © 2007-2008 Nippon Telegraph and Telephone Corporation 15
  16. 16. Evaluation   Evaluation items   Performance of the Primary VM (Network and File I/O) using netperf and iozone   Test machines   Hardware spec •  CPU: Intel Xeon 3GHz X 2 •  Memory: 4GB •  Network: Gigabit Ethernet, InfiniBand •  SAN: FC Disk Array   VM spec •  VMM: Xen 3.0.4 with DomT support •  Guest OS: Debian Etch •  Memory: 512MB Copyright © 2007-2008 Nippon Telegraph and Telephone Corporation 16
  17. 17. Performance of Primary VM ■ DomT ■ Kemari (Ethernet) ■ Kemari (InfiniBand) 100 8 100 90 90 7 80 80 6 Throughput [MB/sec] Throughput [MB/sec] 70 70 Throughput [Mb/sec] 5 60 60 50 4 50 40 40 3 30 30 2 20 20 1 10 10 0 0 0 Network O_SYNC Buffered + fsync   InfiniBand boosted the performance of Network and Buffered + fsync, both of which dirties many pages   All benchmarks continued transparently when the primary server was shut downed from the HP iLO 2 management console Copyright © 2007-2008 Nippon Telegraph and Telephone Corporation 17
  18. 18. Conclusion  Kemari is a Virtual Machine synchronization mechanism to achieve Fault Tolerance  Don't drop the ball! Don't drop the VMs!  Implemented Kemari using Xen and DomT   Thanks to Michael from University of Cambridge  Demonstrated Kemari achieved acceptable performance Copyright © 2007-2008 Nippon Telegraph and Telephone Corporation 18
  19. 19. Future work  Demonstrate the range of applications Kemari can manage to run transparently  Improve the performance of I/O intensive applications that send numbers of events  Hosting HVM domains with PV drivers  Hosting multiple domains simultaneously  Functions to implement for practical use such as detection of HW failure and failover mechanism Copyright © 2007-2008 Nippon Telegraph and Telephone Corporation 19

×