Implements BIOS              emulation support for                     BHyVe                Takuya ASADA<syuu@freebsd.org>...
Before talk about BIOS               Emulation on BHyVe              Let’s quickly looking into BHyVe internal            ...
BHyVe Overview                              2. Run VM instace                                       Disk image            ...
vmm.ko              • Provides /dev/vmm/${vmname}              • Each vmm device file contains each VM                insta...
/dev/vmm/${vmname}                    interfaces              • read/write/mmap                Can access guest memory are...
/dev/vmm/${vmname}                     ioctls              • VM_MAP_MEMORY: Map guest memory                area as reques...
bhyveload              •   FreeBSD bootloader ported to userland: userboot              •   bhyveload loads userboot.so as...
bhyve              • bhyve command runs like following rules:               while (1) {                   ioctl(VM_RUN);  ...
Intel VT-x: Hardware                  assisted virtualization                                   VMX                  VMX  ...
VT-x configuration              • Which event should be handled by                hypervisor?                It depends hyp...
BHyVe BIOS emulation                    project              • Google Summer of Code ’12                “BHyVe BIOS emulat...
Limitation of bhyveload              • It’s legacy free! yay!              • But...              • Only supports FreeBSD/a...
Why don’t you just                  implement OS loader?              •   Better than supporting legacy ugly BIOS? True! B...
BIOS interrupt call         •    Ex: sys/boot/i386/mbr/mbr.s              main.5:      movw %sp,%di             # Save sta...
What happen when it                    called?              int 13h    Software interrupt(INTx)                        CPU...
How Linux KVM                    handles BIOS              • KVM uses QEMU for userland process              • QEMU has re...
BIOS call handling on                      KVM                int 13h               Software interrupt(INTx)              ...
Bring SeaBIOS in                        BHyVe?              • I wanted to use it              • But we can’t bring the cod...
OK then, is there BSDL                 BIOS?              • Unfortunately, we haven’t find any BSDL                BIOS    ...
How doscmd works              •   Map pages on low memory area to place DOS app(<1MB)              •   Setup interrupt vec...
How doscmd works                    int 13h            Software interrupt(INTx)                                      CPU r...
Difference of BIOS handling             on QEMU vs doscmd          • QEMU            Runs real BIOS in guest machine      ...
Plan to emulate BIOS                        on BHyVe              •   Extract only necessary code from doscmd, make it lib...
How to handle BIOS              interrupt call in BHyVe                int 13h          Software interrupt(INTx)          ...
Why don’t you trap                  interrupt directly?              •   Intel VT-x has ability to trap interrupt directly...
Problems(1)              •   doscmd is 64bit unsafe!                  Need to rewrite some type definition                 ...
Problems(2)              • Guest register storage                doscmd stores register value in their                stru...
Debugging BIOS                          emulator              •   When I started implementing BIOS emulation, I inserted r...
Implement instruction          level tracer on BHyVe(1)              •   If guest CPU is emulated, dumping each instructio...
Implement instruction          level tracer on BHyVe(2)              • At first, I implemented following rule:             ...
Implement instruction          level tracer on BHyVe(3)              •   I changed my mind to handle it just same as BIOS ...
Instruction level tracer                       output              [trace] 16bit ip:7c3e cs:0 flags:102 ss:0 sp:7ffe ds:0 c...
Tracing suddenly stops!                        (1)              • EFLAGS can be cleared on some conditions               •...
Tracing suddenly stops!                        (2)              •   EFLAGS can be cleared on some conditions              ...
Tracing suddenly stops!                        (3)              •   lidt just before switching protected mode             ...
Exception causes                      exception              • Not really sure, but it looks like exception               ...
BTX interrupt call                   causes exception              [trace] 32bit-kern eip:9332 cs:18 eflags:106 ss:10 esp:1...
rep causes exception in                  loader              [trace] 32bit-kern eip:2000c4 cs:8 eflags:10106 ss:10 esp:ffc ...
Demonstration13年3月17日日曜日
Conclusion              •   Test implementation of BIOS emulator for BHyVe                  implemented              •   I...
Upcoming SlideShare
Loading in...5
×

Implements BIOS emulation support for BHyVe

2,419

Published on

Published in: Technology

Implements BIOS emulation support for BHyVe

  1. 1. Implements BIOS emulation support for BHyVe Takuya ASADA<syuu@freebsd.org>13年3月17日日曜日
  2. 2. Before talk about BIOS Emulation on BHyVe Let’s quickly looking into BHyVe internal structure and Intel VT-x13年3月17日日曜日
  3. 3. BHyVe Overview 2. Run VM instace Disk image • bhyveload loads guest 1. Create VM instance, tap device OS load guest kernel stdin/stdout Guest kernel N Console 3. Destroy VM • bhyve is userland part of H D I C instance Hypervisor bhyveload bhyve bhyvectl Emulates devices • libvmmapi mmap/ioctl bhyvectl is a management tool /dev/vmm/${vm_name} (vmm.ko) FreeBSD kernel • libvmmapi is userland API • vmm.ko is kernel part of Hypervisor13年3月17日日曜日
  4. 4. vmm.ko • Provides /dev/vmm/${vmname} • Each vmm device file contains each VM instance state • The device file can create via sysctl: hw.vmm.create • Destroy via sysctl: hw.vmm.destroy13年3月17日日曜日
  5. 5. /dev/vmm/${vmname} interfaces • read/write/mmap Can access guest memory area by standard syscall (Which means you even can dump guest memory by dd command) • ioctl Provides various operation to VM13年3月17日日曜日
  6. 6. /dev/vmm/${vmname} ioctls • VM_MAP_MEMORY: Map guest memory area as requested size • VM_SET/GET_REGISTER: Access registers • VM_RUN: Run guest machine, until virtual devices accessed (Or some other trap happened)13年3月17日日曜日
  7. 7. bhyveload • FreeBSD bootloader ported to userland: userboot • bhyveload loads userboot.so as dynamic link library, call loader_main function • Once it called, it does following things: • Parse UFS on diskimage, find kernel • Load kernel to guest memory area (using mmap) • Set initial guest register values (using VM_SET_REGISTER ioctl) • RIP = kernel entry point • CR0 = Paging enable | Protected mode enable • EFER = Long mode enable | Long mode active • Initialize Page Table, set addr to CR3 • Create GDT, IDT, LDT, set addr to GDTR, IDTR, LDTR • Initialize TR • Guest machine starts from kernel entry point, with 64bit mode enabled13年3月17日日曜日
  8. 8. bhyve • bhyve command runs like following rules: while (1) { ioctl(VM_RUN); device_io_emulation(); }13年3月17日日曜日
  9. 9. Intel VT-x: Hardware assisted virtualization VMX VMX root mode non-root mode User User (Ring 3) VMEntry (Ring 3) Kernel VMExit Kernel (Ring 0) (Ring 0) • New CPU mode: VMX root mode(hypervisor) / VMX non-root mode(guest) • If some event which need to emulate in hypervisor, CPU stops guest, exit to hypervisor → VMExit13年3月17日日曜日
  10. 10. VT-x configuration • Which event should be handled by hypervisor? It depends hypervisor implementation! • VT-x is configurable! You can disable/enable each event • Also can change some behavior of CPU13年3月17日日曜日
  11. 11. BHyVe BIOS emulation project • Google Summer of Code ’12 “BHyVe BIOS emulation to boot legacy systems” • Project Goal: Implement BIOS emulation on BHyVe hypervisor, to make BHyVe able to support more guest OSes13年3月17日日曜日
  12. 12. Limitation of bhyveload • It’s legacy free! yay! • But... • Only supports FreeBSD/amd64 • You need to implement kernel loader for each OSes • Want to run more OSes on BHyVe!13年3月17日日曜日
  13. 13. Why don’t you just implement OS loader? • Better than supporting legacy ugly BIOS? True! But... • OS loader will be heavily dependent kernel implementation • You’ll be need to implement OS loader for each OSes ex: Linux loader, NetBSD loader, OpenBSD loader... • Maybe it’s very hard to implement proprietary OS loader • Even OS loader could worked, Guest OS may call BIOS interrupt handler → DIE! It’s common on 32bit x86 OSes. Most 64bit OS are legacy free.13年3月17日日曜日
  14. 14. BIOS interrupt call • Ex: sys/boot/i386/mbr/mbr.s main.5: movw %sp,%di # Save stack pointer movb 0x1(%si),%dh # Load head movw 0x2(%si),%cx # Load cylinder:sector movw $LOAD,%bx # Transfer buffer testb $FL_PACKET,flags # Try EDD? jz main.7 # No. pushw %cx # Save %cx pushw %bx # Save %bx movw $0x55aa,%bx # Magic movb $0x41,%ah # BIOS: EDD extensions int $0x13 # present?    ↑BIOS Interrupt Call13年3月17日日曜日
  15. 15. What happen when it called? int 13h Software interrupt(INTx) CPU reads interrupt vector On the ROM Execute BIOS call handler Perform IO by in/out or MMIO Hardware13年3月17日日曜日
  16. 16. How Linux KVM handles BIOS • KVM uses QEMU for userland process • QEMU has real BIOS called “SeaBIOS”, opensource BIOS • SeaBIOS perform I/O by in/out instruction or MMIO • KVM handles these I/O, emulate devices13年3月17日日曜日
  17. 17. BIOS call handling on KVM int 13h Software interrupt(INTx) CPU reads interrupt vector Execute interrupt handler SeaBIOS preforms IO VMExit by in/out or MMIO to virtual HW QEMU HW Guest Emulation HyperVisor QEMU emulates HW IO13年3月17日日曜日
  18. 18. Bring SeaBIOS in BHyVe? • I wanted to use it • But we can’t bring the code in FreeBSD • Because it’s GPLv3 licensed13年3月17日日曜日
  19. 19. OK then, is there BSDL BIOS? • Unfortunately, we haven’t find any BSDL BIOS • But, there’s BSDL DOS emulator on Ports: doscmd • It has DOS & BIOS interrupt call emulator runs on FreeBSD/i38613年3月17日日曜日
  20. 20. How doscmd works • Map pages on low memory area to place DOS app(<1MB) • Setup interrupt vector / interrupt handler(It just issues HLT;IRET) • Load DOS app on low memory area • Enter virtual 8086 mode(i386_vm86(2)), entry DOS app entry address • CPU executes DOS app in virtual 8086 mode • When DOS app calls DOS/BIOS interrupt call, it handled by interrupt handler, the handler issues HLT instruction • Once HLT instruction issued, CPU leaves from virtual 8086 mode • doscmd emulates DOS/BIOS interrupt call virtual 8086 • return to virtual 8086 mode mode13年3月17日日曜日
  21. 21. How doscmd works int 13h Software interrupt(INTx) CPU reads interrupt vector Issue HLT instruction Execute interrupt handler HLT instruction TrapDOS app on BIOS Emulationv8086 mode doscmd emulates BIOS calldoscmd on FreeBSD/i38613年3月17日日曜日
  22. 22. Difference of BIOS handling on QEMU vs doscmd • QEMU Runs real BIOS in guest machine Interrupt handler handles BIOS interrupt call QEMU just emulates hardware devices • doscmd Hasn’t real BIOS Interrupt handler is just for trap vm86 machine doscmd emulates BIOS interrupt call handler13年3月17日日曜日
  23. 23. Plan to emulate BIOS on BHyVe • Extract only necessary code from doscmd, make it library Export two function: biosemul_init() / biosemul_call() • In biosemul_init(), perform BIOS compatible initialization (initialize register value, boot sector loading, initialize interrupt vector, install interrupt handler) • On interrupt handler, use VMCALL instruction instead of HLT instruction Because GuestOS also may use HLT, and we don’t want to handle it by BIOS emulation code • biosemul_call() handles BIOS interrupt call Executes BIOS interrupt call emulation using doscmd code13年3月17日日曜日
  24. 24. How to handle BIOS interrupt call in BHyVe int 13h Software interrupt(INTx) CPU reads interrupt vector Execute interrupt call handler Issue VMCALL VMExit by VMCALL instruction BIOS Emulation Guest HyperVisor doscmd emulates BIOS call13年3月17日日曜日
  25. 25. Why don’t you trap interrupt directly? • Intel VT-x has ability to trap interrupt directly (no need to issue VMCALL instruction in interrupt handler) • Why we shouldn’t use it for BIOS emulation? Because guest OS may use BIOS interrupt call vector numbers for different software interrupt after entering protected mode • Bootloaders may invoke interrupt handler by jumping address (btx does it)13年3月17日日曜日
  26. 26. Problems(1) • doscmd is 64bit unsafe! Need to rewrite some type definition Ex: u_long → uint32_t • doscmd maps guest memory area at 0x0 Maybe we also can mmap guest memry area at 0x0 on BHyVe, but I rewrited code Ex: *(char *)(0x400) = 0;       ↓ *(char *)(0x400 + guest_mem) = 0;13年3月17日日曜日
  27. 27. Problems(2) • Guest register storage doscmd stores register value in their structure, but BHyVe requires to issue ioctl to set/get guest register I decided to copy all register first, then emulate BIOS interrupt call, writeback modified register after that13年3月17日日曜日
  28. 28. Debugging BIOS emulator • When I started implementing BIOS emulation, I inserted register dump for each BIOS interrupt call • Actually, dumping for each BIOS interrupt call is too few to determine what’s going on • And the emulation doesn’t worked fine, it finally jumped away to strange EIP and commit suicide, I have no idea • I haven’t find a way to run BHyVe on an emulator and getting instruction level trace • BHyVe can run on VMware, but I haven’t find a way to do tracing on it • Decided to implement instruction level trace on BHyVe13年3月17日日曜日
  29. 29. Implement instruction level tracer on BHyVe(1) • If guest CPU is emulated, dumping each instruction is very easy Just dump everything when instruction decoder called • But, on BHyVe guest program runs natively Because it uses VT-x • This means, you have no way to inspect instruction or dump registers until VMExit caused • Then, we can raise exception on every instruction • You can insert instruction to raise exception, but x86 has a flag to single step debugging (TF bit on EFLAGS)13年3月17日日曜日
  30. 30. Implement instruction level tracer on BHyVe(2) • At first, I implemented following rule: • Sets TF bit on EFLAGS, enables VMExit on #DB exception • bhyve handle #DB exception, disassembly instruction on EIP, step forward EIP address,VMEnter again • I suddenly realized VMExit causing BEFORE executing instruction! USELESS!!13年3月17日日曜日
  31. 31. Implement instruction level tracer on BHyVe(3) • I changed my mind to handle it just same as BIOS interrupt call (interrupt handler issue VMCALL instruction→VMExit) • EIP and some register are pushed on stack because it’s not returned Need to fetch from stack to dump • OLD_EIP = *(uint16_t *)(ESP) • OLD_CS = * (uint16_t *)(ESP + 2) • OLD_EFLAGS = * (uint16_t *)(ESP + 4) • OLD_ESP = * (uint16_t *)(ESP + 6)13年3月17日日曜日
  32. 32. Instruction level tracer output [trace] 16bit ip:7c3e cs:0 flags:102 ss:0 sp:7ffe ds:0 cr0:30 eax:0 ebx:0 ecx:0 edx:80 insn:cld [trace] 16bit ip:7c3f cs:0 flags:102 ss:0 sp:7ffe ds:0 cr0:30 eax:0 ebx:0 ecx:0 edx:80 insn:xor %cx, %cx [trace] 16bit ip:7c41 cs:0 flags:146 ss:0 sp:7ffe ds:0 cr0:30 eax:0 ebx:0 ecx:0 edx:80 insn:mov %cx, %es [trace] 16bit ip:7c43 cs:0 flags:146 ss:0 sp:7ffe ds:0 cr0:30 eax:0 ebx:0 ecx:0 edx:80 insn:mov %cx, %ds [trace] 16bit ip:7c45 cs:0 flags:146 ss:0 sp:7ffe ds:0 cr0:30 eax:0 ebx:0 ecx:0 edx:80 insn:mov %cx, %ss [trace] 16bit ip:7c4a cs:0 flags:146 ss:0 sp:7c00 ds:0 cr0:30 eax:0 ebx:0 ecx:0 edx:80 insn:mov %sp, %si [trace] 16bit ip:7c4c cs:0 flags:146 ss:0 sp:7c00 ds:0 cr0:30 eax:0 ebx:0 ecx:0 edx:80 insn:mov $0x700, %di [trace] 16bit ip:7c4f cs:0 flags:146 ss:0 sp:7c00 ds:0 cr0:30 eax:0 ebx:0 ecx:0 edx:80 insn:incb %ch [trace] 16bit ip:7c51 cs:0 flags:102 ss:0 sp:7c00 ds:0 cr0:30 eax:0 ebx:0 ecx:100 edx:80 insn:rep movsw [trace] 16bit ip:7c51 cs:0 flags:102 ss:0 sp:7c00 ds:0 cr0:30 eax:0 ebx:0 ecx:ff edx:80 insn:rep movsw [trace] 16bit ip:7c51 cs:0 flags:102 ss:0 sp:7c00 ds:0 cr0:30 eax:0 ebx:0 ecx:fe edx:80 insn:rep movsw [trace] 16bit ip:7c51 cs:0 flags:102 ss:0 sp:7c00 ds:0 cr0:30 eax:0 ebx:0 ecx:fd edx:80 insn:rep movsw [trace] 16bit ip:7c51 cs:0 flags:102 ss:0 sp:7c00 ds:0 cr0:30 eax:0 ebx:0 ecx:fc edx:80 insn:rep movsw [trace] 16bit ip:7c51 cs:0 flags:102 ss:0 sp:7c00 ds:0 cr0:30 eax:0 ebx:0 ecx:fb edx:80 insn:rep movsw [trace] 16bit ip:7c51 cs:0 flags:102 ss:0 sp:7c00 ds:0 cr0:30 eax:0 ebx:0 ecx:fa edx:80 insn:rep movsw [trace] 16bit ip:7c51 cs:0 flags:102 ss:0 sp:7c00 ds:0 cr0:30 eax:0 ebx:0 ecx:f9 edx:80 insn:rep movsw [trace] 16bit ip:7c51 cs:0 flags:102 ss:0 sp:7c00 ds:0 cr0:30 eax:0 ebx:0 ecx:f8 edx:80 insn:rep movsw [trace] 16bit ip:7c51 cs:0 flags:102 ss:0 sp:7c00 ds:0 cr0:30 eax:0 ebx:0 ecx:f7 edx:80 insn:rep movsw13年3月17日日曜日
  33. 33. Tracing suddenly stops! (1) • EFLAGS can be cleared on some conditions • popf clears EFLAGS: #DB exception still causes immediately after popf instruction issued, so setting TF bit on OLD_FLAGS(on stack) can solve the issue (Guest machine restores EFLAGS by IRET)13年3月17日日曜日
  34. 34. Tracing suddenly stops! (2) • EFLAGS can be cleared on some conditions • BIOS interrupt call VMExit: Looks like CPU clears TF flag when it interrupted doscmd uses following interrupt call handler for handle BIOS interrupt call: VMCALL; STI; RETF 2 RETF 2 means don’t restore CS and EFLAGS, so changing OLD_EFLAGS(on stack) has no effect Just sets TF bit on EFLAGS can solve the issue • But we must not set TF bit on EFLAGS when interrupt is #DB exception It causes infinite loop13年3月17日日曜日
  35. 35. Tracing suddenly stops! (3) • lidt just before switching protected mode • After IDTR changed, #DB exception cannot handle anymore • Because #DB handler only installed on real mode interrupt vector, not on IDT • Modified IDT and implement #DB handler on btx • #DB exception haven’t caused in real mode after the lidt instruction • Probably because IDT for protected mode is not valid for real mode • After switching protected mode, tracing could resumed by set TF flag on EFLAGS13年3月17日日曜日
  36. 36. Exception causes exception • Not really sure, but it looks like exception raises at an exception handler • Because of this, it can’t print error on console • Inserted VMCALL at the beginning of exception handler, dump it all13年3月17日日曜日
  37. 37. BTX interrupt call causes exception [trace] 32bit-kern eip:9332 cs:18 eflags:106 ss:10 esp:17b8 ds:10 cr0:31 eax:31 ebx:9357 ecx:0 edx:70000 insn:decb %al [trace] 32bit-kern eip:9334 cs:18 eflags:106 ss:10 esp:17b8 ds:10 cr0:31 eax:30 ebx:9357 ecx:0 edx:70000 insn:mov %eax, %cr0 [trace] 32bit-kern eip:9097 cs:8 eflags:146 ss:0 esp:1800 ds:0 cr0:31 eax:102 ebx:2820 ecx:0 edx:708ee insn:mov $0x10, %cl [trace] 32bit-kern eip:9099 cs:8 eflags:146 ss:0 esp:1800 ds:0 cr0:31 eax:102 ebx:2820 ecx:10 edx:708ee insn:mov %ecx, %ss [trace] 32bit-kern eip:909d cs:8 eflags:146 ss:10 esp:1800 ds:0 cr0:31 eax:102 ebx:2820 ecx:38 edx:708ee insn:ltr %cx [except] 32bit-kern exception:13 error_code:38 eip:909d cs:8 eflags:10146 ss: 10 esp:1800 insn:ltr %cx ds:0 cr0:31 eax:102 ebx:2820 ecx:38 edx:708ee • INT 0x31 (BIOS call from BTX app) causes an exception at LTR instruction • I Have no idea... → Tried to skips all BIOS call on boot2 & loader, use in/out13年3月17日日曜日
  38. 38. rep causes exception in loader [trace] 32bit-kern eip:2000c4 cs:8 eflags:10106 ss:10 esp:ffc ds:10 cr0:31 eax:a0200 ebx:201000 ecx:52f edx:50000a insn:rep movsb [trace] 32bit-kern eip:2000c4 cs:8 eflags:10106 ss:10 esp:ffc ds:10 cr0:31 eax:a0200 ebx:201000 ecx:52e edx:50000a insn:rep movsb [trace] 32bit-kern eip:2000c4 cs:8 eflags:10106 ss:10 esp:ffc ds:10 cr0:31 eax:a0200 ebx:201000 ecx:52d edx:50000a insn:rep movsb [trace] 32bit-kern eip:2000c4 cs:8 eflags:10106 ss:10 esp:ffc ds:10 cr0:31 eax:a0200 ebx:201000 ecx:52c edx:50000a insn:rep movsb [trace] 32bit-kern eip:2000c4 cs:8 eflags:10106 ss:10 esp:ffc ds:10 cr0:31 eax:a0290 ebx:201000 ecx:52b edx:50000a insn:rep movsb [trace] 32bit-kern eip:2000c4 cs:8 eflags:10106 ss:10 esp:ffc ds:10 cr0:31 eax:a027b ebx:201000 ecx:52a edx:50000a insn:rep movsb [except] 32bit-kern exception:3 error_code:0 eip:2000c4 cs:8 eflags:10106 ss:10 esp:ffc insn:rep movsb ds:10 cr0:31 eax:a0236 ebx:201000 ecx:529 edx:50000a • Really haven’t good idea...13年3月17日日曜日
  39. 39. Demonstration13年3月17日日曜日
  40. 40. Conclusion • Test implementation of BIOS emulator for BHyVe implemented • Instruction level tracer implemented on it for debugging • Reached at /boot/loader stage, but it dies before loading a kernel • Advices by bootloader developers are really needed • Advices for better debugging method is also needed (Is there hardware debugger for x86? Or, maybe VMware has cool debugging feature?)13年3月17日日曜日
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×