rrxv6: Build a RISC-V xv6 Kernel in Rust
Author: Yodalee
<lc85301@gmail.com>
Outline
1. Background Introduction
2. rrxv6 Overview
3. Inside rrxv6
4. Conclusion
Outline
1. Background Introduction
2. rrxv6 Overview
3. Inside rrxv6
4. Conclusion
There won't be much details about
xv6, RISC-V, or Operating System
● A re-implementation of Dennis Ritchie's and Ken Thompson's Unix
Version 6 (v6)
● Migrate from x86 to RISC-V multiprocessor using ANSI C
● Now used in operating systems courses at many universities.
Introduction to xv6
Introduction to RISC-V
● Open RISC standard Instruction Set Architecture (ISA) CPU
● Began in 2010 at the University of California, Berkeley.
● Alternatives to proprietary architecture like x86, ARM, etc.
https://github.com/riscvarchive/riscv-cores-list
rrxv6 Overview
Star Me on Github!
Note: (Traditional Chinese)
https://yodalee.me/series/rrxv6/
Source Code
https://github.com/yodalee/rrxv6
rrxv6 Overview (so far)
RISC-V 64 bits Hardware (on qemu)
Application
rv64
(csr operation abstraction)
Kernel
Virtual Memory
PLIC
UART Scheduler
Memory Allocator
rrxv6 Hello World
qemu-system-riscv64 -machine virt -bios none -m 128M -smp 4 -nographic -s
-kernel target/riscv64imac-unknown-none-elf/debug/rrxv6
Required Toolchain
1. Use rustup to install rust toolchain riscv64imac-unknown-none-elf (lp64)
Set target in .cargo/config
2. Install riscv gcc for linking riscv64-unknown-elf-gcc
3. Install qemu-system-riscv to emulate the hardware
// .cargo/config
[build]
target = "riscv64imac-unknown-none-elf"
[target.riscv64imac-unknown-none-elf]
rustflags = ["-C", "link-arg=-Tlinker.ld"]
rrxv6 Source Structure
Assembly:
entry.S, switch.S
…
Rust kernel
kalloc.rs, kvm.rs
scheduler.rs …
Build script:
build.rs
linker.ld
User initcode.S
kernel binary
// build.rs
use cc::Build;
fn main() -> Result<(), Box<dyn Error>> {
Build::new().file("src/entry.S").compile("asm");
Ok(())
}
Inside the rrxv6
Inside the Cargo.toml
bit_field get_bit, set_bit, get_bits, set_bits
bitflags bitmask generator
volatile-register volatile read, write memory address
mvdnes / spin-rs spinlock on static variable
lazy_static create static variable easily
rust-osdev /
linked-list-allocator
memory allocation
Going std-less
Rust does not have std in RISC-V target.
// main.rs
#![no_std]
#![no_main]
use core::panic::PanicInfo;
static STACK0: [u8;STACK_SIZE * NCPU] = [0;STACK_SIZE * NCPU];
#[no_mangle]
fn start() -> ! { loop{} }
#[panic_handler]
fn panic(_panic: &PanicInfo<'_>) -> ! {
loop {}
}
.global _entry
_entry:
# sp = STACK0 + (hartid * 4096)
la sp, STACK0
li a0, STACK_SIZE
csrr a1, mhartid
addi a1, a1, 1
mul a0, a0, a1
add sp, sp, a0
call start
What we have?
● The Core library: https://doc.rust-lang.org/core/index.html
● Actually most std just re-export modules coming from core
○ Core::alloc
○ Core::ptr
○ Core::panic
○ Core::mem
rv64: CSR Abstraction
// set M Previous Privilege mode to Supervisor, for mret.
unsigned long x = r_mstatus();
x &= ~MSTATUS_MPP_MASK ;
x |= MSTATUS_MPP_S ;
w_mstatus(x);
// set M Exception Program Counter to main, for mret.
// requires gcc -mcmodel=medany
w_mepc((uint64)main);
// disable paging for now.
w_satp(0);
// delegate all interrupts and exceptions to supervisor mode.
w_medeleg(0xffff);
w_mideleg(0xffff);
w_sie(r_sie() | SIE_SEIE | SIE_STIE | SIE_SSIE);
rv64: CSR Abstraction
In OS, we need to set processor by program the CSR. To read/write a CSR, we
need assembly csrr and csrw instructions.
Calling assembly is unsafe in Rust.
let mut x: u64;
unsafe { asm!("csrr {}, sie", out(reg) x); }
x |= (1 << 1) | (1 << 5) | (1 << 9);
unsafe { asm!("csrw sie, {}", in(reg) x); }
rv64: CSR Abstraction
pub struct Sie {
bits: u64
}
impl Sie {
pub fn from_read() -> Self {
let bits: u64;
csrr!("sie", bits);
Self { bits }
}
pub fn write(self) {
csrw!("sie", self.bits);
}
}
pub enum Interrupt {
SoftwareInterrupt,
TimerInterrupt,
ExternalInterrupt,
}
impl Sie {
pub fn set_supervisor_enable(&mut self,
interrupt: Interrupt) {
self.bits |= match interrupt {
Interrupt::SoftwareInterrupt => (1 << 1),
Interrupt::TimerInterrupt => (1 << 5),
Interrupt::ExternalInterrupt => (1 << 9),
}
}
}
rv64: CSR Abstraction
let mut sie = Sie::from_read();
sie.set_supervisor_enable(Interrupt::SoftwareInterrupt);
sie.set_supervisor_enable(Interrupt::TimerInterrupt);
sie.set_supervisor_enable(Interrupt::ExternalInterrupt);
sie.write();
● Now the code is meaningful and readable, without the noisy unsafe.
● Also, the abstraction is zero-cost.
80001c24: csrr a0,sie
80001c28: ori a0,a0,546 (0x222)
80001c2c: csrw sie,a0
MemoryAllocator
We cannot allocate memory if we are #![no_std].
All the memory allocation defined in alloc will cause panic!
Box::new(16)
Rc::new(32)
string::from_utf8()
format!("{}", 64)
vec![1,2,3,4,5]
Arc::new(48)
MemoryAllocator
We have to provide the #[global_allocator], which should implement the
GlobalAlloc Trait.
pub unsafe trait GlobalAlloc {
unsafe fn alloc(&self, layout: Layout) -> *mut u8;
unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout);
// optional
unsafe fn alloc_zeroed(&self, layout: Layout) -> *mut u8;
unsafe fn realloc(&self, ptr: *mut u8, layout: Layout, new_size: usize) -> *mut u8;
}
MemoryAllocator -Dummy Implementation
struct DummyAllocator;
unsafe impl GlobalAlloc for DummyAllocator {
unsafe fn alloc(&self, _layout: Layout) -> *mut u8 { null_mut() }
unsafe fn dealloc(&self, _ptr: *mut u8, _layout: Layout) {}
}
#[global_allocator]
static ALLOCATOR: DummyAllocator = DummyAllocator {};
MemoryAllocator
#[global_allocator]
static ALLOCATOR: LockedHeap = LockedHeap::empty();
pub const KERNELBASE : u64 = 0x8000_0000;
pub const PHYSTOP : u64 = KERNELBASE + 128 * 1024 * 1024;
pub fn init_kvm() {
ALLOCATOR
.lock()
.init(KERNELBASE, PHYSTOP)
}
I use rust-osdev / linked-list-allocator (and submit a patch to it)
Mutable Static
In Kernel, there are static data that must be mutable.
For example:
● The UART module
● The Scheduler module
● The CPU register data
Mutable Static
Accessing mutable static is unsafe in Rust. (what a surprise)
static mut DATA: u32 = 0;
println("{}", DATA);
error[E0133]: use of mutable static is unsafe and requires unsafe
function or block
--> a.rs:4:20
|
4 | println!("{}", DATA);
| ^^^^ use of mutable static
Mutable Static Solution 1: Unsafe
Just add unsafe whenever you access static.
Not suggested, it makes code ugly and unreadable.
static mut DATA: u32 = 0;
unsafe {
println("{}", DATA);
DATA = 1;
println!("{}", DATA);
}
Mutable Static Solution 2: Hide Inside Option
Declare as None static mut SCHEDULER: Option<Scheduler> = None;
Add initializer that
replaces it with Some.
pub fn init_scheduler() {
unsafe {
SCHEDULER = Some(Scheduler::new());
}
}
Add a getter pub fn get_scheduler() -> &'static mut Scheduler {
unsafe {
SCHEDULER.as_mut().unwrap()
}
}
Mutable Static Solution 3: Use Lock
pub struct SpinMutex<T: ?Sized,
R = Spin> {
pub(crate) lock: AtomicBool,
data: UnsafeCell<T>,
}
I use spin-rs in rrxv6 and as example.
pub struct SpinMutexGuard<'a, T:
?Sized + 'a> {
lock: &'a AtomicBool,
data: &'a mut T,
}
fn lock() -> SpinMutexGuard
Atomic change lock to true
impl Drop::drop()
Atomic change lock to false
impl Deref::deref() -> &T
self.data
Limitation: be careful not to create deadlock.
Once I put process infomations in a lock: Mutex<PROCESS[16]>
Mutable Static Solution 3: Use Lock
scheduler process[0]
Context switch
lock() and get
process[0]
Timer interrupt
Deadlock
Mutable Static Solution Comparison
Pros Cons
Unsafe Easy to use Readability
Option Easy to use Not thread-safe
Lock Thread safe Deadlock
Usually I use:
1. struct with field of Mutex<data>
2. static struct using Option
Rust have (unsmart) pointer, but we seldom use it.
In rrxv6, there are two kinds of pointers being used.
1. Raw pointer -> memory allocator, virtual memory, scheduler
2. NonNull pointer -> process trapframe and page table
Pointer
The type notation is *const T or *mut T.
The bridge between address and reference. Usually you can only change type in
Rust with:
1. primitive type: like u32 as u64
2. trait object between base/derive
The raw pointer can break this limitation.
Raw Pointer
Raw Pointer -Type Conversion
Data: T
Reference data: &T, &mut T
Raw pointer: *const T, *mut T
Data Address: u64
&
as
as
*
as
Raw pointer: *const S, *mut S as
Raw Pointer -Type Conversion
let next_table: &mut PageTable = unsafe { &mut *(pte.addr() as *mut PageTable) };
free_pagetable(next_table, …);
Data: T
Reference data: &T, &mut T
Raw pointer: *const T, *mut T Data Address: u64
&
as
as
*
as
A better pointer compared to raw pointer, must be non-zero and covariant.
Usually Option<NonNull<T>> is used.
It is still dangerous if you dereference NonNull that is not properly initialized.
For example NonNull created by NonNull::dangling()
NonNull Pointer
NonNull Pointer -Type Conversion
Reference data: &T, &mut T
NonNull<T>
as_mut()
Raw pointer: *mut T
let mut page_table_ptr = NonNull::new(kalloc() as *mut PageTable)?;
let page_table = unsafe { page_table_ptr.as_mut() };
map_pages(page_table, …)
NonNull::new()
Summary-Why Rust?
1. Rust force you to do thing in a safe way (no excuse)
-> It takes more time for Rust to build the foundation.
2. Rust give the level of abstraction - PageTableVisitor, Iterator function, etc.
// lock in xv6
struct spinlock {
uint locked;
};
struct spinlock tickslock;
uint ticks;
{
acquire(&tickslock);
ticks++;
release(&tickslock);
}
// lock in rrxv6
static ref TICK: Mutex<u64> = Mutex::new(0);
let mut tick = TICK.lock();
*tick += 1;
Conclusion
Conclusion
1. It is possible to build a kernel using Rust without std.
2. Kernel is unsafe.
3. It is all about encapsulation:
a. To create the correct type and hide the unsafe operations.
b. To convince the compiler that your program is safe.
Future Work
Features to implement:
● Virtio
● IPC
● User space program in Rust
○ If we get system call right, C program should also work.
Reference
1. xv6-riscv: https://github.com/mit-pdos/xv6-riscv
2. RISC-V specification: https://riscv.org/technical/specifications/
3. Rust bare-metal: https://docs.rust-embedded.org/book/
4. Tock-os: https://github.com/tock/tock
5. blog_os: https://github.com/phil-opp/blog_os
Keep the Spirit
Thanks for Listening

rrxv6 Build a Riscv xv6 Kernel in Rust.pdf

  • 1.
    rrxv6: Build aRISC-V xv6 Kernel in Rust Author: Yodalee <lc85301@gmail.com>
  • 2.
    Outline 1. Background Introduction 2.rrxv6 Overview 3. Inside rrxv6 4. Conclusion
  • 3.
    Outline 1. Background Introduction 2.rrxv6 Overview 3. Inside rrxv6 4. Conclusion There won't be much details about xv6, RISC-V, or Operating System
  • 4.
    ● A re-implementationof Dennis Ritchie's and Ken Thompson's Unix Version 6 (v6) ● Migrate from x86 to RISC-V multiprocessor using ANSI C ● Now used in operating systems courses at many universities. Introduction to xv6
  • 5.
    Introduction to RISC-V ●Open RISC standard Instruction Set Architecture (ISA) CPU ● Began in 2010 at the University of California, Berkeley. ● Alternatives to proprietary architecture like x86, ARM, etc. https://github.com/riscvarchive/riscv-cores-list
  • 6.
  • 7.
    Star Me onGithub! Note: (Traditional Chinese) https://yodalee.me/series/rrxv6/ Source Code https://github.com/yodalee/rrxv6
  • 8.
    rrxv6 Overview (sofar) RISC-V 64 bits Hardware (on qemu) Application rv64 (csr operation abstraction) Kernel Virtual Memory PLIC UART Scheduler Memory Allocator
  • 9.
    rrxv6 Hello World qemu-system-riscv64-machine virt -bios none -m 128M -smp 4 -nographic -s -kernel target/riscv64imac-unknown-none-elf/debug/rrxv6
  • 10.
    Required Toolchain 1. Userustup to install rust toolchain riscv64imac-unknown-none-elf (lp64) Set target in .cargo/config 2. Install riscv gcc for linking riscv64-unknown-elf-gcc 3. Install qemu-system-riscv to emulate the hardware // .cargo/config [build] target = "riscv64imac-unknown-none-elf" [target.riscv64imac-unknown-none-elf] rustflags = ["-C", "link-arg=-Tlinker.ld"]
  • 11.
    rrxv6 Source Structure Assembly: entry.S,switch.S … Rust kernel kalloc.rs, kvm.rs scheduler.rs … Build script: build.rs linker.ld User initcode.S kernel binary // build.rs use cc::Build; fn main() -> Result<(), Box<dyn Error>> { Build::new().file("src/entry.S").compile("asm"); Ok(()) }
  • 12.
  • 13.
    Inside the Cargo.toml bit_fieldget_bit, set_bit, get_bits, set_bits bitflags bitmask generator volatile-register volatile read, write memory address mvdnes / spin-rs spinlock on static variable lazy_static create static variable easily rust-osdev / linked-list-allocator memory allocation
  • 14.
    Going std-less Rust doesnot have std in RISC-V target. // main.rs #![no_std] #![no_main] use core::panic::PanicInfo; static STACK0: [u8;STACK_SIZE * NCPU] = [0;STACK_SIZE * NCPU]; #[no_mangle] fn start() -> ! { loop{} } #[panic_handler] fn panic(_panic: &PanicInfo<'_>) -> ! { loop {} } .global _entry _entry: # sp = STACK0 + (hartid * 4096) la sp, STACK0 li a0, STACK_SIZE csrr a1, mhartid addi a1, a1, 1 mul a0, a0, a1 add sp, sp, a0 call start
  • 15.
    What we have? ●The Core library: https://doc.rust-lang.org/core/index.html ● Actually most std just re-export modules coming from core ○ Core::alloc ○ Core::ptr ○ Core::panic ○ Core::mem
  • 16.
    rv64: CSR Abstraction //set M Previous Privilege mode to Supervisor, for mret. unsigned long x = r_mstatus(); x &= ~MSTATUS_MPP_MASK ; x |= MSTATUS_MPP_S ; w_mstatus(x); // set M Exception Program Counter to main, for mret. // requires gcc -mcmodel=medany w_mepc((uint64)main); // disable paging for now. w_satp(0); // delegate all interrupts and exceptions to supervisor mode. w_medeleg(0xffff); w_mideleg(0xffff); w_sie(r_sie() | SIE_SEIE | SIE_STIE | SIE_SSIE);
  • 17.
    rv64: CSR Abstraction InOS, we need to set processor by program the CSR. To read/write a CSR, we need assembly csrr and csrw instructions. Calling assembly is unsafe in Rust. let mut x: u64; unsafe { asm!("csrr {}, sie", out(reg) x); } x |= (1 << 1) | (1 << 5) | (1 << 9); unsafe { asm!("csrw sie, {}", in(reg) x); }
  • 18.
    rv64: CSR Abstraction pubstruct Sie { bits: u64 } impl Sie { pub fn from_read() -> Self { let bits: u64; csrr!("sie", bits); Self { bits } } pub fn write(self) { csrw!("sie", self.bits); } } pub enum Interrupt { SoftwareInterrupt, TimerInterrupt, ExternalInterrupt, } impl Sie { pub fn set_supervisor_enable(&mut self, interrupt: Interrupt) { self.bits |= match interrupt { Interrupt::SoftwareInterrupt => (1 << 1), Interrupt::TimerInterrupt => (1 << 5), Interrupt::ExternalInterrupt => (1 << 9), } } }
  • 19.
    rv64: CSR Abstraction letmut sie = Sie::from_read(); sie.set_supervisor_enable(Interrupt::SoftwareInterrupt); sie.set_supervisor_enable(Interrupt::TimerInterrupt); sie.set_supervisor_enable(Interrupt::ExternalInterrupt); sie.write(); ● Now the code is meaningful and readable, without the noisy unsafe. ● Also, the abstraction is zero-cost. 80001c24: csrr a0,sie 80001c28: ori a0,a0,546 (0x222) 80001c2c: csrw sie,a0
  • 20.
    MemoryAllocator We cannot allocatememory if we are #![no_std]. All the memory allocation defined in alloc will cause panic! Box::new(16) Rc::new(32) string::from_utf8() format!("{}", 64) vec![1,2,3,4,5] Arc::new(48)
  • 21.
    MemoryAllocator We have toprovide the #[global_allocator], which should implement the GlobalAlloc Trait. pub unsafe trait GlobalAlloc { unsafe fn alloc(&self, layout: Layout) -> *mut u8; unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout); // optional unsafe fn alloc_zeroed(&self, layout: Layout) -> *mut u8; unsafe fn realloc(&self, ptr: *mut u8, layout: Layout, new_size: usize) -> *mut u8; }
  • 22.
    MemoryAllocator -Dummy Implementation structDummyAllocator; unsafe impl GlobalAlloc for DummyAllocator { unsafe fn alloc(&self, _layout: Layout) -> *mut u8 { null_mut() } unsafe fn dealloc(&self, _ptr: *mut u8, _layout: Layout) {} } #[global_allocator] static ALLOCATOR: DummyAllocator = DummyAllocator {};
  • 23.
    MemoryAllocator #[global_allocator] static ALLOCATOR: LockedHeap= LockedHeap::empty(); pub const KERNELBASE : u64 = 0x8000_0000; pub const PHYSTOP : u64 = KERNELBASE + 128 * 1024 * 1024; pub fn init_kvm() { ALLOCATOR .lock() .init(KERNELBASE, PHYSTOP) } I use rust-osdev / linked-list-allocator (and submit a patch to it)
  • 24.
    Mutable Static In Kernel,there are static data that must be mutable. For example: ● The UART module ● The Scheduler module ● The CPU register data
  • 25.
    Mutable Static Accessing mutablestatic is unsafe in Rust. (what a surprise) static mut DATA: u32 = 0; println("{}", DATA); error[E0133]: use of mutable static is unsafe and requires unsafe function or block --> a.rs:4:20 | 4 | println!("{}", DATA); | ^^^^ use of mutable static
  • 26.
    Mutable Static Solution1: Unsafe Just add unsafe whenever you access static. Not suggested, it makes code ugly and unreadable. static mut DATA: u32 = 0; unsafe { println("{}", DATA); DATA = 1; println!("{}", DATA); }
  • 27.
    Mutable Static Solution2: Hide Inside Option Declare as None static mut SCHEDULER: Option<Scheduler> = None; Add initializer that replaces it with Some. pub fn init_scheduler() { unsafe { SCHEDULER = Some(Scheduler::new()); } } Add a getter pub fn get_scheduler() -> &'static mut Scheduler { unsafe { SCHEDULER.as_mut().unwrap() } }
  • 28.
    Mutable Static Solution3: Use Lock pub struct SpinMutex<T: ?Sized, R = Spin> { pub(crate) lock: AtomicBool, data: UnsafeCell<T>, } I use spin-rs in rrxv6 and as example. pub struct SpinMutexGuard<'a, T: ?Sized + 'a> { lock: &'a AtomicBool, data: &'a mut T, } fn lock() -> SpinMutexGuard Atomic change lock to true impl Drop::drop() Atomic change lock to false impl Deref::deref() -> &T self.data
  • 29.
    Limitation: be carefulnot to create deadlock. Once I put process infomations in a lock: Mutex<PROCESS[16]> Mutable Static Solution 3: Use Lock scheduler process[0] Context switch lock() and get process[0] Timer interrupt Deadlock
  • 30.
    Mutable Static SolutionComparison Pros Cons Unsafe Easy to use Readability Option Easy to use Not thread-safe Lock Thread safe Deadlock Usually I use: 1. struct with field of Mutex<data> 2. static struct using Option
  • 31.
    Rust have (unsmart)pointer, but we seldom use it. In rrxv6, there are two kinds of pointers being used. 1. Raw pointer -> memory allocator, virtual memory, scheduler 2. NonNull pointer -> process trapframe and page table Pointer
  • 32.
    The type notationis *const T or *mut T. The bridge between address and reference. Usually you can only change type in Rust with: 1. primitive type: like u32 as u64 2. trait object between base/derive The raw pointer can break this limitation. Raw Pointer
  • 33.
    Raw Pointer -TypeConversion Data: T Reference data: &T, &mut T Raw pointer: *const T, *mut T Data Address: u64 & as as * as Raw pointer: *const S, *mut S as
  • 34.
    Raw Pointer -TypeConversion let next_table: &mut PageTable = unsafe { &mut *(pte.addr() as *mut PageTable) }; free_pagetable(next_table, …); Data: T Reference data: &T, &mut T Raw pointer: *const T, *mut T Data Address: u64 & as as * as
  • 35.
    A better pointercompared to raw pointer, must be non-zero and covariant. Usually Option<NonNull<T>> is used. It is still dangerous if you dereference NonNull that is not properly initialized. For example NonNull created by NonNull::dangling() NonNull Pointer
  • 36.
    NonNull Pointer -TypeConversion Reference data: &T, &mut T NonNull<T> as_mut() Raw pointer: *mut T let mut page_table_ptr = NonNull::new(kalloc() as *mut PageTable)?; let page_table = unsafe { page_table_ptr.as_mut() }; map_pages(page_table, …) NonNull::new()
  • 37.
    Summary-Why Rust? 1. Rustforce you to do thing in a safe way (no excuse) -> It takes more time for Rust to build the foundation. 2. Rust give the level of abstraction - PageTableVisitor, Iterator function, etc. // lock in xv6 struct spinlock { uint locked; }; struct spinlock tickslock; uint ticks; { acquire(&tickslock); ticks++; release(&tickslock); } // lock in rrxv6 static ref TICK: Mutex<u64> = Mutex::new(0); let mut tick = TICK.lock(); *tick += 1;
  • 38.
  • 39.
    Conclusion 1. It ispossible to build a kernel using Rust without std. 2. Kernel is unsafe. 3. It is all about encapsulation: a. To create the correct type and hide the unsafe operations. b. To convince the compiler that your program is safe.
  • 40.
    Future Work Features toimplement: ● Virtio ● IPC ● User space program in Rust ○ If we get system call right, C program should also work.
  • 41.
    Reference 1. xv6-riscv: https://github.com/mit-pdos/xv6-riscv 2.RISC-V specification: https://riscv.org/technical/specifications/ 3. Rust bare-metal: https://docs.rust-embedded.org/book/ 4. Tock-os: https://github.com/tock/tock 5. blog_os: https://github.com/phil-opp/blog_os
  • 42.
  • 49.