SlideShare a Scribd company logo
Experience on porting
HIGHMEM to 32bit RISC-V Linux
Eric Lin
2020.8.2
About me
• 2016 ~ 2018 NCKU
– MS
• 2018.12 ~ 2020.7 Andes technology
– Software engineer in Linux kernel team
• Experience
– RISC-V Linux kernel
– Device driver
– U-BOOT
• Email:dslin1010@gmail.com
Outline
• Why need highmem?
• Porting highmem
• An end to high memory?
Why need high memory ?
• Earliest days, kernel maintained ′direct map ′ to map all physical memory in
kernel space.
– It easy for the kernel to manipulate any page in the system
• 32bit platform only have 4GB virtual address space
– To reduce TLB flush cost between kernel and user space.
– Split 4G address space to 1:3 ( 1GB => kernel, 3GB => user )
• With direct map, kernel can only map 1GB physical memory.
1G
3G
1G
0x00000000
0xC0000000
0xFFFFFFFF
Physical memory
User
kernel
(direct map)
va_pa_offset
Why need high memory? (cont.)
• Reserved ~896MB for linear mapping (direct map) => low memory
• > 896MB virtual address space :
– Temporary mapping
• PKMAP => kmap() 、VMALLOC => vmalloc()、vmap()
– Permanent mapping
• FIXMAP => DTB、early_ioremap()
(ZONE_NORMAL)
low mem
3G
1G
Physical memory > 1Guser
kernel
(ZONE_HIGHMEM)
high mem
linear mapping
( direct map )
896MB
0xC0000000
VMALLOC
PKMAP
FIXMAP
896MB
i386
va_pa_offset
Porting highmem
• Decide RV32 linux memory layout
– Refer other architecture (arm, x86, nds32)
– (option) Move VMALLOC、FIXMAP..etc after PAGE_OFFSET
• Leave user-process more address space.
• Add a PKMAP region in virtual address space
– for kmap()
• Temporary mapping for a single page
• alloc_page(__GFP_HIGHMEM) from ZONE_HIGHMEM
• Create a page table for pkmap
• Add memory slots in FIXMAP for kmap_atomic()
• Add architecture kmap() and kmap_atomic()
7
• 64bit platform needn’t highmem => kernel have 128GB address space.
• After porting highmem, we would like RV32 linux memory layout as below:
RISC-V 5.6 Linux Memory layout
linear mapping
(direct map)
PKMAP
kernel
user 3G
VMALLOC
FIXMAP
Reserved
VMEMMAP
PCI_IO
linear mapping
(direct map)kernel
user
VMALLOC
FIXMAP
VMEMMAP
PCI_IO
0xffffffe0_00000000
0xffffffff_ffffffff0xffffffff
0xc0000000
0x00000000 0x00000000_00000000
128 GB
16777215 TB
PAGE_OFFSETPAGE_OFFSET
RV64RV32
Move memory layout
(arch/riscv/include/asm/pgtable.h)
+#define VMALLOC_SIZE (SZ_128M)
+/* Reserve 4MB from top of RAM to align with PGDIR_SIZE */
+#define VMALLOC_END (0xffc00000UL)
+#define VMALLOC_START (VMALLOC_END - VMALLOC_SIZE)
…..
51 #define VMEMMAP_END (VMALLOC_START - 1)
52 #define VMEMMAP_START (VMALLOC_START - VMEMMAP_SIZE)
+#ifdef CONFIG_HIGHMEM
+/* Set LOWMEM_END alignment with PGDIR_SIZE */
+#define LOWMEM_END (ALIGN_DOWN(PKMAP_BASE, SZ_4M))
+#define LOWMEM_SIZE (LOWMEM_END - PAGE_OFFSET)
+#endif /* CONFIG_HIGHMEM */
#define TASK_SIZE PAGE_OFFSET
---------------------------
(arch/riscv/include/asm/highmem.h)
+#define PKMAP_BASE (FIXADDR_START - SZ_2M)
linear mapping
(direct map)
PKMAP 2M
user 3G
VMALLOC 128M
FIXMAP 4M
Reserved 4M
VMEMMAP 16M
PCI_IO 16M
0xffffffff
0xc0000000
0x00000000
LOWMEM_END
• Locate VMALLOC_END and LOWMEM_END
• Must Reserved 4MB from top of RAM to
align with PGDIR_SIZE
• Add a new region for PKMAP
VMALLOC_END
TASK_SIZE
Porting highmem
setup_bootm()
– max_low_pfn => the end of low_memory
– max_pfn => the end of physical memory
(arch/riscv/mm/init.c)
150 void __init setup_bootmem(void)
151 {
…
188 #ifdef CONFIG_HIGHMEM
189 max_low_pfn = (PFN_DOWN(__pa(LOWMEM_END)));
190 max_pfn = PFN_DOWN(memblock_end_of_DRAM());
191 memblock_set_current_limit(__pa(LOWMEM_END));
192 #else
low mem
Physical memory
high mem
max_low_pfn
max_pfn
27 static void __init zone_sizes_init(void)
28 {
…….
34 #endif
35 max_zone_pfns[ZONE_NORMAL] = max_low_pfn;
36 #ifdef CONFIG_HIGHMEM
37 max_zone_pfns[ZONE_HIGHMEM] = max_pfn;
38 #endif
39 free_area_init_nodes(max_zone_pfns);
40 }
Porting highmem (cont.)
• Prepare a page table and set it to swapper_pg_dir
– swapper_pg_dir is a page directory pointer for kernel.
• 32 bit use 2 level page table (RISC-V)
pte entry
pgd
pmd (pkmap_p )
PAGE
swapper_pg_dir
Physical memory
Porting highmem (cont.)
+static void __init pkmap_init(void)
+{
.
+ /*
+ * Permanent kmaps:
+ */
+ vaddr = PKMAP_BASE;
+
+ pgd = swapper_pg_dir + pgd_index(vaddr);
+ p4d = p4d_offset(pgd, vaddr);
+ pud = pud_offset(p4d, vaddr);
+ pmd = pmd_offset(pud, vaddr);
+ pkmap_p = (pte_t *)__va(memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE));
…….
+ memset(pkmap_p, 0, PAGE_SIZE);
+ pfn = PFN_DOWN(__pa(pkmap_p));
+ set_pmd(pmd, __pmd((pfn << _PAGE_PFN_SHIFT) |
+ pgprot_val(__pgprot(_PAGE_TABLE))));
+
+ /* Adjust pkmap page table base */
+ pkmap_page_table = pkmap_p + pte_index(vaddr);
start_kernel()
-> setup_arch
-> paging_init
->pkmap_init
• Add new function pkmap_init() for creating pkmap page table
Porting highmem (cont.)
//arch/riscv/include/asm/fixmap.h
enum fixed_addresses {
FIX_PTE,
FIX_PMD,
FIX_EARLYCON_MEM_BASE,
+#ifdef CONFIG_HIGHMEM
+ FIX_KMAP_RESERVED,
+ FIX_KMAP_BEGIN,
+ FIX_KMAP_END = FIX_KMAP_BEGIN + (KM_TYPE_NR * NR_CPUS),
+#endif
+ __end_of_fixed_addresses,
};
23 #define FIXADDR_TOP (PKMAP_BASE)
24 #define FIXADDR_SIZE ((__end_of_fixed_addresses) << PAGE_SHIFT)
25 #define FIXADDR_START (FIXADDR_TOP - FIXADDR_SIZE)
FIXADDR_TOP
FIXADDR_START
(__end_of_fixed_address)
FIX_KMAP_BEGIN
4K
4K
4K
4K
FIX_KMAP_END
kmap_atomic
• Add memory slots
in FIXMAP
After porting
• If success, you will see …
After porting
• If fail, you will see …
????
An end to high memory?
• Upstream my first kernel patch, but …
• Arnd Bergmann (Linaro) reply …
– I would much prefer to not see highmem added to new architectures
at all if possible, see https://lwn.net/Articles/813201/
• Weiner like to improve memory-reclaim performance
– Inode-cache shrinking vs. highmem
• Inodes, being kernel data structures => low memory
• page-cache pages => can be placed in high memory
• With a large number of one-byte files on a 7G machine, it invoke inode
shrinker to reclaim inode with populated page cache. It can drop gigabytes of
hot and active page cache.
• Linus Torvalds say …
Reference
• https://www.kernel.org/doc/Documentation/vm/highmem.txt
• https://lwn.net/Articles/813201/
• https://lkml.org/lkml/2020/4/2/253
Port Generic KASAN on RISC V
胡峻銘(Nick Hu)
About me
• 2015 ~ 2017 in NCTU
– MS
• Work in Andes technology
– Software engineer in Linux kernel team
– 2017.12 ~ now
• Experience
– Linux kernel
• NDS32
– Perf, suspend to ram/standby, …
• RISCV
– Perf, KASAN, …
– drivers
Outline
• What is KASAN?
• How to use KASAN?
• How KASAN works?
• How to port Generic KASAN on RISC V
• Patch to upstream
What is KASAN
• Kernel Address SANitizer (KASAN)
• Dynamic memory error detector
– Out of bound
– Use after free
• It can detect
– Global variable
– Stack
– heap
• Work with slub/slab memory allocator
How to use KASAN
• Select CONFIG_KASAN
• Need GCC version 4.9.2 or later
– Compile-time instrumentation
• Choose
– CONFIG_KASAN_OUTLINE or
– CONFIG_KASAN_INLINE
How KASAN works
• Use the shadow memory to mark the status of
memory
– Use one byte shadow memory marks each 8 bytes
memory
– Use magic number fill the shadow memory
3Shadow memory
8 bytes memory
How KASAN works
• Compiler would insert the check code for
load/store instruction
– Example:
ffffffe0000022f4: 41e080e7 jalr 1054(ra) # ffffffe00036770e <__asan_load1>
ffffffe0000022f8: 00094783 lbu a5,0(s2)
ffffffe0000022fc: 00190b13 addi s6,s2,1
ffffffe000002300: cbc9 beqz a5,ffffffe000002392 <mount_block_root+0x148>
ffffffe000002302: 01879963 bne a5,s8,ffffffe000002314 <mount_block_root+0xca>
ffffffe000002306: 854a mv a0,s2
ffffffe000002308: 00365097 auipc ra,0x365
ffffffe00000230c: 450080e7 jalr 1104(ra) # ffffffe000367758 <__asan_store1>
ffffffe000002310: fe0b0fa3 sb zero,-1(s6)
How to port Generic KASAN on RISCV
• In Documentation/dev-tools/kasan.rst
– Generic KASAN dedicates 1/8th of kernel memory to
its shadow memory
• KASAN_SHADOW_SCALE_SHIFT=3
• Decide shadow memory address
– KASAN_SHADOW_SIZE
• (UL(1) << (38 - KASAN_SHADOW_SCALE_SHIFT))
– KASAN_SHADOW_START
• 0xffffffc000000000 /* 2^64 - 2^38 */
– KASAN_SHADOW_END
• (KASAN_SHADOW_START + KASAN_SHADOW_SIZE)
How to port Generic KASAN on RISCV
• Translate the memory address to the corresponding
shadow address
– kasan_mem_to_shadow()
– KASAN_SHADOW_OFFSET
• shadow_addr – (addr >>
KASAN_SHADOW_SCALE_SHIFT)
• (KASAN_SHADOW_END –
(1ULL <<(64 - KASAN_SHADOW_SCALE_SHIFT)))
static inline void *kasan_mem_to_shadow(const void *addr)
{
return (void *)((unsigned long)addr >> KASAN_SHADOW_SCALE_SHIFT)
+ KASAN_SHADOW_OFFSET;
}
How to port Generic KASAN on RISCV
• kasan_early_init()
– Init the mapping for shadow memory in early
stage
• All maps to ‘kasan_early_shadow_page’
• kasan_init()
– Mapping the memory area which don’t need
check to ‘kasan_early_shadow_page’
– Allocate physical space for shadow memory
• KASAN can poison the shadow memory
Patch to upstream
• V1
– https://lkml.org/lkml/2019/8/7/135
• V4
– https://lkml.org/lkml/2019/10/27/814
END

More Related Content

What's hot

Tips of Malloc & Free
Tips of Malloc & FreeTips of Malloc & Free
Tips of Malloc & Free
Tetsuyuki Kobayashi
 
Memory Management with Page Folios
Memory Management with Page FoliosMemory Management with Page Folios
Memory Management with Page Folios
Adrian Huang
 
semaphore & mutex.pdf
semaphore & mutex.pdfsemaphore & mutex.pdf
semaphore & mutex.pdf
Adrian Huang
 
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
Adrian Huang
 
Linux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionLinux MMAP & Ioremap introduction
Linux MMAP & Ioremap introduction
Gene Chang
 
Linux kernel status in RISC-V
Linux kernel status in RISC-VLinux kernel status in RISC-V
Linux kernel status in RISC-V
Atish Patra
 
qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...
qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...
qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...
Adrian Huang
 
How to use KASAN to debug memory corruption in OpenStack environment- (2)
How to use KASAN to debug memory corruption in OpenStack environment- (2)How to use KASAN to debug memory corruption in OpenStack environment- (2)
How to use KASAN to debug memory corruption in OpenStack environment- (2)
Gavin Guo
 
Physical Memory Management.pdf
Physical Memory Management.pdfPhysical Memory Management.pdf
Physical Memory Management.pdf
Adrian Huang
 
Linux Kernel - Virtual File System
Linux Kernel - Virtual File SystemLinux Kernel - Virtual File System
Linux Kernel - Virtual File System
Adrian Huang
 
Linux Memory Management
Linux Memory ManagementLinux Memory Management
Linux Memory Management
Ni Zo-Ma
 
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMIKernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
Anne Nicolas
 
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Anne Nicolas
 
Memory management in Linux kernel
Memory management in Linux kernelMemory management in Linux kernel
Memory management in Linux kernel
Vadim Nikitin
 
Advanced heap exploitaion
Advanced heap exploitaionAdvanced heap exploitaion
Advanced heap exploitaion
Angel Boy
 
Linux Initialization Process (1)
Linux Initialization Process (1)Linux Initialization Process (1)
Linux Initialization Process (1)
shimosawa
 
twlkh-linux-vsyscall-and-vdso
twlkh-linux-vsyscall-and-vdsotwlkh-linux-vsyscall-and-vdso
twlkh-linux-vsyscall-and-vdso
Viller Hsiao
 
Qemu device prototyping
Qemu device prototypingQemu device prototyping
Qemu device prototyping
Yan Vugenfirer
 
Prerequisite knowledge for shared memory concurrency
Prerequisite knowledge for shared memory concurrencyPrerequisite knowledge for shared memory concurrency
Prerequisite knowledge for shared memory concurrency
Viller Hsiao
 
Linux Linux Traffic Control
Linux Linux Traffic ControlLinux Linux Traffic Control
Linux Linux Traffic Control
SUSE Labs Taipei
 

What's hot (20)

Tips of Malloc & Free
Tips of Malloc & FreeTips of Malloc & Free
Tips of Malloc & Free
 
Memory Management with Page Folios
Memory Management with Page FoliosMemory Management with Page Folios
Memory Management with Page Folios
 
semaphore & mutex.pdf
semaphore & mutex.pdfsemaphore & mutex.pdf
semaphore & mutex.pdf
 
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
 
Linux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionLinux MMAP & Ioremap introduction
Linux MMAP & Ioremap introduction
 
Linux kernel status in RISC-V
Linux kernel status in RISC-VLinux kernel status in RISC-V
Linux kernel status in RISC-V
 
qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...
qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...
qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...
 
How to use KASAN to debug memory corruption in OpenStack environment- (2)
How to use KASAN to debug memory corruption in OpenStack environment- (2)How to use KASAN to debug memory corruption in OpenStack environment- (2)
How to use KASAN to debug memory corruption in OpenStack environment- (2)
 
Physical Memory Management.pdf
Physical Memory Management.pdfPhysical Memory Management.pdf
Physical Memory Management.pdf
 
Linux Kernel - Virtual File System
Linux Kernel - Virtual File SystemLinux Kernel - Virtual File System
Linux Kernel - Virtual File System
 
Linux Memory Management
Linux Memory ManagementLinux Memory Management
Linux Memory Management
 
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMIKernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
 
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
 
Memory management in Linux kernel
Memory management in Linux kernelMemory management in Linux kernel
Memory management in Linux kernel
 
Advanced heap exploitaion
Advanced heap exploitaionAdvanced heap exploitaion
Advanced heap exploitaion
 
Linux Initialization Process (1)
Linux Initialization Process (1)Linux Initialization Process (1)
Linux Initialization Process (1)
 
twlkh-linux-vsyscall-and-vdso
twlkh-linux-vsyscall-and-vdsotwlkh-linux-vsyscall-and-vdso
twlkh-linux-vsyscall-and-vdso
 
Qemu device prototyping
Qemu device prototypingQemu device prototyping
Qemu device prototyping
 
Prerequisite knowledge for shared memory concurrency
Prerequisite knowledge for shared memory concurrencyPrerequisite knowledge for shared memory concurrency
Prerequisite knowledge for shared memory concurrency
 
Linux Linux Traffic Control
Linux Linux Traffic ControlLinux Linux Traffic Control
Linux Linux Traffic Control
 

Similar to Experience on porting HIGHMEM and KASAN to RISC-V at COSCUP 2020

COSCUP 2020 RISC-V 32 bit linux highmem porting
COSCUP 2020 RISC-V 32 bit linux highmem portingCOSCUP 2020 RISC-V 32 bit linux highmem porting
COSCUP 2020 RISC-V 32 bit linux highmem porting
Eric Lin
 
Kvm performance optimization for ubuntu
Kvm performance optimization for ubuntuKvm performance optimization for ubuntu
Kvm performance optimization for ubuntu
Sim Janghoon
 
Java Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey KovalenkoJava Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey Kovalenko
Valeriia Maliarenko
 
Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)
shimosawa
 
Raspberry Pi tutorial
Raspberry Pi tutorialRaspberry Pi tutorial
Raspberry Pi tutorial
艾鍗科技
 
Analisis_avanzado_vmware
Analisis_avanzado_vmwareAnalisis_avanzado_vmware
Analisis_avanzado_vmware
virtualizacionTV
 
Advanced Root Cause Analysis
Advanced Root Cause AnalysisAdvanced Root Cause Analysis
Advanced Root Cause Analysis
Eric Sloof
 
Exploiting the Linux Kernel via Intel's SYSRET Implementation
Exploiting the Linux Kernel via Intel's SYSRET ImplementationExploiting the Linux Kernel via Intel's SYSRET Implementation
Exploiting the Linux Kernel via Intel's SYSRET Implementation
nkslides
 
MySQLinsanity
MySQLinsanityMySQLinsanity
MySQLinsanity
Stanley Huang
 
kdump: usage and_internals
kdump: usage and_internalskdump: usage and_internals
kdump: usage and_internals
LinuxCon ContainerCon CloudOpen China
 
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Akihiro Hayashi
 
LCU14 302- How to port OP-TEE to another platform
LCU14 302- How to port OP-TEE to another platformLCU14 302- How to port OP-TEE to another platform
LCU14 302- How to port OP-TEE to another platform
Linaro
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debugging
libfetion
 
Build an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on CephBuild an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on Ceph
Rongze Zhu
 
Exploiting GPU's for Columnar DataFrrames by Kiran Lonikar
Exploiting GPU's for Columnar DataFrrames by Kiran LonikarExploiting GPU's for Columnar DataFrrames by Kiran Lonikar
Exploiting GPU's for Columnar DataFrrames by Kiran Lonikar
Spark Summit
 
Crash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_TizenCrash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_Tizen
Lex Yu
 
Kernel debug log and console on openSUSE
Kernel debug log and console on openSUSEKernel debug log and console on openSUSE
Kernel debug log and console on openSUSE
SUSE Labs Taipei
 
Basic Linux kernel
Basic Linux kernelBasic Linux kernel
Basic Linux kernel
Morteza Nourelahi Alamdari
 
SiteGround Tech TeamBuilding
SiteGround Tech TeamBuildingSiteGround Tech TeamBuilding
SiteGround Tech TeamBuilding
Marian Marinov
 
PV-Drivers for SeaBIOS using Upstream Qemu
PV-Drivers for SeaBIOS using Upstream QemuPV-Drivers for SeaBIOS using Upstream Qemu
PV-Drivers for SeaBIOS using Upstream Qemu
The Linux Foundation
 

Similar to Experience on porting HIGHMEM and KASAN to RISC-V at COSCUP 2020 (20)

COSCUP 2020 RISC-V 32 bit linux highmem porting
COSCUP 2020 RISC-V 32 bit linux highmem portingCOSCUP 2020 RISC-V 32 bit linux highmem porting
COSCUP 2020 RISC-V 32 bit linux highmem porting
 
Kvm performance optimization for ubuntu
Kvm performance optimization for ubuntuKvm performance optimization for ubuntu
Kvm performance optimization for ubuntu
 
Java Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey KovalenkoJava Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey Kovalenko
 
Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)
 
Raspberry Pi tutorial
Raspberry Pi tutorialRaspberry Pi tutorial
Raspberry Pi tutorial
 
Analisis_avanzado_vmware
Analisis_avanzado_vmwareAnalisis_avanzado_vmware
Analisis_avanzado_vmware
 
Advanced Root Cause Analysis
Advanced Root Cause AnalysisAdvanced Root Cause Analysis
Advanced Root Cause Analysis
 
Exploiting the Linux Kernel via Intel's SYSRET Implementation
Exploiting the Linux Kernel via Intel's SYSRET ImplementationExploiting the Linux Kernel via Intel's SYSRET Implementation
Exploiting the Linux Kernel via Intel's SYSRET Implementation
 
MySQLinsanity
MySQLinsanityMySQLinsanity
MySQLinsanity
 
kdump: usage and_internals
kdump: usage and_internalskdump: usage and_internals
kdump: usage and_internals
 
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
 
LCU14 302- How to port OP-TEE to another platform
LCU14 302- How to port OP-TEE to another platformLCU14 302- How to port OP-TEE to another platform
LCU14 302- How to port OP-TEE to another platform
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debugging
 
Build an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on CephBuild an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on Ceph
 
Exploiting GPU's for Columnar DataFrrames by Kiran Lonikar
Exploiting GPU's for Columnar DataFrrames by Kiran LonikarExploiting GPU's for Columnar DataFrrames by Kiran Lonikar
Exploiting GPU's for Columnar DataFrrames by Kiran Lonikar
 
Crash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_TizenCrash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_Tizen
 
Kernel debug log and console on openSUSE
Kernel debug log and console on openSUSEKernel debug log and console on openSUSE
Kernel debug log and console on openSUSE
 
Basic Linux kernel
Basic Linux kernelBasic Linux kernel
Basic Linux kernel
 
SiteGround Tech TeamBuilding
SiteGround Tech TeamBuildingSiteGround Tech TeamBuilding
SiteGround Tech TeamBuilding
 
PV-Drivers for SeaBIOS using Upstream Qemu
PV-Drivers for SeaBIOS using Upstream QemuPV-Drivers for SeaBIOS using Upstream Qemu
PV-Drivers for SeaBIOS using Upstream Qemu
 

Recently uploaded

Data Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason WebinarData Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason Webinar
UReason
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
Anant Corporation
 
Software Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.pptSoftware Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.ppt
TaghreedAltamimi
 
Hematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood CountHematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood Count
shahdabdulbaset
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
VICTOR MAESTRE RAMIREZ
 
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
Gino153088
 
Transformers design and coooling methods
Transformers design and coooling methodsTransformers design and coooling methods
Transformers design and coooling methods
Roger Rozario
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
RamonNovais6
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
IJECEIAES
 
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
ecqow
 
Curve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods RegressionCurve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods Regression
Nada Hikmah
 
Welding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdfWelding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdf
AjmalKhan50578
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
bijceesjournal
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
insn4465
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
PKavitha10
 
BRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdfBRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdf
LAXMAREDDY22
 
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
SakkaravarthiShanmug
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
Yasser Mahgoub
 
Certificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi AhmedCertificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi Ahmed
Mahmoud Morsy
 
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
ydzowc
 

Recently uploaded (20)

Data Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason WebinarData Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason Webinar
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
 
Software Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.pptSoftware Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.ppt
 
Hematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood CountHematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood Count
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
 
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
 
Transformers design and coooling methods
Transformers design and coooling methodsTransformers design and coooling methods
Transformers design and coooling methods
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
 
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
 
Curve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods RegressionCurve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods Regression
 
Welding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdfWelding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdf
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
 
BRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdfBRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdf
 
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
 
Certificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi AhmedCertificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi Ahmed
 
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
 

Experience on porting HIGHMEM and KASAN to RISC-V at COSCUP 2020

  • 1. Experience on porting HIGHMEM to 32bit RISC-V Linux Eric Lin 2020.8.2
  • 2. About me • 2016 ~ 2018 NCKU – MS • 2018.12 ~ 2020.7 Andes technology – Software engineer in Linux kernel team • Experience – RISC-V Linux kernel – Device driver – U-BOOT • Email:dslin1010@gmail.com
  • 3. Outline • Why need highmem? • Porting highmem • An end to high memory?
  • 4. Why need high memory ? • Earliest days, kernel maintained ′direct map ′ to map all physical memory in kernel space. – It easy for the kernel to manipulate any page in the system • 32bit platform only have 4GB virtual address space – To reduce TLB flush cost between kernel and user space. – Split 4G address space to 1:3 ( 1GB => kernel, 3GB => user ) • With direct map, kernel can only map 1GB physical memory. 1G 3G 1G 0x00000000 0xC0000000 0xFFFFFFFF Physical memory User kernel (direct map) va_pa_offset
  • 5. Why need high memory? (cont.) • Reserved ~896MB for linear mapping (direct map) => low memory • > 896MB virtual address space : – Temporary mapping • PKMAP => kmap() 、VMALLOC => vmalloc()、vmap() – Permanent mapping • FIXMAP => DTB、early_ioremap() (ZONE_NORMAL) low mem 3G 1G Physical memory > 1Guser kernel (ZONE_HIGHMEM) high mem linear mapping ( direct map ) 896MB 0xC0000000 VMALLOC PKMAP FIXMAP 896MB i386 va_pa_offset
  • 6. Porting highmem • Decide RV32 linux memory layout – Refer other architecture (arm, x86, nds32) – (option) Move VMALLOC、FIXMAP..etc after PAGE_OFFSET • Leave user-process more address space. • Add a PKMAP region in virtual address space – for kmap() • Temporary mapping for a single page • alloc_page(__GFP_HIGHMEM) from ZONE_HIGHMEM • Create a page table for pkmap • Add memory slots in FIXMAP for kmap_atomic() • Add architecture kmap() and kmap_atomic()
  • 7. 7 • 64bit platform needn’t highmem => kernel have 128GB address space. • After porting highmem, we would like RV32 linux memory layout as below: RISC-V 5.6 Linux Memory layout linear mapping (direct map) PKMAP kernel user 3G VMALLOC FIXMAP Reserved VMEMMAP PCI_IO linear mapping (direct map)kernel user VMALLOC FIXMAP VMEMMAP PCI_IO 0xffffffe0_00000000 0xffffffff_ffffffff0xffffffff 0xc0000000 0x00000000 0x00000000_00000000 128 GB 16777215 TB PAGE_OFFSETPAGE_OFFSET RV64RV32
  • 8. Move memory layout (arch/riscv/include/asm/pgtable.h) +#define VMALLOC_SIZE (SZ_128M) +/* Reserve 4MB from top of RAM to align with PGDIR_SIZE */ +#define VMALLOC_END (0xffc00000UL) +#define VMALLOC_START (VMALLOC_END - VMALLOC_SIZE) ….. 51 #define VMEMMAP_END (VMALLOC_START - 1) 52 #define VMEMMAP_START (VMALLOC_START - VMEMMAP_SIZE) +#ifdef CONFIG_HIGHMEM +/* Set LOWMEM_END alignment with PGDIR_SIZE */ +#define LOWMEM_END (ALIGN_DOWN(PKMAP_BASE, SZ_4M)) +#define LOWMEM_SIZE (LOWMEM_END - PAGE_OFFSET) +#endif /* CONFIG_HIGHMEM */ #define TASK_SIZE PAGE_OFFSET --------------------------- (arch/riscv/include/asm/highmem.h) +#define PKMAP_BASE (FIXADDR_START - SZ_2M) linear mapping (direct map) PKMAP 2M user 3G VMALLOC 128M FIXMAP 4M Reserved 4M VMEMMAP 16M PCI_IO 16M 0xffffffff 0xc0000000 0x00000000 LOWMEM_END • Locate VMALLOC_END and LOWMEM_END • Must Reserved 4MB from top of RAM to align with PGDIR_SIZE • Add a new region for PKMAP VMALLOC_END TASK_SIZE
  • 9. Porting highmem setup_bootm() – max_low_pfn => the end of low_memory – max_pfn => the end of physical memory (arch/riscv/mm/init.c) 150 void __init setup_bootmem(void) 151 { … 188 #ifdef CONFIG_HIGHMEM 189 max_low_pfn = (PFN_DOWN(__pa(LOWMEM_END))); 190 max_pfn = PFN_DOWN(memblock_end_of_DRAM()); 191 memblock_set_current_limit(__pa(LOWMEM_END)); 192 #else low mem Physical memory high mem max_low_pfn max_pfn 27 static void __init zone_sizes_init(void) 28 { ……. 34 #endif 35 max_zone_pfns[ZONE_NORMAL] = max_low_pfn; 36 #ifdef CONFIG_HIGHMEM 37 max_zone_pfns[ZONE_HIGHMEM] = max_pfn; 38 #endif 39 free_area_init_nodes(max_zone_pfns); 40 }
  • 10. Porting highmem (cont.) • Prepare a page table and set it to swapper_pg_dir – swapper_pg_dir is a page directory pointer for kernel. • 32 bit use 2 level page table (RISC-V) pte entry pgd pmd (pkmap_p ) PAGE swapper_pg_dir Physical memory
  • 11. Porting highmem (cont.) +static void __init pkmap_init(void) +{ . + /* + * Permanent kmaps: + */ + vaddr = PKMAP_BASE; + + pgd = swapper_pg_dir + pgd_index(vaddr); + p4d = p4d_offset(pgd, vaddr); + pud = pud_offset(p4d, vaddr); + pmd = pmd_offset(pud, vaddr); + pkmap_p = (pte_t *)__va(memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE)); ……. + memset(pkmap_p, 0, PAGE_SIZE); + pfn = PFN_DOWN(__pa(pkmap_p)); + set_pmd(pmd, __pmd((pfn << _PAGE_PFN_SHIFT) | + pgprot_val(__pgprot(_PAGE_TABLE)))); + + /* Adjust pkmap page table base */ + pkmap_page_table = pkmap_p + pte_index(vaddr); start_kernel() -> setup_arch -> paging_init ->pkmap_init • Add new function pkmap_init() for creating pkmap page table
  • 12. Porting highmem (cont.) //arch/riscv/include/asm/fixmap.h enum fixed_addresses { FIX_PTE, FIX_PMD, FIX_EARLYCON_MEM_BASE, +#ifdef CONFIG_HIGHMEM + FIX_KMAP_RESERVED, + FIX_KMAP_BEGIN, + FIX_KMAP_END = FIX_KMAP_BEGIN + (KM_TYPE_NR * NR_CPUS), +#endif + __end_of_fixed_addresses, }; 23 #define FIXADDR_TOP (PKMAP_BASE) 24 #define FIXADDR_SIZE ((__end_of_fixed_addresses) << PAGE_SHIFT) 25 #define FIXADDR_START (FIXADDR_TOP - FIXADDR_SIZE) FIXADDR_TOP FIXADDR_START (__end_of_fixed_address) FIX_KMAP_BEGIN 4K 4K 4K 4K FIX_KMAP_END kmap_atomic • Add memory slots in FIXMAP
  • 13. After porting • If success, you will see …
  • 14. After porting • If fail, you will see … ????
  • 15. An end to high memory? • Upstream my first kernel patch, but … • Arnd Bergmann (Linaro) reply … – I would much prefer to not see highmem added to new architectures at all if possible, see https://lwn.net/Articles/813201/ • Weiner like to improve memory-reclaim performance – Inode-cache shrinking vs. highmem • Inodes, being kernel data structures => low memory • page-cache pages => can be placed in high memory • With a large number of one-byte files on a 7G machine, it invoke inode shrinker to reclaim inode with populated page cache. It can drop gigabytes of hot and active page cache. • Linus Torvalds say …
  • 17. Port Generic KASAN on RISC V 胡峻銘(Nick Hu)
  • 18. About me • 2015 ~ 2017 in NCTU – MS • Work in Andes technology – Software engineer in Linux kernel team – 2017.12 ~ now • Experience – Linux kernel • NDS32 – Perf, suspend to ram/standby, … • RISCV – Perf, KASAN, … – drivers
  • 19. Outline • What is KASAN? • How to use KASAN? • How KASAN works? • How to port Generic KASAN on RISC V • Patch to upstream
  • 20. What is KASAN • Kernel Address SANitizer (KASAN) • Dynamic memory error detector – Out of bound – Use after free • It can detect – Global variable – Stack – heap • Work with slub/slab memory allocator
  • 21. How to use KASAN • Select CONFIG_KASAN • Need GCC version 4.9.2 or later – Compile-time instrumentation • Choose – CONFIG_KASAN_OUTLINE or – CONFIG_KASAN_INLINE
  • 22. How KASAN works • Use the shadow memory to mark the status of memory – Use one byte shadow memory marks each 8 bytes memory – Use magic number fill the shadow memory 3Shadow memory 8 bytes memory
  • 23. How KASAN works • Compiler would insert the check code for load/store instruction – Example: ffffffe0000022f4: 41e080e7 jalr 1054(ra) # ffffffe00036770e <__asan_load1> ffffffe0000022f8: 00094783 lbu a5,0(s2) ffffffe0000022fc: 00190b13 addi s6,s2,1 ffffffe000002300: cbc9 beqz a5,ffffffe000002392 <mount_block_root+0x148> ffffffe000002302: 01879963 bne a5,s8,ffffffe000002314 <mount_block_root+0xca> ffffffe000002306: 854a mv a0,s2 ffffffe000002308: 00365097 auipc ra,0x365 ffffffe00000230c: 450080e7 jalr 1104(ra) # ffffffe000367758 <__asan_store1> ffffffe000002310: fe0b0fa3 sb zero,-1(s6)
  • 24. How to port Generic KASAN on RISCV • In Documentation/dev-tools/kasan.rst – Generic KASAN dedicates 1/8th of kernel memory to its shadow memory • KASAN_SHADOW_SCALE_SHIFT=3 • Decide shadow memory address – KASAN_SHADOW_SIZE • (UL(1) << (38 - KASAN_SHADOW_SCALE_SHIFT)) – KASAN_SHADOW_START • 0xffffffc000000000 /* 2^64 - 2^38 */ – KASAN_SHADOW_END • (KASAN_SHADOW_START + KASAN_SHADOW_SIZE)
  • 25. How to port Generic KASAN on RISCV • Translate the memory address to the corresponding shadow address – kasan_mem_to_shadow() – KASAN_SHADOW_OFFSET • shadow_addr – (addr >> KASAN_SHADOW_SCALE_SHIFT) • (KASAN_SHADOW_END – (1ULL <<(64 - KASAN_SHADOW_SCALE_SHIFT))) static inline void *kasan_mem_to_shadow(const void *addr) { return (void *)((unsigned long)addr >> KASAN_SHADOW_SCALE_SHIFT) + KASAN_SHADOW_OFFSET; }
  • 26. How to port Generic KASAN on RISCV • kasan_early_init() – Init the mapping for shadow memory in early stage • All maps to ‘kasan_early_shadow_page’ • kasan_init() – Mapping the memory area which don’t need check to ‘kasan_early_shadow_page’ – Allocate physical space for shadow memory • KASAN can poison the shadow memory
  • 27. Patch to upstream • V1 – https://lkml.org/lkml/2019/8/7/135 • V4 – https://lkml.org/lkml/2019/10/27/814
  • 28. END

Editor's Notes

  1. Hi 大家好 我是Eric~ 今天很高興來到coscup跟大家分享如何把 HIGHMEM porting到 32bit RISC-V linux kernel
  2. 這邊簡單自我介紹一下,主要在碩士到晶心科技Linux kernel team 對於Linux kernel也還在學習階段~
  3. 主要跟大家簡介Linux 為何HIGHMEM這個機制 如何porting highmem Highmem 機制是不是要被deprecate , 主要來自於今年2月LWN 一篇文章
  4. 早期kernel使用direct map,來map所有的physical memory到kernel space,這樣好處方便Linux kernel 管理這些page 在32bit platform只有4GB address sapce Kernel 為了要減少, kernel與user space之間切換 TLB flush的overhead,就把4G address 切成1:3 這樣kernel direct map 就只能夠有1GB physical memory 如果physical memory > 1GB,如何讓kernel可以map超過1GB的記憶體
  5. highmem 主要是保留896MB給direct map,對應到physical就稱為low mem(normal zone),大於896M就稱為high mem (high memory zone) 在virtual address space 部分主要會有2種mapping機制 Temporary mapping (暫時映射) => PKMAP (kmap) 、VMALLOC(vmamlloc,vmap) 永久映射: FIXADDR or FIXMAP => (DTB or early ioremap )
  6. 有了上面的觀念後,接下來就可以開始porting highmem部分 這邊主要分享porting的幾個重點, 要知道rv32 linux memory layout,這邊主要是參考arm、x86 新增一個PKMAP個空間給kmap()、kmap主要是當我們從highmem zone allocate 一塊page時候,建立va ->pa之間的mapping,因此需要建立一塊page table 給pkmap 新增一些memory slot 新增architecture kmap kmap atomic()
  7. 首先我們要先了解rv64 memory layout 以5.6 ,右邊這張圖 這邊順便跟提醒一下64bit是不需要highmem這個機制,因為它的direct map就有128G,已經很夠用 再來就是希望把vmlloc..fixmap部分往上移,讓user 可以比較多空間
  8. 主要是在pagetable.h檔案,定出整個layout開始跟結束的位置, 另外需要注意的是保留4MB空間,因為要align PGDIR_SIZE,如果不align會無法開機
  9. 接下來就是要在setup bootm()定義2個重要參數 max_low_pfn => low memory結束的pfn max_pfn => 整個physical memory Linux在對每個zone做初始化的時候就需要這2個參數,也就是開機畫面會看到 這是在計算每個zone的範圍
  10. 再來就是剛才提到我們需要幫PKMAP region建立page table ,主要就是幫kmap的建立va 與pa 對應的關係 需要先跟系統要一塊page (紅色框) 5這塊就是pkamp的page table 把它assign給swapper_pg_dir ,swapper_pg_dir主要是kernel page directory pointer 剛才有提到為何要保留4MB,主要是我們可以看到一個pgd entry 管理4MB,一開始如果沒有把它會找不到level的pmd entry
  11. 這邊主要就是在做上一張投影片的實作, 需要在start_kernel的paging_init新增一個pkmap_init
  12. 在porting之後,如果順利的話我們會看到機畫面會有2個zone,在memory layout可以看到我們剛才加入的PKMAP region 之後進入shell, 在memory info 可以看到kernel的high memory page有多少
  13. 通常就要開始debug,
  14. 把highmem porting好之後就想要upstream人生的第一個kernel patch,不過送完patch之後可以linaro