QEMU Sandboxing for dummies

Eduardo Otubo
Eduardo OtuboSenior Software Engineer at Red Hat
QEMU Sandboxing for dummies
Eduardo Otubo <otubo@redhat.com>
Senior Software Engineer
27/Jan/2018
2
1. Secure Computing: The basics
2. Libseccomp
3. Qemu sandboxing v1
4. Qemu sandboxing v2 and more options
Agenda
3
Secure Computing: the basics
● Kernel support first version dated from March, 8th 2005 (2.6.12)
Commit by: Andrea Arcangeli
● The main purpose is to call prctl() with PR_SET_SECCOMP on the
process which will allow only: exit(), sigreturn(), read()
and write()
○ Otherwise SIGKILL or SIGSYS are issued
4
Secure Computing: the basics
● Second kernel implementation with dynamic seccomp policies:
January, 11th 2011; Commit by: Will Drewry <wad@chromium.org>
● Now uses with seccomp() system call
● Uses BPF (Berkeley Packet Filter)
○ An in-kernel data link layer packet filter that has an abstracted API that
also works as a generic filter
5
struct sock_filter filter[] = {
/* Grab the system call number */
BPF_STMT(BPF_LD+BPF_W+BPF_ABS, syscall_nr),
/* Jump table for the allowed syscalls */
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_rt_sigreturn, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
#ifdef __NR_sigreturn
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_sigreturn, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
#endif
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit_group, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_read, 1, 0),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_write, 3, 2),
6
Libseccomp
● Paul Moore (2011)
● Userspace layer to make life easier:
○ Abstract complex BPF constructions
○ Abstract differences between architectures and its ABIs
○ Optimize filter construction for best performance
○ Kill (sigkill), trap (sigsys), Allow in case of matched filter (among
other actions)
7
struct sock_filter filter[] = {
/* Grab the system call number */
BPF_STMT(BPF_LD+BPF_W+BPF_ABS, syscall_nr),
/* Jump table for the allowed syscalls */
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_rt_sigreturn, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
#ifdef __NR_sigreturn
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_sigreturn, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
#endif
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit_group, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_read, 1, 0),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_write, 3, 2),
8
struct sock_filter filter[] = {
/* Grab the system call number */
BPF_STMT(BPF_LD+BPF_W+BPF_ABS, syscall_nr),
/* Jump table for the allowed syscalls */
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_rt_sigreturn, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
#ifdef __NR_sigreturn
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_sigreturn, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
#endif
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit_group, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_read, 1, 0),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_write, 3, 2),
9
Qemu sandboxing v1
static const struct QemuSeccompSyscall seccomp_whitelist[] = {
{ SCMP_SYS(timer_settime), 255 },
{ SCMP_SYS(timer_gettime), 254 },
{ SCMP_SYS(futex), 253 },
{ SCMP_SYS(select), 252 },
{ SCMP_SYS(recvfrom), 251 },
{ SCMP_SYS(sendto), 250 },
{ SCMP_SYS(read), 249 },
{ SCMP_SYS(brk), 248 },
{ SCMP_SYS(clone), 247 },
{ SCMP_SYS(mmap), 247 },
{ SCMP_SYS(mprotect), 246 },
{ SCMP_SYS(execve), 245 },
{ SCMP_SYS(open), 245 },
{ SCMP_SYS(ioctl), 245 },
{ SCMP_SYS(recvmsg), 245 },
{ SCMP_SYS(sendmsg), 245 },
10
Qemu sandboxing v1
11
● Basic whitelist approach (--sandbox=on)
○ Every system call is blocked, except for the ones that are explicitly
whitelisted
● Various compatibility problems, requires lots of testing and
different workloads
● It’s safe right?
12
Qemu sandboxing v1
Not actually!
● QEMU links to too many different shared libraries and there is no way
to determine which code paths QEMU triggers in these libraries and
thus identify which syscalls will be genuinely needed.
● Sometimes you miss a syscall and it aborts right at the beginning
before boot (which is good?) but sometimes your VM is running for
days and it could suddenly abort (which is terrible)
13
Qemu sandboxing v2
● Extended blacklist approach (--sandbox=on,...)
● Everything is allowed except for a few sets that are definitely not
allowed
○ Default system calls: basic set of forbidden system calls (kexec,swapon,
swapoff, mount, umount, etc)
○ obsolete
○ elevateprivileges
○ spawn
○ resourcecontrol
14
Obsolete system calls
● Old system calls that were usefull in the past but became obsolete or
replaced by new version
○ Like readdir() being replaced by getdents()
● Should be by default blocked, but left an option to enabled it by
--sandbox on,obsolete=allow
15
Elevated Privileges
● This option would block all set*uid|gid system calls, this is known
to be required by some features like bridge helpers
● This option also does prctl(PR_SET_NO_NEW_PRIVS) which will
avoid new threads to escalate privilege as well
● This mode could be switched on or off by the option:
--sandbox on,elevatedprivileges=allow|deny|children
16
Spawn
● This option provides a fair way to disable new fork() or exec()
processes to be created at all, privileged or not.
● Things like bridge helper, SMB server, ifup/down scripts, migration
exec: protocol would all be disabled.
● This mode could be switched on or off by the option:
--sandbox on,spawn=allow|deny
17
Resource Control
● Avoids QEMU to set process affinity, scheduler priority, etc
● This shouldn’t be QEMU’s responsability to do this, but rather management
software like libvirt.
● This mode could be switched on or off by the option:
--sandbox on,resourcecontrol=allow|deny
18
Qemu sandboxing v2
static const struct QemuSeccompSyscall blacklist[] = {
/* default set of syscalls to blacklist */
{ SCMP_SYS(reboot), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(swapon), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(swapoff), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(syslog), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(mount), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(umount), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(kexec_load), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(afs_syscall), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(break), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(ftime), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(getpmsg), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(gtty), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(lock), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(mpx), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(prof), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(profil), QEMU_SECCOMP_SET_DEFAULT },
19
Some thoughts on Qemu sandboxing
20
● Sandboxing is not your definitive solution for security on virtualization.
But rather a good solution to be stacked on others like:
○ MAC/DAC (Mandatory Access Control and Discretionary Access Control)
○ SELinux
○ Remote Management using SSH/TLS/SSL
○ Guest Image cryptography
○ Virtual Trusted Platform Module (vTPM)
● Sandbox v2 are not low level knobs to control system calls but rahter a high
level knobs to controls concepts.
Questions?
21
THANK YOU
plus.google.com/+RedHat
linkedin.com/company/red-hat
youtube.com/user/RedHatVideos
facebook.com/redhatinc
twitter.com/RedHatNews
1 of 22

Recommended

CSW2017 Qiang li zhibinhu_meiwang_dig into qemu security by
CSW2017 Qiang li zhibinhu_meiwang_dig into qemu securityCSW2017 Qiang li zhibinhu_meiwang_dig into qemu security
CSW2017 Qiang li zhibinhu_meiwang_dig into qemu securityCanSecWest
2.8K views39 slides
eBPF maps 101 by
eBPF maps 101eBPF maps 101
eBPF maps 101SUSE Labs Taipei
4.2K views64 slides
Linux SMEP bypass techniques by
Linux SMEP bypass techniquesLinux SMEP bypass techniques
Linux SMEP bypass techniquesVitaly Nikolenko
4.7K views53 slides
Meet cute-between-ebpf-and-tracing by
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingViller Hsiao
8.8K views75 slides
from Binary to Binary: How Qemu Works by
from Binary to Binary: How Qemu Worksfrom Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu WorksZhen Wei
1.9K views68 slides
ROP 輕鬆談 by
ROP 輕鬆談ROP 輕鬆談
ROP 輕鬆談hackstuff
26K views82 slides

More Related Content

What's hot

DoS and DDoS mitigations with eBPF, XDP and DPDK by
DoS and DDoS mitigations with eBPF, XDP and DPDKDoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDKMarian Marinov
1.1K views36 slides
UM2019 Extended BPF: A New Type of Software by
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareBrendan Gregg
33K views48 slides
Windows Kernel Exploitation : This Time Font hunt you down in 4 bytes by
Windows Kernel Exploitation : This Time Font hunt you down in 4 bytesWindows Kernel Exploitation : This Time Font hunt you down in 4 bytes
Windows Kernel Exploitation : This Time Font hunt you down in 4 bytesPeter Hlavaty
31.1K views50 slides
BPF - in-kernel virtual machine by
BPF - in-kernel virtual machineBPF - in-kernel virtual machine
BPF - in-kernel virtual machineAlexei Starovoitov
11.7K views41 slides
CanSecWest 2017 - Port(al) to the iOS Core by
CanSecWest 2017 - Port(al) to the iOS CoreCanSecWest 2017 - Port(al) to the iOS Core
CanSecWest 2017 - Port(al) to the iOS CoreStefan Esser
8K views51 slides
Memory Compaction in Linux Kernel.pdf by
Memory Compaction in Linux Kernel.pdfMemory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdfAdrian Huang
453 views38 slides

What's hot(20)

DoS and DDoS mitigations with eBPF, XDP and DPDK by Marian Marinov
DoS and DDoS mitigations with eBPF, XDP and DPDKDoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDK
Marian Marinov1.1K views
UM2019 Extended BPF: A New Type of Software by Brendan Gregg
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of Software
Brendan Gregg33K views
Windows Kernel Exploitation : This Time Font hunt you down in 4 bytes by Peter Hlavaty
Windows Kernel Exploitation : This Time Font hunt you down in 4 bytesWindows Kernel Exploitation : This Time Font hunt you down in 4 bytes
Windows Kernel Exploitation : This Time Font hunt you down in 4 bytes
Peter Hlavaty31.1K views
CanSecWest 2017 - Port(al) to the iOS Core by Stefan Esser
CanSecWest 2017 - Port(al) to the iOS CoreCanSecWest 2017 - Port(al) to the iOS Core
CanSecWest 2017 - Port(al) to the iOS Core
Stefan Esser8K views
Memory Compaction in Linux Kernel.pdf by Adrian Huang
Memory Compaction in Linux Kernel.pdfMemory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdf
Adrian Huang453 views
Launch the First Process in Linux System by Jian-Hong Pan
Launch the First Process in Linux SystemLaunch the First Process in Linux System
Launch the First Process in Linux System
Jian-Hong Pan2.1K views
QEMU - Binary Translation by Jiann-Fuh Liaw
QEMU - Binary Translation QEMU - Binary Translation
QEMU - Binary Translation
Jiann-Fuh Liaw16.6K views
Kernel Recipes 2017 - 20 years of Linux Virtual Memory - Andrea Arcangeli by Anne Nicolas
Kernel Recipes 2017 - 20 years of Linux Virtual Memory - Andrea ArcangeliKernel Recipes 2017 - 20 years of Linux Virtual Memory - Andrea Arcangeli
Kernel Recipes 2017 - 20 years of Linux Virtual Memory - Andrea Arcangeli
Anne Nicolas2.2K views
Linux kernel tracing by Viller Hsiao
Linux kernel tracingLinux kernel tracing
Linux kernel tracing
Viller Hsiao16.9K views
Yocto Project ハンズオン プレゼン用資料 by Nobuhiro Iwamatsu
Yocto Project ハンズオン プレゼン用資料Yocto Project ハンズオン プレゼン用資料
Yocto Project ハンズオン プレゼン用資料
Nobuhiro Iwamatsu24K views
SFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMU by Linaro
SFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMUSFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMU
SFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMU
Linaro1.6K views
eBPF Trace from Kernel to Userspace by SUSE Labs Taipei
eBPF Trace from Kernel to UserspaceeBPF Trace from Kernel to Userspace
eBPF Trace from Kernel to Userspace
SUSE Labs Taipei8.5K views
Physical Memory Models.pdf by Adrian Huang
Physical Memory Models.pdfPhysical Memory Models.pdf
Physical Memory Models.pdf
Adrian Huang470 views
Linux 4.x Tracing: Performance Analysis with bcc/BPF by Brendan Gregg
Linux 4.x Tracing: Performance Analysis with bcc/BPFLinux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPF
Brendan Gregg10.7K views
Reverse Mapping (rmap) in Linux Kernel by Adrian Huang
Reverse Mapping (rmap) in Linux KernelReverse Mapping (rmap) in Linux Kernel
Reverse Mapping (rmap) in Linux Kernel
Adrian Huang605 views
Understanding eBPF in a Hurry! by Ray Jenkins
Understanding eBPF in a Hurry!Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!
Ray Jenkins1.5K views

Similar to QEMU Sandboxing for dummies

Kernel debug log and console on openSUSE by
Kernel debug log and console on openSUSEKernel debug log and console on openSUSE
Kernel debug log and console on openSUSESUSE Labs Taipei
2K views34 slides
Microkernel Development by
Microkernel DevelopmentMicrokernel Development
Microkernel DevelopmentRodrigo Almeida
2.2K views56 slides
Linux Capabilities - eng - v2.1.5, compact by
Linux Capabilities - eng - v2.1.5, compactLinux Capabilities - eng - v2.1.5, compact
Linux Capabilities - eng - v2.1.5, compactAlessandro Selli
1.1K views58 slides
Alexander Reelsen - Seccomp for Developers by
Alexander Reelsen - Seccomp for DevelopersAlexander Reelsen - Seccomp for Developers
Alexander Reelsen - Seccomp for DevelopersDevDay Dresden
116 views59 slides
Chromium Sandbox on Linux (NDC Security 2019) by
Chromium Sandbox on Linux (NDC Security 2019)Chromium Sandbox on Linux (NDC Security 2019)
Chromium Sandbox on Linux (NDC Security 2019)Patricia Aas
625 views70 slides
Linux seccomp(2) vs OpenBSD pledge(2) by
Linux seccomp(2) vs OpenBSD pledge(2)Linux seccomp(2) vs OpenBSD pledge(2)
Linux seccomp(2) vs OpenBSD pledge(2)Giovanni Bechis
3.5K views32 slides

Similar to QEMU Sandboxing for dummies(20)

Kernel debug log and console on openSUSE by SUSE Labs Taipei
Kernel debug log and console on openSUSEKernel debug log and console on openSUSE
Kernel debug log and console on openSUSE
SUSE Labs Taipei2K views
Linux Capabilities - eng - v2.1.5, compact by Alessandro Selli
Linux Capabilities - eng - v2.1.5, compactLinux Capabilities - eng - v2.1.5, compact
Linux Capabilities - eng - v2.1.5, compact
Alessandro Selli1.1K views
Alexander Reelsen - Seccomp for Developers by DevDay Dresden
Alexander Reelsen - Seccomp for DevelopersAlexander Reelsen - Seccomp for Developers
Alexander Reelsen - Seccomp for Developers
DevDay Dresden116 views
Chromium Sandbox on Linux (NDC Security 2019) by Patricia Aas
Chromium Sandbox on Linux (NDC Security 2019)Chromium Sandbox on Linux (NDC Security 2019)
Chromium Sandbox on Linux (NDC Security 2019)
Patricia Aas625 views
Linux seccomp(2) vs OpenBSD pledge(2) by Giovanni Bechis
Linux seccomp(2) vs OpenBSD pledge(2)Linux seccomp(2) vs OpenBSD pledge(2)
Linux seccomp(2) vs OpenBSD pledge(2)
Giovanni Bechis3.5K views
HKG18-TR14 - Postmortem Debugging with Coresight by Linaro
HKG18-TR14 - Postmortem Debugging with CoresightHKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with Coresight
Linaro557 views
Generic Synchronization Policies in C++ by Ciaran McHale
Generic Synchronization Policies in C++Generic Synchronization Policies in C++
Generic Synchronization Policies in C++
Ciaran McHale636 views
CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ... by PROIDEA
CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...
CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...
PROIDEA76 views
Implementing of classical synchronization problem by using semaphores by Gowtham Reddy
Implementing of classical synchronization problem by using semaphoresImplementing of classical synchronization problem by using semaphores
Implementing of classical synchronization problem by using semaphores
Gowtham Reddy183 views
Building Network Functions with eBPF & BCC by Kernel TLV
Building Network Functions with eBPF & BCCBuilding Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCC
Kernel TLV3K views
OSSNA 2017 Performance Analysis Superpowers with Linux BPF by Brendan Gregg
OSSNA 2017 Performance Analysis Superpowers with Linux BPFOSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
Brendan Gregg5.1K views
Chromium Sandbox on Linux (BlackHoodie 2018) by Patricia Aas
Chromium Sandbox on Linux (BlackHoodie 2018)Chromium Sandbox on Linux (BlackHoodie 2018)
Chromium Sandbox on Linux (BlackHoodie 2018)
Patricia Aas2.3K views
Linux kernel debugging by Hao-Ran Liu
Linux kernel debuggingLinux kernel debugging
Linux kernel debugging
Hao-Ran Liu1.3K views
Roll your own toy unix clone os by eramax
Roll your own toy unix clone osRoll your own toy unix clone os
Roll your own toy unix clone os
eramax1.3K views
bcc/BPF tools - Strategy, current tools, future challenges by IO Visor Project
bcc/BPF tools - Strategy, current tools, future challengesbcc/BPF tools - Strategy, current tools, future challenges
bcc/BPF tools - Strategy, current tools, future challenges
IO Visor Project1K views
Tracer Evaluation by Qiao Han
Tracer EvaluationTracer Evaluation
Tracer Evaluation
Qiao Han347 views

Recently uploaded

DSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - Geertsema by
DSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - GeertsemaDSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - Geertsema
DSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - GeertsemaDeltares
12 views13 slides
HarshithAkkapelli_Presentation.pdf by
HarshithAkkapelli_Presentation.pdfHarshithAkkapelli_Presentation.pdf
HarshithAkkapelli_Presentation.pdfharshithakkapelli
11 views16 slides
What Can Employee Monitoring Software Do?​ by
What Can Employee Monitoring Software Do?​What Can Employee Monitoring Software Do?​
What Can Employee Monitoring Software Do?​wAnywhere
18 views11 slides
A first look at MariaDB 11.x features and ideas on how to use them by
A first look at MariaDB 11.x features and ideas on how to use themA first look at MariaDB 11.x features and ideas on how to use them
A first look at MariaDB 11.x features and ideas on how to use themFederico Razzoli
44 views36 slides
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko... by
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...Deltares
10 views23 slides
DSD-INT 2023 Next-Generation Flood Inundation Mapping for Taiwan - Delft3D FM... by
DSD-INT 2023 Next-Generation Flood Inundation Mapping for Taiwan - Delft3D FM...DSD-INT 2023 Next-Generation Flood Inundation Mapping for Taiwan - Delft3D FM...
DSD-INT 2023 Next-Generation Flood Inundation Mapping for Taiwan - Delft3D FM...Deltares
7 views40 slides

Recently uploaded(20)

DSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - Geertsema by Deltares
DSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - GeertsemaDSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - Geertsema
DSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - Geertsema
Deltares12 views
What Can Employee Monitoring Software Do?​ by wAnywhere
What Can Employee Monitoring Software Do?​What Can Employee Monitoring Software Do?​
What Can Employee Monitoring Software Do?​
wAnywhere18 views
A first look at MariaDB 11.x features and ideas on how to use them by Federico Razzoli
A first look at MariaDB 11.x features and ideas on how to use themA first look at MariaDB 11.x features and ideas on how to use them
A first look at MariaDB 11.x features and ideas on how to use them
Federico Razzoli44 views
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko... by Deltares
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...
Deltares10 views
DSD-INT 2023 Next-Generation Flood Inundation Mapping for Taiwan - Delft3D FM... by Deltares
DSD-INT 2023 Next-Generation Flood Inundation Mapping for Taiwan - Delft3D FM...DSD-INT 2023 Next-Generation Flood Inundation Mapping for Taiwan - Delft3D FM...
DSD-INT 2023 Next-Generation Flood Inundation Mapping for Taiwan - Delft3D FM...
Deltares7 views
.NET Developer Conference 2023 - .NET Microservices mit Dapr – zu viel Abstra... by Marc Müller
.NET Developer Conference 2023 - .NET Microservices mit Dapr – zu viel Abstra....NET Developer Conference 2023 - .NET Microservices mit Dapr – zu viel Abstra...
.NET Developer Conference 2023 - .NET Microservices mit Dapr – zu viel Abstra...
Marc Müller35 views
DSD-INT 2023 Modelling litter in the Yarra and Maribyrnong Rivers (Australia)... by Deltares
DSD-INT 2023 Modelling litter in the Yarra and Maribyrnong Rivers (Australia)...DSD-INT 2023 Modelling litter in the Yarra and Maribyrnong Rivers (Australia)...
DSD-INT 2023 Modelling litter in the Yarra and Maribyrnong Rivers (Australia)...
Deltares9 views
DSD-INT 2023 HydroMT model building and river-coast coupling in Python - Bove... by Deltares
DSD-INT 2023 HydroMT model building and river-coast coupling in Python - Bove...DSD-INT 2023 HydroMT model building and river-coast coupling in Python - Bove...
DSD-INT 2023 HydroMT model building and river-coast coupling in Python - Bove...
Deltares15 views
Neo4j y GenAI by Neo4j
Neo4j y GenAI Neo4j y GenAI
Neo4j y GenAI
Neo4j35 views
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J... by Deltares
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...
Deltares7 views
Neo4j : Graphes de Connaissance, IA et LLMs by Neo4j
Neo4j : Graphes de Connaissance, IA et LLMsNeo4j : Graphes de Connaissance, IA et LLMs
Neo4j : Graphes de Connaissance, IA et LLMs
Neo4j46 views
DSD-INT 2023 Baseline studies for Strategic Coastal protection for Long Islan... by Deltares
DSD-INT 2023 Baseline studies for Strategic Coastal protection for Long Islan...DSD-INT 2023 Baseline studies for Strategic Coastal protection for Long Islan...
DSD-INT 2023 Baseline studies for Strategic Coastal protection for Long Islan...
Deltares10 views
Les nouveautés produit Neo4j by Neo4j
 Les nouveautés produit Neo4j Les nouveautés produit Neo4j
Les nouveautés produit Neo4j
Neo4j27 views
Cycleops - Automate deployments on top of bare metal.pptx by Thanassis Parathyras
Cycleops - Automate deployments on top of bare metal.pptxCycleops - Automate deployments on top of bare metal.pptx
Cycleops - Automate deployments on top of bare metal.pptx

QEMU Sandboxing for dummies

  • 1. QEMU Sandboxing for dummies Eduardo Otubo <otubo@redhat.com> Senior Software Engineer 27/Jan/2018
  • 2. 2
  • 3. 1. Secure Computing: The basics 2. Libseccomp 3. Qemu sandboxing v1 4. Qemu sandboxing v2 and more options Agenda 3
  • 4. Secure Computing: the basics ● Kernel support first version dated from March, 8th 2005 (2.6.12) Commit by: Andrea Arcangeli ● The main purpose is to call prctl() with PR_SET_SECCOMP on the process which will allow only: exit(), sigreturn(), read() and write() ○ Otherwise SIGKILL or SIGSYS are issued 4
  • 5. Secure Computing: the basics ● Second kernel implementation with dynamic seccomp policies: January, 11th 2011; Commit by: Will Drewry <wad@chromium.org> ● Now uses with seccomp() system call ● Uses BPF (Berkeley Packet Filter) ○ An in-kernel data link layer packet filter that has an abstracted API that also works as a generic filter 5
  • 6. struct sock_filter filter[] = { /* Grab the system call number */ BPF_STMT(BPF_LD+BPF_W+BPF_ABS, syscall_nr), /* Jump table for the allowed syscalls */ BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_rt_sigreturn, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), #ifdef __NR_sigreturn BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_sigreturn, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), #endif BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit_group, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_read, 1, 0), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_write, 3, 2), 6
  • 7. Libseccomp ● Paul Moore (2011) ● Userspace layer to make life easier: ○ Abstract complex BPF constructions ○ Abstract differences between architectures and its ABIs ○ Optimize filter construction for best performance ○ Kill (sigkill), trap (sigsys), Allow in case of matched filter (among other actions) 7
  • 8. struct sock_filter filter[] = { /* Grab the system call number */ BPF_STMT(BPF_LD+BPF_W+BPF_ABS, syscall_nr), /* Jump table for the allowed syscalls */ BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_rt_sigreturn, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), #ifdef __NR_sigreturn BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_sigreturn, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), #endif BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit_group, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_read, 1, 0), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_write, 3, 2), 8
  • 9. struct sock_filter filter[] = { /* Grab the system call number */ BPF_STMT(BPF_LD+BPF_W+BPF_ABS, syscall_nr), /* Jump table for the allowed syscalls */ BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_rt_sigreturn, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), #ifdef __NR_sigreturn BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_sigreturn, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), #endif BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit_group, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_read, 1, 0), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_write, 3, 2), 9
  • 10. Qemu sandboxing v1 static const struct QemuSeccompSyscall seccomp_whitelist[] = { { SCMP_SYS(timer_settime), 255 }, { SCMP_SYS(timer_gettime), 254 }, { SCMP_SYS(futex), 253 }, { SCMP_SYS(select), 252 }, { SCMP_SYS(recvfrom), 251 }, { SCMP_SYS(sendto), 250 }, { SCMP_SYS(read), 249 }, { SCMP_SYS(brk), 248 }, { SCMP_SYS(clone), 247 }, { SCMP_SYS(mmap), 247 }, { SCMP_SYS(mprotect), 246 }, { SCMP_SYS(execve), 245 }, { SCMP_SYS(open), 245 }, { SCMP_SYS(ioctl), 245 }, { SCMP_SYS(recvmsg), 245 }, { SCMP_SYS(sendmsg), 245 }, 10
  • 11. Qemu sandboxing v1 11 ● Basic whitelist approach (--sandbox=on) ○ Every system call is blocked, except for the ones that are explicitly whitelisted ● Various compatibility problems, requires lots of testing and different workloads ● It’s safe right?
  • 12. 12
  • 13. Qemu sandboxing v1 Not actually! ● QEMU links to too many different shared libraries and there is no way to determine which code paths QEMU triggers in these libraries and thus identify which syscalls will be genuinely needed. ● Sometimes you miss a syscall and it aborts right at the beginning before boot (which is good?) but sometimes your VM is running for days and it could suddenly abort (which is terrible) 13
  • 14. Qemu sandboxing v2 ● Extended blacklist approach (--sandbox=on,...) ● Everything is allowed except for a few sets that are definitely not allowed ○ Default system calls: basic set of forbidden system calls (kexec,swapon, swapoff, mount, umount, etc) ○ obsolete ○ elevateprivileges ○ spawn ○ resourcecontrol 14
  • 15. Obsolete system calls ● Old system calls that were usefull in the past but became obsolete or replaced by new version ○ Like readdir() being replaced by getdents() ● Should be by default blocked, but left an option to enabled it by --sandbox on,obsolete=allow 15
  • 16. Elevated Privileges ● This option would block all set*uid|gid system calls, this is known to be required by some features like bridge helpers ● This option also does prctl(PR_SET_NO_NEW_PRIVS) which will avoid new threads to escalate privilege as well ● This mode could be switched on or off by the option: --sandbox on,elevatedprivileges=allow|deny|children 16
  • 17. Spawn ● This option provides a fair way to disable new fork() or exec() processes to be created at all, privileged or not. ● Things like bridge helper, SMB server, ifup/down scripts, migration exec: protocol would all be disabled. ● This mode could be switched on or off by the option: --sandbox on,spawn=allow|deny 17
  • 18. Resource Control ● Avoids QEMU to set process affinity, scheduler priority, etc ● This shouldn’t be QEMU’s responsability to do this, but rather management software like libvirt. ● This mode could be switched on or off by the option: --sandbox on,resourcecontrol=allow|deny 18
  • 19. Qemu sandboxing v2 static const struct QemuSeccompSyscall blacklist[] = { /* default set of syscalls to blacklist */ { SCMP_SYS(reboot), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(swapon), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(swapoff), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(syslog), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(mount), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(umount), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(kexec_load), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(afs_syscall), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(break), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(ftime), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(getpmsg), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(gtty), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(lock), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(mpx), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(prof), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(profil), QEMU_SECCOMP_SET_DEFAULT }, 19
  • 20. Some thoughts on Qemu sandboxing 20 ● Sandboxing is not your definitive solution for security on virtualization. But rather a good solution to be stacked on others like: ○ MAC/DAC (Mandatory Access Control and Discretionary Access Control) ○ SELinux ○ Remote Management using SSH/TLS/SSL ○ Guest Image cryptography ○ Virtual Trusted Platform Module (vTPM) ● Sandbox v2 are not low level knobs to control system calls but rahter a high level knobs to controls concepts.