SlideShare a Scribd company logo
FlexSC
Yongrae Jo, Mirae Lim
CSED504
Few comments before start
●
Not strictly follows the specification from
paper(e.g. syscall thread replaced by
workqueue’s worker thread)
●
Paper conatains two programmable parts
– FlexSC: Exception-less system call mechanism
– libflexsc: developed for applying FlexSC to real
application(e.g. Apache). Standard POSIX thread
compliant. libflexsc needs to modify glibc syscall
wrapper, POSIX libraries
●
Focus on FlexSC implementation
When a system call occurs
user kernel
Exception
blocked
time
When a system call occurs
user kernel
Exception
blocked
time
Tradition Exception based
system call has drawbacks
1. Direct cost
2. Indirect cost
Situation even worse when...
user kernel
Costs of synchronous syscalls
●
Mode switch cost (Direct cost)
– flushing the user-mode pipeline
– saving a few registers onto the kernel stack
– allocating execution stack
– changing the protection domain(ring 3 → 0)
– redirecting execution to the registered exception handler
– returning control back to user
●
Processor structure pollution (Indirect cost)
– user mode state is replaced by kernel mode state
– Processor structure: L1 data and instruction cache, TLB,
branch prediction tables, prefetch buffers, unified caches
L2 and L3 → Cache pollution
Benefits of FlexSC
●
Lower direct cost
– Fewer mode switches by seperating cores into
user mode cores and kernel mode cores
●
Lower indirect cost
– Decoupling Execution from Invocation through
system call scheduling with dedicated cores
●
Linux policy : Activation and Execution are
bound together → Is this policy always
good?
Overview
: when user invoke system call
sysentry[0]
sysentry[1]
sysentry[n]
scanner
thread
queue_work()
User
Program
syscall
workqueue
4
3
3
4
worker
thread
CPU
CPU
CPU
CPU
3
4
syscall()
1
2
3
4
syscall
bound
normal user
instruction
invoke system call
through memory
strore operation
Shared Memory
syspagelibflexsc
store
storereturn value worker
thread
worker
thread
worker
thread
system call entry(sysentry)
●
Collection of information needed to
execute given system call
●
status = {free, submitted, busy, done}
●
why 64 bytes? even smaller size is possible
– 64 is a divisor of popular cache line sizes of
today’s processor
State transition of sysentry
Free Submitted
BusyDone
U: User thread
K: Kernel thread
U: Invokes
system call
K: Executing
system call
K: Done syscall &
Recording return value
U: Consumes
return value
scanner thread
queue a work
when state
becomes
Submitted
work thread
executing
systcall
Shared memory
: for fast communication of sysentry
Overview
: shared memory
user kernel
sysentry[0]
sysentry[1]
sysentry[n]
scanner
thread
visible
memroy
address
worker
thread
syscall
store return value & update status
worker thread enables
asynchronous system call
and scheduling system call
syspage
Issuing system call in FlexSC
Just writing operation needed
Usage pattern
from user space
Populate sysentry
Scanner thread
●
It scans through syspage and check if
entry’s state is “submitted”
●
If entry state is “submitted”, then It put a
work into workqueue
syspage: collection of sysentry
Shared memory issues
● One way that kernel thread access shared user
memory is forking user process
● In 2.6xx, kernel_thread(thread_fn, arg,
CLONE_VM | CLONE_ FS | CLONE_FILES) is
simple solution
● But on recent kernel, forking user process is
simply not possible: segfault
● kernel_thread() causes segfault
(http://www.spinics.net/lists/newbiesmsg574
45.html)
syspage allocation from user
space
For sharing memory with
page unit
Prevent mapped page swapped
out before pinning it to kernel
space
Mapping syspage into kernel
virtual address space
●
kernel simply can’t access user address space
●
Generally, accessing phsycial address directly is not
recommneded
●
So mapping user page to kernel virtual page needed
src: https://en.wikipedia.org/wiki/X86-64#Virtual_address_space_details
Kernel space
User space
get_user_page(), kmap()
●
get_user_page(): Get a list of page from
user space and map it into kernel space
●
kmap(): get kernel virtual address from
physical page
Workqueue
: for asynchronous execution of syscall
Workqueue: asynchronous
execution mechanism
●
Workqueue is an asynchronous execution
mechanism which is widely used across
the kernel
●
It's used for various purposes from simple
context bouncing to hosting a persistent
in-kernel service thread
Workqueue design & problems
of legacy workqueue(~2010)
src: http://events.linuxfoundation.org/sites/events/files/slides/Async%20execution%20with%20wqs.pdf
Concurrency Managed
Workqueue(CMWQ, 2010~present)
src: http://events.linuxfoundation.org/sites/events/files/slides/Async%20execution%20with%20wqs.pdf
workqueue in FlexSC
● It is used to execute asyncronous system call
● Scanner thread does queue_work_on(CPU,
workqueue work) to wake up worker thread
● It enables “Decoupling Execution from
Invocation” by specifying specific CPU
– A CPU that invoke a system call from user space is
different from the CPU that execute that system
call
– Reduces indirect cost
workqueue in FlexSC
: Initilization
●
create workqueue
●
set max active worker thread to
NUM_SYSENTRY(= 64)
workqueue in FlexSC
: work_struct
worker handler
After executing
do_syscall(), It
populates return
value and updates
entry state to
“done” so that
user can consumes
Interface for user program
● New system calls
– flexsc_register(): Initilization process for FlexSC system and it
allows calling process to use FlexSC
– flexsc_exit(): termination FlexSC system
– flexsc_wait() (didn’t implement it yet) : when user process has
nothing to do except for waiting syscall execution
● libflexsc
– make ease use of FlexSC for user program
– paper extends libflexsc to be compatiable with standard POSIX
thread
– In our case, just providing wrapper function of syscall, and some
initilization(set CPU affinity, allocation of syspage, locking the
page, ...)
Limitation of current
implementation
●
Following limitation occurs because servicing
kernel thread can’t fork user process
●
Not sharing user’s file descriptor table
– file I/O related system call(e.g. read, write) not
supported
●
Not sharing entire user space address
– system calls including pointer arguments not
supported. It can point address outside shared page
● Only limited support of system call
Furthur works
●
Overcome current limitation(sharing file
descriptor table, sharing entire user space
if possible)
●
Do callback function when asynchronous
system call is done
●
Measure the exact cost of system call
using system performance analaysis tools
References
●
SOARES, L., AND STUMM, M. Flexsc:
Flexible System Call Scheduling with
Exception-Less System Calls. In 9th
USENIX Symposium on Operating Systems
Design and Implementation (OSDI)
(2010), pp. 33–46.
Thanks

More Related Content

Similar to FlexSC

Operating System / System Operasi
Operating System / System Operasi                   Operating System / System Operasi
Operating System / System Operasi seolangit4
 
Beneath the Linux Interrupt handling
Beneath the Linux Interrupt handlingBeneath the Linux Interrupt handling
Beneath the Linux Interrupt handlingBhoomil Chavda
 
Bc0056 unix operating system
Bc0056   unix operating systemBc0056   unix operating system
Bc0056 unix operating systemsmumbahelp
 
Distributed computing
Distributed computingDistributed computing
Distributed computingDeepak John
 
F9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
F9: A Secure and Efficient Microkernel Built for Deeply Embedded SystemsF9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
F9: A Secure and Efficient Microkernel Built for Deeply Embedded SystemsNational Cheng Kung University
 
POLITEKNIK MALAYSIA
POLITEKNIK MALAYSIAPOLITEKNIK MALAYSIA
POLITEKNIK MALAYSIAAiman Hud
 
Network & Filesystem: Doing less cross rings memory copy
Network & Filesystem: Doing less cross rings memory copyNetwork & Filesystem: Doing less cross rings memory copy
Network & Filesystem: Doing less cross rings memory copyScaleway
 
Fedora Virtualization Day: Linux Containers & CRIU
Fedora Virtualization Day: Linux Containers & CRIUFedora Virtualization Day: Linux Containers & CRIU
Fedora Virtualization Day: Linux Containers & CRIUAndrey Vagin
 
RTOS implementation
RTOS implementationRTOS implementation
RTOS implementationRajan Kumar
 
2. Vagin. Linux containers. June 01, 2013
2. Vagin. Linux containers. June 01, 20132. Vagin. Linux containers. June 01, 2013
2. Vagin. Linux containers. June 01, 2013ru-fedora-moscow-2013
 
EMBEDDED OS
EMBEDDED OSEMBEDDED OS
EMBEDDED OSAJAL A J
 
Linux Performance Tunning Kernel
Linux Performance Tunning KernelLinux Performance Tunning Kernel
Linux Performance Tunning KernelShay Cohen
 
Linux Internals - Kernel/Core
Linux Internals - Kernel/CoreLinux Internals - Kernel/Core
Linux Internals - Kernel/CoreShay Cohen
 
Synchronization linux
Synchronization linuxSynchronization linux
Synchronization linuxSusant Sahani
 
Unix.system.calls
Unix.system.callsUnix.system.calls
Unix.system.callsGRajendra
 
Part 04 Creating a System Call in Linux
Part 04 Creating a System Call in LinuxPart 04 Creating a System Call in Linux
Part 04 Creating a System Call in LinuxTushar B Kute
 
RTOS Material hfffffffffffffffffffffffffffffffffffff
RTOS Material hfffffffffffffffffffffffffffffffffffffRTOS Material hfffffffffffffffffffffffffffffffffffff
RTOS Material hfffffffffffffffffffffffffffffffffffffadugnanegero
 

Similar to FlexSC (20)

Operating System / System Operasi
Operating System / System Operasi                   Operating System / System Operasi
Operating System / System Operasi
 
Beneath the Linux Interrupt handling
Beneath the Linux Interrupt handlingBeneath the Linux Interrupt handling
Beneath the Linux Interrupt handling
 
Bc0056 unix operating system
Bc0056   unix operating systemBc0056   unix operating system
Bc0056 unix operating system
 
Making Linux do Hard Real-time
Making Linux do Hard Real-timeMaking Linux do Hard Real-time
Making Linux do Hard Real-time
 
Distributed computing
Distributed computingDistributed computing
Distributed computing
 
F9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
F9: A Secure and Efficient Microkernel Built for Deeply Embedded SystemsF9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
F9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
 
POLITEKNIK MALAYSIA
POLITEKNIK MALAYSIAPOLITEKNIK MALAYSIA
POLITEKNIK MALAYSIA
 
Network & Filesystem: Doing less cross rings memory copy
Network & Filesystem: Doing less cross rings memory copyNetwork & Filesystem: Doing less cross rings memory copy
Network & Filesystem: Doing less cross rings memory copy
 
Fedora Virtualization Day: Linux Containers & CRIU
Fedora Virtualization Day: Linux Containers & CRIUFedora Virtualization Day: Linux Containers & CRIU
Fedora Virtualization Day: Linux Containers & CRIU
 
RTOS implementation
RTOS implementationRTOS implementation
RTOS implementation
 
2. Vagin. Linux containers. June 01, 2013
2. Vagin. Linux containers. June 01, 20132. Vagin. Linux containers. June 01, 2013
2. Vagin. Linux containers. June 01, 2013
 
EMBEDDED OS
EMBEDDED OSEMBEDDED OS
EMBEDDED OS
 
Vx works RTOS
Vx works RTOSVx works RTOS
Vx works RTOS
 
Linux Performance Tunning Kernel
Linux Performance Tunning KernelLinux Performance Tunning Kernel
Linux Performance Tunning Kernel
 
Linux Internals - Kernel/Core
Linux Internals - Kernel/CoreLinux Internals - Kernel/Core
Linux Internals - Kernel/Core
 
Synchronization linux
Synchronization linuxSynchronization linux
Synchronization linux
 
LINUX Device Drivers
LINUX Device DriversLINUX Device Drivers
LINUX Device Drivers
 
Unix.system.calls
Unix.system.callsUnix.system.calls
Unix.system.calls
 
Part 04 Creating a System Call in Linux
Part 04 Creating a System Call in LinuxPart 04 Creating a System Call in Linux
Part 04 Creating a System Call in Linux
 
RTOS Material hfffffffffffffffffffffffffffffffffffff
RTOS Material hfffffffffffffffffffffffffffffffffffffRTOS Material hfffffffffffffffffffffffffffffffffffff
RTOS Material hfffffffffffffffffffffffffffffffffffff
 

More from YongraeJo

Enhancing Ethereum PoA Clique Network with DAG-based BFT Consensus
Enhancing Ethereum PoA Clique Network with DAG-based BFT ConsensusEnhancing Ethereum PoA Clique Network with DAG-based BFT Consensus
Enhancing Ethereum PoA Clique Network with DAG-based BFT ConsensusYongraeJo
 
Zeus Locality aware distributed transaction upload.pptx
Zeus Locality aware distributed transaction upload.pptxZeus Locality aware distributed transaction upload.pptx
Zeus Locality aware distributed transaction upload.pptxYongraeJo
 
blockchain-and-trusted-computing
blockchain-and-trusted-computingblockchain-and-trusted-computing
blockchain-and-trusted-computingYongraeJo
 
Blockchain meets database
Blockchain meets databaseBlockchain meets database
Blockchain meets databaseYongraeJo
 
Byzantine ordered consensus
Byzantine ordered consensusByzantine ordered consensus
Byzantine ordered consensusYongraeJo
 
BlockLot: Blockchain-based verifiable lottery
BlockLot: Blockchain-based verifiable lotteryBlockLot: Blockchain-based verifiable lottery
BlockLot: Blockchain-based verifiable lotteryYongraeJo
 
Simple robot pets with three emotions (uC/OS III)
Simple robot pets with three emotions (uC/OS III)Simple robot pets with three emotions (uC/OS III)
Simple robot pets with three emotions (uC/OS III)YongraeJo
 
Honeybadger of BFT Protocols
Honeybadger of BFT ProtocolsHoneybadger of BFT Protocols
Honeybadger of BFT ProtocolsYongraeJo
 
Practical Byzantine Fault Tolernace
Practical Byzantine Fault TolernacePractical Byzantine Fault Tolernace
Practical Byzantine Fault TolernaceYongraeJo
 
Making BFT Protocols Really Adaptive
Making BFT Protocols Really AdaptiveMaking BFT Protocols Really Adaptive
Making BFT Protocols Really AdaptiveYongraeJo
 

More from YongraeJo (20)

Enhancing Ethereum PoA Clique Network with DAG-based BFT Consensus
Enhancing Ethereum PoA Clique Network with DAG-based BFT ConsensusEnhancing Ethereum PoA Clique Network with DAG-based BFT Consensus
Enhancing Ethereum PoA Clique Network with DAG-based BFT Consensus
 
Zeus Locality aware distributed transaction upload.pptx
Zeus Locality aware distributed transaction upload.pptxZeus Locality aware distributed transaction upload.pptx
Zeus Locality aware distributed transaction upload.pptx
 
basil.pptx
basil.pptxbasil.pptx
basil.pptx
 
HotStuff
HotStuff HotStuff
HotStuff
 
Fbft
FbftFbft
Fbft
 
blockchain-and-trusted-computing
blockchain-and-trusted-computingblockchain-and-trusted-computing
blockchain-and-trusted-computing
 
Blockchain meets database
Blockchain meets databaseBlockchain meets database
Blockchain meets database
 
Beat
BeatBeat
Beat
 
Byzantine ordered consensus
Byzantine ordered consensusByzantine ordered consensus
Byzantine ordered consensus
 
Stellar
StellarStellar
Stellar
 
Ledgerdb
LedgerdbLedgerdb
Ledgerdb
 
Blockene
BlockeneBlockene
Blockene
 
BlockLot: Blockchain-based verifiable lottery
BlockLot: Blockchain-based verifiable lotteryBlockLot: Blockchain-based verifiable lottery
BlockLot: Blockchain-based verifiable lottery
 
Simple robot pets with three emotions (uC/OS III)
Simple robot pets with three emotions (uC/OS III)Simple robot pets with three emotions (uC/OS III)
Simple robot pets with three emotions (uC/OS III)
 
Honeybadger of BFT Protocols
Honeybadger of BFT ProtocolsHoneybadger of BFT Protocols
Honeybadger of BFT Protocols
 
Cheapbft
Cheapbft Cheapbft
Cheapbft
 
Practical Byzantine Fault Tolernace
Practical Byzantine Fault TolernacePractical Byzantine Fault Tolernace
Practical Byzantine Fault Tolernace
 
Making BFT Protocols Really Adaptive
Making BFT Protocols Really AdaptiveMaking BFT Protocols Really Adaptive
Making BFT Protocols Really Adaptive
 
Pileus
PileusPileus
Pileus
 
Vft
VftVft
Vft
 

Recently uploaded

Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaCzechDreamin
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsExpeed Software
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Product School
 
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀DianaGray10
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxDavid Michel
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxAbida Shariff
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...CzechDreamin
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024Stephanie Beckett
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Product School
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Thierry Lestable
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...CzechDreamin
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...Product School
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsStefano
 
The architecture of Generative AI for enterprises.pdf
The architecture of Generative AI for enterprises.pdfThe architecture of Generative AI for enterprises.pdf
The architecture of Generative AI for enterprises.pdfalexjohnson7307
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsPaul Groth
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCzechDreamin
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Product School
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...CzechDreamin
 

Recently uploaded (20)

Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT Professionals
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
The architecture of Generative AI for enterprises.pdf
The architecture of Generative AI for enterprises.pdfThe architecture of Generative AI for enterprises.pdf
The architecture of Generative AI for enterprises.pdf
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 

FlexSC

  • 2. Few comments before start ● Not strictly follows the specification from paper(e.g. syscall thread replaced by workqueue’s worker thread) ● Paper conatains two programmable parts – FlexSC: Exception-less system call mechanism – libflexsc: developed for applying FlexSC to real application(e.g. Apache). Standard POSIX thread compliant. libflexsc needs to modify glibc syscall wrapper, POSIX libraries ● Focus on FlexSC implementation
  • 3. When a system call occurs user kernel Exception blocked time
  • 4. When a system call occurs user kernel Exception blocked time Tradition Exception based system call has drawbacks 1. Direct cost 2. Indirect cost
  • 5. Situation even worse when... user kernel
  • 6. Costs of synchronous syscalls ● Mode switch cost (Direct cost) – flushing the user-mode pipeline – saving a few registers onto the kernel stack – allocating execution stack – changing the protection domain(ring 3 → 0) – redirecting execution to the registered exception handler – returning control back to user ● Processor structure pollution (Indirect cost) – user mode state is replaced by kernel mode state – Processor structure: L1 data and instruction cache, TLB, branch prediction tables, prefetch buffers, unified caches L2 and L3 → Cache pollution
  • 7. Benefits of FlexSC ● Lower direct cost – Fewer mode switches by seperating cores into user mode cores and kernel mode cores ● Lower indirect cost – Decoupling Execution from Invocation through system call scheduling with dedicated cores ● Linux policy : Activation and Execution are bound together → Is this policy always good?
  • 8. Overview : when user invoke system call sysentry[0] sysentry[1] sysentry[n] scanner thread queue_work() User Program syscall workqueue 4 3 3 4 worker thread CPU CPU CPU CPU 3 4 syscall() 1 2 3 4 syscall bound normal user instruction invoke system call through memory strore operation Shared Memory syspagelibflexsc store storereturn value worker thread worker thread worker thread
  • 9. system call entry(sysentry) ● Collection of information needed to execute given system call ● status = {free, submitted, busy, done} ● why 64 bytes? even smaller size is possible – 64 is a divisor of popular cache line sizes of today’s processor
  • 10. State transition of sysentry Free Submitted BusyDone U: User thread K: Kernel thread U: Invokes system call K: Executing system call K: Done syscall & Recording return value U: Consumes return value scanner thread queue a work when state becomes Submitted work thread executing systcall
  • 11. Shared memory : for fast communication of sysentry
  • 12. Overview : shared memory user kernel sysentry[0] sysentry[1] sysentry[n] scanner thread visible memroy address worker thread syscall store return value & update status worker thread enables asynchronous system call and scheduling system call syspage
  • 13. Issuing system call in FlexSC Just writing operation needed
  • 14. Usage pattern from user space Populate sysentry
  • 15. Scanner thread ● It scans through syspage and check if entry’s state is “submitted” ● If entry state is “submitted”, then It put a work into workqueue
  • 17. Shared memory issues ● One way that kernel thread access shared user memory is forking user process ● In 2.6xx, kernel_thread(thread_fn, arg, CLONE_VM | CLONE_ FS | CLONE_FILES) is simple solution ● But on recent kernel, forking user process is simply not possible: segfault ● kernel_thread() causes segfault (http://www.spinics.net/lists/newbiesmsg574 45.html)
  • 18. syspage allocation from user space For sharing memory with page unit Prevent mapped page swapped out before pinning it to kernel space
  • 19. Mapping syspage into kernel virtual address space ● kernel simply can’t access user address space ● Generally, accessing phsycial address directly is not recommneded ● So mapping user page to kernel virtual page needed src: https://en.wikipedia.org/wiki/X86-64#Virtual_address_space_details Kernel space User space
  • 20. get_user_page(), kmap() ● get_user_page(): Get a list of page from user space and map it into kernel space ● kmap(): get kernel virtual address from physical page
  • 21. Workqueue : for asynchronous execution of syscall
  • 22. Workqueue: asynchronous execution mechanism ● Workqueue is an asynchronous execution mechanism which is widely used across the kernel ● It's used for various purposes from simple context bouncing to hosting a persistent in-kernel service thread
  • 23. Workqueue design & problems of legacy workqueue(~2010) src: http://events.linuxfoundation.org/sites/events/files/slides/Async%20execution%20with%20wqs.pdf
  • 24. Concurrency Managed Workqueue(CMWQ, 2010~present) src: http://events.linuxfoundation.org/sites/events/files/slides/Async%20execution%20with%20wqs.pdf
  • 25. workqueue in FlexSC ● It is used to execute asyncronous system call ● Scanner thread does queue_work_on(CPU, workqueue work) to wake up worker thread ● It enables “Decoupling Execution from Invocation” by specifying specific CPU – A CPU that invoke a system call from user space is different from the CPU that execute that system call – Reduces indirect cost
  • 26. workqueue in FlexSC : Initilization ● create workqueue ● set max active worker thread to NUM_SYSENTRY(= 64)
  • 27. workqueue in FlexSC : work_struct
  • 28. worker handler After executing do_syscall(), It populates return value and updates entry state to “done” so that user can consumes
  • 29. Interface for user program ● New system calls – flexsc_register(): Initilization process for FlexSC system and it allows calling process to use FlexSC – flexsc_exit(): termination FlexSC system – flexsc_wait() (didn’t implement it yet) : when user process has nothing to do except for waiting syscall execution ● libflexsc – make ease use of FlexSC for user program – paper extends libflexsc to be compatiable with standard POSIX thread – In our case, just providing wrapper function of syscall, and some initilization(set CPU affinity, allocation of syspage, locking the page, ...)
  • 30. Limitation of current implementation ● Following limitation occurs because servicing kernel thread can’t fork user process ● Not sharing user’s file descriptor table – file I/O related system call(e.g. read, write) not supported ● Not sharing entire user space address – system calls including pointer arguments not supported. It can point address outside shared page ● Only limited support of system call
  • 31. Furthur works ● Overcome current limitation(sharing file descriptor table, sharing entire user space if possible) ● Do callback function when asynchronous system call is done ● Measure the exact cost of system call using system performance analaysis tools
  • 32. References ● SOARES, L., AND STUMM, M. Flexsc: Flexible System Call Scheduling with Exception-Less System Calls. In 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI) (2010), pp. 33–46.