SlideShare a Scribd company logo
1 of 46
Download to read offline
POSIX Realtime Evented Patterns
An introduction to AIO, IO reactors, callbacks and context switches




                                                                Lourens Naudé
                                                                  Trade2win Limited
The Call Stack


09/12/09
               The Call Stack   2
The Call Stack




09/12/09                3
The Call Stack

    Keeps track of subroutine execution (call + return)
    Dynamic, grows up or down depending on machine
    architecture
    Composed of n-1 stack frames
    Frequently includes local data storage, params,
    additional return state etc.
    Optimized for a single action
    Ordered
    Usually a single call stack per process, except ...


09/12/09                                              4
Context Switches


09/12/09
           Threads + Fiber / Coroutine overheads   5
Context Switches

    A stack per Thread / Fiber
    Context specific data
    Transient long running contexts are memory
    hungry
    Guard against transient threads with a pre-
    initialized Thread pool
    Threaded full stack Web Apps == expensive
    context switches
    clean_backtrace ?



09/12/09                                             6
Ruby Threads


09/12/09
              Green Threads   7
Ruby Threads




09/12/09              8
Ruby Threads

    Scheduled with a timer (SIGVTALRM) every 10ms
    Each thread is allowed a 10ms time slice
    Not the most efficient scheduler
    Coupled with select with IO multiplexing for
    portability
    Can wait for : fd, select, a PID, sleep state, a join
    MRI 1.8: Green Threads, cheap to spawn + switch
    MRI 1.9: Native OS threads, GIL, more expensive
    to spawn
    JRuby: Ruby thread == Java thread

09/12/09                                                    9
Fibers


09/12/09
           Coroutines   10
Fibers




09/12/09       11
Fibers

    A resumable execution state
    Computes a partial result – generator
    Yields back to it's caller
    Caller resumes
    Facilities for data exchange
    Initial 4k stack size and very fast context switches
    MRI 1.9 and JRuby only
    Cooperative scheduling for IO



09/12/09                                                   12
Reactor Pattern


09/12/09
              IO Reactor Pattern   13
Reactor Pattern




09/12/09                14
Reactor Pattern

    Main loop with a tick quantum ( 10 to 100ms )
    Operations register themselves with the reactor
    Process forked, fd readable, cmd finished, timer
    fired, IO timeout etc.
    Callbacks and errbacks
    Reactor notified by lower level subsystems : select,
    epoll, kqueue etc.
    Twisted (Python), EventMachine (Ruby, c++, Java)




09/12/09                                               15
Reactor and Contexts


09/12/09
           Best Practices for Multi Threading   16
Reactor and Threads

    Operations fire on the reactor thread
    Enumerated and invoked FIFO
    Blocking operations block the reactor
    Defer: schedule an operation on a background
    thread
    Schedule: push a deferred context back to the
    reactor thread




09/12/09                                            17
Blocked Reactor




09/12/09                18
Reactor with a deferred operation




09/12/09                                  19
System Calls


09/12/09
            Syscalls and the Kernel   20
Syscalls and the Kernel




09/12/09                        21
Syscalls and the Kernel

    Function calls into the OS: read,write,fork,sbrk etc.
    User vs Kernel space context switch, much more
    expensive than function calls within a process
    Usually implies data transfer between User and
    Kernel
    Important to reduce syscalls for high throughput
    environments
    Some lift workloads ...
    sendfile: request a file to be served directly from
    the kernel without User space overhead


09/12/09                                                    22
POSIX Realtime (AIO)


09/12/09
           POSIX Async IO extensions   23
POSIX Realtime (AIO)

    Introduced in Linux Kernel 2.6.x
    Floating spec for a number of years, currently
    defined in POSIX.1b
    Implementation resembles browser compat
    Fallback to blocking operations in most
    implementations
    Powers popular reverse proxies like Squid, Nginx,
    Varnish etc.




09/12/09                                                24
And then there were Specs




09/12/09                          25
AIO Control Blocks


09/12/09
            AIO Control Blocks   26
Control Block Struct




09/12/09                     27
AIO Control Blocks

    File descriptor with proper r/w mode set
    Buffer region for read / write
    Type of operation: read / write
    Priority: higher priority == faster execution
    Offset: Position to read from / write to
    Bytes: Amount of data to transfer
    Callback mechanism: no op, thread or signal
    Best wrapped in a custom struct for embedding
    domain logic specific to the use cases


09/12/09                                            28
AIO Operations


09/12/09
               AIO Operations   29
AIO Operations on a single fd




09/12/09                              30
AIO Operations on a single fd

    aio_read: sync / async read
    aio_write: sync / async write
    aio_error: error state, if any, for an operation
    aio_error and EINPROGRESS to simulate a
    blocking operation
    aio_cancel: cancel a submitted job
    aio_suspend: pause an in progress operation
    aio_sync: forcefully sync a write op to disk
    aio_return: return status from a given operation
    Uniform API, single AIO Control Block as arg

09/12/09                                               31
AIO List Operations


09/12/09
           AIO List Operations   32
AIO list operations




09/12/09                    33
AIO list operations

    Previously mentioned API still have a syscall per
    call overhead
    lio_listio: submit a batch of control blocks with a
    single syscall
    Modes: blocking, non-blocking and no-op
    Array of control blocks, number of operations and
    an optional callback as arguments
    Callback fires when all operations done
    Callbacks from individual control blocks still fire
    Useful for app specific correlation

09/12/09                                                  34
AIO and Syscalls


09/12/09
                AIO Syscalls   35
8 files, read




09/12/09              36
8 files, async read




09/12/09                    37
Revisit Threads and
            Fibers


09/12/09
           Threads and Fibers, revisited   38
Revisit Threads and Fibers

    Concept from James “raggi” Tucker
    Cheap switching of MRI green threads
    Lets embrace this …
    Stopped threads don't have scheduler overhead




09/12/09                                            39
Revisit Threads and Fibers




09/12/09                           40
Fibered IO Interpreter


09/12/09
           Fibered IO Interpreter   41
Fibered IO Interpreter

    Thread#sleep and Thread#wakeup for pooled or
    transient threads
    Stopped threads excluded by the scheduler saves
    10ms runtime per stopped thread when IO bound
    Model fits very well with existing threaded servers
    like Mongrel
    No need for an IO reactor – we delegate this to the
    OS and syscalls




09/12/09                                               42
Links


09/12/09
           Links and References   43
Links and References

    A few related projects
    http://github.com/eventmachine/eventmachine
    Event Machine repository
    http://github/methodmissing/aio
    Work in progress AIO extension for MRI, API in
    flux, but usable
    http://github/methodmissing/callback
    A native MRI callback object
    http://github/methodmissing/channel
    Fixed sized pub sub channels for MRI

09/12/09                                             44
Questions ?


09/12/09
               Q&A       45
Thanks!
           @methodmissing
           (github / twitter)

09/12/09
               Thanks for listening   46

More Related Content

What's hot

MARC ONERA Toulouse2012 Altreonic
MARC ONERA Toulouse2012 AltreonicMARC ONERA Toulouse2012 Altreonic
MARC ONERA Toulouse2012 AltreonicEric Verhulst
 
Network stack personality in Android phone - netdev 2.2
Network stack personality in Android phone - netdev 2.2Network stack personality in Android phone - netdev 2.2
Network stack personality in Android phone - netdev 2.2Hajime Tazaki
 
Playing BBR with a userspace network stack
Playing BBR with a userspace network stackPlaying BBR with a userspace network stack
Playing BBR with a userspace network stackHajime Tazaki
 
Direct Code Execution @ CoNEXT 2013
Direct Code Execution @ CoNEXT 2013Direct Code Execution @ CoNEXT 2013
Direct Code Execution @ CoNEXT 2013Hajime Tazaki
 
Quantum Computing in China: Progress on Superconducting Multi-Qubits System
Quantum Computing in China: Progress on Superconducting Multi-Qubits SystemQuantum Computing in China: Progress on Superconducting Multi-Qubits System
Quantum Computing in China: Progress on Superconducting Multi-Qubits Systeminside-BigData.com
 
2012 Fall OpenStack Bare-metal Speaker Session
2012 Fall OpenStack Bare-metal Speaker Session2012 Fall OpenStack Bare-metal Speaker Session
2012 Fall OpenStack Bare-metal Speaker SessionMikyung Kang
 
LibOS as a regression test framework for Linux networking #netdev1.1
LibOS as a regression test framework for Linux networking #netdev1.1LibOS as a regression test framework for Linux networking #netdev1.1
LibOS as a regression test framework for Linux networking #netdev1.1Hajime Tazaki
 
Network Stack in Userspace (NUSE)
Network Stack in Userspace (NUSE)Network Stack in Userspace (NUSE)
Network Stack in Userspace (NUSE)Hajime Tazaki
 
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)Hajime Tazaki
 
Library Operating System for Linux #netdev01
Library Operating System for Linux #netdev01Library Operating System for Linux #netdev01
Library Operating System for Linux #netdev01Hajime Tazaki
 
PLNOG14: Architektura oraz rozwiązywanie problemów na routerach IOS-XE - Piot...
PLNOG14: Architektura oraz rozwiązywanie problemów na routerach IOS-XE - Piot...PLNOG14: Architektura oraz rozwiązywanie problemów na routerach IOS-XE - Piot...
PLNOG14: Architektura oraz rozwiązywanie problemów na routerach IOS-XE - Piot...PROIDEA
 
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUsShoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUsJiannan Ouyang, PhD
 
Achieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-KernelsAchieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-KernelsJiannan Ouyang, PhD
 
CSTalks-Polymorphic heterogeneous multicore systems-17Aug
CSTalks-Polymorphic heterogeneous multicore systems-17AugCSTalks-Polymorphic heterogeneous multicore systems-17Aug
CSTalks-Polymorphic heterogeneous multicore systems-17Augcstalks
 
Kernelvm 201312-dlmopen
Kernelvm 201312-dlmopenKernelvm 201312-dlmopen
Kernelvm 201312-dlmopenHajime Tazaki
 
Reinforcing the Kitchen Sink - Aligning BGP-4 Error Handling with Modern Netw...
Reinforcing the Kitchen Sink - Aligning BGP-4 Error Handling with Modern Netw...Reinforcing the Kitchen Sink - Aligning BGP-4 Error Handling with Modern Netw...
Reinforcing the Kitchen Sink - Aligning BGP-4 Error Handling with Modern Netw...Rob Shakir
 

What's hot (20)

MARC ONERA Toulouse2012 Altreonic
MARC ONERA Toulouse2012 AltreonicMARC ONERA Toulouse2012 Altreonic
MARC ONERA Toulouse2012 Altreonic
 
Network stack personality in Android phone - netdev 2.2
Network stack personality in Android phone - netdev 2.2Network stack personality in Android phone - netdev 2.2
Network stack personality in Android phone - netdev 2.2
 
Playing BBR with a userspace network stack
Playing BBR with a userspace network stackPlaying BBR with a userspace network stack
Playing BBR with a userspace network stack
 
mTCP使ってみた
mTCP使ってみたmTCP使ってみた
mTCP使ってみた
 
Direct Code Execution @ CoNEXT 2013
Direct Code Execution @ CoNEXT 2013Direct Code Execution @ CoNEXT 2013
Direct Code Execution @ CoNEXT 2013
 
Virtual net performance
Virtual net performanceVirtual net performance
Virtual net performance
 
Quantum Computing in China: Progress on Superconducting Multi-Qubits System
Quantum Computing in China: Progress on Superconducting Multi-Qubits SystemQuantum Computing in China: Progress on Superconducting Multi-Qubits System
Quantum Computing in China: Progress on Superconducting Multi-Qubits System
 
2012 Fall OpenStack Bare-metal Speaker Session
2012 Fall OpenStack Bare-metal Speaker Session2012 Fall OpenStack Bare-metal Speaker Session
2012 Fall OpenStack Bare-metal Speaker Session
 
LibOS as a regression test framework for Linux networking #netdev1.1
LibOS as a regression test framework for Linux networking #netdev1.1LibOS as a regression test framework for Linux networking #netdev1.1
LibOS as a regression test framework for Linux networking #netdev1.1
 
Network Stack in Userspace (NUSE)
Network Stack in Userspace (NUSE)Network Stack in Userspace (NUSE)
Network Stack in Userspace (NUSE)
 
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
 
Library Operating System for Linux #netdev01
Library Operating System for Linux #netdev01Library Operating System for Linux #netdev01
Library Operating System for Linux #netdev01
 
PLNOG14: Architektura oraz rozwiązywanie problemów na routerach IOS-XE - Piot...
PLNOG14: Architektura oraz rozwiązywanie problemów na routerach IOS-XE - Piot...PLNOG14: Architektura oraz rozwiązywanie problemów na routerach IOS-XE - Piot...
PLNOG14: Architektura oraz rozwiązywanie problemów na routerach IOS-XE - Piot...
 
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUsShoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
 
Libra: a Library OS for a JVM
Libra: a Library OS for a JVMLibra: a Library OS for a JVM
Libra: a Library OS for a JVM
 
Achieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-KernelsAchieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-Kernels
 
CSTalks-Polymorphic heterogeneous multicore systems-17Aug
CSTalks-Polymorphic heterogeneous multicore systems-17AugCSTalks-Polymorphic heterogeneous multicore systems-17Aug
CSTalks-Polymorphic heterogeneous multicore systems-17Aug
 
Kernelvm 201312-dlmopen
Kernelvm 201312-dlmopenKernelvm 201312-dlmopen
Kernelvm 201312-dlmopen
 
Reinforcing the Kitchen Sink - Aligning BGP-4 Error Handling with Modern Netw...
Reinforcing the Kitchen Sink - Aligning BGP-4 Error Handling with Modern Netw...Reinforcing the Kitchen Sink - Aligning BGP-4 Error Handling with Modern Netw...
Reinforcing the Kitchen Sink - Aligning BGP-4 Error Handling with Modern Netw...
 
Lev
LevLev
Lev
 

Viewers also liked

Concurrent Programming with Ruby and Tuple Spaces
Concurrent Programming with Ruby and Tuple SpacesConcurrent Programming with Ruby and Tuple Spaces
Concurrent Programming with Ruby and Tuple Spacesluccastera
 
"Inside The AngularJS Directive Compiler" by Tero Parviainen
"Inside The AngularJS Directive Compiler" by Tero Parviainen"Inside The AngularJS Directive Compiler" by Tero Parviainen
"Inside The AngularJS Directive Compiler" by Tero ParviainenFwdays
 
Caching and IPC with Redis
Caching and IPC with RedisCaching and IPC with Redis
Caching and IPC with RedisKMS Technology
 
Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015
Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015
Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015curryon
 
"How to deploy to production 10 times a day" Андрей Шумада
"How to deploy to production 10 times a day" Андрей Шумада"How to deploy to production 10 times a day" Андрей Шумада
"How to deploy to production 10 times a day" Андрей ШумадаFwdays
 
Алексей Косинский "React Native vs. React+WebView"
Алексей Косинский "React Native vs. React+WebView"Алексей Косинский "React Native vs. React+WebView"
Алексей Косинский "React Native vs. React+WebView"Fwdays
 
Actors and Threads
Actors and ThreadsActors and Threads
Actors and Threadsmperham
 
Programming with Threads in Java
Programming with Threads in JavaProgramming with Threads in Java
Programming with Threads in Javakoji lin
 
Java Course 10: Threads and Concurrency
Java Course 10: Threads and ConcurrencyJava Course 10: Threads and Concurrency
Java Course 10: Threads and ConcurrencyAnton Keks
 
Central processing unit
Central processing unitCentral processing unit
Central processing unitKamal Acharya
 
Processor organization & register organization
Processor organization & register organizationProcessor organization & register organization
Processor organization & register organizationGhanshyam Patel
 
MicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scaleMicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scaleSudhir Tonse
 

Viewers also liked (14)

Concurrent Programming with Ruby and Tuple Spaces
Concurrent Programming with Ruby and Tuple SpacesConcurrent Programming with Ruby and Tuple Spaces
Concurrent Programming with Ruby and Tuple Spaces
 
"Inside The AngularJS Directive Compiler" by Tero Parviainen
"Inside The AngularJS Directive Compiler" by Tero Parviainen"Inside The AngularJS Directive Compiler" by Tero Parviainen
"Inside The AngularJS Directive Compiler" by Tero Parviainen
 
Caching and IPC with Redis
Caching and IPC with RedisCaching and IPC with Redis
Caching and IPC with Redis
 
Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015
Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015
Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015
 
"How to deploy to production 10 times a day" Андрей Шумада
"How to deploy to production 10 times a day" Андрей Шумада"How to deploy to production 10 times a day" Андрей Шумада
"How to deploy to production 10 times a day" Андрей Шумада
 
Inside the jvm
Inside the jvmInside the jvm
Inside the jvm
 
Алексей Косинский "React Native vs. React+WebView"
Алексей Косинский "React Native vs. React+WebView"Алексей Косинский "React Native vs. React+WebView"
Алексей Косинский "React Native vs. React+WebView"
 
Actors and Threads
Actors and ThreadsActors and Threads
Actors and Threads
 
Programming with Threads in Java
Programming with Threads in JavaProgramming with Threads in Java
Programming with Threads in Java
 
Java Course 10: Threads and Concurrency
Java Course 10: Threads and ConcurrencyJava Course 10: Threads and Concurrency
Java Course 10: Threads and Concurrency
 
Central processing unit
Central processing unitCentral processing unit
Central processing unit
 
Processor organization & register organization
Processor organization & register organizationProcessor organization & register organization
Processor organization & register organization
 
Threads
ThreadsThreads
Threads
 
MicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scaleMicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scale
 

Similar to Barcamp PT

A presentaion on Panasas HPC NAS
A presentaion on Panasas HPC NASA presentaion on Panasas HPC NAS
A presentaion on Panasas HPC NASRahul Janghel
 
From data centers to fog computing: the evaporating cloud
From data centers to fog computing: the evaporating cloudFrom data centers to fog computing: the evaporating cloud
From data centers to fog computing: the evaporating cloudFogGuru MSCA Project
 
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)Heiko Joerg Schick
 
Five cool ways the JVM can run Apache Spark faster
Five cool ways the JVM can run Apache Spark fasterFive cool ways the JVM can run Apache Spark faster
Five cool ways the JVM can run Apache Spark fasterTim Ellison
 
Trouble shooting Storage Area Networks for virtualisation deployments
Trouble shooting Storage Area Networks for virtualisation deploymentsTrouble shooting Storage Area Networks for virtualisation deployments
Trouble shooting Storage Area Networks for virtualisation deploymentsKevin Walker
 
MySQL Replication Performance Tuning for Fun and Profit!
MySQL Replication Performance Tuning for Fun and Profit!MySQL Replication Performance Tuning for Fun and Profit!
MySQL Replication Performance Tuning for Fun and Profit!Vitor Oliveira
 
XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavap...
XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavap...XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavap...
XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavap...The Linux Foundation
 
BMC: Bare Metal Container @Open Source Summit Japan 2017
BMC: Bare Metal Container @Open Source Summit Japan 2017BMC: Bare Metal Container @Open Source Summit Japan 2017
BMC: Bare Metal Container @Open Source Summit Japan 2017Kuniyasu Suzaki
 
Deep Dive on the Amazon Aurora MySQL-compatible Edition - DAT301 - re:Invent ...
Deep Dive on the Amazon Aurora MySQL-compatible Edition - DAT301 - re:Invent ...Deep Dive on the Amazon Aurora MySQL-compatible Edition - DAT301 - re:Invent ...
Deep Dive on the Amazon Aurora MySQL-compatible Edition - DAT301 - re:Invent ...Amazon Web Services
 
Oracle Clusterware and Private Network Considerations - Practical Performance...
Oracle Clusterware and Private Network Considerations - Practical Performance...Oracle Clusterware and Private Network Considerations - Practical Performance...
Oracle Clusterware and Private Network Considerations - Practical Performance...Guenadi JILEVSKI
 
Reservoir engineering in a HPC (zettaflops) world: a ‘disruptive’ presentation
Reservoir engineering in a HPC (zettaflops) world:  a ‘disruptive’ presentationReservoir engineering in a HPC (zettaflops) world:  a ‘disruptive’ presentation
Reservoir engineering in a HPC (zettaflops) world: a ‘disruptive’ presentationHans Haringa
 
Introduction to Real Time Java
Introduction to Real Time JavaIntroduction to Real Time Java
Introduction to Real Time JavaDeniz Oguz
 
Z109889 z4 r-storage-dfsms-vegas-v1910b
Z109889 z4 r-storage-dfsms-vegas-v1910bZ109889 z4 r-storage-dfsms-vegas-v1910b
Z109889 z4 r-storage-dfsms-vegas-v1910bTony Pearson
 
”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016Kuniyasu Suzaki
 
An AI accelerator ASIC architecture
An AI accelerator ASIC architectureAn AI accelerator ASIC architecture
An AI accelerator ASIC architectureKhanh Le
 
Traitement temps réel chez Streamroot - Golang Paris Juin 2016
Traitement temps réel chez Streamroot - Golang Paris Juin 2016Traitement temps réel chez Streamroot - Golang Paris Juin 2016
Traitement temps réel chez Streamroot - Golang Paris Juin 2016Simon Caplette
 
Flexible and Scalable Domain-Specific Architectures
Flexible and Scalable Domain-Specific ArchitecturesFlexible and Scalable Domain-Specific Architectures
Flexible and Scalable Domain-Specific ArchitecturesNetronome
 
Orcl siebel-sun-s282213-oow2006
Orcl siebel-sun-s282213-oow2006Orcl siebel-sun-s282213-oow2006
Orcl siebel-sun-s282213-oow2006Sal Marcus
 

Similar to Barcamp PT (20)

A presentaion on Panasas HPC NAS
A presentaion on Panasas HPC NASA presentaion on Panasas HPC NAS
A presentaion on Panasas HPC NAS
 
From data centers to fog computing: the evaporating cloud
From data centers to fog computing: the evaporating cloudFrom data centers to fog computing: the evaporating cloud
From data centers to fog computing: the evaporating cloud
 
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
 
Five cool ways the JVM can run Apache Spark faster
Five cool ways the JVM can run Apache Spark fasterFive cool ways the JVM can run Apache Spark faster
Five cool ways the JVM can run Apache Spark faster
 
Trouble shooting Storage Area Networks for virtualisation deployments
Trouble shooting Storage Area Networks for virtualisation deploymentsTrouble shooting Storage Area Networks for virtualisation deployments
Trouble shooting Storage Area Networks for virtualisation deployments
 
MySQL Replication Performance Tuning for Fun and Profit!
MySQL Replication Performance Tuning for Fun and Profit!MySQL Replication Performance Tuning for Fun and Profit!
MySQL Replication Performance Tuning for Fun and Profit!
 
XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavap...
XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavap...XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavap...
XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavap...
 
BMC: Bare Metal Container @Open Source Summit Japan 2017
BMC: Bare Metal Container @Open Source Summit Japan 2017BMC: Bare Metal Container @Open Source Summit Japan 2017
BMC: Bare Metal Container @Open Source Summit Japan 2017
 
Deep Dive on the Amazon Aurora MySQL-compatible Edition - DAT301 - re:Invent ...
Deep Dive on the Amazon Aurora MySQL-compatible Edition - DAT301 - re:Invent ...Deep Dive on the Amazon Aurora MySQL-compatible Edition - DAT301 - re:Invent ...
Deep Dive on the Amazon Aurora MySQL-compatible Edition - DAT301 - re:Invent ...
 
Oracle Clusterware and Private Network Considerations - Practical Performance...
Oracle Clusterware and Private Network Considerations - Practical Performance...Oracle Clusterware and Private Network Considerations - Practical Performance...
Oracle Clusterware and Private Network Considerations - Practical Performance...
 
Reservoir engineering in a HPC (zettaflops) world: a ‘disruptive’ presentation
Reservoir engineering in a HPC (zettaflops) world:  a ‘disruptive’ presentationReservoir engineering in a HPC (zettaflops) world:  a ‘disruptive’ presentation
Reservoir engineering in a HPC (zettaflops) world: a ‘disruptive’ presentation
 
Introduction to Real Time Java
Introduction to Real Time JavaIntroduction to Real Time Java
Introduction to Real Time Java
 
Z109889 z4 r-storage-dfsms-vegas-v1910b
Z109889 z4 r-storage-dfsms-vegas-v1910bZ109889 z4 r-storage-dfsms-vegas-v1910b
Z109889 z4 r-storage-dfsms-vegas-v1910b
 
”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016
 
An AI accelerator ASIC architecture
An AI accelerator ASIC architectureAn AI accelerator ASIC architecture
An AI accelerator ASIC architecture
 
Traitement temps réel chez Streamroot - Golang Paris Juin 2016
Traitement temps réel chez Streamroot - Golang Paris Juin 2016Traitement temps réel chez Streamroot - Golang Paris Juin 2016
Traitement temps réel chez Streamroot - Golang Paris Juin 2016
 
Rtos
RtosRtos
Rtos
 
Flexible and Scalable Domain-Specific Architectures
Flexible and Scalable Domain-Specific ArchitecturesFlexible and Scalable Domain-Specific Architectures
Flexible and Scalable Domain-Specific Architectures
 
Nido
NidoNido
Nido
 
Orcl siebel-sun-s282213-oow2006
Orcl siebel-sun-s282213-oow2006Orcl siebel-sun-s282213-oow2006
Orcl siebel-sun-s282213-oow2006
 

More from Lourens Naudé

ZeroMQ as scriptable sockets
ZeroMQ as scriptable socketsZeroMQ as scriptable sockets
ZeroMQ as scriptable socketsLourens Naudé
 
TX/RX 101: Transfer data efficiently
TX/RX 101: Transfer data efficientlyTX/RX 101: Transfer data efficiently
TX/RX 101: Transfer data efficientlyLourens Naudé
 
In the Loop - Lone Star Ruby Conference
In the Loop - Lone Star Ruby ConferenceIn the Loop - Lone Star Ruby Conference
In the Loop - Lone Star Ruby ConferenceLourens Naudé
 
EuRuKo 2011 - In the Loop
EuRuKo 2011 - In the LoopEuRuKo 2011 - In the Loop
EuRuKo 2011 - In the LoopLourens Naudé
 
Event Driven Architecture
Event Driven ArchitectureEvent Driven Architecture
Event Driven ArchitectureLourens Naudé
 
RailswayCon 2010 - Dynamic Language VMs
RailswayCon 2010 - Dynamic Language VMsRailswayCon 2010 - Dynamic Language VMs
RailswayCon 2010 - Dynamic Language VMsLourens Naudé
 
RailswayCon 2010 - Command Your Domain
RailswayCon 2010 - Command Your DomainRailswayCon 2010 - Command Your Domain
RailswayCon 2010 - Command Your DomainLourens Naudé
 
Railswaycon Inside Matz Ruby
Railswaycon Inside Matz RubyRailswaycon Inside Matz Ruby
Railswaycon Inside Matz RubyLourens Naudé
 

More from Lourens Naudé (9)

ZeroMQ as scriptable sockets
ZeroMQ as scriptable socketsZeroMQ as scriptable sockets
ZeroMQ as scriptable sockets
 
TX/RX 101: Transfer data efficiently
TX/RX 101: Transfer data efficientlyTX/RX 101: Transfer data efficiently
TX/RX 101: Transfer data efficiently
 
In the Loop - Lone Star Ruby Conference
In the Loop - Lone Star Ruby ConferenceIn the Loop - Lone Star Ruby Conference
In the Loop - Lone Star Ruby Conference
 
EuRuKo 2011 - In the Loop
EuRuKo 2011 - In the LoopEuRuKo 2011 - In the Loop
EuRuKo 2011 - In the Loop
 
Event Driven Architecture
Event Driven ArchitectureEvent Driven Architecture
Event Driven Architecture
 
RailswayCon 2010 - Dynamic Language VMs
RailswayCon 2010 - Dynamic Language VMsRailswayCon 2010 - Dynamic Language VMs
RailswayCon 2010 - Dynamic Language VMs
 
RailswayCon 2010 - Command Your Domain
RailswayCon 2010 - Command Your DomainRailswayCon 2010 - Command Your Domain
RailswayCon 2010 - Command Your Domain
 
Railswaycon Inside Matz Ruby
Railswaycon Inside Matz RubyRailswaycon Inside Matz Ruby
Railswaycon Inside Matz Ruby
 
Embracing Events
Embracing EventsEmbracing Events
Embracing Events
 

Recently uploaded

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 

Recently uploaded (20)

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 

Barcamp PT

  • 1. POSIX Realtime Evented Patterns An introduction to AIO, IO reactors, callbacks and context switches Lourens Naudé Trade2win Limited
  • 2. The Call Stack 09/12/09 The Call Stack 2
  • 4. The Call Stack Keeps track of subroutine execution (call + return) Dynamic, grows up or down depending on machine architecture Composed of n-1 stack frames Frequently includes local data storage, params, additional return state etc. Optimized for a single action Ordered Usually a single call stack per process, except ... 09/12/09 4
  • 5. Context Switches 09/12/09 Threads + Fiber / Coroutine overheads 5
  • 6. Context Switches A stack per Thread / Fiber Context specific data Transient long running contexts are memory hungry Guard against transient threads with a pre- initialized Thread pool Threaded full stack Web Apps == expensive context switches clean_backtrace ? 09/12/09 6
  • 7. Ruby Threads 09/12/09 Green Threads 7
  • 9. Ruby Threads Scheduled with a timer (SIGVTALRM) every 10ms Each thread is allowed a 10ms time slice Not the most efficient scheduler Coupled with select with IO multiplexing for portability Can wait for : fd, select, a PID, sleep state, a join MRI 1.8: Green Threads, cheap to spawn + switch MRI 1.9: Native OS threads, GIL, more expensive to spawn JRuby: Ruby thread == Java thread 09/12/09 9
  • 10. Fibers 09/12/09 Coroutines 10
  • 12. Fibers A resumable execution state Computes a partial result – generator Yields back to it's caller Caller resumes Facilities for data exchange Initial 4k stack size and very fast context switches MRI 1.9 and JRuby only Cooperative scheduling for IO 09/12/09 12
  • 13. Reactor Pattern 09/12/09 IO Reactor Pattern 13
  • 15. Reactor Pattern Main loop with a tick quantum ( 10 to 100ms ) Operations register themselves with the reactor Process forked, fd readable, cmd finished, timer fired, IO timeout etc. Callbacks and errbacks Reactor notified by lower level subsystems : select, epoll, kqueue etc. Twisted (Python), EventMachine (Ruby, c++, Java) 09/12/09 15
  • 16. Reactor and Contexts 09/12/09 Best Practices for Multi Threading 16
  • 17. Reactor and Threads Operations fire on the reactor thread Enumerated and invoked FIFO Blocking operations block the reactor Defer: schedule an operation on a background thread Schedule: push a deferred context back to the reactor thread 09/12/09 17
  • 19. Reactor with a deferred operation 09/12/09 19
  • 20. System Calls 09/12/09 Syscalls and the Kernel 20
  • 21. Syscalls and the Kernel 09/12/09 21
  • 22. Syscalls and the Kernel Function calls into the OS: read,write,fork,sbrk etc. User vs Kernel space context switch, much more expensive than function calls within a process Usually implies data transfer between User and Kernel Important to reduce syscalls for high throughput environments Some lift workloads ... sendfile: request a file to be served directly from the kernel without User space overhead 09/12/09 22
  • 23. POSIX Realtime (AIO) 09/12/09 POSIX Async IO extensions 23
  • 24. POSIX Realtime (AIO) Introduced in Linux Kernel 2.6.x Floating spec for a number of years, currently defined in POSIX.1b Implementation resembles browser compat Fallback to blocking operations in most implementations Powers popular reverse proxies like Squid, Nginx, Varnish etc. 09/12/09 24
  • 25. And then there were Specs 09/12/09 25
  • 26. AIO Control Blocks 09/12/09 AIO Control Blocks 26
  • 28. AIO Control Blocks File descriptor with proper r/w mode set Buffer region for read / write Type of operation: read / write Priority: higher priority == faster execution Offset: Position to read from / write to Bytes: Amount of data to transfer Callback mechanism: no op, thread or signal Best wrapped in a custom struct for embedding domain logic specific to the use cases 09/12/09 28
  • 29. AIO Operations 09/12/09 AIO Operations 29
  • 30. AIO Operations on a single fd 09/12/09 30
  • 31. AIO Operations on a single fd aio_read: sync / async read aio_write: sync / async write aio_error: error state, if any, for an operation aio_error and EINPROGRESS to simulate a blocking operation aio_cancel: cancel a submitted job aio_suspend: pause an in progress operation aio_sync: forcefully sync a write op to disk aio_return: return status from a given operation Uniform API, single AIO Control Block as arg 09/12/09 31
  • 32. AIO List Operations 09/12/09 AIO List Operations 32
  • 34. AIO list operations Previously mentioned API still have a syscall per call overhead lio_listio: submit a batch of control blocks with a single syscall Modes: blocking, non-blocking and no-op Array of control blocks, number of operations and an optional callback as arguments Callback fires when all operations done Callbacks from individual control blocks still fire Useful for app specific correlation 09/12/09 34
  • 35. AIO and Syscalls 09/12/09 AIO Syscalls 35
  • 37. 8 files, async read 09/12/09 37
  • 38. Revisit Threads and Fibers 09/12/09 Threads and Fibers, revisited 38
  • 39. Revisit Threads and Fibers Concept from James “raggi” Tucker Cheap switching of MRI green threads Lets embrace this … Stopped threads don't have scheduler overhead 09/12/09 39
  • 40. Revisit Threads and Fibers 09/12/09 40
  • 41. Fibered IO Interpreter 09/12/09 Fibered IO Interpreter 41
  • 42. Fibered IO Interpreter Thread#sleep and Thread#wakeup for pooled or transient threads Stopped threads excluded by the scheduler saves 10ms runtime per stopped thread when IO bound Model fits very well with existing threaded servers like Mongrel No need for an IO reactor – we delegate this to the OS and syscalls 09/12/09 42
  • 43. Links 09/12/09 Links and References 43
  • 44. Links and References A few related projects http://github.com/eventmachine/eventmachine Event Machine repository http://github/methodmissing/aio Work in progress AIO extension for MRI, API in flux, but usable http://github/methodmissing/callback A native MRI callback object http://github/methodmissing/channel Fixed sized pub sub channels for MRI 09/12/09 44
  • 46. Thanks! @methodmissing (github / twitter) 09/12/09 Thanks for listening 46