SlideShare a Scribd company logo
1 of 14
epoll - The I/O Hero 
Evented I/O in Linux 
Mohsin Shafeeque MS080400037
Operating System 
●Processor blindly executes instructions. 
●Its the OS that puts a structure that enables a lot 
oThings like multi tasking 
oInteraction with I/O devices 
o Providing APIs and utility functions 
●In essence, lots of services! 
●Without them, there is no screen, no keyboard no HD. 
●But today I'll talk about a single service that has a great 
impact. 
●Or say, a single function that made Linux Kernel hell lot 
valuable. 
●And that is, evented I/O.
A little about I/O 
●All I/O devices are slow 
oDisks 
oNetworks 
●Processor are too much fast. 
oA 10 msec disk operation. 
oAnd processor has executed millions of instructions. 
●Two modes of I/O 
oBlocking 
 Process blocks until operation completes 
oNon blocking 
 Process continues. Asynchronous. 
●Non-blocking has many implementations
Non blocking I/O 
●There are many hardware/software implementations. 
●Polling 
oContinues looping to check status polling 
oWastes CPU cycles 
●Signals 
oOS generated interrupts. 
oMight leave other processes inconsistent. 
●Callbacks 
oPointer to functions. 
oStack deepening issue. (callback issuing I/O) 
●Interrupts 
o Hardware interrupts in kernel mode.
Web servers - I/O hungry! 
●Its not just the disk fragmentation and file copy 
programs that are I/O hungry. 
●Even more hungry are the web servers! 
●In the age of Internet, web server performance is 
critical. 
●And all of it relies on throughput! 
●Number of requests/clients served per second. 
● There are many models around it. 
●But one particular service/function called epoll has 
accelerated all this.
How web server works? 
●Before we look at epoll(), lets look at servers. 
●The open up a socket. 
●Wait for incoming connections. 
●There are three models here: 
oOne process per connection (Apache?) 
oOne thread per connection. 
●In first case, a new process spawned on every request. 
●Second case, new thread created for each request. 
●Both aren't very scalable. 
● Third option: 
oOne thread multiple connections! 
●Lets see how it works!
Create sockets, select then! 
●Single thread creates many sockets. 
●Each socket is a file descriptor so an array of them. 
●Server code calls the below given select function. 
oint select(int nfds, fd_set *readfds, fd_set 
*writefds, fd_set *exceptfds, struct timeval 
*timeout); 
●nfds - number of file descriptors. 
●readfs - those to be read. 
●writefds - those to be written. 
●exceptfds - those to check for error. 
●timeout - time to sleep at max. 
● Program calls this function, which makes it to sleep. 
●The call would only return when some descriptor is ready.
select - Zooming in 
●From the man page: 
select() and pselect() allow a program to monitor multiple file 
descriptors, waiting until one or more of the file descriptors become 
"ready" for some class of I/O operation (e.g., input possible). A file 
descriptor is considered ready if it is possible to perform the corre‐sponding 
I/O operation (e.g., read(2)) without blocking.
Problem with select 
●It takes O(n) time! 
●That is, if 500 file descriptors (sockets) are being 
watched, 
●it might take 500 steps to return the fd that's ready for 
I/O. 
●And that's a problem! 
●Already we have kernel to user mode switching overhead. 
●And then this O(n) 
● Solution....? 
●epoll() - Introduced in Linux 2.5.44. 
●Takes O(1) time. That's fast! 
●Lets have a look.
epoll details 
int epoll_create(int size); 
Creates an epoll object and returns its file descriptor. 
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event); 
Controls (configures) which file descriptors are watched by this object, and for which events. 
int epoll_wait(int epfd, struct epoll_event *events, int maxevents, int timeout);
How it works? 
●epoll_wait is called by the server code. 
●It will return any fds that are ready (i.e. have data). 
●Two modes of operations 
oedge triggered. 
olevel-triggered. 
●In edge triggered, process will be invoked only once per 
new arrival. 
oE.g. 2 KB received, process reads 1 KB only, next call 
will block till further data arrives even if 1 KB already 
in. 
●In level triggered, process will be invoked till the buffer 
is empty. 
oE.g. 2 KB received, 1 KB read, next call won't block but 
would return same descriptor.
So what? 
●This model has enabled the servers to handle thousands 
of requests in handful of threads. 
●Server creates sockets, on each request arrival, a file 
descriptor created and monitored. 
●Event driven! 
●Ngnix! The second largest server on internet uses this 
model. 
●Written by a Russian. Handles 70 million calls a day. 
●Ngnix used by Wordpress! Many others! 
● Node.js - The new hotness. Based on V8 Javascript 
Engine. 
●Uses the epoll() to handle thousands of requests per 
thread. Apache in comparison is on thread per request.
Conclusion 
●Operating systems do provide services. 
●But sometimes, even a single function call can open up a 
new world of possibilities. 
● epoll() is such an example. 
●I/O is most critical portion of an OS. 
●Even if it is single tasking system efficient I/O is what 
the system is all about. 
●And epoll() adds event driven I/O to Linux. 
●Interesting applications are being developed on top of 
epoll()
References 
●epoll() official man page. 
●Linux Device Drivers - poll epoll 
● Node.js - Evented I/O for Javascript 
● NGNIX - The Russian Webserver

More Related Content

What's hot

Reverse Engineering Dojo: Enhancing Assembly Reading Skills
Reverse Engineering Dojo: Enhancing Assembly Reading SkillsReverse Engineering Dojo: Enhancing Assembly Reading Skills
Reverse Engineering Dojo: Enhancing Assembly Reading SkillsAsuka Nakajima
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingKernel TLV
 
(2013 DEVIEW) 멀티쓰레드 프로그래밍이 왜이리 힘드나요?
(2013 DEVIEW) 멀티쓰레드 프로그래밍이  왜이리 힘드나요? (2013 DEVIEW) 멀티쓰레드 프로그래밍이  왜이리 힘드나요?
(2013 DEVIEW) 멀티쓰레드 프로그래밍이 왜이리 힘드나요? 내훈 정
 
Using cgroups in docker container
Using cgroups in docker containerUsing cgroups in docker container
Using cgroups in docker containerVinay Jindal
 
QEMU Disk IO Which performs Better: Native or threads?
QEMU Disk IO Which performs Better: Native or threads?QEMU Disk IO Which performs Better: Native or threads?
QEMU Disk IO Which performs Better: Native or threads?Pradeep Kumar
 
Locking in Linux Traffic Control subsystem
Locking in Linux Traffic Control subsystemLocking in Linux Traffic Control subsystem
Locking in Linux Traffic Control subsystemCong Wang
 
How Linux Processes Your Network Packet - Elazar Leibovich
How Linux Processes Your Network Packet - Elazar LeibovichHow Linux Processes Your Network Packet - Elazar Leibovich
How Linux Processes Your Network Packet - Elazar LeibovichDevOpsDays Tel Aviv
 
Integrating Sparkplug IoT Edge of Network Nodes with Kafka with Yves Kurz
Integrating Sparkplug IoT Edge of Network Nodes with Kafka with Yves KurzIntegrating Sparkplug IoT Edge of Network Nodes with Kafka with Yves Kurz
Integrating Sparkplug IoT Edge of Network Nodes with Kafka with Yves KurzHostedbyConfluent
 
COSCUP 2020 RISC-V 32 bit linux highmem porting
COSCUP 2020 RISC-V 32 bit linux highmem portingCOSCUP 2020 RISC-V 32 bit linux highmem porting
COSCUP 2020 RISC-V 32 bit linux highmem portingEric Lin
 
The Basic Introduction of Open vSwitch
The Basic Introduction of Open vSwitchThe Basic Introduction of Open vSwitch
The Basic Introduction of Open vSwitchTe-Yen Liu
 
Docker cheat-sheet
Docker cheat-sheetDocker cheat-sheet
Docker cheat-sheetPeđa Delić
 
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 InstanceExtreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 InstanceScyllaDB
 
5G Network Slicing Using Mininet
5G Network Slicing Using Mininet5G Network Slicing Using Mininet
5G Network Slicing Using MininetMohammed Abuibaid
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)Kirill Tsym
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsBrendan Gregg
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughThomas Graf
 
Scapy TLS: A scriptable TLS 1.3 stack
Scapy TLS: A scriptable TLS 1.3 stackScapy TLS: A scriptable TLS 1.3 stack
Scapy TLS: A scriptable TLS 1.3 stackAlexandre Moneger
 
Virtualizing the Network to enable a Software Defined Infrastructure (SDI)
Virtualizing the Network to enable a Software Defined Infrastructure (SDI)Virtualizing the Network to enable a Software Defined Infrastructure (SDI)
Virtualizing the Network to enable a Software Defined Infrastructure (SDI)Odinot Stanislas
 

What's hot (20)

Reverse Engineering Dojo: Enhancing Assembly Reading Skills
Reverse Engineering Dojo: Enhancing Assembly Reading SkillsReverse Engineering Dojo: Enhancing Assembly Reading Skills
Reverse Engineering Dojo: Enhancing Assembly Reading Skills
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
 
(2013 DEVIEW) 멀티쓰레드 프로그래밍이 왜이리 힘드나요?
(2013 DEVIEW) 멀티쓰레드 프로그래밍이  왜이리 힘드나요? (2013 DEVIEW) 멀티쓰레드 프로그래밍이  왜이리 힘드나요?
(2013 DEVIEW) 멀티쓰레드 프로그래밍이 왜이리 힘드나요?
 
Using cgroups in docker container
Using cgroups in docker containerUsing cgroups in docker container
Using cgroups in docker container
 
QEMU Disk IO Which performs Better: Native or threads?
QEMU Disk IO Which performs Better: Native or threads?QEMU Disk IO Which performs Better: Native or threads?
QEMU Disk IO Which performs Better: Native or threads?
 
Locking in Linux Traffic Control subsystem
Locking in Linux Traffic Control subsystemLocking in Linux Traffic Control subsystem
Locking in Linux Traffic Control subsystem
 
DPDK KNI interface
DPDK KNI interfaceDPDK KNI interface
DPDK KNI interface
 
How Linux Processes Your Network Packet - Elazar Leibovich
How Linux Processes Your Network Packet - Elazar LeibovichHow Linux Processes Your Network Packet - Elazar Leibovich
How Linux Processes Your Network Packet - Elazar Leibovich
 
Integrating Sparkplug IoT Edge of Network Nodes with Kafka with Yves Kurz
Integrating Sparkplug IoT Edge of Network Nodes with Kafka with Yves KurzIntegrating Sparkplug IoT Edge of Network Nodes with Kafka with Yves Kurz
Integrating Sparkplug IoT Edge of Network Nodes with Kafka with Yves Kurz
 
COSCUP 2020 RISC-V 32 bit linux highmem porting
COSCUP 2020 RISC-V 32 bit linux highmem portingCOSCUP 2020 RISC-V 32 bit linux highmem porting
COSCUP 2020 RISC-V 32 bit linux highmem porting
 
The Basic Introduction of Open vSwitch
The Basic Introduction of Open vSwitchThe Basic Introduction of Open vSwitch
The Basic Introduction of Open vSwitch
 
Docker cheat-sheet
Docker cheat-sheetDocker cheat-sheet
Docker cheat-sheet
 
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 InstanceExtreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
 
5G Network Slicing Using Mininet
5G Network Slicing Using Mininet5G Network Slicing Using Mininet
5G Network Slicing Using Mininet
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets
 
Cron
CronCron
Cron
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking Walkthrough
 
Scapy TLS: A scriptable TLS 1.3 stack
Scapy TLS: A scriptable TLS 1.3 stackScapy TLS: A scriptable TLS 1.3 stack
Scapy TLS: A scriptable TLS 1.3 stack
 
Virtualizing the Network to enable a Software Defined Infrastructure (SDI)
Virtualizing the Network to enable a Software Defined Infrastructure (SDI)Virtualizing the Network to enable a Software Defined Infrastructure (SDI)
Virtualizing the Network to enable a Software Defined Infrastructure (SDI)
 

Viewers also liked

Non-blocking I/O, Event loops and node.js
Non-blocking I/O, Event loops and node.jsNon-blocking I/O, Event loops and node.js
Non-blocking I/O, Event loops and node.jsMarcus Frödin
 
Scala at foursquare
Scala at foursquareScala at foursquare
Scala at foursquarejorgeortiz85
 
Fast and energy-efficient eNVM based memory organisation at L3-L1 layers for ...
Fast and energy-efficient eNVM based memory organisation at L3-L1 layers for ...Fast and energy-efficient eNVM based memory organisation at L3-L1 layers for ...
Fast and energy-efficient eNVM based memory organisation at L3-L1 layers for ...Facultad de Informática UCM
 
Wait queue
Wait queueWait queue
Wait queueRoy Lee
 
How we (Almost) Forgot Lambda Architecture and used Elasticsearch
How we (Almost) Forgot Lambda Architecture and used ElasticsearchHow we (Almost) Forgot Lambda Architecture and used Elasticsearch
How we (Almost) Forgot Lambda Architecture and used ElasticsearchMichael Stockerl
 
HBase from the Trenches - Phoenix Data Conference 2015
HBase from the Trenches - Phoenix Data Conference 2015HBase from the Trenches - Phoenix Data Conference 2015
HBase from the Trenches - Phoenix Data Conference 2015Avinash Ramineni
 
Hadoop, MapReduce and R = RHadoop
Hadoop, MapReduce and R = RHadoopHadoop, MapReduce and R = RHadoop
Hadoop, MapReduce and R = RHadoopVictoria López
 
State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM
State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVMState: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM
State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVMJonas Bonér
 
Linux下Poll和Epoll内核源码剖析
Linux下Poll和Epoll内核源码剖析Linux下Poll和Epoll内核源码剖析
Linux下Poll和Epoll内核源码剖析Hao(Robin) Dong
 
Node js presentation
Node js presentationNode js presentation
Node js presentationmartincabrera
 
Unity Internals: Memory and Performance
Unity Internals: Memory and PerformanceUnity Internals: Memory and Performance
Unity Internals: Memory and PerformanceDevGAMM Conference
 
Memory Management: What You Need to Know When Moving to Java 8
Memory Management: What You Need to Know When Moving to Java 8Memory Management: What You Need to Know When Moving to Java 8
Memory Management: What You Need to Know When Moving to Java 8AppDynamics
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneRahul Jain
 
Understanding Memory Management In Spark For Fun And Profit
Understanding Memory Management In Spark For Fun And ProfitUnderstanding Memory Management In Spark For Fun And Profit
Understanding Memory Management In Spark For Fun And ProfitSpark Summit
 
Modern UI Development With Node.js
Modern UI Development With Node.jsModern UI Development With Node.js
Modern UI Development With Node.jsRyan Anklam
 
Hadoop HDFS Detailed Introduction
Hadoop HDFS Detailed IntroductionHadoop HDFS Detailed Introduction
Hadoop HDFS Detailed IntroductionHanborq Inc.
 
Anatomy of a Modern Node.js Application Architecture
Anatomy of a Modern Node.js Application Architecture Anatomy of a Modern Node.js Application Architecture
Anatomy of a Modern Node.js Application Architecture AppDynamics
 
Nodejs Explained with Examples
Nodejs Explained with ExamplesNodejs Explained with Examples
Nodejs Explained with ExamplesGabriele Lana
 

Viewers also liked (20)

Non-blocking I/O, Event loops and node.js
Non-blocking I/O, Event loops and node.jsNon-blocking I/O, Event loops and node.js
Non-blocking I/O, Event loops and node.js
 
Scala at foursquare
Scala at foursquareScala at foursquare
Scala at foursquare
 
Memory Pools for C and C++
Memory Pools for C and C++Memory Pools for C and C++
Memory Pools for C and C++
 
Fast and energy-efficient eNVM based memory organisation at L3-L1 layers for ...
Fast and energy-efficient eNVM based memory organisation at L3-L1 layers for ...Fast and energy-efficient eNVM based memory organisation at L3-L1 layers for ...
Fast and energy-efficient eNVM based memory organisation at L3-L1 layers for ...
 
Wait queue
Wait queueWait queue
Wait queue
 
How we (Almost) Forgot Lambda Architecture and used Elasticsearch
How we (Almost) Forgot Lambda Architecture and used ElasticsearchHow we (Almost) Forgot Lambda Architecture and used Elasticsearch
How we (Almost) Forgot Lambda Architecture and used Elasticsearch
 
HBase from the Trenches - Phoenix Data Conference 2015
HBase from the Trenches - Phoenix Data Conference 2015HBase from the Trenches - Phoenix Data Conference 2015
HBase from the Trenches - Phoenix Data Conference 2015
 
Hadoop, MapReduce and R = RHadoop
Hadoop, MapReduce and R = RHadoopHadoop, MapReduce and R = RHadoop
Hadoop, MapReduce and R = RHadoop
 
State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM
State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVMState: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM
State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM
 
Linux下Poll和Epoll内核源码剖析
Linux下Poll和Epoll内核源码剖析Linux下Poll和Epoll内核源码剖析
Linux下Poll和Epoll内核源码剖析
 
Node js presentation
Node js presentationNode js presentation
Node js presentation
 
Unity Internals: Memory and Performance
Unity Internals: Memory and PerformanceUnity Internals: Memory and Performance
Unity Internals: Memory and Performance
 
Memory Management: What You Need to Know When Moving to Java 8
Memory Management: What You Need to Know When Moving to Java 8Memory Management: What You Need to Know When Moving to Java 8
Memory Management: What You Need to Know When Moving to Java 8
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of Lucene
 
Understanding Memory Management In Spark For Fun And Profit
Understanding Memory Management In Spark For Fun And ProfitUnderstanding Memory Management In Spark For Fun And Profit
Understanding Memory Management In Spark For Fun And Profit
 
NodeJS for Beginner
NodeJS for BeginnerNodeJS for Beginner
NodeJS for Beginner
 
Modern UI Development With Node.js
Modern UI Development With Node.jsModern UI Development With Node.js
Modern UI Development With Node.js
 
Hadoop HDFS Detailed Introduction
Hadoop HDFS Detailed IntroductionHadoop HDFS Detailed Introduction
Hadoop HDFS Detailed Introduction
 
Anatomy of a Modern Node.js Application Architecture
Anatomy of a Modern Node.js Application Architecture Anatomy of a Modern Node.js Application Architecture
Anatomy of a Modern Node.js Application Architecture
 
Nodejs Explained with Examples
Nodejs Explained with ExamplesNodejs Explained with Examples
Nodejs Explained with Examples
 

Similar to Linux epoll - Evented I/O for High Performance Servers

Linux multiplexing
Linux multiplexingLinux multiplexing
Linux multiplexingMark Veltzer
 
Deep Dive into Node.js Event Loop.pdf
Deep Dive into Node.js Event Loop.pdfDeep Dive into Node.js Event Loop.pdf
Deep Dive into Node.js Event Loop.pdfShubhamChaurasia88
 
M|18 Architectural Overview: MariaDB MaxScale
M|18 Architectural Overview: MariaDB MaxScaleM|18 Architectural Overview: MariaDB MaxScale
M|18 Architectural Overview: MariaDB MaxScaleMariaDB plc
 
Linux 开源操作系统发展新趋势
Linux 开源操作系统发展新趋势Linux 开源操作系统发展新趋势
Linux 开源操作系统发展新趋势Anthony Wong
 
uWSGI - Swiss army knife for your Python web apps
uWSGI - Swiss army knife for your Python web appsuWSGI - Swiss army knife for your Python web apps
uWSGI - Swiss army knife for your Python web appsTomislav Raseta
 
Introduction to node.js
Introduction to node.jsIntroduction to node.js
Introduction to node.jsSu Zin Kyaw
 
Performance optimization techniques for Java code
Performance optimization techniques for Java codePerformance optimization techniques for Java code
Performance optimization techniques for Java codeAttila Balazs
 
The internet of $h1t
The internet of $h1tThe internet of $h1t
The internet of $h1tAmit Serper
 
Scaling an ELK stack at bol.com
Scaling an ELK stack at bol.comScaling an ELK stack at bol.com
Scaling an ELK stack at bol.comRenzo Tomà
 
The art of concurrent programming
The art of concurrent programmingThe art of concurrent programming
The art of concurrent programmingIskren Chernev
 
Why kernelspace sucks?
Why kernelspace sucks?Why kernelspace sucks?
Why kernelspace sucks?OpenFest team
 
Linux Server Deep Dives (DrupalCon Amsterdam)
Linux Server Deep Dives (DrupalCon Amsterdam)Linux Server Deep Dives (DrupalCon Amsterdam)
Linux Server Deep Dives (DrupalCon Amsterdam)Amin Astaneh
 
High performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbHigh performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbWei Shan Ang
 
LAS16-209: Finished and Upcoming Projects in LMG
LAS16-209: Finished and Upcoming Projects in LMGLAS16-209: Finished and Upcoming Projects in LMG
LAS16-209: Finished and Upcoming Projects in LMGLinaro
 
Network Automation: Ansible 101
Network Automation: Ansible 101Network Automation: Ansible 101
Network Automation: Ansible 101APNIC
 
Introduction to Docker (and a bit more) at LSPE meetup Sunnyvale
Introduction to Docker (and a bit more) at LSPE meetup SunnyvaleIntroduction to Docker (and a bit more) at LSPE meetup Sunnyvale
Introduction to Docker (and a bit more) at LSPE meetup SunnyvaleJérôme Petazzoni
 

Similar to Linux epoll - Evented I/O for High Performance Servers (20)

Linux multiplexing
Linux multiplexingLinux multiplexing
Linux multiplexing
 
Deep Dive into Node.js Event Loop.pdf
Deep Dive into Node.js Event Loop.pdfDeep Dive into Node.js Event Loop.pdf
Deep Dive into Node.js Event Loop.pdf
 
M|18 Architectural Overview: MariaDB MaxScale
M|18 Architectural Overview: MariaDB MaxScaleM|18 Architectural Overview: MariaDB MaxScale
M|18 Architectural Overview: MariaDB MaxScale
 
Linux 开源操作系统发展新趋势
Linux 开源操作系统发展新趋势Linux 开源操作系统发展新趋势
Linux 开源操作系统发展新趋势
 
uWSGI - Swiss army knife for your Python web apps
uWSGI - Swiss army knife for your Python web appsuWSGI - Swiss army knife for your Python web apps
uWSGI - Swiss army knife for your Python web apps
 
Introduction to node.js
Introduction to node.jsIntroduction to node.js
Introduction to node.js
 
Nodejs
NodejsNodejs
Nodejs
 
Performance optimization techniques for Java code
Performance optimization techniques for Java codePerformance optimization techniques for Java code
Performance optimization techniques for Java code
 
Linux-Internals-and-Networking
Linux-Internals-and-NetworkingLinux-Internals-and-Networking
Linux-Internals-and-Networking
 
The internet of $h1t
The internet of $h1tThe internet of $h1t
The internet of $h1t
 
Scaling an ELK stack at bol.com
Scaling an ELK stack at bol.comScaling an ELK stack at bol.com
Scaling an ELK stack at bol.com
 
The art of concurrent programming
The art of concurrent programmingThe art of concurrent programming
The art of concurrent programming
 
Why kernelspace sucks?
Why kernelspace sucks?Why kernelspace sucks?
Why kernelspace sucks?
 
Monkey Server
Monkey ServerMonkey Server
Monkey Server
 
Linux Server Deep Dives (DrupalCon Amsterdam)
Linux Server Deep Dives (DrupalCon Amsterdam)Linux Server Deep Dives (DrupalCon Amsterdam)
Linux Server Deep Dives (DrupalCon Amsterdam)
 
High performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbHigh performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodb
 
LAS16-209: Finished and Upcoming Projects in LMG
LAS16-209: Finished and Upcoming Projects in LMGLAS16-209: Finished and Upcoming Projects in LMG
LAS16-209: Finished and Upcoming Projects in LMG
 
Network Automation: Ansible 101
Network Automation: Ansible 101Network Automation: Ansible 101
Network Automation: Ansible 101
 
Concept of thread
Concept of threadConcept of thread
Concept of thread
 
Introduction to Docker (and a bit more) at LSPE meetup Sunnyvale
Introduction to Docker (and a bit more) at LSPE meetup SunnyvaleIntroduction to Docker (and a bit more) at LSPE meetup Sunnyvale
Introduction to Docker (and a bit more) at LSPE meetup Sunnyvale
 

Recently uploaded

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 

Recently uploaded (20)

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 

Linux epoll - Evented I/O for High Performance Servers

  • 1. epoll - The I/O Hero Evented I/O in Linux Mohsin Shafeeque MS080400037
  • 2. Operating System ●Processor blindly executes instructions. ●Its the OS that puts a structure that enables a lot oThings like multi tasking oInteraction with I/O devices o Providing APIs and utility functions ●In essence, lots of services! ●Without them, there is no screen, no keyboard no HD. ●But today I'll talk about a single service that has a great impact. ●Or say, a single function that made Linux Kernel hell lot valuable. ●And that is, evented I/O.
  • 3. A little about I/O ●All I/O devices are slow oDisks oNetworks ●Processor are too much fast. oA 10 msec disk operation. oAnd processor has executed millions of instructions. ●Two modes of I/O oBlocking  Process blocks until operation completes oNon blocking  Process continues. Asynchronous. ●Non-blocking has many implementations
  • 4. Non blocking I/O ●There are many hardware/software implementations. ●Polling oContinues looping to check status polling oWastes CPU cycles ●Signals oOS generated interrupts. oMight leave other processes inconsistent. ●Callbacks oPointer to functions. oStack deepening issue. (callback issuing I/O) ●Interrupts o Hardware interrupts in kernel mode.
  • 5. Web servers - I/O hungry! ●Its not just the disk fragmentation and file copy programs that are I/O hungry. ●Even more hungry are the web servers! ●In the age of Internet, web server performance is critical. ●And all of it relies on throughput! ●Number of requests/clients served per second. ● There are many models around it. ●But one particular service/function called epoll has accelerated all this.
  • 6. How web server works? ●Before we look at epoll(), lets look at servers. ●The open up a socket. ●Wait for incoming connections. ●There are three models here: oOne process per connection (Apache?) oOne thread per connection. ●In first case, a new process spawned on every request. ●Second case, new thread created for each request. ●Both aren't very scalable. ● Third option: oOne thread multiple connections! ●Lets see how it works!
  • 7. Create sockets, select then! ●Single thread creates many sockets. ●Each socket is a file descriptor so an array of them. ●Server code calls the below given select function. oint select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout); ●nfds - number of file descriptors. ●readfs - those to be read. ●writefds - those to be written. ●exceptfds - those to check for error. ●timeout - time to sleep at max. ● Program calls this function, which makes it to sleep. ●The call would only return when some descriptor is ready.
  • 8. select - Zooming in ●From the man page: select() and pselect() allow a program to monitor multiple file descriptors, waiting until one or more of the file descriptors become "ready" for some class of I/O operation (e.g., input possible). A file descriptor is considered ready if it is possible to perform the corre‐sponding I/O operation (e.g., read(2)) without blocking.
  • 9. Problem with select ●It takes O(n) time! ●That is, if 500 file descriptors (sockets) are being watched, ●it might take 500 steps to return the fd that's ready for I/O. ●And that's a problem! ●Already we have kernel to user mode switching overhead. ●And then this O(n) ● Solution....? ●epoll() - Introduced in Linux 2.5.44. ●Takes O(1) time. That's fast! ●Lets have a look.
  • 10. epoll details int epoll_create(int size); Creates an epoll object and returns its file descriptor. int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event); Controls (configures) which file descriptors are watched by this object, and for which events. int epoll_wait(int epfd, struct epoll_event *events, int maxevents, int timeout);
  • 11. How it works? ●epoll_wait is called by the server code. ●It will return any fds that are ready (i.e. have data). ●Two modes of operations oedge triggered. olevel-triggered. ●In edge triggered, process will be invoked only once per new arrival. oE.g. 2 KB received, process reads 1 KB only, next call will block till further data arrives even if 1 KB already in. ●In level triggered, process will be invoked till the buffer is empty. oE.g. 2 KB received, 1 KB read, next call won't block but would return same descriptor.
  • 12. So what? ●This model has enabled the servers to handle thousands of requests in handful of threads. ●Server creates sockets, on each request arrival, a file descriptor created and monitored. ●Event driven! ●Ngnix! The second largest server on internet uses this model. ●Written by a Russian. Handles 70 million calls a day. ●Ngnix used by Wordpress! Many others! ● Node.js - The new hotness. Based on V8 Javascript Engine. ●Uses the epoll() to handle thousands of requests per thread. Apache in comparison is on thread per request.
  • 13. Conclusion ●Operating systems do provide services. ●But sometimes, even a single function call can open up a new world of possibilities. ● epoll() is such an example. ●I/O is most critical portion of an OS. ●Even if it is single tasking system efficient I/O is what the system is all about. ●And epoll() adds event driven I/O to Linux. ●Interesting applications are being developed on top of epoll()
  • 14. References ●epoll() official man page. ●Linux Device Drivers - poll epoll ● Node.js - Evented I/O for Javascript ● NGNIX - The Russian Webserver