This document provides an overview of asynchronous I/O programming. It begins with an outline and introduction to asynchronous I/O. It then discusses specific asynchronous I/O APIs like Berkeley sockets, select, poll, epoll, KQueue, and Posix AIO. It covers advantages and drawbacks of asynchronous programming. Examples are provided of asynchronous programming with these different APIs. The document also discusses libraries like libevent and frameworks like Twisted that provide asynchronous functionality.
Have you ever heard of FreeBSD? Probably.
Have you ever interacted with its kernel? Probably not.
In this talk, Gili Yankovitch (nyxsecuritysolutions.com) will talk about the FreeBSD operating system, its network stack and how to write network drivers for it.
The talk will cover the following topics:
* Kernel/User interation in FreeBSD
* The FreeBSD Network Stack
* Network Buffers API
* L2 and L3 Hooking
Although we don't use it for the core web application, most other places in Launchpad that have to deal with concurrency issues do it using Twisted. This talk will survey these areas and talk about issues we've found and design patterns we've found helpful.
Nadav Markus goes over the path from a simple crash POC provided by Google Project Zero (for CVE-2015-7547), to a fully weaponized exploit.
He explores how an attacker can utilize the behavior of the Linux kernel in order to bypass ASLR, allowing an attacker to remotely execute code on vulnerable targets.
You can create and test your OFC with Trema.
You can create tests effectively with test frameworks.
You can run common tests for both environment of a testing network and a real network.
Troubleshooting common oslo.messaging and RabbitMQ issuesMichael Klishin
This talk focuses on troubleshooting of common oslo.messaging and RabbitMQ issues in OpenStack environments. Co-presented at the OpenStack Summit Austin in April 2016.
Pipework: Software-Defined Network for Containers and DockerJérôme Petazzoni
Pipework lets you connect together containers in arbitrarily complex scenarios. Pipework uses cgroups and namespaces and works with "plain" LXC containers (created with lxc-start), and with the awesome Docker.
It's nothing less than Software-Defined Networking for Linux Containers!
This is a short presentation about Pipework, given at the Docker Networking meet-up November 6th in Mountain View.
More information:
- https://github.com/jpetazzo/pipework
- http://www.meetup.com/Docker-Networking/
FBTFTP: an opensource framework to build dynamic tftp serversAngelo Failla
Talk given at EuroPython2016, Bilbao:
https://ep2016.europython.eu/conference/talks/fbtftp-facebooks-python3-framework-for-tftp-servers
TFTP was first standardized in ’81 (same year I was born!) and one of its primary uses is in the early stage of network booting. TFTP is very simple to implement, and one of the reasons it is still in use is that its small footprint allows engineers to fit the code into very low resource, single board computers, system-on-a-chip implementations and mainboard chipsets, in the case of modern hardware.
It is therefore a crucial protocol deployed in almost every data center environment. It is used, together with DHCP, to chain load Network Boot Programs (NBPs), like Grub2 and iPXE. They allow machines to bootstrap themselves and install operating systems off of the network, downloading kernels and initrds via HTTP and starting them up.
At Facebook, we have been using the standard in.tftpd daemon for years, however, we started to reach its limitations. Limitations that were partially due to our scale and the way TFTP was deployed in our infrastructure, but also to the protocol specifications based on requirements from the 80’s.
To address those limitations we ended up writing our own framework for creating dynamic TFTP servers in Python3, and we decided to open source it.
I will take you thru the framework and the features it offers. I’ll discuss the specific problems that motivated us to create it. We will look at practical examples of how touse it, along with a little code, to build your own server that are tailored to your own infra needs.
In this talk Jiří Pírko discusses the design and evolution of the VLAN implementation in Linux, the challenges and pitfalls as well as hardware acceleration and alternative implementations.
Jiří Pírko is a major contributor to kernel networking and the creator of libteam for link aggregation.
You're Off the Hook: Blinding Security SoftwareCylance
User-mode hooking is dead. It’s also considered harmful due to interference with OS-level exploit mitigations like Control Flow Guard (CFG). At BlackHat US 2016, the “Captain Hook” talk revealed there were multiple serious security issues in AV hooking — we will put the final nail in the coffin by showing how trivial it is to bypass user-mode hooks. We will demonstrate a universal user-mode unhooking approach that can be included in any binary to blind security software from monitoring code execution and perform heuristic analysis. The tool and source code will be released on GitHub after the talk.
Alex Matrosov | Principal Research Scientist
Jeff Tang | Senior Security Researcher
Have you ever heard of FreeBSD? Probably.
Have you ever interacted with its kernel? Probably not.
In this talk, Gili Yankovitch (nyxsecuritysolutions.com) will talk about the FreeBSD operating system, its network stack and how to write network drivers for it.
The talk will cover the following topics:
* Kernel/User interation in FreeBSD
* The FreeBSD Network Stack
* Network Buffers API
* L2 and L3 Hooking
Although we don't use it for the core web application, most other places in Launchpad that have to deal with concurrency issues do it using Twisted. This talk will survey these areas and talk about issues we've found and design patterns we've found helpful.
Nadav Markus goes over the path from a simple crash POC provided by Google Project Zero (for CVE-2015-7547), to a fully weaponized exploit.
He explores how an attacker can utilize the behavior of the Linux kernel in order to bypass ASLR, allowing an attacker to remotely execute code on vulnerable targets.
You can create and test your OFC with Trema.
You can create tests effectively with test frameworks.
You can run common tests for both environment of a testing network and a real network.
Troubleshooting common oslo.messaging and RabbitMQ issuesMichael Klishin
This talk focuses on troubleshooting of common oslo.messaging and RabbitMQ issues in OpenStack environments. Co-presented at the OpenStack Summit Austin in April 2016.
Pipework: Software-Defined Network for Containers and DockerJérôme Petazzoni
Pipework lets you connect together containers in arbitrarily complex scenarios. Pipework uses cgroups and namespaces and works with "plain" LXC containers (created with lxc-start), and with the awesome Docker.
It's nothing less than Software-Defined Networking for Linux Containers!
This is a short presentation about Pipework, given at the Docker Networking meet-up November 6th in Mountain View.
More information:
- https://github.com/jpetazzo/pipework
- http://www.meetup.com/Docker-Networking/
FBTFTP: an opensource framework to build dynamic tftp serversAngelo Failla
Talk given at EuroPython2016, Bilbao:
https://ep2016.europython.eu/conference/talks/fbtftp-facebooks-python3-framework-for-tftp-servers
TFTP was first standardized in ’81 (same year I was born!) and one of its primary uses is in the early stage of network booting. TFTP is very simple to implement, and one of the reasons it is still in use is that its small footprint allows engineers to fit the code into very low resource, single board computers, system-on-a-chip implementations and mainboard chipsets, in the case of modern hardware.
It is therefore a crucial protocol deployed in almost every data center environment. It is used, together with DHCP, to chain load Network Boot Programs (NBPs), like Grub2 and iPXE. They allow machines to bootstrap themselves and install operating systems off of the network, downloading kernels and initrds via HTTP and starting them up.
At Facebook, we have been using the standard in.tftpd daemon for years, however, we started to reach its limitations. Limitations that were partially due to our scale and the way TFTP was deployed in our infrastructure, but also to the protocol specifications based on requirements from the 80’s.
To address those limitations we ended up writing our own framework for creating dynamic TFTP servers in Python3, and we decided to open source it.
I will take you thru the framework and the features it offers. I’ll discuss the specific problems that motivated us to create it. We will look at practical examples of how touse it, along with a little code, to build your own server that are tailored to your own infra needs.
In this talk Jiří Pírko discusses the design and evolution of the VLAN implementation in Linux, the challenges and pitfalls as well as hardware acceleration and alternative implementations.
Jiří Pírko is a major contributor to kernel networking and the creator of libteam for link aggregation.
You're Off the Hook: Blinding Security SoftwareCylance
User-mode hooking is dead. It’s also considered harmful due to interference with OS-level exploit mitigations like Control Flow Guard (CFG). At BlackHat US 2016, the “Captain Hook” talk revealed there were multiple serious security issues in AV hooking — we will put the final nail in the coffin by showing how trivial it is to bypass user-mode hooks. We will demonstrate a universal user-mode unhooking approach that can be included in any binary to blind security software from monitoring code execution and perform heuristic analysis. The tool and source code will be released on GitHub after the talk.
Alex Matrosov | Principal Research Scientist
Jeff Tang | Senior Security Researcher
OSDC 2017 | Linux Performance Profiling and Monitoring by Werner FischerNETWAYS
Nowadays system administrators have great choices when it comes down to Linux performance profiling and monitoring. The challenge is to pick the appropriate tools and interpret their results correctly.
This talk is a chance to take a tour through various performance profiling and benchmarking tools, focusing on their benefit for every sysadmin.
More than 25 different tools are presented. Ranging from well known tools like strace, iostat, tcpdump or vmstat to new features like Linux tracepoints or perf_events. You will also learn which tools can be monitored by Icinga and which monitoring plugins are already available for that.
At the end the goal is to gather reference points to look at, whenever you are faced with performance problems.
Take the chance to close your knowledge gaps and learn how to get the most out of your system.
OSDC 2017 | Open POWER for the data center by Werner FischerNETWAYS
IBM's POWER (Performance Optimization With Enhanced RISC) architecture is known to run mission-critical applications and to provide bank-style "RAS" (Reliability, Availability, Serviceability) features since 1990. Opening the architecture in 2013 enabled other vendors like Tyan or Rackspace to build servers based on the current POWER8 edition of this architecture. The current POWER8 CPUs provide up to 12 cores with 8x Simultaneous Multithreading - leading to 96 threads per CPU. Up to eight memory channels enable up to 230 GB/s memory bandwidth per CPU. Increased L1, L2, L3 and new L4 caches help to boost the performance of memory-bound applications like databeses, by providing more than 1 TB/s of bandwidth. In this talk Werner will give an overview of the architecture and show the performance possibilities of POWER8, using the PostgreSQL database as an example. By comparing PostgreSQL 9.4, 9.5 and 9.6 benchmarking results he will visualize the increased efficiency thanks to PowergreSQL's optimizations for POWER over the last years. Finally, he will outline one other benefit of OpenPOWER systems: from the very beginning (the first instruction to initialize the first CPU core, long before DRAM, firmware management or PCIe works) up to running your Linux OS and application like a database, only open source code gets executed.
OSDC 2017 - Werner Fischer - Open power for the data centerNETWAYS
IBM's POWER (Performance Optimization With Enhanced RISC) architecture is known to run mission-critical applications and to provide bank-style "RAS" (Reliability, Availability, Serviceability) features since 1990. Opening the architecture in 2013 enabled other vendors like Tyan or Rackspace to build servers based on the current POWER8 edition of this architecture. The current POWER8 CPUs provide up to 12 cores with 8x Simultaneous Multithreading - leading to 96 threads per CPU. Up to eight memory channels enable up to 230 GB/s memory bandwidth per CPU. Increased L1, L2, L3 and new L4 caches help to boost the performance of memory-bound applications like databeses, by providing more than 1 TB/s of bandwidth. In this talk Werner will give an overview of the architecture and show the performance possibilities of POWER8, using the PostgreSQL database as an example. By comparing PostgreSQL 9.4, 9.5 and 9.6 benchmarking results he will visualize the increased efficiency thanks to PowergreSQL's optimizations for POWER over the last years. Finally, he will outline one other benefit of OpenPOWER systems: from the very beginning (the first instruction to initialize the first CPU core, long before DRAM, firmware management or PCIe works) up to running your Linux OS and application like a database, only open source code gets executed.
ParaForming - Patterns and Refactoring for Parallel Programmingkhstandrews
Despite Moore's "law", uniprocessor clock speeds have now stalled. Rather than single processors running at ever higher clock speeds, it is
common to find dual-, quad- or even hexa-core processors, even in consumer laptops and desktops.
Future hardware will not be slightly parallel, however, as in today's multicore systems, but will be
massively parallel, with manycore and perhaps even megacore systems
becoming mainstream.
This means that programmers need to start thinking parallel. To achieve this they must move away
from traditional programming models where parallelism is a
bolted-on afterthought. Rather, programmers must use languages where parallelism is deeply embedded into the programming model
from the outset.
By providing a high level model of computation, without explicit ordering of computations,
declarative languages in general, and functional languages in particular, offer many advantages for parallel
programming.
One of the most fundamental advantages of the functional paradigm is purity.
In a purely functional language, as exemplified by Haskell, there are simply no side effects: it is therefore impossible for parallel computations to conflict with each
other in ways that are not well understood.
ParaForming aims to radically improve the process
of parallelising purely functional programs through a comprehensive set of high-level parallel refactoring patterns for Parallel Haskell,
supported by advanced refactoring tools.
By matching parallel design patterns with appropriate algorithmic skeletons
using advanced software refactoring techniques and novel cost information, we will bridge the gap between fully automatic
and fully explicit approaches to parallelisation, helping programmers "think parallel" in a systematic,
guided way. This talk introduces the ParaForming approach, gives some examples and shows how
effective parallel programs can be developed using advanced refactoring technology.
Historically, sharing a Linux server entailed all kinds of untenable compromises. In addition to the security concerns, there was simply no good way to keep one application from hogging resources and messing with the others. The classic “noisy neighbor” problem made shared systems the bargain-basement slums of the Internet, suitable only for small or throwaway projects.
Serious use-cases traditionally demanded dedicated systems. Over the past decade virtualization (in conjunction with Moore’s law) has democratized the availability of what amount to dedicated systems, and the result is hundreds of thousands of websites and applications deployed into VPS or cloud instances. It’s a step in the right direction, but still has glaring flaws.
Most of these websites are just piles of code sitting on a server somewhere. How did that code got there? How can it can be scaled? Secured? Maintained? It’s anybody’s guess. There simply isn’t enough SysAdmin talent in the world to meet the demands of managing all these apps with anything close to best practices without a better model.
Containers are a whole new ballgame. Unlike VMs, you skip the overhead of running an entire OS for every application environment. There’s also no need to provision a whole new machine to have a place to deploy, meaning you can spin up or scale your application with orders of magnitude more speed and accuracy.
Sanger OpenStack presentation March 2017Dave Holland
A description of the Sanger Institute's journey with OpenStack to date, covering RHOSP, Ceph, S3, user applications, and future plans. Given at the Sanger Institute's OpenStack Day.
I think this presentation about Adapteva's Parallella is one of the most comprehensive till now. Feel free to use it. I gave this talk on 10th Dec 2014 at Cloud Research Lab, Ericsson AB, Lund, Sweden.
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...ScyllaDB
Scylla strives to deliver high throughput at low, consistent latencies under any scenario. But in the field things can and do get slower than one would like. Some of those issues come from bad data modelling and anti-patterns. Some others from lack of resources and bad system configuration, and in rare cases even product malfunction.
But how to tell them apart? And once you do, how to understand how to fix your application or reconfigure your system? Scylla has a rich ecosystem of tools available to answer those questions and in this talk we’ll discuss the proper use of some of them and how to take advantage of each tool’s strength. We will discuss real examples using tools like CQL tracing, nodetool commands, the Scylla monitor and others.
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...Spark Summit
Spark is by its nature very fault tolerant. However, faults, and application failures, can and do happen, in production at scale.
In this talk, we’ll discuss the nuts and bolts of fault tolerance in Spark.
We will begin with a brief overview of the sorts of fault tolerance offered, and lead into a deep dive of the internals of fault tolerance. This will include a discussion of Spark on YARN, scheduling, and resource allocation.
We will then spend some time on a case study and discussing some tools used to find and verify fault tolerance issues. Our case study comes from a customer who experienced an application outage that was root caused to a scheduler bug. We discuss the analysis we did to reach this conclusion and the work that we did to reproduce it locally. We highlight some of the techniques used to simulate faults and find bugs.
At the end, we’ll discuss some future directions for fault tolerance improvements in Spark, such as scheduler and checkpointing changes.
Under The Hood Of A Shard-Per-Core Database ArchitectureScyllaDB
Most databases are based on architectures that pre-date advances to modern hardware. This results in performance issues, the need to overprovision, and a high total cost of ownership. In this webinar, we will discuss the advances to modern server technology and take a deep dive into ScyllaDB’s shard-per-core architecture and our asynchronous engine, the Seastar framework.
Join us to learn how Seastar (and ScyllaDB):
- Avoid locks and contention on the CPU level
- Bypass kernel bottlenecks
- Implement its per-core shared-nothing autosharding mechanism
- Utilize modern storage hardware
- Leverage NUMA to get the best RAM performance
- Balance your data across CPUs and nodes for the best and smoothest performance
Plus we’ll cover the advantages of unlocking vertical scalability.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
2. Outline
Asynchronous IO
●
What, why, when
–
What about threads
–
Programming Asynchronously
●
Berkeley sockets, select, poll
–
epoll, KQueue
–
Posix AIO, libevent, Twisted
–
2
3. What is Asynchronous IO
Also called eventdriven or nonblocking
●
Performing IO without blocking
●
Changing blocking operations to be nonblocking
–
Requires program reorganization
–
The brain must follow
●
The program must never block
●
Instead ask what can be done without blocking
–
3
4. Why Asynchronous IO
Concurrency is really difficult
●
Coordinating threads and locks is nontrivial
–
Very nontrivial
–
Don't make things hard
–
Threads and locks are not free
●
They take up resources and require kernel switches
–
Often true concurrency is not needed
●
Asynchronous programs are race free
●
By definition
–
4
5. Drawbacks
The program must not block
●
Only a problem when learning
–
And with crappy APIs
–
Removes “normal” program flow
●
Due to mixing of IO channels
–
Requires state machinery or continuations
–
Hard to use multiple CPU cores
●
But doable
–
5
6. Program Structure
One process must handle several sockets
●
State is not shared between connections
–
Often solved using state machines
–
Makes program flow more restricted
–
Debugging is different – but not harder
●
Threads and locks are darn hard to debug
–
6
7. Avoiding Asynchronous IO
Not suitable for everything
●
When true concurrency is needed
●
Very seldom
–
Utilizing several CPUs with shared data
–
Or when badly suited
●
Long running computations
–
Can be split, but bad abstraction
●
7
8. Are Threads Evil
No just misunderstood
●
Used when inappropriate
–
Coordination is the problem
●
Often goes wrong
–
Not free
●
Spawning a thread for each incoming connection
–
500 incoming connections per second => problem
–
Thread pools used for compensating
–
8
9. Setting Things Straight
Async. IO and threads can do the same
●
Probably includes performance
–
It's about having the best abstraction
●
Or: Making things easy
–
Concurrency is not known to simplify things
–
9
10. Async. Programming
First introduced with Berkeley sockets
●
Then came select, later poll
●
Recent trends
●
epoll, KQueue, Posix AIO, Twisted
–
10
11. Berkeley Sockets
int s = socket(family, type, protocol);
fcntl(s, F_SETFL, O_NONBLOCK);
// bind+listen or connect (also async.)
void *buffer = malloc(1024);
retval = read(s, buffer, 1024);
if (retval == -EAGAIN) {
// Reschedule command
}
Unsuitable for many sockets
●
Many kernel switches
–
The try/except paradigm is ill suited for IO
–
11
12. Select
int select(int n, fd_set *readfds, fd_set *writefds,
fd_set *exceptfds, struct timeval *timeout);
Monitors three sets of file descriptors for changes
●
Tell which actions can be done without blocking
–
pselect() variant to avoid signal race
●
Has a limit of of maximum file descriptors
●
Availability: *nix, BSD, Windows
●
Reasonably portable
–
12
13. Poll
int poll(struct pollfd *ufds, unsigned int nfds, int timeout);
struct pollfd {
int fd; /* file descriptor */
short events; /* requested events */
short revents; /* returned events */
};
Basically a select() with a different API
●
No limit on file descriptors
●
Availability: *nix, BSD
●
13
14. State vs. Stateless
select() and poll() is stateless
●
This forces them to be O(n) operations
–
The kernel must scan all file descriptors
●
This tend to suck for large number of file descriptors
–
Having state means more code in the kernel
●
Continuously monitor a set of file descriptions
–
Makes an O(n) operation to O(1)
–
14
15. Epoll 1/2
int epfd = epoll_create(EPOLL_QUEUE_LEN);
int client_sock = socket();
static struct epoll_event ev;
ev.events = EPOLLIN | EPOLLOUT | EPOLLERR;
ev.data.fd = client_sock;
int res = epoll_ctl(epfd, EPOLL_CTL_ADD, client_sock, &ev);
// Main loop
struct epoll_event *events;
while (1) {
int nfds = epoll_wait(epfd, events, MAX_EVENTS, TIMEOUT);
for(int i = 0; i < nfds; i++) {
int fd = events[i].data.fd;
handle_io_on_socket(fd);
}
}
15
16. Epoll 2/2
Has state in the kernel
●
File descriptors must be added and removed
–
Slight more complex API
–
Outperforms select and poll
●
When the number of open files is high
–
Availability: Linux only (introduced in 2.5)
●
16
17. KQueue
We've had enough APIs for now
●
Works much like epoll
●
Also does disk IO, signals, file/process events
●
Arguably best interface
●
Availability: BSD
●
17
18. libevent
Event based library
●
Portable (Linux, BSD, Solaris, Windows)
–
Slightly better abstraction
–
Can use select, poll, epoll, KQueue, /dev/poll
●
Possible to change between them
–
Can also be used for benchmarking :)
●
18
21. Asynchronous Disk IO
What about disk IO
●
Often blocks so little time that it can be ignored
–
But sometimes it cannot
–
Databases are the prime target
●
Sockets has been asynchronous for many years
●
With disk IO limping behind
–
Posix AIO to the rescue
●
21
22. Posix AIO
Relatively new standard
●
Introduced in Linux in 2005
–
Was in several vendor trees before that
●
Often emulated in libc using threads
–
Also does vector operations
●
API Sample:
●
int aio_read(struct aiocb *aiocbp);
int aio_suspend(const struct aiocb* const iocbs[],
int niocb, const struct timespec *timeout);
22
23. Twisted
An eventdriven framework
●
Lives at: www.twistedmatrix.com
●
Started as game network library
●
Rather large
●
Around 200K lines of Python
–
Basic support for 30+ protocols
–
Has web, mail, ssh clients and servers
●
Also has its own database API
●
And security infrastructure (pluggable)
●
23
24. Twisted vs. the world
Probably most advanced framework in its class
●
Only real competitor is ACE
●
Twisted borrows a lot of inspiration from here
–
ACE is also historically important
–
Some claim that java.nio and C# Async are also
●
asynchronous frameworks
24
25. Twisted Architecture
Tries hard to keep things orthogonal
●
Reactor–Transport–Factory–ProtocolApplication
●
As opposed to mingled together libraries
–
Makes changes very easy (once mastered)
●
Often is about combining things the right way
–
Also means more boilerplate code
–
25
26. Twisted Echo Server
from twisted.internet import reactor, protocol
class Echo(protocol.Protocol):
def dataReceived(self, data):
self.transport.write(data)
factory = protocol.ServerFactory()
factory.protocol = Echo
reactor.listenTCP(8000, factory)
reactor.run()
Lots of magic behind the curtain
●
Protocols, Factories and the Reactor
–
26
27. The Twisted Reactor
The main loop of Twisted
●
Don't call us, we'll call you (framework)
●
You insert code into the reactor
–
Also does scheduling
●
Future calls, looping calls
–
Interchangeable
●
select, poll, WMFO, IOCP, CoreFoundation, KQueue
–
Integrates with GTK, Qt, wxWidgets
–
27
28. Factories and Protocols
Factories produces protocols
●
On incoming connections
–
A protocol represents a connection
●
Provides method such as dataReceived
–
Which are called from the reactor
–
The application is build on top of protocols
●
May use several transports, factories, and protocols
–
28
29. Twisted Deferreds
The single most confusing aspect in Twisted
●
Removes the concept of stack execution
–
Vital to understand as they are the program flow
●
An alternative to state machines
–
Think of them as a oneshot continuation
●
Or: An object representing what should happen
–
Or: A promise that data will be delivered
–
Or: A callback
–
29
30. Deferred Example
Fetching a web page:
●
def gotPage(page):
print quot;I got the page:quot;, page
deferred = twisted.web.getPage(quot;http://www.cs.aau.dk/quot;)
deferred.addCallback(gotPage)
return deferred
The getPage function takes time to complete
●
But blocking is prohibited
–
When the page is retrieve the deferred will fire
●
And callbacks will be invoked
–
30
31. Deferreds and Errors
def gotPage(page):
print quot;I got the page:quot;, page
def getPage_errorback(error):
print quot;Didn't get the page, me so sad - reason:quot;, error
deferred = twisted.web.getPage(quot;http://www.cs.aau.dk/quot;)
deferred.addCallback(gotPage)
deferred.addErrback(getPage_errorback)
Separates normal program flow from error
●
handling
31
32. Chaining Callbacks
deferred = twisted.web.getPage(quot;http://www.cs.aau.dk/quot;)
deferred.addCallback(gotPage)
deferred.addCallback(changePage)
deferred.addCallback(uploadPage)
Often several sequential actions are needed
●
The result from gotPage is provided as argument
●
to changePage
The execution stack disappears
●
Causes headaches during learning
–
Coroutines can make program more stacklike
–
32
33. Deferreds and Coroutines
@defer.deferredGenerator
def myFunction(self):
d = getPageStream(quot;http://www.cs.aau.dkquot;)
yield d ; stream = d.getResult()
while True:
d = stream.read()
yield d ; content = d.getResult()
if content is None:
break
print content
Notice: yield instead of return
●
The function is reentrant
–
33
34. Why Twisted
Grasping the power of Twisted is difficult
●
Also hard to explain
–
Makes cross protocol integration very easy
●
Provides a lot of functionality and power
●
But one needs to know how to use it
–
Drawbacks
●
Steep learning curve
–
Portability: Applications cannot be ported to Twisted
–
34
35. Summary
Asynchronous IO
●
An alternative to threads
–
Never block
–
Requires program reorganization
–
Asynchronous Programming
●
Stateless: Berkeley sockets, select, poll
–
State: epoll, Kqueue
–
Disk: Posix AIO
–
libevent and Twisted
–
Exercises?
●
35