PSS Guest Lecture


Asynchronous IO 
  Programming

         th
       20  April 2006

  by Henrik Thostrup Jensen   1
Outline
    Asynchronous IO
●

        What, why, when
    –
        What about threads
    –
    Programming Asynchronous...
What is Asynchronous IO
    Also called event­driven or non­blocking
●


    Performing IO without blocking
●

        Cha...
Why Asynchronous IO
    Concurrency is really difficult
●

        Coordinating threads and locks is non­trivial
    –
   ...
Drawbacks
    The program must not block
●

        Only a problem when learning
    –
        And with crappy APIs
    –
...
Program Structure
    One process must handle several sockets
●

        State is not shared between connections
    –
   ...
Avoiding Asynchronous IO
    Not suitable for everything
●


    When true concurrency is needed
●

        Very seldom
  ...
Are Threads Evil
    No ­ just misunderstood
●

        Used when inappropriate
    –
    Coordination is the problem
●

 ...
Setting Things Straight
    Async. IO and threads can do the same
●

        Probably includes performance
    –
    It's ...
Async. Programming
    First introduced with Berkeley sockets
●


    Then came select, later poll
●


    Recent trends
●...
Berkeley Sockets

    int s = socket(family, type, protocol);
    fcntl(s, F_SETFL, O_NONBLOCK);
    // bind+listen or con...
Select

    int select(int n, fd_set *readfds, fd_set *writefds,
               fd_set *exceptfds, struct timeval *timeout...
Poll

    int poll(struct pollfd *ufds, unsigned int nfds, int timeout);
    struct pollfd {
            int fd;          ...
State vs. Stateless
    select() and poll() is stateless
●

        This forces them to be O(n) operations
    –
         ...
Epoll 1/2

int epfd = epoll_create(EPOLL_QUEUE_LEN);
int client_sock = socket();
static struct epoll_event ev;
ev.events =...
Epoll 2/2
    Has state in the kernel
●

        File descriptors must be added and removed
    –
        Slight more comp...
KQueue
    We've had enough APIs for now
●


    Works much like epoll
●


    Also does disk IO, signals, file/process ev...
libevent
    Event based library
●

        Portable (Linux, BSD, Solaris, Windows)
    –
        Slightly better abstract...
libevent benchmark 1/2




                         19
libevent benchmark 2/2




                         20
Asynchronous Disk IO
    What about disk IO
●

        Often blocks so little time that it can be ignored
    –
        Bu...
Posix AIO
    Relatively new standard
●

         Introduced in Linux in 2005
     –
              Was in several vendor t...
Twisted
    An event­driven framework
●


    Lives at: www.twistedmatrix.com
●


    Started as game network library
●


...
Twisted vs. the world
    Probably most advanced framework in its class
●


    Only real competitor is ACE 
●

        Tw...
Twisted Architecture
    Tries hard to keep things orthogonal
●


    Reactor–Transport–Factory–Protocol­Application
●

  ...
Twisted Echo Server
    from twisted.internet import reactor, protocol

    class Echo(protocol.Protocol):
        def dat...
The Twisted Reactor
    The main loop of Twisted
●


    Don't call us, we'll call you (framework)
●

        You insert c...
Factories and Protocols
    Factories produces protocols
●

        On incoming connections
    –
    A protocol represent...
Twisted Deferreds
    The single most confusing aspect in Twisted
●

        Removes the concept of stack execution
    –
...
Deferred Example
    Fetching a web page:
●


    def gotPage(page):
        print quot;I got the page:quot;, page

    de...
Deferreds and Errors

    def gotPage(page):
        print quot;I got the page:quot;, page

    def getPage_errorback(erro...
Chaining Callbacks
    deferred = twisted.web.getPage(quot;http://www.cs.aau.dk/quot;)
    deferred.addCallback(gotPage)
 ...
Deferreds and Coroutines
    @defer.deferredGenerator
    def myFunction(self):
        d = getPageStream(quot;http://www....
Why Twisted
    Grasping the power of Twisted is difficult
●

        Also hard to explain
    –
    Makes cross protocol ...
Summary
    Asynchronous IO
●

        An alternative to threads
    –
        Never block
    –
        Requires program ...
Upcoming SlideShare
Loading in...5
×

Asynchronous Io Programming

15,223

Published on

Published in: Technology, Education
3 Comments
37 Likes
Statistics
Notes
  • great a sync network server explanation
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Impressive presentation of 'Asynchronous Io Programming'. You've shown your credibility on presentation with this slideshow. This one deserves thumbs up. I'm John, owner of www.freeringtones.ws/ . Hope to see more quality slides from you.

    Best wishes.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • hey there,could you please mail this across to me,it will really assist me for my function.thank you very much.
    Anisa
    http://financejedi.com http://healthjedi.com
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
15,223
On Slideshare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
494
Comments
3
Likes
37
Embeds 0
No embeds

No notes for slide

Asynchronous Io Programming

  1. 1. PSS Guest Lecture Asynchronous IO  Programming th 20  April 2006 by Henrik Thostrup Jensen 1
  2. 2. Outline Asynchronous IO ● What, why, when – What about threads – Programming Asynchronously ● Berkeley sockets, select, poll – epoll, KQueue – Posix AIO, libevent, Twisted – 2
  3. 3. What is Asynchronous IO Also called event­driven or non­blocking ● Performing IO without blocking ● Changing blocking operations to be non­blocking – Requires program reorganization – The brain must follow ● The program must never block ● Instead ask what can be done without blocking – 3
  4. 4. Why Asynchronous IO Concurrency is really difficult ● Coordinating threads and locks is non­trivial – Very non­trivial – Don't make things hard – Threads and locks are not free ● They take up resources and require kernel switches – Often true concurrency is not needed ● Asynchronous programs are race free ● By definition – 4
  5. 5. Drawbacks The program must not block ● Only a problem when learning – And with crappy APIs – Removes “normal” program flow ● Due to mixing of IO channels – Requires state machinery or continuations – Hard to use multiple CPU cores ● But doable – 5
  6. 6. Program Structure One process must handle several sockets ● State is not shared between connections – Often solved using state machines – Makes program flow more restricted – Debugging is different – but not harder ● Threads and locks are darn hard to debug – 6
  7. 7. Avoiding Asynchronous IO Not suitable for everything ● When true concurrency is needed ● Very seldom – Utilizing several CPUs with shared data – Or when badly suited ● Long running computations – Can be split, but bad abstraction ● 7
  8. 8. Are Threads Evil No ­ just misunderstood ● Used when inappropriate – Coordination is the problem ● Often goes wrong – Not free ● Spawning a thread for each incoming connection – 500 incoming connections per second => problem – Thread pools used for compensating – 8
  9. 9. Setting Things Straight Async. IO and threads can do the same ● Probably includes performance – It's about having the best abstraction ● Or: Making things easy – Concurrency is not known to simplify things – 9
  10. 10. Async. Programming First introduced with Berkeley sockets ● Then came select, later poll ● Recent trends ● epoll, KQueue, Posix AIO, Twisted – 10
  11. 11. Berkeley Sockets int s = socket(family, type, protocol); fcntl(s, F_SETFL, O_NONBLOCK); // bind+listen or connect (also async.) void *buffer = malloc(1024); retval = read(s, buffer, 1024); if (retval == -EAGAIN) { // Reschedule command } Unsuitable for many sockets ● Many kernel switches – The try/except paradigm is ill suited for IO – 11
  12. 12. Select int select(int n, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout); Monitors three sets of file descriptors for changes ● Tell which actions can be done without blocking – pselect() variant to avoid signal race ● Has a limit of of maximum file descriptors ● Availability: *nix, BSD, Windows ● Reasonably portable – 12
  13. 13. Poll int poll(struct pollfd *ufds, unsigned int nfds, int timeout); struct pollfd { int fd; /* file descriptor */ short events; /* requested events */ short revents; /* returned events */ }; Basically a select() with a different API ● No limit on file descriptors ● Availability: *nix, BSD ● 13
  14. 14. State vs. Stateless select() and poll() is stateless ● This forces them to be O(n) operations – The kernel must scan all file descriptors ● This tend to suck for large number of file descriptors – Having state means more code in the kernel ● Continuously monitor a set of file descriptions – Makes an O(n) operation to O(1) – 14
  15. 15. Epoll 1/2 int epfd = epoll_create(EPOLL_QUEUE_LEN); int client_sock = socket(); static struct epoll_event ev; ev.events = EPOLLIN | EPOLLOUT | EPOLLERR; ev.data.fd = client_sock; int res = epoll_ctl(epfd, EPOLL_CTL_ADD, client_sock, &ev); // Main loop struct epoll_event *events; while (1) { int nfds = epoll_wait(epfd, events, MAX_EVENTS, TIMEOUT); for(int i = 0; i < nfds; i++) { int fd = events[i].data.fd; handle_io_on_socket(fd); } } 15
  16. 16. Epoll 2/2 Has state in the kernel ● File descriptors must be added and removed – Slight more complex API – Outperforms select and poll ● When the number of open files is high – Availability: Linux only (introduced in 2.5) ● 16
  17. 17. KQueue We've had enough APIs for now ● Works much like epoll ● Also does disk IO, signals, file/process events ● Arguably best interface ● Availability: BSD ● 17
  18. 18. libevent Event based library ● Portable (Linux, BSD, Solaris, Windows) – Slightly better abstraction – Can use select, poll, epoll, KQueue, /dev/poll ● Possible to change between them – Can also be used for benchmarking :­) ● 18
  19. 19. libevent benchmark 1/2 19
  20. 20. libevent benchmark 2/2 20
  21. 21. Asynchronous Disk IO What about disk IO ● Often blocks so little time that it can be ignored – But sometimes it cannot – Databases are the prime target ● Sockets has been asynchronous for many years ● With disk IO limping behind – Posix AIO to the rescue ● 21
  22. 22. Posix AIO Relatively new standard ● Introduced in Linux in 2005 – Was in several vendor trees before that ● Often emulated in libc using threads – Also does vector operations ● API Sample: ● int aio_read(struct aiocb *aiocbp); int aio_suspend(const struct aiocb* const iocbs[], int niocb, const struct timespec *timeout); 22
  23. 23. Twisted An event­driven framework ● Lives at: www.twistedmatrix.com ● Started as game network library ● Rather large ● Around 200K lines of Python – Basic support for 30+ protocols – Has web, mail, ssh clients and servers ● Also has its own database API ● And security infrastructure (pluggable) ● 23
  24. 24. Twisted vs. the world Probably most advanced framework in its class ● Only real competitor is ACE  ● Twisted borrows a lot of inspiration from here – ACE is also historically important – Some claim that java.nio and C# Async are also  ● asynchronous frameworks 24
  25. 25. Twisted Architecture Tries hard to keep things orthogonal ● Reactor–Transport–Factory–Protocol­Application ● As opposed to mingled together libraries – Makes changes very easy (once mastered) ● Often is about combining things the right way – Also means more boilerplate code – 25
  26. 26. Twisted Echo Server from twisted.internet import reactor, protocol class Echo(protocol.Protocol): def dataReceived(self, data): self.transport.write(data) factory = protocol.ServerFactory() factory.protocol = Echo reactor.listenTCP(8000, factory) reactor.run() Lots of magic behind the curtain ● Protocols, Factories and the Reactor – 26
  27. 27. The Twisted Reactor The main loop of Twisted ● Don't call us, we'll call you (framework) ● You insert code into the reactor – Also does scheduling ● Future calls, looping calls – Interchangeable ● select, poll, WMFO, IOCP, CoreFoundation, KQueue – Integrates with GTK, Qt, wxWidgets – 27
  28. 28. Factories and Protocols Factories produces protocols ● On incoming connections – A protocol represents a connection ● Provides method such as dataReceived – Which are called from the reactor – The application is build on top of protocols ● May use several transports, factories, and protocols – 28
  29. 29. Twisted Deferreds The single most confusing aspect in Twisted ● Removes the concept of stack execution – Vital to understand as they are the program flow ● An alternative to state machines – Think of them as a one­shot continuation ● Or: An object representing what should happen – Or: A promise that data will be delivered – Or: A callback – 29
  30. 30. Deferred Example Fetching a web page: ● def gotPage(page): print quot;I got the page:quot;, page deferred = twisted.web.getPage(quot;http://www.cs.aau.dk/quot;) deferred.addCallback(gotPage) return deferred The getPage function takes time to complete ● But blocking is prohibited – When the page is retrieve the deferred will fire ● And callbacks will be invoked – 30
  31. 31. Deferreds and Errors def gotPage(page): print quot;I got the page:quot;, page def getPage_errorback(error): print quot;Didn't get the page, me so sad - reason:quot;, error deferred = twisted.web.getPage(quot;http://www.cs.aau.dk/quot;) deferred.addCallback(gotPage) deferred.addErrback(getPage_errorback) Separates normal program flow from error  ● handling 31
  32. 32. Chaining Callbacks deferred = twisted.web.getPage(quot;http://www.cs.aau.dk/quot;) deferred.addCallback(gotPage) deferred.addCallback(changePage) deferred.addCallback(uploadPage) Often several sequential actions are needed ● The result from gotPage is provided as argument  ● to changePage The execution stack disappears ● Causes headaches during learning – Coroutines can make program more stack­like – 32
  33. 33. Deferreds and Coroutines @defer.deferredGenerator def myFunction(self): d = getPageStream(quot;http://www.cs.aau.dkquot;) yield d ; stream = d.getResult() while True: d = stream.read() yield d ; content = d.getResult() if content is None: break print content Notice: yield instead of return ● The function is reentrant – 33
  34. 34. Why Twisted Grasping the power of Twisted is difficult ● Also hard to explain – Makes cross protocol integration very easy ● Provides a lot of functionality and power ● But one needs to know how to use it – Drawbacks ● Steep learning curve – Portability: Applications cannot be ported to Twisted – 34
  35. 35. Summary Asynchronous IO ● An alternative to threads – Never block – Requires program reorganization – Asynchronous Programming ● Stateless: Berkeley sockets, select, poll – State: epoll, Kqueue – Disk: Posix AIO – libevent and Twisted – Exercises? ● 35
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×