Asynchronous Io Programming
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Asynchronous Io Programming

on

  • 20,505 views

 

Statistics

Views

Total Views
20,505
Views on SlideShare
20,417
Embed Views
88

Actions

Likes
33
Downloads
484
Comments
3

9 Embeds 88

http://www.slideshare.net 50
http://rapd.wordpress.com 22
http://www.slideee.com 5
http://lj-toys.com 3
http://192.168.6.52 3
http://www.techgig.com 2
http://www.risingcode.com 1
http://risingcode.com 1
http://www.linkedin.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • great a sync network server explanation
    Are you sure you want to
    Your message goes here
    Processing…
  • Impressive presentation of 'Asynchronous Io Programming'. You've shown your credibility on presentation with this slideshow. This one deserves thumbs up. I'm John, owner of www.freeringtones.ws/ . Hope to see more quality slides from you.

    Best wishes.
    Are you sure you want to
    Your message goes here
    Processing…
  • hey there,could you please mail this across to me,it will really assist me for my function.thank you very much.
    Anisa
    http://financejedi.com http://healthjedi.com
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Asynchronous Io Programming Presentation Transcript

  • 1. PSS Guest Lecture Asynchronous IO  Programming th 20  April 2006 by Henrik Thostrup Jensen 1
  • 2. Outline Asynchronous IO ● What, why, when – What about threads – Programming Asynchronously ● Berkeley sockets, select, poll – epoll, KQueue – Posix AIO, libevent, Twisted – 2
  • 3. What is Asynchronous IO Also called event­driven or non­blocking ● Performing IO without blocking ● Changing blocking operations to be non­blocking – Requires program reorganization – The brain must follow ● The program must never block ● Instead ask what can be done without blocking – 3
  • 4. Why Asynchronous IO Concurrency is really difficult ● Coordinating threads and locks is non­trivial – Very non­trivial – Don't make things hard – Threads and locks are not free ● They take up resources and require kernel switches – Often true concurrency is not needed ● Asynchronous programs are race free ● By definition – 4
  • 5. Drawbacks The program must not block ● Only a problem when learning – And with crappy APIs – Removes “normal” program flow ● Due to mixing of IO channels – Requires state machinery or continuations – Hard to use multiple CPU cores ● But doable – 5
  • 6. Program Structure One process must handle several sockets ● State is not shared between connections – Often solved using state machines – Makes program flow more restricted – Debugging is different – but not harder ● Threads and locks are darn hard to debug – 6
  • 7. Avoiding Asynchronous IO Not suitable for everything ● When true concurrency is needed ● Very seldom – Utilizing several CPUs with shared data – Or when badly suited ● Long running computations – Can be split, but bad abstraction ● 7
  • 8. Are Threads Evil No ­ just misunderstood ● Used when inappropriate – Coordination is the problem ● Often goes wrong – Not free ● Spawning a thread for each incoming connection – 500 incoming connections per second => problem – Thread pools used for compensating – 8
  • 9. Setting Things Straight Async. IO and threads can do the same ● Probably includes performance – It's about having the best abstraction ● Or: Making things easy – Concurrency is not known to simplify things – 9
  • 10. Async. Programming First introduced with Berkeley sockets ● Then came select, later poll ● Recent trends ● epoll, KQueue, Posix AIO, Twisted – 10
  • 11. Berkeley Sockets int s = socket(family, type, protocol); fcntl(s, F_SETFL, O_NONBLOCK); // bind+listen or connect (also async.) void *buffer = malloc(1024); retval = read(s, buffer, 1024); if (retval == -EAGAIN) { // Reschedule command } Unsuitable for many sockets ● Many kernel switches – The try/except paradigm is ill suited for IO – 11
  • 12. Select int select(int n, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout); Monitors three sets of file descriptors for changes ● Tell which actions can be done without blocking – pselect() variant to avoid signal race ● Has a limit of of maximum file descriptors ● Availability: *nix, BSD, Windows ● Reasonably portable – 12
  • 13. Poll int poll(struct pollfd *ufds, unsigned int nfds, int timeout); struct pollfd { int fd; /* file descriptor */ short events; /* requested events */ short revents; /* returned events */ }; Basically a select() with a different API ● No limit on file descriptors ● Availability: *nix, BSD ● 13
  • 14. State vs. Stateless select() and poll() is stateless ● This forces them to be O(n) operations – The kernel must scan all file descriptors ● This tend to suck for large number of file descriptors – Having state means more code in the kernel ● Continuously monitor a set of file descriptions – Makes an O(n) operation to O(1) – 14
  • 15. Epoll 1/2 int epfd = epoll_create(EPOLL_QUEUE_LEN); int client_sock = socket(); static struct epoll_event ev; ev.events = EPOLLIN | EPOLLOUT | EPOLLERR; ev.data.fd = client_sock; int res = epoll_ctl(epfd, EPOLL_CTL_ADD, client_sock, &ev); // Main loop struct epoll_event *events; while (1) { int nfds = epoll_wait(epfd, events, MAX_EVENTS, TIMEOUT); for(int i = 0; i < nfds; i++) { int fd = events[i].data.fd; handle_io_on_socket(fd); } } 15
  • 16. Epoll 2/2 Has state in the kernel ● File descriptors must be added and removed – Slight more complex API – Outperforms select and poll ● When the number of open files is high – Availability: Linux only (introduced in 2.5) ● 16
  • 17. KQueue We've had enough APIs for now ● Works much like epoll ● Also does disk IO, signals, file/process events ● Arguably best interface ● Availability: BSD ● 17
  • 18. libevent Event based library ● Portable (Linux, BSD, Solaris, Windows) – Slightly better abstraction – Can use select, poll, epoll, KQueue, /dev/poll ● Possible to change between them – Can also be used for benchmarking :­) ● 18
  • 19. libevent benchmark 1/2 19
  • 20. libevent benchmark 2/2 20
  • 21. Asynchronous Disk IO What about disk IO ● Often blocks so little time that it can be ignored – But sometimes it cannot – Databases are the prime target ● Sockets has been asynchronous for many years ● With disk IO limping behind – Posix AIO to the rescue ● 21
  • 22. Posix AIO Relatively new standard ● Introduced in Linux in 2005 – Was in several vendor trees before that ● Often emulated in libc using threads – Also does vector operations ● API Sample: ● int aio_read(struct aiocb *aiocbp); int aio_suspend(const struct aiocb* const iocbs[], int niocb, const struct timespec *timeout); 22
  • 23. Twisted An event­driven framework ● Lives at: www.twistedmatrix.com ● Started as game network library ● Rather large ● Around 200K lines of Python – Basic support for 30+ protocols – Has web, mail, ssh clients and servers ● Also has its own database API ● And security infrastructure (pluggable) ● 23
  • 24. Twisted vs. the world Probably most advanced framework in its class ● Only real competitor is ACE  ● Twisted borrows a lot of inspiration from here – ACE is also historically important – Some claim that java.nio and C# Async are also  ● asynchronous frameworks 24
  • 25. Twisted Architecture Tries hard to keep things orthogonal ● Reactor–Transport–Factory–Protocol­Application ● As opposed to mingled together libraries – Makes changes very easy (once mastered) ● Often is about combining things the right way – Also means more boilerplate code – 25
  • 26. Twisted Echo Server from twisted.internet import reactor, protocol class Echo(protocol.Protocol): def dataReceived(self, data): self.transport.write(data) factory = protocol.ServerFactory() factory.protocol = Echo reactor.listenTCP(8000, factory) reactor.run() Lots of magic behind the curtain ● Protocols, Factories and the Reactor – 26
  • 27. The Twisted Reactor The main loop of Twisted ● Don't call us, we'll call you (framework) ● You insert code into the reactor – Also does scheduling ● Future calls, looping calls – Interchangeable ● select, poll, WMFO, IOCP, CoreFoundation, KQueue – Integrates with GTK, Qt, wxWidgets – 27
  • 28. Factories and Protocols Factories produces protocols ● On incoming connections – A protocol represents a connection ● Provides method such as dataReceived – Which are called from the reactor – The application is build on top of protocols ● May use several transports, factories, and protocols – 28
  • 29. Twisted Deferreds The single most confusing aspect in Twisted ● Removes the concept of stack execution – Vital to understand as they are the program flow ● An alternative to state machines – Think of them as a one­shot continuation ● Or: An object representing what should happen – Or: A promise that data will be delivered – Or: A callback – 29
  • 30. Deferred Example Fetching a web page: ● def gotPage(page): print quot;I got the page:quot;, page deferred = twisted.web.getPage(quot;http://www.cs.aau.dk/quot;) deferred.addCallback(gotPage) return deferred The getPage function takes time to complete ● But blocking is prohibited – When the page is retrieve the deferred will fire ● And callbacks will be invoked – 30
  • 31. Deferreds and Errors def gotPage(page): print quot;I got the page:quot;, page def getPage_errorback(error): print quot;Didn't get the page, me so sad - reason:quot;, error deferred = twisted.web.getPage(quot;http://www.cs.aau.dk/quot;) deferred.addCallback(gotPage) deferred.addErrback(getPage_errorback) Separates normal program flow from error  ● handling 31
  • 32. Chaining Callbacks deferred = twisted.web.getPage(quot;http://www.cs.aau.dk/quot;) deferred.addCallback(gotPage) deferred.addCallback(changePage) deferred.addCallback(uploadPage) Often several sequential actions are needed ● The result from gotPage is provided as argument  ● to changePage The execution stack disappears ● Causes headaches during learning – Coroutines can make program more stack­like – 32
  • 33. Deferreds and Coroutines @defer.deferredGenerator def myFunction(self): d = getPageStream(quot;http://www.cs.aau.dkquot;) yield d ; stream = d.getResult() while True: d = stream.read() yield d ; content = d.getResult() if content is None: break print content Notice: yield instead of return ● The function is reentrant – 33
  • 34. Why Twisted Grasping the power of Twisted is difficult ● Also hard to explain – Makes cross protocol integration very easy ● Provides a lot of functionality and power ● But one needs to know how to use it – Drawbacks ● Steep learning curve – Portability: Applications cannot be ported to Twisted – 34
  • 35. Summary Asynchronous IO ● An alternative to threads – Never block – Requires program reorganization – Asynchronous Programming ● Stateless: Berkeley sockets, select, poll – State: epoll, Kqueue – Disk: Posix AIO – libevent and Twisted – Exercises? ● 35