How we use Twisted in Launchpad


Published on

Although we don't use it for the core web application, most other places in Launchpad that have to deal with concurrency issues do it using Twisted. This talk will survey these areas and talk about issues we've found and design patterns we've found helpful.

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • How we use Twisted in Launchpad

    1. 1. How we use Twisted in Launchpad KiwiPyCon 2009 Michael Hudson, Canonical Ltd
    2. 2. Introduction • This talk attempts to present some “real world” use of Twisted as part of Launchpad, a large – one could even say “enterprise scale” – open source application. • First, a survey of where Twisted is used • Then a more detailed example
    3. 3. Twisted? • “Event-driven networking engine written in Python and licensed under the MIT license” • Has abstractions for handling concurrency without going insane • (in other words, it doesn’t use threads) • • Not a web framework!
    4. 4. Launchpad? • Collaboration and hosting platform for software projects • Particularly: Ubuntu • Code hosting, bugs, translations, … • The service: • The code: (licensed under Affero GPLv3)
    5. 5. Why Twisted? • Honestly: not completely sure, the decision was made before I started • Generally a high quality product, even back in 2004 when Launchpad was new • Wide range of supported protocols (SSH, SFTP, HTTP), easy to add more • Solid process management
    6. 6. Where Twisted? • Everywhere there’s concurrency • (Apart from the web application, that’s a “thread per request” Zope web application) • Codehosting, librarian, branch puller, code imports, build farm, mirror prober… • In more detail…
    7. 7. Codehosting SSH • A Twisted Conch server listens on • Custom authentication: keys checked by querying an XML-RPC server • Custom file system for SFTP, maps external paths to internal ones based on branch db id • Launches and tracks “bzr serve” processes for bzr+ssh branch access
    8. 8. Librarian • Simple application for storing files • Written using twisted.web and a simple custom upload protocol • Very simple: upload files, get a HTTP URL to download them from • Simple, but very effective; manages many terabytes of data
    9. 9. Branch Puller • Copies branch data from where it is uploaded to a read only area • Uses Twisted for process management, not network access • Twisted code quite generic: dispatches jobs to subprocesses, monitors them for activity • Will talk more about this “ProcessMonitor” later in the talk
    10. 10. Code Imports • As far as Twisted usage goes, similar to puller • Runs code import in subprocess, monitors for activity, informs database of progress • Can take from seconds to weeks to complete
    11. 11. The Build Farm (a.k.a. Soyuz) • XML-RPC client and server for dispatching builds to builders • Sort of environment where things go wrong a lot, by now very robust against timeouts etc
    12. 12. The Build Farm (a.k.a. Soyuz) • Build machines (buildds) not allowed to make network connections by the firewall • They run an XML-RPC server that has methods like: • “ping”: are you alive • “status”: what are you doing • “build”: start doing this • Runs build as subprocess, monitors output
    13. 13. The Build Farm (a.k.a. Soyuz) • Buildd-manager is a daemon process that periodically: • calls the “status” method on every builder (in parallel), then • dispatches pending builds to idle builders • fetches completed builds from builders over HTTP
    14. 14. Mirror prober • Checks that Ubuntu mirrors are up to date • Highly parallel HTTP client • Robust timeout handling
    15. 15. Quick Twisted Jargon Primer 1 • Deferred: a result you don’t have yet • E.g. the result of making an XML-RPC call across the network • Events: successfully got result, failed somehow • Interesting fact: if some operation returns a Deferred, you need to worry about it failing – Deferreds highlight “integration points”
    16. 16. Quick Twisted Jargon Primer 2 • A Protocol represents a network connection: • Handles data in a asynchronous mannter • Events: “connection made”, “data received”, “connection lost” • A ProcessProtocol represents a subprocess: • Similar, but processes have multiple streams • “connection lost” becomes “process exited”
    17. 17. Example: ProcessMonitorProtocol • Use case: • Run a subprocess • Report its activity and output so that it can be summarized on a web page • Kill if no progress shown for too long • Kill harder (SIGKILL) if it doesn’t die after SIGINT • Builds on ProcessProtocol
    18. 18. Example: ProcessMonitorProtocol • Race conditions galore: • Process exits just as you’re reporting progress • An attempt to report progress fails just as you receive output • Production experience helped us beat these out of the code :-)
    19. 19. Example: ProcessMonitorProtocol • serializes notifications ProcessMonitorProtocol and event handling with a DeferredLock – a convenience that essentially prevents callbacks from one deferred running until another’s have completed
    20. 20. Example: ProcessMonitorProtocol class Example(ProcessMonitorProtocol): """Reports activity on all output. self.endpoint is an XML-RPC proxy. """ def outReceived(self, data): self.resetTimeout() self.runNotification( self.endpoint.callRemote, “progress”)
    21. 21. Other Canonical uses of Twisted • Landscape (system management/monitoring): • client side: various Twisted processes talking over DBUS • server side: long running/unruly processes managed by a Twisted daemon • Ubuntu One (“your personal cloud”): • File sharing client and server both implemented using Twisted
    22. 22. Questions? Thanks for listening!
    23. 23. Further Reading • IRC channels (all on Freenode): • #twisted • #launchpad (users) • #launchpad-dev (developers) • Mailing lists: • • •