SlideShare a Scribd company logo
How we use Twisted in
    Launchpad
          KiwiPyCon 2009

    Michael Hudson, Canonical Ltd
    michael.hudson@canonical.com
Introduction

• This talk attempts to present some “real
  world” use of Twisted as part of Launchpad,
  a large – one could even say “enterprise
  scale” – open source application.
• First, a survey of where Twisted is used
• Then a more detailed example
Twisted?
• “Event-driven networking engine written in
  Python and licensed under the MIT license”
• Has abstractions for handling concurrency
  without going insane
• (in other words, it doesn’t use threads)
• http://twistedmatrix.com/trac/
• Not a web framework!
Launchpad?
• Collaboration and hosting platform for
  software projects
• Particularly: Ubuntu
• Code hosting, bugs, translations, …
• The service: https://launchpad.net
• The code: https://launchpad.net/launchpad
  (licensed under Affero GPLv3)
Why Twisted?
• Honestly: not completely sure, the decision
  was made before I started
• Generally a high quality product, even back
  in 2004 when Launchpad was new
• Wide range of supported protocols (SSH,
  SFTP, HTTP), easy to add more
• Solid process management
Where Twisted?

• Everywhere there’s concurrency
• (Apart from the web application, that’s a
  “thread per request” Zope web application)
• Codehosting, librarian, branch puller, code
  imports, build farm, mirror prober…
• In more detail…
Codehosting
            SSH
• A Twisted Conch server listens on
  bazaar.launchpad.net:22

• Custom authentication: keys checked by
  querying an XML-RPC server
• Custom file system for SFTP, maps external
  paths to internal ones based on branch db id
• Launches and tracks “bzr serve” processes
  for bzr+ssh branch access
Librarian
• Simple application for storing files
• Written using twisted.web and a simple
  custom upload protocol
• Very simple: upload files, get a HTTP URL to
  download them from
• Simple, but very effective; manages many
  terabytes of data
Branch Puller
• Copies branch data from where it is
  uploaded to a read only area
• Uses Twisted for process management, not
  network access
• Twisted code quite generic: dispatches jobs
  to subprocesses, monitors them for activity
• Will talk more about this “ProcessMonitor”
  later in the talk
Code Imports

• As far as Twisted usage goes, similar to puller
• Runs code import in subprocess, monitors
  for activity, informs database of progress
• Can take from seconds to weeks to
  complete
The Build Farm
        (a.k.a. Soyuz)

• XML-RPC client and server for dispatching
  builds to builders
• Sort of environment where things go wrong
  a lot, by now very robust against timeouts
  etc
The Build Farm
        (a.k.a. Soyuz)
• Build machines (buildds) not allowed to
  make network connections by the firewall
• They run an XML-RPC server that has
  methods like:
  • “ping”: are you alive
  • “status”: what are you doing
  • “build”: start doing this
• Runs build as subprocess, monitors output
The Build Farm
        (a.k.a. Soyuz)
• Buildd-manager is a daemon process that
  periodically:
 • calls the “status” method on every builder
    (in parallel), then
 • dispatches pending builds to idle builders
 • fetches completed builds from builders
    over HTTP
Mirror prober


• Checks that Ubuntu mirrors are up to date
• Highly parallel HTTP client
• Robust timeout handling
Quick Twisted
         Jargon Primer 1
•   Deferred: a   result you don’t have yet
    • E.g. the result of making an XML-RPC call
      across the network
    • Events: successfully got result, failed
      somehow
    • Interesting fact: if some operation returns a
      Deferred, you need to worry about it failing
      – Deferreds highlight “integration points”
Quick Twisted
       Jargon Primer 2
• A Protocol represents a network connection:
 • Handles data in a asynchronous mannter
 • Events: “connection made”, “data received”,
    “connection lost”
• A ProcessProtocol represents a subprocess:
 • Similar, but processes have multiple streams
 • “connection lost” becomes “process exited”
Example:
      ProcessMonitorProtocol

• Use case:
 • Run a subprocess
 • Report its activity and output so that it
   can be summarized on a web page
 • Kill if no progress shown for too long
   • Kill harder (SIGKILL) if it doesn’t die
      after SIGINT
• Builds on ProcessProtocol
Example:
      ProcessMonitorProtocol

• Race conditions galore:
 • Process exits just as you’re reporting
    progress
  • An attempt to report progress fails just as
    you receive output
• Production experience helped us beat these
  out of the code :-)
Example:
       ProcessMonitorProtocol


•                        serializes notifications
    ProcessMonitorProtocol
    and event handling with a DeferredLock – a
    convenience that essentially prevents
    callbacks from one deferred running until
    another’s have completed
Example:
      ProcessMonitorProtocol

class Example(ProcessMonitorProtocol):
 """Reports activity on all output.

 self.endpoint is an XML-RPC proxy.
 """

 def outReceived(self, data):
  self.resetTimeout()
  self.runNotification(
   self.endpoint.callRemote, “progress”)
Other Canonical
     uses of Twisted
• Landscape (system management/monitoring):
 • client side: various Twisted processes
    talking over DBUS
 • server side: long running/unruly processes
    managed by a Twisted daemon
• Ubuntu One (“your personal cloud”):
 • File sharing client and server both
    implemented using Twisted
Questions?

Thanks for listening!
Further Reading
• IRC channels (all on Freenode):
  • #twisted
  • #launchpad (users)
  • #launchpad-dev (developers)
• Mailing lists:
  • twisted-python@twistedmatrix.com
  • launchpad-users@lists.launchpad.net
  • launchpad-dev@lists.launchpad.net

More Related Content

How we use Twisted in Launchpad

  • 1. How we use Twisted in Launchpad KiwiPyCon 2009 Michael Hudson, Canonical Ltd michael.hudson@canonical.com
  • 2. Introduction • This talk attempts to present some “real world” use of Twisted as part of Launchpad, a large – one could even say “enterprise scale” – open source application. • First, a survey of where Twisted is used • Then a more detailed example
  • 3. Twisted? • “Event-driven networking engine written in Python and licensed under the MIT license” • Has abstractions for handling concurrency without going insane • (in other words, it doesn’t use threads) • http://twistedmatrix.com/trac/ • Not a web framework!
  • 4. Launchpad? • Collaboration and hosting platform for software projects • Particularly: Ubuntu • Code hosting, bugs, translations, … • The service: https://launchpad.net • The code: https://launchpad.net/launchpad (licensed under Affero GPLv3)
  • 5. Why Twisted? • Honestly: not completely sure, the decision was made before I started • Generally a high quality product, even back in 2004 when Launchpad was new • Wide range of supported protocols (SSH, SFTP, HTTP), easy to add more • Solid process management
  • 6. Where Twisted? • Everywhere there’s concurrency • (Apart from the web application, that’s a “thread per request” Zope web application) • Codehosting, librarian, branch puller, code imports, build farm, mirror prober… • In more detail…
  • 7. Codehosting SSH • A Twisted Conch server listens on bazaar.launchpad.net:22 • Custom authentication: keys checked by querying an XML-RPC server • Custom file system for SFTP, maps external paths to internal ones based on branch db id • Launches and tracks “bzr serve” processes for bzr+ssh branch access
  • 8. Librarian • Simple application for storing files • Written using twisted.web and a simple custom upload protocol • Very simple: upload files, get a HTTP URL to download them from • Simple, but very effective; manages many terabytes of data
  • 9. Branch Puller • Copies branch data from where it is uploaded to a read only area • Uses Twisted for process management, not network access • Twisted code quite generic: dispatches jobs to subprocesses, monitors them for activity • Will talk more about this “ProcessMonitor” later in the talk
  • 10. Code Imports • As far as Twisted usage goes, similar to puller • Runs code import in subprocess, monitors for activity, informs database of progress • Can take from seconds to weeks to complete
  • 11. The Build Farm (a.k.a. Soyuz) • XML-RPC client and server for dispatching builds to builders • Sort of environment where things go wrong a lot, by now very robust against timeouts etc
  • 12. The Build Farm (a.k.a. Soyuz) • Build machines (buildds) not allowed to make network connections by the firewall • They run an XML-RPC server that has methods like: • “ping”: are you alive • “status”: what are you doing • “build”: start doing this • Runs build as subprocess, monitors output
  • 13. The Build Farm (a.k.a. Soyuz) • Buildd-manager is a daemon process that periodically: • calls the “status” method on every builder (in parallel), then • dispatches pending builds to idle builders • fetches completed builds from builders over HTTP
  • 14. Mirror prober • Checks that Ubuntu mirrors are up to date • Highly parallel HTTP client • Robust timeout handling
  • 15. Quick Twisted Jargon Primer 1 • Deferred: a result you don’t have yet • E.g. the result of making an XML-RPC call across the network • Events: successfully got result, failed somehow • Interesting fact: if some operation returns a Deferred, you need to worry about it failing – Deferreds highlight “integration points”
  • 16. Quick Twisted Jargon Primer 2 • A Protocol represents a network connection: • Handles data in a asynchronous mannter • Events: “connection made”, “data received”, “connection lost” • A ProcessProtocol represents a subprocess: • Similar, but processes have multiple streams • “connection lost” becomes “process exited”
  • 17. Example: ProcessMonitorProtocol • Use case: • Run a subprocess • Report its activity and output so that it can be summarized on a web page • Kill if no progress shown for too long • Kill harder (SIGKILL) if it doesn’t die after SIGINT • Builds on ProcessProtocol
  • 18. Example: ProcessMonitorProtocol • Race conditions galore: • Process exits just as you’re reporting progress • An attempt to report progress fails just as you receive output • Production experience helped us beat these out of the code :-)
  • 19. Example: ProcessMonitorProtocol • serializes notifications ProcessMonitorProtocol and event handling with a DeferredLock – a convenience that essentially prevents callbacks from one deferred running until another’s have completed
  • 20. Example: ProcessMonitorProtocol class Example(ProcessMonitorProtocol): """Reports activity on all output. self.endpoint is an XML-RPC proxy. """ def outReceived(self, data): self.resetTimeout() self.runNotification( self.endpoint.callRemote, “progress”)
  • 21. Other Canonical uses of Twisted • Landscape (system management/monitoring): • client side: various Twisted processes talking over DBUS • server side: long running/unruly processes managed by a Twisted daemon • Ubuntu One (“your personal cloud”): • File sharing client and server both implemented using Twisted
  • 23. Further Reading • IRC channels (all on Freenode): • #twisted • #launchpad (users) • #launchpad-dev (developers) • Mailing lists: • twisted-python@twistedmatrix.com • launchpad-users@lists.launchpad.net • launchpad-dev@lists.launchpad.net