Cowboy development with Django
Upcoming SlideShare
Loading in...5

Cowboy development with Django



Keynote for DjangoCon 2009, presented on the 8th of September 2009. Covers two cowboy projects - and MP expenses - and talks about ways of "reigning in the cowboy" and developing ...

Keynote for DjangoCon 2009, presented on the 8th of September 2009. Covers two cowboy projects - and MP expenses - and talks about ways of "reigning in the cowboy" and developing in a more sustainable way.



Total Views
Views on SlideShare
Embed Views



7 Embeds 727 687 18 15 4 1 1
http://moderation.local 1



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Cowboy development with Django Cowboy development with Django Presentation Transcript

  • Cowboy development with Django Simon Willison DjangoCon 2009
  • Just one problem... we didn’t have cowboys in England
  • The Napoleonic Wars
  • A Napoleonic Sea Fort
  • Super Evil Dev Fort
  • Photos by Cindy Li
  • (Built in 1 week and 10 months)
  • DEMO
  • Search uses the geospatial branch of Xapian Species database comes from Freebase Photos can be imported from Flickr “Suggest changes” to our Zoo information uses model objects representing proposed changes to other model objects
  • /dev/fort Cohort 3: Winter 2009 What is /dev/fort? The trip Imagine a place of no distractions, no The third /dev/fort will run from 9th to 16th November on the Kintyre IM, no Twitter — in fact, no Peninsula in Scotland. internet. Within, a group of a dozen or more developers, designers, thinkers and doers. And a lot of a food. Cohort 2: Summer 2009 Now imagine that place is a fort. The trip The second /dev/fort ran from 30th May to 6th June 2009 at Knockbrex Castle in Scotland. As with the first cohort, we have a few remaining problems still to iron out (thorny issues inside Django we were hoping to avoid, that sort of thing). We hope to have the site in alpha by the end of the summer. Cohort members Ryan Alexander, Steven Anderson, James Aylett, Hannah Donovan, Natalie Downe, Mark Norman Francis, Matthew Hasler, Steve Marshall, Richard Pope, Gareth Rushgrove, Simon Willison. The idea behind /dev/fort is to throw Cohort 1: Winter 2008 a group of people together, cut them off from the rest of the world, and
  • Cowboy development at work
  • MP expenses
  • Heather Brooke
  • January 2005 The FOI request
  • February 2008 The Information Tribunal
  • “Transparency will damage democracy”
  • January 2009 The exemption law
  • March 2009 The mole
  • “All of the receipts of 650-odd MPs, redacted and unredacted, are for sale at a price of £300,000, so I am told. The price is going up because of the interest in the subject.” Sir Stuart Bell, MP Newsnight, 30th March
  • 8th May, 2009 The Daily Telegraph
  • At the Guardian...
  • April: “Expenses are due out in a couple of months, is there anything we can do?”
  • June: “Expenses have been bumped forward, they’re out next week!”
  • Thursday 11th June The proof-of-concept
  • Monday 15th June The tentative go-ahead
  • Tuesday 16th June Designer + client-side engineer
  • Wednesday 17th June Operations engineer
  • Thursday 18th June Launch day!
  • How we built it
  • $ convert Frank_Comm.pdf pages.png
  • Frictionless registration
  • Page filters
  • page_filters = ( # Maps name of filter to dictionary of kwargs to doc.pages.filter() ('reviewed', { 'votes__isnull': False }), ('unreviewed', { 'votes__isnull': True }), ('with line items', { 'line_items__isnull': False }), ('interesting', { 'votes__interestingvote__status': 'yes' }), ('interesting but known', { 'votes__interestingvote__status': 'known' ... ) page_filters_lookup = dict(page_filters)
  • pages = doc.pages.all() if page_filter: kwargs = page_filters_lookup.get(page_filter) if kwargs is None: raise Http404, 'Invalid page filter: %s' % page_filter pages = pages.filter(**kwargs).distinct() # Build the filters filters = [] for name, kwargs in page_filters: filters.append({ 'name': name, 'count': doc.pages.filter(**kwargs).distinct().count(), })
  • Matching names
  • On the day
  • def get_mp_pages(): "Returns list of (mp-name, mp-page-url) tuples" soup = Soup(urllib.urlopen(INDEX_URL)) mp_links = [] for link in soup.findAll('a'): if link.get('title', '').endswith("'s allowances"): mp_links.append( (link['title'].replace("'s allowances", ''), link['href']) ) return mp_links
  • def get_pdfs(mp_url): "Returns list of (description, years, pdf-url, size) tuples" soup = Soup(urllib.urlopen(mp_url)) pdfs = [] trs = soup.findAll('tr')[1:] # Skip the first, it's the table header for tr in trs: name_td, year_td, pdf_td = tr.findAll('td') name = name_td.string year = year_td.string pdf_url = pdf_td.find('a')['href'] size = pdf_td.find('a').contents[-1].replace('(', '').replace(')', '') pdfs.append( (name, year, pdf_url, size) ) return pdfs
  • “Drop Everything”
  • Photoshop + AppleScript v.s. Java + IntelliJ
  • Images on our docroot (S3 upload was taking too long)
  • Blitz QA
  • Launch! (on EC2)
  • Crash #1: more Apache children than MySQL connections
  • unreviewed_count = Page.objects.filter( votes__isnull = True ).distinct().count()
  • SELECT COUNT(DISTINCT `expenses_page`.`id`) FROM `expenses_page` LEFT OUTER JOIN `expenses_vote` ON ( `expenses_page`.`id` = `expenses_vote`.`page_id` ) WHERE `expenses_vote`.`id` IS NULL
  • unreviewed_count = cache.get('homepage:unreviewed_count') if unreviewed_count is None: unreviewed_count = Page.objects.filter( votes__isnull = True ).distinct().count() cache.set('homepage: unreviewed_count', unreviewed_count, 60)
  • With 70,000 pages and a LOT of votes... DB takes up 135% of CPU Cache the count in memcached... DB drops to %35 of CPU
  • unreviewed_count = Page.objects.filter( votes__isnull = True ).distinct().count() reviewed_count = Page.objects.filter( votes__isnull = False ).distinct().count()
  • unreviewed_count = Page.objects.filter( is_reviewed = False ).count()
  • Migrating to InnoDB on a separate server
  • ssh mps-live "mysqldump mp_expenses" | sed 's/ENGINE=MyISAM/ENGINE=InnoDB/g' | sed 's/CHARSET=latin1/CHARSET=utf8/g' | ssh mysql-big "mysql -u root mp_expenses"
  • Reigning in the cowboy
  • Reigning in the cowboy An RSS to JSON proxy service Pair programming Comprehensive unit tests, with mocks Continuous integration (Team City) Deployment scripts against CI build numbers
  • Points of embarrassment Database required to run the test suite Logging? What logging? Tests get deployed alongside the code (!) ... but generally pretty smooth sailing
  • A final thought
  • Web development in 2005 Relational Cache Database Application Admin tools Templates XML feeds
  • Web development in 2009 Relational Search Datastructure External web Non-relational Cache Database index servers services database Admin tools Application Message queue Offline workers Monitoring and reporting Templates XML feeds API Webhooks
  • Thank you