Cowboy development with Django

Cowboy development
with Django
Simon Willison
DjangoCon 2009

http://www.youtube.com/watch?v=nZx9sNXv9h0

Just one problem... we
didn’t have cowboys in
England

A Napoleonic Sea Fort

http://en.wikipedia.org/wiki/File:Alderney_-_Fort_Clonque_02.jpg

http://www.anotherurl.com/travel/fort_clonque/handbook.htm

Photos by Cindy Li

http://www.ﬂickr.com/photos/cindyli/sets/72157610369683426/

WildLifeNearYou.com
(Built in 1 week and 10 months)

Search uses the geospatial branch of Xapian

Species database comes from Freebase

Photos can be imported from Flickr

“Suggest changes” to our Zoo information uses
model objects representing proposed changes to
other model objects

/dev/fort
Cohort 3: Winter 2009 What is /dev/fort?

The trip Imagine a place of no distractions, no
The third /dev/fort will run from 9th to 16th November on the Kintyre IM, no Twitter — in fact, no
Peninsula in Scotland. internet. Within, a group of a dozen
or more developers, designers,
thinkers and doers. And a lot of a
food.
Cohort 2: Summer 2009
Now imagine that place is a fort.
The trip
The second /dev/fort ran from 30th May to 6th June 2009 at Knockbrex
Castle in Scotland. As with the first cohort, we have a few remaining
problems still to iron out (thorny issues inside Django we were hoping to
avoid, that sort of thing). We hope to have the site in alpha by the end of the
summer.

Cohort members
Ryan Alexander, Steven Anderson, James Aylett, Hannah Donovan, Natalie
Downe, Mark Norman Francis, Matthew Hasler, Steve Marshall, Richard
Pope, Gareth Rushgrove, Simon Willison.
The idea behind /dev/fort is to throw
Cohort 1: Winter 2008 a group of people together, cut them
off from the rest of the world, and

http://devfort.com/

February 2008
The Information Tribunal

“Transparency will
damage democracy”

January 2009
The exemption law

“All of the receipts of 650-odd MPs,
redacted and unredacted, are for sale at a
price of £300,000, so I am told. The price
is going up because of the interest in the
subject.”
Sir Stuart Bell, MP
Newsnight, 30th March

8th May, 2009
The Daily Telegraph

April: “Expenses are due out in
a couple of months, is there
anything we can do?”

June: “Expenses have been
bumped forward, they’re out
next week!”

Thursday 11th June
The proof-of-concept

Monday 15th June
The tentative go-ahead

Tuesday 16th June
Designer + client-side engineer

Wednesday 17th June
Operations engineer

Thursday 18th June
Launch day!

$ convert Frank_Comm.pdf pages.png

page_filters = (
# Maps name of filter to dictionary of kwargs to doc.pages.filter()
('reviewed', {
'votes__isnull': False
}),
('unreviewed', {
'votes__isnull': True
}),
('with line items', {
'line_items__isnull': False
}),
('interesting', {
'votes__interestingvote__status': 'yes'
}),
('interesting but known', {
'votes__interestingvote__status': 'known'
...
)
page_filters_lookup = dict(page_filters)

pages = doc.pages.all()
if page_filter:
kwargs = page_filters_lookup.get(page_filter)
if kwargs is None:
raise Http404, 'Invalid page filter: %s' % page_filter
pages = pages.filter(**kwargs).distinct()

# Build the filters
filters = []
for name, kwargs in page_filters:
filters.append({
'name': name,
'count': doc.pages.filter(**kwargs).distinct().count(),
})

http://github.com/simonw/datamatcher

def get_mp_pages():
"Returns list of (mp-name, mp-page-url) tuples"
soup = Soup(urllib.urlopen(INDEX_URL))
mp_links = []
for link in soup.ﬁndAll('a'):
if link.get('title', '').endswith("'s allowances"):
mp_links.append(
(link['title'].replace("'s allowances", ''), link['href'])
)
return mp_links

def get_pdfs(mp_url):
"Returns list of (description, years, pdf-url, size) tuples"
soup = Soup(urllib.urlopen(mp_url))
pdfs = []
trs = soup.findAll('tr')[1:] # Skip the first, it's the table header
for tr in trs:
name_td, year_td, pdf_td = tr.findAll('td')
name = name_td.string
year = year_td.string
pdf_url = pdf_td.find('a')['href']
size = pdf_td.find('a').contents[-1].replace('(', '').replace(')', '')
pdfs.append(
(name, year, pdf_url, size)
)
return pdfs

Photoshop + AppleScript
v.s.
Java + IntelliJ

Images on our docroot (S3
upload was taking too long)

Crash #1: more Apache
children than MySQL
connections

unreviewed_count = Page.objects.ﬁlter(
votes__isnull = True
).distinct().count()

SELECT
COUNT(DISTINCT èxpenses_page`.ìd`)
FROM
èxpenses_page` LEFT OUTER JOIN èxpenses_vote` ON (
èxpenses_page`.ìd` = èxpenses_vote`.`page_id`
) WHERE èxpenses_vote`.ìd` IS NULL

unreviewed_count = cache.get('homepage:unreviewed_count')
if unreviewed_count is None:
cache.set('homepage: unreviewed_count', unreviewed_count, 60)

With 70,000 pages and a LOT of votes...

DB takes up 135% of CPU

Cache the count in memcached...

DB drops to %35 of CPU


reviewed_count = Page.objects.ﬁlter(
votes__isnull = False

is_reviewed = False
).count()

Migrating to InnoDB on a
separate server

ssh mps-live "mysqldump mp_expenses" |
sed 's/ENGINE=MyISAM/ENGINE=InnoDB/g' |
sed 's/CHARSET=latin1/CHARSET=utf8/g' |
ssh mysql-big "mysql -u root mp_expenses"

Reigning in the cowboy

An RSS to JSON proxy service
Pair programming
Comprehensive unit tests, with mocks
Continuous integration (Team City)
Deployment scripts against CI build numbers

Points of embarrassment

Database required to run the test suite
Logging? What logging?
Tests get deployed alongside the code (!)
... but generally pretty smooth sailing

Web development in 2005
Relational
Cache
Database

Application Admin tools

Templates XML feeds

Web development in 2009
Relational Search Datastructure External web Non-relational
Cache
Database index servers services database

Admin tools
Application Message queue Ofﬂine workers
Monitoring and reporting

Templates XML feeds API Webhooks

Cowboy development with Django

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Cowboy development with Django

Similar to Cowboy development with Django (20)

More from Simon Willison

More from Simon Willison (20)

Recently uploaded

Recently uploaded (20)

Cowboy development with Django