Os Django


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Os Django

  1. 1. Django Master Class Jeremy Dunck • Jacob Kaplan-Moss • Simon Willison Handouts for the tutorial given at OSCON, July 23th, 2007. Available online at http://toys.jacobian.org/presentations/2007/oscon/tutorial/. 2 So here’s what we’ve got on the plate: 1. Unit testing (Simon). First because it’s important, dammit! 2. Stupid middleware tricks (Jacob). Make middleware work for you, not against you. 3. Signals (Jeremy). Get notified when important things happen. 4. Forms & AJAX (Simon). Django’s !quot;#$%&'( library rocks, and it goes great with this whole “AJAX” thing. Now you can finally show your face in those cool “Web 2.0” cliques. 5. Template tag patterns (Jacob). Save time writing those repetitive tags by factoring out common tasks. 6. Custom fields (Jeremy). Because not every piece of data is a simple, primitive type. 7. OpenID (Simon). Learn the straight dope about OpenID, and see how to integrate it into your Django site. 8. The “rest” of the stack (Jacob). Also known as “how to scale your website by copying LiveJournal.” 9. GIS (Jeremy). Store data about our planet. Or another one. 3 First up: testing. If you pay attention to only one part of this tutorial, make it this one.
  2. 2. 4 Test Driven Development By Example (by Kent Beck) is the bible here. If you have the discipline for it, this is a really rewarding way of programming. It works particularly well if you are pair programming with someone who can keep you on the straight and narrow. 5 The other end of the spectrum. Write tests only when you need them. This is a really great way to tackle tricky bugs, where the hardest problem is often replicating them. Replicate them with a test, then solve them. The test will guarantee they don't come back to haunt you again later. 6 I speak from experience here. I had a project with a beautiful test suite. I let things lapse while dashing for a deadline. I still haven’t got all the tests working again, which discourages me from running the tests at all, which massively devalues the test suite. 7 Until I saw Ruby on Rails, I had basically resigned to the fact that testing web apps was too hard to be worth doing, thanks to the difficulties involved in testing something with external persistent state (a database) and most interactions happening over HTTP. Rails used fixtures to tackle the database testing problem, and included a bunch of clever hooks for making everything else easy. Django has since evolved a similar set of features, albeit with a distinctly Pythonic flavour.
  3. 3. 8 Doctests are used extensively by Django itself for unit testing the ORM - they have the nice side-effect of doubling as documentation, automatically generated for the website: » http://www.djangoproject.com/documentation/models/ You are encouraged to use them for testing your own models as well; Django's built in test runner will detect and execute them. 9 10 11
  4. 4. 12 A naive approach. 13 It doesn't work for the edge cases. That's why tests should always target the edge cases. You can try using using 365.25 instead, but it still won't pass every test. 14 This passes all the tests. I'm ashamed to admit how long it took me to get here; the tests were invaluable. 15
  5. 5. 16 17 http://www.kottke.org/04/10/normalized-data discusses the quote in more detail. Cal is the lead engineer on Flickr, and knows exactly what it takes to build a system that scales to millions of users. Denormalisation is an excellent way to speed up your queries - at a cost of added complexity in your application code. It's an ideal case study for unit testing. 18 Many online forums have a view which shows the most recent 20 or so threads along with a count of the number of replies to each. This can be a pretty expensive SQL query, and can be dramatically sped up by denormalising the data. 19 Here, !)'*"+,-quot;( is the denormalised field. It stores the number of replies that are attached to that thread - information that already exists in the database (and is now stored twice).
  6. 6. 20 Fixtures provide a way of pre-populating a database with test data - great for writing tests against. These fixtures are saved in a file called $%&)'.$-/0)"(.01"234%)!056(%!. You can easily generate your own fixtures using this command: 5.'2!27quot;5+893)'+3202 To pretty-print the JSON, use this: 5.'2!27quot;5+893)'+32029::-!3quot;!09; And for XML instead, do this: 5.'2!27quot;5+893)'+32029::-!3quot;!09;9::$%&'209/', If you’ve got PyYAML installed, you can also use ::$%&'20 82',. 21 This test case clears the database and loads our 01"234%)!056(%! fixtures before each test. It contains two tests: one that adds a new reply to a thread, and another that deletes a reply. The tests check that !)'*"+,-quot;( accurately reflects the number of replies associated with a thread. 22 Our tests fail, because we don't have a mechanism for keeping the counter in sync with the actual data yet.
  7. 7. 23 By over-riding the save and delete methods on the <quot;+,8, we can update the !)'*&quot;+,-quot;( of the parent thread when a reply is added or deleted. 24 The tests pass! It says 7 because I ran the test runner against a project, which included a couple of other simple applications. 25 Bonus slide: here's an alternative way of solving the denormalised counter problem, this time using signals instead of custom delete() and save() methods. If you’ve not yet learned about signals, don’t fret; Jeremy is going to cover them in a little bit. Mark this slide and come back to it then... 26
  8. 8. 27 Two tests here. The first simple checks that Django's trailing slash adding middleware is configured correctly, and that .&quot;7-(0quot;&. returns a 200 (“OK”) status code. The second checks that .&quot;7-(0quot;&. uses the &quot;7-(0quot;&510', template, demonstrating both the verbose way of doing this and the (quot;,$52((quot;&0=quot;'+,20quot;>(quot;3 shortcut. 28 A more complex example. This illustrates two useful concepts: POSTing to a form using 4,-quot;!05+%(0?@, and intercepting sent e-mails using '2-,5%)0A%/. Test cases that inherit from 362!7%50quot;(05=quot;(0B2(quot; automatically hook in to Django's email framework and intercept messages, so that instead of being sent out via SMTP they are stored in a queue and made available for assertion testing. A good rule for tests is that they should never interact with external services, unless the services themselves are being tested. 29 More on testing with Django: » http://www.djangoproject.com/documentation/testing/ I've also used BeautifulSoup for running tests against the structure of my HTML before, but I generally find this counter-productive as HTML frequently changes during development without having much of an impact on the functionality of the application.
  9. 9. 30 Part the second: Middleware. 31 Most people understand a request/response cycle along these lines. This is correct, of course, but it’s also overly simplistic; there are a number of steps that this simple Request/View/Response understanding leaves out. In particular, it suggests that if we want certain behavior to happen on each request, we’re forced to write it into a view (since the view is the only part of this cycle that Django doesn’t control internally). Often we want to perform tasks on each and every request -- think about return gzipped content, for example. If Django was really this simplistic, something along those lines would be basically impossible. 32 So this is how a request “really” works (and actually even this outline simplifies things somewhat). I don’t have time to go over all the intricate details here, but notice the pieces of “middleware” that let you hook in at various points in the cycle and override the default behavior. For example, you can see that the request middleware can return a response and “short-circuit” the entire view phase. This is how the caching framework is able to work so fast: if the page is in the cache, the view doesn’t even need to be called. A note on terminology: we call this feature “middleware”, though this term can be a bit misleading to folks with an “enterprisy” background. Ruby on Rails calls its similar feature “filters”, which would work well for Django (if not for the conflict with the naming of template filters). If you’re confused, think of middleware as essentially callbacks at particular moments in a single request cycle.
  10. 10. 33 Here’s a simple piece of middleware (modified from a post to DjangoSnippets by “Leonidas”). In particular, this is a piece of “request middleware.” There’s really nothing special about middleware; a piece of middleware is just a Python class that defines a particular API. Here, by defining +&%4quot;((*&quot;C)quot;(0?@, this object can be used as a piece of request middleware. 34 A piece of middleware can define multiple handlers. It can also save state as instance attributes (i.e. (quot;,$5$%%9D9E#120quot;Fquot;&G), but note that for performance a single middleware instance is reused for all requests. 35 “Installing” a piece of middleware is as simple as registering it in HIJJKLMN<L*BKNOOLO. This example shows some built-in Django middleware along with the piece of middleware from the previous slide. 36 The order of HIJJKLMN<L*BKNOOLO is important; middleware is processed “top-down” during the request phase, and “bottom-up” during the response phase.
  11. 11. 37 Here’s another way of looking at it. Middleware is the onion skin around the view; you can think of each middleware class as a “layer” that wraps the view and can intercept data on its way in or out. 38 The four types of middleware callbacks. 39 It’s nasty and slimy, but a great example of request middleware is a three-click paywall like some news sites use. That is, you get free access to the site, but after your third page you get redirected to a login/registration page. Pretty straightforward, but note that request middleware may return an P00+<quot;(+%!(quot; or suitable subclass. If so, the rest of the request is short-circuited and the view is never handled. However, the middleware may also return Q%!quot;R, which signals that the normal request cycle should be continued. 40 View middleware... isn’t really very useful, honestly. It’s mostly there for a debugging hook -- it’s a nice place to hook in if you’d like to wrap and profile a view, for example. I’m going to skip showing an example, because you probably won’t ever need to use it.
  12. 12. 41 You’ll use response middleware any time you need to modify the output before it gets sent to the browser. 42 Cute, eh? 43 Like view middleware, exception middleware isn’t all that useful in end-user code; it’s mostly there as a hook for doing frameworky stuff. So I’ve cheated and taken an example from Django itself: the built-in =&2!(240-%!H-33,quot;#2&quot; that handles keeping each request in its own transaction. Here we can see the rollback step taken by the exception hook. (There’s of course a similar commit step in the response middleware, but that’s not shown here.) 44 More: » http://www.djangoproject.com/documentation/middleware/ » http://www.djangobook.com/en/beta/chapter16/ » http://code.djangoproject.com/wiki/ContributedMiddleware » http://www.djangosnippets.org/tags/middleware/
  13. 13. 45 Jacob’s discussion of middleware showed that it provides hooks for additional processing of HTTP requests and responses. I’m going to cover signals, which provide similar hooks in Django’s lifecycle and ORM. 46 You’ve probably used something like Django’s signaling tools before in the form of Observer from the book, “Design Patternsquot; (a.k.a the Gang of Four), or from Qt, Java, or .Net programming. Something so popular has got to be useful, right? When you’re first starting out with a toolset, it’s common to just make things work. But, when your codebase grows or you wish to start combining and layering components, directly referencing other modules and applications leads to circular dependencies, tight coupling, difficulties in testing, and, yes, sadness. You can use Django’s stock signals to hook into other apps and to customize ORM behavior. You can also provide your own signals for use in other applications. 47 The core idea is that signals provide a way to communicate and coordinate without directly expressing dependencies. Note that it’s possible to have multiple handlers per signal. They’ll run sequentially, but their order is undefined. You shouldn’t write signal handlers with the expectation that an earlier handler has altered state.
  14. 14. 48 Here’s a simple example from the Django codebase. We want Django’s ORM to be useful without the HTTP handler, and vice versa. But we also want to make sure that when an HTTP request is finished, the DB connection is closed. The core.request_finished signal is used to notify the ORM that the connection is no longer needed. 49 Using signals starts with choosing a one. You can either use a stock Django one or publish your own. Defining your own signal is as simple as creating an object to represent it. Once you’ve chosen your signal, you’ll write a handler based on the arguments the signal’s sender provides. Finally, you’ll connect your handler to the signal. 50 Django includes a number of signals which it uses internally.
  15. 15. 51 362!7%54%&quot;5(-7!2,( is home to a couple more request- related signals. &quot;C)quot;(0*(02&0quot;3 is sent when the request handler first begins processing, and is used internally to reset 3A54%!!quot;40-%!5C)quot;&-quot;(, a list of all queries executed by Django’s ORM which is kept when (quot;00-!7(5JLS>T9DD =&)quot;. 7%0*&quot;C)quot;(0*quot;/4quot;+0-%! is used to indicate an exception occurred while processing a request. It’s used internally to roll back any pending database transaction as well as for exception reporting in 362!7%50quot;(0. 52 Now we get to the good stuff. The 4,2((*+&quot;+2&quot;3 signal indicates that a H%3quot;, class has been constructed. It’s used internally for some housekeeping such as ensuring that every model has a H2!27quot;& and resolving recursive model relationships. This signal is very early in the life of a H%3quot;,, so some pretty radical features are possible. The pre and post init signals allow signal handlers to munge data just as a model instance is created. We’ll see an example in Tquot;!quot;&-4U%&quot;-7!Vquot;8 a bit later. The pre and post save signals allow a signal handler to do additional processing in response to the model being saved. The pre and post delete signals serve a similar purpose. +%(0*(8!43A is sent by 362!7%54%&quot;5'2!27quot;'quot;!0 just after an app’s models have been added to the database. It’s used for interactive prompting, as seen in auth’s initial superuser prompt. 53 One nice use of signals is to add additional functionality to existing code. Suppose we want to get an email any time a model is saved with a pub_date attribute set in the future.
  16. 16. 54 Note that if you just want this type of handling on a single model which you control, you’d probably be better off overriding the save method in your model definition rather than using a signal. But in this case, we want to handle multiple models. We’ll need to listen to a save signal. We can use either pre- or post-save in this case, since the signal will not be manipulating the data about to be saved. We’ll use +&quot;*(2Fquot;. Django dispatches the +&quot;*(2Fquot; signal with the keyword arguments (quot;!3quot;& (the model class) and -!(02!4quot; (the model object). We’ll need to define a signal handler to use these parameters. 55 Connecting to a signal is pretty simple-- just call 3-(+2041quot;&54%!!quot;40, passing in the handler and the signal for which it should be called. 56 Recall that +&quot;*(2Fquot; offers both the model class and instance as parameters. In this case, we care about the model instance, but not the class. Django’s dispatching system will match up the published arguments with the subscribed handlers. There’s no need to accept all parameters explicitly in the handlers.
  17. 17. 57 Since we’re trying to handle many different models, we’ll have to assume some common interface. Here, we check whether the model has the attributes we expect, and if not, we stop processing the signal. 58 Now, whenever a model instance is saved, mail_on_future will be called. 59 Another use of signals is to adapt from one form in an API call to another. 60 Tquot;!quot;&-4U%&quot;-7!Vquot;8 makes it possible to refer to any kind of related instance using U%&quot;-7!Vquot;8-like semantics. It does this by storing the related instance’s content type and primary key value. But there’s a hitch-- models with regular U%&quot;-7!Vquot;8 fields can be constructed with references to the related model instance. In this example, we’re assigning an author to a story. Tquot;!quot;&-4U%&quot;-7!Vquot;8, however, requires both a content type and a foreign key. The API would be more consistent with U%&quot;-7!Vquot;8 if we had a way to hide that complexity. In this example, we’d like to assign a target object for a B%''quot;!0.
  18. 18. 61 To accomplish this, Tquot;!quot;&-4U%&quot;-7!Vquot;8 listens for the +&quot;*-!-0 signal and alters the model construction call from the nice form to the ugly (but necessary) form. 62 In the pre_init handler, GenericForeignKey inspects the constructor kwargs for the desired usage. 63 And then it replaces the the given model instance with its related content type and primary key. 64 This reduces the lines of code needed to use the GenericForeignKey, and makes the API more like a standard ForeignKey. Nice!
  19. 19. 65 You can find further information on signals as implemented in Django with these links: » http://en.wikipedia.org/wiki/Observer_pattern » http://code.djangoproject.com/wiki/Signals » http://pydispatcher.sourceforge.net/ 66 These projects, available on http://code.google.com/, all use signals. 362!7%:'),0-,-!7)2,, in particular, is very ambitious; it uses signals to dynamically create models featuring parallel texts for originally-specified models. Additionally, it substitutes its own custom (oldforms) manipulators in to facilitate data entry of multilingual text. Have a look and have fun. 67 68
  20. 20. 69 70 71 This view has three return values: the empty string, if it was given an empty username; the text 'Unavailable', if it was given a username that is unavailable; and the text 'Available' for usernames that are available. 72 The 6W)quot;&8 function takes a CSS selector as its first argument; here we are passing a selector for the span element with -3DX'(7X, but it supports all sorts of advanced selectors including ones from CSS 2 and 3, XPath and a few that are unique to jQuery. The function returns a wrapper object around the collection of elements matched by the selector. jQuery methods can then be called on the wrapper; in this case we are calling the ,%23 method, which uses Ajax to retrieve a fragment of HTML from a URL and then injects it in to the element(s) on which it was called.
  21. 21. 73 For convenience, jQuery sets up Y?@ as an alias to itself. 6W)quot;&8 and Y are the only two symbols it adds to your global namespace, and you can revert Y back to what it was before if you want to (for compatibility with Protoype, for example). 74 Here we're binding a function to the Zquot;8)+ event of the input field. Every time a key is released it performs the Ajax request. 75 Finally, we set the whole thing to run when the page has finished loading. This ensures that the input element has been loaded in to the browser's DOM. $(document).ready() fires after the DOM has been loaded but before all of the images have been loaded - this means it's a better way to attach JavaScript behaviours than the more traditional window.onload, which can take a lot longer to fire. 76 The $ function also acts as a shortcut for $(document).ready, if you pass it a function instead of a selector string.
  22. 22. 77 All Web applications need server-side validation, to ensure the integrity (and security) of data submitted by the client. Application usability can be enhanced by adding JavaScript client-side validation, but this often leads to duplicated validation logic - the same rules expressed once in Python and once in JavaScript. With Ajax, we can reuse the server-side code for client-side validation. 78 Django's !quot;#$%&'( library allows us to define form validation logic in a similar way to Django models - declaratively, using a subclass of !quot;#$%&'(5U%&'. 79 Here's the server-side code that goes with that form. If the form has been POSTed, it checks if it is valid. If it is, it sends an e-mail (in this case) and redirects the user. If the form is invalid or has not yet been submitted, the contact page is displayed. 80 The template looks like this. $%&'52(*+ provides a simple default layout for the form; the template can be extended to define exactly how the form should look if a custom display is required.
  23. 23. 81 Let's add client-side validation, reusing our B%!0240U%&' for validation. This view expects to be POSTed either the whole form or just one of the fields; if just one field is provided, the field= GET variable is used to specify which one. The view returns a Python dictionary rendered to JSON, a useful data format for Ajax as it can be evaluated as regular JavaScript. It makes use of a custom [(%!<quot;(+%!(quot; class, which knows how to render a Python object as JSON. 82 Here's [(%!<quot;(+%!(quot;. I often include this utility class in my applications when I'm working with JSON. Note that it sets the correct Content-Type header, quot;application/jsonquot;. This can make debugging difficult as the browser will attempt to download the content directly; an improved version could check for (quot;00-!7(5JLS>T and serve using quot;text/plainquot;. 83 This is the accompanying JavaScript. The F2,-320quot;I!+)0 function is called for an input field, and performs an HTTP POST (using jQuery's Ajax features) against the view we just defined. It makes use of the 6C)quot;&85$%&'56( plugin, which adds the $%&'=%N&&28?@ method to the jQuery object. jQuery plugins provide a clever mechanism for extending jQuery's functionality without needing to increase the size of the main jquery.js file. The F2,-320quot;I!+)0 function is attached to every input field on the page, using jQuery's handy custom -!+)0 selector.
  24. 24. 84 Here's the (1%#L&&%&( function, which displays any errors in the quot;&&%&,-(0 associated with the form element. &quot;,20quot;3L&&%&K-(0?@ uses jQuery's DOM traversal functions to find the error list associated with the input element, and creates one if there isn't one already. 85 86 87 Bonus slide: here’s that F2,-320quot;*4%!0240 method repackaged as a generic view.
  25. 25. 88 More: » http://www.djangoproject.com/documentation/newforms/ » http://jquery.com/ » http://dojotoolkit.org/ » http://developer.yahoo.com/yui/ » http://www.prototypejs.org/ » http://www.djangosnippets.org/tags/ajax/ 89 Custom template tags are supremely useful. Write ‘em for a while, however, and you start to discover some patterns you use over and over again. In this part, I’ll go over five common needs, and the patterns I use to handle them. 90 The first use case: simple data (i.e. a list, text, etc.) in, simple data out. When you’ve got one of these tasks, think “filter!” 91 An example filter to “piratize” text. Filters really are damn simple, so there’s not much more to say about this.
  26. 26. 92 Use case #2: you’ve got some programatically-generated data (i.e. from the results of a database lookup, or system call, or ...) that you’d like to render into the template. In this case, the ](-'+,quot;*027 decorator is your friend. 93 Here’s a pretty simple example: display a server’s uptime. Not a very useful tag, but shows the basic pattern pretty well. 94 Use case #3: you’ve got something you want to display in a template tag, but it’s expensive and you don’t want template authors killing your servers. The solution is to cache the results of template tags. 95 I’ve written a set of node subclasses that illustrate one way you could use caching with template tags. It’s a useful idea even if you don’t use these specific bits.
  27. 27. 96 This is a use case that doesn’t come up very often, some some times you need to do pretty complex stuff. 97 Here’s an example (also available at djangosnippets.com) of what I’m talking about. These tags depend on each other, and you’ll need to handle the child tokens “inside” the switch tag correctly. 98 The import parts to notice here are the three commented lines. First we gather all the child nodes until the ^_9quot;!3(#-0419_` tag; then we delete that ^_9quot;!3(#-0419_` tag; then we pull out just ^_942(quot;9_` nodes. From there, it’s a matter of returning the node type. The 42(quot; handler is very similar; it just doesn’t have to do the 7quot;0*!%3quot;(*A8*08+quot;?@ call. 99 Here’s (the render method of) the switch node. Notice all that it does is delegate rendering off to the case node after doing some checks.
  28. 28. 100 Finally, this is the interesting part of the case node. Pretty simple: check for equality, and (when requested) render all the child nodes passed in. Again, the full code’s available online at http://www.djangosnippets.org/snippets/300/. 101 This is a common complaint: “I’ve got this cool tag, but I hate having to ^_9,%239_` it everywhere!” The solution is to make it a builtin. 102 And here’s how. You can stick this code anywhere that’ll get loaded on startup; I suggest installing it in a top-ish-level **-!-0**5+8. 103 More resources: » http://djangoproject.com/documentation/templates_python/ — the official template documentation. » http://code.google.com/p/django-template-utils/ — James’ template utils have some good examples. » http://www.djangosnippets.org/ — There are lots of good resources here.
  29. 29. 104 There are two different kinds of fields in Django: !quot;#$%&'(5U-quot;,3 (which Simon covered earlier), and 3A5'%3quot;,(5U-quot;,3. Here I’ll cover model fields. Model fields provide a way to customize the behavior of the ORM and to provide a richer interface when dealing with model instances. 105 There are many model fields that come with Django. Here are a few that run spectrum of sophistication. A B12&U-quot;,3 requires a '2/,quot;!701 argument, and otherwise supports common validation parameters like blank, null, and default. I’m sure you’ve used one before. Note that each of those parameters could be implemented as a validator given in F2,-320%&*,-(0. They’re included in the B12&U-quot;,3 implementation because they are so commonly useful. Next on the spectrum is ><KU-quot;,3, which is a B12&U-quot;,3 with a larger default '2/,quot;!701 and an additional option to validate that the resource identified actually exists. A U-,quot;U-quot;,3 goes further by contributing helper functions, such as 7quot;0*UILKJ*)&,, to the associated model. As I covered earlier, Tquot;!quot;&-4U%&quot;-7!Vquot;8 provides an abstraction layer over the B%!0quot;!0=8+quot; package in order to make model instances refer to any other model. Developers using Django can tap into this power, too. 106 We’ll start with a validating ISBNField. An ISBN is a unique identifier assigned to each edition (or sometimes printing) of any book. They come in 10 and 13 digit varieties; 13 digits is the new standard. The last digit is a check digit and can be used to verify validity.
  30. 30. 107 We need to subclass an existing Field class. The base Field class provides hooks needed for Django to manage persistance. We’ll usually want to override the Field.__init__ in order to set constraints, and we need to map our Field into a database column. 108 Before we get to the actual field, a little warning about validation. Form processing is in flux on trunk right now. Oldforms is being replaced with Newforms. Oldforms used manipulators, which validated, in part, using a field’s F2,-320%&*,-(0. There’s some debate right now whether validation logic belongs in models, forms, or both. Rather than get sidelined with that debate and the many ways to currently do it, I’m going to cheat and not use forms here. Instead, I’ll rely on H%3quot;,5F2,-320quot;, which, at least on trunk right now, calls validate for each of the fields on the model. Watch this space. 109 Let’s get started. We’ll inherit from B12&U-quot;,3 to start with, since ISBNs are a string of characters. Here’s our custom validator. If you’re not familiar, validators must raise an a2,-320-%!L&&%& exception to indicate failure.
  31. 31. 110 In the IOSQU-quot;,3G(9**-!-0**, we’ll force '2/,quot;!701 to be 13, since all ISBNs are at most that many characters. We also add the -(IOSQ validator to validator_list, as an example of how we could support oldforms. 111 Finally we add 7quot;0*-!0quot;&!2,*08+quot;90%90quot;,,9J62!7%90% '2+9RRIOSQU-quot;,3 to the B12&U-quot;,3 database column type. 112 Now we can use the ISBNField like any stock field. We can give it a valid ISBN and have it pass, or a bad ISBN and have it fail. 113 Given an ISBN, it’s common to want related information about a book such as the title. Let’s change IOSQU-quot;,3 so that it contributes a B12&U-quot;,3 for the title in addition to its own field.
  32. 32. 114 So, I’ve written a method that, given an ISBN, returns the title of that book. I’ve also tweaked the IOSQU-quot;,35**-!-0** to take an optional title_field argument. This is used to determine the name of the title field on this model. 115 Every U-quot;,3 has a 4%!0&-A)0quot;*0%*4,2(( method, which Django uses to help define the H%3quot;, class. In the last example, we just let the standard U-quot;,354%!0&-A)0quot;*0%*4,2(( do its thing, but now we want to alter the model class definition to include an extra U-quot;,3 for the title. The tricky part here is incrementing the creation counter. The creation counter is used to maintain field order when one Django model inherits from another one. But it also affects the order of field value assignment in the model’s constructor. We want ISBN to be set after the title field so that we can fill in the title based on the ISBN value. If the ISBN field occurred before the title in the model definition, the title set by the IOSQU-quot;,3 might be overwritten. Finally, we contribute the new title field to the model we’re helping to build. 116 Actually, there’s one more step to the contribution. We’d like the the title attribute to be derived from the given ISBN. If you want control what happens on an attribute access, you typically use a property. In Django, the U-quot;,3 instance is attached to the H%3quot;, class. This is important to realize, because a single U-quot;,3 instance can’t manage the model instances. Instead, we need to use a “descriptor”.
  33. 33. 117 Descriptors are objects that take a class or instance as a parameter, and resolve attribute lookup using both that reference and internal state. See Guido’s discussion here: http://www.python.org/download/releases/2.2.3/descrintro/ Since serving the attribute resolution is tightly related to the Field itself, I’ve made the Field instance itself serve as the descriptor for the Model class. 118 Here’s the descriptor “set” method for setting the value of the field on the model. We insure that the call is for a model instance rather than the model class. This prevents overriding the field on the class in outside code. Then, if the ISBN is a string or Q%!quot;, the ISBN is stashed in the model instance’s dictionary, and the title is set to correspond to the ISBN. 119 Finally, when the ISBNField’s attribute is accessed, we return the value from the model instance’s dictionary. This is the descriptors “getter” method. 120 There we have it: an ISBNField that manages a related title field.
  34. 34. 121 More resources on Django’s model creation lifecycle: » http://code.djangoproject.com/wiki/DevModelCreation » http://toys.jacobian.org/presentations/2007/pycon/tutorials/advanced/#s22 The =27U-quot;,3 that’s part of django-tagging (http://code.google.com/p/django-tagging/) is a good example. And more information about the python magic that lets this work: » http://www.python.org/download/releases/2.2.3/descrintro/ » http://docs.python.org/ref/attribute-access.html 122 123
  35. 35. 124 It solves the “too many passwords” problem - with OpenID, you don’t have to come up with a brand new username and password on every site that you need an account. It’s decentralised, which means that there’s no central entity controlling everyone’s identity - unlike Microsoft Passport or Six Apart’s TypeKey. It’s an open standard, supported by Open Source libraries. For a much more detailed introduction, watch the video of my Google Tech Talk (or read through the slides): » http://video.google.com/videoplay?docid=2288395847791059857 » http://www.slideshare.net/simon/implications-of-openid-google- tech-talk/ 125 These are some of mine. It’s perfectly normal for people to have more than one (people have maintained multiple online personas since the early days of the Internet), but in practise most people will pick one and use it on most sites. If you have a LiveJournal or AOL account, you have an OpenID already. If you don’t have one, there are plenty of places that you can get one: http://openid.net/wiki/index.php/OpenIDServers 126 You can watch a screencast of OpenID in action here: http://simonwillison.net/2006/openid-screencast/
  36. 36. 127 128 If you view the HTML source of a page that is an OpenID, you’ll find this in the <head> section. This tells the OpenID consumer (the site you are signing in to) where your provider’s server is. This is the URL that you will be redirected to to “prove” that you own that OpenID. Proof is often done by signing in to that site with a username and password, but other forms of authentication are possible as well. The consumer also establishes a shared secret with the provider, if they haven’t communicated before. This lets them communicate securely despite your browser handing the information back and forth between the two of them. 129
  37. 37. 130 This essentially acts as a way of helping you to pre-fill a registration form. As part of the OpenID sign in process, the consumer can ask your provider for this information. Your provider will explicitly ask your permission before passing it back. There are no guarantees that complete (or indeed any) information will be passed back at all, so consumers can’t rely on this working. More here: http://simonwillison.net/2007/Jun/30/sreg/ 131 132 The reference implementation is the JanRain OpenID library: http://www.openidenabled.com/openid/libraries/python/. It’s a great library, and really isn’t that hard to use. But there is an easier way... 133 The models are used by the JanRain library for persistence; you don’t have to worry about them at all. Full instructions here: http://django- openid.googlecode.com/svn/trunk/openid.html
  38. 38. 134 The full middleware line is b362!7%*%+quot;!-34%!()'quot;&5'-33,quot;#2&quot;5c+quot;!IJH-33,quot;#2&quot;b , but that didn't fit on the slide. You need to add this somewhere after the session middleware, which must be activated for the OpenID functionality to work. 135 The first URL will be your sign-in page, where users are directed to begin signing in with OpenID. The second is the URL that the user will be redirected back to upon successful sign in with their OpenID provider. The third is the signout page, which users can use to sign out of your application. 136
  39. 39. 137 138 It may not be instantly obvious why it is useful to have users sign in with more than one OpenID at once. There are a number of reasons, but the most interesting is that sites may well start to offer API services around the OpenIDs they provide - for example, a last.fm OpenID may be used to retrieve that user's last.fm music preferences, while an Upcoming.org OpenID could provide access to their calendar. Supporting multiple OpenIDs allows services to be developed that can take advantage of these site-specific APIs. 139 140 By quot;coming soonquot;, I mean really soon. There's a small chance I'll have released the first of these before giving this tutorial.
  40. 40. 141 More info: » http://openid.net/ — the oficial OpenID site; also home to the OpenID mailing lists. » http://www.openidenabled.com/ — a directory of OpenID- enabled applications. » http://simonwillison.net/tags/openid/ — All of Simon’s writings on OpenID. » http://code.google.com/p/django-openid/ — Home of the django- openid library. 142 143 So: diagrammed loosely, this is what a typical website looks like, right?
  41. 41. 144 Ahem. 145 This is more like it. This is LiveJournal’s current architecture, as taken from some slides on LiveJournal’s architecture given by Brad Fitzpatrick. Yes, LiveJournal is a big site, but 90% of good scaling is foresight. Planning ahead to an architecture like this is the only way we’ll actually get there without too much trouble. 146 The thing is, this is the only part of that cluster that’s LiveJournal- specific. In any big application, there’s a bunch of other code that does infrastructure-related activities, and all that is reusable. In fact, poke under the hood at most big web sites — MySpace, Facebook, Slashdot, etc. — and you’ll find many tools crop up over and over again. The wonders of the LAMP-ish stack these days is that you can use the same tools the big boys use. The fact that MySpace gets 6000 hits/second out of Memcached makes me not worry at all about my 60. I’m going to go over a few of these tools that’ll give you the most “bang for your buck.”
  42. 42. 147 The first tool I’ll look at is Perlbal. Perlbal is a “reverse proxy load balancer and web server”, which is a fancy way of describing a tool that mediates between web browsers and backend web servers. Perlbal can do a whole bunch more, actually — including acting as a part of MogileFS, which is awesome but which I can’t cover in this tutorial — but I’ll just focus on its role as a reverse proxy. There are, of course, other load balancers -- Apache’s '%3*+&%/8 and nginx come to mind -- and much of the following applies to them. I use Perlbal, so that’s what I’m gonna talk about. 148 So why use a reverse proxy at all? Well, even if you’ve only got a single web server, Perlbal can still save your butt. Although it takes only fractions of a second to generate a page, a slow client can take a relatively long time to download that content. In most situations even your faster clients have far smaller pipes than your server; this leaves the server to spend the majority of its town “spoonfeeding” rendered data down to clients. Perlbal (and other reverse proxies) will cache a certain amount of content and trickle it down to clients, leaving your backend free to handle more requests. Second, if all your requests go through a proxy, it’s amazingly easy to swap out backend web servers, add more as traffic increases, or otherwise move things around. Without a proxy, you’d spend a bunch of time rebinding IP addresses, and possibly end up locked into a server you don’t like. Finally, if you’re lucky you’ll get to the point that a single server won’t handle all the traffic you’re throwing at it. Perlbal makes it incredibly easy to add more backend servers if and when that happens. 149 Unfortunately, Perlbal isn’t documented all that well. The docs in SVN are pretty good, and the mailing list is a great place to get help. I’ll also show some example configs over the next few slides. » http://danga.com/perlbal/ » http://code.sixapart.com/svn/perlbal/trunk/doc/ » http://lists.danga.com/mailman/listinfo/perlbal
  43. 43. 150 Here’s a stripped down version of the Perlbal config for ljworld.com. We’re using the virtual host plugin to delegate based on domain name. The domain name points to a “service”, which (since it’s a proxy) points to a “pool” of servers. We’re using a cute trick for the poll here; instead of listing the servers in the config file, we point to a “nodefile” of backend web servers. 151 This is that node file; one Id+%&0 per line. The clever thing is that Perlbal notices if this file changes and automatically reconfigures the pool; this means that changing the pool is as simple as changing this file. 152 A couple of tricks we’ve learned over a few years of using Perlbal: » Because you’re now behind a proxy, <LHc=L*Id won’t be correct (it’ll always be set to the IP of Perlbal itself). Django’s included eU%&#2&3quot;3U%&H-33,quot;#2&quot; will correctly set <LHc=L*Id for you. » Perlbal has some neat tricks; check out e:<quot;+&%/8:U-,quot; and e:<quot;+&%/8:><K. » It’s often useful to know which backend server actually handled a request. We use a special X-header to keep track of that (e: Squot;20,quot;(). » If you’ve got a change you’re not sure about, you can always deploy it to a single server and let Perlbal hand just a portion of requests to that server.
  44. 44. 153 The next tool on our little micro-tour is memcached. It’s a in- memory object caching system, and it’s the secret to making your sites run fast. Django’s caching framework will use memcached, and for any serious production-quality site you should let it. 154 Really, there’s no reason not to use memcached, so I’m not going to spend much time advocating it. If you choose a different cache backend you deserve what you get. 155 This is how easy it is to start memcached. 156 And this is all you need to do to make Django use it (well, besides installing the memcached client library, which is pure Python and will run anywhere). Since it confuses some people, the second line shows how to use multiple cache backends.
  45. 45. 157 Some tricks: » More memcached servers generally equals better performance (i.e. four 1 GB servers will perform better than 1 4GB server). That’s because the memcached protocol hashes twice: once on the client to determine the server, and once on the server. This leaves an equal distribution of keys across servers, and hence better performance. You do want roughly equal cache sizes on each server so that key expiration isn’t abnormal. » You want to make sure to use unique keys if you’re running multiple sites against the same cache. Otherwise %!quot;5quot;/2'+,quot;54%'.N. could get the same key as 0#%5quot;/2'+,quot;54%'.N., and that’s bad. We use J[NQTc*OL==IQTO*HcJ>KL as the key prefix, and it works well. » Memcached has no namespaces, so try to design keys that don’t need ‘em. In a bind, you can use some external value that you increment when you need a “new” namespace. 158 The final tool I’ll look at is Capistrano. Although it’s classified as a deployment utility, you can really think of Capistrano as a tool to run the same command on a bunch of servers at once. The most useful command is (F!9)+320quot;, but you can really run anything. 159 Once you end up with multiple web servers, keeping ‘em in sync is hard, and NFS is failure-prone. Deployment tools keep sanity.
  46. 46. 160 Yes, it’s Ruby :) The Capistrano DSL, though, is pretty sweet; here I’m defining a remote command I can easily run with 42+9)+7&23quot;*+&%6quot;40. I can’t really show much more code examples since each site will be different, but I suggest just reading through the manual and playing around; it’s really not very hard. 161 A couple of tricks we’ve learned: » If you’ve got a “restart” task (to reload Apache or whatever), make sure to stagger the restarts so you don’t have any downtime. » Capistrano is great to combine with a build process. We use it to crunch and combine JavaScript, and it rocks. 3quot;+,%8 62F2(4&-+0 combines the build process and the roll-out process. » It’s also a good idea to bake cache-busting into your code deployment task. 162 http://www.unessa.net/en/hoyci/2007/06/using-capistrano-deploy- django-apps/ has a good introduction to using Capistrano with Django.
  47. 47. 163 164 This material wasn’t ready when the handouts needed to go to print, but it’ll be available online. 165 © 2007 Dunck, Kaplan-Moss, Willison. All rights reserved.