Making Django and NoSQL Play Nice

14,828 views

Published on

Talk given at DjangoCon.eu 2010.

Published in: Technology
0 Comments
22 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
14,828
On SlideShare
0
From Embeds
0
Number of Embeds
1,148
Actions
Shares
0
Downloads
283
Comments
0
Likes
22
Embeds 0
No embeds

No notes for slide

































  • Making Django and NoSQL Play Nice

    1. Making Django and NoSQL Play Nice Alex Gaynor Berlin
    2. NoSQL Any database that doesn’t speak SQL Usually non-relational databases e.g. Cassandra, Redis, MongoDB
    3. 2 Part Talk 50% 50% Current Internals Coming Changes
    4. What does playing nice mean? from mongoengine import connection def my_view(request): objects = connection.do_something() BAD
    5. # settings.py DATABASES = { "default": { "ENGINE": "django_mongo", } } # models.py from django.db import models class MyPerfectlyNormalModel(models.Model): name = models.CharField(max_length=12) GOOD
    6. Why do we care? Admin Forms Serializers Model validation API Generators Metadata Makes my brain hurt less
    7. Into the rabbit hole we go!
    8. Lay of the land Models Managers QuerySets Queries Compilers Backends
    9. Models from django.db import models class Category(models.Model): name = models.CharField(max_length=100) slug = models.SlugField() parent = models.ForeignKey("self", null=True)
    10. Managers Category.objects
    11. QuerySets Category.objects.get_query_set()
    12. Queries Category.objects.get_query_set().query
    13. Compilers qs = Category.objects.get_query_set() qs.query.get_compiler(qs.db)
    14. QuerySets The whole damned thing
    15. QuerySet This is the top layer of query state. From here on out it’s like an onion. Not backend specific. _db _result_cache _iter query
    16. Query Right now this holds all state for a query. It’s semi-backend specific. Right now there’s one, and it’s specific to SQL backends. Computes most JOINS, aggregates, etc. Translates Q objects into Where objects.
    17. The Query Problem It’s something of a lossy translation. Translating filter(), values(), and other calls into internal datastructures is lossless with respect to the database being used, but not with respect to other databases. If you’ve got all SQL databases you’re fine, but if you mix in a non-relational DB you’ve got problems. More on this later.
    18. SQLCompiler Takes a Query and a connection and turns it into SQL (and executes it). This also does some computation of joins (for select_related). This is the totally backend specific part (only part that knows about the actual connection and database). The rest of the chain just *assumes* a SQL db.
    19. django.db.backends.* This is where backends live. Not super exciting. A bunch of flags and methods to control very small parts of SQL creation. Also introspection, creation, and shell.
    20. You call methods on a QuerySet Which calls methods on a Query Which mutates some datastructures You evaluate a QuerySet Which asks it’s Query for a Compiler Which generates some SQL Which calls some methods on the backend Which gets a cursor and evaluates it
    21. The Problem
    22. Query thinks in terms of SQL It chooses between join types it generates table aliases it splits filters between HAVING and WHERE and probably some other stuff
    23. Why is this an issue How do I ask a MongoDBCompiler to compile a LEFT OUTER JOIN vs. an INNER JOIN? Or a HAVING vs a WHERE? These concepts don’t map cleanly, so the translation is lossy across backends
    24. Design Decisions Not everything is a technical problem
    25. Do we emulate JOINs? Category.objects.filter(parent__parent__name="Tech")
    26. Do we maintain secondary indices? Category.objects.filter(name="Tech")
    27. Different databases have different features True of SQL databases, but more so for non-relational databases. No lingua franca like SQL is.
    28. A solution Or something close enough...
    29. Make Query do less Instead of generating two trees of WHERE and HAVING, generate a single tree of filters. Don’t generate JOINs at all. Push that all down to the compiler.
    30. Make SQLCompiler do more Generate all JOINs Split filter tree into HAVING vs WHERE Can generate more efficient JOINs with global knowledge. Probably makes it easier to do fix some other ORM bugs.
    31. Plan of action Change the ORM up Build MongoDB prototype backend ??? Profit
    32. http://alexgaynor.net/ Slides will be up there

    ×