Advanced Django ORM techniques

D
Daniel RosemanSoftware Developer at Government Digital Service
Advanced Django ORM
     techniques
 Daniel Roseman   http://blog.roseman.org.uk
About Me
• Python user for five years
• Discovered Django four years ago
• Worked full-time with Python/Django since
  2008.
• Top Django answerer on StackOverflow!
• Occasionally blog on Django, concentrating
  on efficient use of the ORM.
Contents

• Behind the scenes: models and fields
• How model relationships work
• More efficient relationships
• Other optimising techniques
Django ORM
efficiency: a story
Advanced Django ORM techniques
414 queries!
How can you stop this
 happening to you?


             http://www.flickr.com/photos/m0n0/4479450696
Behind the scenes:
models and fields


               http://www.flickr.com/photos/spacesuitcatalyst/847530840
Defining a model

• Model structure initialised via metaclass
• Called when model is first defined
• Resulting model class stored in cache to
  use when instantiated
Fields

• Fields have contribute_to_class
• Adds methods, eg get_FOO_display()
• Enables use of descriptors for field access
Model metadata

•   Model._meta

•   .fields

•   .get_field(fieldname)

•   .get_all_related_objects()
Model instantiation

• Instance is populated from database initially
• Has no subsequent relationship with db
  until save
• No identity between models
Querysets
• Model=manager returns a queryset:
  foos Foo.objects.all()

• Queryset is an ordered list of instances
  of a single model
• No database access yet
• Slice: foos[0]
• Iterate: {% for foo in foos %}
Where do all those
  queries come from?
• Repeated queries
• Lack of caching
• Relational lookup
• Templates as well as views
Repeated queries
    def get_absolute_url(self):
      return "%s/%s" % (
         self.category.slug,
         self.slug
      )


    Same category, but query is
    repeated for each article
Repeated queries
• Same link on every
  page

• Dynamic, so can't
  go in urlconf

• Could be cached
  or memoized
Relationships




        http://www.flickr.com/photos/katietegtmeyer/124315322
Relational lookups

• Forwards:
  foo.bar.field



• Backwards:
  bar.foo_set.all()
Example models
class Foo(models.Model):
 name = models.CharField(max_length=10)


class Bar(models.Model):
 name = models.CharField(max_length=10)
 foo = models.ForeignKey(Foo)
Forwards relationship

>>> bar = Bar.objects.all()[0]
>>> bar.__dict__
{'id': 1, 'foo_id': 1, 'name': u'item1'}
Forwards relationship
>>> bar.foo.name
u'item1'
>>> bar.__dict__
{'_foo_cache': <Foo: Foo object>, 'id': 1,
'foo_id': 1, 'name': u'item1'}
Fowards relationships
• Relational access implemented via a
    descriptor:
    django.db.models.fields.related.
    SingleRelatedObjectDescriptor

•   __get__ tries to access _foo_cache

• If doesn't exist, does lookup and creates
    cache
select_related
• Automatically follows foreign keys in SQL
  query
• Prepopulates _foo_cache
• Doesn't follow null=True relationships by
  default
• Makes query more expensive, so be sure
  you need it
Backwards relationships
{% for foo in my_foos %}
 {% for bar in foo.bar_set.all %}
  {{ bar.name }}
 {% endfor %}
{% endfor %}
Backwards relationships
• One query per foo
• If you iterate over foo_set again, you
  generate a new set of db hits
• No _foo_cache
• select_related does not work here
Optimising backwards
    relationships

• Get all related objects at once
• Sort by ID of parent object
• Then cache in hidden attribute as with
  select_related
qs = Foo.objects.filter(criteria=whatever)
obj_dict = dict([(obj.id, obj)
           for obj in qs])
objects = Bar.objects.filter(foo__in=qs)
relation_dict = {}
for obj in objects:
 relation_dict.setdefault(
          obj.foo_id, []).append(obj)
for id, related in relation_dict.items():
 obj_dict[id]._related = related
qs = Foo.objects.filter(criteria=whatever)
obj_dict = dict([(obj.id, obj)
           for obj in qs])
objects = Bar.objects.filter(foo__in=qs)
relation_dict = {}
for obj in objects:
 relation_dict.setdefault(
          obj.foo_id, []).append(obj)
for id, related in relation_dict.items():
 obj_dict[id]._related = related
qs = Foo.objects.filter(criteria=whatever)
obj_dict = dict([(obj.id, obj)
           for obj in qs])
objects = Bar.objects.filter(foo__in=qs)
relation_dict = {}
for obj in objects:
 relation_dict.setdefault(
          obj.foo_id, []).append(obj)
for id, related in relation_dict.items():
 obj_dict[id]._related = related
qs = Foo.objects.filter(criteria=whatever)
obj_dict = dict([(obj.id, obj)
           for obj in qs])
objects = Bar.objects.filter(foo__in=qs)
relation_dict = {}
for obj in objects:
 relation_dict.setdefault(
          obj.foo_id, []).append(obj)
for id, related in relation_dict.items():
 obj_dict[id]._related = related
qs = Foo.objects.filter(criteria=whatever)
obj_dict = dict([(obj.id, obj)
           for obj in qs])
objects = Bar.objects.filter(foo__in=qs)
relation_dict = {}
for obj in objects:
 relation_dict.setdefault(
          obj.foo_id, []).append(obj)
for id, related in relation_dict.items():
 obj_dict[id]._related = related
qs = Foo.objects.filter(criteria=whatever)
obj_dict = dict([(obj.id, obj)
           for obj in qs])
objects = Bar.objects.filter(foo__in=qs)
relation_dict = {}
for obj in objects:
 relation_dict.setdefault(
          obj.foo_id, []).append(obj)
for id, related in relation_dict.items():
 obj_dict[id]._related = related
Optimising backwards
[{'time': '0.000', 'sql': u'SELECT
"foobar_foo"."id", "foobar_foo"."name" FROM
"foobar_foo"'},
{'time': '0.000', 'sql': u'SELECT
"foobar_bar"."id", "foobar_bar"."name",
"foobar_bar"."foo_id" FROM "foobar_bar"
WHERE "foobar_bar"."foo_id" IN (SELECT
U0."id" FROM "foobar_foo" U0)'}]
Optimising backwards

• Still quite expensive, as can mean large
  dependent subquery – MySQL in particular
  very bad at these
• But now just two queries instead of n
• Not automatic – need to remember to use
  _related_items attribute
Generic relations
• Foreign key to ContentType, object_id
• Descriptor to enable direct access
• iterating through creates n+m
  queries(n=number of source objects,
  m=number of different content types)
• ContentType objects automatically cached
• Forwards relationship creates _foo_cache
• but select_related doesn't work
generics = {}
for item in queryset:
  generics.setdefault(item.content_type_id,
                 
 set()).add(item.object_id)
content_types = ContentType.objects.in_bulk(
                 generics.keys())
relations = {}
for ct, fk_list in generics.items():
 ct_model = content_types[ct].model_class()
 relations[ct] = ct_model.objects.
           in_bulk(list(fk_list))
for item in queryset:
 setattr(item, '_content_object_cache',
    relations[content_type_id][item.object_id]
 )
generics = {}
for item in queryset:
  generics.setdefault(item.content_type_id,
                 
 set()).add(item.object_id)
content_types = ContentType.objects.in_bulk(
                 generics.keys())
relations = {}
for ct, fk_list in generics.items():
 ct_model = content_types[ct].model_class()
 relations[ct] = ct_model.objects.
           in_bulk(list(fk_list))
for item in queryset:
 setattr(item, '_content_object_cache',
    relations[content_type_id][item.object_id]
 )
generics = {}
for item in queryset:
  generics.setdefault(item.content_type_id,
                 
 set()).add(item.object_id)
content_types = ContentType.objects.in_bulk(
                 generics.keys())
relations = {}
for ct, fk_list in generics.items():
 ct_model = content_types[ct].model_class()
 relations[ct] = ct_model.objects.
           in_bulk(list(fk_list))
for item in queryset:
 setattr(item, '_content_object_cache',
    relations[content_type_id][item.object_id]
 )
generics = {}
for item in queryset:
  generics.setdefault(item.content_type_id,
                 
 set()).add(item.object_id)
content_types = ContentType.objects.in_bulk(
                 generics.keys())
relations = {}
for ct, fk_list in generics.items():
 ct_model = content_types[ct].model_class()
 relations[ct] = ct_model.objects.
           in_bulk(list(fk_list))
for item in queryset:
 setattr(item, '_content_object_cache',
    relations[content_type_id][item.object_id]
 )
generics = {}
for item in queryset:
  generics.setdefault(item.content_type_id,
                 
 set()).add(item.object_id)
content_types = ContentType.objects.in_bulk(
                 generics.keys())
relations = {}
for ct, fk_list in generics.items():
 ct_model = content_types[ct].model_class()
 relations[ct] = ct_model.objects.
           in_bulk(list(fk_list))
for item in queryset:
 setattr(item, '_content_object_cache',
    relations[content_type_id][item.object_id]
 )
generics = {}
for item in queryset:
  generics.setdefault(item.content_type_id,
                 
 set()).add(item.object_id)
content_types = ContentType.objects.in_bulk(
                 generics.keys())
relations = {}
for ct, fk_list in generics.items():
 ct_model = content_types[ct].model_class()
 relations[ct] = ct_model.objects.
           in_bulk(list(fk_list))
for item in queryset:
 setattr(item, '_content_object_cache',
    relations[content_type_id][item.object_id]
 )
generics = {}
for item in queryset:
  generics.setdefault(item.content_type_id,
                 
 set()).add(item.object_id)
content_types = ContentType.objects.in_bulk(
                 generics.keys())
relations = {}
for ct, fk_list in generics.items():
 ct_model = content_types[ct].model_class()
 relations[ct] = ct_model.objects.
           in_bulk(list(fk_list))
for item in queryset:
 setattr(item, '_content_object_cache',
    relations[content_type_id][item.object_id]
 )
generics = {}
for item in queryset:
  generics.setdefault(item.content_type_id,
                 
 set()).add(item.object_id)
content_types = ContentType.objects.in_bulk(
                 generics.keys())
relations = {}
for ct, fk_list in generics.items():
 ct_model = content_types[ct].model_class()
 relations[ct] = ct_model.objects.
           in_bulk(list(fk_list))
for item in queryset:
 setattr(item, '_content_object_cache',
    relations[content_type_id][item.object_id]
 )
generics = {}
for item in queryset:
  generics.setdefault(item.content_type_id,
                 
 set()).add(item.object_id)
content_types = ContentType.objects.in_bulk(
                 generics.keys())
relations = {}
for ct, fk_list in generics.items():
 ct_model = content_types[ct].model_class()
 relations[ct] = ct_model.objects.
           in_bulk(list(fk_list))
for item in queryset:
 setattr(item, '_content_object_cache',
    relations[content_type_id][item.object_id]
 )
Other optimising
  techniques
Memoizing
• Cache property on first access
• Can cache within instance, if multiple
  accesses within same request
def get_expensive_items(self):
 if not hasattr(self, '_cache'):
  self._cache = self.expensive_op()
 return self._cache
DB Indexes

• Pay attention to slow query log and
  debug toolbar output
• Add extra indexes where necessary -
  especially for multiple-column lookup
• Use EXPLAIN
Outsourcing

• Does all the logic need to go in the web
  app?
• Services - via eg Piston
• Message queues
• Distributed tasks, eg Celery
Summary

• Understand where queries are coming
  from
• Optimise where necessary, within Django
  or in the database
• and...
PROFILE
Daniel Roseman

http://blog.roseman.org.uk
1 of 51

Recommended

PHP - Introduction to Object Oriented Programming with PHP by
PHP -  Introduction to  Object Oriented Programming with PHPPHP -  Introduction to  Object Oriented Programming with PHP
PHP - Introduction to Object Oriented Programming with PHPVibrant Technologies & Computers
2.8K views22 slides
PHP - Introduction to PHP Forms by
PHP - Introduction to PHP FormsPHP - Introduction to PHP Forms
PHP - Introduction to PHP FormsVibrant Technologies & Computers
579 views30 slides
HTML and CSS crash course! by
HTML and CSS crash course!HTML and CSS crash course!
HTML and CSS crash course!Ana Cidre
2.7K views29 slides
PHP Cookies and Sessions by
PHP Cookies and SessionsPHP Cookies and Sessions
PHP Cookies and SessionsNisa Soomro
4.9K views14 slides
Advanced Javascript by
Advanced JavascriptAdvanced Javascript
Advanced JavascriptAdieu
8.2K views88 slides
Fundamental JavaScript [UTC, March 2014] by
Fundamental JavaScript [UTC, March 2014]Fundamental JavaScript [UTC, March 2014]
Fundamental JavaScript [UTC, March 2014]Aaron Gustafson
8.5K views105 slides

More Related Content

What's hot

Javascript 101 by
Javascript 101Javascript 101
Javascript 101Shlomi Komemi
2.1K views32 slides
Php with MYSQL Database by
Php with MYSQL DatabasePhp with MYSQL Database
Php with MYSQL DatabaseComputer Hardware & Trouble shooting
6.5K views52 slides
Regular Expressions in PHP by
Regular Expressions in PHPRegular Expressions in PHP
Regular Expressions in PHPAndrew Kandels
3.8K views20 slides
Form Handling using PHP by
Form Handling using PHPForm Handling using PHP
Form Handling using PHPNisa Soomro
3.5K views14 slides
html5.ppt by
html5.ppthtml5.ppt
html5.pptNiharika Gupta
42K views13 slides
Class 7 - PHP Object Oriented Programming by
Class 7 - PHP Object Oriented ProgrammingClass 7 - PHP Object Oriented Programming
Class 7 - PHP Object Oriented ProgrammingAhmed Swilam
2.6K views50 slides

What's hot(20)

Regular Expressions in PHP by Andrew Kandels
Regular Expressions in PHPRegular Expressions in PHP
Regular Expressions in PHP
Andrew Kandels3.8K views
Form Handling using PHP by Nisa Soomro
Form Handling using PHPForm Handling using PHP
Form Handling using PHP
Nisa Soomro3.5K views
Class 7 - PHP Object Oriented Programming by Ahmed Swilam
Class 7 - PHP Object Oriented ProgrammingClass 7 - PHP Object Oriented Programming
Class 7 - PHP Object Oriented Programming
Ahmed Swilam2.6K views
Introduction to HTML5 by IT Geeks
Introduction to HTML5Introduction to HTML5
Introduction to HTML5
IT Geeks905 views
JavaScript - Chapter 13 - Browser Object Model(BOM) by WebStackAcademy
JavaScript - Chapter 13 - Browser Object Model(BOM)JavaScript - Chapter 13 - Browser Object Model(BOM)
JavaScript - Chapter 13 - Browser Object Model(BOM)
WebStackAcademy3.4K views
Chap 4 PHP.pdf by HASENSEID
Chap 4 PHP.pdfChap 4 PHP.pdf
Chap 4 PHP.pdf
HASENSEID1.2K views
Introduction to SQLAlchemy and Alembic Migrations by Jason Myers
Introduction to SQLAlchemy and Alembic MigrationsIntroduction to SQLAlchemy and Alembic Migrations
Introduction to SQLAlchemy and Alembic Migrations
Jason Myers6.3K views
CSS Transitions, Transforms, Animations by Rob LaPlaca
CSS Transitions, Transforms, Animations CSS Transitions, Transforms, Animations
CSS Transitions, Transforms, Animations
Rob LaPlaca6.9K views
PHP Workshop Notes by Pamela Fox
PHP Workshop NotesPHP Workshop Notes
PHP Workshop Notes
Pamela Fox5.9K views

Viewers also liked

Advanced Django by
Advanced DjangoAdvanced Django
Advanced DjangoSimon Willison
55.4K views81 slides
What's new in Django 1.7 by
What's new in Django 1.7What's new in Django 1.7
What's new in Django 1.7Daniel Roseman
2K views17 slides
Django orm-tips by
Django orm-tipsDjango orm-tips
Django orm-tipsTareque Hossain
1.3K views7 slides
Basic Django ORM by
Basic Django ORMBasic Django ORM
Basic Django ORMAyun Park
1.3K views9 slides
Django ORM by
Django ORMDjango ORM
Django ORMJunsu Kim
480 views32 slides
Introduction to Django REST Framework, an easy way to build REST framework in... by
Introduction to Django REST Framework, an easy way to build REST framework in...Introduction to Django REST Framework, an easy way to build REST framework in...
Introduction to Django REST Framework, an easy way to build REST framework in...Zhe Li
2.6K views23 slides

Viewers also liked(19)

Basic Django ORM by Ayun Park
Basic Django ORMBasic Django ORM
Basic Django ORM
Ayun Park1.3K views
Django ORM by Junsu Kim
Django ORMDjango ORM
Django ORM
Junsu Kim480 views
Introduction to Django REST Framework, an easy way to build REST framework in... by Zhe Li
Introduction to Django REST Framework, an easy way to build REST framework in...Introduction to Django REST Framework, an easy way to build REST framework in...
Introduction to Django REST Framework, an easy way to build REST framework in...
Zhe Li2.6K views
Django: Advanced Models by Ying-An Lai
Django: Advanced ModelsDjango: Advanced Models
Django: Advanced Models
Ying-An Lai827 views
Django In Depth by ubernostrum
Django In DepthDjango In Depth
Django In Depth
ubernostrum16.5K views
Django REST Framework by Load Impact
Django REST FrameworkDjango REST Framework
Django REST Framework
Load Impact1.9K views
REST Easy with Django-Rest-Framework by Marcel Chastain
REST Easy with Django-Rest-FrameworkREST Easy with Django-Rest-Framework
REST Easy with Django-Rest-Framework
Marcel Chastain7.5K views
Full Stack & Full Circle: What the Heck Happens In an HTTP Request-Response C... by Carina C. Zona
Full Stack & Full Circle: What the Heck Happens In an HTTP Request-Response C...Full Stack & Full Circle: What the Heck Happens In an HTTP Request-Response C...
Full Stack & Full Circle: What the Heck Happens In an HTTP Request-Response C...
Carina C. Zona7.5K views
12 tips on Django Best Practices by David Arcos
12 tips on Django Best Practices12 tips on Django Best Practices
12 tips on Django Best Practices
David Arcos50.8K views
Tabela de números romanos by Dann Senda
Tabela de números romanosTabela de números romanos
Tabela de números romanos
Dann Senda63.1K views
(PFC403) Maximizing Amazon S3 Performance | AWS re:Invent 2014 by Amazon Web Services
(PFC403) Maximizing Amazon S3 Performance | AWS re:Invent 2014(PFC403) Maximizing Amazon S3 Performance | AWS re:Invent 2014
(PFC403) Maximizing Amazon S3 Performance | AWS re:Invent 2014
Amazon Web Services49.5K views
Top 10 senior technical architect interview questions and answers by tonychoper5406
Top 10 senior technical architect interview questions and answersTop 10 senior technical architect interview questions and answers
Top 10 senior technical architect interview questions and answers
tonychoper540612.2K views
facebook architecture for 600M users by Jongyoon Choi
facebook architecture for 600M usersfacebook architecture for 600M users
facebook architecture for 600M users
Jongyoon Choi72K views
Web Development with Python and Django by Michael Pirnat
Web Development with Python and DjangoWeb Development with Python and Django
Web Development with Python and Django
Michael Pirnat188.7K views

Similar to Advanced Django ORM techniques

Hibernate Tutorial for beginners by
Hibernate Tutorial for beginnersHibernate Tutorial for beginners
Hibernate Tutorial for beginnersrajkamal560066
96 views30 slides
Powerful Generic Patterns With Django by
Powerful Generic Patterns With DjangoPowerful Generic Patterns With Django
Powerful Generic Patterns With DjangoEric Satterwhite
8.4K views74 slides
Django workshop : let's make a blog by
Django workshop : let's make a blogDjango workshop : let's make a blog
Django workshop : let's make a blogPierre Sudron
2.1K views40 slides
Django Search by
Django SearchDjango Search
Django SearchPeter Herndon
1.7K views9 slides
The Django Book, Chapter 16: django.contrib by
The Django Book, Chapter 16: django.contribThe Django Book, Chapter 16: django.contrib
The Django Book, Chapter 16: django.contribTzu-ping Chung
1.2K views32 slides
Backbone.js Simple Tutorial by
Backbone.js Simple TutorialBackbone.js Simple Tutorial
Backbone.js Simple Tutorial추근 문
4.7K views16 slides

Similar to Advanced Django ORM techniques(20)

Hibernate Tutorial for beginners by rajkamal560066
Hibernate Tutorial for beginnersHibernate Tutorial for beginners
Hibernate Tutorial for beginners
rajkamal56006696 views
Powerful Generic Patterns With Django by Eric Satterwhite
Powerful Generic Patterns With DjangoPowerful Generic Patterns With Django
Powerful Generic Patterns With Django
Eric Satterwhite8.4K views
Django workshop : let's make a blog by Pierre Sudron
Django workshop : let's make a blogDjango workshop : let's make a blog
Django workshop : let's make a blog
Pierre Sudron2.1K views
The Django Book, Chapter 16: django.contrib by Tzu-ping Chung
The Django Book, Chapter 16: django.contribThe Django Book, Chapter 16: django.contrib
The Django Book, Chapter 16: django.contrib
Tzu-ping Chung1.2K views
Backbone.js Simple Tutorial by 추근 문
Backbone.js Simple TutorialBackbone.js Simple Tutorial
Backbone.js Simple Tutorial
추근 문4.7K views
Django class based views (Dutch Django meeting presentation) by Reinout van Rees
Django class based views (Dutch Django meeting presentation)Django class based views (Dutch Django meeting presentation)
Django class based views (Dutch Django meeting presentation)
Reinout van Rees1.4K views
Chap 3 Python Object Oriented Programming - Copy.ppt by muneshwarbisen1
Chap 3 Python Object Oriented Programming - Copy.pptChap 3 Python Object Oriented Programming - Copy.ppt
Chap 3 Python Object Oriented Programming - Copy.ppt
muneshwarbisen116 views
اسلاید جلسه ۹ کلاس پایتون برای هکر های قانونی by Mohammad Reza Kamalifard
اسلاید جلسه ۹ کلاس پایتون برای هکر های قانونیاسلاید جلسه ۹ کلاس پایتون برای هکر های قانونی
اسلاید جلسه ۹ کلاس پایتون برای هکر های قانونی
Firebase for Apple Developers by Peter Friese
Firebase for Apple DevelopersFirebase for Apple Developers
Firebase for Apple Developers
Peter Friese101 views
Hands On Spring Data by Eric Bottard
Hands On Spring DataHands On Spring Data
Hands On Spring Data
Eric Bottard1.6K views
A Related Matter: Optimizing your webapp by using django-debug-toolbar, selec... by Christopher Adams
A Related Matter: Optimizing your webapp by using django-debug-toolbar, selec...A Related Matter: Optimizing your webapp by using django-debug-toolbar, selec...
A Related Matter: Optimizing your webapp by using django-debug-toolbar, selec...
Christopher Adams5.3K views
Django Class-based views (Slovenian) by Luka Zakrajšek
Django Class-based views (Slovenian)Django Class-based views (Slovenian)
Django Class-based views (Slovenian)
Luka Zakrajšek1.1K views
Django Forms: Best Practices, Tips, Tricks by Shawn Rider
Django Forms: Best Practices, Tips, TricksDjango Forms: Best Practices, Tips, Tricks
Django Forms: Best Practices, Tips, Tricks
Shawn Rider23.4K views
Mongo and Harmony by Steve Smith
Mongo and HarmonyMongo and Harmony
Mongo and Harmony
Steve Smith1.5K views
Declarative Data Modeling in Python by Joshua Forman
Declarative Data Modeling in PythonDeclarative Data Modeling in Python
Declarative Data Modeling in Python
Joshua Forman834 views
Alfresco Content Modelling and Policy Behaviours by J V
Alfresco Content Modelling and Policy BehavioursAlfresco Content Modelling and Policy Behaviours
Alfresco Content Modelling and Policy Behaviours
J V2.7K views
Core data in Swfit by allanh0526
Core data in SwfitCore data in Swfit
Core data in Swfit
allanh0526114 views

Recently uploaded

Empathic Computing: Delivering the Potential of the Metaverse by
Empathic Computing: Delivering  the Potential of the MetaverseEmpathic Computing: Delivering  the Potential of the Metaverse
Empathic Computing: Delivering the Potential of the MetaverseMark Billinghurst
470 views80 slides
SAP Automation Using Bar Code and FIORI.pdf by
SAP Automation Using Bar Code and FIORI.pdfSAP Automation Using Bar Code and FIORI.pdf
SAP Automation Using Bar Code and FIORI.pdfVirendra Rai, PMP
19 views38 slides
Voice Logger - Telephony Integration Solution at Aegis by
Voice Logger - Telephony Integration Solution at AegisVoice Logger - Telephony Integration Solution at Aegis
Voice Logger - Telephony Integration Solution at AegisNirmal Sharma
17 views1 slide
Report 2030 Digital Decade by
Report 2030 Digital DecadeReport 2030 Digital Decade
Report 2030 Digital DecadeMassimo Talia
14 views41 slides
Black and White Modern Science Presentation.pptx by
Black and White Modern Science Presentation.pptxBlack and White Modern Science Presentation.pptx
Black and White Modern Science Presentation.pptxmaryamkhalid2916
14 views21 slides

Recently uploaded(20)

Empathic Computing: Delivering the Potential of the Metaverse by Mark Billinghurst
Empathic Computing: Delivering  the Potential of the MetaverseEmpathic Computing: Delivering  the Potential of the Metaverse
Empathic Computing: Delivering the Potential of the Metaverse
Mark Billinghurst470 views
SAP Automation Using Bar Code and FIORI.pdf by Virendra Rai, PMP
SAP Automation Using Bar Code and FIORI.pdfSAP Automation Using Bar Code and FIORI.pdf
SAP Automation Using Bar Code and FIORI.pdf
Voice Logger - Telephony Integration Solution at Aegis by Nirmal Sharma
Voice Logger - Telephony Integration Solution at AegisVoice Logger - Telephony Integration Solution at Aegis
Voice Logger - Telephony Integration Solution at Aegis
Nirmal Sharma17 views
Black and White Modern Science Presentation.pptx by maryamkhalid2916
Black and White Modern Science Presentation.pptxBlack and White Modern Science Presentation.pptx
Black and White Modern Science Presentation.pptx
maryamkhalid291614 views
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... by Bernd Ruecker
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
Bernd Ruecker26 views
Unit 1_Lecture 2_Physical Design of IoT.pdf by StephenTec
Unit 1_Lecture 2_Physical Design of IoT.pdfUnit 1_Lecture 2_Physical Design of IoT.pdf
Unit 1_Lecture 2_Physical Design of IoT.pdf
StephenTec11 views
handbook for web 3 adoption.pdf by Liveplex
handbook for web 3 adoption.pdfhandbook for web 3 adoption.pdf
handbook for web 3 adoption.pdf
Liveplex19 views
Special_edition_innovator_2023.pdf by WillDavies22
Special_edition_innovator_2023.pdfSpecial_edition_innovator_2023.pdf
Special_edition_innovator_2023.pdf
WillDavies2216 views
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLive by Network Automation Forum
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLiveAutomating a World-Class Technology Conference; Behind the Scenes of CiscoLive
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLive
Five Things You SHOULD Know About Postman by Postman
Five Things You SHOULD Know About PostmanFive Things You SHOULD Know About Postman
Five Things You SHOULD Know About Postman
Postman27 views
Spesifikasi Lengkap ASUS Vivobook Go 14 by Dot Semarang
Spesifikasi Lengkap ASUS Vivobook Go 14Spesifikasi Lengkap ASUS Vivobook Go 14
Spesifikasi Lengkap ASUS Vivobook Go 14
Dot Semarang35 views
Data-centric AI and the convergence of data and model engineering: opportunit... by Paolo Missier
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...
Paolo Missier34 views
STPI OctaNE CoE Brochure.pdf by madhurjyapb
STPI OctaNE CoE Brochure.pdfSTPI OctaNE CoE Brochure.pdf
STPI OctaNE CoE Brochure.pdf
madhurjyapb12 views

Advanced Django ORM techniques

  • 1. Advanced Django ORM techniques Daniel Roseman http://blog.roseman.org.uk
  • 2. About Me • Python user for five years • Discovered Django four years ago • Worked full-time with Python/Django since 2008. • Top Django answerer on StackOverflow! • Occasionally blog on Django, concentrating on efficient use of the ORM.
  • 3. Contents • Behind the scenes: models and fields • How model relationships work • More efficient relationships • Other optimising techniques
  • 7. How can you stop this happening to you? http://www.flickr.com/photos/m0n0/4479450696
  • 8. Behind the scenes: models and fields http://www.flickr.com/photos/spacesuitcatalyst/847530840
  • 9. Defining a model • Model structure initialised via metaclass • Called when model is first defined • Resulting model class stored in cache to use when instantiated
  • 10. Fields • Fields have contribute_to_class • Adds methods, eg get_FOO_display() • Enables use of descriptors for field access
  • 11. Model metadata • Model._meta • .fields • .get_field(fieldname) • .get_all_related_objects()
  • 12. Model instantiation • Instance is populated from database initially • Has no subsequent relationship with db until save • No identity between models
  • 13. Querysets • Model=manager returns a queryset: foos Foo.objects.all() • Queryset is an ordered list of instances of a single model • No database access yet • Slice: foos[0] • Iterate: {% for foo in foos %}
  • 14. Where do all those queries come from? • Repeated queries • Lack of caching • Relational lookup • Templates as well as views
  • 15. Repeated queries def get_absolute_url(self): return "%s/%s" % ( self.category.slug, self.slug ) Same category, but query is repeated for each article
  • 16. Repeated queries • Same link on every page • Dynamic, so can't go in urlconf • Could be cached or memoized
  • 17. Relationships http://www.flickr.com/photos/katietegtmeyer/124315322
  • 18. Relational lookups • Forwards: foo.bar.field • Backwards: bar.foo_set.all()
  • 19. Example models class Foo(models.Model): name = models.CharField(max_length=10) class Bar(models.Model): name = models.CharField(max_length=10) foo = models.ForeignKey(Foo)
  • 20. Forwards relationship >>> bar = Bar.objects.all()[0] >>> bar.__dict__ {'id': 1, 'foo_id': 1, 'name': u'item1'}
  • 21. Forwards relationship >>> bar.foo.name u'item1' >>> bar.__dict__ {'_foo_cache': <Foo: Foo object>, 'id': 1, 'foo_id': 1, 'name': u'item1'}
  • 22. Fowards relationships • Relational access implemented via a descriptor: django.db.models.fields.related. SingleRelatedObjectDescriptor • __get__ tries to access _foo_cache • If doesn't exist, does lookup and creates cache
  • 23. select_related • Automatically follows foreign keys in SQL query • Prepopulates _foo_cache • Doesn't follow null=True relationships by default • Makes query more expensive, so be sure you need it
  • 24. Backwards relationships {% for foo in my_foos %} {% for bar in foo.bar_set.all %} {{ bar.name }} {% endfor %} {% endfor %}
  • 25. Backwards relationships • One query per foo • If you iterate over foo_set again, you generate a new set of db hits • No _foo_cache • select_related does not work here
  • 26. Optimising backwards relationships • Get all related objects at once • Sort by ID of parent object • Then cache in hidden attribute as with select_related
  • 27. qs = Foo.objects.filter(criteria=whatever) obj_dict = dict([(obj.id, obj) for obj in qs]) objects = Bar.objects.filter(foo__in=qs) relation_dict = {} for obj in objects: relation_dict.setdefault( obj.foo_id, []).append(obj) for id, related in relation_dict.items(): obj_dict[id]._related = related
  • 28. qs = Foo.objects.filter(criteria=whatever) obj_dict = dict([(obj.id, obj) for obj in qs]) objects = Bar.objects.filter(foo__in=qs) relation_dict = {} for obj in objects: relation_dict.setdefault( obj.foo_id, []).append(obj) for id, related in relation_dict.items(): obj_dict[id]._related = related
  • 29. qs = Foo.objects.filter(criteria=whatever) obj_dict = dict([(obj.id, obj) for obj in qs]) objects = Bar.objects.filter(foo__in=qs) relation_dict = {} for obj in objects: relation_dict.setdefault( obj.foo_id, []).append(obj) for id, related in relation_dict.items(): obj_dict[id]._related = related
  • 30. qs = Foo.objects.filter(criteria=whatever) obj_dict = dict([(obj.id, obj) for obj in qs]) objects = Bar.objects.filter(foo__in=qs) relation_dict = {} for obj in objects: relation_dict.setdefault( obj.foo_id, []).append(obj) for id, related in relation_dict.items(): obj_dict[id]._related = related
  • 31. qs = Foo.objects.filter(criteria=whatever) obj_dict = dict([(obj.id, obj) for obj in qs]) objects = Bar.objects.filter(foo__in=qs) relation_dict = {} for obj in objects: relation_dict.setdefault( obj.foo_id, []).append(obj) for id, related in relation_dict.items(): obj_dict[id]._related = related
  • 32. qs = Foo.objects.filter(criteria=whatever) obj_dict = dict([(obj.id, obj) for obj in qs]) objects = Bar.objects.filter(foo__in=qs) relation_dict = {} for obj in objects: relation_dict.setdefault( obj.foo_id, []).append(obj) for id, related in relation_dict.items(): obj_dict[id]._related = related
  • 33. Optimising backwards [{'time': '0.000', 'sql': u'SELECT "foobar_foo"."id", "foobar_foo"."name" FROM "foobar_foo"'}, {'time': '0.000', 'sql': u'SELECT "foobar_bar"."id", "foobar_bar"."name", "foobar_bar"."foo_id" FROM "foobar_bar" WHERE "foobar_bar"."foo_id" IN (SELECT U0."id" FROM "foobar_foo" U0)'}]
  • 34. Optimising backwards • Still quite expensive, as can mean large dependent subquery – MySQL in particular very bad at these • But now just two queries instead of n • Not automatic – need to remember to use _related_items attribute
  • 35. Generic relations • Foreign key to ContentType, object_id • Descriptor to enable direct access • iterating through creates n+m queries(n=number of source objects, m=number of different content types) • ContentType objects automatically cached • Forwards relationship creates _foo_cache • but select_related doesn't work
  • 36. generics = {} for item in queryset: generics.setdefault(item.content_type_id, set()).add(item.object_id) content_types = ContentType.objects.in_bulk( generics.keys()) relations = {} for ct, fk_list in generics.items(): ct_model = content_types[ct].model_class() relations[ct] = ct_model.objects. in_bulk(list(fk_list)) for item in queryset: setattr(item, '_content_object_cache', relations[content_type_id][item.object_id] )
  • 37. generics = {} for item in queryset: generics.setdefault(item.content_type_id, set()).add(item.object_id) content_types = ContentType.objects.in_bulk( generics.keys()) relations = {} for ct, fk_list in generics.items(): ct_model = content_types[ct].model_class() relations[ct] = ct_model.objects. in_bulk(list(fk_list)) for item in queryset: setattr(item, '_content_object_cache', relations[content_type_id][item.object_id] )
  • 38. generics = {} for item in queryset: generics.setdefault(item.content_type_id, set()).add(item.object_id) content_types = ContentType.objects.in_bulk( generics.keys()) relations = {} for ct, fk_list in generics.items(): ct_model = content_types[ct].model_class() relations[ct] = ct_model.objects. in_bulk(list(fk_list)) for item in queryset: setattr(item, '_content_object_cache', relations[content_type_id][item.object_id] )
  • 39. generics = {} for item in queryset: generics.setdefault(item.content_type_id, set()).add(item.object_id) content_types = ContentType.objects.in_bulk( generics.keys()) relations = {} for ct, fk_list in generics.items(): ct_model = content_types[ct].model_class() relations[ct] = ct_model.objects. in_bulk(list(fk_list)) for item in queryset: setattr(item, '_content_object_cache', relations[content_type_id][item.object_id] )
  • 40. generics = {} for item in queryset: generics.setdefault(item.content_type_id, set()).add(item.object_id) content_types = ContentType.objects.in_bulk( generics.keys()) relations = {} for ct, fk_list in generics.items(): ct_model = content_types[ct].model_class() relations[ct] = ct_model.objects. in_bulk(list(fk_list)) for item in queryset: setattr(item, '_content_object_cache', relations[content_type_id][item.object_id] )
  • 41. generics = {} for item in queryset: generics.setdefault(item.content_type_id, set()).add(item.object_id) content_types = ContentType.objects.in_bulk( generics.keys()) relations = {} for ct, fk_list in generics.items(): ct_model = content_types[ct].model_class() relations[ct] = ct_model.objects. in_bulk(list(fk_list)) for item in queryset: setattr(item, '_content_object_cache', relations[content_type_id][item.object_id] )
  • 42. generics = {} for item in queryset: generics.setdefault(item.content_type_id, set()).add(item.object_id) content_types = ContentType.objects.in_bulk( generics.keys()) relations = {} for ct, fk_list in generics.items(): ct_model = content_types[ct].model_class() relations[ct] = ct_model.objects. in_bulk(list(fk_list)) for item in queryset: setattr(item, '_content_object_cache', relations[content_type_id][item.object_id] )
  • 43. generics = {} for item in queryset: generics.setdefault(item.content_type_id, set()).add(item.object_id) content_types = ContentType.objects.in_bulk( generics.keys()) relations = {} for ct, fk_list in generics.items(): ct_model = content_types[ct].model_class() relations[ct] = ct_model.objects. in_bulk(list(fk_list)) for item in queryset: setattr(item, '_content_object_cache', relations[content_type_id][item.object_id] )
  • 44. generics = {} for item in queryset: generics.setdefault(item.content_type_id, set()).add(item.object_id) content_types = ContentType.objects.in_bulk( generics.keys()) relations = {} for ct, fk_list in generics.items(): ct_model = content_types[ct].model_class() relations[ct] = ct_model.objects. in_bulk(list(fk_list)) for item in queryset: setattr(item, '_content_object_cache', relations[content_type_id][item.object_id] )
  • 45. Other optimising techniques
  • 46. Memoizing • Cache property on first access • Can cache within instance, if multiple accesses within same request def get_expensive_items(self): if not hasattr(self, '_cache'): self._cache = self.expensive_op() return self._cache
  • 47. DB Indexes • Pay attention to slow query log and debug toolbar output • Add extra indexes where necessary - especially for multiple-column lookup • Use EXPLAIN
  • 48. Outsourcing • Does all the logic need to go in the web app? • Services - via eg Piston • Message queues • Distributed tasks, eg Celery
  • 49. Summary • Understand where queries are coming from • Optimise where necessary, within Django or in the database • and...

Editor's Notes

  1. (background: montage of Limmud, rosemanblog, Capital, Classic, Heart, GlassesDirect)
  2. Some of same ideas in Guido&apos;s Appstats talk this morning
  3. It&apos;s a model, in a field, geddit?
  4. For more, see Marty Alchin, Pro Django (Apress)
  5. descriptors used especially in related objects - see later
  6. Very useful for introspection and working out what&apos;s going on
  7. explain identity: multiple instances relating to same model row aren&apos;t the same object, changes made to one don&apos;t reflect the other; even saving one with new values won&apos;t be reflected in others.
  8. Update, Aggregates, Q, F
  9. Find repeated queries with my branch of the django-debug-toolbar, or SimonW&apos;s original query debug middleware
  10. Actually in 1.2 there&apos;s an extra _state object in __dict__, which is used for the multiple DB support (which I&apos;m not covering here).
  11. Lack of model identity means that accessing the related item on one instance does not cause cache to be created on other instances that might reference the same db row
  12. Note: backwards cache does work on OneToOne as of 1.2
  13. +----+--------------------+-----------+-----------------+---------------+---------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------------+-----------+-----------------+---------------+---------+---------+------+------+-------------+ | 1 | PRIMARY | sandy_bar | ALL | NULL | NULL | NULL | NULL | 100 | Using where | | 2 | DEPENDENT SUBQUERY | U0 | unique_subquery | PRIMARY | PRIMARY | 4 | func | 1 | Using where | +----+-----------+----+--------------------+-----------+-----------------+---------------+---------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------------+-----------+-----------------+---------------+---------+---------+------+------+-------------+ | 1 | PRIMARY | sandy_bar | ALL | NULL | NULL | NULL | NULL | 100 | Using where | | 2 | DEPENDENT SUBQUERY | U0 | unique_subquery | PRIMARY | PRIMARY | 4 | func | 1 | Using where | +----+--------------------+-----------+-----------------+---------------+---------+---------+------+------+-------------+ --------+-----------+-----------------+---------------+---------+---------+------+------+-------------+