SlideShare a Scribd company logo
1 of 52
Download to read offline
The amazing world
behind your ORM
Louise Grandjonc
Louise Grandjonc (louise@ulule.com)
Lead developer at Ulule (www.ulule.com)
Django developer - Postgres enthusiast
@louisemeta on twitter
About me
1. How do we end up with performances problems?
2. How can we see them without roughly guessing how
long you’re waiting before seeing your page?
3. What does it change in our everyday developer job?
Today’s agenda
How do we end up with
performances problems?
1. To annoy the DBAs
2. Because we can avoid having to worry about DB
connections
3. We keep using our main language
4. We are a bit afraid of SQL
5. 90% of the time, we don’t really need to do more than
really simple SELECT and INSERT, so why bother do it worst
than our ORM would?
Why do we use ORMs?
(and why that’s not so terrible)
Not looking at what happens will cause performances problems,
because…
1.The ORMs execute queries that you might not expect
2.Your queries might not be optimised and you won’t know about it
3.To make DBAs to like you, even if you’re using an ORM
Why we should know what our
ORM is doing
How can we see them without roughly
guessing how long you’re waiting
before seeing your page?
How can I see what is
happening when I do stuff?
1. Django debug toolbar (to see queries and their explain in your
django view)
Advantages: can be easily included in your django templates
Problems: Does not allow you to see everything (ajax calls !), if
you’re working on an API, you cannot use it!
2. Django devserver : puts all the logs of your database into your
runserver output
Advantages: you’re not missing the ajax calls
3. Simply look at your database logs
Advantages: you can see everything, you won’t be disturbed if
you ever change project/programming languages/framework/
computer, you can configure how you see your logs
Problems: you don’t know where your logs are?
Where are my logs?
owl_conference=# show log_directory ;
log_directory
---------------
pg_log
(1 row)
owl_conference=# show data_directory ;
data_directory
-------------------------
/usr/local/var/postgres
(1 row)
owl_conference=# show log_filename ;
log_filename
-------------------------
postgresql-%Y-%m-%d.log
(1 row)
Having good looking logs
(and logging everything like a crazy owl)
owl_conference=# SHOW config_file;
config_file
-----------------------------------------
/usr/local/var/postgres/postgresql.conf
(1 row)
In your postgresql.conf
log_filename = 'postgresql-%Y-%m-%d.log'
log_statement = 'all'
logging_collector = on
log_min_duration_statement = 0
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d,host=%h,app=%a'
Having good looking logs
user=owly,db=owl_conference,host=127.0.0.1,app=owl
LOG: statement: SELECT "owl"."id", "owl"."name",
"owl"."employer_name", "owl"."favourite_food", "owl"."job_id",
"owl"."fur_color" FROM "owl" WHERE "owl"."job_id" = 1 LIMIT 10
user=owly,db=owl_conference,host=127.0.0.1,app=owl
LOG: duration: 0.297 ms
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'NAME': 'owl_conference',
'USER': 'owly',
'PASSWORD': 'mouseEating',
'HOST': '127.0.0.1',
'OPTIONS': {'application_name': 'owl'}
}
}
Your logs should look like
Yep ! I’ve seen my logs… But …
Where are this queries executed in my code?
Django will always execute your queries when it needs to use the
object !
Let’s take an example…
Example
Template
def index(request):
owls = Owl.objects.filter(employer_name=‘Ulule’)
context = {‘owls': owls}
return render(request, 'owls/index.html', context)
SELECT "owl"."id", "owl"."name", "owl"."employer_name",
"owl"."favourite_food", "owl"."job_id", "owl"."fur_color" FROM
"owl" WHERE "owl"."employer_name" = 'Ulule'
{% for owl in owls %}
<p> {{ owl.name }} </p>
{% end for %}
Example
View
def index(request):
owls = Owl.objects.filter(employer_name=‘Ulule’)
owl_count = len(owls)
context = {‘owls': owls,‘owl_count’: owl_count}
return render(request, 'owls/index.html', context)
SELECT "owl"."id", "owl"."name", "owl"."employer_name",
"owl"."favourite_food", "owl"."job_id", "owl"."fur_color" FROM
"owl" WHERE "owl"."employer_name" = 'Ulule'
{% for owl in owls %}
<p> {{ owl.name }} </p>
{% end for %}
Yep ! I’ve seen my logs… But …
Where are this queries executed in my code?
How to spot where your query is executed?
1. Each model has a table to store data.
Find the model.
2. Where in my view, or in my form am I
using this model to get/filter objects?
3. Where am I using this objects? Is it in my
view/form? Passed into the context and
used in templates?
What does in change in our everyday
developer job?
(Or how to really do something when you have a problem)
The two most common
problems of any ORM user…
1. I have way too many queries… Why ?
2. One of my query is freakin' slow… Why?
Once upon a time… 1000 times
The danger of loops in your code, and how your templates
are making fun of you…
1. Preload stuff ! The ORM is executing the queries
when it needs the data, if your looping over foreign
key, whithout any preload, it will just query every
time it needs the foreign key… Imagine you have a
loop over 1 million objects. Use prefetch_related and
select_related (see next slide)
2. In an ideal world, no query should ever be executed
from your django html template. Every data should
be in your context, you should never have
« surprise » queries from your templates !
Once upon a time… 1000 times
select_related or prefetch_related?
In django, select_related and prefetch_related will help you lower
your amount of query by preloading the foreign keys or many-to-
many.
1. select_related uses a join (only for foreign keys):
- Advantages: only one request
- Problem: if you are joining big tables, with a lot of columns
and no index, it can be slow… We’ll talk about that next.
2. prefetch_related does a second request on your join table (for
foreign keys and many-to-many
- Advantages: no big join
- Problem: more queries
Example … 1
def index(request):
owls = Owl.objects.filter(employer_name=‘Ulule’)
context = {‘owls': owls}
for owl in owls:
# do stuff
owl.job
return render(request, 'owls/index.html', context)
def index(request):
owls = Owl.objects
.filter(employer_name=‘Ulule’)
.select_related(‘job’)
context = {‘owls': owls}
for owl in owls:
# do stuff
owl.job
return render(request, 'owls/index.html', context)
Example … 1
Using select_related
owls = Owl.objects
.filter(employer_name=‘Ulule’)
.select_related(‘job’)
SELECT "owl"."id", "owl"."name", "owl"."employer_name",
"owl"."favourite_food", "owl"."job_id", "owl"."fur_color",
"job"."id", "job"."name" FROM "owl" LEFT OUTER JOIN "job"
ON ("owl"."job_id" = "job"."id") WHERE
"owl"."employer_name" = 'Ulule'
Example … 1
Using prefetch_related
owls = Owl.objects
.filter(employer_name=‘Ulule’)
.prefetch_related(‘job’)
SELECT "owl"."id", "owl"."name", "owl"."employer_name",
"owl"."favourite_food", "owl"."job_id", "owl"."fur_color"
FROM "owl" WHERE "owl"."employer_name" = 'Ulule'
SELECT "job"."id", "job"."name" FROM "job" WHERE "job"."id"
IN (2)
One of my query is super slow…
Let’s talk about EXPLAIN !
What is EXPLAIN
Gives you the execution plan chosen by the query
planner that your database will use to execute your SQL
statement
Using ANALYZE will actually execute your query! (Don’t
worry, you can ROLLBACK)
EXPLAIN (ANALYZE) my super query;
BEGIN;
EXPLAIN ANALYZE my super query;
ROLLBACK;
Mmmm… Query planner?
The magical thing that generates execution plans for a
query and calculate what is the cost of each plan.
The best one is used to execute your query (hopefully)
So, what does it took like ?
Let’s imagine a slow query… I’m trying to have all the owls
working at Ulule (super rare job for an owl)
Python version
DB version
Owl.objects.filter(employer_name=‘Ulule’)
SELECT "owl"."id", "owl"."name", "owl"."employer_name",
"owl"."favourite_food", "owl"."job_id", "owl"."fur_color"
FROM "owl" WHERE "owl"."employer_name" = 'Ulule'
And…
owl_conference=# EXPLAIN ANALYZE
SELECT * FROM owl WHERE
employer_name=‘Ulule'
QUERY PLAN
------------------------------------
Seq Scan on owl (cost=0.00..205.01
rows=1 width=35) (actual
time=1.945..1.946 rows=1 loops=1)
Filter: ((employer_name)::text =
'Ulule'::text)
Rows Removed by Filter: 10000
Planning time: 0.080 ms
Execution time: 1.965 ms
(5 rows)
Let’s go step by step ! .. 1
Costs
(cost=0.00..205.01 rows=1 width=35)
Cost of retrieving
all rows
Number of rows
returned
Cost of retrieving
first row
Average width of a
row (in bytes)
(actual time=1.945..1.946 rows=1 loops=1)
Only if you use analyse, gives you the real times
Number of time your seq scan
(index scan etc.) was executed
Let’s go step by step ! .. 2
Seq Scan
Seq Scan on owl ...
Filter: ((employer_name)::text = 'Ulule'::text)
Rows Removed by Filter: 10000
Scan the entire database and retrieve the rows that
correspond to your where clause
It’s okay for small databases but can be very
expensive… Do you need an index?
Let’s go step by step ! .. 3
Index scan
QUERY PLAN
-------------------------------------------------
Index Scan using employer_name_owl on owl
(cost=0.29..8.30 rows=1 width=35) (actual
time=0.034..0.034 rows=1 loops=1)
Index Cond: ((employer_name)::text =
'Ulule'::text)
Planning time: 0.387 ms
Execution time: 0.066 ms
(4 rows)
What if there is an index on this column?
The index is visited row by row in order to
retrieve the data corresponding to your clause.
Let’s go step by step ! .. 4
owl_conference=# EXPLAIN SELECT * FROM "owl"
WHERE "owl"."employer_name" = 'post office’;
QUERY PLAN
-------------------------------------------------
Seq Scan on owl (cost=0.00..205.01 rows=7001
width=35)
Filter: ((employer_name)::text = 'post
office'::text)
(2 rows)
With an index and a really common value !
It’s quicker for common values for the db to read
all data, than scan the index.
Let’s go step by step ! .. 5
Bitmap Heap Scan
owl_conference=# EXPLAIN SELECT * FROM owl WHERE
owl.employer_name = ‘Hogwarts’;
QUERY PLAN
-------------------------------------------------
Bitmap Heap Scan on owl (cost=47.78..152.78
rows=2000 width=35)
Recheck Cond: ((employer_name)::text =
'Hogwarts'::text)
-> Bitmap Index Scan on employer_name_owl
(cost=0.00..47.28 rows=2000 width=0)
Index Cond: ((employer_name)::text =
'Hogwarts'::text)
(4 rows)
With an index and a common value (but not
too common)
Let’s go step by step ! ..4
Bitmap Heap Scan…
Index scan : goes through your index tuple-pointer one
at a time and reads the data from the pages. Uses the
index order.
Bitmap Heap Scan: orders the tuple-pointer in physical
memory order and go through it.
Avoids little «physical jumps » between pages
So we have 3 types of scan
1. Sequential scan
2. Index scan
3. Bitmap heap scan
And now let’s join stuff !
And now let’s join stuff…
Nested loops
owl_conference=# EXPLAIN ANALYZE SELECT * FROM owl JOIN job
ON (job.id = owl.job_id) WHERE job.id=1;
QUERY PLAN
-------------------------------------------------------------
Nested Loop (cost=blabla) (actual time=blabla)
-> Seq Scan on job (cost=blabla)
Rows Removed by Filter: 6
-> Seq Scan on owl (costblabla)
Filter: (job_id = 1)
Rows Removed by Filter: 1000
Planning time: 0.150 ms
Execution time: 3.663 ms
(9 rows)
And now let’s join stuff…
Nested loops
Used for little tables, can be slow
This image
does not
match
the previous
query ;)
And now let’s join stuff…
Hash Join
owl_conference=# EXPLAIN ANALYZE SELECT * FROM owl JOIN job
ON (job.id = owl.job_id) WHERE job.id>1;
QUERY PLAN
-------------------------------------------------------------
Hash Join (cost=1.17..318.70 rows=10001 width=56) (actual
time=0.033..36.021 rows=1000 loops=1)
Hash Cond: (owl.job_id = job.id)
-> Seq Scan on owl (cost=blabla(
-> Hash (cost=blabla)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on job (cost=blabla)
Filter: (id > 1)
Rows Removed by Filter: 1
Planning time: 0.235 ms
(10 rows)
And now let’s join stuff…
Hash Join
Smaller table in hashed because
it has to fit into memory
And now let’s join stuff…
Merge Join
owl_conference=# EXPLAIN ANALYZE SELECT * FROM owl JOIN job
ON (job.id = owl.id) WHERE owl.id>1;
QUERY PLAN
-------------------------------------------------------------
Merge Join (cost=blabla)
Merge Cond: (owl.id = job.id)
-> Index Scan using owl_pkey on owl (cost=blabla)
Index Cond: (id > 1)
-> Sort (cost=blabla)
Sort Key: job.id
Sort Method: quicksort Memory: 25kB
-> Seq Scan on job (cost=blaba)
Planning time: 0.453 ms
Execution time: 0.102 ms
(10 rows)
And now let’s join stuff…
Merge Join
Used for big tables, an index can be
used to avoid sorting
So we have 3 types of joins
1. Nested loop
2. Hash join
3. Merge join
And a last word about
ORDER BY
(last part, I swear !)
And now let’s order stuff…
owl_conference=# EXPLAIN ANALYZE SELECT * FROM owl ORDER BY
owl.job_id, owl.favourite_food;
QUERY PLAN
-------------------------------------------------------------
Sort (cost=844.47..869.47 rows=10001 width=35) (actual
time=7.252..8.090 rows=10001 loops=1)
Sort Key: job_id, favourite_food
Sort Method: quicksort Memory: 1166kB
-> Seq Scan on owl (cost=0.00..180.01 rows=10001
width=35) (actual time=0.017..1.181 rows=10001 loops=1)
Planning time: 0.142 ms
Execution time: 8.665 ms
(6 rows)
Everything is sorted into the memory (which is why it can costly)
And now let’s order stuff…
With an index
owl_conference=# EXPLAIN ANALYZE SELECT * FROM owl ORDER BY
owl.job_id, owl.favourite_food;
QUERY PLAN
-------------------------------------------------------------
Index Scan using owl_job_id_favourite_food on owl
(cost=0.29..544.66 rows=10001 width=35) (actual
time=0.016..2.835 rows=10001 loops=1)
Planning time: 0.098 ms
Execution time: 3.510 ms
(3 rows)
Simply use index order
And now let’s order stuff…
ORDER BY LIMIT
owl_conference=# EXPLAIN ANALYZE SELECT name, employer_name
FROM owl ORDER BY name LIMIT 10;
QUERY PLAN
-------------------------------------------------------------
-------------------------------------------------------
Limit (cost…) (actual time…)
-> Sort (cost…) (actual time…)
Sort Key: name
Sort Method: top-N heapsort Memory: 25kB
-> Seq Scan on owl (cost=0.00..180.01 rows=10001
width=16) (actual time=0.032..5.856 rows=10002 loops=1)
Planning time: 0.201 ms
Execution time: 15.846 ms
(7 rows)
Like with quicksort, all the data has to be sorted… Why is the memory
taken so muck smaller?
Top-N heap sort
- A heap (sort of tree) is used with a bounded size
- For each row
- If the heap isn’t full, tuple added at the right place
- If heap is full and value smaller (for ASC) than current
values
- Tuple inserted at the right place, last value popped
- Else value discarded
Top-N heap sort
Data to order … Iterations 1.. 2.. 3
Iteration 10
Top-N heap sort
Example (if it wasn’t clear…)
Inserting new smaller value,
Potter eliminated (Voldy’s dream)
Heap in the end, after sorting
all stuff
Be careful when you ORDER BY !
1. Sorting with sort key without limit or index can be
heavy
2. You might need an index, only EXPLAIN will tell
you
Conclusion
Conclusion
- Looking at your DB logs, whatever your favourite solution is,
will help you build a website with good performances
- Always know where your queries come from
- Careful about loops ! Use prefetch_related and
select_related to avoid O(n) queries
- If you have a slow query, there is no magical solution, look
into explain to understand what’s going wrong and find a
solution
Thank you for your attention !
Any questions?
Owly design: zimmoriarty (https://www.instagram.com/zimmoriarty/)
To go further - sources
Owly design: zimmoriarty (https://www.instagram.com/zimmoriarty/)
https://momjian.us/main/writings/pgsql/optimizer.pdf
https://use-the-index-luke.com/sql/plans-dexecution/
postgresql/operations
http://tech.novapost.fr/postgresql-application_name-django-
settings.html

More Related Content

What's hot

Djangocon11: Monkeying around at New Relic
Djangocon11: Monkeying around at New RelicDjangocon11: Monkeying around at New Relic
Djangocon11: Monkeying around at New Relic
New Relic
 
Php Code Audits (PHP UK 2010)
Php Code Audits (PHP UK 2010)Php Code Audits (PHP UK 2010)
Php Code Audits (PHP UK 2010)
Damien Seguy
 

What's hot (20)

Python于Web 2.0网站的应用 - QCon Beijing 2010
Python于Web 2.0网站的应用 - QCon Beijing 2010Python于Web 2.0网站的应用 - QCon Beijing 2010
Python于Web 2.0网站的应用 - QCon Beijing 2010
 
Python Tricks That You Can't Live Without
Python Tricks That You Can't Live WithoutPython Tricks That You Can't Live Without
Python Tricks That You Can't Live Without
 
Intro to Python
Intro to PythonIntro to Python
Intro to Python
 
Djangocon11: Monkeying around at New Relic
Djangocon11: Monkeying around at New RelicDjangocon11: Monkeying around at New Relic
Djangocon11: Monkeying around at New Relic
 
Python Part 2
Python Part 2Python Part 2
Python Part 2
 
Introduction to Python and TensorFlow
Introduction to Python and TensorFlowIntroduction to Python and TensorFlow
Introduction to Python and TensorFlow
 
Python Part 1
Python Part 1Python Part 1
Python Part 1
 
Python and sysadmin I
Python and sysadmin IPython and sysadmin I
Python and sysadmin I
 
ES6 and BEYOND
ES6 and BEYONDES6 and BEYOND
ES6 and BEYOND
 
Python 101 1
Python 101   1Python 101   1
Python 101 1
 
Python introduction
Python introductionPython introduction
Python introduction
 
ES6
ES6ES6
ES6
 
A Few of My Favorite (Python) Things
A Few of My Favorite (Python) ThingsA Few of My Favorite (Python) Things
A Few of My Favorite (Python) Things
 
Python Peculiarities
Python PeculiaritiesPython Peculiarities
Python Peculiarities
 
Elegant Solutions For Everyday Python Problems - PyCon Canada 2017
Elegant Solutions For Everyday Python Problems - PyCon Canada 2017Elegant Solutions For Everyday Python Problems - PyCon Canada 2017
Elegant Solutions For Everyday Python Problems - PyCon Canada 2017
 
Active Support Core Extensions (1)
Active Support Core Extensions (1)Active Support Core Extensions (1)
Active Support Core Extensions (1)
 
Elegant Solutions for Everyday Python Problems Pycon 2018 - Nina Zakharenko
Elegant Solutions for Everyday Python Problems Pycon 2018 - Nina ZakharenkoElegant Solutions for Everyday Python Problems Pycon 2018 - Nina Zakharenko
Elegant Solutions for Everyday Python Problems Pycon 2018 - Nina Zakharenko
 
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance Python
 
Php Code Audits (PHP UK 2010)
Php Code Audits (PHP UK 2010)Php Code Audits (PHP UK 2010)
Php Code Audits (PHP UK 2010)
 
Java 8 Puzzlers [as presented at OSCON 2016]
Java 8 Puzzlers [as presented at  OSCON 2016]Java 8 Puzzlers [as presented at  OSCON 2016]
Java 8 Puzzlers [as presented at OSCON 2016]
 

Viewers also liked

Viewers also liked (20)

Meetup pg recherche fulltext ES -> PG
Meetup pg recherche fulltext ES -> PGMeetup pg recherche fulltext ES -> PG
Meetup pg recherche fulltext ES -> PG
 
Unidad iii la encuesta
Unidad iii la encuestaUnidad iii la encuesta
Unidad iii la encuesta
 
Extrem Ownership - Jocko Willink
Extrem Ownership - Jocko WillinkExtrem Ownership - Jocko Willink
Extrem Ownership - Jocko Willink
 
Narrative Shot List
Narrative Shot ListNarrative Shot List
Narrative Shot List
 
6th edition of Collection of Essays on Development and Investment Opportuniti...
6th edition of Collection of Essays on Development and Investment Opportuniti...6th edition of Collection of Essays on Development and Investment Opportuniti...
6th edition of Collection of Essays on Development and Investment Opportuniti...
 
6ª edição da Coleção de Ensaios sobre Desenvolvimento e Oportunidades de Inve...
6ª edição da Coleção de Ensaios sobre Desenvolvimento e Oportunidades de Inve...6ª edição da Coleção de Ensaios sobre Desenvolvimento e Oportunidades de Inve...
6ª edição da Coleção de Ensaios sobre Desenvolvimento e Oportunidades de Inve...
 
Lina mariaperdomocuenca trabajofinal
Lina mariaperdomocuenca trabajofinalLina mariaperdomocuenca trabajofinal
Lina mariaperdomocuenca trabajofinal
 
Ejercicios suma
Ejercicios sumaEjercicios suma
Ejercicios suma
 
Holding & subsidary Company
Holding & subsidary CompanyHolding & subsidary Company
Holding & subsidary Company
 
Actuakización en el manejo de las infecciones de orina
Actuakización en el manejo de las infecciones de orinaActuakización en el manejo de las infecciones de orina
Actuakización en el manejo de las infecciones de orina
 
Seo camp'us 2017 utiliser google analytics comme un voyou - aristide riou
Seo camp'us 2017   utiliser google analytics comme un voyou - aristide riouSeo camp'us 2017   utiliser google analytics comme un voyou - aristide riou
Seo camp'us 2017 utiliser google analytics comme un voyou - aristide riou
 
MySQL 5.7の次のMySQL 8.0はどんなものになるだろう
MySQL 5.7の次のMySQL 8.0はどんなものになるだろうMySQL 5.7の次のMySQL 8.0はどんなものになるだろう
MySQL 5.7の次のMySQL 8.0はどんなものになるだろう
 
Zajednicka pozicija EU za Poglavlje 35
Zajednicka pozicija EU za Poglavlje 35Zajednicka pozicija EU za Poglavlje 35
Zajednicka pozicija EU za Poglavlje 35
 
Multilinguality of Metadata. Measuring the Multilingual Degree of Europeana‘s...
Multilinguality of Metadata. Measuring the Multilingual Degree of Europeana‘s...Multilinguality of Metadata. Measuring the Multilingual Degree of Europeana‘s...
Multilinguality of Metadata. Measuring the Multilingual Degree of Europeana‘s...
 
The Ethical Public Relations Practitioner
The Ethical Public Relations PractitionerThe Ethical Public Relations Practitioner
The Ethical Public Relations Practitioner
 
Raft
RaftRaft
Raft
 
Global Engineering R&D Spend 2016
Global Engineering R&D Spend 2016Global Engineering R&D Spend 2016
Global Engineering R&D Spend 2016
 
Can An Ugly Divorce Get Me Fired?
Can An Ugly Divorce Get Me Fired?Can An Ugly Divorce Get Me Fired?
Can An Ugly Divorce Get Me Fired?
 
El gran libro_del_dibujo
El gran libro_del_dibujoEl gran libro_del_dibujo
El gran libro_del_dibujo
 
『バックドア基準の入門』@統数研研究集会
『バックドア基準の入門』@統数研研究集会『バックドア基準の入門』@統数研研究集会
『バックドア基準の入門』@統数研研究集会
 

Similar to Conf orm - explain

Appsec usa2013 js_libinsecurity_stefanodipaola
Appsec usa2013 js_libinsecurity_stefanodipaolaAppsec usa2013 js_libinsecurity_stefanodipaola
Appsec usa2013 js_libinsecurity_stefanodipaola
drewz lin
 

Similar to Conf orm - explain (20)

Becoming a better developer with EXPLAIN
Becoming a better developer with EXPLAINBecoming a better developer with EXPLAIN
Becoming a better developer with EXPLAIN
 
Rails Tips and Best Practices
Rails Tips and Best PracticesRails Tips and Best Practices
Rails Tips and Best Practices
 
Debugging Complex Systems - Erlang Factory SF 2015
Debugging Complex Systems - Erlang Factory SF 2015Debugging Complex Systems - Erlang Factory SF 2015
Debugging Complex Systems - Erlang Factory SF 2015
 
Data herding
Data herdingData herding
Data herding
 
Data herding
Data herdingData herding
Data herding
 
concurrency
concurrencyconcurrency
concurrency
 
The Ring programming language version 1.5.3 book - Part 6 of 184
The Ring programming language version 1.5.3 book - Part 6 of 184The Ring programming language version 1.5.3 book - Part 6 of 184
The Ring programming language version 1.5.3 book - Part 6 of 184
 
Django in the Office: Get Your Admin for Nothing and Your SQL for Free
Django in the Office: Get Your Admin for Nothing and Your SQL for FreeDjango in the Office: Get Your Admin for Nothing and Your SQL for Free
Django in the Office: Get Your Admin for Nothing and Your SQL for Free
 
More Data, More Problems: Evolving big data machine learning pipelines with S...
More Data, More Problems: Evolving big data machine learning pipelines with S...More Data, More Problems: Evolving big data machine learning pipelines with S...
More Data, More Problems: Evolving big data machine learning pipelines with S...
 
Natural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usageNatural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usage
 
Computação Paralela: Benefícios e Desafios - Intel Software Conference 2013
Computação Paralela: Benefícios e Desafios - Intel Software Conference 2013Computação Paralela: Benefícios e Desafios - Intel Software Conference 2013
Computação Paralela: Benefícios e Desafios - Intel Software Conference 2013
 
The Ring programming language version 1.5.4 book - Part 6 of 185
The Ring programming language version 1.5.4 book - Part 6 of 185The Ring programming language version 1.5.4 book - Part 6 of 185
The Ring programming language version 1.5.4 book - Part 6 of 185
 
Golang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war storyGolang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war story
 
Building Hermetic Systems (without Docker)
Building Hermetic Systems (without Docker)Building Hermetic Systems (without Docker)
Building Hermetic Systems (without Docker)
 
Data oriented design and c++
Data oriented design and c++Data oriented design and c++
Data oriented design and c++
 
Test First Teaching
Test First TeachingTest First Teaching
Test First Teaching
 
Mastering Python lesson 3a
Mastering Python lesson 3aMastering Python lesson 3a
Mastering Python lesson 3a
 
Appsec usa2013 js_libinsecurity_stefanodipaola
Appsec usa2013 js_libinsecurity_stefanodipaolaAppsec usa2013 js_libinsecurity_stefanodipaola
Appsec usa2013 js_libinsecurity_stefanodipaola
 
Test First Teaching and the path to TDD
Test First Teaching and the path to TDDTest First Teaching and the path to TDD
Test First Teaching and the path to TDD
 
Microsoft azure data fundamentals (dp 900) practice tests 2022
Microsoft azure data fundamentals (dp 900) practice tests 2022Microsoft azure data fundamentals (dp 900) practice tests 2022
Microsoft azure data fundamentals (dp 900) practice tests 2022
 

Recently uploaded

Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
AldoGarca30
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 

Recently uploaded (20)

Learn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksLearn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic Marks
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Moment Distribution Method For Btech Civil
Moment Distribution Method For Btech CivilMoment Distribution Method For Btech Civil
Moment Distribution Method For Btech Civil
 
457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx
457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx
457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 

Conf orm - explain

  • 1. The amazing world behind your ORM Louise Grandjonc
  • 2. Louise Grandjonc (louise@ulule.com) Lead developer at Ulule (www.ulule.com) Django developer - Postgres enthusiast @louisemeta on twitter About me
  • 3. 1. How do we end up with performances problems? 2. How can we see them without roughly guessing how long you’re waiting before seeing your page? 3. What does it change in our everyday developer job? Today’s agenda
  • 4. How do we end up with performances problems?
  • 5. 1. To annoy the DBAs 2. Because we can avoid having to worry about DB connections 3. We keep using our main language 4. We are a bit afraid of SQL 5. 90% of the time, we don’t really need to do more than really simple SELECT and INSERT, so why bother do it worst than our ORM would? Why do we use ORMs? (and why that’s not so terrible)
  • 6. Not looking at what happens will cause performances problems, because… 1.The ORMs execute queries that you might not expect 2.Your queries might not be optimised and you won’t know about it 3.To make DBAs to like you, even if you’re using an ORM Why we should know what our ORM is doing
  • 7. How can we see them without roughly guessing how long you’re waiting before seeing your page?
  • 8. How can I see what is happening when I do stuff? 1. Django debug toolbar (to see queries and their explain in your django view) Advantages: can be easily included in your django templates Problems: Does not allow you to see everything (ajax calls !), if you’re working on an API, you cannot use it! 2. Django devserver : puts all the logs of your database into your runserver output Advantages: you’re not missing the ajax calls 3. Simply look at your database logs Advantages: you can see everything, you won’t be disturbed if you ever change project/programming languages/framework/ computer, you can configure how you see your logs Problems: you don’t know where your logs are?
  • 9. Where are my logs? owl_conference=# show log_directory ; log_directory --------------- pg_log (1 row) owl_conference=# show data_directory ; data_directory ------------------------- /usr/local/var/postgres (1 row) owl_conference=# show log_filename ; log_filename ------------------------- postgresql-%Y-%m-%d.log (1 row)
  • 10. Having good looking logs (and logging everything like a crazy owl) owl_conference=# SHOW config_file; config_file ----------------------------------------- /usr/local/var/postgres/postgresql.conf (1 row) In your postgresql.conf log_filename = 'postgresql-%Y-%m-%d.log' log_statement = 'all' logging_collector = on log_min_duration_statement = 0 log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d,host=%h,app=%a'
  • 11. Having good looking logs user=owly,db=owl_conference,host=127.0.0.1,app=owl LOG: statement: SELECT "owl"."id", "owl"."name", "owl"."employer_name", "owl"."favourite_food", "owl"."job_id", "owl"."fur_color" FROM "owl" WHERE "owl"."job_id" = 1 LIMIT 10 user=owly,db=owl_conference,host=127.0.0.1,app=owl LOG: duration: 0.297 ms DATABASES = { 'default': { 'ENGINE': 'django.db.backends.postgresql_psycopg2', 'NAME': 'owl_conference', 'USER': 'owly', 'PASSWORD': 'mouseEating', 'HOST': '127.0.0.1', 'OPTIONS': {'application_name': 'owl'} } } Your logs should look like
  • 12. Yep ! I’ve seen my logs… But … Where are this queries executed in my code? Django will always execute your queries when it needs to use the object ! Let’s take an example…
  • 13. Example Template def index(request): owls = Owl.objects.filter(employer_name=‘Ulule’) context = {‘owls': owls} return render(request, 'owls/index.html', context) SELECT "owl"."id", "owl"."name", "owl"."employer_name", "owl"."favourite_food", "owl"."job_id", "owl"."fur_color" FROM "owl" WHERE "owl"."employer_name" = 'Ulule' {% for owl in owls %} <p> {{ owl.name }} </p> {% end for %}
  • 14. Example View def index(request): owls = Owl.objects.filter(employer_name=‘Ulule’) owl_count = len(owls) context = {‘owls': owls,‘owl_count’: owl_count} return render(request, 'owls/index.html', context) SELECT "owl"."id", "owl"."name", "owl"."employer_name", "owl"."favourite_food", "owl"."job_id", "owl"."fur_color" FROM "owl" WHERE "owl"."employer_name" = 'Ulule' {% for owl in owls %} <p> {{ owl.name }} </p> {% end for %}
  • 15. Yep ! I’ve seen my logs… But … Where are this queries executed in my code? How to spot where your query is executed? 1. Each model has a table to store data. Find the model. 2. Where in my view, or in my form am I using this model to get/filter objects? 3. Where am I using this objects? Is it in my view/form? Passed into the context and used in templates?
  • 16. What does in change in our everyday developer job? (Or how to really do something when you have a problem)
  • 17. The two most common problems of any ORM user… 1. I have way too many queries… Why ? 2. One of my query is freakin' slow… Why?
  • 18. Once upon a time… 1000 times The danger of loops in your code, and how your templates are making fun of you… 1. Preload stuff ! The ORM is executing the queries when it needs the data, if your looping over foreign key, whithout any preload, it will just query every time it needs the foreign key… Imagine you have a loop over 1 million objects. Use prefetch_related and select_related (see next slide) 2. In an ideal world, no query should ever be executed from your django html template. Every data should be in your context, you should never have « surprise » queries from your templates !
  • 19. Once upon a time… 1000 times select_related or prefetch_related? In django, select_related and prefetch_related will help you lower your amount of query by preloading the foreign keys or many-to- many. 1. select_related uses a join (only for foreign keys): - Advantages: only one request - Problem: if you are joining big tables, with a lot of columns and no index, it can be slow… We’ll talk about that next. 2. prefetch_related does a second request on your join table (for foreign keys and many-to-many - Advantages: no big join - Problem: more queries
  • 20. Example … 1 def index(request): owls = Owl.objects.filter(employer_name=‘Ulule’) context = {‘owls': owls} for owl in owls: # do stuff owl.job return render(request, 'owls/index.html', context) def index(request): owls = Owl.objects .filter(employer_name=‘Ulule’) .select_related(‘job’) context = {‘owls': owls} for owl in owls: # do stuff owl.job return render(request, 'owls/index.html', context)
  • 21. Example … 1 Using select_related owls = Owl.objects .filter(employer_name=‘Ulule’) .select_related(‘job’) SELECT "owl"."id", "owl"."name", "owl"."employer_name", "owl"."favourite_food", "owl"."job_id", "owl"."fur_color", "job"."id", "job"."name" FROM "owl" LEFT OUTER JOIN "job" ON ("owl"."job_id" = "job"."id") WHERE "owl"."employer_name" = 'Ulule'
  • 22. Example … 1 Using prefetch_related owls = Owl.objects .filter(employer_name=‘Ulule’) .prefetch_related(‘job’) SELECT "owl"."id", "owl"."name", "owl"."employer_name", "owl"."favourite_food", "owl"."job_id", "owl"."fur_color" FROM "owl" WHERE "owl"."employer_name" = 'Ulule' SELECT "job"."id", "job"."name" FROM "job" WHERE "job"."id" IN (2)
  • 23. One of my query is super slow… Let’s talk about EXPLAIN !
  • 24. What is EXPLAIN Gives you the execution plan chosen by the query planner that your database will use to execute your SQL statement Using ANALYZE will actually execute your query! (Don’t worry, you can ROLLBACK) EXPLAIN (ANALYZE) my super query; BEGIN; EXPLAIN ANALYZE my super query; ROLLBACK;
  • 25. Mmmm… Query planner? The magical thing that generates execution plans for a query and calculate what is the cost of each plan. The best one is used to execute your query (hopefully)
  • 26. So, what does it took like ? Let’s imagine a slow query… I’m trying to have all the owls working at Ulule (super rare job for an owl) Python version DB version Owl.objects.filter(employer_name=‘Ulule’) SELECT "owl"."id", "owl"."name", "owl"."employer_name", "owl"."favourite_food", "owl"."job_id", "owl"."fur_color" FROM "owl" WHERE "owl"."employer_name" = 'Ulule'
  • 27. And… owl_conference=# EXPLAIN ANALYZE SELECT * FROM owl WHERE employer_name=‘Ulule' QUERY PLAN ------------------------------------ Seq Scan on owl (cost=0.00..205.01 rows=1 width=35) (actual time=1.945..1.946 rows=1 loops=1) Filter: ((employer_name)::text = 'Ulule'::text) Rows Removed by Filter: 10000 Planning time: 0.080 ms Execution time: 1.965 ms (5 rows)
  • 28. Let’s go step by step ! .. 1 Costs (cost=0.00..205.01 rows=1 width=35) Cost of retrieving all rows Number of rows returned Cost of retrieving first row Average width of a row (in bytes) (actual time=1.945..1.946 rows=1 loops=1) Only if you use analyse, gives you the real times Number of time your seq scan (index scan etc.) was executed
  • 29. Let’s go step by step ! .. 2 Seq Scan Seq Scan on owl ... Filter: ((employer_name)::text = 'Ulule'::text) Rows Removed by Filter: 10000 Scan the entire database and retrieve the rows that correspond to your where clause It’s okay for small databases but can be very expensive… Do you need an index?
  • 30. Let’s go step by step ! .. 3 Index scan QUERY PLAN ------------------------------------------------- Index Scan using employer_name_owl on owl (cost=0.29..8.30 rows=1 width=35) (actual time=0.034..0.034 rows=1 loops=1) Index Cond: ((employer_name)::text = 'Ulule'::text) Planning time: 0.387 ms Execution time: 0.066 ms (4 rows) What if there is an index on this column? The index is visited row by row in order to retrieve the data corresponding to your clause.
  • 31. Let’s go step by step ! .. 4 owl_conference=# EXPLAIN SELECT * FROM "owl" WHERE "owl"."employer_name" = 'post office’; QUERY PLAN ------------------------------------------------- Seq Scan on owl (cost=0.00..205.01 rows=7001 width=35) Filter: ((employer_name)::text = 'post office'::text) (2 rows) With an index and a really common value ! It’s quicker for common values for the db to read all data, than scan the index.
  • 32. Let’s go step by step ! .. 5 Bitmap Heap Scan owl_conference=# EXPLAIN SELECT * FROM owl WHERE owl.employer_name = ‘Hogwarts’; QUERY PLAN ------------------------------------------------- Bitmap Heap Scan on owl (cost=47.78..152.78 rows=2000 width=35) Recheck Cond: ((employer_name)::text = 'Hogwarts'::text) -> Bitmap Index Scan on employer_name_owl (cost=0.00..47.28 rows=2000 width=0) Index Cond: ((employer_name)::text = 'Hogwarts'::text) (4 rows) With an index and a common value (but not too common)
  • 33. Let’s go step by step ! ..4 Bitmap Heap Scan… Index scan : goes through your index tuple-pointer one at a time and reads the data from the pages. Uses the index order. Bitmap Heap Scan: orders the tuple-pointer in physical memory order and go through it. Avoids little «physical jumps » between pages
  • 34. So we have 3 types of scan 1. Sequential scan 2. Index scan 3. Bitmap heap scan And now let’s join stuff !
  • 35. And now let’s join stuff… Nested loops owl_conference=# EXPLAIN ANALYZE SELECT * FROM owl JOIN job ON (job.id = owl.job_id) WHERE job.id=1; QUERY PLAN ------------------------------------------------------------- Nested Loop (cost=blabla) (actual time=blabla) -> Seq Scan on job (cost=blabla) Rows Removed by Filter: 6 -> Seq Scan on owl (costblabla) Filter: (job_id = 1) Rows Removed by Filter: 1000 Planning time: 0.150 ms Execution time: 3.663 ms (9 rows)
  • 36. And now let’s join stuff… Nested loops Used for little tables, can be slow This image does not match the previous query ;)
  • 37. And now let’s join stuff… Hash Join owl_conference=# EXPLAIN ANALYZE SELECT * FROM owl JOIN job ON (job.id = owl.job_id) WHERE job.id>1; QUERY PLAN ------------------------------------------------------------- Hash Join (cost=1.17..318.70 rows=10001 width=56) (actual time=0.033..36.021 rows=1000 loops=1) Hash Cond: (owl.job_id = job.id) -> Seq Scan on owl (cost=blabla( -> Hash (cost=blabla) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Seq Scan on job (cost=blabla) Filter: (id > 1) Rows Removed by Filter: 1 Planning time: 0.235 ms (10 rows)
  • 38. And now let’s join stuff… Hash Join Smaller table in hashed because it has to fit into memory
  • 39. And now let’s join stuff… Merge Join owl_conference=# EXPLAIN ANALYZE SELECT * FROM owl JOIN job ON (job.id = owl.id) WHERE owl.id>1; QUERY PLAN ------------------------------------------------------------- Merge Join (cost=blabla) Merge Cond: (owl.id = job.id) -> Index Scan using owl_pkey on owl (cost=blabla) Index Cond: (id > 1) -> Sort (cost=blabla) Sort Key: job.id Sort Method: quicksort Memory: 25kB -> Seq Scan on job (cost=blaba) Planning time: 0.453 ms Execution time: 0.102 ms (10 rows)
  • 40. And now let’s join stuff… Merge Join Used for big tables, an index can be used to avoid sorting
  • 41. So we have 3 types of joins 1. Nested loop 2. Hash join 3. Merge join And a last word about ORDER BY (last part, I swear !)
  • 42. And now let’s order stuff… owl_conference=# EXPLAIN ANALYZE SELECT * FROM owl ORDER BY owl.job_id, owl.favourite_food; QUERY PLAN ------------------------------------------------------------- Sort (cost=844.47..869.47 rows=10001 width=35) (actual time=7.252..8.090 rows=10001 loops=1) Sort Key: job_id, favourite_food Sort Method: quicksort Memory: 1166kB -> Seq Scan on owl (cost=0.00..180.01 rows=10001 width=35) (actual time=0.017..1.181 rows=10001 loops=1) Planning time: 0.142 ms Execution time: 8.665 ms (6 rows) Everything is sorted into the memory (which is why it can costly)
  • 43. And now let’s order stuff… With an index owl_conference=# EXPLAIN ANALYZE SELECT * FROM owl ORDER BY owl.job_id, owl.favourite_food; QUERY PLAN ------------------------------------------------------------- Index Scan using owl_job_id_favourite_food on owl (cost=0.29..544.66 rows=10001 width=35) (actual time=0.016..2.835 rows=10001 loops=1) Planning time: 0.098 ms Execution time: 3.510 ms (3 rows) Simply use index order
  • 44. And now let’s order stuff… ORDER BY LIMIT owl_conference=# EXPLAIN ANALYZE SELECT name, employer_name FROM owl ORDER BY name LIMIT 10; QUERY PLAN ------------------------------------------------------------- ------------------------------------------------------- Limit (cost…) (actual time…) -> Sort (cost…) (actual time…) Sort Key: name Sort Method: top-N heapsort Memory: 25kB -> Seq Scan on owl (cost=0.00..180.01 rows=10001 width=16) (actual time=0.032..5.856 rows=10002 loops=1) Planning time: 0.201 ms Execution time: 15.846 ms (7 rows) Like with quicksort, all the data has to be sorted… Why is the memory taken so muck smaller?
  • 45. Top-N heap sort - A heap (sort of tree) is used with a bounded size - For each row - If the heap isn’t full, tuple added at the right place - If heap is full and value smaller (for ASC) than current values - Tuple inserted at the right place, last value popped - Else value discarded
  • 46. Top-N heap sort Data to order … Iterations 1.. 2.. 3 Iteration 10
  • 47. Top-N heap sort Example (if it wasn’t clear…) Inserting new smaller value, Potter eliminated (Voldy’s dream) Heap in the end, after sorting all stuff
  • 48. Be careful when you ORDER BY ! 1. Sorting with sort key without limit or index can be heavy 2. You might need an index, only EXPLAIN will tell you
  • 50. Conclusion - Looking at your DB logs, whatever your favourite solution is, will help you build a website with good performances - Always know where your queries come from - Careful about loops ! Use prefetch_related and select_related to avoid O(n) queries - If you have a slow query, there is no magical solution, look into explain to understand what’s going wrong and find a solution
  • 51. Thank you for your attention ! Any questions? Owly design: zimmoriarty (https://www.instagram.com/zimmoriarty/)
  • 52. To go further - sources Owly design: zimmoriarty (https://www.instagram.com/zimmoriarty/) https://momjian.us/main/writings/pgsql/optimizer.pdf https://use-the-index-luke.com/sql/plans-dexecution/ postgresql/operations http://tech.novapost.fr/postgresql-application_name-django- settings.html