Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Efficient Django
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Abstract
Tips and best practices for avoiding scalability
issues an...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Hi!
● I'm David Arcos
● Python/Django developer since 2008
● Co-org...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
●
“We improve your sales conversions, using
predictive algorithms t...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
1) Basic concepts
David Arcos - @DZPMEfficient Django – #EuroPython 2016
The Pareto Principle
"For many events, roughly 80% of the effects
c...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Prioritize and focus
Focus on the few tasks that will have the most...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Basic scalability
“Potential to be enlarged to handle a growing
amo...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Database performance
●
Do less requests:
– Less reads
– Less writes...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Templates
●
Cache them
●
Jinja2 is a bit faster than the default en...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Cache
●
Generic approach: cache at each stack level
●
The cache doc...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Cache
●
Generic approach: cache at each stack level
●
The cache doc...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Bottlenecks
●
Where is your bottleneck?
●
CPU bound or I/O bound?
–...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
2) Measuring
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Can't improve what you don't measure
●
Measure your system to find ...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Monitoring
●
System: load, CPU, memory...
●
Database: q/s, response...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Profiling
●
The cProfile module provides profiling of
Python progra...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
timeit
●
The timeit module is a simple way to time
execution time o...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
ipdb
●
Like pdb, but for ipython
– tab completion, syntax highlight...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
django-debug-toolbar
●
Display debug information about the current
...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
django-debug-toolbar-line-profiler
●
A toolbar panel for profiling
...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
3) Tips and tricks
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Add db indexes
●
Single (db_index) or multiple (index_together)
●
B...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Do bulk operations
●
Will greatly reduce the number of SQL queries:...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Get related objects
●
Return FK fields in same query:
– qs.select_r...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Slow admin?
●
Use list_select_related
●
Overwrite get_queryset() wi...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Cachalot
●
Caches your Django ORM queries and
automatically invalid...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Queues and workers
●
Do slow stuff later
●
Some operations can be q...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Cached sessions
●
Use SESSION_ENGINE to set cached sessions:
– Non-...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Persistent connections
●
Use CONN_MAX_AGE to set the lifetime of a
...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
UUIDs
●
Use UUID for Primary Keys (instead of
incremental IDs)
– Gu...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Slow tests?
●
Skip migrations: --keepdb
●
Run in parallel: --parall...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
4) Conclusions
●
Measure first
●
Optimize only the bottleneck
●
Go ...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Good resources
●
The official Django documentation
●
Book: “High Pe...
David Arcos - @DZPMEfficient Django – #EuroPython 2016
Thanks for attending!
- Get the slides at http://slideshare.net/DZP...
Upcoming SlideShare
Loading in …5
×

Efficient Django

3,257 views

Published on

Does Django scale? How to manage traffic peaks? What happens when the database grows too big? How to find (and fix) the bottlenecks?

We will overview the basics concepts, we'll use metrics to find bottlenecks, and finally we'll see some tips and tricks to improve the scalability and the performance of a Django project.

Main topics:
- System architecture
- Database performance
- Finding bottlenecks
- Monitoring, profiling, debugging
- Query optimization
- Dealing with a slow admin
- Queues and workers
- Faster tests

Talk given at #EuroPython 2016: https://ep2016.europython.eu/conference/talks/efficient-django

Published in: Technology

Efficient Django

  1. 1. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Efficient Django
  2. 2. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Abstract Tips and best practices for avoiding scalability issues and performance bottlenecks in Django ● 1) Basic concepts: the theory ● 2) Measuring: how to find bottlenecks ● 3) Tips and tricks ● 4) Conclusion (yes, it scales!)
  3. 3. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Hi! ● I'm David Arcos ● Python/Django developer since 2008 ● Co-organizer at Python Barcelona ● CTO at Lead Ratings
  4. 4. David Arcos - @DZPMEfficient Django – #EuroPython 2016 ● “We improve your sales conversions, using predictive algorithms to rate the leads” ● Prediction API, “Machine Learning as a Service” ● http://lead-ratings.com
  5. 5. David Arcos - @DZPMEfficient Django – #EuroPython 2016 1) Basic concepts
  6. 6. David Arcos - @DZPMEfficient Django – #EuroPython 2016 The Pareto Principle "For many events, roughly 80% of the effects come from 20% of the causes"
  7. 7. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Prioritize and focus Focus on the few tasks that will have the most impact
  8. 8. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Basic scalability “Potential to be enlarged to handle a growing amount of work” ● Stateless app servers – Load balance them, scale horizontally ● Keep the state on the database(s) – This is the difficult part! Each system is different
  9. 9. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Database performance ● Do less requests: – Less reads – Less writes ● Do faster requests: – Indexed fields – De-normalize
  10. 10. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Templates ● Cache them ● Jinja2 is a bit faster than the default engine – but cache them anyways ● You can do fragment caching (for blocks)
  11. 11. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Cache ● Generic approach: cache at each stack level ● The cache documentation is excellent ● Beware of the cache invalidation!
  12. 12. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Cache ● Generic approach: cache at each stack level ● The cache documentation is excellent ● Beware of the cache invalidation!
  13. 13. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Bottlenecks ● Where is your bottleneck? ● CPU bound or I/O bound? – CPU? Run heavy calculations in async workers – Memory? Compress objects before caching – Database? Read from db replicas ● How to find it?
  14. 14. David Arcos - @DZPMEfficient Django – #EuroPython 2016 2) Measuring
  15. 15. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Can't improve what you don't measure ● Measure your system to find bottlenecks ● Optimize those bottlenecks ● Verify the improvements ● Rinse and repeat!
  16. 16. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Monitoring ● System: load, CPU, memory... ● Database: q/s, response time, size ● Cache: q/s, hit rate ● Queue: length ● Custom: metrics for your app
  17. 17. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Profiling ● The cProfile module provides profiling of Python programs by collecting data: – Number of calls, running time, time per call...
  18. 18. David Arcos - @DZPMEfficient Django – #EuroPython 2016 timeit ● The timeit module is a simple way to time execution time of small bits of Python code:
  19. 19. David Arcos - @DZPMEfficient Django – #EuroPython 2016 ipdb ● Like pdb, but for ipython – tab completion, syntax highlighting, better tracebacks, better introspection… ● Use ipdb.set_trace() to add a breakpoint and jump in with the debugger
  20. 20. David Arcos - @DZPMEfficient Django – #EuroPython 2016 django-debug-toolbar ● Display debug information about the current request/response ● Panels, very modular
  21. 21. David Arcos - @DZPMEfficient Django – #EuroPython 2016 django-debug-toolbar-line-profiler ● A toolbar panel for profiling Django Debug Panel ● Chrome extension ● For AJAX requests and non-HTML responses
  22. 22. David Arcos - @DZPMEfficient Django – #EuroPython 2016 3) Tips and tricks
  23. 23. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Add db indexes ● Single (db_index) or multiple (index_together) ● Be sure to profile and measure! – Sometimes it’s not obvious (i.e., admin) – Huge difference, i.e. from 15s to 3 ms (3.5M rows) ● But: uses more space, slower writes
  24. 24. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Do bulk operations ● Will greatly reduce the number of SQL queries: – Model.objects.bulk_create() – qs.update() <- maybe with F() expressions – qs.delete()
  25. 25. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Get related objects ● Return FK fields in same query: – qs.select_related() ● Return M2M fields, extra query: – qs.prefetch_related()
  26. 26. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Slow admin? ● Use list_select_related ● Overwrite get_queryset() with prefetch_related ● Is ordering using an index? Same for search_fields ● readonly_fields will avoid FK/M2M queries ● Use the raw_id_fields widget (or better: django-salmonella) ● Extend admin/filter.html to show filters as <select>
  27. 27. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Cachalot ● Caches your Django ORM queries and automatically invalidates them
  28. 28. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Queues and workers ● Do slow stuff later ● Some operations can be queued, and executed asynchronously in workers ● Use Celery
  29. 29. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Cached sessions ● Use SESSION_ENGINE to set cached sessions: – Non-persistent: don’t hit the DB – Persistent: don’t hit the DB… so often
  30. 30. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Persistent connections ● Use CONN_MAX_AGE to set the lifetime of a database connection (persistence)
  31. 31. David Arcos - @DZPMEfficient Django – #EuroPython 2016 UUIDs ● Use UUID for Primary Keys (instead of incremental IDs) – Guaranteed uniqueness, avoid collisions – UUIDs are well-indexed ● Easier db sharding
  32. 32. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Slow tests? ● Skip migrations: --keepdb ● Run in parallel: --parallel ● Disable unused middlewares, installed_apps, password hashers, logging, etc… ● Use mocking whenever possible
  33. 33. David Arcos - @DZPMEfficient Django – #EuroPython 2016 4) Conclusions ● Measure first ● Optimize only the bottleneck ● Go for the low-hanging fruit ● Measure again
  34. 34. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Good resources ● The official Django documentation ● Book: “High Performance Django” ● Blog: “Instagram Engineering” ● “Latency Numbers Every Programmer Should Know”
  35. 35. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Thanks for attending! - Get the slides at http://slideshare.net/DZPM - We are looking for engineers and data scientists!

×