Advertisement
Advertisement

More Related Content

Advertisement

Recently uploaded(20)

Efficient Django

  1. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Efficient Django
  2. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Abstract Tips and best practices for avoiding scalability issues and performance bottlenecks in Django ● 1) Basic concepts: the theory ● 2) Measuring: how to find bottlenecks ● 3) Tips and tricks ● 4) Conclusion (yes, it scales!)
  3. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Hi! ● I'm David Arcos ● Python/Django developer since 2008 ● Co-organizer at Python Barcelona ● CTO at Lead Ratings
  4. David Arcos - @DZPMEfficient Django – #EuroPython 2016 ● “We improve your sales conversions, using predictive algorithms to rate the leads” ● Prediction API, “Machine Learning as a Service” ● http://lead-ratings.com
  5. David Arcos - @DZPMEfficient Django – #EuroPython 2016 1) Basic concepts
  6. David Arcos - @DZPMEfficient Django – #EuroPython 2016 The Pareto Principle "For many events, roughly 80% of the effects come from 20% of the causes"
  7. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Prioritize and focus Focus on the few tasks that will have the most impact
  8. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Basic scalability “Potential to be enlarged to handle a growing amount of work” ● Stateless app servers – Load balance them, scale horizontally ● Keep the state on the database(s) – This is the difficult part! Each system is different
  9. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Database performance ● Do less requests: – Less reads – Less writes ● Do faster requests: – Indexed fields – De-normalize
  10. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Templates ● Cache them ● Jinja2 is a bit faster than the default engine – but cache them anyways ● You can do fragment caching (for blocks)
  11. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Cache ● Generic approach: cache at each stack level ● The cache documentation is excellent ● Beware of the cache invalidation!
  12. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Cache ● Generic approach: cache at each stack level ● The cache documentation is excellent ● Beware of the cache invalidation!
  13. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Bottlenecks ● Where is your bottleneck? ● CPU bound or I/O bound? – CPU? Run heavy calculations in async workers – Memory? Compress objects before caching – Database? Read from db replicas ● How to find it?
  14. David Arcos - @DZPMEfficient Django – #EuroPython 2016 2) Measuring
  15. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Can't improve what you don't measure ● Measure your system to find bottlenecks ● Optimize those bottlenecks ● Verify the improvements ● Rinse and repeat!
  16. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Monitoring ● System: load, CPU, memory... ● Database: q/s, response time, size ● Cache: q/s, hit rate ● Queue: length ● Custom: metrics for your app
  17. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Profiling ● The cProfile module provides profiling of Python programs by collecting data: – Number of calls, running time, time per call...
  18. David Arcos - @DZPMEfficient Django – #EuroPython 2016 timeit ● The timeit module is a simple way to time execution time of small bits of Python code:
  19. David Arcos - @DZPMEfficient Django – #EuroPython 2016 ipdb ● Like pdb, but for ipython – tab completion, syntax highlighting, better tracebacks, better introspection… ● Use ipdb.set_trace() to add a breakpoint and jump in with the debugger
  20. David Arcos - @DZPMEfficient Django – #EuroPython 2016 django-debug-toolbar ● Display debug information about the current request/response ● Panels, very modular
  21. David Arcos - @DZPMEfficient Django – #EuroPython 2016 django-debug-toolbar-line-profiler ● A toolbar panel for profiling Django Debug Panel ● Chrome extension ● For AJAX requests and non-HTML responses
  22. David Arcos - @DZPMEfficient Django – #EuroPython 2016 3) Tips and tricks
  23. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Add db indexes ● Single (db_index) or multiple (index_together) ● Be sure to profile and measure! – Sometimes it’s not obvious (i.e., admin) – Huge difference, i.e. from 15s to 3 ms (3.5M rows) ● But: uses more space, slower writes
  24. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Do bulk operations ● Will greatly reduce the number of SQL queries: – Model.objects.bulk_create() – qs.update() <- maybe with F() expressions – qs.delete()
  25. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Get related objects ● Return FK fields in same query: – qs.select_related() ● Return M2M fields, extra query: – qs.prefetch_related()
  26. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Slow admin? ● Use list_select_related ● Overwrite get_queryset() with prefetch_related ● Is ordering using an index? Same for search_fields ● readonly_fields will avoid FK/M2M queries ● Use the raw_id_fields widget (or better: django-salmonella) ● Extend admin/filter.html to show filters as <select>
  27. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Cachalot ● Caches your Django ORM queries and automatically invalidates them
  28. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Queues and workers ● Do slow stuff later ● Some operations can be queued, and executed asynchronously in workers ● Use Celery
  29. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Cached sessions ● Use SESSION_ENGINE to set cached sessions: – Non-persistent: don’t hit the DB – Persistent: don’t hit the DB… so often
  30. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Persistent connections ● Use CONN_MAX_AGE to set the lifetime of a database connection (persistence)
  31. David Arcos - @DZPMEfficient Django – #EuroPython 2016 UUIDs ● Use UUID for Primary Keys (instead of incremental IDs) – Guaranteed uniqueness, avoid collisions – UUIDs are well-indexed ● Easier db sharding
  32. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Slow tests? ● Skip migrations: --keepdb ● Run in parallel: --parallel ● Disable unused middlewares, installed_apps, password hashers, logging, etc… ● Use mocking whenever possible
  33. David Arcos - @DZPMEfficient Django – #EuroPython 2016 4) Conclusions ● Measure first ● Optimize only the bottleneck ● Go for the low-hanging fruit ● Measure again
  34. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Good resources ● The official Django documentation ● Book: “High Performance Django” ● Blog: “Instagram Engineering” ● “Latency Numbers Every Programmer Should Know”
  35. David Arcos - @DZPMEfficient Django – #EuroPython 2016 Thanks for attending! - Get the slides at http://slideshare.net/DZPM - We are looking for engineers and data scientists!
Advertisement