SlideShare a Scribd company logo
1 of 23
Download to read offline
DJANGO ADVANCED
DATAFLOWS
ASYNC, PUSH/SOCKETS & CACHING
Mitch Kuchenberg - backend developer @Ambassador
DJANGO ADVANCED DATAFLOWS
THEME AND VARIATION
▸ Standard request/response covers 90-ish% of use cases,
but what happens when…
▸ …there’s lots of data to process in the request?
▸ …you have to serve 1000s of requests per second?
▸ …you need to send push notifications to the client?
▸ …you need scheduled tasks for things like weekly
reports, newsletters, etc.?
DJANGO ADVANCED DATAFLOWS
THEME AND VARIATION
3 very powerful tools when used responsibly
> LET’S ILLUSTRATE WHAT
ALL 3 DO WITH SOME
EXAMPLES…
DJANGO ADVANCED DATAFLOWS
SPEC I
▸ We need to make an endpoint user signup.
▸ accept POST with {“email_address”:
“email@example.com”, “password”: “biscuits”}
▸ Django view validates that data, creates User object in
DB, generates some signup info (maybe share Urls and a
few foreign key objects?…), and returns a response
including information from those new object instances
and a 201 “created” status.
DJANGO ADVANCED DATAFLOWS
▸ This should look kind of familiar.
▸ Great! No external libraries necessary here, but…
DIAGRAM I
DJANGO ADVANCED DATAFLOWS
PROBLEM I
▸ There’s a lot of heavy lifting to do with generating the User
object and its foreign keys, so the response is taking up to
and beyond 700 ms.
▸ This works when you’re only serving a few hundred
requests per minute, but it’s definitely not going to scale.
▸ Something must be done!
CELERY IS AN ASYNCHRONOUS TASK
QUEUE/JOB QUEUE BASED ON
DISTRIBUTED MESSAGE PASSING. IT IS
FOCUSED ON REAL-TIME OPERATION,
BUT SUPPORTS SCHEDULING AS WELL.
www.celeryproject.org
DJANGO ADVANCED DATAFLOWS
SOLUTION I: DJANGO-CELERY
DJANGO ADVANCED DATAFLOWS
SOLUTION I: DJANGO-CELERY
▸ Configuring Celery can be a little tricky but there are lots
of tutorials out there.
▸ Condensed version: Celery allows you to create and
execute asynchronous tasks from within your Django view
(and elsewhere!).
▸ This allows us to pass the heavy lifting off to the task to be
completed in the background and just return a 202
“accepted” response.
DJANGO ADVANCED DATAFLOWS
SOLUTION I: DJANGO-CELERY
▸ Slightly more complicated, but nothing revolutionary.
▸ We’re now sufficiently performant. Depending on setup
this could scale very well! But…
DJANGO ADVANCED DATAFLOWS
PROBLEM II
▸ We don’t want users to have to check their email inboxes
for their info.
▸ We’d like users to receive this info from within the client.
▸ But how do we notify the client from within an
asynchronous task when the client is no longer awaiting an
HTTP response?
▸ Something must be done!
INSTANTLY UPDATE BROWSERS,
MOBILES AND IOT DEVICES
WITH OUR SIMPLE, EVENT-
BASED API.
https://pusher.com/features
DJANGO ADVANCED DATAFLOWS
SOLUTION II: PUSHER
DJANGO ADVANCED DATAFLOWS
SOLUTION II: PUSHER
▸ Pusher provides a relatively simple API for pushing
notifications to “channels.” The client and our API agree
on a socket; the client listens on that socket and the API
pushes to it via a unique “channel.”
▸ Condensed version: Pusher allows our API to
communicate with clients (browsers, mobile devices, IoT
devices) outside of standard request/response.
▸ Now our users can receive the data they need when the
Celery task is ready to publish it.
DJANGO ADVANCED DATAFLOWS
SOLUTION II: PUSHER
▸ Note the only difference between this diagram and the last one is
we’ve replaced “Email” with “Pusher” and flipped the arrow that
was going from “Client” to “Email.”
▸ Boom! Now we can publish data to the client after computing it
asynchronously via Celery. But…
DJANGO ADVANCED DATAFLOWS
PROBLEM III
▸ Pusher imposes a strict limit on the size of the data you can
publish: 10Kb.
▸ The data we want to publish is larger than 10Kb!
▸ But how can the client receive the data if we can’t send it
over the wire?
▸ Something must be done!
https://github.com/pusher/pusher-http-python/blob/master/pusher/pusher.py#L151
REDIS IS AN OPEN SOURCE (BSD
LICENSED), IN-MEMORY DATA
STRUCTURE STORE, USED AS
DATABASE, CACHE AND MESSAGE
BROKER.
http://redis.io/
DJANGO ADVANCED DATAFLOWS
SOLUTION III: REDIS
DJANGO ADVANCED DATAFLOWS
SOLUTION III: REDIS
▸ Django’s default caching API provides a very simple interface for
setting, retrieving, and deleting cache items.
▸ Condensed version: we’ll just set the needful data as a cache
item with a unique key (some variant of UUID) and return a URL
with that UUID the client can GET.
▸ The client will GET the URL (something like https://
mydjangoapi.com/get_cached_response/<uuidhere>/).
▸ From within that new view we retrieve the cache item (based on
the key provided in the request arg) and return it in the response.
DJANGO ADVANCED DATAFLOWS
SOLUTION III: REDIS
▸ Problem solved.
DJANGO ADVANCED DATAFLOWS
IN REVIEW
▸ Celery, Pusher, and Redis are very robust and powerful when
used in the right places.
▸ Celery allows your API to run memory and time-intensive
computations without bogging down throughput.
▸ Pusher enables communication between your Celery tasks
and clients (pub-sub).
▸ Redis is an easy dict-type storage system your app can use to
store information that you don’t want to create a model for.
DJANGO ADVANCED DATAFLOWS
ONE LAST NOTE
▸ Be careful when using Celery! It’s powerful, but with great power
comes the potential for great harm.
▸ Once you’re operating outside of the standard request/response
paradigm it’s easy to expose yourself to serious programatic flaws.
▸ Racing conditions: watch your timing!
▸ Celery memory problems: use reset_queries() and keep tasks as
short as possible!
▸ Messaging queues are transient in nature, so make sure you’re
backing up your data. Use acks_late=True whenever a task is
dealing with data you absolutely cannot lose.
DJANGO ADVANCED DATAFLOWS
CELERY’S SOURCE CODE IS COMPLICATED…
▸ …like, really really complicated. It makes problems difficult
to track down and debug.
$ sfood celery | sfood-graph | dot -Tpdf
THANKS FOR
LISTENING!
- The Diplomats
DJANGO ADVANCED DATAFLOWS
SHOUTOUTS
▸ Matt, Brando, Jeff, Chase, and all of my fellow Diplomats.
▸ You! Thanks for coming out
▸ My sources:
▸ http://www.celeryproject.org/
▸ https://www.djangoproject.com/
▸ https://pusher.com/
▸ http://redis.io/
▸ https://github.com/pusher/pusher-http-python
▸ https://www.draw.io/ really simple online diagram editor.
▸ http://www.django-rest-framework.org/ didn’t cover DRF here
but check it out!

More Related Content

Similar to Advanced workflows

Work Queues
Work QueuesWork Queues
Work Queues
ciconf
 

Similar to Advanced workflows (20)

Dictionary Within the Cloud
Dictionary Within the CloudDictionary Within the Cloud
Dictionary Within the Cloud
 
DevOps Fest 2020. immutable infrastructure as code. True story.
DevOps Fest 2020. immutable infrastructure as code. True story.DevOps Fest 2020. immutable infrastructure as code. True story.
DevOps Fest 2020. immutable infrastructure as code. True story.
 
Accelerating workloads and bursting data with Google Dataproc & Alluxio
Accelerating workloads and bursting data with Google Dataproc & AlluxioAccelerating workloads and bursting data with Google Dataproc & Alluxio
Accelerating workloads and bursting data with Google Dataproc & Alluxio
 
Continuous Delivery: The Dirty Details
Continuous Delivery: The Dirty DetailsContinuous Delivery: The Dirty Details
Continuous Delivery: The Dirty Details
 
Webinar | So You Think You Know the Cloud: Hosting Alternatives You May Not K...
Webinar | So You Think You Know the Cloud: Hosting Alternatives You May Not K...Webinar | So You Think You Know the Cloud: Hosting Alternatives You May Not K...
Webinar | So You Think You Know the Cloud: Hosting Alternatives You May Not K...
 
Diving into event-sourcing and event-driven architectures
Diving into event-sourcing and event-driven architecturesDiving into event-sourcing and event-driven architectures
Diving into event-sourcing and event-driven architectures
 
Python in an Evolving Enterprise System (PyData SV 2013)
Python in an Evolving Enterprise System (PyData SV 2013)Python in an Evolving Enterprise System (PyData SV 2013)
Python in an Evolving Enterprise System (PyData SV 2013)
 
Evolving to Cloud-Native - Nate Schutta 1/2
Evolving to Cloud-Native - Nate Schutta 1/2Evolving to Cloud-Native - Nate Schutta 1/2
Evolving to Cloud-Native - Nate Schutta 1/2
 
Evolving to Cloud-Native - Nate Schutta (1/2)
Evolving to Cloud-Native - Nate Schutta (1/2)Evolving to Cloud-Native - Nate Schutta (1/2)
Evolving to Cloud-Native - Nate Schutta (1/2)
 
Availability in a cloud native world v1.6 (Feb 2019)
Availability in a cloud native world v1.6 (Feb 2019)Availability in a cloud native world v1.6 (Feb 2019)
Availability in a cloud native world v1.6 (Feb 2019)
 
There is something about serverless
There is something about serverlessThere is something about serverless
There is something about serverless
 
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your DataCloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
 
Predicting Space Weather with Docker
Predicting Space Weather with DockerPredicting Space Weather with Docker
Predicting Space Weather with Docker
 
Product! - The road to production deployment
Product! - The road to production deploymentProduct! - The road to production deployment
Product! - The road to production deployment
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud Platform
 
APIdays Paris 2018 - Cloud computing - we went through every steps of the Gar...
APIdays Paris 2018 - Cloud computing - we went through every steps of the Gar...APIdays Paris 2018 - Cloud computing - we went through every steps of the Gar...
APIdays Paris 2018 - Cloud computing - we went through every steps of the Gar...
 
Scalable TensorFlow Deep Learning as a Service with Docker, OpenPOWER, and GPUs
Scalable TensorFlow Deep Learning as a Service with Docker, OpenPOWER, and GPUsScalable TensorFlow Deep Learning as a Service with Docker, OpenPOWER, and GPUs
Scalable TensorFlow Deep Learning as a Service with Docker, OpenPOWER, and GPUs
 
Work Queues
Work QueuesWork Queues
Work Queues
 
Green Code Lab Challenge 2014 - Green IT Applied To Public Works
Green Code Lab Challenge 2014 - Green IT Applied To Public WorksGreen Code Lab Challenge 2014 - Green IT Applied To Public Works
Green Code Lab Challenge 2014 - Green IT Applied To Public Works
 
Druid in Spot Instances
Druid in Spot InstancesDruid in Spot Instances
Druid in Spot Instances
 

Recently uploaded

1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
AldoGarca30
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
mphochane1998
 

Recently uploaded (20)

Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Introduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdfIntroduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdf
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
 
457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx
457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx
457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx
 
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptxOrlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
 

Advanced workflows

  • 1. DJANGO ADVANCED DATAFLOWS ASYNC, PUSH/SOCKETS & CACHING Mitch Kuchenberg - backend developer @Ambassador
  • 2. DJANGO ADVANCED DATAFLOWS THEME AND VARIATION ▸ Standard request/response covers 90-ish% of use cases, but what happens when… ▸ …there’s lots of data to process in the request? ▸ …you have to serve 1000s of requests per second? ▸ …you need to send push notifications to the client? ▸ …you need scheduled tasks for things like weekly reports, newsletters, etc.?
  • 3. DJANGO ADVANCED DATAFLOWS THEME AND VARIATION 3 very powerful tools when used responsibly
  • 4. > LET’S ILLUSTRATE WHAT ALL 3 DO WITH SOME EXAMPLES…
  • 5. DJANGO ADVANCED DATAFLOWS SPEC I ▸ We need to make an endpoint user signup. ▸ accept POST with {“email_address”: “email@example.com”, “password”: “biscuits”} ▸ Django view validates that data, creates User object in DB, generates some signup info (maybe share Urls and a few foreign key objects?…), and returns a response including information from those new object instances and a 201 “created” status.
  • 6. DJANGO ADVANCED DATAFLOWS ▸ This should look kind of familiar. ▸ Great! No external libraries necessary here, but… DIAGRAM I
  • 7. DJANGO ADVANCED DATAFLOWS PROBLEM I ▸ There’s a lot of heavy lifting to do with generating the User object and its foreign keys, so the response is taking up to and beyond 700 ms. ▸ This works when you’re only serving a few hundred requests per minute, but it’s definitely not going to scale. ▸ Something must be done!
  • 8. CELERY IS AN ASYNCHRONOUS TASK QUEUE/JOB QUEUE BASED ON DISTRIBUTED MESSAGE PASSING. IT IS FOCUSED ON REAL-TIME OPERATION, BUT SUPPORTS SCHEDULING AS WELL. www.celeryproject.org DJANGO ADVANCED DATAFLOWS SOLUTION I: DJANGO-CELERY
  • 9. DJANGO ADVANCED DATAFLOWS SOLUTION I: DJANGO-CELERY ▸ Configuring Celery can be a little tricky but there are lots of tutorials out there. ▸ Condensed version: Celery allows you to create and execute asynchronous tasks from within your Django view (and elsewhere!). ▸ This allows us to pass the heavy lifting off to the task to be completed in the background and just return a 202 “accepted” response.
  • 10. DJANGO ADVANCED DATAFLOWS SOLUTION I: DJANGO-CELERY ▸ Slightly more complicated, but nothing revolutionary. ▸ We’re now sufficiently performant. Depending on setup this could scale very well! But…
  • 11. DJANGO ADVANCED DATAFLOWS PROBLEM II ▸ We don’t want users to have to check their email inboxes for their info. ▸ We’d like users to receive this info from within the client. ▸ But how do we notify the client from within an asynchronous task when the client is no longer awaiting an HTTP response? ▸ Something must be done!
  • 12. INSTANTLY UPDATE BROWSERS, MOBILES AND IOT DEVICES WITH OUR SIMPLE, EVENT- BASED API. https://pusher.com/features DJANGO ADVANCED DATAFLOWS SOLUTION II: PUSHER
  • 13. DJANGO ADVANCED DATAFLOWS SOLUTION II: PUSHER ▸ Pusher provides a relatively simple API for pushing notifications to “channels.” The client and our API agree on a socket; the client listens on that socket and the API pushes to it via a unique “channel.” ▸ Condensed version: Pusher allows our API to communicate with clients (browsers, mobile devices, IoT devices) outside of standard request/response. ▸ Now our users can receive the data they need when the Celery task is ready to publish it.
  • 14. DJANGO ADVANCED DATAFLOWS SOLUTION II: PUSHER ▸ Note the only difference between this diagram and the last one is we’ve replaced “Email” with “Pusher” and flipped the arrow that was going from “Client” to “Email.” ▸ Boom! Now we can publish data to the client after computing it asynchronously via Celery. But…
  • 15. DJANGO ADVANCED DATAFLOWS PROBLEM III ▸ Pusher imposes a strict limit on the size of the data you can publish: 10Kb. ▸ The data we want to publish is larger than 10Kb! ▸ But how can the client receive the data if we can’t send it over the wire? ▸ Something must be done! https://github.com/pusher/pusher-http-python/blob/master/pusher/pusher.py#L151
  • 16. REDIS IS AN OPEN SOURCE (BSD LICENSED), IN-MEMORY DATA STRUCTURE STORE, USED AS DATABASE, CACHE AND MESSAGE BROKER. http://redis.io/ DJANGO ADVANCED DATAFLOWS SOLUTION III: REDIS
  • 17. DJANGO ADVANCED DATAFLOWS SOLUTION III: REDIS ▸ Django’s default caching API provides a very simple interface for setting, retrieving, and deleting cache items. ▸ Condensed version: we’ll just set the needful data as a cache item with a unique key (some variant of UUID) and return a URL with that UUID the client can GET. ▸ The client will GET the URL (something like https:// mydjangoapi.com/get_cached_response/<uuidhere>/). ▸ From within that new view we retrieve the cache item (based on the key provided in the request arg) and return it in the response.
  • 18. DJANGO ADVANCED DATAFLOWS SOLUTION III: REDIS ▸ Problem solved.
  • 19. DJANGO ADVANCED DATAFLOWS IN REVIEW ▸ Celery, Pusher, and Redis are very robust and powerful when used in the right places. ▸ Celery allows your API to run memory and time-intensive computations without bogging down throughput. ▸ Pusher enables communication between your Celery tasks and clients (pub-sub). ▸ Redis is an easy dict-type storage system your app can use to store information that you don’t want to create a model for.
  • 20. DJANGO ADVANCED DATAFLOWS ONE LAST NOTE ▸ Be careful when using Celery! It’s powerful, but with great power comes the potential for great harm. ▸ Once you’re operating outside of the standard request/response paradigm it’s easy to expose yourself to serious programatic flaws. ▸ Racing conditions: watch your timing! ▸ Celery memory problems: use reset_queries() and keep tasks as short as possible! ▸ Messaging queues are transient in nature, so make sure you’re backing up your data. Use acks_late=True whenever a task is dealing with data you absolutely cannot lose.
  • 21. DJANGO ADVANCED DATAFLOWS CELERY’S SOURCE CODE IS COMPLICATED… ▸ …like, really really complicated. It makes problems difficult to track down and debug. $ sfood celery | sfood-graph | dot -Tpdf
  • 23. DJANGO ADVANCED DATAFLOWS SHOUTOUTS ▸ Matt, Brando, Jeff, Chase, and all of my fellow Diplomats. ▸ You! Thanks for coming out ▸ My sources: ▸ http://www.celeryproject.org/ ▸ https://www.djangoproject.com/ ▸ https://pusher.com/ ▸ http://redis.io/ ▸ https://github.com/pusher/pusher-http-python ▸ https://www.draw.io/ really simple online diagram editor. ▸ http://www.django-rest-framework.org/ didn’t cover DRF here but check it out!