SlideShare a Scribd company logo
1 of 19
Django and working with
large database tables
Django Stockholm Meetup Group
March 30, 2017
About me
Ilian Iliev
Platform Engineer at Lifesum
ilian@ilian.io
www.ilian.io
The setup
2.5GHz i7, 16GB Ram, MacBook Pro
Django 1.10
MySQL 5.7.14
PostgreSQL 9.5.4
The Models
class Tag(models.Model):
name = models.CharField(max_length=255)
class User(models.Model):
name = models.CharField(max_length=255)
date = models.DateTimeField(null=True)
class Message(models.Model):
sender = models.ForeignKey(User, related_name='sent_messages')
receiver = models.ForeignKey(User, related_name='recieved_messages', null=True)
tags = models.ManyToManyField(Tag)
The Change
class Message(models.Model):
sender = models.ForeignKey(User, related_name='sent_messages')
receiver = models.ForeignKey(User, related_name='recieved_messages', null=True)
tags = models.ManyToManyField(Tag, blank=True)
The weird migration
ALTER TABLE `big_tables_message_tags` DROP FOREIGN KEY
`big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id`;
ALTER TABLE `big_tables_message_tags` ADD CONSTRAINT
`big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id` FOREIGN KEY (`tag_id`)
REFERENCES `big_tables_tag` (`id`);
ALTER TABLE `big_tables_message_tags` DROP FOREIGN KEY
`big_tables_message__message_id_95bfb6e6_fk_big_tables_message_id`;
ALTER TABLE `big_tables_message_tags` ADD CONSTRAINT
`big_tables_message__message_id_95bfb6e6_fk_big_tables_message_id` FOREIGN KEY
(`message_id`) REFERENCES `big_tables_message` (`id`);
MySQL
Rows ~ 2.7M
Size ~ 88MB
message_id index size ~ 48MB
tags_id index size ~ 61MB
Migration time ~ 41 sec
The weird migration
ALTER TABLE "big_tables_message_tags" DROP CONSTRAINT
"big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id";
ALTER TABLE "big_tables_message_tags" ADD CONSTRAINT
"big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id" FOREIGN KEY ("tag_id")
REFERENCES "big_tables_tag" ("id") DEFERRABLE INITIALLY DEFERRED;
ALTER TABLE "big_tables_message_tags" DROP CONSTRAINT
"big_tables_message_message_id_95bfb6e6_fk_big_tables_message_id";
ALTER TABLE "big_tables_message_tags" ADD CONSTRAINT
"big_tables_message_message_id_95bfb6e6_fk_big_tables_message_id" FOREIGN KEY
("message_id") REFERENCES "big_tables_message" ("id") DEFERRABLE INITIALLY DEFERRED;
PostgreSQL
Rows ~ 2.8M
Size ~ 83MB
message_id index size ~ 77MB
tags_id index size ~ 119MB
Migration time ~ 3.2 sec
Modify the migration that created the field and add the change there
* It is a know issue https://code.djangoproject.com/ticket/25253
Solution
class MessagesTags(models.Model):
message = models.ForeignKey(Message)
tag = models.ForeignKey(Tag)
added_by = models.ForeignKey(User, null=True)
Adding fields to big tables
MySQL: 31 sec
PostgreSQL: 5.3 sec
Timing
MySQL INPLACE
ALTER TABLE `big_tables_message_tags` ADD COLUMN `added_by_id`
integer NULL, ALGORITHM INPLACE, LOCK NONE;
ALTER TABLE `big_tables_message_tags` ADD CONSTRAINT
`big_tables_message_ta_added_by_id_88e3a4dc_fk_big_tables_user_id`
FOREIGN KEY (`added_by_id`) REFERENCES `big_tables_user` (`id`),
ALGORITHM INPLACE, LOCK NONE;
* The INPLACE algorithm is supported when foreign_key_checks is disabled.
Otherwise, only the COPY algorithm is supported.
Running this on prod
Running in on prod resulted in the API crashing
Non locking query but still too heavy for the DB
Aurora appears even slower
Alternative
class MessagesTagsExtend(models.Model):
STATUS_PENDING_REVIEW = 0
STATUS_APPROVED = 10
DEFAULT_STATUS = STATUS_PENDING_REVIEW
message_tag = models.OneToOneField(MessagesTags)
status = models.IntegerField(default=DEFAULT_STATUS)
Alternative
class MessagesTags(models.Model):
...
@property
def status(self):
try:
return self.messagestagsextend.status
except MessagesTagsExtend.DoesNotExist:
print 'here'
return MessagesTagsExtend.DEFAULT_STATUS
@status.setter
def status(self, value):
obj, _ = MessagesTagsExtend.objects.get_or_create(message_tag=self)
obj.status = value
obj.save()
self.messagestagsextend = obj
* Performance is not tested on production environment
Iterating on big tables
for x in MessagesTags.objects.all():
print x
+ Single SQL query
- Loads everything in memory
Iterating on big tables
for x in MessagesTags.objects.iterator():
print x
+ Single SQL query
+ Loads pieces of the result in memory
- prefetch_related is not working
Questions?

More Related Content

What's hot

Psycopg2 postgres python DDL Operaytions (select , Insert , update, create ta...
Psycopg2 postgres python DDL Operaytions (select , Insert , update, create ta...Psycopg2 postgres python DDL Operaytions (select , Insert , update, create ta...
Psycopg2 postgres python DDL Operaytions (select , Insert , update, create ta...sachin kumar
 
20190627 j hipster-conf- diary of a java dev lost in the .net world
20190627   j hipster-conf- diary of a java dev lost in the .net world20190627   j hipster-conf- diary of a java dev lost in the .net world
20190627 j hipster-conf- diary of a java dev lost in the .net worldDaniel Petisme
 
XML & XPath Injections
XML & XPath InjectionsXML & XPath Injections
XML & XPath InjectionsAMol NAik
 
[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris Savkov
[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris Savkov[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris Savkov
[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris SavkovOWASP Russia
 
Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)MongoSF
 
MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329Douglas Duncan
 
Indexing & query optimization
Indexing & query optimizationIndexing & query optimization
Indexing & query optimizationJared Rosoff
 
Python PCEP Functions
Python PCEP FunctionsPython PCEP Functions
Python PCEP FunctionsIHTMINSTITUTE
 
Fast querying indexing for performance (4)
Fast querying   indexing for performance (4)Fast querying   indexing for performance (4)
Fast querying indexing for performance (4)MongoDB
 
Jdbc oracle
Jdbc oracleJdbc oracle
Jdbc oracleyazidds2
 
Python dictionary : past, present, future
Python dictionary: past, present, futurePython dictionary: past, present, future
Python dictionary : past, present, futuredelimitry
 
Smarter Testing with Spock
Smarter Testing with SpockSmarter Testing with Spock
Smarter Testing with SpockDmitry Voloshko
 
1.4 data cleaning and manipulation in r and excel
1.4  data cleaning and manipulation in r and excel1.4  data cleaning and manipulation in r and excel
1.4 data cleaning and manipulation in r and excelSimple Research
 

What's hot (20)

MySql:Introduction
MySql:IntroductionMySql:Introduction
MySql:Introduction
 
Psycopg2 postgres python DDL Operaytions (select , Insert , update, create ta...
Psycopg2 postgres python DDL Operaytions (select , Insert , update, create ta...Psycopg2 postgres python DDL Operaytions (select , Insert , update, create ta...
Psycopg2 postgres python DDL Operaytions (select , Insert , update, create ta...
 
Mongo indexes
Mongo indexesMongo indexes
Mongo indexes
 
Hacking XPATH 2.0
Hacking XPATH 2.0Hacking XPATH 2.0
Hacking XPATH 2.0
 
20190627 j hipster-conf- diary of a java dev lost in the .net world
20190627   j hipster-conf- diary of a java dev lost in the .net world20190627   j hipster-conf- diary of a java dev lost in the .net world
20190627 j hipster-conf- diary of a java dev lost in the .net world
 
Spock
SpockSpock
Spock
 
XML & XPath Injections
XML & XPath InjectionsXML & XPath Injections
XML & XPath Injections
 
[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris Savkov
[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris Savkov[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris Savkov
[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris Savkov
 
บทที่4
บทที่4บทที่4
บทที่4
 
Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)
 
MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329
 
Clojure functions midje
Clojure functions midjeClojure functions midje
Clojure functions midje
 
Indexing & query optimization
Indexing & query optimizationIndexing & query optimization
Indexing & query optimization
 
Python PCEP Functions
Python PCEP FunctionsPython PCEP Functions
Python PCEP Functions
 
Fast querying indexing for performance (4)
Fast querying   indexing for performance (4)Fast querying   indexing for performance (4)
Fast querying indexing for performance (4)
 
Jdbc oracle
Jdbc oracleJdbc oracle
Jdbc oracle
 
Sequelize
SequelizeSequelize
Sequelize
 
Python dictionary : past, present, future
Python dictionary: past, present, futurePython dictionary: past, present, future
Python dictionary : past, present, future
 
Smarter Testing with Spock
Smarter Testing with SpockSmarter Testing with Spock
Smarter Testing with Spock
 
1.4 data cleaning and manipulation in r and excel
1.4  data cleaning and manipulation in r and excel1.4  data cleaning and manipulation in r and excel
1.4 data cleaning and manipulation in r and excel
 

Similar to Django and working with large database tables

Questions On The Code And Core Module
Questions On The Code And Core ModuleQuestions On The Code And Core Module
Questions On The Code And Core ModuleKatie Gulley
 
concurrency with GPars
concurrency with GParsconcurrency with GPars
concurrency with GParsPaul King
 
More Stored Procedures and MUMPS for DivConq
More Stored Procedures and  MUMPS for DivConqMore Stored Procedures and  MUMPS for DivConq
More Stored Procedures and MUMPS for DivConqeTimeline, LLC
 
Python Metaprogramming
Python MetaprogrammingPython Metaprogramming
Python MetaprogrammingSDU CYBERLAB
 
Clean code _v2003
 Clean code _v2003 Clean code _v2003
Clean code _v2003R696
 
Java → kotlin: Tests Made Simple
Java → kotlin: Tests Made SimpleJava → kotlin: Tests Made Simple
Java → kotlin: Tests Made Simpleleonsabr
 
03 object-classes-pbl-4-slots
03 object-classes-pbl-4-slots03 object-classes-pbl-4-slots
03 object-classes-pbl-4-slotsmha4
 
03 object-classes-pbl-4-slots
03 object-classes-pbl-4-slots03 object-classes-pbl-4-slots
03 object-classes-pbl-4-slotsmha4
 
[FT-7][snowmantw] How to make a new functional language and make the world be...
[FT-7][snowmantw] How to make a new functional language and make the world be...[FT-7][snowmantw] How to make a new functional language and make the world be...
[FT-7][snowmantw] How to make a new functional language and make the world be...Functional Thursday
 
Building node.js applications with Database Jones
Building node.js applications with Database JonesBuilding node.js applications with Database Jones
Building node.js applications with Database JonesJohn David Duncan
 
Addressing Scenario
Addressing ScenarioAddressing Scenario
Addressing ScenarioTara Hardin
 
GSP 125 Final Exam Guide
GSP 125 Final Exam GuideGSP 125 Final Exam Guide
GSP 125 Final Exam Guidecritter13
 
Faculty of ScienceDepartment of ComputingFinal Examinati.docx
Faculty of ScienceDepartment of ComputingFinal Examinati.docxFaculty of ScienceDepartment of ComputingFinal Examinati.docx
Faculty of ScienceDepartment of ComputingFinal Examinati.docxmydrynan
 
Metaprogramovanie #1
Metaprogramovanie #1Metaprogramovanie #1
Metaprogramovanie #1Jano Suchal
 

Similar to Django and working with large database tables (20)

Questions On The Code And Core Module
Questions On The Code And Core ModuleQuestions On The Code And Core Module
Questions On The Code And Core Module
 
Django Good Practices
Django Good PracticesDjango Good Practices
Django Good Practices
 
concurrency with GPars
concurrency with GParsconcurrency with GPars
concurrency with GPars
 
More Stored Procedures and MUMPS for DivConq
More Stored Procedures and  MUMPS for DivConqMore Stored Procedures and  MUMPS for DivConq
More Stored Procedures and MUMPS for DivConq
 
Python Metaprogramming
Python MetaprogrammingPython Metaprogramming
Python Metaprogramming
 
Clean code _v2003
 Clean code _v2003 Clean code _v2003
Clean code _v2003
 
Django Models
Django ModelsDjango Models
Django Models
 
Data herding
Data herdingData herding
Data herding
 
Data herding
Data herdingData herding
Data herding
 
Java → kotlin: Tests Made Simple
Java → kotlin: Tests Made SimpleJava → kotlin: Tests Made Simple
Java → kotlin: Tests Made Simple
 
Why Our Code Smells
Why Our Code SmellsWhy Our Code Smells
Why Our Code Smells
 
03 object-classes-pbl-4-slots
03 object-classes-pbl-4-slots03 object-classes-pbl-4-slots
03 object-classes-pbl-4-slots
 
03 object-classes-pbl-4-slots
03 object-classes-pbl-4-slots03 object-classes-pbl-4-slots
03 object-classes-pbl-4-slots
 
Clean code
Clean codeClean code
Clean code
 
[FT-7][snowmantw] How to make a new functional language and make the world be...
[FT-7][snowmantw] How to make a new functional language and make the world be...[FT-7][snowmantw] How to make a new functional language and make the world be...
[FT-7][snowmantw] How to make a new functional language and make the world be...
 
Building node.js applications with Database Jones
Building node.js applications with Database JonesBuilding node.js applications with Database Jones
Building node.js applications with Database Jones
 
Addressing Scenario
Addressing ScenarioAddressing Scenario
Addressing Scenario
 
GSP 125 Final Exam Guide
GSP 125 Final Exam GuideGSP 125 Final Exam Guide
GSP 125 Final Exam Guide
 
Faculty of ScienceDepartment of ComputingFinal Examinati.docx
Faculty of ScienceDepartment of ComputingFinal Examinati.docxFaculty of ScienceDepartment of ComputingFinal Examinati.docx
Faculty of ScienceDepartment of ComputingFinal Examinati.docx
 
Metaprogramovanie #1
Metaprogramovanie #1Metaprogramovanie #1
Metaprogramovanie #1
 

Recently uploaded

Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogueitservices996
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldRoberto Pérez Alcolea
 
SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?Alexandre Beguel
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slidesvaideheekore1
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsJean Silva
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxRTS corp
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolsosttopstonverter
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonApplitools
 
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...Bert Jan Schrijver
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorTier1 app
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfkalichargn70th171
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingShane Coughlan
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 

Recently uploaded (20)

Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogue
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository world
 
SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slides
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration tools
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
 
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryError
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 

Django and working with large database tables

  • 1. Django and working with large database tables Django Stockholm Meetup Group March 30, 2017
  • 2. About me Ilian Iliev Platform Engineer at Lifesum ilian@ilian.io www.ilian.io
  • 3. The setup 2.5GHz i7, 16GB Ram, MacBook Pro Django 1.10 MySQL 5.7.14 PostgreSQL 9.5.4
  • 4. The Models class Tag(models.Model): name = models.CharField(max_length=255) class User(models.Model): name = models.CharField(max_length=255) date = models.DateTimeField(null=True) class Message(models.Model): sender = models.ForeignKey(User, related_name='sent_messages') receiver = models.ForeignKey(User, related_name='recieved_messages', null=True) tags = models.ManyToManyField(Tag)
  • 5. The Change class Message(models.Model): sender = models.ForeignKey(User, related_name='sent_messages') receiver = models.ForeignKey(User, related_name='recieved_messages', null=True) tags = models.ManyToManyField(Tag, blank=True)
  • 6. The weird migration ALTER TABLE `big_tables_message_tags` DROP FOREIGN KEY `big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id`; ALTER TABLE `big_tables_message_tags` ADD CONSTRAINT `big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id` FOREIGN KEY (`tag_id`) REFERENCES `big_tables_tag` (`id`); ALTER TABLE `big_tables_message_tags` DROP FOREIGN KEY `big_tables_message__message_id_95bfb6e6_fk_big_tables_message_id`; ALTER TABLE `big_tables_message_tags` ADD CONSTRAINT `big_tables_message__message_id_95bfb6e6_fk_big_tables_message_id` FOREIGN KEY (`message_id`) REFERENCES `big_tables_message` (`id`);
  • 7. MySQL Rows ~ 2.7M Size ~ 88MB message_id index size ~ 48MB tags_id index size ~ 61MB Migration time ~ 41 sec
  • 8. The weird migration ALTER TABLE "big_tables_message_tags" DROP CONSTRAINT "big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id"; ALTER TABLE "big_tables_message_tags" ADD CONSTRAINT "big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id" FOREIGN KEY ("tag_id") REFERENCES "big_tables_tag" ("id") DEFERRABLE INITIALLY DEFERRED; ALTER TABLE "big_tables_message_tags" DROP CONSTRAINT "big_tables_message_message_id_95bfb6e6_fk_big_tables_message_id"; ALTER TABLE "big_tables_message_tags" ADD CONSTRAINT "big_tables_message_message_id_95bfb6e6_fk_big_tables_message_id" FOREIGN KEY ("message_id") REFERENCES "big_tables_message" ("id") DEFERRABLE INITIALLY DEFERRED;
  • 9. PostgreSQL Rows ~ 2.8M Size ~ 83MB message_id index size ~ 77MB tags_id index size ~ 119MB Migration time ~ 3.2 sec
  • 10. Modify the migration that created the field and add the change there * It is a know issue https://code.djangoproject.com/ticket/25253 Solution
  • 11. class MessagesTags(models.Model): message = models.ForeignKey(Message) tag = models.ForeignKey(Tag) added_by = models.ForeignKey(User, null=True) Adding fields to big tables
  • 12. MySQL: 31 sec PostgreSQL: 5.3 sec Timing
  • 13. MySQL INPLACE ALTER TABLE `big_tables_message_tags` ADD COLUMN `added_by_id` integer NULL, ALGORITHM INPLACE, LOCK NONE; ALTER TABLE `big_tables_message_tags` ADD CONSTRAINT `big_tables_message_ta_added_by_id_88e3a4dc_fk_big_tables_user_id` FOREIGN KEY (`added_by_id`) REFERENCES `big_tables_user` (`id`), ALGORITHM INPLACE, LOCK NONE; * The INPLACE algorithm is supported when foreign_key_checks is disabled. Otherwise, only the COPY algorithm is supported.
  • 14. Running this on prod Running in on prod resulted in the API crashing Non locking query but still too heavy for the DB Aurora appears even slower
  • 15. Alternative class MessagesTagsExtend(models.Model): STATUS_PENDING_REVIEW = 0 STATUS_APPROVED = 10 DEFAULT_STATUS = STATUS_PENDING_REVIEW message_tag = models.OneToOneField(MessagesTags) status = models.IntegerField(default=DEFAULT_STATUS)
  • 16. Alternative class MessagesTags(models.Model): ... @property def status(self): try: return self.messagestagsextend.status except MessagesTagsExtend.DoesNotExist: print 'here' return MessagesTagsExtend.DEFAULT_STATUS @status.setter def status(self, value): obj, _ = MessagesTagsExtend.objects.get_or_create(message_tag=self) obj.status = value obj.save() self.messagestagsextend = obj * Performance is not tested on production environment
  • 17. Iterating on big tables for x in MessagesTags.objects.all(): print x + Single SQL query - Loads everything in memory
  • 18. Iterating on big tables for x in MessagesTags.objects.iterator(): print x + Single SQL query + Loads pieces of the result in memory - prefetch_related is not working

Editor's Notes

  1. How many of you use MySQL How many use PostgreSQL Anyone using SQLite or Oracle?
  2. And of course you will always have to add select related
  3. Single SQL query Loads everything in memory I killed it after taking 2G of ram and it still hasn’t started printing the results
  4. And of course you will always have to add select related Consider using values() and values_list()
  5. Thank you a for listening, do you have any questions.