Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Django and working with large database tables

7,442 views

Published on

Common problems faced when working with Django on big tables. Slides from my presentation for the Django Stockholm Meetup group.

Published in: Software
  • Be the first to comment

Django and working with large database tables

  1. 1. Django and working with large database tables Django Stockholm Meetup Group March 30, 2017
  2. 2. About me Ilian Iliev Platform Engineer at Lifesum ilian@ilian.io www.ilian.io
  3. 3. The setup 2.5GHz i7, 16GB Ram, MacBook Pro Django 1.10 MySQL 5.7.14 PostgreSQL 9.5.4
  4. 4. The Models class Tag(models.Model): name = models.CharField(max_length=255) class User(models.Model): name = models.CharField(max_length=255) date = models.DateTimeField(null=True) class Message(models.Model): sender = models.ForeignKey(User, related_name='sent_messages') receiver = models.ForeignKey(User, related_name='recieved_messages', null=True) tags = models.ManyToManyField(Tag)
  5. 5. The Change class Message(models.Model): sender = models.ForeignKey(User, related_name='sent_messages') receiver = models.ForeignKey(User, related_name='recieved_messages', null=True) tags = models.ManyToManyField(Tag, blank=True)
  6. 6. The weird migration ALTER TABLE `big_tables_message_tags` DROP FOREIGN KEY `big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id`; ALTER TABLE `big_tables_message_tags` ADD CONSTRAINT `big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id` FOREIGN KEY (`tag_id`) REFERENCES `big_tables_tag` (`id`); ALTER TABLE `big_tables_message_tags` DROP FOREIGN KEY `big_tables_message__message_id_95bfb6e6_fk_big_tables_message_id`; ALTER TABLE `big_tables_message_tags` ADD CONSTRAINT `big_tables_message__message_id_95bfb6e6_fk_big_tables_message_id` FOREIGN KEY (`message_id`) REFERENCES `big_tables_message` (`id`);
  7. 7. MySQL Rows ~ 2.7M Size ~ 88MB message_id index size ~ 48MB tags_id index size ~ 61MB Migration time ~ 41 sec
  8. 8. The weird migration ALTER TABLE "big_tables_message_tags" DROP CONSTRAINT "big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id"; ALTER TABLE "big_tables_message_tags" ADD CONSTRAINT "big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id" FOREIGN KEY ("tag_id") REFERENCES "big_tables_tag" ("id") DEFERRABLE INITIALLY DEFERRED; ALTER TABLE "big_tables_message_tags" DROP CONSTRAINT "big_tables_message_message_id_95bfb6e6_fk_big_tables_message_id"; ALTER TABLE "big_tables_message_tags" ADD CONSTRAINT "big_tables_message_message_id_95bfb6e6_fk_big_tables_message_id" FOREIGN KEY ("message_id") REFERENCES "big_tables_message" ("id") DEFERRABLE INITIALLY DEFERRED;
  9. 9. PostgreSQL Rows ~ 2.8M Size ~ 83MB message_id index size ~ 77MB tags_id index size ~ 119MB Migration time ~ 3.2 sec
  10. 10. Modify the migration that created the field and add the change there * It is a know issue https://code.djangoproject.com/ticket/25253 Solution
  11. 11. class MessagesTags(models.Model): message = models.ForeignKey(Message) tag = models.ForeignKey(Tag) added_by = models.ForeignKey(User, null=True) Adding fields to big tables
  12. 12. MySQL: 31 sec PostgreSQL: 5.3 sec Timing
  13. 13. MySQL INPLACE ALTER TABLE `big_tables_message_tags` ADD COLUMN `added_by_id` integer NULL, ALGORITHM INPLACE, LOCK NONE; ALTER TABLE `big_tables_message_tags` ADD CONSTRAINT `big_tables_message_ta_added_by_id_88e3a4dc_fk_big_tables_user_id` FOREIGN KEY (`added_by_id`) REFERENCES `big_tables_user` (`id`), ALGORITHM INPLACE, LOCK NONE; * The INPLACE algorithm is supported when foreign_key_checks is disabled. Otherwise, only the COPY algorithm is supported.
  14. 14. Running this on prod Running in on prod resulted in the API crashing Non locking query but still too heavy for the DB Aurora appears even slower
  15. 15. Alternative class MessagesTagsExtend(models.Model): STATUS_PENDING_REVIEW = 0 STATUS_APPROVED = 10 DEFAULT_STATUS = STATUS_PENDING_REVIEW message_tag = models.OneToOneField(MessagesTags) status = models.IntegerField(default=DEFAULT_STATUS)
  16. 16. Alternative class MessagesTags(models.Model): ... @property def status(self): try: return self.messagestagsextend.status except MessagesTagsExtend.DoesNotExist: print 'here' return MessagesTagsExtend.DEFAULT_STATUS @status.setter def status(self, value): obj, _ = MessagesTagsExtend.objects.get_or_create(message_tag=self) obj.status = value obj.save() self.messagestagsextend = obj * Performance is not tested on production environment
  17. 17. Iterating on big tables for x in MessagesTags.objects.all(): print x + Single SQL query - Loads everything in memory
  18. 18. Iterating on big tables for x in MessagesTags.objects.iterator(): print x + Single SQL query + Loads pieces of the result in memory - prefetch_related is not working
  19. 19. Questions?

×