3. The setup
2.5GHz i7, 16GB Ram, MacBook Pro
Django 1.10
MySQL 5.7.14
PostgreSQL 9.5.4
4. The Models
class Tag(models.Model):
name = models.CharField(max_length=255)
class User(models.Model):
name = models.CharField(max_length=255)
date = models.DateTimeField(null=True)
class Message(models.Model):
sender = models.ForeignKey(User, related_name='sent_messages')
receiver = models.ForeignKey(User, related_name='recieved_messages', null=True)
tags = models.ManyToManyField(Tag)
5. The Change
class Message(models.Model):
sender = models.ForeignKey(User, related_name='sent_messages')
receiver = models.ForeignKey(User, related_name='recieved_messages', null=True)
tags = models.ManyToManyField(Tag, blank=True)
6. The weird migration
ALTER TABLE `big_tables_message_tags` DROP FOREIGN KEY
`big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id`;
ALTER TABLE `big_tables_message_tags` ADD CONSTRAINT
`big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id` FOREIGN KEY (`tag_id`)
REFERENCES `big_tables_tag` (`id`);
ALTER TABLE `big_tables_message_tags` DROP FOREIGN KEY
`big_tables_message__message_id_95bfb6e6_fk_big_tables_message_id`;
ALTER TABLE `big_tables_message_tags` ADD CONSTRAINT
`big_tables_message__message_id_95bfb6e6_fk_big_tables_message_id` FOREIGN KEY
(`message_id`) REFERENCES `big_tables_message` (`id`);
7. MySQL
Rows ~ 2.7M
Size ~ 88MB
message_id index size ~ 48MB
tags_id index size ~ 61MB
Migration time ~ 41 sec
8. The weird migration
ALTER TABLE "big_tables_message_tags" DROP CONSTRAINT
"big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id";
ALTER TABLE "big_tables_message_tags" ADD CONSTRAINT
"big_tables_message_tags_tag_id_5eb6034e_fk_big_tables_tag_id" FOREIGN KEY ("tag_id")
REFERENCES "big_tables_tag" ("id") DEFERRABLE INITIALLY DEFERRED;
ALTER TABLE "big_tables_message_tags" DROP CONSTRAINT
"big_tables_message_message_id_95bfb6e6_fk_big_tables_message_id";
ALTER TABLE "big_tables_message_tags" ADD CONSTRAINT
"big_tables_message_message_id_95bfb6e6_fk_big_tables_message_id" FOREIGN KEY
("message_id") REFERENCES "big_tables_message" ("id") DEFERRABLE INITIALLY DEFERRED;
9. PostgreSQL
Rows ~ 2.8M
Size ~ 83MB
message_id index size ~ 77MB
tags_id index size ~ 119MB
Migration time ~ 3.2 sec
10. Modify the migration that created the field and add the change there
* It is a know issue https://code.djangoproject.com/ticket/25253
Solution
11. class MessagesTags(models.Model):
message = models.ForeignKey(Message)
tag = models.ForeignKey(Tag)
added_by = models.ForeignKey(User, null=True)
Adding fields to big tables
13. MySQL INPLACE
ALTER TABLE `big_tables_message_tags` ADD COLUMN `added_by_id`
integer NULL, ALGORITHM INPLACE, LOCK NONE;
ALTER TABLE `big_tables_message_tags` ADD CONSTRAINT
`big_tables_message_ta_added_by_id_88e3a4dc_fk_big_tables_user_id`
FOREIGN KEY (`added_by_id`) REFERENCES `big_tables_user` (`id`),
ALGORITHM INPLACE, LOCK NONE;
* The INPLACE algorithm is supported when foreign_key_checks is disabled.
Otherwise, only the COPY algorithm is supported.
14. Running this on prod
Running in on prod resulted in the API crashing
Non locking query but still too heavy for the DB
Aurora appears even slower
16. Alternative
class MessagesTags(models.Model):
...
@property
def status(self):
try:
return self.messagestagsextend.status
except MessagesTagsExtend.DoesNotExist:
print 'here'
return MessagesTagsExtend.DEFAULT_STATUS
@status.setter
def status(self, value):
obj, _ = MessagesTagsExtend.objects.get_or_create(message_tag=self)
obj.status = value
obj.save()
self.messagestagsextend = obj
* Performance is not tested on production environment
17. Iterating on big tables
for x in MessagesTags.objects.all():
print x
+ Single SQL query
- Loads everything in memory
18. Iterating on big tables
for x in MessagesTags.objects.iterator():
print x
+ Single SQL query
+ Loads pieces of the result in memory
- prefetch_related is not working