Django Meetup: Django Multicolumn Joins
Upcoming SlideShare
Loading in...5
×
 

Django Meetup: Django Multicolumn Joins

on

  • 595 views

A presentation shared by Hearsay Social software engineer Jeremy Tillman.

A presentation shared by Hearsay Social software engineer Jeremy Tillman.

Statistics

Views

Total Views
595
Views on SlideShare
582
Embed Views
13

Actions

Likes
0
Downloads
5
Comments
0

3 Embeds 13

https://twitter.com 7
http://eventifier.co 3
http://localhost 3

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Django Meetup: Django Multicolumn Joins Django Meetup: Django Multicolumn Joins Presentation Transcript

    • Django Meetup: Django Multicolumn Joins Jeremy Tillman Software Engineer, Hearsay Social @hssengineering
    • Django Multicolumn Joins | © 2012 Hearsay Social 2 About Me • Joined Hearsay Social May 2012 as Software Engineering Generalist • Computer Engineer BA, Purdue University • 3 years @ Microsoft working on versions of Window Server • 9 years of databases experience – Access, SQL Server, MySql • Loves Sea Turtles!
    • Django Multicolumn Joins | © 2012 Hearsay Social 3 Why do we want multicolumn joins?
    • Django Multicolumn Joins | © 2012 Hearsay Social 4 Django First App: Poll example class Poll(models.Model): question = models.CharField(max_length=200) pub_date = models.DateTimeField('date published') class Choice(models.Model): poll = models.ForeignKey(Poll) choice_text = models.CharField(max_length=200) votes = models.IntegerField(default=0)
    • Django Multicolumn Joins | © 2012 Hearsay Social 5 What if we stored Polls for X number of customers? class Customer(models.Model): name = models.CharField(max_length=100) class Meta: ordering = („name‟,) class Choice(models.Model): poll = models.ForeignKey(Poll) choice_text = models.CharField(max_length=200) votes = models.IntegerField(default=0) class Poll(models.Model): customer = models.ForeignKey(Customer) question = models.CharField(max_length=200) pub_date = models.DateTimeField('date published') CREATE TABLE customer( id INT NOT NULL AUTO_INCRMENT PRIMARY KEY, name VARCHAR(100) NOT NULL); CREATE TABLE poll( id INT NOT NULL AUTO_INCRMENT PRIMARY KEY, customer_id INT NOT NULL, question VARCHAR(200) NOT NULL, pub_date DATETIME NOT NULL, INDEX idx_customer (customer_id)); CREATE TABLE choice( id INT NOT NULL AUTO_INCRMENT PRIMARY KEY, poll INT NOT NULL, choice_text VARCHAR (200), votes INT NOT NULL DEFAULT 0, INDEX idx_poll (poll_id));
    • Django Multicolumn Joins | © 2012 Hearsay Social 6 How is our data being stored? CREATE TABLE choice( id INT NOT NULL AUTO_INCRMENT PRIMARY KEY, poll_id INT NOT NULL, choice_text VARCHAR (200), votes INT NOT NULL DEFAULT 0, INDEX idx_poll (poll_id)); id poll_id choice_text votes 1 1 Ham 5 2 7 Aries 8 3 2 Elephant 9 …. … … … 23,564,149 1 All of the above 2 23,564,150 74 Sea turtle 7
    • Django Multicolumn Joins | © 2012 Hearsay Social 7 Data locality part 1: Scope by poll CREATE TABLE choice( id INT NOT NULL, poll_id INT NOT NULL, choice_text VARCHAR (200), votes INT NOT NULL DEFAULT 0, PRIMARY KEY (poll_id, id)); id poll_id choice_text votes 1 1 Ham 5 1,562 1 Turkey 46 23,564,149 1 All of the above 2 …. … … … 18,242,234 74 Jelly fish 0 23,564,150 74 Sea turtle 7
    • Django Multicolumn Joins | © 2012 Hearsay Social 8 Data locality part 2: Scope by customer CREATE TABLE choice( id INT NOT NULL, customer_id INT NOT NULL, poll_id INT NOT NULL, choice_text VARCHAR (200), votes INT NOT NULL DEFAULT 0, PRIMARY KEY (customer_id, poll_id, id)); id poll_id customer_id choice_text votes 1 1 1 Ham 5 1,562 1 1 Turkey 46 23,564,149 1 1 All of the above 2 18,242,234 74 1 Jelly fish 0 23,564,150 74 1 Sea turtle 7 … … … … …
    • Django Multicolumn Joins | © 2012 Hearsay Social 9 Representation in Django Models class Customer(models.Model): name = models.CharField(max_length=100) class Meta: ordering = („name‟,) class Choice(models.Model): customer = models.ForeignKey(Customer) poll = models.ForeignKey(Poll) choice_text = models.CharField(max_length=200) votes = models.IntegerField(default=0) class Poll(models.Model): customer = models.ForeignKey(Customer) question = models.CharField(max_length=200) pub_date = models.DateTimeField('date published')
    • Django Multicolumn Joins | © 2012 Hearsay Social 10 Customer Load/Data Balance customer_id id 1 1 2 2 3 3 4 4
    • Django Multicolumn Joins | © 2012 Hearsay Social 11 Customer Load/Data Balance: Split Customers customer_id id 3 3 3 5 4 4 4 6 customer_id id 1 1 1 5 2 2 2 6
    • Django Multicolumn Joins | © 2012 Hearsay Social 12 Add DB and Balance Load: id collision customer_id id 3 3 3 5 customer_id id 1 1 1 5 customer_id id 2 2 2 6 4 4 4 6
    • Django Multicolumn Joins | © 2012 Hearsay Social 13 Queries: Find all choices for a poll? customer_id id question 1 1 What’s your seat pref.? 1 2 Are you married? 2 1 Gender? 2 2 Did you have fun? customer_id poll_id id choice_text 1 1 1 Window 1 1 2 Ailse 1 2 1 Yes 1 2 2 No 2 1 1 Male 2 1 2 Female 2 2 1 Yes? Poll Choice
    • Django Multicolumn Joins | © 2012 Hearsay Social 14 Queries: Find all choices for a poll? Attempt 1) Using related set target_poll.choice_set.all() or Choice.objects.filter(poll=target_poll) SELECT * FROM choice WHERE poll_id = 1 customer_id id question 1 1 What’s your seat pref.? 1 2 Are you married? 2 1 Gender? 2 2 Did you have fun? customer_id poll_id id choice_text 1 1 1 Window 1 1 2 Ailse 1 2 1 Yes 1 2 2 No 2 1 1 Male 2 1 2 Female 2 2 1 Yes? Poll Choice
    • Django Multicolumn Joins | © 2012 Hearsay Social 15 Queries: Find all choices for a poll? Attempt 2) Adding a F expression target_poll.choice_set.all(customer=F(„poll__customer‟)) or Choice.objects.filter(poll=target_poll, customer=F(„poll__customer‟)) SELECT c.* FROM choice c INNER JOIN poll p ON c.poll_id = p.id WHERE c.poll_id = 1 AND c.customer_id = p.customer_id; customer_id id question 1 1 What’s your seat pref.? 1 2 Are you married? 2 1 Gender? 2 2 Did you have fun? customer_id poll_id id choice_text 1 1 1 Window 1 1 2 Ailse 1 2 1 Yes 1 2 2 No 2 1 1 Male 2 1 2 Female 2 2 1 Yes? Poll Choice
    • Django Multicolumn Joins | © 2012 Hearsay Social 16 Queries: Find all choices for a poll? Attempt 3) Filter explicitly target_poll.choice_set.all(customer=target_poll.customer) or Choice.objects.filter(poll=target_poll, customer=target_poll.customer) SELECT * FROM choice WHERE poll_id = 1 AND customer_id = 2; customer_id id question 1 1 What’s your seat pref.? 1 2 Are you married? 2 1 Gender? 2 2 Did you have fun? customer_id poll_id id choice_text 1 1 1 Window 1 1 2 Ailse 1 2 1 Yes 1 2 2 No 2 1 1 Male 2 1 2 Female 2 2 1 Yes? Poll Choice
    • Django Multicolumn Joins | © 2012 Hearsay Social 17 Field Assignment quantity_inn = Customer.objects.create(id=15, name=„Quantity Inn‟) quantity_poll = Poll.objects.create(id=1, company=quantity_inn, question=„What size bed do you prefer?‟) choice1 = Choice(id=1, choice_text=“King”, poll=quantity_poll) choice1.customer_id ??????   choice1.customer = quantity_poll.customer Repetitive
    • Django Multicolumn Joins | © 2012 Hearsay Social 18 What do we do?
    • Django Multicolumn Joins | © 2012 Hearsay Social 19 Solution via Django 1.6 class ForeignObject(othermodel, from_fields, to_fields[, **options]) where: from django.db.models import ForeignObject
    • Django Multicolumn Joins | © 2012 Hearsay Social 20 ForeignObject Usage class ForeignModel(models.Model): id1 = models.IntegerField() id2 = models.IntegerField() class ReferencingModel(models.Model): om_id1 = models.IntegerField() om_id2 = models.IntegerField() om = ForeignObject(ForeignModel, from_fields=(om_id1, om_id2), to_fields=(id1, id2))
    • Django Multicolumn Joins | © 2012 Hearsay Social 21 Conversion from ForeignKey to ForeignObject class Choice(models.Model): customer = models.ForeignKey(Customer) poll = models.ForeignKey(Poll) choice_text = models.CharField(max_length=200) votes = models.IntegerField(default=0) class Choice(models.Model): customer = models.ForeignKey(Customer) poll_id = models.IntegerField() choice_text = models.CharField(max_length=200) votes = models.IntegerField(default=0) poll = models.ForeignObject(Poll, from_fields=(‘customer’, ‘poll_id’), to_fields=(‘customer’, ‘id’))
    • Django Multicolumn Joins | © 2012 Hearsay Social 22 Queries with ForeignObject Attempt 1) Using related set target_poll.choice_set.all() SELECT * FROM choice WHERE poll_id = 1 AND customer_id = 2; customer_id id question 1 1 What’s your seat pref.? 1 2 Are you married? 2 1 Gender? 2 2 Did you have fun? customer_id poll_id id choice_text 1 1 1 Window 1 1 2 Ailse 1 2 1 Yes 1 2 2 No 2 1 1 Male 2 1 2 Female 2 2 1 Yes? Poll Choice
    • Django Multicolumn Joins | © 2012 Hearsay Social 23 Queries with ForeignObject Attempt 2) Manually stated Choice.objects.filter(poll=target_poll) SELECT * FROM choice WHERE poll_id = 1 AND customer_id = 2; customer_id id question 1 1 What’s your seat pref.? 1 2 Are you married? 2 1 Gender? 2 2 Did you have fun? customer_id poll_id id choice_text 1 1 1 Window 1 1 2 Ailse 1 2 1 Yes 1 2 2 No 2 1 1 Male 2 1 2 Female 2 2 1 Yes? Poll Choice
    • Django Multicolumn Joins | © 2012 Hearsay Social 24 Queries with ForeignObject Attempt 2) Manually stated w/tuple Choice.objects.filter(poll=(2, 1)) SELECT * FROM choice WHERE poll_id = 1 AND customer_id = 2; customer_id id question 1 1 What’s your seat pref.? 1 2 Are you married? 2 1 Gender? 2 2 Did you have fun? customer_id poll_id id choice_text 1 1 1 Window 1 1 2 Ailse 1 2 1 Yes 1 2 2 No 2 1 1 Male 2 1 2 Female 2 2 1 Yes? Poll Choice
    • Django Multicolumn Joins | © 2012 Hearsay Social 25 Field Assignment with ForeignObject quantity_inn = Customer.objects.create(id=15, name=„Quantity Inn‟) quantity_poll = Poll.objects.create(id=1, company=quantity_inn, question=„What size bed do you prefer?‟) choice1 = Choice(id=1, choice_text=“King”, poll=quantity_poll) choice1.customer_id >> 15   choice1.customer = quantity_poll.customer  Not needed
    • Django Multicolumn Joins | © 2012 Hearsay Social 26 “With great power comes great responsibility”
    • Django Multicolumn Joins | © 2012 Hearsay Social 27 Tuple ordering matters Choice.objects.filter(poll=(1, 2)) SELECT * FROM choice WHERE poll_id = 2 AND customer_id = 1; poll = models.ForeignObject(Poll, from_fields=(‘customer’, ‘poll_id’), to_fields=(‘customer’, ‘id’)) customer_id id question 1 1 What’s your seat pref.? 1 2 Are you married? 2 1 Gender? 2 2 Did you have fun? customer_id poll_id id choice_text 1 1 1 Window 1 1 2 Ailse 1 2 1 Yes 1 2 2 No 2 1 1 Male 2 1 2 Female 2 2 1 Yes? Poll Choice
    • Django Multicolumn Joins | © 2012 Hearsay Social 28 IN Operator Choice.objects.filter(poll__in=[(2, 1), (2, 2)]) SELECT * FROM choice WHERE (poll_id = 1 AND customer_id = 2) OR (poll_id = 2 AND customer_id = 2); poll = models.ForeignObject(Poll, from_fields=(‘customer’, ‘poll_id’), to_fields=(‘customer’, ‘id’)) customer_id id question 1 1 What’s your seat pref.? 1 2 Are you married? 2 1 Gender? 2 2 Did you have fun? customer_id poll_id id choice_text 1 1 1 Window 1 1 2 Ailse 1 2 1 Yes 1 2 2 No 2 1 1 Male 2 1 2 Female 2 2 1 Yes? Poll Choice
    • Django Multicolumn Joins | © 2012 Hearsay Social 29 IN Operator w/queryset Choice.objects.filter(poll__in= Poll.objects.filter(customer_id=2)) SELECT c.* FROM choice c WHERE EXISTS (SELECT p.customer_id, p.id FROM poll p WHERE p.customer_id = 2 AND p.customer_id = c.customer_id AND p.id = c.poll_id); poll = models.ForeignObject(Poll, from_fields=(‘customer’, ‘poll_id’), to_fields=(‘customer’, ‘id’)) customer_id id question 1 1 What’s your seat pref.? 1 2 Are you married? 2 1 Gender? 2 2 Did you have fun? customer_id poll_id id choice_text 1 1 1 Window 1 1 2 Ailse 1 2 1 Yes 1 2 2 No 2 1 1 Male 2 1 2 Female 2 2 1 Yes? Poll Choice
    • Django Multicolumn Joins | © 2012 Hearsay Social 30 IN Operator with MySql Choice.objects.filter(poll__in=[(2, 1), (2, 2)]) SELECT * FROM choice WHERE (poll_id, customer_id) IN ((1, 2), (2, 2)); poll = models.ForeignObject(Poll, from_fields=(‘customer’, ‘poll_id’), to_fields=(‘customer’, ‘id’)) customer_id id question 1 1 What’s your seat pref.? 1 2 Are you married? 2 1 Gender? 2 2 Did you have fun? customer_id poll_id id choice_text 1 1 1 Window 1 1 2 Ailse 1 2 1 Yes 1 2 2 No 2 1 1 Male 2 1 2 Female 2 2 1 Yes? Poll Choice
    • Django Multicolumn Joins | © 2012 Hearsay Social 31 IN Operator w/queryset & MySQL Choice.objects.filter(poll__in= Poll.objects.filter(customer_id=2)) SELECT c.* FROM choice c WHERE (c.customer_id, c.poll_id) IN (SELECT p.customer_id, p.id FROM poll p WHERE p.customer_id = 2); poll = models.ForeignObject(Poll, from_fields=(‘customer’, ‘poll_id’), to_fields=(‘customer’, ‘id’)) customer_id id question 1 1 What’s your seat pref.? 1 2 Are you married? 2 1 Gender? 2 2 Did you have fun? customer_id poll_id id choice_text 1 1 1 Window 1 1 2 Ailse 1 2 1 Yes 1 2 2 No 2 1 1 Male 2 1 2 Female 2 2 1 Yes? Poll Choice
    • Django Multicolumn Joins | © 2012 Hearsay Social 32 ForeignKey vs ForeignObject Whats the difference? ForeignKey is a ForeignObject pseudo def: ForeignObject(OtherModel, from_fields=((„self‟,)), to_fields=((OtherModel._meta.pk.name),))
    • Django Multicolumn Joins | © 2012 Hearsay Social 33 ForeignKey usage: Order By Example Poll.objects.order_by(„customer‟) class Customer(models.Model): name = models.CharField(max_length=100) class Meta: ordering = („name‟,) class Poll(models.Model): customer = models.ForeignKey(Customer) question = models.CharField(max_length=200) pub_date = models.DateTimeField('date published')
    • Django Multicolumn Joins | © 2012 Hearsay Social 34 ForeignKey usage: Order By Example Poll.objects.order_by(„customer‟) SELECT p.* from poll INNER JOIN customer c ON p.customer_id = c.id ORDER BY c.name ASC; class Customer(models.Model): name = models.CharField(max_length=100) class Meta: ordering = („name‟,) class Poll(models.Model): customer = models.ForeignKey(Customer) question = models.CharField(max_length=200) pub_date = models.DateTimeField('date published')
    • Django Multicolumn Joins | © 2012 Hearsay Social 35 ForeignKey usage: Order By Example Poll.objects.order_by(„customer_id‟) SELECT p.* from poll INNER JOIN customer c ON p.customer_id = c.id ORDER BY c.name ASC; class Customer(models.Model): name = models.CharField(max_length=100) class Meta: ordering = („name‟,) class Poll(models.Model): customer = models.ForeignKey(Customer) question = models.CharField(max_length=200) pub_date = models.DateTimeField('date published') Alias for customer
    • Django Multicolumn Joins | © 2012 Hearsay Social 36 ForeignKey usage: Order By Example Poll.objects.order_by(„customer__id‟) SELECT p.* from poll INNER JOIN customer c ON p.customer_id = c.id ORDER BY p.customer_id ASC; class Customer(models.Model): name = models.CharField(max_length=100) class Meta: ordering = („name‟,) class Poll(models.Model): customer = models.ForeignKey(Customer) question = models.CharField(max_length=200) pub_date = models.DateTimeField('date published')
    • Django Multicolumn Joins | © 2012 Hearsay Social 37 ForeignKey usage: Order By Example Poll.objects.order_by(„customer_id‟) SELECT * from poll ORDER BY customer_id ASC; class Customer(models.Model): name = models.CharField(max_length=100) class Meta: ordering = („name‟,) class Poll(models.Model): customer_id = models.IntegerField() question = models.CharField(max_length=200) pub_date = models.DateTimeField('date published') customer = models.ForeignObject(Customer, from_fields=(„customer_id‟,), to_fields=(„id‟,))
    • Django Multicolumn Joins | © 2012 Hearsay Social 38 Still more fun stuff • ForeignObject.get_extra_description_filter • ForeignObject.get_extra_restriction • More to come
    • Django Multicolumn Joins | © 2012 Hearsay Social 39 Dig for more information: • ForeignObject source • django/db/models/fields/related.py • V1 Version of Patch (Based of Django 1.4) • https://github.com/jtillman/django/tree/MultiColumnJoin • Blog post to come • Hearsay Social Blog (http://engineering.hearsaysocial.com/)
    • Django Multicolumn Joins | © 2012 Hearsay Social 40 Questions?