Scaling Multi-Tenant Applications
Using the Django ORM & Postgres
Louise Grandjonc - Pycaribbean 2019
@louisemeta
About me
Software Engineer at Citus Data
Postgres enthusiast
@louisemeta and @citusdata on twitter
www.louisemeta.com
louise@citusdata.com
@louisemeta
Today’s agenda
1. What do we mean by “multi-tenancy”?
2. Three ways to scale a multi-tenant app
3. Shared tables in your Django apps
4. Postgres, citus and shared tables
@louisemeta
What do we mean by
“multi-tenancy”
@louisemeta
What do we mean by “multi-tenancy”
- Multiple customers (tenants)
- Each with their own data
- SaaS
- Example: shopify, salesforce
@louisemeta
What do we mean by “multi-tenancy”
A very realistic example
Owner examples:
- Hogwarts
- Ministry of Magic
- Harry Potter
- Post office
@louisemeta
What do we mean by “multi-tenancy”
The problem of scaling multi tenant apps
@louisemeta
3 ways to scale
a multi-tenant app
@louisemeta
Solution 1:
One database per tenant
@louisemeta
- Organized collection of interrelated data
- Don’t share resources:
- Username and password
- Connections
- Memory
@louisemeta
One database per tenant
@louisemeta
One database per tenant
1. Changing the settings
DATABASES = {
'tenant-{id1}': {
'ENGINE': 'django.db.backends.postgresql',
'NAME': 'hogwarts',
'USER': 'louise',
'PASSWORD': ‘abc',
'HOST': '…',
'PORT': '5432'
},
'tenant-{id2}': {
'ENGINE': 'django.db.backends.postgresql',
'NAME': ‘ministry',
'USER': 'louise',
'PASSWORD': 'abc',
'HOST': '…',
'PORT': '5432'
},
…
}
Warnings !
- You need to have each tenant in
the settings.
- When you have a new customer,
you need to create a database
and change the settings.
One database per tenant
@louisemeta
2. Handling migrations
python manage.py migrate —database=tenant_id1;
For each tenant, when you have a new migration
One database per tenant
@louisemeta
Changes needed to handle it with Django ORM
3. Creating your own database router
class ExampleDatabaseRouter(object):
"""
Determines on which tenant database to read/write
"""
def db_for_read(self, model, **hints):
“”"Returns the name of the right database depending on the query"""
return ‘tenant_idx’
def db_for_write(self, model, **hints):
“”"Returns the name of the right database depending on the query"""
return ‘tenant_idx’
def allow_relation(self, obj1, obj2, **hints):
"""Determine if relationship is allowed between two objects.
The two objects have to be on the same database ;)”””
pass
One database per tenant
@louisemeta
PROS
- Start quickly
- Isolate customer (tenant) data
- Compliance is a bit easier
- If one customer is queried a lot,
performance degrade will be low
- Time for DBA/developer to manage
- Hard to handle with ORMs
- Maintain consistency
(ex: create index across all databases)
- Longer running migrations
- Performance degrades as # customers (tenants)
goes up
CONS
One database per tenant
@louisemeta
Solution 2:
One schema per tenant
@louisemeta
- Logical namespaces to hold a set of tables
- Share resources:
- Username and password
- Connections
- Memory
One schema per tenant
@louisemeta
One schema per tenant
@louisemeta
PROS
- Better resource utilization vs.
one database per tenant
- Start quickly
- Logical isolation
- Hard to manage (ex: add column across
all schemas)
- Longer running migrations
- Performance degrades as # customers
(tenants) goes up
CONS
One schema per tenant
@louisemeta
Solution 3:
Shared tables architecture
@louisemeta
Shared tables architecture
@louisemeta
Shared tables architecture
@louisemeta
PROS
- Easy maintenance
- Faster running migrations
- Best resource utilization
- Faster performance
- Scales to 1k-100k tenants
- Application code to guarantee isolation
- Make sure ORM calls are always scoped to
a single tenant
CONS
Shared tables architecture
@louisemeta
3 ways to scale multi-tenant apps
@louisemeta
Shared tables in your Django apps
@louisemeta
Main problems to solve
- Make sure ORM calls are always scoped to a
single tenant
- Include the tenant column to joins
@louisemeta
django-multitenant
Automates all ORM calls
to be scoped to a single tenant
@louisemeta
django-multitenant
Owl.objects.filter(name=‘Hedwige’)
<=>
SELECT * from app_owl where name=‘Hedwige’
Owl.objects.filter(id=1)
<=>
SELECT * from app_owl
WHERE name=‘Hedwige’
AND owner_id = <tenant_id>
django-multitenant
Letter.objects.filter(id=1).select_related(‘deliverer_id’)
<=>
SELECT * from app_letter
INNER JOIN app_owl ON (app_owl.id=app_letter.deliverer_id)
WHERE app_letter.id=1
Letter.objects.filter(id=1).select_related(‘deliverer_id’)
<=>
SELECT * from app_letter
INNER JOIN app_owl ON (app_owl.id=app_letter.deliverer_id
AND app_owl.owner_id=app_letter.owner_id)
WHERE app_letter.id=1 AND app_owl.owner_id = <tenant_id>
@louisemeta
django-multitenant
3 steps
1. Change models to use TenantMixin and TenantManagerMixin

2.Change ForeignKey to TenantForeignKey 

3.Define tenant scoping: set_current_tenant(t) 

@louisemeta
django-multitenant
3 steps
Models before using django-multitenant
class Owner(models.Model):
type = models.CharField(max_length=10) # add choice
name = models.CharField(max_length=255)
class Owl(models.Model):
name = models.CharField(max_length=255)
owner = models.ForeignKey(Owner)
feather_color = models.CharField(max_length=255)
favorite_food = models.CharField(max_length=255)
class Letters(models.Model):
content = models.TextField()
deliverer = models.ForeignKey(Owl)
@louisemeta
django-multitenant
3 steps
Models with django-multitenant
class TenantManager(TenantManagerMixin, models.Manager):
pass
class Owner(TenantModelMixin, models.Model):
type = models.CharField(max_length=10) # add choice
name = models.CharField(max_length=255)
tenant_id = ‘id’
objects = TenantManager()
class Owl(TenantModelMixin, models.Model):
name = models.CharField(max_length=255)
owner = TenantForeignKey(Owner)
feather_color = models.CharField(max_length=255)
favorite_food = models.CharField(max_length=255)
tenant_id = ‘owner_id’
objects = TenantManager()
class Letters(TenantModelMixin, models.Model):
content = models.TextField()
deliverer = models.ForeignKey(Owl)
owner = TenantForeignKey(Owner)
tenant_id = ‘owner_id’
objects = TenantManager()
@louisemeta
django-multitenant
3 steps
set_current_tenant(t)
- Specifies which tenant the APIs should be scoped to
- Set at authentication logic via middleware
- Set explicitly at top of function (ex. view, external tasks/jobs)
@louisemeta
django-multitenant
3 steps
set_current_tenant(t) in Middleware
class TenantMiddleware:
def __init__(self, get_response):
self.get_response = get_response
# One-time configuration and initialization.
def __call__(self, request):
#Assuming your app has a function to get the tenant associated for a user
current_tenant = get_tenant_for_user(request.user)
set_current_tenant(current_tenant)
response = self.get_response(request)
return response
@louisemeta
django-multitenant
Benefits of django-multitenant
- Drop-in implementation of shared tables architecture
- Guarantees isolation
- Ready to scale with distributed Postgres (Citus)
@louisemeta
Postgres, citus
and shared tables
@louisemeta
Why Postgres
- Open source
- Constraints
- Rich SQL support
- Extensions
- PostGIS / Geospatial
- HLL
- TopN
- Citus
- Foreign data wrappers
- Fun indexes (GIN, GiST, BRIN…)
- CTEs
- Window functions
- Full text search
- Datatypes
- JSONB
@louisemeta
Why citus
- Citus is an open source extension for postgreSQL
- Implements a distributed architecture for postgres
- Allows you to scale out CPU, memory, etc.
- Compatible with modern postgres (up to 11)
@louisemeta
Distributed Postgres with citus
@louisemeta
Distributed Postgres with citus
Foreign key colocation
@louisemeta
- Full SQL support for queries on a single set of co-located
shards
- Multi-statement transaction support for modifications on a
single set of co-located shards
- Foreign keys
- …
Distributed Postgres with citus
Foreign key colocation
@louisemeta
Scope your queries !
Distributed Postgres with citus
@louisemeta
Why citus
@louisemeta
Scale out Django!
github.com/citusdata/django-multitenant
louise@citusdata.com

citusdata.com/newsletter
@louisemeta. @citusdata

Scaling Multi-Tenant Applications Using the Django ORM & Postgres | PyCaribbean 2019 | Louise Grandjonc

  • 1.
    Scaling Multi-Tenant Applications Usingthe Django ORM & Postgres Louise Grandjonc - Pycaribbean 2019 @louisemeta
  • 2.
    About me Software Engineerat Citus Data Postgres enthusiast @louisemeta and @citusdata on twitter www.louisemeta.com louise@citusdata.com @louisemeta
  • 3.
    Today’s agenda 1. Whatdo we mean by “multi-tenancy”? 2. Three ways to scale a multi-tenant app 3. Shared tables in your Django apps 4. Postgres, citus and shared tables @louisemeta
  • 4.
    What do wemean by “multi-tenancy” @louisemeta
  • 5.
    What do wemean by “multi-tenancy” - Multiple customers (tenants) - Each with their own data - SaaS - Example: shopify, salesforce @louisemeta
  • 6.
    What do wemean by “multi-tenancy” A very realistic example Owner examples: - Hogwarts - Ministry of Magic - Harry Potter - Post office @louisemeta
  • 7.
    What do wemean by “multi-tenancy” The problem of scaling multi tenant apps @louisemeta
  • 8.
    3 ways toscale a multi-tenant app @louisemeta
  • 9.
    Solution 1: One databaseper tenant @louisemeta
  • 10.
    - Organized collectionof interrelated data - Don’t share resources: - Username and password - Connections - Memory @louisemeta One database per tenant
  • 11.
  • 12.
    1. Changing thesettings DATABASES = { 'tenant-{id1}': { 'ENGINE': 'django.db.backends.postgresql', 'NAME': 'hogwarts', 'USER': 'louise', 'PASSWORD': ‘abc', 'HOST': '…', 'PORT': '5432' }, 'tenant-{id2}': { 'ENGINE': 'django.db.backends.postgresql', 'NAME': ‘ministry', 'USER': 'louise', 'PASSWORD': 'abc', 'HOST': '…', 'PORT': '5432' }, … } Warnings ! - You need to have each tenant in the settings. - When you have a new customer, you need to create a database and change the settings. One database per tenant @louisemeta
  • 13.
    2. Handling migrations pythonmanage.py migrate —database=tenant_id1; For each tenant, when you have a new migration One database per tenant @louisemeta
  • 14.
    Changes needed tohandle it with Django ORM 3. Creating your own database router class ExampleDatabaseRouter(object): """ Determines on which tenant database to read/write """ def db_for_read(self, model, **hints): “”"Returns the name of the right database depending on the query""" return ‘tenant_idx’ def db_for_write(self, model, **hints): “”"Returns the name of the right database depending on the query""" return ‘tenant_idx’ def allow_relation(self, obj1, obj2, **hints): """Determine if relationship is allowed between two objects. The two objects have to be on the same database ;)””” pass One database per tenant @louisemeta
  • 15.
    PROS - Start quickly -Isolate customer (tenant) data - Compliance is a bit easier - If one customer is queried a lot, performance degrade will be low - Time for DBA/developer to manage - Hard to handle with ORMs - Maintain consistency (ex: create index across all databases) - Longer running migrations - Performance degrades as # customers (tenants) goes up CONS One database per tenant @louisemeta
  • 16.
    Solution 2: One schemaper tenant @louisemeta
  • 17.
    - Logical namespacesto hold a set of tables - Share resources: - Username and password - Connections - Memory One schema per tenant @louisemeta
  • 18.
    One schema pertenant @louisemeta
  • 19.
    PROS - Better resourceutilization vs. one database per tenant - Start quickly - Logical isolation - Hard to manage (ex: add column across all schemas) - Longer running migrations - Performance degrades as # customers (tenants) goes up CONS One schema per tenant @louisemeta
  • 20.
    Solution 3: Shared tablesarchitecture @louisemeta
  • 21.
  • 22.
  • 23.
    PROS - Easy maintenance -Faster running migrations - Best resource utilization - Faster performance - Scales to 1k-100k tenants - Application code to guarantee isolation - Make sure ORM calls are always scoped to a single tenant CONS Shared tables architecture @louisemeta
  • 24.
    3 ways toscale multi-tenant apps @louisemeta
  • 25.
    Shared tables inyour Django apps @louisemeta
  • 26.
    Main problems tosolve - Make sure ORM calls are always scoped to a single tenant - Include the tenant column to joins @louisemeta
  • 27.
    django-multitenant Automates all ORMcalls to be scoped to a single tenant @louisemeta
  • 28.
    django-multitenant Owl.objects.filter(name=‘Hedwige’) <=> SELECT * fromapp_owl where name=‘Hedwige’ Owl.objects.filter(id=1) <=> SELECT * from app_owl WHERE name=‘Hedwige’ AND owner_id = <tenant_id>
  • 29.
    django-multitenant Letter.objects.filter(id=1).select_related(‘deliverer_id’) <=> SELECT * fromapp_letter INNER JOIN app_owl ON (app_owl.id=app_letter.deliverer_id) WHERE app_letter.id=1 Letter.objects.filter(id=1).select_related(‘deliverer_id’) <=> SELECT * from app_letter INNER JOIN app_owl ON (app_owl.id=app_letter.deliverer_id AND app_owl.owner_id=app_letter.owner_id) WHERE app_letter.id=1 AND app_owl.owner_id = <tenant_id> @louisemeta
  • 30.
    django-multitenant 3 steps 1. Changemodels to use TenantMixin and TenantManagerMixin
 2.Change ForeignKey to TenantForeignKey 
 3.Define tenant scoping: set_current_tenant(t) 
 @louisemeta
  • 31.
    django-multitenant 3 steps Models beforeusing django-multitenant class Owner(models.Model): type = models.CharField(max_length=10) # add choice name = models.CharField(max_length=255) class Owl(models.Model): name = models.CharField(max_length=255) owner = models.ForeignKey(Owner) feather_color = models.CharField(max_length=255) favorite_food = models.CharField(max_length=255) class Letters(models.Model): content = models.TextField() deliverer = models.ForeignKey(Owl) @louisemeta
  • 32.
    django-multitenant 3 steps Models withdjango-multitenant class TenantManager(TenantManagerMixin, models.Manager): pass class Owner(TenantModelMixin, models.Model): type = models.CharField(max_length=10) # add choice name = models.CharField(max_length=255) tenant_id = ‘id’ objects = TenantManager() class Owl(TenantModelMixin, models.Model): name = models.CharField(max_length=255) owner = TenantForeignKey(Owner) feather_color = models.CharField(max_length=255) favorite_food = models.CharField(max_length=255) tenant_id = ‘owner_id’ objects = TenantManager() class Letters(TenantModelMixin, models.Model): content = models.TextField() deliverer = models.ForeignKey(Owl) owner = TenantForeignKey(Owner) tenant_id = ‘owner_id’ objects = TenantManager() @louisemeta
  • 33.
    django-multitenant 3 steps set_current_tenant(t) - Specifieswhich tenant the APIs should be scoped to - Set at authentication logic via middleware - Set explicitly at top of function (ex. view, external tasks/jobs) @louisemeta
  • 34.
    django-multitenant 3 steps set_current_tenant(t) inMiddleware class TenantMiddleware: def __init__(self, get_response): self.get_response = get_response # One-time configuration and initialization. def __call__(self, request): #Assuming your app has a function to get the tenant associated for a user current_tenant = get_tenant_for_user(request.user) set_current_tenant(current_tenant) response = self.get_response(request) return response @louisemeta
  • 35.
    django-multitenant Benefits of django-multitenant -Drop-in implementation of shared tables architecture - Guarantees isolation - Ready to scale with distributed Postgres (Citus) @louisemeta
  • 36.
    Postgres, citus and sharedtables @louisemeta
  • 37.
    Why Postgres - Opensource - Constraints - Rich SQL support - Extensions - PostGIS / Geospatial - HLL - TopN - Citus - Foreign data wrappers - Fun indexes (GIN, GiST, BRIN…) - CTEs - Window functions - Full text search - Datatypes - JSONB @louisemeta
  • 38.
    Why citus - Citusis an open source extension for postgreSQL - Implements a distributed architecture for postgres - Allows you to scale out CPU, memory, etc. - Compatible with modern postgres (up to 11) @louisemeta
  • 39.
    Distributed Postgres withcitus @louisemeta
  • 40.
    Distributed Postgres withcitus Foreign key colocation @louisemeta
  • 41.
    - Full SQLsupport for queries on a single set of co-located shards - Multi-statement transaction support for modifications on a single set of co-located shards - Foreign keys - … Distributed Postgres with citus Foreign key colocation @louisemeta
  • 42.
    Scope your queries! Distributed Postgres with citus @louisemeta
  • 43.
  • 44.