Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Scaling Multi-Tenant Applications Using the Django ORM & Postgres | PyCaribbean 2019 | Louise Grandjonc

42 views

Published on

There are a number of data architectures you could use when building a multi-tenant app. Some, such as using one database per customer or one schema per customer. These two options scale to an extent when you have say 10s of tenants. However as you start scaling to hundreds and thousands of tenants, you start running into challenges both from performance and maintenance of tenants perspective. You could solve the above problem by adding the notion of tenancy directly into the logic of your SaaS application. How to implement/automate this in Django-ORM is a challenge? We will talk about how to make the django app tenant aware and at a broader level explain how scale out applications that are built on top of Django ORM and follow a multi tenant data model. We'd take postgresql as our database of choice and the logic/implementation can be extended to any other relational databases as well.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Scaling Multi-Tenant Applications Using the Django ORM & Postgres | PyCaribbean 2019 | Louise Grandjonc

  1. 1. Scaling Multi-Tenant Applications Using the Django ORM & Postgres Louise Grandjonc - Pycaribbean 2019 @louisemeta
  2. 2. About me Software Engineer at Citus Data Postgres enthusiast @louisemeta and @citusdata on twitter www.louisemeta.com louise@citusdata.com @louisemeta
  3. 3. Today’s agenda 1. What do we mean by “multi-tenancy”? 2. Three ways to scale a multi-tenant app 3. Shared tables in your Django apps 4. Postgres, citus and shared tables @louisemeta
  4. 4. What do we mean by “multi-tenancy” @louisemeta
  5. 5. What do we mean by “multi-tenancy” - Multiple customers (tenants) - Each with their own data - SaaS - Example: shopify, salesforce @louisemeta
  6. 6. What do we mean by “multi-tenancy” A very realistic example Owner examples: - Hogwarts - Ministry of Magic - Harry Potter - Post office @louisemeta
  7. 7. What do we mean by “multi-tenancy” The problem of scaling multi tenant apps @louisemeta
  8. 8. 3 ways to scale a multi-tenant app @louisemeta
  9. 9. Solution 1: One database per tenant @louisemeta
  10. 10. - Organized collection of interrelated data - Don’t share resources: - Username and password - Connections - Memory @louisemeta One database per tenant
  11. 11. @louisemeta One database per tenant
  12. 12. 1. Changing the settings DATABASES = { 'tenant-{id1}': { 'ENGINE': 'django.db.backends.postgresql', 'NAME': 'hogwarts', 'USER': 'louise', 'PASSWORD': ‘abc', 'HOST': '…', 'PORT': '5432' }, 'tenant-{id2}': { 'ENGINE': 'django.db.backends.postgresql', 'NAME': ‘ministry', 'USER': 'louise', 'PASSWORD': 'abc', 'HOST': '…', 'PORT': '5432' }, … } Warnings ! - You need to have each tenant in the settings. - When you have a new customer, you need to create a database and change the settings. One database per tenant @louisemeta
  13. 13. 2. Handling migrations python manage.py migrate —database=tenant_id1; For each tenant, when you have a new migration One database per tenant @louisemeta
  14. 14. Changes needed to handle it with Django ORM 3. Creating your own database router class ExampleDatabaseRouter(object): """ Determines on which tenant database to read/write """ def db_for_read(self, model, **hints): “”"Returns the name of the right database depending on the query""" return ‘tenant_idx’ def db_for_write(self, model, **hints): “”"Returns the name of the right database depending on the query""" return ‘tenant_idx’ def allow_relation(self, obj1, obj2, **hints): """Determine if relationship is allowed between two objects. The two objects have to be on the same database ;)””” pass One database per tenant @louisemeta
  15. 15. PROS - Start quickly - Isolate customer (tenant) data - Compliance is a bit easier - If one customer is queried a lot, performance degrade will be low - Time for DBA/developer to manage - Hard to handle with ORMs - Maintain consistency (ex: create index across all databases) - Longer running migrations - Performance degrades as # customers (tenants) goes up CONS One database per tenant @louisemeta
  16. 16. Solution 2: One schema per tenant @louisemeta
  17. 17. - Logical namespaces to hold a set of tables - Share resources: - Username and password - Connections - Memory One schema per tenant @louisemeta
  18. 18. One schema per tenant @louisemeta
  19. 19. PROS - Better resource utilization vs. one database per tenant - Start quickly - Logical isolation - Hard to manage (ex: add column across all schemas) - Longer running migrations - Performance degrades as # customers (tenants) goes up CONS One schema per tenant @louisemeta
  20. 20. Solution 3: Shared tables architecture @louisemeta
  21. 21. Shared tables architecture @louisemeta
  22. 22. Shared tables architecture @louisemeta
  23. 23. PROS - Easy maintenance - Faster running migrations - Best resource utilization - Faster performance - Scales to 1k-100k tenants - Application code to guarantee isolation - Make sure ORM calls are always scoped to a single tenant CONS Shared tables architecture @louisemeta
  24. 24. 3 ways to scale multi-tenant apps @louisemeta
  25. 25. Shared tables in your Django apps @louisemeta
  26. 26. Main problems to solve - Make sure ORM calls are always scoped to a single tenant - Include the tenant column to joins @louisemeta
  27. 27. django-multitenant Automates all ORM calls to be scoped to a single tenant @louisemeta
  28. 28. django-multitenant Owl.objects.filter(name=‘Hedwige’) <=> SELECT * from app_owl where name=‘Hedwige’ Owl.objects.filter(id=1) <=> SELECT * from app_owl WHERE name=‘Hedwige’ AND owner_id = <tenant_id>
  29. 29. django-multitenant Letter.objects.filter(id=1).select_related(‘deliverer_id’) <=> SELECT * from app_letter INNER JOIN app_owl ON (app_owl.id=app_letter.deliverer_id) WHERE app_letter.id=1 Letter.objects.filter(id=1).select_related(‘deliverer_id’) <=> SELECT * from app_letter INNER JOIN app_owl ON (app_owl.id=app_letter.deliverer_id AND app_owl.owner_id=app_letter.owner_id) WHERE app_letter.id=1 AND app_owl.owner_id = <tenant_id> @louisemeta
  30. 30. django-multitenant 3 steps 1. Change models to use TenantMixin and TenantManagerMixin
 2.Change ForeignKey to TenantForeignKey 
 3.Define tenant scoping: set_current_tenant(t) 
 @louisemeta
  31. 31. django-multitenant 3 steps Models before using django-multitenant class Owner(models.Model): type = models.CharField(max_length=10) # add choice name = models.CharField(max_length=255) class Owl(models.Model): name = models.CharField(max_length=255) owner = models.ForeignKey(Owner) feather_color = models.CharField(max_length=255) favorite_food = models.CharField(max_length=255) class Letters(models.Model): content = models.TextField() deliverer = models.ForeignKey(Owl) @louisemeta
  32. 32. django-multitenant 3 steps Models with django-multitenant class TenantManager(TenantManagerMixin, models.Manager): pass class Owner(TenantModelMixin, models.Model): type = models.CharField(max_length=10) # add choice name = models.CharField(max_length=255) tenant_id = ‘id’ objects = TenantManager() class Owl(TenantModelMixin, models.Model): name = models.CharField(max_length=255) owner = TenantForeignKey(Owner) feather_color = models.CharField(max_length=255) favorite_food = models.CharField(max_length=255) tenant_id = ‘owner_id’ objects = TenantManager() class Letters(TenantModelMixin, models.Model): content = models.TextField() deliverer = models.ForeignKey(Owl) owner = TenantForeignKey(Owner) tenant_id = ‘owner_id’ objects = TenantManager() @louisemeta
  33. 33. django-multitenant 3 steps set_current_tenant(t) - Specifies which tenant the APIs should be scoped to - Set at authentication logic via middleware - Set explicitly at top of function (ex. view, external tasks/jobs) @louisemeta
  34. 34. django-multitenant 3 steps set_current_tenant(t) in Middleware class TenantMiddleware: def __init__(self, get_response): self.get_response = get_response # One-time configuration and initialization. def __call__(self, request): #Assuming your app has a function to get the tenant associated for a user current_tenant = get_tenant_for_user(request.user) set_current_tenant(current_tenant) response = self.get_response(request) return response @louisemeta
  35. 35. django-multitenant Benefits of django-multitenant - Drop-in implementation of shared tables architecture - Guarantees isolation - Ready to scale with distributed Postgres (Citus) @louisemeta
  36. 36. Postgres, citus and shared tables @louisemeta
  37. 37. Why Postgres - Open source - Constraints - Rich SQL support - Extensions - PostGIS / Geospatial - HLL - TopN - Citus - Foreign data wrappers - Fun indexes (GIN, GiST, BRIN…) - CTEs - Window functions - Full text search - Datatypes - JSONB @louisemeta
  38. 38. Why citus - Citus is an open source extension for postgreSQL - Implements a distributed architecture for postgres - Allows you to scale out CPU, memory, etc. - Compatible with modern postgres (up to 11) @louisemeta
  39. 39. Distributed Postgres with citus @louisemeta
  40. 40. Distributed Postgres with citus Foreign key colocation @louisemeta
  41. 41. - Full SQL support for queries on a single set of co-located shards - Multi-statement transaction support for modifications on a single set of co-located shards - Foreign keys - … Distributed Postgres with citus Foreign key colocation @louisemeta
  42. 42. Scope your queries ! Distributed Postgres with citus @louisemeta
  43. 43. Why citus @louisemeta
  44. 44. Scale out Django! github.com/citusdata/django-multitenant louise@citusdata.com
 citusdata.com/newsletter @louisemeta. @citusdata

×