SlideShare a Scribd company logo
1 of 16
Scaling Dubsmash's backend
from 0 to 100+ million users
PYCON.DE Munich - Daniel Taschik – 10/29/2016
We hit a nerve.
>100M
Users
192
Countries
1.5B
Videos
Dubsmash
Connect Create Communicate
The Start
The Start
Backend
• Django-powered BE for content
management
• web-based Dubloader to add sounds
• deployed on Heroku
Content Delivery
• sound files in S3
• meta information in JSON file in S3
• files served via Cloudfront CDN
Metrics
• Dubloader with < 100 req/min
• >500 TB! of traffic in January 2015
Dubsmash Service Landscape
Backend
Router
S3 sound storage
Cloudfront CDN
New Features: Registration & Search
User registration
• API based on REST framework
• Django user model
• store user’s most like sounds
• push notifications for new content
Server-side Sound search
• new Django-based service
• search via ElasticSearch using Haystack
• Celery-based indexing on RQ
Metrics
• 100.000 registrations within first 24h
• >20.000 requests per minute on search service
caching
Cloudfront CDN
S3 sound storage
Dubsmash Service Landscape
Searc
h
Router
main API
DubTalk
Social Graph Service
• friend relations on platform
• Django
• TitanDB on Cassandra
• later DynamoDB
DubTalk Service
• group & video management
• Django
Service Communication
• Async via Celery on RabbitMQ
• Sync via internal HTTPS API
Metrics
• > 50.000 requests per min on both
• > 150.000.000 videos stored
NoSQL
Cloudfront CDN
caching
S3 sound storage
Dubsmash Service Landscape
GraphDubTalk
Router
Monolith
relational DB
Large Scale Problems
favorited sounds outgrew our PostgreSQL
• > 1.000.000.000 favorited sounds
• simple data model & access pattern
• Premium-7 120GB RAM, 1TB disk instance
dtaschik@unic0rn:~/dubsmash$ heroku pg:table-size -a dubsmash
name | size
-----------------------------------+------------
users_favs | 158 GB
dtaschik@unic0rn:~/dubsmash $ heroku pg:index-size -a dubsmash
name | size
-----------------------------------+------------
users_favs_username_key | 132 GB
ID username sound_id
1 daniel3 Dzdcjc
2 sarah 3jGYzH
Let’s make it a new service!
Cloudfront CDNS3 sound storage
Dubsmash Service Landscape
Auth
Graph DubTalk
Favs
Router
Monolith
relational DB
caching
NoSQL
many
more
Our Goal
Interested? Come and join! 😜
Building the largest mobile video communication
platform!
Questions?
daniel@dubsmash.com | daniel3 | @dtaschik
Let’s say it with video!
Thank you!
daniel@dubsmash.com | daniel3 | @dtaschik

More Related Content

Viewers also liked

5S_Implementation_Guide
5S_Implementation_Guide5S_Implementation_Guide
5S_Implementation_Guide
Selvakumar K
 
Pretzelmaker Franchise Opportunity Available in Orlando, Florida!
Pretzelmaker Franchise Opportunity Available in Orlando, Florida!Pretzelmaker Franchise Opportunity Available in Orlando, Florida!
Pretzelmaker Franchise Opportunity Available in Orlando, Florida!
Pretzelmaker
 

Viewers also liked (7)

5S_Implementation_Guide
5S_Implementation_Guide5S_Implementation_Guide
5S_Implementation_Guide
 
RA APP
RA APPRA APP
RA APP
 
Globalización
GlobalizaciónGlobalización
Globalización
 
Alper Okay CV
Alper Okay CVAlper Okay CV
Alper Okay CV
 
Chudnutie v praxi ako sa zbaviť obezity
Chudnutie v praxi   ako sa zbaviť obezityChudnutie v praxi   ako sa zbaviť obezity
Chudnutie v praxi ako sa zbaviť obezity
 
Top 10 Game yang Sequelnya Tenggelam
Top 10 Game yang Sequelnya TenggelamTop 10 Game yang Sequelnya Tenggelam
Top 10 Game yang Sequelnya Tenggelam
 
Pretzelmaker Franchise Opportunity Available in Orlando, Florida!
Pretzelmaker Franchise Opportunity Available in Orlando, Florida!Pretzelmaker Franchise Opportunity Available in Orlando, Florida!
Pretzelmaker Franchise Opportunity Available in Orlando, Florida!
 

Similar to Scaling Dubsmash's backend from 0 to 100+ million users

Similar to Scaling Dubsmash's backend from 0 to 100+ million users (20)

Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
 
Strategic Uses for Cost Efficient Long-Term Cloud Storage
Strategic Uses for Cost Efficient Long-Term Cloud StorageStrategic Uses for Cost Efficient Long-Term Cloud Storage
Strategic Uses for Cost Efficient Long-Term Cloud Storage
 
Active Archiving with Amazon S3 and Tiering to Amazon Glacier - March 2017 AW...
Active Archiving with Amazon S3 and Tiering to Amazon Glacier - March 2017 AW...Active Archiving with Amazon S3 and Tiering to Amazon Glacier - March 2017 AW...
Active Archiving with Amazon S3 and Tiering to Amazon Glacier - March 2017 AW...
 
AWS re:Invent 2016: Strategic Planning for Long-Term Data Archiving with Amaz...
AWS re:Invent 2016: Strategic Planning for Long-Term Data Archiving with Amaz...AWS re:Invent 2016: Strategic Planning for Long-Term Data Archiving with Amaz...
AWS re:Invent 2016: Strategic Planning for Long-Term Data Archiving with Amaz...
 
Cloud for Media - A Complete Solution Stack for Faster Cloud Adoption
Cloud for Media - A Complete Solution Stack for Faster Cloud AdoptionCloud for Media - A Complete Solution Stack for Faster Cloud Adoption
Cloud for Media - A Complete Solution Stack for Faster Cloud Adoption
 
Deep Dive on Archiving and Compliance
Deep Dive on Archiving and ComplianceDeep Dive on Archiving and Compliance
Deep Dive on Archiving and Compliance
 
Architecting a 24x7 Live Linear Broadcast for Availability on AWS
Architecting a 24x7 Live Linear Broadcast for Availability on AWSArchitecting a 24x7 Live Linear Broadcast for Availability on AWS
Architecting a 24x7 Live Linear Broadcast for Availability on AWS
 
Migrate the Mission Critical Application to AWS Cloud
Migrate the Mission Critical Application to AWS CloudMigrate the Mission Critical Application to AWS Cloud
Migrate the Mission Critical Application to AWS Cloud
 
stackconf 2023 | Scaling a Collaboration Service like Nextcloud to 20 Million...
stackconf 2023 | Scaling a Collaboration Service like Nextcloud to 20 Million...stackconf 2023 | Scaling a Collaboration Service like Nextcloud to 20 Million...
stackconf 2023 | Scaling a Collaboration Service like Nextcloud to 20 Million...
 
BUILD 2014 - Building end-to-end video experience with Azure Media Services
BUILD 2014 - Building end-to-end video experience with Azure Media ServicesBUILD 2014 - Building end-to-end video experience with Azure Media Services
BUILD 2014 - Building end-to-end video experience with Azure Media Services
 
Nokta techpresentation
Nokta techpresentationNokta techpresentation
Nokta techpresentation
 
Nuxeo Platform LTS 2015 - Opening Keynote Event 2015-10
Nuxeo Platform LTS 2015 - Opening Keynote Event 2015-10Nuxeo Platform LTS 2015 - Opening Keynote Event 2015-10
Nuxeo Platform LTS 2015 - Opening Keynote Event 2015-10
 
How Netflix Directs 1/3rd of Internet Traffic
How Netflix Directs 1/3rd of Internet TrafficHow Netflix Directs 1/3rd of Internet Traffic
How Netflix Directs 1/3rd of Internet Traffic
 
A cloud-based digital supply chain: Not just about capacity
A cloud-based digital supply chain: Not just about capacity A cloud-based digital supply chain: Not just about capacity
A cloud-based digital supply chain: Not just about capacity
 
AWS re:Invent 2016: Deep Dive on Amazon Glacier (STG302)
AWS re:Invent 2016: Deep Dive on Amazon Glacier (STG302)AWS re:Invent 2016: Deep Dive on Amazon Glacier (STG302)
AWS re:Invent 2016: Deep Dive on Amazon Glacier (STG302)
 
Unlocking LOCKSS with APIs
Unlocking LOCKSS with APIsUnlocking LOCKSS with APIs
Unlocking LOCKSS with APIs
 
From a video archive to a near-live media distribution platform - Gaches, Ol...
From a video archive to a near-live media distribution platform -  Gaches, Ol...From a video archive to a near-live media distribution platform -  Gaches, Ol...
From a video archive to a near-live media distribution platform - Gaches, Ol...
 
Amazon Web Services and Interact - Workshop Giugno 2013
Amazon Web Services and Interact - Workshop Giugno 2013Amazon Web Services and Interact - Workshop Giugno 2013
Amazon Web Services and Interact - Workshop Giugno 2013
 
Sony MCS Cloud
Sony MCS CloudSony MCS Cloud
Sony MCS Cloud
 
AWS Services for Content Production
AWS Services for Content ProductionAWS Services for Content Production
AWS Services for Content Production
 

Recently uploaded

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Recently uploaded (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 

Scaling Dubsmash's backend from 0 to 100+ million users

  • 1. Scaling Dubsmash's backend from 0 to 100+ million users PYCON.DE Munich - Daniel Taschik – 10/29/2016
  • 2. We hit a nerve. >100M Users 192 Countries 1.5B Videos
  • 5. The Start Backend • Django-powered BE for content management • web-based Dubloader to add sounds • deployed on Heroku Content Delivery • sound files in S3 • meta information in JSON file in S3 • files served via Cloudfront CDN Metrics • Dubloader with < 100 req/min • >500 TB! of traffic in January 2015
  • 6. Dubsmash Service Landscape Backend Router S3 sound storage Cloudfront CDN
  • 7.
  • 8. New Features: Registration & Search User registration • API based on REST framework • Django user model • store user’s most like sounds • push notifications for new content Server-side Sound search • new Django-based service • search via ElasticSearch using Haystack • Celery-based indexing on RQ Metrics • 100.000 registrations within first 24h • >20.000 requests per minute on search service
  • 9. caching Cloudfront CDN S3 sound storage Dubsmash Service Landscape Searc h Router main API
  • 10. DubTalk Social Graph Service • friend relations on platform • Django • TitanDB on Cassandra • later DynamoDB DubTalk Service • group & video management • Django Service Communication • Async via Celery on RabbitMQ • Sync via internal HTTPS API Metrics • > 50.000 requests per min on both • > 150.000.000 videos stored
  • 11. NoSQL Cloudfront CDN caching S3 sound storage Dubsmash Service Landscape GraphDubTalk Router Monolith relational DB
  • 12. Large Scale Problems favorited sounds outgrew our PostgreSQL • > 1.000.000.000 favorited sounds • simple data model & access pattern • Premium-7 120GB RAM, 1TB disk instance dtaschik@unic0rn:~/dubsmash$ heroku pg:table-size -a dubsmash name | size -----------------------------------+------------ users_favs | 158 GB dtaschik@unic0rn:~/dubsmash $ heroku pg:index-size -a dubsmash name | size -----------------------------------+------------ users_favs_username_key | 132 GB ID username sound_id 1 daniel3 Dzdcjc 2 sarah 3jGYzH Let’s make it a new service!
  • 13. Cloudfront CDNS3 sound storage Dubsmash Service Landscape Auth Graph DubTalk Favs Router Monolith relational DB caching NoSQL many more
  • 14. Our Goal Interested? Come and join! 😜 Building the largest mobile video communication platform!
  • 16. Let’s say it with video! Thank you! daniel@dubsmash.com | daniel3 | @dtaschik

Editor's Notes

  1. Dubsmash is a product with global scale. 100m users, in 192 countries created over 1,5bn videos in total. most famous ones are jimmy fallon, Huge Jackman, Arnold Schwarzenegger or Jennifer Lopez and recently emma dickson?
  2. connect, create communicate withfamily and friends but started as a very simple, 3-screen protoypical creational tool.
  3. the cave – building Dubsmash out of a Berlin souterrain 50 sqm office
  4. early 2015
  5. started mid 2015
  6. mid 2016