SlideShare a Scribd company logo
1 of 34
Distributed Systems
Twitter Timeline
and Search
Presented By
Md. Rakib Trofder
BSSE-1129
Nazmus Sakib
Ahmed
BSSE-1108
Md. Siam
BSSE-1104
Use cases
01
Tweets
Use cases
Followers
Notifications
Emails
Timeline
Home
Timeline
Searching
High
Availability
Use cases
Constraints and
assumptions
02
State assumptions
01
100 million active users
02
500 million tweets per day
03
250 billion read
requests per month
04
10 billion searches per
month
State assumptions
10 deliveries fanout
for a tweet
5 billion tweets delivered
on fanout per day
150 billion tweets
delivered on fanout
per month
500 million
tweets per day
15 billion tweets
per month
State assumptions
Fast Optimized Heavy
Viewing the timeline Reads of tweets Reads and Writes
Timeline & Searching
Calculate usage
03
Calculate usage
250 billion read requests per month * (400 requests per second /
1 billion requests per month)
= 100 thousand read requests per second
15 billion tweets per month * (400 requests per second / 1 billion
requests per month) = 6,000 tweets per second
Calculate usage
150 billion tweets delivered on fanout per month * (400 requests
per second / 1 billion requests per month)
= 60 thousand tweets delivered on fanout per second
10 billion searches per month * (400 requests per second / 1
billion requests per month)= 4,000 search requests per second
Calculate usage
500 million tweets per day * 10 KB per tweet * 30 days per month
= 150 TB of new tweet content per month
150 TB of new tweet content per month * 12 months per year * 3 years
= 5.4 PB of new tweet content in 3 years
Design core
components
04
Database Selection (Maintaining Own
Timeline)
Indexing
Super Fast Index
Lookups
Faster Search
Inserting of Tweet to the
User’s Profile and Viewing
Own Timeline
Strict Relation
Relational
Database
Database Selection (Delivering tweets and
building the home timeline )
Overload
Writing the Tweets to
Everyone’s Profile who
Follows
Fast Write
Is the Only Solution
Caching
Non Relational
Database
Database Selection (Object Storing )
Object Database
Storing the Media (images,
videos) files
Services
User Info
Service
Tweet Info
Service
Timeline
Service
User Graph
Service
Search Service
Notification
Service Fan Out Service
Use case: User posts a tweet
● The Client posts a tweet to the Web Server, running as a reverse proxy
● The Web Server forwards the request to the Write API server
● The Write API stores the tweet in the user's timeline on a SQL database
Use case: User posts a tweet (Cont)
● The Write API contacts the Fan Out Service, which does the following:
○ Queries the User Graph Service to find the user's followers stored in the Memory
Cache
○ Stores the tweet in the home timeline of the user's followers in a Memory Cache
■ O(n) operation: 1,000 followers = 1,000 lookups and inserts
○ Stores the tweet in the Search Index Service to enable fast searching
○ Stores media in the Object Store
○ Uses the Notification Service to send out push notifications to followers:
■ Uses a Queue (not pictured) to asynchronously send out notifications
Use case: User Views the Home Timeline
● The Client posts a home timeline request to the Web Server
● The Web Server forwards the request to the Read API server
● The Read API server contacts the Timeline Service, which does the following:
○ Gets the timeline data stored in the Memory Cache, containing tweet ids and user ids -
O(1)
○ Queries the Tweet Info Service with a multiget to obtain additional info about the tweet
ids - O(n)
○ Queries the User Info Service with a multiget to obtain additional info about the user
ids - O(n)
Use case: User views the user timeline
● The Client posts a user timeline request to the Web Server
● The Web Server forwards the request to the Read API server
● The Read API retrieves the user timeline from the SQL Database
Use case: User searches keywords
● The Client sends a search request to the Web Server
● The Web Server forwards the request to the Search API server
● The Search API contacts the Search Service, which does the following:
○ Parses/tokenizes the input query, determining what needs to be searched
■ Removes markup
■ Breaks up the text into terms
■ Fixes typos
■ Normalizes capitalization
■ Converts the query to use boolean operation
● Queries the Search Cluster (ie Lucene) for the results:
○ Scatter gathers each server in the cluster to determine if there are any results for the
query
○ Merges, ranks, sorts, and returns the result
High Level Design
Scale the design
05
Testing
Benchmark Find Bottlenecks
CDN
Load Balancer
Scale the design
DNS
Replicas Master Slave Write
Scale the design
Points of Bottlenecks
SQL Write Master-
Slave
A single SQL write master-
slave might be
overwhelmed
Fan out service
In case of users with
millions of followers
Points of Optimization
(Fan Out Service)
● Avoid fanout of users with millions of followers
● Search to find tweets for highly-followed users
● Merge the results with user’s home tweets
● Re-order and serve them
Points of Optimization
(SQL Write Master-Slave)
● Federation (Functional Partitioning)
● Sharding
● Denormalization
● SQL Tuning
Points of Optimization
(Additional)
● Keep only several hundred tweets for each home timeline in the Memory Cache
● Keep only active users' home timeline info in the Memory Cache
● If a user was not previously active in the past 30 days, we could rebuild the timeline
from the SQL Database
● Query the User Graph Service to determine who the user is following
● Get the tweets from the SQL Database and add them to the Memory Cache
● Store only a month of tweets in the Tweet Info Service
● Store only active users in the User Info Service
● The Search Cluster would likely need to keep the tweets in memory to keep latency
low
Scaled
Design
CREDITS: Diese Präsentationsvorlage wurde von
Slidesgo erstellt, inklusive Icons von Flaticon und
Infografiken & Bilder von Freepik
Thanks!
Any Questions?

More Related Content

Similar to Twitter Timeline and Search Distributed System.pptx

Deploying and Managing PowerPivot for SharePoint
Deploying and Managing PowerPivot for SharePointDeploying and Managing PowerPivot for SharePoint
Deploying and Managing PowerPivot for SharePointDenny Lee
 
CSE5656 Complex Networks - Gathering Data from Twitter
CSE5656 Complex Networks - Gathering Data from TwitterCSE5656 Complex Networks - Gathering Data from Twitter
CSE5656 Complex Networks - Gathering Data from TwitterMarcello Tomasini
 
Twitter API, Streaming and SharePoint 2013
Twitter API, Streaming and SharePoint 2013Twitter API, Streaming and SharePoint 2013
Twitter API, Streaming and SharePoint 2013Sebastian Huppmann
 
SPSBE building an faq for end users
SPSBE building an faq for end usersSPSBE building an faq for end users
SPSBE building an faq for end usersPaul Hunt
 
Spsbe buildinganfaqforendusers-150422122027-conversion-gate02
Spsbe buildinganfaqforendusers-150422122027-conversion-gate02Spsbe buildinganfaqforendusers-150422122027-conversion-gate02
Spsbe buildinganfaqforendusers-150422122027-conversion-gate02BIWUG
 
O365Con18 - Hybrid SharePoint Deep Dive - Thomas Vochten
O365Con18 - Hybrid SharePoint Deep Dive - Thomas VochtenO365Con18 - Hybrid SharePoint Deep Dive - Thomas Vochten
O365Con18 - Hybrid SharePoint Deep Dive - Thomas VochtenNCCOMMS
 
Teams Provisioning with Power Automate and the Microsoft Graph
Teams Provisioning with Power Automate and the Microsoft GraphTeams Provisioning with Power Automate and the Microsoft Graph
Teams Provisioning with Power Automate and the Microsoft Graphfastlane66
 
Tom Bennet – BrightonSEO April 2016: Site Speed for content Marketers
Tom Bennet – BrightonSEO April 2016: Site Speed for content Marketers Tom Bennet – BrightonSEO April 2016: Site Speed for content Marketers
Tom Bennet – BrightonSEO April 2016: Site Speed for content Marketers Authoritas
 
Angular - Chapter 7 - HTTP Services
Angular - Chapter 7 - HTTP ServicesAngular - Chapter 7 - HTTP Services
Angular - Chapter 7 - HTTP ServicesWebStackAcademy
 
Dan Cohen, Hands On Seo from Internet World 2009
Dan Cohen, Hands On Seo from Internet World 2009Dan Cohen, Hands On Seo from Internet World 2009
Dan Cohen, Hands On Seo from Internet World 2009Dan Cohen
 
Automating your tasks with microsoft flow
Automating your tasks with microsoft flowAutomating your tasks with microsoft flow
Automating your tasks with microsoft flowDipti Chhatrapati
 
Rest webservice ppt
Rest webservice pptRest webservice ppt
Rest webservice pptsinhatanay
 
Social Architecture of SharePoint 2013 for Developers
Social Architecture of SharePoint 2013 for DevelopersSocial Architecture of SharePoint 2013 for Developers
Social Architecture of SharePoint 2013 for DevelopersPaul J. Swider
 
Web of Short URL’s
Web of Short URL’sWeb of Short URL’s
Web of Short URL’sIRJET Journal
 
Avninfosoft Seo strategy for 2014
Avninfosoft Seo strategy  for 2014Avninfosoft Seo strategy  for 2014
Avninfosoft Seo strategy for 2014Kaite Willson
 
CARA MEMBUAT REFERENSI DAN SITASI PADA NASKAH
CARA MEMBUAT REFERENSI DAN SITASI PADA NASKAHCARA MEMBUAT REFERENSI DAN SITASI PADA NASKAH
CARA MEMBUAT REFERENSI DAN SITASI PADA NASKAHRelawan Jurnal Indonesia
 

Similar to Twitter Timeline and Search Distributed System.pptx (20)

Deploying and Managing PowerPivot for SharePoint
Deploying and Managing PowerPivot for SharePointDeploying and Managing PowerPivot for SharePoint
Deploying and Managing PowerPivot for SharePoint
 
CSE5656 Complex Networks - Gathering Data from Twitter
CSE5656 Complex Networks - Gathering Data from TwitterCSE5656 Complex Networks - Gathering Data from Twitter
CSE5656 Complex Networks - Gathering Data from Twitter
 
Twitter API, Streaming and SharePoint 2013
Twitter API, Streaming and SharePoint 2013Twitter API, Streaming and SharePoint 2013
Twitter API, Streaming and SharePoint 2013
 
SPSBE building an faq for end users
SPSBE building an faq for end usersSPSBE building an faq for end users
SPSBE building an faq for end users
 
Spsbe buildinganfaqforendusers-150422122027-conversion-gate02
Spsbe buildinganfaqforendusers-150422122027-conversion-gate02Spsbe buildinganfaqforendusers-150422122027-conversion-gate02
Spsbe buildinganfaqforendusers-150422122027-conversion-gate02
 
O365Con18 - Hybrid SharePoint Deep Dive - Thomas Vochten
O365Con18 - Hybrid SharePoint Deep Dive - Thomas VochtenO365Con18 - Hybrid SharePoint Deep Dive - Thomas Vochten
O365Con18 - Hybrid SharePoint Deep Dive - Thomas Vochten
 
Teams Provisioning with Power Automate and the Microsoft Graph
Teams Provisioning with Power Automate and the Microsoft GraphTeams Provisioning with Power Automate and the Microsoft Graph
Teams Provisioning with Power Automate and the Microsoft Graph
 
Jinchao demo v7
Jinchao demo v7Jinchao demo v7
Jinchao demo v7
 
Tom Bennet – BrightonSEO April 2016: Site Speed for content Marketers
Tom Bennet – BrightonSEO April 2016: Site Speed for content Marketers Tom Bennet – BrightonSEO April 2016: Site Speed for content Marketers
Tom Bennet – BrightonSEO April 2016: Site Speed for content Marketers
 
Updates from Microsoft Ignite
Updates from Microsoft Ignite Updates from Microsoft Ignite
Updates from Microsoft Ignite
 
Angular - Chapter 7 - HTTP Services
Angular - Chapter 7 - HTTP ServicesAngular - Chapter 7 - HTTP Services
Angular - Chapter 7 - HTTP Services
 
Dan Cohen, Hands On Seo from Internet World 2009
Dan Cohen, Hands On Seo from Internet World 2009Dan Cohen, Hands On Seo from Internet World 2009
Dan Cohen, Hands On Seo from Internet World 2009
 
Twitter System Design
Twitter System DesignTwitter System Design
Twitter System Design
 
Automating your tasks with microsoft flow
Automating your tasks with microsoft flowAutomating your tasks with microsoft flow
Automating your tasks with microsoft flow
 
Rest webservice ppt
Rest webservice pptRest webservice ppt
Rest webservice ppt
 
Social Architecture of SharePoint 2013 for Developers
Social Architecture of SharePoint 2013 for DevelopersSocial Architecture of SharePoint 2013 for Developers
Social Architecture of SharePoint 2013 for Developers
 
Twet
TwetTwet
Twet
 
Web of Short URL’s
Web of Short URL’sWeb of Short URL’s
Web of Short URL’s
 
Avninfosoft Seo strategy for 2014
Avninfosoft Seo strategy  for 2014Avninfosoft Seo strategy  for 2014
Avninfosoft Seo strategy for 2014
 
CARA MEMBUAT REFERENSI DAN SITASI PADA NASKAH
CARA MEMBUAT REFERENSI DAN SITASI PADA NASKAHCARA MEMBUAT REFERENSI DAN SITASI PADA NASKAH
CARA MEMBUAT REFERENSI DAN SITASI PADA NASKAH
 

More from Md. Rakib Trofder

Software System Reconstruction using large language model
Software System Reconstruction using large language modelSoftware System Reconstruction using large language model
Software System Reconstruction using large language modelMd. Rakib Trofder
 
Daily Scrum, Sprint Review & Retrospective.pptx
Daily Scrum, Sprint Review & Retrospective.pptxDaily Scrum, Sprint Review & Retrospective.pptx
Daily Scrum, Sprint Review & Retrospective.pptxMd. Rakib Trofder
 
Scrum & Sprint Planning.pptx
Scrum & Sprint Planning.pptxScrum & Sprint Planning.pptx
Scrum & Sprint Planning.pptxMd. Rakib Trofder
 
Agricultural Business with Technology
Agricultural Business with TechnologyAgricultural Business with Technology
Agricultural Business with TechnologyMd. Rakib Trofder
 
Artificial Intelligence in Gaming.pptx
Artificial Intelligence in Gaming.pptxArtificial Intelligence in Gaming.pptx
Artificial Intelligence in Gaming.pptxMd. Rakib Trofder
 
Mechanism behind BlogBee Application
Mechanism behind BlogBee ApplicationMechanism behind BlogBee Application
Mechanism behind BlogBee ApplicationMd. Rakib Trofder
 
Massive Open Online Courses (MOOC)
Massive Open Online Courses (MOOC)Massive Open Online Courses (MOOC)
Massive Open Online Courses (MOOC)Md. Rakib Trofder
 
BlogBee A Blog Based Social Media.pptx
BlogBee A Blog Based Social Media.pptxBlogBee A Blog Based Social Media.pptx
BlogBee A Blog Based Social Media.pptxMd. Rakib Trofder
 
INTER-SYSTEMS EARNS ISO 9001-2008 CERTIFICATION
INTER-SYSTEMS EARNS ISO 9001-2008 CERTIFICATIONINTER-SYSTEMS EARNS ISO 9001-2008 CERTIFICATION
INTER-SYSTEMS EARNS ISO 9001-2008 CERTIFICATIONMd. Rakib Trofder
 
Web Technology Tag Presentation.pptx
Web Technology Tag Presentation.pptxWeb Technology Tag Presentation.pptx
Web Technology Tag Presentation.pptxMd. Rakib Trofder
 
Video to text blog (blog bee)
Video to text blog (blog bee)Video to text blog (blog bee)
Video to text blog (blog bee)Md. Rakib Trofder
 
Http status code 416 vs 428, 503 vs 505
Http status code  416 vs 428, 503 vs 505Http status code  416 vs 428, 503 vs 505
Http status code 416 vs 428, 503 vs 505Md. Rakib Trofder
 
Economy politics and city life
Economy politics and city lifeEconomy politics and city life
Economy politics and city lifeMd. Rakib Trofder
 

More from Md. Rakib Trofder (20)

Software System Reconstruction using large language model
Software System Reconstruction using large language modelSoftware System Reconstruction using large language model
Software System Reconstruction using large language model
 
Daily Scrum, Sprint Review & Retrospective.pptx
Daily Scrum, Sprint Review & Retrospective.pptxDaily Scrum, Sprint Review & Retrospective.pptx
Daily Scrum, Sprint Review & Retrospective.pptx
 
Scrum & Sprint Planning.pptx
Scrum & Sprint Planning.pptxScrum & Sprint Planning.pptx
Scrum & Sprint Planning.pptx
 
Agricultural Business with Technology
Agricultural Business with TechnologyAgricultural Business with Technology
Agricultural Business with Technology
 
HTTP Caching.pptx
HTTP Caching.pptxHTTP Caching.pptx
HTTP Caching.pptx
 
Artificial Intelligence in Gaming.pptx
Artificial Intelligence in Gaming.pptxArtificial Intelligence in Gaming.pptx
Artificial Intelligence in Gaming.pptx
 
Mechanism behind BlogBee Application
Mechanism behind BlogBee ApplicationMechanism behind BlogBee Application
Mechanism behind BlogBee Application
 
Massive Open Online Courses (MOOC)
Massive Open Online Courses (MOOC)Massive Open Online Courses (MOOC)
Massive Open Online Courses (MOOC)
 
Design Pattern.pptx
Design Pattern.pptxDesign Pattern.pptx
Design Pattern.pptx
 
BlogBee A Blog Based Social Media.pptx
BlogBee A Blog Based Social Media.pptxBlogBee A Blog Based Social Media.pptx
BlogBee A Blog Based Social Media.pptx
 
INTER-SYSTEMS EARNS ISO 9001-2008 CERTIFICATION
INTER-SYSTEMS EARNS ISO 9001-2008 CERTIFICATIONINTER-SYSTEMS EARNS ISO 9001-2008 CERTIFICATION
INTER-SYSTEMS EARNS ISO 9001-2008 CERTIFICATION
 
Web Technology Tag Presentation.pptx
Web Technology Tag Presentation.pptxWeb Technology Tag Presentation.pptx
Web Technology Tag Presentation.pptx
 
Video to text blog (blog bee)
Video to text blog (blog bee)Video to text blog (blog bee)
Video to text blog (blog bee)
 
Web tech tag explanation
Web tech tag explanationWeb tech tag explanation
Web tech tag explanation
 
Library assistant tool
Library assistant toolLibrary assistant tool
Library assistant tool
 
Http status code 416 vs 428, 503 vs 505
Http status code  416 vs 428, 503 vs 505Http status code  416 vs 428, 503 vs 505
Http status code 416 vs 428, 503 vs 505
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
 
Page rank algorithm
Page rank algorithmPage rank algorithm
Page rank algorithm
 
Analytic hierarchy process
Analytic hierarchy processAnalytic hierarchy process
Analytic hierarchy process
 
Economy politics and city life
Economy politics and city lifeEconomy politics and city life
Economy politics and city life
 

Recently uploaded

Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerunnathinaik
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 

Recently uploaded (20)

Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developer
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 

Twitter Timeline and Search Distributed System.pptx

  • 2. Presented By Md. Rakib Trofder BSSE-1129 Nazmus Sakib Ahmed BSSE-1108 Md. Siam BSSE-1104
  • 7. State assumptions 01 100 million active users 02 500 million tweets per day 03 250 billion read requests per month 04 10 billion searches per month
  • 8. State assumptions 10 deliveries fanout for a tweet 5 billion tweets delivered on fanout per day 150 billion tweets delivered on fanout per month 500 million tweets per day 15 billion tweets per month
  • 9. State assumptions Fast Optimized Heavy Viewing the timeline Reads of tweets Reads and Writes Timeline & Searching
  • 11. Calculate usage 250 billion read requests per month * (400 requests per second / 1 billion requests per month) = 100 thousand read requests per second 15 billion tweets per month * (400 requests per second / 1 billion requests per month) = 6,000 tweets per second
  • 12. Calculate usage 150 billion tweets delivered on fanout per month * (400 requests per second / 1 billion requests per month) = 60 thousand tweets delivered on fanout per second 10 billion searches per month * (400 requests per second / 1 billion requests per month)= 4,000 search requests per second
  • 13. Calculate usage 500 million tweets per day * 10 KB per tweet * 30 days per month = 150 TB of new tweet content per month 150 TB of new tweet content per month * 12 months per year * 3 years = 5.4 PB of new tweet content in 3 years
  • 15. Database Selection (Maintaining Own Timeline) Indexing Super Fast Index Lookups Faster Search Inserting of Tweet to the User’s Profile and Viewing Own Timeline Strict Relation Relational Database
  • 16. Database Selection (Delivering tweets and building the home timeline ) Overload Writing the Tweets to Everyone’s Profile who Follows Fast Write Is the Only Solution Caching Non Relational Database
  • 17. Database Selection (Object Storing ) Object Database Storing the Media (images, videos) files
  • 18. Services User Info Service Tweet Info Service Timeline Service User Graph Service Search Service Notification Service Fan Out Service
  • 19. Use case: User posts a tweet ● The Client posts a tweet to the Web Server, running as a reverse proxy ● The Web Server forwards the request to the Write API server ● The Write API stores the tweet in the user's timeline on a SQL database
  • 20. Use case: User posts a tweet (Cont) ● The Write API contacts the Fan Out Service, which does the following: ○ Queries the User Graph Service to find the user's followers stored in the Memory Cache ○ Stores the tweet in the home timeline of the user's followers in a Memory Cache ■ O(n) operation: 1,000 followers = 1,000 lookups and inserts ○ Stores the tweet in the Search Index Service to enable fast searching ○ Stores media in the Object Store ○ Uses the Notification Service to send out push notifications to followers: ■ Uses a Queue (not pictured) to asynchronously send out notifications
  • 21. Use case: User Views the Home Timeline ● The Client posts a home timeline request to the Web Server ● The Web Server forwards the request to the Read API server ● The Read API server contacts the Timeline Service, which does the following: ○ Gets the timeline data stored in the Memory Cache, containing tweet ids and user ids - O(1) ○ Queries the Tweet Info Service with a multiget to obtain additional info about the tweet ids - O(n) ○ Queries the User Info Service with a multiget to obtain additional info about the user ids - O(n)
  • 22. Use case: User views the user timeline ● The Client posts a user timeline request to the Web Server ● The Web Server forwards the request to the Read API server ● The Read API retrieves the user timeline from the SQL Database
  • 23. Use case: User searches keywords ● The Client sends a search request to the Web Server ● The Web Server forwards the request to the Search API server ● The Search API contacts the Search Service, which does the following: ○ Parses/tokenizes the input query, determining what needs to be searched ■ Removes markup ■ Breaks up the text into terms ■ Fixes typos ■ Normalizes capitalization ■ Converts the query to use boolean operation ● Queries the Search Cluster (ie Lucene) for the results: ○ Scatter gathers each server in the cluster to determine if there are any results for the query ○ Merges, ranks, sorts, and returns the result
  • 28. Replicas Master Slave Write Scale the design
  • 29. Points of Bottlenecks SQL Write Master- Slave A single SQL write master- slave might be overwhelmed Fan out service In case of users with millions of followers
  • 30. Points of Optimization (Fan Out Service) ● Avoid fanout of users with millions of followers ● Search to find tweets for highly-followed users ● Merge the results with user’s home tweets ● Re-order and serve them
  • 31. Points of Optimization (SQL Write Master-Slave) ● Federation (Functional Partitioning) ● Sharding ● Denormalization ● SQL Tuning
  • 32. Points of Optimization (Additional) ● Keep only several hundred tweets for each home timeline in the Memory Cache ● Keep only active users' home timeline info in the Memory Cache ● If a user was not previously active in the past 30 days, we could rebuild the timeline from the SQL Database ● Query the User Graph Service to determine who the user is following ● Get the tweets from the SQL Database and add them to the Memory Cache ● Store only a month of tweets in the Tweet Info Service ● Store only active users in the User Info Service ● The Search Cluster would likely need to keep the tweets in memory to keep latency low
  • 34. CREDITS: Diese Präsentationsvorlage wurde von Slidesgo erstellt, inklusive Icons von Flaticon und Infografiken & Bilder von Freepik Thanks! Any Questions?