SlideShare a Scribd company logo
1 of 31
Download to read offline
Building Zingme News Feed
           System


               Chau Nguyen Nhat Thanh
                Senior Software Manager
             User Platform Division - VNG
Agenda

1) Introduction to News Feed
2) ZingMe News Feed system history
3) ZingMe News Feed system
4) Some statistics
5) Bonus
6) Q&A
Introduction to News Feed
Introduction to News Feed
●   Update friends' info
●   Update Biz info
●   Update VIP info
●   Interaction to them by comment, like ...
Introduction to News Feed
Introduction to News Feed
Introduction to News Feed
Introduction to News Feed
●   Terms
    ●   Social graph: Users in most social networking
        sites are describable in terms of a social
        graph. The relationships between users are
        represented by adjacency lists. If Jack and Jill
        are friends, they are said to be adjacent. This
        is known as an "edge" in the graph. (from
        Quora)
    ●   Not only Friends
    ●   but also Followers …
Introduction to News Feed
●   What do we need?
    ●   Someone does actions, his friend will see these
        action in his home as soon as possible
●   What will we solve the problems?
    ●   Solution 1: Push model (fan out on write)
    ●   Solution 2: Pull model (fan out on read)
    ●   Solution 3: Mixing push and pull (Feeding
        Frenzy- a paper from Yahoo)
Introduction to News Feed
●   Push model
    ●   This method involves denormalizing the user's activity
        data and pushing the meta data to all the user's friends
        at the time it occurs. (from Quora)
●   Pull model
    ●   This method involves keeping all recent activity data in
        memory and pulling in (or fanning out) that data at the
        time a user loads their home page. Data doesn't need to
        be pushed out to all subscribers as soon as it happens,
        so no back-log and no disk seeks (from Quora)
●   Mix model
    ●   Active user using push model
    ●   Non active user using pull
ZingMe News Feed system history
ZingMe News Feed system history

●   First version
    ●   Using PHP for worker
    ●   Using MySQL for feed item
    ●   Using MySQL for feed indexing
    ●   Having full feature: feed type filtering, ignoring
        users ..
    ●   Restarting DB and other services are the favorite
        jobs at that time :)
    ●   Lesson learn:
        –   Relation DB may not be fit for this kind of project
ZingMe News Feed system history

●   Second version
    ●   Still using PHP for worker
    ●   Using Cassandra for feed item
    ●   Using home build list id service for feed indexing
    ●   Using Memcached for caching item
    ●   Removing all deluxe features :) (stupid features due to
        our limited technique)
    ●   Restarting Cassandra, and waiting for compaction is our
        favorite jobs :) :)
    ●   Headache with changing avatar
    ●   Lesson learn: believe only ourself
ZingMe News Feed system history

●   Third version
    ●   Moving to Java for better performance
    ●   Still using Cassandra for feed item
    ●   Trying to use redis in Lab
    ●   Keep only simple features (KISS)
    ●   Cannot control memcache
        –   The new one expired before the old one ???
        –   Memcached is wrong ???
    ●   Cannot believe to Cassandra
    ●   Lesson learn: memcached is not the “thuốc tiên” :)
ZingMe News Feed system
    (The current one :))
ZingMe News Feed system
ZingMe News Feed system

●   Still using push model because of Twitter public some
    info related to this model
●   Not enough technical when choosing pull model
●   Begin to understand a little bit about how to keep it
    scaling
●   Do not use Cassandra any more for such kind of this
    system → do not believe to anyone, learn from what
    they do and try our best
ZingMe News Feed system

●   Feed Item
    ●   UserId, ObjectId, Created date...
    ●   Storage: home build based on Kyoto Cabinet
    ●   Fast recovery when crash
●   Feed Index
    ●   UserId → [feedId1,feedId2...]
    ●   Storage: home build
    ●   Fast recovery when crash
ZingMe News Feed system

●   Rate limit
    ●   Prefilter Spam or auto tool based on rate of write request
    ●   When hit limit, block that user for amount of time
●   Feed writer
    ●   Receive the write command
    ●   Get the next Id from Generator
    ●   Push the item to queue
    ●   Return the feedId for future reference
ZingMe News Feed system

●   Gearman feed storage queue
    ●   Very fast
    ●   Support multi language client
    ●   Some time block the all workers when network
        unstable :)
    ●   Solve most of our heavy jobs
ZingMe News Feed system

●   Feed Sync center
    ●   Sync the new feed to the others such as:
        –   Spam detection
        –   Feed ranking system
        –   Logging system
    ●   Feed replication function for future use
ZingMe News Feed system

●   Feed Render worker
    ●   The main and heavy job:
        –   Get the feed item
        –   Extract the template id
        –   Get user info
        –   Render the feed based on them
    ●   Put rendered feed in to appropriate cache
    ●   Mobile and Desktop are totally different
ZingMe News Feed system

●   Feed Aggregate
    ●   Get the feed index
    ●   Get the rendered item from cache
    ●   Return to the front-end
    ●   Some cheat:
        –   If the cached items less than 5, in stead of returning
            the data return a JavaScript to reload that list
        –   At the same time push a task to warm-up the
            rendered cache
    ●   Auto fail-over when a cache service die
Some statistics
Some statistics

●   ~15M actions / day
●   10% Spam
    ●   Gift receive
    ●   Meaningless status
●   Cache hit 98%
●   ~80M registered users
●   ~3M active users / days
●   Max 1000 friends only
●   Unlimited followers
Bonus
●
    Twemcache (https://github.com/twitter/twemcache)
    ●   From Twitter
    ●   Solve most problems with memcached
    ●   More strategy for eviction items
        –   Item LRU eviction: per-slabclass LRU eviction
        –   Random eviction : evict all items from a randomly chosen slab
        –   ...
    ●   Twemcache proxy
●
    Redis (http://redis.io)
    ●   Replacement for home build when you have not enough time
    ●   Set is default supported
    ●   Supported cluster
    ●   Persistence
Question and Answer
Q&A
●   What is the problem with followers?
    ●   Do that with the trick
    ●   Cheating the owner ;)
We are hiring!!!!!!!
Q&A




Contact info:
         Chau Nguyen Nhat Thanh

       thanhcnn@vng.com.vn

       me.zing.vn/thanhcnn2000

More Related Content

What's hot

Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds Architecture
Dan McKinley
 
mongodb와 mysql의 CRUD 연산의 성능 비교
mongodb와 mysql의 CRUD 연산의 성능 비교mongodb와 mysql의 CRUD 연산의 성능 비교
mongodb와 mysql의 CRUD 연산의 성능 비교
Woo Yeong Choi
 
Data Ingestion, Extraction & Parsing on Hadoop
Data Ingestion, Extraction & Parsing on HadoopData Ingestion, Extraction & Parsing on Hadoop
Data Ingestion, Extraction & Parsing on Hadoop
skaluska
 
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
confluent
 

What's hot (20)

A Deep Dive into Apache Cassandra for .NET Developers
A Deep Dive into Apache Cassandra for .NET DevelopersA Deep Dive into Apache Cassandra for .NET Developers
A Deep Dive into Apache Cassandra for .NET Developers
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds Architecture
 
Cloudera Impala: A Modern SQL Engine for Apache Hadoop
Cloudera Impala: A Modern SQL Engine for Apache HadoopCloudera Impala: A Modern SQL Engine for Apache Hadoop
Cloudera Impala: A Modern SQL Engine for Apache Hadoop
 
Inside Zalo: Developing a mobile messenger for the audience of millions - VN ...
Inside Zalo: Developing a mobile messenger for the audience of millions - VN ...Inside Zalo: Developing a mobile messenger for the audience of millions - VN ...
Inside Zalo: Developing a mobile messenger for the audience of millions - VN ...
 
Bigtable and Dynamo
Bigtable and DynamoBigtable and Dynamo
Bigtable and Dynamo
 
mongodb와 mysql의 CRUD 연산의 성능 비교
mongodb와 mysql의 CRUD 연산의 성능 비교mongodb와 mysql의 CRUD 연산의 성능 비교
mongodb와 mysql의 CRUD 연산의 성능 비교
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon Redshift
 
Extending Druid Index File
Extending Druid Index FileExtending Druid Index File
Extending Druid Index File
 
Data Ingestion, Extraction & Parsing on Hadoop
Data Ingestion, Extraction & Parsing on HadoopData Ingestion, Extraction & Parsing on Hadoop
Data Ingestion, Extraction & Parsing on Hadoop
 
MariaDB Server Performance Tuning & Optimization
MariaDB Server Performance Tuning & OptimizationMariaDB Server Performance Tuning & Optimization
MariaDB Server Performance Tuning & Optimization
 
ORC Files
ORC FilesORC Files
ORC Files
 
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
 
Reddit/Quora Software System Design
Reddit/Quora Software System DesignReddit/Quora Software System Design
Reddit/Quora Software System Design
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
 
A Brief Introduction of TiDB (Percona Live)
A Brief Introduction of TiDB (Percona Live)A Brief Introduction of TiDB (Percona Live)
A Brief Introduction of TiDB (Percona Live)
 
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a ServiceZeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
 
Apache kafka performance(latency)_benchmark_v0.3
Apache kafka performance(latency)_benchmark_v0.3Apache kafka performance(latency)_benchmark_v0.3
Apache kafka performance(latency)_benchmark_v0.3
 
Lessons from managing a Pulsar cluster (Nutanix)
Lessons from managing a Pulsar cluster (Nutanix)Lessons from managing a Pulsar cluster (Nutanix)
Lessons from managing a Pulsar cluster (Nutanix)
 
Gobblin' Big Data With Ease @ QConSF 2014
Gobblin' Big Data With Ease @ QConSF 2014Gobblin' Big Data With Ease @ QConSF 2014
Gobblin' Big Data With Ease @ QConSF 2014
 

Similar to Building ZingMe News Feed System

Big data @ uber vu (1)
Big data @ uber vu (1)Big data @ uber vu (1)
Big data @ uber vu (1)
Mihnea Giurgea
 
Gatling - Bordeaux JUG
Gatling - Bordeaux JUGGatling - Bordeaux JUG
Gatling - Bordeaux JUG
slandelle
 
Rules for Fools: The Rules Module
Rules for Fools: The Rules ModuleRules for Fools: The Rules Module
Rules for Fools: The Rules Module
Will Hall
 

Similar to Building ZingMe News Feed System (20)

Building zing me news feed system
Building zing me news feed systemBuilding zing me news feed system
Building zing me news feed system
 
Utopia Kindgoms scaling case: From 4 to 50K users
Utopia Kindgoms scaling case: From 4 to 50K usersUtopia Kindgoms scaling case: From 4 to 50K users
Utopia Kindgoms scaling case: From 4 to 50K users
 
Utopia Kingdoms scaling case. From 4 users to 50.000+
Utopia Kingdoms scaling case. From 4 users to 50.000+Utopia Kingdoms scaling case. From 4 users to 50.000+
Utopia Kingdoms scaling case. From 4 users to 50.000+
 
Big data @ uber vu (1)
Big data @ uber vu (1)Big data @ uber vu (1)
Big data @ uber vu (1)
 
Tokamak 4: KDE Plasma Netbook
Tokamak 4: KDE Plasma NetbookTokamak 4: KDE Plasma Netbook
Tokamak 4: KDE Plasma Netbook
 
Der Freitag, A Use Case
Der Freitag, A Use CaseDer Freitag, A Use Case
Der Freitag, A Use Case
 
An EyeWitness View into your Network
An EyeWitness View into your NetworkAn EyeWitness View into your Network
An EyeWitness View into your Network
 
Security and why you need to review yours.
Security and why you need to review yours.Security and why you need to review yours.
Security and why you need to review yours.
 
Activity feeds (and more) at mate1
Activity feeds (and more) at mate1Activity feeds (and more) at mate1
Activity feeds (and more) at mate1
 
Apache Cassandra at Target - Cassandra Summit 2014
Apache Cassandra at Target - Cassandra Summit 2014Apache Cassandra at Target - Cassandra Summit 2014
Apache Cassandra at Target - Cassandra Summit 2014
 
Server fleet management using Camunda by Akhil Ahuja
Server fleet management using Camunda by Akhil AhujaServer fleet management using Camunda by Akhil Ahuja
Server fleet management using Camunda by Akhil Ahuja
 
React - The JavaScript Library for User Interfaces
React - The JavaScript Library for User InterfacesReact - The JavaScript Library for User Interfaces
React - The JavaScript Library for User Interfaces
 
Our journey into scalable player engagement platform
Our journey into scalable player engagement platformOur journey into scalable player engagement platform
Our journey into scalable player engagement platform
 
Gatling - Bordeaux JUG
Gatling - Bordeaux JUGGatling - Bordeaux JUG
Gatling - Bordeaux JUG
 
10 ways to improve your Android app performance
10 ways to improve your Android app performance10 ways to improve your Android app performance
10 ways to improve your Android app performance
 
Lambda architecture
Lambda architectureLambda architecture
Lambda architecture
 
(Greach 2015) Decathlon Sport Meeting
(Greach 2015) Decathlon Sport Meeting(Greach 2015) Decathlon Sport Meeting
(Greach 2015) Decathlon Sport Meeting
 
Rules for Fools: The Rules Module
Rules for Fools: The Rules ModuleRules for Fools: The Rules Module
Rules for Fools: The Rules Module
 
Monitoring Big Data Systems - "The Simple Way"
Monitoring Big Data Systems - "The Simple Way"Monitoring Big Data Systems - "The Simple Way"
Monitoring Big Data Systems - "The Simple Way"
 
Debugging data pipelines @OLA by Karan Kumar
Debugging data pipelines @OLA by Karan KumarDebugging data pipelines @OLA by Karan Kumar
Debugging data pipelines @OLA by Karan Kumar
 

More from Chau Thanh (9)

ZaloPay Merchant Platform on K8S on-premise
ZaloPay Merchant Platform on K8S on-premiseZaloPay Merchant Platform on K8S on-premise
ZaloPay Merchant Platform on K8S on-premise
 
ZaloPay Merchant Platform on K8S on-premise
ZaloPay Merchant Platform on K8S on-premiseZaloPay Merchant Platform on K8S on-premise
ZaloPay Merchant Platform on K8S on-premise
 
Cache hcm-topdev
Cache hcm-topdevCache hcm-topdev
Cache hcm-topdev
 
Design a scalable social network: Problems and solutions
Design a scalable social network: Problems and solutionsDesign a scalable social network: Problems and solutions
Design a scalable social network: Problems and solutions
 
Design a scalable site: Problem and solutions
Design a scalable site: Problem and solutionsDesign a scalable site: Problem and solutions
Design a scalable site: Problem and solutions
 
Design a scalable social network: Problems and Solutions
Design a scalable social network: Problems and SolutionsDesign a scalable social network: Problems and Solutions
Design a scalable social network: Problems and Solutions
 
IoT and developer chances
IoT and developer chancesIoT and developer chances
IoT and developer chances
 
Zing Me Real Time Web Chat Architect
Zing Me Real Time Web Chat ArchitectZing Me Real Time Web Chat Architect
Zing Me Real Time Web Chat Architect
 
Zingme practice for building scalable website with PHP
Zingme practice for building scalable website with PHPZingme practice for building scalable website with PHP
Zingme practice for building scalable website with PHP
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Recently uploaded (20)

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Building ZingMe News Feed System

  • 1. Building Zingme News Feed System Chau Nguyen Nhat Thanh Senior Software Manager User Platform Division - VNG
  • 2. Agenda 1) Introduction to News Feed 2) ZingMe News Feed system history 3) ZingMe News Feed system 4) Some statistics 5) Bonus 6) Q&A
  • 4. Introduction to News Feed ● Update friends' info ● Update Biz info ● Update VIP info ● Interaction to them by comment, like ...
  • 8. Introduction to News Feed ● Terms ● Social graph: Users in most social networking sites are describable in terms of a social graph. The relationships between users are represented by adjacency lists. If Jack and Jill are friends, they are said to be adjacent. This is known as an "edge" in the graph. (from Quora) ● Not only Friends ● but also Followers …
  • 9. Introduction to News Feed ● What do we need? ● Someone does actions, his friend will see these action in his home as soon as possible ● What will we solve the problems? ● Solution 1: Push model (fan out on write) ● Solution 2: Pull model (fan out on read) ● Solution 3: Mixing push and pull (Feeding Frenzy- a paper from Yahoo)
  • 10. Introduction to News Feed ● Push model ● This method involves denormalizing the user's activity data and pushing the meta data to all the user's friends at the time it occurs. (from Quora) ● Pull model ● This method involves keeping all recent activity data in memory and pulling in (or fanning out) that data at the time a user loads their home page. Data doesn't need to be pushed out to all subscribers as soon as it happens, so no back-log and no disk seeks (from Quora) ● Mix model ● Active user using push model ● Non active user using pull
  • 11. ZingMe News Feed system history
  • 12. ZingMe News Feed system history ● First version ● Using PHP for worker ● Using MySQL for feed item ● Using MySQL for feed indexing ● Having full feature: feed type filtering, ignoring users .. ● Restarting DB and other services are the favorite jobs at that time :) ● Lesson learn: – Relation DB may not be fit for this kind of project
  • 13. ZingMe News Feed system history ● Second version ● Still using PHP for worker ● Using Cassandra for feed item ● Using home build list id service for feed indexing ● Using Memcached for caching item ● Removing all deluxe features :) (stupid features due to our limited technique) ● Restarting Cassandra, and waiting for compaction is our favorite jobs :) :) ● Headache with changing avatar ● Lesson learn: believe only ourself
  • 14. ZingMe News Feed system history ● Third version ● Moving to Java for better performance ● Still using Cassandra for feed item ● Trying to use redis in Lab ● Keep only simple features (KISS) ● Cannot control memcache – The new one expired before the old one ??? – Memcached is wrong ??? ● Cannot believe to Cassandra ● Lesson learn: memcached is not the “thuốc tiên” :)
  • 15. ZingMe News Feed system (The current one :))
  • 17. ZingMe News Feed system ● Still using push model because of Twitter public some info related to this model ● Not enough technical when choosing pull model ● Begin to understand a little bit about how to keep it scaling ● Do not use Cassandra any more for such kind of this system → do not believe to anyone, learn from what they do and try our best
  • 18.
  • 19. ZingMe News Feed system ● Feed Item ● UserId, ObjectId, Created date... ● Storage: home build based on Kyoto Cabinet ● Fast recovery when crash ● Feed Index ● UserId → [feedId1,feedId2...] ● Storage: home build ● Fast recovery when crash
  • 20. ZingMe News Feed system ● Rate limit ● Prefilter Spam or auto tool based on rate of write request ● When hit limit, block that user for amount of time ● Feed writer ● Receive the write command ● Get the next Id from Generator ● Push the item to queue ● Return the feedId for future reference
  • 21. ZingMe News Feed system ● Gearman feed storage queue ● Very fast ● Support multi language client ● Some time block the all workers when network unstable :) ● Solve most of our heavy jobs
  • 22. ZingMe News Feed system ● Feed Sync center ● Sync the new feed to the others such as: – Spam detection – Feed ranking system – Logging system ● Feed replication function for future use
  • 23. ZingMe News Feed system ● Feed Render worker ● The main and heavy job: – Get the feed item – Extract the template id – Get user info – Render the feed based on them ● Put rendered feed in to appropriate cache ● Mobile and Desktop are totally different
  • 24. ZingMe News Feed system ● Feed Aggregate ● Get the feed index ● Get the rendered item from cache ● Return to the front-end ● Some cheat: – If the cached items less than 5, in stead of returning the data return a JavaScript to reload that list – At the same time push a task to warm-up the rendered cache ● Auto fail-over when a cache service die
  • 26. Some statistics ● ~15M actions / day ● 10% Spam ● Gift receive ● Meaningless status ● Cache hit 98% ● ~80M registered users ● ~3M active users / days ● Max 1000 friends only ● Unlimited followers
  • 27. Bonus ● Twemcache (https://github.com/twitter/twemcache) ● From Twitter ● Solve most problems with memcached ● More strategy for eviction items – Item LRU eviction: per-slabclass LRU eviction – Random eviction : evict all items from a randomly chosen slab – ... ● Twemcache proxy ● Redis (http://redis.io) ● Replacement for home build when you have not enough time ● Set is default supported ● Supported cluster ● Persistence
  • 29. Q&A ● What is the problem with followers? ● Do that with the trick ● Cheating the owner ;)
  • 31. Q&A Contact info: Chau Nguyen Nhat Thanh thanhcnn@vng.com.vn me.zing.vn/thanhcnn2000