SlideShare a Scribd company logo
1 of 70
Download to read offline
27 mei 2008

       Folke Lemaitre
      Director of Development
       http://nl.netlog.com/folke




  What we learned about
scalability & high availability
Overview



‣ What is Netlog?
‣ Translations
‣ Network topology
‣ Scaling Databases
‣ Caching
‣ Search
‣ Q&A
What is Netlog?
Social Network



‣ Create your own profile
‣ Discover your friendsʼ activity
‣ Communicate
‣ Explore new content
‣ Applications
Your Profile
What: itʼs personal



  ‣ You rule: itʼs yours

              Music                            YOU
                                                               ANOTHER
                             Photos

  Games

                                                     ANOTHER


                 YOU                  Videos
People



                               Blogs
     Photos

                Relations.
Friend Activity



 ‣ Share & discover friendsʼ activity
                     Pinguke V
                                        Mari . reageert
      Toon Coppens
                     wijzigt haar       op haar foto
                                                                         Jan Maarten
                                                                         Willems tekent
      uploadt een    profielfoto                                         het gastenboek
      nieuwe foto
                                                                         van nico b
                                                                                          Jaak Noukens
                                                                                          en Jo zijn nu
                                                                                          vrienden

                                                          Stijn Symons
                                                          uploadt een
                                                          nieuwe foto

                       Kenny Gryp
                       tekent het
                       gastenboek van
                       Lorenz Bogaert
Communication: Shouts
Communication: Ratings & Comments
Communication: Private messaging
Communication: Chat
Communication: Clans
Explore


                              Blogs
   Profiles
                  Photos




                           Clans          Music
Events
                 Videos




                           Applications
         Pages
Applications



‣ OpenSocial
 • sandbox: http://nl.netlog.com/go/developer/opensocial/sandbox=1

‣ Officially announced tomorrow@ Google I/O
 • Stay tuned!

‣ Public launch for june
Developer Pages

      http://nl.netlog.com/go/developer
Itʼs going pretty good




 ‣ More than 35,000,000 unique members
 ‣ More than 4,000,000,000 pageviews/Month
 ‣ 19 languages and more coming up
 ‣ More than 20 countries
 ‣ Current Alexa Top-100 ranking
                              (most visited web sites in the world)


 ‣ Current ComScore Europe Top-10 ranking
0
                                                                                                                                                          50.000.000
                                                                                                                                                                          100.000.000
                                                                                                                                                                                          150.000.000
                                                                                                                                                                                                          200.000.000
                                                                                                                                    Ja
                                                                                                                                       nu




                                                          16%



                               3%
                                                                                                                                  Fe       ar
                                                                                                                                      br    y-




                                                       Western Asia
                                                                                                                                       ua 07




                         Eastern Europe
                                                                                                                                     M ry-
                                                                                                                                       ar 07
                                                                                                                                         ch




                                                                                     10%
                                                                                                                                       Ap -07




      22%
                                                                                                                                           ril
                                                                                                                                               -
                                                                                                                                        M 07
                                                                                                                                          ay




 Southern Europe
                                                                                                                                       Ju -07




                                                                                   Americas 3%
                                                                                                                                         ne
                                                                                                                                               -
                                                                                                                                        Ju 07
                                                                                                                                           ly




                                                                                         Northern Europe
                                                                                                                                    Au -0
                                                                                                                                        gu 7
                                                                                                                                          st
                                                                                                                                               -0
                                                                                                                                                  7
                                                                                                                                   O
                                                                                                                                     c
                                                                                                                                 N tob
                                                                                                                                  ov er
                                                                                                                                                                                                                          Monthly Visits




                                                                                                                                    e          -0
                                                                                                                                 D mb 7
                                                                                                                                  ec e
                                                                                                                                    em r-0




                                                      46%
                                                                                                                                   Ja be 7
                                                                                                                                      nu r-0
                                                                                                                                  Fe ary 7




                                                 Western Europe
                                                                                                                                    br -0
                                                                                                                                       ua 8
                                                                                                                                                                                                                                                  Itʼs going pretty good




                                                                                                                                     M ry-
                                                                                                                                       ar 08
                                                                                                                                         ch
                                                                                                                                       Ap -08
                                                                                                                                           ril
                                                                                                                                               -0
                                                                                                                                                  8




                     0
                                    10.000.000
                                                         20.000.000
                                                                      30.000.000
                                                                                          40.000.000




  Ja
     nu
                                                                                                                                                      0
                                                                                                                                                          1.250.000.000
                                                                                                                                                                          2.500.000.000
                                                                                                                                                                                          3.750.000.000
                                                                                                                                                                                                          5.000.000.000




 Fe ary                                                                                                                             Ja
    br -0                                                                                                                             n
       ua 7                                                                                                                       Fe uar
    M ry-                                                                                                                           br y-0
       ar 07                                                                                                                            ua 7
         ch                                                                                                                          M ry-
       Ap -07                                                                                                                           ar 0
                                                                                                                                          ch 7
          ril
              -                                                                                                                         Ap -07
       M 07                                                                                                                                ril
         ay
                                                                                                                                        M -07
      Ju -07                                                                                                                              ay
         ne                                                                                                                            Ju -07
              -                                                                                                                           ne
       Ju 07
          l                                                                                                                             Ju -07
   Au y-0
       gu 7                                                                                                                         Au ly-0
          st                                                                                                                            gu 7
              -0                                                                                                                           st
                 7                                                                                                                             -0
  O                                                                                                                                               7
    ct                                                                                                                             O
N obe                                                                                                                                ct
 ov                                                                                                                              N ob
    e r-0                                                                                                                         ov er
                                                                                                                                               -
D mb 7                                                                                                                              e
 ec e                                                                                                                            D mb 07
    em r-0                                                                                                                        ec e
                                                                                                       Monthly Unique Visitors




                                                                                                                                    em r-0
                                                                                                                                                                                                                          Monthly Page Requests




  Ja be 7                                                                                                                          Ja be 7
     nu r-0                                                                                                                           n r-
 Fe ary 7                                                                                                                         Fe uar 07
    br -0                                                                                                                           br y-0
       ua 8                                                                                                                             ua 8
    M ry-                                                                                                                            M ry-
       ar 08                                                                                                                            ar 0
         ch                                                                                                                               ch 8
       Ap -08                                                                                                                           Ap -08
          ril                                                                                                                              ril
              -0                                                                                                                               -0
                 8                                                                                                                                8
Itʼs going pretty good
Translations
19 languages and alot more coming!

                           Slovenčina
         Español Català
                                         Svenska
                           suomi     česky
slovenščina    Deutsch                      Magyar
                          Nederlands
                français
 Русский                   Italiano Afrikaans
          English
         Dansk      Türkçe
Polski                                     Hrvatski
               Lietuvių kalba
   Eesti                         Latviešu valoda
          Português
Română                                  български
                Norsk (bokmål)
Translate Tool
Template
Parsed Template
Translated Template
Generated PhP code
Template Code
Template Output
Network Topology
Overview


            Netlog Datacenters
                                                               Database Pools

                                                              Slave                  Slave
                                                     Master                 Master
                                                              Slave                  Slave
                                                        User Pool             Activity Pool

                                 Web Cluster
                                                              Slave                  Slave
                                                     Master                 Master
                                                              Slave                  Slave
                                                     Friendships Pool                ...
Internet                   Web Load Balancer
            Firewall
                                                              Memcache Pools


                          Static Load Balancer
                                                               Session Cache

                                                                      Slave
                                                                Master
                                                      General Cache   Slave
                                                                      Html Cache
                                                                      Primary Pool
      CDN
                                   Storage Servers
Web Servers

‣ Software
 • Apache 2
 • Php 5.2.6
 • eAccelerator 0.9.5.2 for bytecode caching
 • Keepalived for high availability

‣ 200 servers
‣ 450 000 requests per second
Database Servers



‣ MySQL Enterprise 4.1.22
‣ 200 database servers
‣ 40 thousand tables
‣ 70 billion records
‣ 60 thousand queries per second
Memcache Servers



‣ Memcached 1.2.4
‣ 60 servers
‣ 250 thousand requests/second
‣ 450 GB of memory
Static servers



‣ Software:
 • Lighttpd
 • NginX

‣ Used for:
 • static files: css/javascript/images/...
 • user content: photos, videos

‣ Content Delivery Network: Akamai & Panther
Other servers



‣ OpenSocial:
 • Shindig
 • Tomcat

‣ Search:
 • Sphinx
Scaling Databases
Database & Scalability



‣ Database pools

‣ Replication

‣ Partitioning
Database Pools



‣ Different data on different database pools:
 • messaging
 • friendships
 • blogs
 • music
 • videos
 • ...
Replication



‣ write to one master
‣ read from multiple slaves (and master)
‣ pros
 • easy to implement
 • read intensive applications scale very well
‣ cons
 • write intensive applications donʼt scale
Partitioning (sharding)



‣ Divide data on primary key:
 • all user data for users with id 1 - 10 in database1
 • all user data for users with id 11 - 20 in database2
 • ...

‣ Best scaling possible
‣ How?
 • managed in code
 • MySQL partitioning (available from version 5.1)
Analyse, analyse, analyse!


‣ Tag your queries
 •   SELECT * FROM USER WHERE userid = 123 /*User::getUser():11 */


‣ Analyse mysql slow logs
‣ Analyse process lists
‣ Analyse based on tags
 •   1023 User:getUser():230
 •   512 User::isOnline():124
 •   10   Activities:getActivity():320


‣ minutely cron that checks for “too many
 connections”
 • if “too many connections”, log process list
Caching
Introduction to memcached



‣ Developed by Danga Interactive:
  • http://www.danga.com/


‣ Initially developed for LiveJournal:
  • http://www.livejournal.com/


‣ OpenSource
Introduction to memcached



‣ Least Recently Used
‣ Fast!
‣ Distributed
‣ Automatic failover
‣ Big Hash table: set/add/get/delete
What to cache?



‣ sessions
‣ query caching
‣ processed data
‣ generated html
Session Cache



‣ 99% hit ratio
‣ Time to live is 20 minutes
‣ Faster than session database
Query Cache



‣ Why memcache and not MySQL query cache?
 • MySQL invalidates cached queries on a table on
     every update
 •   different query cache for different replicated
     databases

‣ Add to generic database classes
 • Cache key is query
Processed data



‣ Better to cache processed data than query
 results
HTML Caching
HTML Caching



‣ Profile blocks are fully cached
‣ Data needed to generate html is also cached
‣ When data changes, html is invalidated, cached
 data updated

‣ High cache hit rate on profile pages
3 ways of caching



‣ Cache with TTL

‣ Cache forever with invalidate

‣ Cache forever with update
Cache with TTL



‣ The good:
 • Quickly achieve better performance on existing code

‣ The bad:
 • Users see outdated information
 • TTL can not be high
 • Caching efficiency is minimal
Cache with TTL


‣ Cache friends for 5 minutes
Cache forever with invalidate



‣ The Good:
 • fairly easy to implement
 • user never sees outdated data
Cache friends forever


‣ For memcached this means ttl=0
Invalidate Cache
Cache forever with update



‣ The Good:
 • Best caching possible
 • Can reduce your select queries to the minimum
Update Cache (array)


‣ Only update cache when no db queries needed
Update Cache (simple value)



‣ No need to check cache
Global Locking



‣ Use memcache as locking mechanism
Global Locking: Chat Example

‣ Example: add new message to cached shared
 chat thread
Flooding detection



‣ User can only redo action A after a timeout
 • a guestbook message can only be posted once every
   2 minutes



‣ User can not do action A more than X times in T
 minutes
 • only 12 failed login attempts per hour are allowed
Flooding detection
Flooding detection



‣ User can only redo action A after a timeout
 • a guestbook message can only be posted once every
   2 minutes



‣ User can not do action A more than X times in T
 minutes
 • only 12 failed login attempts per hour are allowed
Search
MySQL full-text search



‣ Initially used for our search
  • can be very slow
  • extra load on most of our databases, since most
   content is searchable

‣ Better search engine needed
  • Sphinx!
  • OpenSource search engine developed by Andrew
   Aksyonoff (http://sphinxsearch.com/)
Sphinx Features



‣ very fast indexing
‣ very fast searching
 • 0.04 seconds average
 • 5 million searches / day
 • 60 searches / second
‣ distributed
‣ document fields
‣ stopwords
‣ api available in many languages
 •   PhP, Java, Python, Ruby, Perl, C++, ...
Sphinx Indexer



‣ Index is read-only (except for attributes)
‣ Build new index while searching old one
‣ How we index:
  • rebuild full index from data once in a while (daily,
      weekly)
  •   generate delta indexes often (every minute, 5
      minutes)
      •   contains changes for search index since last full index merge
  • full index merge of previous index and delta (every
      hour)
Sphinx Search



‣ Search query returns list of ids
‣ For every result page shown, we fetch data
 associated with ids
 • data is cached with memcache for every id
Thank you!




             Questions?

More Related Content

Similar to Netlog: What we learned about scalability & high availability

Gcit1015 power point final
Gcit1015 power point finalGcit1015 power point final
Gcit1015 power point finalmRaKes
 
Gcit1015 power point final
Gcit1015 power point finalGcit1015 power point final
Gcit1015 power point final12208388
 
Strategic Communications Planning
Strategic Communications PlanningStrategic Communications Planning
Strategic Communications PlanningShonali Burke
 
Integrated Marketing - Overman
Integrated Marketing - OvermanIntegrated Marketing - Overman
Integrated Marketing - OvermanEric Overman
 
Eduserv OpenID Meeting: OpenID Today
Eduserv OpenID Meeting: OpenID TodayEduserv OpenID Meeting: OpenID Today
Eduserv OpenID Meeting: OpenID TodayDavid Recordon
 
From content to community
From content to communityFrom content to community
From content to communitypwcom.co.uk Ltd
 

Similar to Netlog: What we learned about scalability & high availability (6)

Gcit1015 power point final
Gcit1015 power point finalGcit1015 power point final
Gcit1015 power point final
 
Gcit1015 power point final
Gcit1015 power point finalGcit1015 power point final
Gcit1015 power point final
 
Strategic Communications Planning
Strategic Communications PlanningStrategic Communications Planning
Strategic Communications Planning
 
Integrated Marketing - Overman
Integrated Marketing - OvermanIntegrated Marketing - Overman
Integrated Marketing - Overman
 
Eduserv OpenID Meeting: OpenID Today
Eduserv OpenID Meeting: OpenID TodayEduserv OpenID Meeting: OpenID Today
Eduserv OpenID Meeting: OpenID Today
 
From content to community
From content to communityFrom content to community
From content to community
 

Recently uploaded

React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 

Recently uploaded (20)

React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 

Netlog: What we learned about scalability & high availability

  • 1. 27 mei 2008 Folke Lemaitre Director of Development http://nl.netlog.com/folke What we learned about scalability & high availability
  • 2. Overview ‣ What is Netlog? ‣ Translations ‣ Network topology ‣ Scaling Databases ‣ Caching ‣ Search ‣ Q&A
  • 4. Social Network ‣ Create your own profile ‣ Discover your friendsʼ activity ‣ Communicate ‣ Explore new content ‣ Applications
  • 6. What: itʼs personal ‣ You rule: itʼs yours Music YOU ANOTHER Photos Games ANOTHER YOU Videos People Blogs Photos Relations.
  • 7. Friend Activity ‣ Share & discover friendsʼ activity Pinguke V Mari . reageert Toon Coppens wijzigt haar op haar foto Jan Maarten Willems tekent uploadt een profielfoto het gastenboek nieuwe foto van nico b Jaak Noukens en Jo zijn nu vrienden Stijn Symons uploadt een nieuwe foto Kenny Gryp tekent het gastenboek van Lorenz Bogaert
  • 13. Explore Blogs Profiles Photos Clans Music Events Videos Applications Pages
  • 14. Applications ‣ OpenSocial • sandbox: http://nl.netlog.com/go/developer/opensocial/sandbox=1 ‣ Officially announced tomorrow@ Google I/O • Stay tuned! ‣ Public launch for june
  • 15. Developer Pages http://nl.netlog.com/go/developer
  • 16. Itʼs going pretty good ‣ More than 35,000,000 unique members ‣ More than 4,000,000,000 pageviews/Month ‣ 19 languages and more coming up ‣ More than 20 countries ‣ Current Alexa Top-100 ranking (most visited web sites in the world) ‣ Current ComScore Europe Top-10 ranking
  • 17. 0 50.000.000 100.000.000 150.000.000 200.000.000 Ja nu 16% 3% Fe ar br y- Western Asia ua 07 Eastern Europe M ry- ar 07 ch 10% Ap -07 22% ril - M 07 ay Southern Europe Ju -07 Americas 3% ne - Ju 07 ly Northern Europe Au -0 gu 7 st -0 7 O c N tob ov er Monthly Visits e -0 D mb 7 ec e em r-0 46% Ja be 7 nu r-0 Fe ary 7 Western Europe br -0 ua 8 Itʼs going pretty good M ry- ar 08 ch Ap -08 ril -0 8 0 10.000.000 20.000.000 30.000.000 40.000.000 Ja nu 0 1.250.000.000 2.500.000.000 3.750.000.000 5.000.000.000 Fe ary Ja br -0 n ua 7 Fe uar M ry- br y-0 ar 07 ua 7 ch M ry- Ap -07 ar 0 ch 7 ril - Ap -07 M 07 ril ay M -07 Ju -07 ay ne Ju -07 - ne Ju 07 l Ju -07 Au y-0 gu 7 Au ly-0 st gu 7 -0 st 7 -0 O 7 ct O N obe ct ov N ob e r-0 ov er - D mb 7 e ec e D mb 07 em r-0 ec e Monthly Unique Visitors em r-0 Monthly Page Requests Ja be 7 Ja be 7 nu r-0 n r- Fe ary 7 Fe uar 07 br -0 br y-0 ua 8 ua 8 M ry- M ry- ar 08 ar 0 ch ch 8 Ap -08 Ap -08 ril ril -0 -0 8 8
  • 20. 19 languages and alot more coming! Slovenčina Español Català Svenska suomi česky slovenščina Deutsch Magyar Nederlands français Русский Italiano Afrikaans English Dansk Türkçe Polski Hrvatski Lietuvių kalba Eesti Latviešu valoda Português Română български Norsk (bokmål)
  • 29. Overview Netlog Datacenters Database Pools Slave Slave Master Master Slave Slave User Pool Activity Pool Web Cluster Slave Slave Master Master Slave Slave Friendships Pool ... Internet Web Load Balancer Firewall Memcache Pools Static Load Balancer Session Cache Slave Master General Cache Slave Html Cache Primary Pool CDN Storage Servers
  • 30. Web Servers ‣ Software • Apache 2 • Php 5.2.6 • eAccelerator 0.9.5.2 for bytecode caching • Keepalived for high availability ‣ 200 servers ‣ 450 000 requests per second
  • 31. Database Servers ‣ MySQL Enterprise 4.1.22 ‣ 200 database servers ‣ 40 thousand tables ‣ 70 billion records ‣ 60 thousand queries per second
  • 32. Memcache Servers ‣ Memcached 1.2.4 ‣ 60 servers ‣ 250 thousand requests/second ‣ 450 GB of memory
  • 33. Static servers ‣ Software: • Lighttpd • NginX ‣ Used for: • static files: css/javascript/images/... • user content: photos, videos ‣ Content Delivery Network: Akamai & Panther
  • 34. Other servers ‣ OpenSocial: • Shindig • Tomcat ‣ Search: • Sphinx
  • 36. Database & Scalability ‣ Database pools ‣ Replication ‣ Partitioning
  • 37. Database Pools ‣ Different data on different database pools: • messaging • friendships • blogs • music • videos • ...
  • 38. Replication ‣ write to one master ‣ read from multiple slaves (and master) ‣ pros • easy to implement • read intensive applications scale very well ‣ cons • write intensive applications donʼt scale
  • 39. Partitioning (sharding) ‣ Divide data on primary key: • all user data for users with id 1 - 10 in database1 • all user data for users with id 11 - 20 in database2 • ... ‣ Best scaling possible ‣ How? • managed in code • MySQL partitioning (available from version 5.1)
  • 40. Analyse, analyse, analyse! ‣ Tag your queries • SELECT * FROM USER WHERE userid = 123 /*User::getUser():11 */ ‣ Analyse mysql slow logs ‣ Analyse process lists ‣ Analyse based on tags • 1023 User:getUser():230 • 512 User::isOnline():124 • 10 Activities:getActivity():320 ‣ minutely cron that checks for “too many connections” • if “too many connections”, log process list
  • 42. Introduction to memcached ‣ Developed by Danga Interactive: • http://www.danga.com/ ‣ Initially developed for LiveJournal: • http://www.livejournal.com/ ‣ OpenSource
  • 43. Introduction to memcached ‣ Least Recently Used ‣ Fast! ‣ Distributed ‣ Automatic failover ‣ Big Hash table: set/add/get/delete
  • 44. What to cache? ‣ sessions ‣ query caching ‣ processed data ‣ generated html
  • 45. Session Cache ‣ 99% hit ratio ‣ Time to live is 20 minutes ‣ Faster than session database
  • 46. Query Cache ‣ Why memcache and not MySQL query cache? • MySQL invalidates cached queries on a table on every update • different query cache for different replicated databases ‣ Add to generic database classes • Cache key is query
  • 47. Processed data ‣ Better to cache processed data than query results
  • 49. HTML Caching ‣ Profile blocks are fully cached ‣ Data needed to generate html is also cached ‣ When data changes, html is invalidated, cached data updated ‣ High cache hit rate on profile pages
  • 50. 3 ways of caching ‣ Cache with TTL ‣ Cache forever with invalidate ‣ Cache forever with update
  • 51. Cache with TTL ‣ The good: • Quickly achieve better performance on existing code ‣ The bad: • Users see outdated information • TTL can not be high • Caching efficiency is minimal
  • 52. Cache with TTL ‣ Cache friends for 5 minutes
  • 53. Cache forever with invalidate ‣ The Good: • fairly easy to implement • user never sees outdated data
  • 54. Cache friends forever ‣ For memcached this means ttl=0
  • 56. Cache forever with update ‣ The Good: • Best caching possible • Can reduce your select queries to the minimum
  • 57. Update Cache (array) ‣ Only update cache when no db queries needed
  • 58. Update Cache (simple value) ‣ No need to check cache
  • 59. Global Locking ‣ Use memcache as locking mechanism
  • 60. Global Locking: Chat Example ‣ Example: add new message to cached shared chat thread
  • 61. Flooding detection ‣ User can only redo action A after a timeout • a guestbook message can only be posted once every 2 minutes ‣ User can not do action A more than X times in T minutes • only 12 failed login attempts per hour are allowed
  • 63. Flooding detection ‣ User can only redo action A after a timeout • a guestbook message can only be posted once every 2 minutes ‣ User can not do action A more than X times in T minutes • only 12 failed login attempts per hour are allowed
  • 65. MySQL full-text search ‣ Initially used for our search • can be very slow • extra load on most of our databases, since most content is searchable ‣ Better search engine needed • Sphinx! • OpenSource search engine developed by Andrew Aksyonoff (http://sphinxsearch.com/)
  • 66. Sphinx Features ‣ very fast indexing ‣ very fast searching • 0.04 seconds average • 5 million searches / day • 60 searches / second ‣ distributed ‣ document fields ‣ stopwords ‣ api available in many languages • PhP, Java, Python, Ruby, Perl, C++, ...
  • 67. Sphinx Indexer ‣ Index is read-only (except for attributes) ‣ Build new index while searching old one ‣ How we index: • rebuild full index from data once in a while (daily, weekly) • generate delta indexes often (every minute, 5 minutes) • contains changes for search index since last full index merge • full index merge of previous index and delta (every hour)
  • 68. Sphinx Search ‣ Search query returns list of ids ‣ For every result page shown, we fetch data associated with ids • data is cached with memcache for every id
  • 69.
  • 70. Thank you! Questions?