27 mei 2008

       Folke Lemaitre
      Director of Development
       http://nl.netlog.com/folke




  What we learned a...
Overview



‣ What is Netlog?
‣ Translations
‣ Network topology
‣ Scaling Databases
‣ Caching
‣ Search
‣ Q&A
What is Netlog?
Social Network



‣ Create your own profile
‣ Discover your friendsʼ activity
‣ Communicate
‣ Explore new content
‣ Applica...
Your Profile
What: itʼs personal



  ‣ You rule: itʼs yours

              Music                            YOU
                      ...
Friend Activity



 ‣ Share & discover friendsʼ activity
                     Pinguke V
                                  ...
Communication: Shouts
Communication: Ratings & Comments
Communication: Private messaging
Communication: Chat
Communication: Clans
Explore


                              Blogs
   Profiles
                  Photos




                           Clans    ...
Applications



‣ OpenSocial
 • sandbox: http://nl.netlog.com/go/developer/opensocial/sandbox=1

‣ Officially announced tom...
Developer Pages

      http://nl.netlog.com/go/developer
Itʼs going pretty good




 ‣ More than 35,000,000 unique members
 ‣ More than 4,000,000,000 pageviews/Month
 ‣ 19 languag...
0
                                                                                                                        ...
Itʼs going pretty good
Translations
19 languages and alot more coming!

                           Slovenčina
         Español Català
                        ...
Translate Tool
Template
Parsed Template
Translated Template
Generated PhP code
Template Code
Template Output
Network Topology
Overview


            Netlog Datacenters
                                                               Database Pools

 ...
Web Servers

‣ Software
 • Apache 2
 • Php 5.2.6
 • eAccelerator 0.9.5.2 for bytecode caching
 • Keepalived for high avail...
Database Servers



‣ MySQL Enterprise 4.1.22
‣ 200 database servers
‣ 40 thousand tables
‣ 70 billion records
‣ 60 thousa...
Memcache Servers



‣ Memcached 1.2.4
‣ 60 servers
‣ 250 thousand requests/second
‣ 450 GB of memory
Static servers



‣ Software:
 • Lighttpd
 • NginX

‣ Used for:
 • static files: css/javascript/images/...
 • user content:...
Other servers



‣ OpenSocial:
 • Shindig
 • Tomcat

‣ Search:
 • Sphinx
Scaling Databases
Database & Scalability



‣ Database pools

‣ Replication

‣ Partitioning
Database Pools



‣ Different data on different database pools:
 • messaging
 • friendships
 • blogs
 • music
 • videos
 •...
Replication



‣ write to one master
‣ read from multiple slaves (and master)
‣ pros
 • easy to implement
 • read intensiv...
Partitioning (sharding)



‣ Divide data on primary key:
 • all user data for users with id 1 - 10 in database1
 • all use...
Analyse, analyse, analyse!


‣ Tag your queries
 •   SELECT * FROM USER WHERE userid = 123 /*User::getUser():11 */


‣ Ana...
Caching
Introduction to memcached



‣ Developed by Danga Interactive:
  • http://www.danga.com/


‣ Initially developed for LiveJ...
Introduction to memcached



‣ Least Recently Used
‣ Fast!
‣ Distributed
‣ Automatic failover
‣ Big Hash table: set/add/ge...
What to cache?



‣ sessions
‣ query caching
‣ processed data
‣ generated html
Session Cache



‣ 99% hit ratio
‣ Time to live is 20 minutes
‣ Faster than session database
Query Cache



‣ Why memcache and not MySQL query cache?
 • MySQL invalidates cached queries on a table on
     every upda...
Processed data



‣ Better to cache processed data than query
 results
HTML Caching
HTML Caching



‣ Profile blocks are fully cached
‣ Data needed to generate html is also cached
‣ When data changes, html i...
3 ways of caching



‣ Cache with TTL

‣ Cache forever with invalidate

‣ Cache forever with update
Cache with TTL



‣ The good:
 • Quickly achieve better performance on existing code

‣ The bad:
 • Users see outdated inf...
Cache with TTL


‣ Cache friends for 5 minutes
Cache forever with invalidate



‣ The Good:
 • fairly easy to implement
 • user never sees outdated data
Cache friends forever


‣ For memcached this means ttl=0
Invalidate Cache
Cache forever with update



‣ The Good:
 • Best caching possible
 • Can reduce your select queries to the minimum
Update Cache (array)


‣ Only update cache when no db queries needed
Update Cache (simple value)



‣ No need to check cache
Global Locking



‣ Use memcache as locking mechanism
Global Locking: Chat Example

‣ Example: add new message to cached shared
 chat thread
Flooding detection



‣ User can only redo action A after a timeout
 • a guestbook message can only be posted once every
 ...
Flooding detection
Flooding detection



‣ User can only redo action A after a timeout
 • a guestbook message can only be posted once every
 ...
Search
MySQL full-text search



‣ Initially used for our search
  • can be very slow
  • extra load on most of our databases, si...
Sphinx Features



‣ very fast indexing
‣ very fast searching
 • 0.04 seconds average
 • 5 million searches / day
 • 60 se...
Sphinx Indexer



‣ Index is read-only (except for attributes)
‣ Build new index while searching old one
‣ How we index:
 ...
Sphinx Search



‣ Search query returns list of ids
‣ For every result page shown, we fetch data
 associated with ids
 • d...
Thank you!




             Questions?
Netlog: What we learned about scalability & high availability
Upcoming SlideShare
Loading in...5
×

Netlog: What we learned about scalability & high availability

43,285

Published on

Talk I did @ http://www.kingsofcode.nl about the things we learned the lst year about making http://www.netlog.com scalable and delivering high performance to our users...

Published in: Technology, Business
4 Comments
70 Likes
Statistics
Notes
No Downloads
Views
Total Views
43,285
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
877
Comments
4
Likes
70
Embeds 0
No embeds

No notes for slide

Netlog: What we learned about scalability & high availability

  1. 1. 27 mei 2008 Folke Lemaitre Director of Development http://nl.netlog.com/folke What we learned about scalability & high availability
  2. 2. Overview ‣ What is Netlog? ‣ Translations ‣ Network topology ‣ Scaling Databases ‣ Caching ‣ Search ‣ Q&A
  3. 3. What is Netlog?
  4. 4. Social Network ‣ Create your own profile ‣ Discover your friendsʼ activity ‣ Communicate ‣ Explore new content ‣ Applications
  5. 5. Your Profile
  6. 6. What: itʼs personal ‣ You rule: itʼs yours Music YOU ANOTHER Photos Games ANOTHER YOU Videos People Blogs Photos Relations.
  7. 7. Friend Activity ‣ Share & discover friendsʼ activity Pinguke V Mari . reageert Toon Coppens wijzigt haar op haar foto Jan Maarten Willems tekent uploadt een profielfoto het gastenboek nieuwe foto van nico b Jaak Noukens en Jo zijn nu vrienden Stijn Symons uploadt een nieuwe foto Kenny Gryp tekent het gastenboek van Lorenz Bogaert
  8. 8. Communication: Shouts
  9. 9. Communication: Ratings & Comments
  10. 10. Communication: Private messaging
  11. 11. Communication: Chat
  12. 12. Communication: Clans
  13. 13. Explore Blogs Profiles Photos Clans Music Events Videos Applications Pages
  14. 14. Applications ‣ OpenSocial • sandbox: http://nl.netlog.com/go/developer/opensocial/sandbox=1 ‣ Officially announced tomorrow@ Google I/O • Stay tuned! ‣ Public launch for june
  15. 15. Developer Pages http://nl.netlog.com/go/developer
  16. 16. Itʼs going pretty good ‣ More than 35,000,000 unique members ‣ More than 4,000,000,000 pageviews/Month ‣ 19 languages and more coming up ‣ More than 20 countries ‣ Current Alexa Top-100 ranking (most visited web sites in the world) ‣ Current ComScore Europe Top-10 ranking
  17. 17. 0 50.000.000 100.000.000 150.000.000 200.000.000 Ja nu 16% 3% Fe ar br y- Western Asia ua 07 Eastern Europe M ry- ar 07 ch 10% Ap -07 22% ril - M 07 ay Southern Europe Ju -07 Americas 3% ne - Ju 07 ly Northern Europe Au -0 gu 7 st -0 7 O c N tob ov er Monthly Visits e -0 D mb 7 ec e em r-0 46% Ja be 7 nu r-0 Fe ary 7 Western Europe br -0 ua 8 Itʼs going pretty good M ry- ar 08 ch Ap -08 ril -0 8 0 10.000.000 20.000.000 30.000.000 40.000.000 Ja nu 0 1.250.000.000 2.500.000.000 3.750.000.000 5.000.000.000 Fe ary Ja br -0 n ua 7 Fe uar M ry- br y-0 ar 07 ua 7 ch M ry- Ap -07 ar 0 ch 7 ril - Ap -07 M 07 ril ay M -07 Ju -07 ay ne Ju -07 - ne Ju 07 l Ju -07 Au y-0 gu 7 Au ly-0 st gu 7 -0 st 7 -0 O 7 ct O N obe ct ov N ob e r-0 ov er - D mb 7 e ec e D mb 07 em r-0 ec e Monthly Unique Visitors em r-0 Monthly Page Requests Ja be 7 Ja be 7 nu r-0 n r- Fe ary 7 Fe uar 07 br -0 br y-0 ua 8 ua 8 M ry- M ry- ar 08 ar 0 ch ch 8 Ap -08 Ap -08 ril ril -0 -0 8 8
  18. 18. Itʼs going pretty good
  19. 19. Translations
  20. 20. 19 languages and alot more coming! Slovenčina Español Català Svenska suomi česky slovenščina Deutsch Magyar Nederlands français Русский Italiano Afrikaans English Dansk Türkçe Polski Hrvatski Lietuvių kalba Eesti Latviešu valoda Português Română български Norsk (bokmål)
  21. 21. Translate Tool
  22. 22. Template
  23. 23. Parsed Template
  24. 24. Translated Template
  25. 25. Generated PhP code
  26. 26. Template Code
  27. 27. Template Output
  28. 28. Network Topology
  29. 29. Overview Netlog Datacenters Database Pools Slave Slave Master Master Slave Slave User Pool Activity Pool Web Cluster Slave Slave Master Master Slave Slave Friendships Pool ... Internet Web Load Balancer Firewall Memcache Pools Static Load Balancer Session Cache Slave Master General Cache Slave Html Cache Primary Pool CDN Storage Servers
  30. 30. Web Servers ‣ Software • Apache 2 • Php 5.2.6 • eAccelerator 0.9.5.2 for bytecode caching • Keepalived for high availability ‣ 200 servers ‣ 450 000 requests per second
  31. 31. Database Servers ‣ MySQL Enterprise 4.1.22 ‣ 200 database servers ‣ 40 thousand tables ‣ 70 billion records ‣ 60 thousand queries per second
  32. 32. Memcache Servers ‣ Memcached 1.2.4 ‣ 60 servers ‣ 250 thousand requests/second ‣ 450 GB of memory
  33. 33. Static servers ‣ Software: • Lighttpd • NginX ‣ Used for: • static files: css/javascript/images/... • user content: photos, videos ‣ Content Delivery Network: Akamai & Panther
  34. 34. Other servers ‣ OpenSocial: • Shindig • Tomcat ‣ Search: • Sphinx
  35. 35. Scaling Databases
  36. 36. Database & Scalability ‣ Database pools ‣ Replication ‣ Partitioning
  37. 37. Database Pools ‣ Different data on different database pools: • messaging • friendships • blogs • music • videos • ...
  38. 38. Replication ‣ write to one master ‣ read from multiple slaves (and master) ‣ pros • easy to implement • read intensive applications scale very well ‣ cons • write intensive applications donʼt scale
  39. 39. Partitioning (sharding) ‣ Divide data on primary key: • all user data for users with id 1 - 10 in database1 • all user data for users with id 11 - 20 in database2 • ... ‣ Best scaling possible ‣ How? • managed in code • MySQL partitioning (available from version 5.1)
  40. 40. Analyse, analyse, analyse! ‣ Tag your queries • SELECT * FROM USER WHERE userid = 123 /*User::getUser():11 */ ‣ Analyse mysql slow logs ‣ Analyse process lists ‣ Analyse based on tags • 1023 User:getUser():230 • 512 User::isOnline():124 • 10 Activities:getActivity():320 ‣ minutely cron that checks for “too many connections” • if “too many connections”, log process list
  41. 41. Caching
  42. 42. Introduction to memcached ‣ Developed by Danga Interactive: • http://www.danga.com/ ‣ Initially developed for LiveJournal: • http://www.livejournal.com/ ‣ OpenSource
  43. 43. Introduction to memcached ‣ Least Recently Used ‣ Fast! ‣ Distributed ‣ Automatic failover ‣ Big Hash table: set/add/get/delete
  44. 44. What to cache? ‣ sessions ‣ query caching ‣ processed data ‣ generated html
  45. 45. Session Cache ‣ 99% hit ratio ‣ Time to live is 20 minutes ‣ Faster than session database
  46. 46. Query Cache ‣ Why memcache and not MySQL query cache? • MySQL invalidates cached queries on a table on every update • different query cache for different replicated databases ‣ Add to generic database classes • Cache key is query
  47. 47. Processed data ‣ Better to cache processed data than query results
  48. 48. HTML Caching
  49. 49. HTML Caching ‣ Profile blocks are fully cached ‣ Data needed to generate html is also cached ‣ When data changes, html is invalidated, cached data updated ‣ High cache hit rate on profile pages
  50. 50. 3 ways of caching ‣ Cache with TTL ‣ Cache forever with invalidate ‣ Cache forever with update
  51. 51. Cache with TTL ‣ The good: • Quickly achieve better performance on existing code ‣ The bad: • Users see outdated information • TTL can not be high • Caching efficiency is minimal
  52. 52. Cache with TTL ‣ Cache friends for 5 minutes
  53. 53. Cache forever with invalidate ‣ The Good: • fairly easy to implement • user never sees outdated data
  54. 54. Cache friends forever ‣ For memcached this means ttl=0
  55. 55. Invalidate Cache
  56. 56. Cache forever with update ‣ The Good: • Best caching possible • Can reduce your select queries to the minimum
  57. 57. Update Cache (array) ‣ Only update cache when no db queries needed
  58. 58. Update Cache (simple value) ‣ No need to check cache
  59. 59. Global Locking ‣ Use memcache as locking mechanism
  60. 60. Global Locking: Chat Example ‣ Example: add new message to cached shared chat thread
  61. 61. Flooding detection ‣ User can only redo action A after a timeout • a guestbook message can only be posted once every 2 minutes ‣ User can not do action A more than X times in T minutes • only 12 failed login attempts per hour are allowed
  62. 62. Flooding detection
  63. 63. Flooding detection ‣ User can only redo action A after a timeout • a guestbook message can only be posted once every 2 minutes ‣ User can not do action A more than X times in T minutes • only 12 failed login attempts per hour are allowed
  64. 64. Search
  65. 65. MySQL full-text search ‣ Initially used for our search • can be very slow • extra load on most of our databases, since most content is searchable ‣ Better search engine needed • Sphinx! • OpenSource search engine developed by Andrew Aksyonoff (http://sphinxsearch.com/)
  66. 66. Sphinx Features ‣ very fast indexing ‣ very fast searching • 0.04 seconds average • 5 million searches / day • 60 searches / second ‣ distributed ‣ document fields ‣ stopwords ‣ api available in many languages • PhP, Java, Python, Ruby, Perl, C++, ...
  67. 67. Sphinx Indexer ‣ Index is read-only (except for attributes) ‣ Build new index while searching old one ‣ How we index: • rebuild full index from data once in a while (daily, weekly) • generate delta indexes often (every minute, 5 minutes) • contains changes for search index since last full index merge • full index merge of previous index and delta (every hour)
  68. 68. Sphinx Search ‣ Search query returns list of ids ‣ For every result page shown, we fetch data associated with ids • data is cached with memcache for every id
  69. 69. Thank you! Questions?
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×