Scalability at GROU.PS

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    1 Favorite

    Scalability at GROU.PS - Presentation Transcript

    1. Scalability at GROU.PS
      EmreSokullu
    2. Disclaimer
      We’re not fully there yet
      We hire: jobs@groups-inc.com
    3. Challenges @ GROU.PS
      3M unique visitors per month
      120M page views
      1PB assets to be served every month
      Video,Photos, Files
      Support for 5Gbit/s
      Very dynamic pages:
      With social networks; p(u,t) = HTML
      p(g,u,t) = HTML -> WHERE group_id = ? AND …
    4. What is GROU.PS ?
    5. Distributed Architecture
      25+ servers, S3 cloud, EdgeCast CDN
      4 cores +
      All Linux: Red Hat
      Some Debian, Ubuntu, CentOS
    6. Amazon Technologies
      S3
      CloudFront
      EC2 (elastic IP and persistent storage)
      SimpleDB
      Queue technologies, distributed hadoop and more…
    7. Amazon Technologies
      Downside:
      Not so cheap
      Bad database performance
    8. Serving Content?
      Use MogileFS
      Distributed file serving
      Use CDN
      hot content served off from local servers
      Sysctl tunings needed!
    9. Our typical sysctl additions
      net.ipv4.tcp_syncookies = 1
      net.ipv4.tcp_synack_retries = 2
      ## Emre edited
      # http://www.oracle-base.com/articles/11g/OracleDB11gR1InstallationOnFedora8.php
      kernel.shmall = 2097152
      kernel.shmmax = 2147483648
      kernel.shmmni = 4096
      # semaphores: semmsl, semmns, semopm, semmni
      kernel.sem = 250 32000 100 128
      net.ipv4.ip_local_port_range = 1024 65000
      net.core.rmem_default=4194304
      #net.core.rmem_max=4194304
      net.core.wmem_default=262144
      #net.core.wmem_max=262144
      fs.file-max=5049800
      vm.swappiness=10
      ## Emre edited
      # from http://forums.softlayer.com/showthread.php?t=3252
      net.ipv4.tcp_rmem = 4096 87380 8388608
      net.ipv4.tcp_wmem = 4096 87380 8388608
      net.core.rmem_max = 8388608
      net.core.wmem_max = 8388608
      net.core.netdev_max_backlog = 5000
      net.ipv4.tcp_window_scaling = 1
      net.ipv4.ip_nonlocal_bind=1
      # http://rackerhacker.com/2007/08/24/apache-no-space-left-on-device-couldnt-create-accept-lock/
      kernel.msgmni = 1024
      kernel.sem = 250 256000 32 1024
      net.ipv4.ip_conntrack_max = 524288
      net.ipv4.netfilter.ip_conntrack_max = 524288
    10. MySQL
      Load off via memcache
      $memcache->set(“group_by_name.jtpd”, 1122, false, 0);
      $memcache->set(“home_module_html.1122”,…, true, 30);
      function getGroupID($group_name) { global $memcache; if( !isset($memcache) || ($res=($memcache->get(“group_by_name.{$group_name}”)))===false ) { // get it from mysql and memcache } else { return $res; // serve from memcache }}
    11. MySQL
      Replication easy
      Split Reads
      What about writes?
      That’s where sharding comes to play
      Vertical Sharding
      Horizontal Sharding
      MMM
    12. MySQL
      Runs poorly on multi-cores
      query_cache_size = 0 # on master
      query_cache_type = 0 # on master
      thread_concurrency = 8 # total cores
      max_connections = 750 # shouldn’t exceed that
      innodb_buffer_pool_size = 10G # a little less than the total amount
    13. MySQL Query Optimization
      INDEX group, user
      WHERE group = ? AND user = ?
      Not WHERE user = ? AND group = ?
      B-tree
    14. MySQL Query Optimization
      SHOW PROCESSLIST
      Maatkit, mk-query-digest
      Percona builds
    15. NOSQL
      Voldemort, Linkedin
      Cassandra, Facebook
      Tokyo Cabinet, mixi
    16. Logging
      Database logging is not the solution
      File system is expensive too
      A legal necessity
    17. Logging
      Solution:
      Scribe & Thrift
      By Facebook
      Eventually consistent
    18. Nginx & libevent
    19. Nginx & libevent
      Handles 10000 connections
      5gbit/s
      Rambler
      Wordpress
      Grou.ps
    20. Postfix
      Run multiple instances
      Spam Clusters
    21. Monitoring
      Munin + monit
      Other alternatives:
      Cacti
      Nagios
      Hyperic – vmware
    22. PHP
    23. More to come on my blog
      http://emresokullu.com
      More fine tuning tips
      Become a member of my community
      Love grou.ps ;)
      Convert to PHP
      We’re hiring: jobs@groups-inc.com
    SlideShare Zeitgeist 2009

    + esokulluesokullu Nominate

    custom

    387 views, 1 favs, 1 embeds more stats

    How does GROU.PS scale to serving 1PB of assets eac more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 387
      • 121 on SlideShare
      • 266 from embeds
    • Comments 0
    • Favorites 1
    • Downloads 6
    Most viewed embeds
    • 266 views on http://grou.ps

    more

    All embeds
    • 266 views on http://grou.ps

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories