Disclaimer We’re not fully there yet We hire: jobs@groups-inc.com
Challenges @ GROU.PS 3M unique visitors per month 120M page views 1PB assets to be served every month Video,Photos, Files Support for 5Gbit/s Very dynamic pages: With social networks; p(u,t) = HTML p(g,u,t) = HTML -> WHERE group_id = ? AND …
What is GROU.PS ?
Distributed Architecture 25+ servers, S3 cloud, EdgeCast CDN 4 cores + All Linux: Red Hat Some Debian, Ubuntu, CentOS
Amazon Technologies S3 CloudFront EC2 (elastic IP and persistent storage) SimpleDB Queue technologies, distributed hadoop and more…
Amazon Technologies Downside: Not so cheap Bad database performance
Serving Content? Use MogileFS Distributed file serving Use CDN hot content served off from local servers Sysctl tunings needed!
MySQL Load off via memcache $memcache->set(“group_by_name.jtpd”, 1122, false, 0); $memcache->set(“home_module_html.1122”,…, true, 30); function getGroupID($group_name) { global $memcache; if( !isset($memcache) || ($res=($memcache->get(“group_by_name.{$group_name}”)))===false ) { // get it from mysql and memcache } else { return $res; // serve from memcache }}
MySQL Replication easy Split Reads What about writes? That’s where sharding comes to play Vertical Sharding Horizontal Sharding MMM
MySQL Runs poorly on multi-cores query_cache_size = 0 # on master query_cache_type = 0 # on master thread_concurrency = 8 # total cores max_connections = 750 # shouldn’t exceed that innodb_buffer_pool_size = 10G # a little less than the total amount
MySQL Query Optimization INDEX group, user WHERE group = ? AND user = ? Not WHERE user = ? AND group = ? B-tree
MySQL Query Optimization SHOW PROCESSLIST Maatkit, mk-query-digest Percona builds
NOSQL Voldemort, Linkedin Cassandra, Facebook Tokyo Cabinet, mixi
Logging Database logging is not the solution File system is expensive too A legal necessity
Logging Solution: Scribe & Thrift By Facebook Eventually consistent
More to come on my blog http://emresokullu.com More fine tuning tips Become a member of my community Love grou.ps ;) Convert to PHP We’re hiring: jobs@groups-inc.com
How does GROU.PS scale to serving 1PB of assets eac more
How does GROU.PS scale to serving 1PB of assets each month. memcache, nginx, gearman, tornado, libevent, kqueue, epoll, mysql, sharding, replication, memcached, tokyo cabinet less
0 comments
Post a comment