Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Scaling Traffic from 0 to 139 Million Unique Visitors

2,525 views

Published on

How Yelp scaled from 0 to 139 million unique visitors. Slides from LAUNCH Scale, 2014

Published in: Technology
  • Be the first to comment

Scaling Traffic from 0 to 139 Million Unique Visitors

  1. 1. Michael Stoppelman, VP of Engineering @stopman SCALE: Traffic October 23, 2014
  2. 2. In the beginning of time… 2005 - 2007
  3. 3. Traffic History at Yelp • 2005 - 2007 – ~200k searches/day – ~850k reviews total – www.yelp.com – LAMP (P - Python) – Pushes were daily – Master-Slave MySQL setup, 1 main database, mix of InnoDB and MyISAM tables – Turned on gzip on Apache – Squid for image proxying
  4. 4. As we grew so did our infrastructure… 2008 - 2009
  5. 5. Traffic History at Yelp • 2008 - 2009 – Search scaling • sharded by geography • index distribution: rsync w/ fadvise – Log aggregation: • Syslog + rsync + s3 • scribe + s3 – Gearman used for async queue processing – MySQL split vertically into 3 databases – Dirty session cookie – Mobile apps: iPhone, Android, Blackberry, etc… – 4 countries
  6. 6. Scale scale scale… 2010 - 2011
  7. 7. Infrastructure History at Yelp • 2010 - 2011 – Introduced “read only” mode for the site – First CDN put into use – Photos migrated to s3 – mrjob is built/open sourced – AWS EMR is used for mrjob processing – Managed DNS - DynDNS – 13 countries
  8. 8. IPO 2012, more scale!
  9. 9. Infrastructure History at Yelp • 2012 - 2013 – Introduced “read only” datacenters • Cacheserv introduced – Load balance traffic between datacenters – Elasticsearch – Pre-IPO traffic, we added the ability to quickly reduce load – Direct connection with AWS – All schema changes are done, online – Moved to all FusionIO for DB hosts – 24 countries
  10. 10. Biggest year yet… 2014 - present
  11. 11. Traffic Infrastructure - current picture • 2014 - current – abusive scraping – Starting to serve traffic from EC2 (for Asia/Europe) – Elasticsearch, Logstash, and Kibana – Gearman w/ MySQL – Kafka == Scribe – Pyleus (doing work in real-time) – 29 countries
  12. 12. Pyleus: A Python Framework for Storm Topologies ● Pyleus: Yelp’s super new Python Storm bindings ● Build topologies in Python ● Declaratively describe structure in YAML ● Respects requirements.txt ● Compose a topology from Python packaged components!

×