Gavin M
Upcoming SlideShare
Loading in...5
×
 

Gavin M

on

  • 6,168 views

 

Statistics

Views

Total Views
6,168
Views on SlideShare
6,166
Embed Views
2

Actions

Likes
2
Downloads
18
Comments
0

1 Embed 2

http://localhost 2

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Gavin M Gavin M Presentation Transcript

  • myYearbook.com Architecture Lessons Learned from the Trials of Scaling a High Traffic Website
    • Founded in 2005
    • 3 rd Largest Social Network in United States
    • Teenage Demographic
    • 60+ Employees
  • January 2007
    • 100M Pageviews
    • 1 Database Server
    • 1 Web Application Server
    • Daily issues with load and site availability
  • September 2008
    • 2.5B Pageviews
    • 30 Database Servers
    • 120 Web Application Servers
    • 99.94% Uptime as measured by pingdom.com
  • Key Architecture Components
    • PHP5, APC
    • Apache httpd
    • PostgreSQL
    • Memcached
    • Apache ActiveMQ
    • Lighttpd
    • Isilon IQ Clustered NAS
    • Message Systems eCelerity
    • Subversion
  • Web Application Architecture
    • 2005-2007: Monolithic Code Base
    • 2008: Migrating to a Services Oriented Architecture
      • Applications get own resources
      • Loosely Coupled architecture
    • MVC Application using XSLT
  • Web Application Architecture
    • Why SOA?
      • Monolithic app wastes hardware
      • Cross Data-Center Operations
      • Selective Maintenance
  • Scaling Postgres
    • Rules for Scaling
    • Plan for Growth
    • Know the internals
    • Bigger Hardware is Better
  • Our Postgres Scaling History
    • Quarter 1, 2007
      • Monolithic database with one schema, many complex joins and poor optimization
      • No plan for growth
      • No DBA
  • Our Postgres Scaling History
    • Quarter 3, 2008
      • Horizontal “Sharded” Data
      • Vertical Partitioning
      • 5000 Connections/sec Avg
  • Scaling Postgres: Lessons Learned
    • Scaling web servers means many database connections, needed pooling
      • Started with pgPool moved to pgBouncer
    • Started with Slony replicating read-only slaves
      • High IO/CPU Overhead
  • Scaling Postgres: Lessons Learned
    • Began scaling vertically by separating application data by database servers and removed read only slaves
    • Needed few small tables replicated that could be slightly inaccurate and eventually consistent (BASE)
  • Scaling Postgres: Lessons Learned
    • Enter plProxy
      • Database partitioning language by Skype utilizing PostgreSQL functions
      • Trigger based plProxy functions replicate needed tables without the Queue overhead
      • NOT TRANSACTION SAFE
  • Scaling Postgres: Lessons Learned
    • Standard Use of plProxy
      • Horizontal partitioning of data by ID across multiple servers
      • Example: Messaging System
        • 8 Servers store actual partitioned message data
        • Rule #1 – Plan for Growth
  • Scaling Postgres: Lessons Learned
    • Knowing internals
      • pg_catalog
        • pg_stat_user_tables
        • pg_stat_user_indexes
  • Scaling Postgres: Knowing Internals
  • Scaling Postgres: Lessons Learned
    • Database Ecosystem
      • Performance Factors
        • Index bloat
        • Usage changes
          • Abuse
        • Cache utilization contention
  • Scaling Postgres: Lessons Learned
    • Bigger is Better
      • More RAM
      • More Disks
      • Faster and More CPU
  • Scaling Postgres: Lessons Learned
    • Scaling Across CPU Cores
    • PostgreSQL Scales to 32 Cores
    • Extensive Benchmarking @ MYB
    • Before and After Upgade
  • Scaling Postgres: Future Plans
    • More Partitioning
    • SOA Data Distribution
      • Golconde
        • Python Based
        • Apache ActiveMQ
  • Apache ActiveMQ
    • Java based Message Broker software
    • Client language neutral
    • Implements JMS 1.1, Stomp, XMPP, REST and Others
  • ActiveMQ @ myYearbook.com
    • Out-of-band Processing
    • Uploaded content processing
      • Image Resize
      • Content analysis (R&D)
      • Anti-Virus Scans
    • Comment and Message processing
      • Spam Processing
    • Email spooling from web application
    • Anywhere we can that makes sense
    • Targeted Workload
    • Message Queues allow for the right server for the job
    • Better distribution of CPU intensive tasks without negatively impacting the user experience
    • Clusterable, Scalable
  • Memcached: Key for Success
    • Valuable Scaling Tool
      • Over 250k get requests second during peak
      • Over 750GB of cached data
      • Easy to Deploy
      • The more distributed the cache becomes the less impacting cache failures become - more boxes are better than fewer
  • Memcached: Potential Problems
    • Large scale implementations can have some hidden problems
      • Lots of network traffic
      • Non-partition or evenly distributed data
    • What to do for data that is not evenly distributed?
      • Implemented a round-robin cluster of memcache servers that contain the same data
  • Research and Development
    • Copyr
      • Copy-on-Write Filesystem Replication
    • Framewerk
      • PHP5 OO Development Framework
    • Golconde
      • Queue Based Data Distribution for PostgreSQL
    • Lightr
      • PHP5 XMPP Class Library
    • mod_xsltd
      • Lighttpd XSL Transformation module
    • Playr
      • PostgreSQL Log Replay
    • Staplr
      • STAtisical Package Logically engineered Right
  • Tools for Success
    • Operations Portal
      • Executive Level Overview of Operational Status and Production Change Log
    • Staplr
      • Trending & Analytis System
  • Operations Portal
  • Trending and Analysis: Staplr
    • Version 0.6
      • PHP Based
      • Process forking
      • Shelled RRD Commands
    • Version 2.0
      • Python Based
      • Threaded
      • Python wrappers to librrd
  • Trending and Analysis: Staplr
    • Polls for:
      • Apache httpd
      • Apache ActiveMQ
      • lighttpd
      • memcached
      • MySQL
      • pgBouncer
      • PostgreSQL
      • SNMP Data
        • APC, Isilon, F5, Xiotech, Others
      • SysStat
  • Questions?