Clustered PHP - DC PHP 2009

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Notes on slide 1

    I don’t think I need to elaborate.

    I don’t think I need to elaborate.

    I don’t think I need to elaborate.

    Your application became incredibly popular overnight, and you need to be able to service more users.

    4 Favorites

    Clustered PHP - DC PHP 2009 - Presentation Transcript

    1. Clustered PHPDC PHP 2009
      by Marcel Esser – marcel.esser@croscon.com
      Senior Developer & Systems Analyst, CROSCON
    2. what is clustering?
      “A computer cluster is a group of linked computers, working together
      closely so that in many respects they form a single computer.”
      - Wikipedia
    3. what is clustering?
      “A computer cluster is a group of linked computers, working together
      closely so that in many respects they form a single computer.”
      - Wikipedia
      …in other words, it’s really hard.
    4. what is clustering not?
      Clustering is not high availability.
      … but you often achieve high availability with clustering.
    5. what is clustering not?
      Clustering is not high availability.
      … but you often achieve high availability with clustering.
      Clustering is not high volume.
      … but you often achieve high volume with clustering.
    6. what is clustering not?
      Clustering is not high availability.
      … but you often achieve high availability with clustering.
      Clustering is not high volume.
      … but you often achieve high volume with clustering.
      We want both.
    7. also…
      clustering is really,
      really hard
    8. some do it well
    9. some … eh
      well, nevermind.
    10. why cluster?
      service more users
    11. why cluster?
      service more users
      service users faster
    12. why cluster?
      service more users
      service users faster
      increase reliability
    13. why cluster?
      service more users
      service users faster
      increase reliability
      get rich
    14. objectives
      linear capacity increase
    15. 1 server - 100 lolcats per second
    16. 2 servers - 200 lolcats per second
    17. 3 servers - 300 lolcats per second
    18. objectives
      linear capacity increase
      linear cost increase
    19. 100 lolcats - $100
    20. 200 lolcats - $200
    21. 300 lolcats - $300
    22. objectives
      linear capacity increase
      linear cost increase
      exponential reliability increase
    23. common topics in clustering php
      • Load Balancing
      • Database Scaling
      • Replicated Storage
      • Backups
      • Data Caches
      • Distributed Sessions
      • Staging Strategies
      • Debugging
      • Background Services
    24. load balancing
      Load
      Servers
      Client
      Balancer
    25. load balancing
      Your load balancer may or may not…
      • remove bad nodes from the pool
      • balance by performance
      • balance by weight
      • route by geolocation
      • support sticky sessions
      • have 1 million other features
    26. load balancing tools
      some among thousands…
      • DNS servers
      • Big IP
      • Perlbal
      • nginx
      • Varnish
    27. database scaling
      common things you can do:
      partitioning
      replication
      sharding
    28. database partitioning
      • every user is assigned to a database server
      • users don’t share data between each other (between servers)
      • when you need more capacity, add another database server
      • works for some apps, doesn’t work for others
      implementation example: invoice and timesheet management app
    29. database replication (mysql)
      master - master
      master - slave
      master - many slave
    30. database replication (mysql)
      master - master
      server 1 replicates (as master) to server 2 (acting as slave)
      server 2 replicates (as master) to server 1 (acting as slave)
      • works well to a point
      • complete nightmare when replication gets desynchronized
      • doesn’t actually improve write performance
      • good for basic high availability
    31. database replication (mysql)
      master - slave
      server 1 replicates (as master) to server 2 (acting as slave)
      • good first step
      • makes you re-write your application to consider slave queries
      • doesn’t increase write performance
      • de-synchronization is relatively painless
      • replication lag
    32. database replication (mysql)
      master – many slave
      server 1 replicates (as master) to many servers (acting as slaves)
      • thundering read performance
      • makes you re-write your application to consider slave queries
      • doesn’t increase write performance
      • de-synchronization is relatively painless
      • replication lag
    33. database sharding
      • data is split between multiple database servers
      • logical index is kept of what data is where (for example, a mathematical index or a lookup chart)
      • you have to grab, parse, and correlate data across servers
      • theoretically limitless scalability
      • complicated
      implementation example: digg, facebook, etc
    34. replicated storage
      common things you can do:
      replicated file system
      lookup tables
      storage services
      huge NAS arrays
    35. replicated file system (glusterfs)
      • very affordable
      • various replication modes
      • nothing to keep track of in your app
      • easy to implement
      • can cause massive failures if poorly configured
    36. lookup tables
      • very affordable
      • limitless mode; entirely up to you
      • entirely dependent on your application logic
      • can cause massive failures if poorly implemented
    37. storage services
      • very expensive
      • theoretically limit-less capacity
      • easy to use
      • data must be pulled back first if used locally
      • costs and bandwidth usage can be mitigated (for example, by putting a proxy in front of it)
    38. large NAS arrays
      • insanely expensive
      • insanely expensive
      • insanely expensive
      • bullet-proof fault tolerance… at a price
      • easy to use… for a price
    39. backups
      common methods:
      all-RAID (doesn’t work)
      snapshots
      copying from slaves
    40. backups
      all-RAID doesn’t work
      Why?
      RAID won’t keep your application from
      deleting data everywhere
    41. snapshots
      use a mechanism to take a snapshot of the partition
      i.e. LVM partitions
      • works really well
      • easy if you do it from the beginning
      • requires some planning
      • should be used with RAID drives
    42. copying from slaves
      take a slave out of rotation and copy from it
      i.e. MySQL databases
      • works really well
      • easy if you do it from the beginning
      • requires some planning
      • backups can be out of date
    43. data caches (memcached)
      PHP doesn’t have cross-request persistence, so
      someone added it: memcached
      • in-memory
      • fast
      • scalable
      • proven
      • use it
      Got configuration data? Small, high-TTL data sets?
      Use APC.
      Large, high-TTL data sets? Use files.
      Mind the race condition.
    44. replicated sessions
      pick your poison:
      • memcache w. redundancy
      • database
      • shared file system (don’t actually do this)
    45. staging strategy
      If you value your free time:
      Dev
      Stage
      Live
      Test
    46. staging strategy (dev)
      • do use source control systems (subversion, etc)
      • do profile your to loop for obvious performance issues
      • do use phpdoc tags
      • do make your dev environment as similar to live as practical (i.e., don’t develop on Windows and run live on UNIX)
      • do document all your changes
      • do use TDD (test-driven development)
    47. staging strategy (test)
      • do make test functionally identical to live, except for data
      • do create data fixtures that are representative of real-life data
      • do create functional tests for the user interface (Selenium)
      • do not push anything to stage that did not pass unit tests
    48. staging strategy (stage)
      • do make stage identical to a live node
      • do connect to the live database
      • do have test ‘users’ to perform destructive operations against
      • do have a mechanism to automate pushing stage to live
    49. staging strategy (live)
      • do not ever make changes by hand on live
      • do automate pushing updates
      • do take nodes out rotation when you push updates
      • do not allow ssh access to live except when really needed
    50. debugging
      • do use xdebug on dev, test, and stage
      • do prepare an automated action that can turn xdebug and profiling on/off on 1 of the live nodes. you can and will run into errors that only exist on live.
      • do write a test case to replicate the the bug first and then fix the bug, whenever possible
      • do first look if bugs are explainable by platform differences between development and production systems (i.e., don’t develop on Windows and deploy on UNIX)
      • do go to my talk at ZendCon in October, “It Works on Dev”
    51. background services
      • do void launching background processes from the web app
      • PHP doesn’t have a native message queue, so (many) people wrote some. example, gearmand. do use a message queue.
      • do check for memory leaks in background tasks! many php libraries and also many php versions themselves still leak memory. try to write a loop in bash for a background task rather than in php. recycle the process often.
      • do plan your message format carefully
      • do persist important messages
    52. questions
      …anybody?
    53. shameless plug
      CROSCON
      Bespoke Application Development
      Consulting
      Security
      Marcel Esser
      marcel.esser@croscon.com
      (202) 730-9728
      Personal? marcel.esser@gmail.com

    + marcelessermarcelesser, 2 months ago

    custom

    373 views, 4 favs, 1 embeds more stats

    Presentation from DC PHP 2009

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 373
      • 372 on SlideShare
      • 1 from embeds
    • Comments 0
    • Favorites 4
    • Downloads 0
    Most viewed embeds
    • 1 views on http://www.php.rk.edu.pl

    more

    All embeds
    • 1 views on http://www.php.rk.edu.pl

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories