PHP at Density and Scale (Lone Star PHP 2014)

2,105 views

Published on

Mixing performance, configurability, density, and security at scale has, historically, been hard with PHP. Early approaches have involved CGIs, suhosin, or multiple Apache instances. Then came PHP-FPM. At Pantheon, we've taken PHP-FPM, integrated it with cgroups, namespaces, and systemd socket activation. We use it to deliver all of our goals at unheard-of densities: thousands and thousands of isolated pools per box.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,105
On SlideShare
0
From Embeds
0
Number of Embeds
783
Actions
Shares
0
Downloads
18
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

PHP at Density and Scale (Lone Star PHP 2014)

  1. 1. PHP at Density and Scale ...with security and consistent performance
  2. 2. About Me ● Four Kitchens ● Drupal.org ● Pressflow ● Pantheon ● systemd
  3. 3. Broadly Defining Security Your data... 1. Is accessible to the right people (access) 2. Isn’t to anyone else (access) 3. Is usable (quality of service)
  4. 4. Topics ● Performance ○ Socket activation ○ Automount/autofs ○ cgroups ○ “Customer Experience Monitor” ○ Migration ● Security ○ Users ○ Namespaces ○ Defense-in-depth ○ Non-disruptive fixes
  5. 5. Challenge: PHP-FPM Overhead ● Using a full PHP-FPM instance per stack ○ Isolated opcode cache space ○ Defense-in-depth against PHP issues ○ Low-impact reconfiguration ● Idle PHP-FPMs take ~0.5% of a core each ○ At 10k dense, that’s over six cores ● Initial solution used error capture in nginx ○ Masked real failures to connect to PHP-FPM ○ Slower than necessary ○ Production use of HTTP 418 (arguably a bonus)
  6. 6. Traditional server sockets: overview ... nginx TCP 80 Client nginx TCP 81 If you want a service available, the daemon has to be running.
  7. 7. Socket activation: overview systemd TCP 80 Client TCP 81 nginxfd=3 Only a socket in systemd has to run for service availability.
  8. 8. Socket activation: details ● systemd squats on all listeners ○ Looks for incoming traffic with EPOLL ○ Starts the services/containers on-demand ○ Passes socket to daemon as fd=3+ ● Not a proxy (same performance) ● No client awareness ● No CPU or memory overhead when idle
  9. 9. Socket activation: Pantheon’s use ● nginx and PHP-FPM ● MariaDB soon ○ Using an alternative now ● Allows 90%+ containers to be idle ● Makes bootup sensible ● Reconfiguration pattern is stop, not restart
  10. 10. Socket Activation Demo Demoed this at NYC Camp a few weeks ago
  11. 11. Automount/autofs ● Like socket activation for file system mounts ○ Kernel squats on mount path and looks for traffic ○ Brings up file mount lazily ● Used for FuseDAV (Valhalla client)
  12. 12. Automount Demo
  13. 13. Challenge: Resource Availability ● Per-site load isn’t predictable ● Different sites compete for resources ○ Between customers ○ Among customers’ own sites ● Traditional prioritization isn’t adequate ○ VMs are too heavyweight ○ Tools like “nice” can cause starvation ○ Generally want burstability
  14. 14. cgroups ● Many options ○ Pantheon uses CPUShares and BlockIOWeight ● Keeps things fair under contention ○ Kind of like adding purple ropes when people are queueing
  15. 15. Contention with cgroups Demo
  16. 16. Customer Experience Monitor ● Runs a representative Drupal site on every container host ● Reports scores to the API and monitoring ● Influences migration and container placement
  17. 17. Migration ● At density, rebalancing is important ● Keep state lightweight ○ No OS ○ No runtime ● Mutiny: migration as replication + promotion
  18. 18. Challenge: Security Isolation ● Many users ● One kernel ● VMs too heavyweight ● Users run their own code ● Can’t betray expectations ○ Many users develop locally and push code ○ Some customers import existing, working sites
  19. 19. Isolation for security ● Users ● Namespaces ● Seccomp filters
  20. 20. Defense in depth ● Application ○ Drupal ● Runtime ○ nginx, PHP-FPM, FuseDAV ● Container: “binding” certificate ○ Linux user, namespaces, etc. ● Container host: “endpoint” certificate ○ Only trusted for the containers assigned ● Platform: root certificate
  21. 21. Challenge: Security Responses ● Traditional approach too big a hammer ○ Rebooting hundreds of hosts with 10k+ containers each would be a fail-over storm ○ Basic customers don’t have fail-over ○ Not going to pack it less dense ● Customers can run own code ○ May load executables and libraries themselves
  22. 22. Non-disruptive fixes ● Kernel upgrades via migration ● Rolling daemon and library upgrades ○ Heartbleed
  23. 23. Heartbleed Fix Demo

×