Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Scaling PHP apps

577 views

Published on

Scaling PHP apps: web server, file-system, sessions, async tasks, database and some more

Published in: Software
  • Be the first to comment

Scaling PHP apps

  1. 1. 9 January 2018 Scaling PHP apps
  2. 2. WHO AM I?
  3. 3. Matteo Moretti
  4. 4. CTO @ website: madisoft.it tech blog: labs.madisoft.it
  5. 5. Scalability
  6. 6. It’s from experience. There are no lessons.
  7. 7. Nuvola scuoladigitale.info ● > 3M HTTP requests / day ● > 1000 databases ● ~ 0.5T mysql data ● ~ 180M query / day ● ~ 105M of media files ● ~ 18T of media files ● From ~5k to ~200k sessions in 5 minutes
  8. 8. Scalability Your app is scalable if it can adapt to support an increasing amount of data or a growing number of users.
  9. 9. “But… I don’t have an increasing load” (http://www.freepik.com/free-photos-vectors/smile - Smile vector designed by Freepik)
  10. 10. “Scalability doesn’t matter to you.” (http://www.freepik.com/free-photos-vectors/smile - Smile vector designed by Freepik)
  11. 11. “I do have an increasing load” (http://www.freepik.com/free-photos-vectors/smile - Smile vector designed by Freepik)
  12. 12. Your app is growing
  13. 13. But… suddenly…
  14. 14. Ok, we need to scale
  15. 15. Scaling… what? PHP code? Database? Sessions? Storage? Async tasks? Logs?
  16. 16. Everything?
  17. 17. Can Node.js scale?
  18. 18. Can Symfony scale?
  19. 19. Can PHP scale?
  20. 20. Scaling is about app architecture
  21. 21. App architecture How can you scale your web server if you put everything inside? Database, user files, sessions, ...
  22. 22. App architecture / Decouple ● Decouple services ● Service: do one thing and do it well
  23. 23. App architecture / Decouple
  24. 24. 6 main areas 1. web server 2. sessions 3. database 4. filesystem 5. async tasks 6. logging There are some more (http caching, frontend, etc): next talk!
  25. 25. Web server A single big web server (scale up)
  26. 26. Web server Many small web servers (scale out)
  27. 27. Web server Many small web servers behind a load balancer
  28. 28. Web server Many small web servers behind a load balancer, inside an auto-scaling group
  29. 29. Web server NGINX + php-fpm PHP 7 (Symfony 4)
  30. 30. Web server Avoid micro-optimization Single quotes vs double quotes? It doesn’t matter
  31. 31. Web server / Cache PHP CACHE APPLICATION CACHE DOCTRINE CACHE
  32. 32. Web server / PHP cache OPcache OPcache improves PHP performance by storing precompiled script bytecode in shared memory, thereby removing the need for PHP to load and parse scripts on each request. http://php.net/manual/en/intro.opcache.php
  33. 33. Web server / PHP cache
  34. 34. Web server / PHP cache OPcache Bytecode caching opcache.enable = On opcache.validate_timestamps = 0 Need to manually reset OPcache on deploy! https://tideways.io/profiler/blog/fine-tune-your-opcache-configuration-to-avoid-caching-suprises
  35. 35. PHP code / Application cache ● Put application cache in ram ● Use cache warmers during deploy releaseN/var/cache -> /var/www/project/cache/releaseN “/etc/fstab” tmpfs /var/www/project/cache tmpfs size=512m
  36. 36. PHP code / Doctrine cache ● Configure Doctrine to use cache ● Disable Doctrine logging and profiling on prod doctrine.orm.default_metadata_cache: type: apcu doctrine.orm.default_query_cache: type: apcu doctrine.orm.default_result_cache: type: apcu
  37. 37. PHP code / Cache DISK I/O ~ 0%
  38. 38. Monitor Measure Analyze
  39. 39. PHP code / Profiling Blackfire New Relic Tideways
  40. 40. PHP code / Recap ● Easy ● No need to change your PHP code ● It’s most configuration and tuning ● You can do one by one and measure how it affects performance ● Need to monitor and profile: New Relic for PHP ● Don’t waste time on micro-optimization Take away: use cache!
  41. 41. Sessions ● Think session management as a service ● Use centralized Memcached or Redis (Ec2 or ElasticCache on AWS) ● Avoid sticky sessions (load balancer set up)
  42. 42. Session / Memcached No bundle required https://labs.madisoft.it/scaling-symfony-sessions-with-memcached
  43. 43. Session / Redis https://github.com/snc/SncRedisBundle https://labs.madisoft.it/scaling-symfony-sessions-with-redis
  44. 44. Session / Redis config.yml framework: session: handler_id: snc_redis.session.handler
  45. 45. Session / Redis Bundle config snc_redis: clients: session_client: dsn: '%redis_dsn_session%' logging: false # https://github.com/snc/SncRedisBundle/issues/161 type: phpredis session: client: session_client locking: false prefix: session_prefix_ ttl: '%session_ttl%'
  46. 46. Session / Redis parameters.yml redis_db: 3 redis_dsn_session: 'redis://%redis_ip%/%redis_db%' redis_ip: redis-cluster.my-lan.com session_ttl: 86400
  47. 47. Session / Recap ● Very easy ● No need to change your PHP code ● Redis better than Memcached: it has persistence and many other features ● Let AWS scale for you and deal with failover and sysadmin stuff Take away: use Redis
  48. 48. Database Aka “The bottleneck”
  49. 49. Database Relational databases
  50. 50. Database NOSQL db?
  51. 51. Database If you need data integrity do not replace your SQL db with NOSQL to scale
  52. 52. Database How to scale SQL db?
  53. 53. Database When to scale?
  54. 54. Database If dbsize < 10 GB dont_worry();
  55. 55. Database / Big db problems ● Very slow backup. High lock time ● If mysql crashes, restart takes time ● It takes time to download and restore in dev ● You need expensive hardware (mostly RAM)
  56. 56. Database / Short-term solutions Use a managed db service like AWS RDS ● It scales for you ● It handles failover and backup for you But: ● It’s expensive for big db ● Problems are only mitigated but they are still there
  57. 57. Database / Long-term solutions Sharding
  58. 58. Database / Sharding Split a single big db into many small dbs (multi-tenant)
  59. 59. Database / Sharding ● Very fast backup. Low lock time ● If mysql crashes, restart takes little time ● Fast to download and restore in dev ● No need of expensive hardware ● You arrange your dbs on many machines
  60. 60. Database / Sharding ● How can Symfony deal with them? ● How to execute a cli command on one of them? ● How to apply a migration (ie: add column) to 1000 dbs? ● …...
  61. 61. Database / Sharding Doctrine DBAL & ORM
  62. 62. Database / Sharding Define a DBAL connection and a ORM entity manager for each db https://symfony.com/doc/current/doctrine/multiple_entity_managers.html
  63. 63. Database / Sharding doctrine: orm: entity_managers: global: connection: global shard1: connection: shard1 shard2: connection: shard2 doctrine: dbal: connections: global: ….. shard1: …… shard2: …... default_connection: global
  64. 64. Database / Sharding This works for few dbs (~ <5)
  65. 65. Database / Sharding Doctrine sharding http://docs.doctrine-project.org/projects/doctrine-dbal/en/latest/reference/sharding.html
  66. 66. Database / Doctrine sharding ● Suited for multi-tenant applications ● Global database to store shared data (ie: user data) ● Need to use uuid http://docs.doctrine-project.org/projects/doctrine-dbal/en/latest/reference/sharding.html
  67. 67. Database / Sharding Configuration doctrine: dbal: default_connection: global connections: default: shard_choser_service: vendor.app.shard_choser shards: shard1: id: 1 host / user / dbname shard2: id: 2 host / user / dbname
  68. 68. Database / Sharding ShardManager Interface $shardManager = new PoolingShardManager(); $currentCustomerId = 3; $shardManager->selectShard($currentCustomerId); // all queries after this call hit the shard where customer // with id 3 is on $shardManager->selectGlobal(); // the global db is selected
  69. 69. Database / Sharding ● It works but it’s complex to be managed ● No documentation everywhere ● Need to manage shard configuration: adding a new shard? ● Need to parallelize shard migrations: Gearman? ● Deal with sharding in test environment
  70. 70. Database / Recap ● NOSQL is not used to scale SQL: they have different purposes. You can use both. ● Sharding is difficult to implement ● Need to change your code ● Short-term solution is to use AWS to leverage some maintenance ● Doctrine ORM sharding works well but you need to write code and wrappers. Best suited for multi-tenant apps ● When it’s done, you can scale without any limit Take away: do sharding if your REALLY need it
  71. 71. Filesystem Users upload files: documents, media, etc How to handle them?
  72. 72. Filesystem ● Need of filesystem abstraction ● Use external object storage like S3 ● Avoid using NAS: it’s tricky to be set-up correctly
  73. 73. Filesystem / Abstraction ● FlysystemBundle ● KnpGaufretteBundle https://github.com/1up-lab/OneupFlysystemBundle
  74. 74. Filesystem / Abstraction https://github.com/1up-lab/OneupFlysystemBundle ● AWS S3 ● Dropbox ● FTP ● Local filesystem ● ...
  75. 75. Filesystem / Abstraction Configuration oneup_flysystem: adapters: s3_adapter: awss3v3: client: s3_client bucket: "%s3_bucket%" oneup_flysystem: adapters: local_adapater: local: directory: ‘myLocalDir’
  76. 76. Filesystem / Abstraction Configuration prod.yml oneup_flysystem: filesystems: my_filesystem: adapter: s3_adapter dev.yml oneup_flysystem: filesystems: my_filesystem: adapter: local_adapter
  77. 77. Filesystem / Abstraction Usage // LeagueFlysystemFilesystemInterface $filesystem = $container->get(‘oneup_flysystem.my_filesystem’); $path = ‘myFilePath’; $filesystem->has($path); $filesystem->read($path); $filesystem->write($path, $contents);
  78. 78. Filesystem / Recap ● Easy ● Need to change your PHP code ● Ready-made bundles ● Avoid local filesystem and NAS Take away: use FlystemBundle with S3
  79. 79. Async tasks Sync vs Async Can synchronous systems scale?
  80. 80. Async tasks We need a system able to manage some queues, deliver, store and route messages. And some PHP code able to consume those messages...
  81. 81. RabbitMQ Open source message broker https://www.rabbitmq.com
  82. 82. RabbitMQ
  83. 83. RabbitMQ How can PHP talk to RabbitMQ? AMQP library for PHP https://github.com/php-amqplib/php-amqplib
  84. 84. RabbitMQ RabbitMQ lets you scale the queue
  85. 85. RabbitMQ You can scale consumers: how?
  86. 86. RabbitMQ Putting some machines (containers) inside an auto-scaling group! They can scale based on: ● Hardware parameters: cpu / memory ● Number of queue items ● Add your custom metrics!
  87. 87. Async tasks / Recap ● You need an external system and some new machines / containers ● Need to change your PHP code ● Ready-made bundles and libraries ● Avoid blocking sync tasks. Put the message on the queue and move on. Take away: use RabbitMQ with auto-scaling consumers
  88. 88. Logging Logs have to be collected and managed in a centralized way...
  89. 89. Logging ELK
  90. 90. Logging ● You need an external system ● Take a look at managed ones: loggly.com, logz.io, scalyr.com ● Don’t need to change your PHP code ● You can’t avoid it in a distributed system Take away: use a managed service
  91. 91. Scaling / Recap ● Sessions and filesystem: easy. Do it ● PHP code: not difficult. Think of it. Save money. ● Database: very hard. Think a lot ● Async tasks: think of it if you have many of them. ● Logging: necessary. Easy to implement if you choose a managed service
  92. 92. THANK YOU
  93. 93. We are always looking for talented people...!
  94. 94. QUESTIONS?

×