Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Magento scalability from the trenches (Meet Magento Sweden 2016)


Published on

Here you can find some general rules on how to design high-performance web applications and building them in simple iterative steps.

Published in: Internet
  • Be the first to comment

Magento scalability from the trenches (Meet Magento Sweden 2016)

  1. 1. MAGENTO SCALABILITY from the trenches Piotr Karwatka
  2. 2. AGENDA 1. General scalability rules 2. Action Plan – scalability framework 3. Magento B2B case 1. EAV and indexes, 2. Cache 3. Replication 4. Fine-tuning 4. Magento 2.0 2
  3. 3. THE CHALLENGE - Good architecture – a rare good, - There is no holy grail of scalability, - Always take custom approach – measure before optimizing, - Start “cheap”, scale fast – risky - Processes driven over improvisation, - Redundancy – scalability goes with availability - Divide and conquer – using layers - Measure and examine bottlenecks, - Scale only overloaded layers - Good news: Magento is scalable by design 3 middleware cache storage app db
  4. 4. HARDWARE APPROACH At start – optimize code & use cache (New Relic, collected to catch bottlenecks); try HHVM, nginx, OpCache Vertical: more RAM, more CPUs + no code changes required, fast gain - technology barriers, - at some point very expensive Horizontal: more cheap servers + high availability when done right, + cloud ready, - often requires code refactoring, - challenging configuration and dev-ops 4 Cost at scale
  5. 5. ACTION PLAN Step 1 - use vertical scaling as far as it’s reasonable, - optimize code to avoid bottlenecks, - use caching where it’s possible, - separate database server - separate static files or/and use CDN, Step 2 - add additional app servers, - establish cache cluster, - use reverse proxy (Varnish) Step 3 - use database replication, - scale up using horizontal scaling 5 First go vertical Then go horizontal
  6. 6. MAGENTO CASE – THE CHALLENGE TIM.PL – largest B2B site in Poland. About 100 000 000EUR / year Platform for customers – offers/inquiries, bulk orders, near real-time CRM/WMS integration 6 - B2B e-Commerce site with external integrations (CRM, PIM, ERP, WMS) - Up to 1.5M SKU’s, - Up to 2K active concurrent users, average session time: 4h+, - About 6000 attributes, - About 2189 attribute sets, - 1M+ website calls / day, - Challenging read/write ratio: 50/50% - B2B features, site used as tool/platform; browse/checkout scenario
  7. 7. We called it MVP. It worked well to some point... 7
  8. 8. FIRST APPROACH – 3 years ago - Cache for blocks enabled, - FLAT enabled – but at 5000+ attributes InnoDB limits achieved, - The code was optimized quite well (we’ve used Ivan’s tips: http://www. valuable-tips-for-developers) - Separated DB server + master-master replication (backup purposes), - SSD disks (APP + DB), lot of RAM (16GB / server) – vertical scaling approach, - MySQL tuning (IO buffers, InnoDB buffers), - Apache tuning (connection limits, FPM) - HHVM tested – about +50% boost, but no profiling 8
  9. 9. OPTIMIZE AND PROFILE! Always measure impact of change before implementing it to production - JMeter – we used it to emulate throughput and conduct load tests after each change, - New Relic – to analyze application speed, track slow-queries and method-calls; it can be used on production servers as well because of near-zero overhead 9 - Collectd – installed on both app and db servers – we’ve discovered bottlenecks on IO and db-locking on Magento’s product indexation, - Logs – we used ELK (Kibana) and custom New Relic integration to diagnose web-services response times, - htop, iotop – during IO problems it can be useful to find what generates the problem exactly, - Xdebug/XHProf profiler - on stage servers to debug and profile code and discover cache gaps, JMeter 2h load tests Fine tuning JMeter 24h load tests Optimize one piece at time
  10. 10. High availability is crucial – we switched to 2N model 10 master master App servers + GlusterFS both servers can handle user reqs. Haproxy + Varnish – load balancer load balancing and reverse proxy for caching and static files
  11. 11. APP & CACHE - Redis is faster than memcached as backend cache, - Varnish (with ESI) is a must for both static files and page caching (we used Turpentine and Phoenix on some projects – both are fine) - VCL can be challenging, - We managed to use HAProxy as load balancer (using automatic failover), - We’ve added cache to Mage_Catalog_Model_Product::load - Consider adding cache to Mage_Eav_Model_Entity_Abstract to avoid EAV at all – we couldn’t use FLAT because of attributes count, - We turned on FLAT to 900 most frequently used attributes (InnoDb limits), - Sessions were moved to Redis, - We discovered lot of queries to core_url_rewrite - cache should help here, - We used Fast-Async Reindexing module while using Magento 1.x to avoid database locking - GlusterFS used to handle uploads and replication 11
  12. 12. VARNISH IMPACT 12
  13. 13. APP & CACHE - Remarks - GlusterFS/network file systems – stat(), open() without local caching are IO exhausting, - we had some issues with APC on PHP 5.4 (segfaults) – now everybody uses OpCache ☺ - at some point we switched from Apache to nginx + php-fpm to gain speed req/s throughput and lower memory usage (read more here: http://info.magento. com/rs/magentocommerce/images/MagentoECG- PoweringMagentowithNgnixandPHP-FPM.pdf) - We had problems with Magento API (really slow responses – 0.5s); optimizations = 0.2s + HHVM = 0.1s; next step – fast responding façade without Magento overhead - - We had problems with Redis clogging with cache Keys (http://divante. co/blog/magento-clogged-redis-cache/) 13
  14. 14. HHVM IMPACT 14
  15. 15. THE HARD WAY - Most challenging issues: EAV and indexing - Will be great to use NoSQL DB (MongoDB, SOLR), - At this point we use only model-level cache, - We’ve disabled Magento logs and reports – less queries, less useless data to store, - Small configuration tips make big difference: - query_cache_size - up to 128MB works well; furthermore – cache cleaning can be really, REALLY slow - innodb_thread_concurrency - setting to 0 prevents MySQL from clogging worker threads (looks like it’s locking but it isn’t) - We switched from MySQL to PerconaDB/XtraDB - Great gain performance gain on peaks – queries count vs. response time – up to + 275%, - No code / SQL changes required – 100% compatible with MySQL, - MemSQL – looks really promising, not tested yet 15
  16. 16. DATABASE CAVEATS 16 Without FLAT in place – lot of EAV-related quires, also lot of URL-redirect related queries. Those queries are unnecessary.
  17. 17. HOW TO DISABLE EAV? – it will be great if we can switch to NoSQL DB (like MongoDB, SOLR, Sphinx Search), – one can overwrite EAV->FLAT indexers but it’s extremely hard (relations, some modules works on RAW SQL), – suggestions: - Add cache to Product::load method – invalidation is extremely important (you can use modification date in cache-key or observer based mechanism to clear it up), - Add cache to load EAV attributes – for products, product categories, - Overwrite/refactor Mage_Catalog – for searching and browsing products – some search modules do this partially, - Great knowledge base about EAV: http://www. 17 If you cannot use FLAT (categories + products are must) – it’s too slow or you have too many attributes
  18. 18. DATABASE SCALABILITY - REPLICATION With replicas one gets: high availability, more req/s. It doesn’t fit all cases: Caution: replication-lags It’s possible to move selected tables to external servers (like product catalogs). Always consider using cache first! 18 :- ) :- ( master slave mastermaster master master master TB: users TB: photos
  19. 19. INDEXATION VS. REPLICATION - Master-slave replication shall help with db-locking issue; - MySQL replicates only UPDATE/INSERT operations using binlogs - this is extremely fast and doesn’t lock replicas 19 public function processEntityAction(Varien_Object $entity, $entityType, $eventType) ... $resourceModel = Mage::getResourceSingleton ('index/process'); $resourceModel->beginTransaction(); $this->_allowTableChanges = false; try { $this->indexEvent($event); $resourceModel->commit(); } catch (Exception $e) { $resourceModel->rollBack(); if ($allowTableChanges) { $this->_allowTableChanges = true; $this->_changeKeyStatus(true); $this->_currentEvent = null; } throw $e;
  20. 20. DATABASE – NEXT STEPS - We’ve tested app-local master-slave replication to avoid network latency and database-locking – Magento supports this kind of replication out of the box, – Next step – move catalog database to separate server, – Route Admin panel requests to separated servers (using multi- master Magento2 feature) 20 master master App servers + GlusterFS + PerconaDB local db-slave’s for read access Each server can handle user requestsHaproxy & Varnish load balancer + proxy Indexing, updates, Imports, RDBM
  21. 21. INTEGRATIONS - We use queuing to avoid bottlenecks, - On each app server there are Gearman workers (PHP processes) – responsible for getting prices, stocks, transferring orders, - Workers exchange data with CRM, WMS, ERP, PIM in both async and sync modes – using priorities, - We used Command/Task design pattern, - We log everything using ELK – especially Kibana and New Relic to analyze external systems - Magento API can be very challenging (it’s extremely slow) 21
  22. 22. MONITORING We use Kibana (ELK stack) and custom New Relic metrics to monitor real-time integrations (CRM, WMS, ERP) Zabbix with Sellenium scripts is used to monitor and alert website availability 22
  23. 23. FINAL ARCHITECTURE 23 master master App servers + GlusterFS + PerconaDB local db-slave’s for read access Each server can handle user requests Haproxy & Varnish load balancer + proxy Gearman queue workers handle background jobs and external integrations API calls Web requests External sys. Calls background jobs
  24. 24. WHAT I’VE MISSED + MAGENTO 2 - Search – we used FactFinder / SOLR, - Details about Varnish and HHVM - Life is going to be easier: What excites me in Magento2? – Materialized views engine – smarter indexation, – Full page caching in community, – Multi master DB contexts, – Checkout optimizations 24
  25. 25. THANK YOU! QUESTIONS? 25 Technical or scalability challenges? Contact me to consult your case for free! Piotr Karwatka ( Divante –