This document discusses strategies for scaling a Magento e-commerce platform. It recommends first using vertical scaling by optimizing code and enabling caching before adding additional application and database servers through horizontal scaling. Specific optimizations discussed include using Redis for caching, Varnish for page caching, separating the database to its own server, enabling flat catalog indexing, and implementing master-slave database replication. Proper monitoring tools like New Relic and load testing are also emphasized for identifying bottlenecks during the scaling process.
2. AGENDA
1. General scalability rules
2. Action Plan – scalability framework
3. Magento B2B case
1. EAV and indexes,
2. Cache
3. Replication
4. Fine-tuning
4. Magento 2.0
2
3. THE CHALLENGE
- Good architecture – a rare good,
- There is no holy grail of scalability,
- Always take custom approach – measure before
optimizing,
- Start “cheap”, scale fast – risky
- Processes driven over improvisation,
- Redundancy – scalability goes with availability
- Divide and conquer – using layers
- Measure and examine bottlenecks,
- Scale only overloaded layers
- Good news: Magento is scalable by design
3
middleware
cache
storage
app
db
4. HARDWARE APPROACH
At start – optimize code & use cache (New Relic,
collected to catch bottlenecks); try HHVM, nginx,
OpCache
Vertical: more RAM, more CPUs
+ no code changes required, fast gain
- technology barriers,
- at some point very expensive
Horizontal: more cheap servers
+ high availability when done right,
+ cloud ready,
- often requires code refactoring,
- challenging configuration and dev-ops
4
Cost at scale
5. ACTION PLAN
Step 1
- use vertical scaling as far as it’s reasonable,
- optimize code to avoid bottlenecks,
- use caching where it’s possible,
- separate database server
- separate static files or/and use CDN,
Step 2
- add additional app servers,
- establish cache cluster,
- use reverse proxy (Varnish)
Step 3
- use database replication,
- scale up using horizontal scaling 5
First go vertical
Then go horizontal
6. MAGENTO CASE – THE CHALLENGE
TIM.PL – largest B2B site in Poland. About 100 000 000EUR /
year
Platform for customers – offers/inquiries, bulk orders, near real-time
CRM/WMS integration
6
- B2B e-Commerce site with external
integrations (CRM, PIM, ERP, WMS)
- Up to 1.5M SKU’s,
- Up to 2K active concurrent users,
average session time: 4h+,
- About 6000 attributes,
- About 2189 attribute sets,
- 1M+ website calls / day,
- Challenging read/write ratio: 50/50%
- B2B features, site used as
tool/platform; browse/checkout
scenario
7. We called it MVP.
It worked well to some point...
7
8. FIRST APPROACH – 3 years ago
- Cache for blocks enabled,
- FLAT enabled – but at 5000+ attributes InnoDB limits achieved,
- The code was optimized quite well (we’ve used Ivan’s tips: http://www.
slideshare.net/ivanchepurnyi/making-magento-flying-like-a-rocket-a-set-of-
valuable-tips-for-developers)
- Separated DB server + master-master replication (backup purposes),
- SSD disks (APP + DB), lot of RAM (16GB / server) – vertical scaling
approach,
- MySQL tuning (IO buffers, InnoDB buffers),
- Apache tuning (connection limits, FPM)
- HHVM tested – about +50% boost, but no profiling
8
9. OPTIMIZE AND PROFILE!
Always measure impact of change before implementing it to production
- JMeter – we used it to emulate throughput and conduct load tests after each
change,
- New Relic – to analyze application speed, track slow-queries and method-calls;
it can be used on production servers as well because of near-zero overhead
9
- Collectd – installed on both app and db servers –
we’ve discovered bottlenecks on IO and db-locking
on Magento’s product indexation,
- Logs – we used ELK (Kibana) and custom New
Relic integration to diagnose web-services
response times,
- htop, iotop – during IO problems it can be useful
to find what generates the problem exactly,
- Xdebug/XHProf profiler - on stage servers to
debug and profile code and discover cache gaps,
JMeter
2h load
tests
Fine
tuning
JMeter
24h load
tests
Optimize
one piece
at time
10. High availability is crucial – we switched to 2N model
10
master master
App servers + GlusterFS
both servers can handle user reqs.
Haproxy + Varnish – load balancer
load balancing and reverse proxy for caching and static files
11. APP & CACHE
- Redis is faster than memcached as backend cache,
- Varnish (with ESI) is a must for both static files and page caching (we used
Turpentine and Phoenix on some projects – both are fine) - VCL can be challenging,
- We managed to use HAProxy as load balancer (using automatic failover),
- We’ve added cache to Mage_Catalog_Model_Product::load
- Consider adding cache to Mage_Eav_Model_Entity_Abstract to avoid EAV at all – we
couldn’t use FLAT because of attributes count,
- We turned on FLAT to 900 most frequently used attributes (InnoDb limits),
- Sessions were moved to Redis,
- We discovered lot of queries to core_url_rewrite - cache should help here,
- We used Fast-Async Reindexing module while using Magento 1.x to avoid
database locking
- GlusterFS used to handle uploads and replication
11
13. APP & CACHE
- Remarks
- GlusterFS/network file systems – stat(), open() without local caching are IO
exhausting,
- we had some issues with APC on PHP 5.4 (segfaults) – now everybody uses
OpCache ☺
- at some point we switched from Apache to nginx + php-fpm to gain speed req/s
throughput and lower memory usage (read more here: http://info.magento.
com/rs/magentocommerce/images/MagentoECG-
PoweringMagentowithNgnixandPHP-FPM.pdf)
- We had problems with Magento API (really slow responses – 0.5s);
optimizations = 0.2s + HHVM = 0.1s; next step – fast responding façade without
Magento overhead - http://divante.co/blog/magento-1-9-1-0-page-load-time-0-3s/
- We had problems with Redis clogging with cache Keys (http://divante.
co/blog/magento-clogged-redis-cache/)
13
15. THE HARD WAY
- Most challenging issues: EAV and indexing
- Will be great to use NoSQL DB (MongoDB, SOLR),
- At this point we use only model-level cache,
- We’ve disabled Magento logs and reports – less queries, less
useless data to store,
- Small configuration tips make big difference:
- query_cache_size - up to 128MB works well; furthermore – cache cleaning can
be really, REALLY slow
- innodb_thread_concurrency - setting to 0 prevents MySQL from clogging
worker threads (looks like it’s locking but it isn’t)
- We switched from MySQL to PerconaDB/XtraDB
- Great gain performance gain on peaks – queries count vs.
response time – up to + 275%,
- No code / SQL changes required – 100% compatible with
MySQL,
- MemSQL – looks really promising, not tested yet
15
16. DATABASE CAVEATS
16
Without FLAT in place – lot of EAV-related quires, also lot of URL-redirect related
queries. Those queries are unnecessary.
17. HOW TO DISABLE EAV?
– it will be great if we can switch to NoSQL DB (like MongoDB,
SOLR, Sphinx Search),
– one can overwrite EAV->FLAT indexers but it’s extremely hard
(relations, some modules works on RAW SQL),
– suggestions:
- Add cache to Product::load method – invalidation is
extremely important (you can use modification date in
cache-key or observer based mechanism to clear it up),
- Add cache to load EAV attributes – for products, product
categories,
- Overwrite/refactor Mage_Catalog – for searching and
browsing products – some search modules do this partially,
- Great knowledge base about EAV: http://www.
solvingmagento.com/magento-eav-system/
17
If you cannot use FLAT (categories + products are must) – it’s too slow or you
have too many attributes
18. DATABASE SCALABILITY - REPLICATION
With replicas one gets: high availability, more req/s.
It doesn’t fit all cases:
Caution: replication-lags
It’s possible to move selected tables to
external servers (like product catalogs).
Always consider using cache first!
18
:-
)
:-
(
master slave
mastermaster
master
master
master
TB: users
TB: photos
19. INDEXATION VS. REPLICATION
- Master-slave replication shall help
with db-locking issue;
- MySQL replicates only
UPDATE/INSERT operations using
binlogs
- this is extremely fast and doesn’t
lock replicas
19
public function processEntityAction(Varien_Object $entity,
$entityType, $eventType)
...
$resourceModel = Mage::getResourceSingleton
('index/process');
$resourceModel->beginTransaction();
$this->_allowTableChanges = false;
try {
$this->indexEvent($event);
$resourceModel->commit();
} catch (Exception $e) {
$resourceModel->rollBack();
if ($allowTableChanges) {
$this->_allowTableChanges = true;
$this->_changeKeyStatus(true);
$this->_currentEvent = null;
}
throw $e;
20. DATABASE – NEXT STEPS
- We’ve tested app-local master-slave replication to avoid network
latency and database-locking
– Magento supports this kind of replication out of the box,
– Next step – move catalog database to separate server,
– Route Admin panel requests to separated servers (using multi-
master Magento2 feature)
20
master master
App servers + GlusterFS + PerconaDB
local db-slave’s for read access
Each server can handle user requestsHaproxy & Varnish
load balancer + proxy
Indexing, updates,
Imports, RDBM
21. INTEGRATIONS
- We use queuing to avoid bottlenecks,
- On each app server there are Gearman workers
(PHP processes) – responsible for getting prices,
stocks, transferring orders,
- Workers exchange data with CRM, WMS, ERP,
PIM in both async and sync modes – using
priorities,
- We used Command/Task design pattern,
- We log everything using ELK – especially
Kibana and New Relic to analyze external
systems
- Magento API can be very challenging (it’s
extremely slow)
21
22. MONITORING
We use Kibana (ELK stack) and custom New Relic metrics to monitor real-time
integrations (CRM, WMS, ERP)
Zabbix with Sellenium scripts is used to monitor and alert website availability
22
23. FINAL ARCHITECTURE
23
master master
App servers + GlusterFS + PerconaDB
local db-slave’s for read access
Each server can handle user requests
Haproxy & Varnish
load balancer + proxy
Gearman queue workers
handle background jobs and external
integrations
API calls
Web requests
External sys. Calls
background jobs
24. WHAT I’VE MISSED + MAGENTO 2
- Search – we used FactFinder / SOLR,
- Details about Varnish and HHVM
- Life is going to be easier: What excites me in Magento2?
– Materialized views engine – smarter indexation,
– Full page caching in community,
– Multi master DB contexts,
– Checkout optimizations
24
25. THANK YOU! QUESTIONS?
25
Technical or scalability challenges?
Contact me to consult your case for free!
Piotr Karwatka (pkarwatka@divante.pl)
Divante – http://divante.co