High-performance high-availability Plone

High Availability
High Performance
Plone
Guido Stevens
guido.stevens@cosent.nl
www.cosent.nl
Social Knowledge Technology

Please wave, to improve my speech

Plone as usual
● Aspeli: über-buildout for a
production Plone server
● Regebro:
Plone-Buildout-Example
– nginx frontend
– varnish cache
– haproxy balancer
– 4x plone instance
– zeo backend

Plone as usual
balancing across Plone instances

Plone as usual
Plone instances

Meet the client
● High-profile internet technology NGO
● Slashdot traffic levels
– 0.4 million page views / peak day
– 4 million page views / month
– 40 million hits / month
● Mission-critical web presence
● 100% uptime previous 5 years
● Non-Plone sysadmins
● High security

Architecture Goals
● Must convince “file-based 100% uptime” sysadmins
● No SPOF
– eliminate all Single Points Of Failure
● Automated failover
– no manual intervention
● Extreme performance
● Extreme resilience
– killall -9 Plone

Meet Paul Stevens
● My brother
● mod_wodan + DBmail
● Plone developer
● pjstevns on irc/github/etc
NFG Net Facilities Group
● premium hosting
● 24/7 MySQL HA
– since stone age
● www.nfg.nl

Load Balancer
● Client provided hardware load balancer
● Alternative: Linux Virtual Server + HAproxy
– 2x HAproxy in active/passive config
● this would be an EXTRA layer of HAproxy not shown in diagram
– use highly available “virtual” IP address
– monitor with Heartbeat or comparable
– failover virtual IP addres with arping broadcasts
● Alternative: AWS

Ensure physical separation
● Ensure redundancy across physical servers
– no use to fail over on same machine
– separate machines in separate data centers
● Gotcha: moving virtuals around
– Disable HA facilities of virtualization platform
– We'll do our own HA

ZEO versus Relstorage
● ZEO
– ZEO protocol
– filestorage
– object pickles
● ZRS Replication
– $$$ at the time
– later opensourced
● No hot-failover
– slave master reconfig→
● Relstorage
– ZEO protocol
– MySQL or PostgreSQL
– object pickles: no alchemy!
● MySQL replication
– done that 24/7 since 2001
– widely used
● Hot failover
– multi-master

Blobstorage
● Not shown in diagram
● Client provided Netapp Metrocluster NFS disks
– no need to care about replication and HA for those
● Alternatives:
– DRBD + NFS
– AWS Elastic Block Device
– F-sniper + rsync + NFS
● Why not run database on that?
– disk replication + NFS + ZEO
– what can possibly go wrong?

mod_wodan
● Caching module for Apache
– C
– Originally by ICS for nu.nl
– Now maintained by NFG
● Store response body + headers on disk
● BOFH attitude to caching policies
● Used in anger
● Alternative: stxnext.staticdeployment

Varnish ↔ Wodan
● Proxy process
● RAM memory cache
– restart → empty cache
– expired → gone
● Plays nice
– request + response headers
– etag split-view
● purge API
– plone.app.caching
● Apache module
● Persistent disk cache
– restart full cache→
– expired keep fallback→
● BOFH
– my way or the highway
– single cache file per page
● Cronjobs maintenance
– crawl sitemap
– delete removed pages

Varnish plus Wodan
Varnish
● unload Plone
● plone.app.caching policies
– pages 1 hour
– resources longer
– purge on edit
● etag split-view
– per-user page versions
– cache authenticated
Wodan
● failsafe content delivery
● hard policy config
– pages 1 minute
– resources longer
– edit 1-minute refresh→
● Gotcha: anonymous only
– editors bypass Wodan

Multi Master MySQL
● multi-master
– cross replication
● each slaves the other
– any can be master
● hot failover and failback
● Gotcha: use only 1 master at a time
– Relstorage is not multi-master
– avoid replication errors
● mmm_agent server (not shown in diagram)
– monitors mysql health and replication
– manages virtual MySQL HA ip address
● think: Heartbeat for MySQL

Plone as usual
file-based
content
delivery

Readonly Rescue Mode
● File-based content delivery
– mod_wodan
– full cache of all pages + resources
– cached search results (Subject / tag cloud)
● AJAX-driven graceful degradation
– detect backend down via non-cached lightweight view
● @@ipaddress not a full page: minimal rendering overhead
– disable interactive elements via CSS
● search bar, personal tools display:none→
● Gotcha: anonymous only
– down for authenticated until manual reconfig→
● Gotcha: ErrorDocument
– pre-cache nice page but preserve http error status code→

High-performance high-availability Plone

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (12)

Similar to High-performance high-availability Plone

Similar to High-performance high-availability Plone (20)

More from Guido Stevens

More from Guido Stevens (9)

Recently uploaded

Recently uploaded (20)

High-performance high-availability Plone