Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Symfony2 for legacy app rejuvenation: the eZ Publish case study
1. It’s all about eXperience
Gaetano Giunta
Daniel Clements
The UK
Symfony
meetup
July 2015
Symfony2 for legacy app
rejuvenation:
the eZPublish case study
3. Do you speak eeazee publish?
3
An Enterprise-grade Web Content management System
Figures
•20K commits, first one in 1999
•3200 php files in default install
•Licensed both under GPL and commercially
•45K community members, 200 partners, 500+ Paying customers!
•Full-time employees!
•Not your average php project
( This is not Product Placement, I swear )
( And it’s not going to last more than 3 slides, I promise )
4. Do you speak eeazee publish?
4
An Enterprise-grade Web Content management System
Figures
•20K commits, first one in 2002
•3200 php files in default install
•Licensed both under GPL and commercially
•45K community members, 200 partners, 500+ Paying customers!
•Full-time employees!
•Not your average php project
( This is not Product Placement, I swear )
( And it’s not going to last more than 3 slides, I promise )
5. Do you speak eeazee publish?
5
An Enterprise-grade Web Content management System
Figures
•20K commits, first one in 2002
•3200 php files in default install
•Licensed both under GPL and commercially
•45K community members, 200 partners, 500+ Paying customers!
•Full-time employees!
•Not your average php project
( This is not Product Placement, I swear )
( And it’s not going to last more than 3 slides, I promise )
6. Do you speak eeazee publish?
6
An Enterprise-grade Web Content management System
Figures
•20K commits, first one in 2002
•3200 php files in default install
•Licensed both under GPL and commercially
•45K commPaying customers!
•Full-time employees!
•Not your average php project
( This is not Product Placement, I swear )
( And it’s not going to last more than 3 slides, I promise )
7. Do you speak eeazee publish? II
7
Features
•Flexible content model
• Versioning
• Translation
•Separation of content and presentation (templates & themes)
•Wysiwyg editor; in-site editing; import from Office documents
•Advanced search engine (based on SOLR)
•Built-in webshop
•Workflow engine
•Fine grained RBAC
•Cluster-mode for HA
•RSS feeds, SEO, multimedia, geolocation, and much, much more…
8. Do you speak eeazee publish? II
8
Features
•Flexible content model
• Versioning
• Translation
•Separation of content and presentation (templates & themes)
•Wysiwyg editor; in-site editing; import from Office documents
•Advanced search engine (based on SOLR)
•Built-in webshop
•Workflow engine
•Fine grained RBAC
•Cluster-mode for HA
•RSS feeds, SEO, multimedia, and much, much more…
9. Do you speak eeazee publish? III
9
Patterns
• Entity-Attribute-Value
•ActiveRecord
•DBAL
• Supports MySQL, PostgreSQL, Oracle
•MVC
• Data can be fetched from templates, making it HMVC in practice
•Templates
• Bespoke language
10. If it works… Why change?
10
Existing codebase is 10 years old
High maintenance cost
• Started with no unit tests
• Layers and roles not properly defined / documented
OOP before php had
• Private/protected/static
• Closures
• Namespaces
• Late static binding
• And much more
Not built for an Ajax and REST world
11. If it works… Why change?
11
Existing codebase is 10 years old
High maintenance cost
• Started with no unit tests
• Layers and roles not properly defined / documented
OOP before php had
• Private/protected/static
• Closures
• Namespaces
• Late static binding
• And much more
Not built for an Ajax and REST world
12. If it works… Why change?
12
Existing codebase is 10 years old
Widely deployed
• Well debugged
• Pitfalls have probably been uncovered by now
Proven to scale
Well known:
• Documentation improved over years
• Tutorials, forums, blogs, aggregators
• Active community of practitioners
• Official training courses and consulting
13. If it works… Why change?
13
Existing codebase is 10 years old
Widely deployed
• Well debugged
• Pitfalls have probably been uncovered by now
Proven to scale
Well known:
• Documentation improved over years
• Tutorials, forums, blogs, aggregators
• Active community of practitioners
• Official training courses and consulting
14. Making (good) decisions
14
The first rewrite phase
• Started beginning of 2011
• Focused on Content Engine
• Using in-house components (Apache Zeta Components)
The second rewrite phase
• Started beginning of 2012
• Base the CMS on an existing framework
15. Prerequisites - functional
15
• Durable Architecture
• API stability guaranteed
• Battle tested / not (only) the latest trend
• Should still be there in 10 years
• Speed and Scalability
• Lively Community
• Documentation available
• Evolving and getting bugfixes
16. Prerequisites - technical
16
• Simple Integration with existing API
• HMVC (Hierarchical Model View Controller) stack
• Decoupled Components
• Dependency Injection
• Good Template Engine
• Extensible, Open, Reliable (TM ;-)
21. Backwards compatibility
21
Product Management SCRUM Story:
«As an existing user, I don’t want to be pissed off by a new
#@!$% version!»
• 100% Data Compatible (same DB scheme)
• Routing fallback to legacy controllers
• Possibility to include legacy templates in the new ones
• Access legacy code from Symfony controllers
• Settings compatibility
• Bonus: access Symfony services from legacy modules
22. Backwards compatibility
22
Product Management SCRUM Story:
«As an existing user, I don’t want to be pissed off by a new
#@!$% version!»
• 100% Data Compatible (same DB scheme)
• Routing fallback to legacy controllers
• Possibility to include legacy templates in the new ones
• Access legacy code from Symfony controllers
• Settings compatibility
• Bonus: access Symfony services from legacy modules
23. And while the devs are busy…
23
Things the boss has to account for
• having to maintain 2 codebases for a long time =>
• dev resources split =>
• not a lot of new features for a while
24. And while the devs are busy…
24
More things the boss has to account for
• Double infrastructure:
• Testing / Support / Packages system / …
• Training your support team / consulting team / sales team /
partners
• Evangelization of the community
• Possibility of a fork of the legacy version ('hostile'
community takeover)
• Customers wanting to pay A LOT to stay on old version and
be supported longer
25. Making (good) decisions, part II
25
• Coming up with a perfect-out-of-the-box v2 means
dedicating the whole company to the reimplementation
• By the time that version is ready, the market might have
moved on to something else
Solution: take the Minimum Viable Product approach
• Deliver v2 before reaching feature parity
• Iterate quickly
• Downside: keep the v1 around for longer
• But you have to provide the compatibility layer anyway, right?
26. Making (good) decisions, part II
26
Product Management SCRUM Story:
«As an existing user, I don’t want to be pissed off by a new
#@!$% version!»
• 100% Data Compatible (same DB scheme)
• Routing fallback to legacy controllers
• Possibility to include legacy templates in the new ones
• Access legacy code from Symfony controllers
• Settings compatibility
• Bonus: access Symfony services from legacy modulesChallenge accepted
31. Directory layout
31
New Kernel: a standard Symfony App
Legacy Kernel is isolated in a subdirectory*
* = symlinks still necessary for making assets available
33. Integrated plugin management
33
New “Kernel”: Bundles are installed via composer + config
editing
Legacy Kernel: Extensions are installed via tarball unzip
(manual) + config editing
Solutions:
1. write a custom Composer installer plugin for Legacy
Extensions
2. Allow an eZP Bundle to include a Legacy Extension
34. Plugin management
34
New “Kernel”: Bundles are installed via composer + config
editing
Legacy Kernel: Extensions are installed via tarball unzip
(manual) + config editing
Solutions:
1. write a custom Composer installer plugin for Legacy
Extensions
2. Allow an eZP Bundle to include a Legacy Extension
36. The old Front Controller
36
Not the nicest code ever written
• All logic in the php file – no classes
• 1100 lines of code
• Sets up a huge amount of global variables
• Incorporates tons of logic
• module redirection / re-execution logic
• permission checking
• running the setup wizard if needed
• checking for pending database transactions
• etc…
• Not clearly structured in setup / execute / teardown phases
37. The new Front Controller
37
use SymfonyComponentHttpFoundationRequest;
require_once __DIR__ . '/../ezpublish/autoload.php'; // set up class autoloading
require_once __DIR__ . '/../ezpublish/EzPublishKernel.php';
$kernel = new EzPublishKernel( 'dev', true ); // extends the Sf Kernel
$kernel->loadClassCache(); // a method from parent
$request = Request::createFromGlobals();
$response = $kernel->handle( $request );
$response->send();
$kernel->terminate( $request, $response );
The Kernel extends the Symfony HTTPKernel
• It adds configuration for the Service Container
• It allows to register bundles via registerBundles()
38. Refactoring
38
Previous index.php had to be refactored
• All logic moved into a class
• 20 lines of code in index.php!
• Better separation of setup and teardown of the environment
from execution of application logic
=>
• Can now be used as sub-controller
• Or to run any legacy code from any New Kernel context
41. Legacy Routing
41
eZPublish 4 uses a custom MVC implementation
•Front controller: index.php
•Controllers are “plain php” files, properly declared
•Url syntax: http:// site / module / controller / parameters
•Parameters use a custom format instead of the query string
•Virtual aliases can be added on top
•For all content nodes, a “nice” alias is always generated by the system
• Good for SEO
Technical debt
•No DIC anywhere (registry pattern used)
•No nested controllers
•No provision for REST / AJAX
• Implemented ad-hoc in many plugins (code/functionality duplication)
•Policies are tied to controllers, not to the underlying content model
45. Legacy caches
46
eZ Publish 4 has a complicated advanced caching system
For viewing content, cache is generated on access, invalidated on editing
• TTL = infinite
• When editing a content, cache is also invalidated for all related contents
• Extra invalidation rules can be configured
• Can be set up to be pregenerated at editing time (tradeoff: editing speed)
• Cache keys include policies of current user, query string, custom session data
“Cache-blocks” can also be added anywhere in the templates
• Expiry rules can be set on each block, TTL-based or content-editing based
• Breaks mvc principle
Most powerful AND misunderstood feature in the CMS
46. Legacy caches: up to eleven
47
Historically: built-in “full-page cache” (stores html on disk)
Currently deprecated, in favour of using a caching Reverse Proxy
• Performances same if not better
• Delegate maintenance of part of the stack (Varnish, Squid)
Holy grail of caching: high TTL and support for PURGE command
1. When RP requests page from server, he gets a high TTL => cache page forever
2. When page changes, server tells to RP to purge that url from cache
• Best reduction in number of requests to server while always showing fresh data
• Downside: extremely hard to cache pages for connected users
ESI support as well
• Hard to make efficient, as eZ can not regenerate an ESI block without full page
47. Legacy caches: up to eleven
48
Historically: built-in “full-page cache” (stores html on disk)
Currently deprecated, in favour of using a caching Reverse Proxy
• Performances same if not better
• Delegate maintenance of part of the stack (Varnish, Squid)
Holy grail of caching: high TTL and support for PURGE command
1. When RP requests page from server, he gets a high TTL => cache page forever
2. When page changes, server tells to RP to purge that url from cache
• Best reduction in number of requests to server while always showing fresh data
• Downside: extremely hard to cache pages for connected users
ESI support as well
• Hard to make efficient, as eZ can not regenerate an ESI block without full page
48. Integration Wholesale Replacement
49
Symfony has one of the nicest caching systems for web apps
Because it adopts the HTTP model
HTTP Expiration and Validation are used
• By setting caching headers on response object
Integrates with a Gateway Cache (a.k.a. Reverse Proxy)
• Native (built-in, php)
$kernel = new Kernel('prod', false);
$kernel = new HTTPCache($kernel);
• External (Varnish, Squid, ...)
Native support for ESI
• Using {{ render_esi() }} in twig
50. REST API
51
eZ4 has an incomplete REST API
•Only functionality available: reading content
•Based on Zeta Components MVC component
=>
A new API has been implemented
•Full reading and writing of content is possible
•All “dictionary” data is also available
•Content-type for response can be JSON or XML (with an XSD!)
•Fully restful
• Usage of all HTTP verbs (and then some: PATCH)
• Respect http headers of request (eg: “Accept”)
• HATEOAS: use urls as resource ids
•No separate request handling framework needed: pure Symfony routing
51. Database Access Layer
52
• eZPublish 4 comes with its own DBAL
• The Content Engine was rewritten on top of Zeta
Components Database (before Symfony adoption)
• Dwindling development
• Shaky support for Oracle, MS SQLServer, etc
• Most Symfony apps use Doctrine
• Fast enough (not the ORM)
• Good support for many databases
=>
• Adopt Doctrine
• Write a stub layer implementing Zeta DB API on top of it
52. Improving performances
53
• The queries generated by eZPublish 4 have been hand
tweaked for years
• The new Content Engine not so much
• Designed to run on non-sql storage
• Many more layers of separation
• Generated queries are suboptimal
=>
• Introduce a Content Cache
• Built using Stash (www.stashphp.com)
• Multiple storage backends
• Fully tested
53. Improving performances
54
• The queries generated by eZPublish 4 have been hand
tweaked for years
• The new Content Engine not so much
• Designed to run on non-sql storage
• Many more layers of separation
• Generated queries are suboptimal
=>
• Introduce a Content Cache
• Built using Stash (www.stashphp.com)
• Multiple storage backends
• Fully tested
54. Templating
55
• eZPublish 4 comes with its own template language
• Syntax similar to Smarty
• Compiler has very limited capabilities
• Support in IDEs is spotty
• Symfony comes with TWIG template engine
• Fast enough
• Good way to support inheritance
• Extensible
=>
• Adopt TWIG
• Implement custom helpers (filters / tag / functions)
55. Managing Assets
56
• eZPublish 4 comes with its own assets plugin
• Minifies and compresses CSS, JS
• Doubles as core plugin for all things AJAX
• Hard dependency on JQuery and YUI versions
• No support for LESS,SASS, etc…
=>
• Adopt Assetic
• Currently working on: support for multiple site themes
56. Community Bundles
57
Replacing - and improving - existing functionality
• csrf token Symfony
• Reverse Proxy integration FosHttpCacheBundle
• Image manipulation LiipImagineBundle
• Pagination WhiteOctoberPagerFanta
• Menu building KnpMenuBundle
• Breadcrumbs BreadcrumbsBundle
• IO/storage LeagueFlysystem
57. Testing
58
eZ4 has an incomplete test suite
• No unit testing
• Selenium for functional testing
• Jenkins as CI server
• Manual labour still involved
• High maintenance cost
=>
• Phpunit for unit testing
• 100% API Coverage is the goal
• Behat for all the rest
• Travis for good platform coverage
58. Testing
59
eZ4 has an incomplete test suite
• No unit testing
• Selenium for functional testing
• Jenkins as CI server
• Manual labour still involved
• High maintenance cost
=>
• Phpunit for unit testing
• 100% API Coverage is the goal
• Behat for all the rest
• Travis for good platform coverage
59. The story so far
60
• First release: November 2012
• 8 major releases
• Legacy Kernel not included anymore since 2015
• New modern Administration UI in beta
• Will be launched as eZ Platform this fall
• Good collaboration with the Symfony ecosystem
60. We’ll be here all night
61
THANK YOU
<link to slideshare>
• @gggeek
• @declemo
• www.kaliop.co.uk
• www.ez.no
• share.ez.no
• doc.ez.no
• github.com/ezsystems
Why so many slides instead of deep dive into code? 1: Sf docs are very good; 2: developers like to focus on code, sometimes they miss the big picture
What do paying customers want? Two things: features and stability. Regardless of any contradictions.
What do paying customers want? Two things: features and stability. Regardless of any contradictions.
What do paying customers want? Two things: features and stability. Regardless of any contradictions.
And that is Risky Business
Enterprise icecream: everything AND the kitchen sink
Pretty advanced concepts for the time it was designed
This is what people think about when you say “old code”
But what are the real problems?
Also:
Hard to use it as content repository (integrating it where it is not the frontend app)
Multiple caching layers implementations
Multiple rest-ish implementations
In one word: solid (and oldie)
At the same time, it is hard to onboard new developers (island syndrome)
Battle tested == good security track record
The first point rules out any other development language than PHP, really.
Funnily enough, the choice to stay with PHP was challenged as recently as this year
Home made : Why would we do that ? So much work for what ? Doing the same mistakes as in the past, just because otherwise it would be «not invented here» ? No
Zeta Components : eZ has a long story with them. Back in 2008-2009, their destiny was to become the next generation of eZ Publish. For several reasons it didn’t happen. And to be pragmatic, it would have been a lot of work to adapt them to work with DI or HMVC
ZF2 : Still immature at the time
Then Symfony2 looked as an obvious and reasonable choice. Furthermore it’s heavily used, has a very active and nice community, and easy to learn. Let’s do it !
eZ Publish 3 business case (2003). Major change => eZ Publish lost the 2/3rd of its users, community members... Because there was no BC at all
For reference: eZP came with 2 releases per year, supported for 3 years each. That makes 6 versions in support at any given time
More layers on the right: enforcing stricter separation
Note that we use ‘ezpublish’ directory instead of ‘app’. This was discussed with and approved by SensioLabs
The Legacy Kernel had a clear separation of directory structure between the Core and the Extensions. This turned out to be unnecessary and a hurdle for newcomers: one more thing to learn
The Legacy Kernel had a clear separation of directory structure between the Core and the Extensions. This turned out to be unnecessary and a hurdle for newcomers: one more thing to learn
Keeps our friend Jordi Boggiano happy
This is slightly simplified, the actual code has a few more lines for dealing with caches, environments and reverse proxies
The closure is needed to set up the ‘legacy execution environment’:
Moving the execution to a different run directory
Setting up all the global variables (there are many of those)
Compatibility requirement: be able to execute legacy modules
Bonus: added collaboration between projects
Keeping Lukas Kahwe Smith happy
If all of this sounds weird, it is because it is weird
Bonus points: a client for the REST API, implements the same interfaces exposed by the local PHP API – network transparency!!!
And of course, the sql queries are getting improved as well in the meantime