REST In Action: The Live Coverage Platform at the New York Times
REST In Action: The Live
Coverage Platform at the
New York Times
WordCamp US
Philadelphia, PA - December 4, 2015
Scott Taylor
• Core Developer, WordPress
Release Lead for 4.4
• Sr. Software Engineer, The
New York Times
• @wonderboymusic on Twitter/
Instagram/Swarm et al
• I like Music, NYC, and
Mexican food
WordPress at the NYT now
• some “legacy” blogs
• Lens - Photography blog
• First Draft
• some internal corporate sites
• NYT Co.
• Women of the World
• Times Journeys
• The Live Coverage platform
• some forthcoming
International projects
WordPress 4.4 Lead
• REST API (Phase 1)
• Term Meta
• Responsive Images
• WordPress as oEmbed Provider
• Tons of under the hood stuff
Blogs at the NYTimes
• NYT used WordPress very early
• NYT was an early investor in Automattic
• Multisite
• ~80 blogs at the height of blog mania - the 00s
were the glory days for Blogs and WordPress
• Many blogs used Live Blogging
When I arrived:
Legacy Blogs Codebase
• Separate from the rest of the NYT’s PHP codebase
• Global NYTimes CSS and JS, CSS for all Blogs,
custom CSS per-blog
• A universe that assumed jQuery AND Prototype
were loaded on every page in global scope
• Challenging amounts of what could generously be
called “technical debt”
• Inline HTML from 2008 that assumes Prototype will
still be a thing in 2015, stored in post_content
• Widgets and inline code that add their own version
of jQuery/Prototype, because YOLO
• Even better: widgets/modules from other teams that
use a different version of jQuery … at times there
could be 4 jQuerys on the page (and 4 different
versions at that)
Things Like ….
No shared modules
• Code/HTML markup can get out of sync with other
projects regularly: header, footer, navigation
• The CSS and JS files were split across multiple
SVN repos - changes to global assets can affect us
without us knowing. Fixing the code requires
scouring through multiple repos.
At the NYT
• No WordPress Comments: There is an entire team that deals
with these for the site globally, in a different system called CRNR
• No Media: There is another CMS at the Times, Scoop, which
stores the images, videos, slideshows, etc
• WordPress native post-locking: This only landed in WordPress
core in version 3.6 (we have yet to reconcile the differences)
• There is layer for Bylines which is separate from Users: Our
users are employees authenticated via LDAP, most post authors
don’t actually enter the content themselves
NYT5: The New Frontier
My arrival at the New York Times coincided with
the NYT5 project, already in progress
NYT5
• Development requires Vagrant environment
• “apps” are Git repos that require Grunt to transpile the
codebase - you can’t run your repo as a website: it has
to be built
• Impossible to create a “theme” this time with shared JS
and CSS. CSS is SASS build, JS is Require build - both
dynamically built for each app.
• PHP has Composer dependencies and uses
namespaces - the directories are expanded via Grunt in
accordance with PSR-0 Autoloading Standard
require( ['jquery'], function ($) {
$('#cool-link').click(...);
} );
require( ['jquery/1.9'], function ($) {
$('#cool-link').click(...);
} );
require( ['jquery/2.0'], function ($) {
$('#cool-link').click(...);
} );
Require.js fixed the jQuery problem
NYT5 Dealbreakers
• We can’t just point at WordPress on every request
and have our code figure out routing. Routing
happens in Apache in NYT5 - most requests get
piped to app.php
• Because PHP Namespaces are used, WP has to
load early and outside of them (global scope)
• On the frontend, WP cannot exit prematurely before
hitting the framework, which returns the response
to the server via SymfonyHttpFoundation
NYT5 Advantages
• “shared” modules - we inherit the “shell” of the page,
which includes: navigation, footer, login, etc.
• our nyt5 theme doesn’t need to produce an entire
HTML document, just the “content” portion
• With WP in global scope, all of its code is available even
when we hit the MVC parts of the NYT5 framework.
• WP output is captured via an output buffer on load - it’s
accessible downstream when the app logic is running.
$wp_query = new WP_Query();
$GLOBALS['wp_query'] = ...
function wp_thing() {
global $wp_query;
. . .
}
GLOBALS!!!
Overall: Bad News for Blogs
• Blogs were duplicating Section Fronts, Columns:
Mark Bittman has column in the paper.
The column also exists on the web as an article.
He contributes to the Diner’s Journal blog.
There is a section front for dining.
He also has his own NYTimes blog. Why?
• Blogs and WordPress were combined in everyone’s
mind. So whenever WordPress was mentioned as a
solution for anything, the response was: aren’t blogs
going away? #dark
2008: Live Blogs at the Times
• A Blog would create a post and check “Start Live
Blogging”
• the updates related to the post were stored in custom
tables in the database
• the APIs for interacting with these tables duplicated tons
of WordPress functionality
• Custom Post Types didn’t exist until WordPress 3.0 (June
2010) - the NYT code was never rewritten to leverage
them (would have required porting the content as well)
Live (actual) Blogs:
Dashboards/Dashblogs
• A Live Blog would be its own blog in the network, its own
set of tables
• A special dashboard theme that had hooks to add
custom JS/CSS for each individual blog, without baking
them into the theme
• Making an entirely new site in the network for a 4-hour
event is overkill
• For every 10 or so new blogs that are added, you are
adding 100 new database tables - gross!
What if…
• Instead of custom tables and
dupe’d API code, new object
types: events and updates!
• To create a new “Live Blog”: create
an event, then go to a Backbone-
powered screen to add updates
• If WP isn’t desired for the front end,
it could be the backend for
anything that wants a JSON feed
for live event data
• Using custom post types, building
a Live Event UI that looks like the
NYT5 theme would be nominal
• Built an admin interface with Backbone to quickly
produce content - which in turn could be read from
JSON feeds
• When saving, the updates post into a service we
have called Invisible City (wraps Redis/Pusher)
• Our first real foray into using the REST API
• Our plan was just to be an admin to produce data
via self-service URLs
What we did
Live Events, the new Live Blogs:
Complete Rewrite of 2008 code
• nytimes.com/live/{event} and nytimes.com/live/{event}/
{update}
• Brand new admin interface: Backbone app that uses the
REST API. Constantly updated filterable stream -
Backbone collections that re-fetch on Heartbeat tick
• Custom REST endpoints that handle processes that need
to happen on save
• Front end served by WordPress for SEO, but data is
received by web socket from Invisible City and rendered
via React
Guess What?
• Would rather use Docker instead of Vagrant
• PSR-0 is now PSR-4
• Grunt is now eschewed in favor of Gulp
• RequireJS is ok, but I’d rather use Browserify
• PHP is cool, but why don’t we use Node and
React?
Interactive News Team
• Mostly independent
• Mostly special News projects
• Mostly use whatever language you want
• Mostly does not use NYT5
• Is open to using WordPress, but just as often, does
not
nytimes.com/live/{event}
Request is served by WordPress,
PHP generates markup
React wraps the "posts" area
JS listens to Web Socket
Updates are added on the backend (OR via SLACK!)
React updates the content
Most plugins only handle POST
• WP-API and Backbone speak REST
• REST will send you requests via
GET, PUT, DELETE, POST and friends
$hook = add_menu_page( ... );
add_action( "load-$hook", 'callback' );
function old_custom_load() {
if ( 'POST' !== $_SERVER['REQUEST_METHOD'] ) {
return;
}
...
}
function new_custom_load() {
if ( 'GET' === $_SERVER['REQUEST_METHOD'] ) {
return;
}
...
}
WordPress becomes
a web service
• Monolithic mindset needs to transition into how to
make it into a bare metal service provider
• The serving of requests should be loosely coupled
from objects like WP_Query
• WordPress needs to become supportive of
concurrency
Custom JSON Endpoints for GET
• We do not hit these endpoints on the front-end
• We have a storage mount that is fronted via Varnish
and Akamai
• JSON feeds can show up on the homepage of the
NYT to dynamically render “promos” - these have
to massively scale
HTTP is time-consuming
• It is easy to lose track of how many things are
happening on the 'save_post' hook
• Admin needs to be fast
• The front end is typically cached, but page generation
shouldn’t be bogged down by HTTP requests
• Anything which is time-consuming should be
offloaded to a separate “process” or request who
response you don’t need to handle
Custom REST Endpoints for POST
• Use fire-and-forget technique on 'save_post',
instead of waiting for responses inline. You can still
log/handle/re-try responses in the separate request.
• Most things that happen on 'save_post' only
need to know $post_id for context, the endpoint
handler can call get_post() from there