Moneyball for performance metrics

Moneyball for Performance Metrics
@ CSSConf.asia, 11/18/20151920x1080
Thanks, @rands!

Moneyball for Performance Metrics
Jeff Lembeck
@jefflembeck
npm, inc.
Hey everybody, my name is Jeff Lembeck and I’m a web developer over at npm.
I write, as you might imagine a whole heck of a lot of JavaScript, and a lot of CSS too.
For my free-time though — I’m mostly into sports, and the sport that’s had the biggest hooks in me since I was just a little kid, was baseball.

Now, baseball is an interesting sport for me because I grew up near Seattle - which makes my team the Seattle Mariners.

This is not a great team to be a fan of if you enjoy watching the sport, as they are, historically, one of the worst teams of all time.

It’s not just that they’ve never won the World Series, which is the US’s baseball championship, but they’ve never even been to it. It’s heart breaking.

Year after year. *click*

Now, while I _could_ spend my time up on stage going over the many times I’ve been let down, as an adult and as a child, by my local baseball
team - I’m not gonna do that. Instead, I’ll talk about some of the great baseball that I have been able to see, which in this case, of course, was
by another team and, well, that’s what brings me here today

Let’s talk about our division rivals, the Oakland As.

In 2002, a man named Billy Beane was the General Manager for the Oakland Athletics, a professional baseball team based in Oakland, California.

Oakland has a disadvantage, as far as teams go of being a, so-called, small market team.

💸 😢
This means that the team, normally due to location, doesn’t have as large of a fanbase, and doesn’t make as much money as some of the
bigger teams.

This means that they, then, don’t have as much money to spend on big-name players.

Now, in baseball, the General Manager of a team controls the contracts/hiring/ﬁring of players.

Since he was the GM of a small market team, Billy Beane had the diﬃcult challenge of attracting the most talented players while not being able to pay them
as much as a popular team, say — the New York Yankees, could.

Side note, no matter how little or how much you know about the sport of baseball, if you’re going to take anything away from this talk, let it be that the
Yankees suck.

hits
runs batted in
batting average
Fortunately for the As, Billy came up with a plan.

He decided that the traditional ways of measuring the quality of a player did not paint the entire picture

and were not helpful for building a winning baseball team, especially in the case of a team that couldn’t aﬀord to pay the biggest players the most amount
of money. *continue click*

OPS = AB * (H + BB + HBP) + TB *(AB + BB + SF + HBP)
——————————————————————————————————————
AB * (AB + BB + SF + HBP)
hits
runs batted in
batting average
Billy, instead, used *click* newer aggregated statistics and formulas to put together a list of players who *click*, when measured against these new metrics,
*click*became far more valuable than their contracts showed.

This allowed Billy to get players to help him win for far less money than normal.

The strategy ended up being very successful and brought the As to the playoﬀs multiple years in a row, competing on the same level as the teams who
spent more than double the money. This new strategy spread throughout the league and became famous enough to spawn a book and eventually a movie
was made about it, starring Brad Pitt as Billy, which - if Brad Pitt plays you in a movie, I think you did pretty OK.

So, we’re here at devfest.asia - on the CSSConf day, and this guy up on stage is blabbing about some baseball strategy put into action 13 years ago. What
gives? Well, I really like baseball, so - there’s that, but also, I think that Billy’s ideas can be applied to all sorts of other ﬁelds - traditional tactics for
measurement need to be re-analyzed from time to time and tested against new metrics. *CLICK - CONT*

I think this is especially true for one of my other great interests in life, web performance. We’ve spent a very long time focusing on a few key indicators to
tell us how fast our sites are, but it’s become clear lately that they barely paint half the picture.

so picture me like I’m the web version of Billy Beane— which should be very easy for you to do, especially if you’re really really far in the back, and let’s
talk about web performance.

slow websites lose
To ﬁnd out where we can start, we have to know what we’re up against. We have to know and understand the enemies. And the enemies in this case, are
the things that make up a slow ass website, because slow-ass websites lose.

So, what are we up against? Let’s take a look at what hell looks like for a web developer

But seriously, Android devices get a lot of heat for lagging on performance, as they should. Especially on the JavaScript end, as they doubly should. But it’s
not just Android that hurts out here, it’s all of them.

We have these little computers in our pockets. And there are a whole heck of a lot of them. And they’ve taken over. This chart is the data (and voice) usage
for the past 5 years.

The overall growth of mobile device use for browsing isn’t something new, heck - Responsive Design has been “the way” for around ﬁve years now. In
2013, 21% all cell phone owners used their phone as their primary device for internet access. This number has only been rising.

And we don’t just assume they’ll do things with their devices while they’re on the go, we know they’ll do basically anything on them: dog-sitting, dating,
making terrible comments on youtube, buying food, buying a car, buying a house? *click - cont*

So we have these devices and we’re stuck with them. We have the knowledge that they’re going to be used everywhere, consistently, for some generally
weird stuﬀ, but ya know - they’re super convenient, who cares if they’re fast?

people expect mobile to be fast
It turns out, basically everybody.

People expect mobile to be _fast_

Etsy 12% bounce rate160kb of images
Edmunds 77% load time
20% page views
4% bounce rate
3% ad impression variance
Etsy increased the weight of their images by 160kb and got a 12% higher bounce rate

Edmunds dropped their load time by 77% and well… look

And, as you might have experienced, getting your site to be fast on mobile is diﬃcult. Mobile traﬃc isn’t by default very fast. Latency on a bad network can
bite you extremely hard and it’s rarely the case that somebody has access to a network where latency isn’t an issue.

what do you mean by “win” and “lose”?
And that’s where winning and losing comes into play. And what I mean by that? let’s talk numbers

Etam: load time from 1.2s to 500ms conversions by 20%
time on site by 21%
pages viewed per visit by 28%
I could do this all day

Walmart:
1s load time 2% conversions
100ms load time 1% revenue

Obama for America: 60% load time 14% conversions

Removing one client-side redirect from Google's DoubleClick
resulted in a 12% improvement in click-through rate.
No seriously, we could be here for a while. There are a plethora of performance-related stories out there for you to convince the
money-holders in your company that you need to work on this stuﬀ

*click*

A one second delay for Bing, turns into a 2.8% drop in revenue. A
two second delay results in 4.3% drop.
Removing one client-side redirect from Google's DoubleClick
resulted in a 12% improvement in click-through rate.
Mozilla cut load time by 2.2 seconds and saw download
conversions increase by 15.4%
Amazon sees a 1% decrease in revenue for every 100ms
increase in load time.
Click, click, click

But despite all of the knowledge we have about the beneﬁts of faster pages -

features, frameworks, design, etc. are bloating up our sites. The average size is now somewhere around 2.14MB, which is a 12.7% growth over
last year.

so how?
So we have increasing use of underpowered devices on shaky networks, and those users are being delivered bigger websites all of the time.
These same users are growing less and less patient over time with how slow our websites are. How are we supposed to make a good
experience happen?

measure things
My favorite way to handle problems is to find definitive ways to measure them, and then focus on improving those measurements. We need to
find out what we want and find different ways of gathering quantitative values by which we can solve this problem.

Big warning about this. Just because something is diﬃcult to measure, does not mean it should be disregarded. If you ﬁnd something nearly
impossible to measure, keep it in mind at all times. Try to approach it from other angles, make it part of other measurements if it can’t be
broken out yet.

Daniel Yankelovich Had a great quote about this: The ﬁrst step is to measure whatever can be easily measured. This is OK as far as it goes. *pause* The second
step is to disregard that which can't be easily measured or to give it an arbitrary quantitative value. This is artiﬁcial and misleading. *pause*The third step is to
presume that what can't be measured easily really isn't important. This is blindness. *pause* The fourth step is to say that what can't be easily measured really
doesn't exist. This is [gross negligence].

easy, traditional
measurements?
So, keeping that in mind, how have we traditionally measured the speed of a website?

DOM Complete
<html>
<head>
<script>
var time = new Date();
window.addEventListener("DOMContentLoaded", function(e){
var now = new Date();
console.log(now - time);
});
</script>
</head>
<body>
<img src="./office-southeast-1000.jpg"/>
</body>
</html>
DOM Complete is when the document object model tree has been completely built. This is frequently known as the point in time in which you can query for
elements.

Onload
<html>
<head>
<script>
var time = new Date();
window.addEventListener("load", function(e){
var now = new Date();
console.log(now - time);
});
</script>
</head>
<body>
<img src="./office-southeast-1000.jpg"/>
</body>
</html>
Onload is the point in time in which every single asset on the site has loaded.

Page Weight
Page weight is the size of everything the client ends up downloading to make the site work all summed together.

request/response
Request to Response timing?

request/response
New Relic
Calibre
Skylight
Request response timing is the amount of time from where your server receives the request until the time where it responds, fully encapsulated within the
server, no latency taken into account.

There are plenty of options available for backend measurement and I’ve had a good experience with these.

DomComplete:
Onload:
Page Weight:
Request/Response:
19ms
2.4s
1.4MB
243ms
These measurements, combined, can paint part of a picture for us, but if you only pay attention to them, you’re missing out on crucial pieces of performance and
this can absolutely sink you. Don’t get me wrong, these metrics are useful and I actually pay attention to them, but they are just part of what we’re looking for
when we’re trying to measure speed. *click cont*

DomComplete:
Onload:
Page Weight:
Request/Response:
19ms
2.4s
1.4MB
243ms
So, what’s the new way? What’s the new strategy? How do we ﬁll in the blank spaces that our traditional measurements leave behind? How do we ﬁnd the best
way to give our users what they want in the way we want to give it to them as quickly as possible? Well, the answer to that is… complicated.

ﬁrst usable time
We need to focus on First Usable Time

If, instead of the monitoring how long it takes for an entire page to load, we instead measure how long it takes for the user to use the page for what they want, we
can get a more accurate gauge on general usability.

*click* Because it’s incredibly frustrating to get to a page that clearly has all of the content downloaded, but the text is blank until the font loads. This is the NY
Times - yesterday, on Chrome

- And it’s incredibly frustrating to get to a page that looks visually complete, but has so many diﬀerent scripts on it, that you can’t scroll it.

- So what kinds of things are people using to ﬁnd out if their site is usable?

The most popular measurement right now is Speed Index.

Segue: Speed Index was invented by the ﬁne folks who bring you webpagetest, a fantastic tool that allows you to see video strips of your site and how it loads.
You can break it down to the 10th of a second and for those of us that will nerd out for days, you can roll through and really see how the browser puts your page
together.

It’s a fantastic tool and I strongly recommend using it, and maybe even buying the book about using it.

Anyway, the Speed Index metric is based upon visual completeness and how quickly your site can get there.

So, let’s talk about the formula

It’s the integral from 0 to end (in ms) of 1 - the visual completeness percentage over 100 The lower the number, the better.

This chart gives a good indicator of what is meant by the formula. Percentage of visual completeness is on the y-axis, time is on the x -> where the visual
incompleteness is indicated in the shaded area, so you can see it approaching zero

This gives you something measurable and you can use the webpagetest api to run several tests against your page and return median results, which is something
you can use as a benchmark to make sure you’re not having serious performance regressions.

Speed Index isn’t brand new but it’s becoming accepted as another reliable data point to track. Heck, it’s not just accepted, it’s suggested by Google and is a fan
favorite amongst the performance crowd.

!
*click to get !* So this is a great data point to add to your repertoire. It’s easy for you to measure, it gives you a legitimate target to optimize for. But what about
when it doesn’t quite capture what your need? What if its detector for visual completeness is way oﬀ? What else can we measure?

render blocking
How about the amount of time blocking rendering? Lowering this could be the ﬁrst key to making sure your users’ browsers are able to start as soon as they can
on rendering the page.

find the files that block
How? Start by finding files that block rendering. These include any CSS on the page and also any JavaScript that exists before the content. (cont)

find the files that block
Once you’ve found these, you can use the network tab in your devtools to read the total time you spent downloading each of those files.

But that might be diﬃcult for automation, so let’s have PhantomJS do it! Did you know that you can use PhantomJS to write a HAR ﬁle?

HAR file?
Side note: HAR file stands for HTTP ARchive file. They can be used to demonstrate the network traffic and assets downloaded when visiting a page, just
like what the network tab will give you.

OK, back to phantom -

*click*By timing each asset’s request/response cycle, including start time, end time, and size of the ﬁles, you can do exactly what your network tab does. In
this case, I ran a script that created a HAR ﬁle, *double-click* which is data in JSON format, *click* and then opened it in charles to inspect.

¯_(ツ)_/¯
And that’s fantastic and useful, but what else can we measure?

round trips
How about how many Round Trips it takes to view your content? Is it over 1? Let’s talk about how that works.

TCP slow start
Did you know that new TCP connections cannot use the full bandwidth available to them? In order to prevent dropped packets, TCP starts slow, as it
doesn’t know the quality of the network it’s sending data over and wants to avoid congestion of that network. Therefore, it’s the standard to send, at a
maximum, 10 TCP packets on a new connection in the ﬁrst round trip.

14.65KB
At 1500 bytes per packet, that’s only 14.648KB

At this point, the client sends an acknowledgement that it has received this data and to the server so it will send more. The server will slowly ramp up the
amount with each round trip, but this could take a bit for a huge ﬁrst ﬁle

sooooo….?
So, what does that mean for you?

monitor whether or not your
site is usable in < 14.65K
If you can keep all of what is needed to use the site out the gates in 1 request that is < = 14.648 KB, you’re cutting the amount of round trips that need to
happen for your site to be usable to an incredibly low speed. Even over high latency and low bandwidth networks, this will feel snappy.

timing differences?
What else can we measure?

What about timing diﬀerences on every event under the sun? 
Have you used the PerformanceTiming API before? It’s awesome, let’s do this.

Let’s bring phantomjs back out.

performance.timing.loadEventEnd
- performance.timing.navigationStart
That will give you the time, in MS from the moment the browser starts the process of navigation to your page, to the moment it is ﬁnished loading. Far more
exact than onload could give you. So useful. Print that.

performance.timing.domInteractive
- performance.timing.responseStart
This will give you the time from the moment where your server response comes back to your browser until the time where the browser has ﬁnished parsing
all of the HTML and DOM construction is complete

Not exact enough? 
What about any of these options?

From almost every measured point, you can record and report back your data. This should push you nicely along the way to making your own Realtime
User Monitoring

different websites need
different measurements
What about things that aren’t so cut and dry?

And this is where the big caveat comes in: Diﬀerent Websites Need Diﬀerent Measurements *click - cont*

It’s great to line up your sites and compete over median Speed Indexes, and page weights, and load times, seriously. It makes a better web for all of us.
But, what if your page could not possibly be considered complete until the hero image is loaded?

What if you couldn’t even think of using your page until your menu can not only be clicked on, but can be used well?

And this is where we start building something of our own. We can have all of the well-vetted formulas and ways to approach performance out there, but to
really approach our problems at their source, we need something that ﬁts our personal site. For that, we’re going to need Realtime User Monitoring and
we’re also going to need custom metrics. Luckily, we’ve got those too.

UserTiming API
Enter, the UserTiming API!

The User Timing API is still in “recommended” status with the w3c and isn’t used by Safari (including iOS) or Opera Mini yet, but there’s a perfectly good
polyﬁll out there for that! So no fear, let’s get going on this.

UserTiming API
The User Timing API provides us a couple of really great methods that can help us better track what’s going on on our page. They attach right to the
performance interface.

performance.mark
performance.measure
*CLICK* These methods include `mark`, which allows you to take a quick time snapshot that is saved,

*CLICK* `measure`, which will give you a measurement between two `mark`s.

With these, you can very accurately time what’s happening and just how long it takes these things to happen.

Let’s use an example. Say I have a page that just isn’t considered ready until this image is front and center.

performance.getEntriesByName(“source-file.jpg”)[0].duration
The performance timing api gives us the ability to grab a ﬁle that was requested, and it tells you how long it took to get the ﬁle with getEntriesByName.

But that’s not the whole story.

<html>
<head>
<meta charset="utf8" />
</head>
<body>
<img src=“./source-file.jpg"
onload=“performance.mark('source-file-1')"/>
<script>
performance.mark(‘source-file-2');
</script>
</body>
</html>
We need to see when the image was loaded and seen on the page. For that, we can borrow a trick from Steve Souders, and combine a few diﬀerent
methods.

We can start with an inline `onload` on the img itself, and then also put an inline script right behind the img tag, so it will execute while the page is being
rendered.

var startTimes = performance.getEntriesByType('mark')
.map(function(mark){
return mark.startTime;
});
Math.max.apply(null, startTimes);
//=> 1301.78ms
Then we can check what the start time is for each of these marks. The highest, in this case, will give us the actual time that the image has been rendered
on the page. This is immensely useful for a hero image that your page relies upon to be considered usable (such as, if you have a site where people are
purchasing items).

Neat huh?

Hopefully, by this point, you have an idea of something that you can measure that will dramatically increase the actual visibility you have into your site’s
performance, but never be satisﬁed with your measurements. New techniques will continue to be developed, and with them will come better insight along
the way. *click* Pay attention to your statistics and test across the board and you should have a lot of success — and then you can dance

now what?
Now that you have your measurements in order, maybe we can focus on what you can do to speed things up a bit.

latency
Latency is the amount of time it takes for your request to make it from the client to the server. *click cont*

latency
This transmission is limited by the speed of light with resistance provided by the copper used in the wire and the path taken from routing station to routing
station.

Since the path is such a factor in this case, using a CDN can greatly limit the amount of latency your users incur by shortening the distance their request
has to go.

critical path
Another way to avoid latency issues is to cater to your critical path.

As I mentioned earlier, the ﬁrst request the client makes to a server will be limited by TCP slow start.

This limit is roughly 14.6KB. *click cont*

critical path
With this in mind,

if you can inline your CSS that is critical for the page to load and then

asynchronously load the full CSS ﬁle along with any necessary JavaScript,

you can make sure little to no render-blocking that relies on a network request occurs and your ﬁrst round trip will have everything a user needs to use a
site.

This makes for ultra-fast sites on even the worst of networks.

One of my favorite examples for this is the Filament Group website.

In this case, we’re throttling down to 2G, and it’s still usable in less than one second.

server-side rendering
While sending an empty body and waiting for a script to load all of your assets may feel cleaner, it guarantees that there will a minimum of two requests
before you can even start building the content for your page, and once that happens, if your user has an underpowered device - then it can take even
longer.

server-side rendering
Rendering your site on the server first and sending the HTML on the first response will, almost always, provide a faster first page load. In the past, we’ve
been able to achieve this with progressive enhancement, which I’m still a huge advocate for, but now, JS frameworks/libraries/whatever you want to call
them are catering to this performance necessity by allowing your first request to be served HTML.

“best practices”
You can also use “Best Practices” - now, I’ve never really liked the term “Best Practices” - it tends to mean “hacks that involve tribal knowledge so we can
work around limitations of our technology.” With HTTP/1.1 - we have plenty of these. So let’s talk about _why_ they’re actually recommended, instead of
hand-waving around them.

concatenate
For example *CLICK*, due to the amount of concurrent requests a browser can make - 6 - which is a completely arbitrary number that we all need to
memorize, we suggest you concatenate all of your CSS and JS ﬁles so as to limit the number of requests that your browser can make without stalling.

minification
Since we’re sending this big file of CSS or JavaScript, we want to make sure that we can make it as syntactically small as possible. We want to strip
comments, we want to make variable names as small as possible, etc. Minification makes this possible by parsing your file and then recreating the code in
the smallest way.

*click* Then there’s gzip. Hey, I’m a huge fan of this, actually. gzip works like the video you see on the screen. *PAUSE* It looks for repetition in the text
that’s being sent and writes to ﬁle something that references said repetition. This is indicated by the red text in this video. *pause* The compression is
incredibly fast and makes for some immensely smaller ﬁles for transfer. *pause* Always gzip when you can, you’ll save money on bandwidth and provide a
better experience for users.

original:
minified:
minified + gzip:
247597 bytes
84380 bytes
29607 bytes
Combining gzip and miniﬁcation can be huge for dropping your ﬁle size, for example, here is jQuery

The Future
is now
So, as I mentioned, best practices are normally artifacts that come from limitations of our current ecosystem. HTTP 2 helps address these issues in a lot of
ways, and best of all, *click* you can use it right now - delivering your site based on what the client asks for

how we keep this up
So, hopefully now you have some ways in mind to measure performance on your site. With these measurements, you can concentrate on the pain points in
your site by focusing on methods to speed everything up. This is great and this is wonderful, but let’s bring it around to the last part. Never settling.

performance budget
Set a performance budget and stick to it. Know what you want your users to experience. Measure increases and decreases in your times and see how that
aﬀects your traﬃc, conversions, and sales. Make sure your continuous integration system tests if your budget is being met.

Here’s how Etsy handles this. They keep a video showing how their site currently loads displayed front and center on a wall. Developers of the site see
where the numbers currently are, so they are empowered to act upon problems with what they’re building and to see their successes ﬁrst hand.

ﬁnal thought
I’ve talked a lot up here about how performance aﬀects the bottom line. Heck, I even named this talk after a baseball method of extracting the most you
can out of your team without spending more money than necessary. But performance, web performance at least, is about more than that. Building a faster
website makes for more money, sure, but it also increases the amount of people who can visit your site. Faster sites tend to be faster all the way down and
in being so, makes what you have built a more accessible site for everybody, and isn’t that what the web is all about?

Thank you.

Moneyball for performance metrics

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (20)

Similar to Moneyball for performance metrics

Similar to Moneyball for performance metrics (20)

Recently uploaded

Recently uploaded (20)

Moneyball for performance metrics