The businesses that depend on us, depend on us to be fast!
It would not work to have a performance management system that is slow!
We need to scope this engagement to manageable size.We already have dozens or more people monitoring and managing our systems internally, but few looking at what the users actually experience. This come from Carey Milsap, formerlyVide-President of Performance at Oracle
UC 10 is a very important use case for us – we have many relationships with merchants, and performance is an important part of why they choose us.
Every Business Intelligence system has these three parts
This is one of the many ways we test our products to provide a better user experience
I was asked not to provide detailed figures on our data systems, so forgive me if everything is order-of-magnitude here.
Heresy! Normalized Dates and Times!
Architecture should also be unobtrusive, an enabler.Architecture makes the hard things buildable.This is a good metaphor to Paypal altogether – our role is to unobtrusively enable users to exchange money while getting out of the way. We are support, not front and center. Our job is to make merchants look good and work well.
Managing Performance Globally with MySQL
Globally with MySQL
Daniel Austin, PayPal, Inc.
MySQL Connect 2013
Sept 22nd, 2013
Why Are We Here?
We needed a comprehensive system for performance management
“Anytime Anywhere” implies a significant commitment to the user
experience, especially performance and service reliability.
So we designed a fast real-time analytics system for performance
data using MySQL 5.1.
And then we built it.
Overture: Architecture Principles
1. Design and build for scale
2. Only build to differentiate
3. Everything we use or create must have a managed lifecycle
4. Design with systemic qualities in mind
5. Adopt industry standards
What Do You Mean „Web
• Performance is response time
– In this case, we are scoping
the discussion to include
only end-user response
time for PayPal activities
• Only outside the PayPal
– Inside, it‟s monitoring,
complementary but different
– We are concerned with real
people not machines
• For our purposes, we treat
PayPal‟s systems as a black
The Vision: 3 Big Ideas
Performance engineering is a
Bake It In Up
We are focused on
the experiences of
end users of PayPal,
Establish one shared,
toolkit and testing
• Model Driven Architecture – no code!
• Data Driven
– Real data products
– Fast, efficient data model for HTTP
• Up-to-date global dataset provides low MTTR
• Flexible fast reporting for performance analytics
The Big Picture
Data Collection Summary
• Multiple sources for synthetic and RUM
performance testing data
• Large-scale dataset with very long (10
yrs+) retention time
– Need to build for the ages
• Requires some effort to design a flexible
methodology when devices and networks
are changing quickly
Advanced ETL With Talend
• MODEL-DRIVEN = FAST
• LETS US DEVELOP
• METADATA DRIVEN
• MODEL IN, JAVA
GLeaM Data Products
• Level 0
– Raw data at measurement-level
– Field-level Syntactic & Semantic
– Level 1
– 3NF 5D Data Model
– concrete aggregates while
retaining record-level resolution
– Level 2
– User-defined and derived
– Time & Space-based aggregates
– Longitudinal and bulk reporting
A data product is a well-defined data set
that has data types, a data dictionary, and
validation criteria. It should be possible to
rebuild the system from a functional
viewpoint based entirely on the data
• Split at path segment
• We used a simple
SHA(1) key to index
secondary URL tables
• We need a defined URI
data type in MySQL!
Some Best Practices
– URIs: Handle with care
• Encode text strings in lexical order
• Use sequential bitfields for searching
– Integer arithmetic only
– Combined fields for per-row consistency
checks in every table
– Don‟t skip the supporting jobs – sharding,
– Don‟t trade ETL time for integrity risk!
GLeaM Data Reporting
• GLeaM is intended to be agnostic and
flexible w.r.t reporting tools
• We chose Tableau for dynamic analytics
• We also use several enterprise-level
reporting tools to produce aggregate
WEB AND DESKTOP
We designed initial reports for 3 sets of stakeholders:
• High-level overviews for
• Diagnostic reports for
operations teams to identify
• Deep-dive analytical reports
to identify opportunities for
What We Learned
• Paying attention to design patterns pays off
• MySQL rewards detailed optimization
• Trade-offs around normalization can lead to
10x or even 100x query time reduction
• Sharding remains an issue
• We believe we can easily achieve petabyte
scales with additional slaves
CODA: THE LAST ARCHITECTURE PRINCIPLE
…A PLAYER IS SAID TO BE SHIBUI WHEN HE OR SHE MAKES NO SPECTACULAR
PLAYS ON THE FIELD, BUT CONTRIBUTES TO THE TEAM IN AN UNOBTRUSIVE WAY.
Daniel Austin PayPal, Inc.
MySQL Connect 2013
Sept 22nd, 2013