2. Why Are We Here?
We needed a comprehensive system for performance management
at PayPal
Vision->Goals->Plan->Execution->Delighted User
“Anytime Anywhere” implies a significant commitment to the user
experience, especially performance and service reliability.
So we designed a fast real-time analytics system for performance
data using MySQL 5.1.
And then we built it.
3. Overture: Architecture Principles
1. Design and build for scale
2. Only build to differentiate
3. Everything we use or create must have a managed lifecycle
4. Design with systemic qualities in mind
5. Adopt industry standards
3
4. What Do You Mean „Web
Performance‟?
• Performance is response time
– In this case, we are scoping
the discussion to include
only end-user response
time for PayPal activities
• Only outside the PayPal
system boundary
– Inside, it‟s monitoring,
complementary but different
– We are concerned with real
people not machines
• For our purposes, we treat
PayPal‟s systems as a black
box
5. The Vision: 3 Big Ideas
Performance engineering is a
design-time activity.
Bake It In Up
Front
We are focused on
the experiences of
end users of PayPal,
anywhere, anyway,
anytime.
End2End
Performance
for End
Users
One
Consistent
View
Establish one shared,
consistent performance
toolkit and testing
methodology.
7. Architecture: Features
• Model Driven Architecture – no code!
• Data Driven
– Real data products
– Fast, efficient data model for HTTP
• Up-to-date global dataset provides low MTTR
• Flexible fast reporting for performance analytics
14. Data Collection Summary
• Multiple sources for synthetic and RUM
performance testing data
• Large-scale dataset with very long (10
yrs+) retention time
– Need to build for the ages
• Requires some effort to design a flexible
methodology when devices and networks
are changing quickly
16. Advanced ETL With Talend
• MODEL-DRIVEN = FAST
DEVELOPMENT
• LETS US DEVELOP
COMPONENTS
FAST
• METADATA DRIVEN
• MODEL IN, JAVA
OUT
17. GLeaM Data Products
• Level 0
– Raw data at measurement-level
resolution
– Field-level Syntactic & Semantic
Validation
– Level 1
– 3NF 5D Data Model
– concrete aggregates while
retaining record-level resolution
– Level 2
– User-defined and derived
measures
– Time & Space-based aggregates
– Longitudinal and bulk reporting
A data product is a well-defined data set
that has data types, a data dictionary, and
validation criteria. It should be possible to
rebuild the system from a functional
viewpoint based entirely on the data
product catalog.
19. GLeaM Data Storage
• Modeling HTTP in SQL
• MySQL 5.1, Master & multi-slave config
• 3rd Normal Form, Codd compliance
• Fast, efficient analytical data model for
HTTP Sessions
20. 3NF Level 1 Data Model for HTTP
• NO xrefs
• 5D User Narrative Model
• High levels of normalization are costly up
front…
• …but pay for themselves later when you
are making queries!
23. Managing URLs
• VARCHAR(4096)?
• Split at path segment
• We used a simple
SHA(1) key to index
secondary URL tables
• We need a defined URI
data type in MySQL!
24. Some Best Practices
– URIs: Handle with care
• Encode text strings in lexical order
• Use sequential bitfields for searching
– Integer arithmetic only
– Combined fields for per-row consistency
checks in every table
– Don‟t skip the supporting jobs – sharding,
rollover, logging
– Don‟t trade ETL time for integrity risk!
26. GLeaM Data Reporting
• GLeaM is intended to be agnostic and
flexible w.r.t reporting tools
• We chose Tableau for dynamic analytics
• We also use several enterprise-level
reporting tools to produce aggregate
reports
28. GLeaM Reports
We designed initial reports for 3 sets of stakeholders:
• High-level overviews for
busy decision-makers
Analytics
• Diagnostic reports for
operations teams to identify
Operations
Executives
• Deep-dive analytical reports
to identify opportunities for
improvements
30. What We Learned
• Paying attention to design patterns pays off
• MySQL rewards detailed optimization
• Trade-offs around normalization can lead to
10x or even 100x query time reduction
• Sharding remains an issue
• We believe we can easily achieve petabyte
scales with additional slaves
30
31. CODA: THE LAST ARCHITECTURE PRINCIPLE
SHIBUI
SIMPLE
ELEGANT
BALANCED
…A PLAYER IS SAID TO BE SHIBUI WHEN HE OR SHE MAKES NO SPECTACULAR
PLAYS ON THE FIELD, BUT CONTRIBUTES TO THE TEAM IN AN UNOBTRUSIVE WAY.
The businesses that depend on us, depend on us to be fast!
It would not work to have a performance management system that is slow!
We need to scope this engagement to manageable size.We already have dozens or more people monitoring and managing our systems internally, but few looking at what the users actually experience. This come from Carey Milsap, formerlyVide-President of Performance at Oracle
UC 10 is a very important use case for us – we have many relationships with merchants, and performance is an important part of why they choose us.
Every Business Intelligence system has these three parts
This is one of the many ways we test our products to provide a better user experience
I was asked not to provide detailed figures on our data systems, so forgive me if everything is order-of-magnitude here.
Heresy! Normalized Dates and Times!
Architecture should also be unobtrusive, an enabler.Architecture makes the hard things buildable.This is a good metaphor to Paypal altogether – our role is to unobtrusively enable users to exchange money while getting out of the way. We are support, not front and center. Our job is to make merchants look good and work well.