SlideShare a Scribd company logo
1 of 50
Story of
How LinkedIn Used RUM & PoPs to Make
the Site Faster
Ritesh Maheshwari
Performance Engineer
LinkedIn
@ritesh
2013
2015
Story
PoP RUM
Story: In a nutshell
1. We built PoPs
2. …used RUM to assign users to Optimal PoPs
3. …found DNS based assignment is suboptimal
4. …evaluated Anycast as a solution using RUM
5. …now using Anycast to assign users to PoPs
6. …new PoPs locations chosen using RUM
What is RUM?
JavaScript (client code) to measure performance
 Navigation Timing API
 LinkedIn uses
https://github.com/lognormal/boomerang/
Metrics like:
• DNS Time
• Connection Time
• First Byte Time
• Download Time
• Page Load Time
What are PoPs?
Point of Presence / PoP
• Small-scale data centers
– Cheap, Many
• Proxy servers at LinkedIn (ATS)
World Without PoPs
Browser Data Center
connection time 250ms
World Without PoPs
Browser Data Center
connection time
server
compute
time
250ms
500ms
World Without PoPs
Browser Data Center
connection time
first
byte
time
+
page
download
time
5 RTTs = 5x250ms = 1250ms
server
compute
time
250ms
Total = 2000ms
500ms
With PoPs
Browser Data CenterPoP
100ms
250ms
With PoPs
Browser Data CenterPoP
100msconnection time
With PoPs
Browser Data CenterPoP
100msconnection time
first
byte
time
+
page
download
time
server
compute
time
500ms
With PoPs
Browser Data CenterPoP
100msconnection time
5 RTTs = 5x100ms = 500ms
Total = 1100ms
900 ms gain!first
byte
time
+
page
download
time
500ms
server
compute
time
How are users assigned to PoPs?
Through 3rd party DNS services
IP handed based on user’s resolver location
# Spain
$ dig @109.69.8.51 +short www.linkedin.com
91.225.248.80
# California
$ dig +short www.linkedin.com
216.52.242.80
Should India connect to Singapore or
Dublin?
How to assure optimal PoPs assignment?
• Geographical Distance
• Knowledge about Internet Connectivity
Lets Make Data-based Decisions
Synthetic measurements?
No realistic agent distribution across networks
No realistic agent connectivity distribution
Can our users become measurements agent?
• RUM!
RUM beacons
Fetch a tiny object from each candidate PoP
For each pop_name,
1. Start timer
2. Fetch {pop_name}.perf.linkedin.com/pop/admin
3. Stop timer
Send data back to our servers
• Millions of agents!
• Analyze data to find “optimal” PoP per country
We can assign countries to new PoPs!
Country PoP Median Beacon Time
China Hong Kong 434
China Dublin 1216
China Singapore 515
India Hong Kong 1368
India Dublin 1042
India Singapore 898
We can audit current assignment!
Country Is PoP optimal? Current PoP Optimal PoP
India TRUE Singapore Singapore
Pakistan FALSE Singapore Dublin
Spain TRUE Dublin Dublin
Brazil FALSE US West Coast US East Coast
Netherlands TRUE Dublin Dublin
UAE FALSE US West Coast Dublin
Italy TRUE Dublin Dublin
Mexico TRUE US West Coast US West Coast
Russia FALSE US West Coast Dublin
0%
5%
10%
15%
20%
25%
30%
India Pakistan Singapore Russia Brazil
PercentageImprovement LinkedIn Homepage Download Time Improvement
Median Improvement 90th Percentile Improvement
Trust, but Verify
Is 100% of India connecting to Singapore PoP?
Can RUM identify which PoP served this page view?
1. RUM is JavaScript
2. PoP selected through DNS
3. JavaScripts can’t know DNS answers
 JavaScripts can read response headers
Can RUM identify which PoP served
this page view?
After page load:
1. Fetch www.linkedin.com/pop/admin
2. Read X-Li-Pop header
3. Add that data to RUM and send it back
Offline analysis:
1. What % of traffic in India going to Singapore PoP?
2. A/B testing PoP assignment.
Plot Twist:
Assignment far from optimal
• About 31% of US traffic gets assigned to a
suboptimal PoP.
– 45% of East Coast
• About 10% of traffic globally gets assigned to a
suboptimal PoP.
DNS PoP assignment is suboptimal
• Assignment based on Resolver IP, not Client IP
DNS
Resolver
PoP
US
East
PoP
US
West
New YorkCalifornia
DNS PoP assignment is suboptimal
• Assignment based on Resolver IP, not Client IP
• Bad IP to Geo databases
– Resolver really in NY, but database says CA
Story so far
1. We built PoPs
2. …used RUM to assign users to Optimal PoPs
3. …found DNS based assignment is suboptimal
Accurate PoP assignment Problem
• Bug our DNS providers (31% -> 27%)
• Run our own DNS
How about Anycast?
Anycast – One IP, Multiple Servers
PoP A
PoP B
PoP C
Bob
1.1.1.1
1.1.1.1
1.1.1.1
• Unicast – one IP per server
• Seamless support at routing layer
• Packets routed to closest server by
# of hops
 Client IP, not Resolver IP used!
 No Geo-IP Databases
How does Anycast compare to DNS?
Will anycast send more users to optimal PoP?
Lets test it!
RUM to rescue
For each PoP:
1. Announce same anycast IP (108.174.13.10)
2. Configure a domain
ac.perf.linkedin.com to point to
108.174.13.10
RUM to rescue
For each page view:
1. RUM downloads a tiny object :
ac.perf.linkedin.com/pop/admin
2. Read X-Li-Pop response header to record which PoP served
the object
3. Send this back to LinkedIn with RUM data
Data:
1. For each user, the anycast PoP
2. For each user, the optimal PoP (from pop beacons)
Results 
Region or
Country
DNS % Optimal
Assignment
Anycast % Optimal
Assignment
Illinois 70 90
Florida 73 95
Georgia 75 93
Pennsylvania 85 95
Results 
Region or
Country
DNS % Optimal
Assignment
Anycast % Optimal
Assignment
Arizona 60 39
Brazil 88 33
New York 77 74
Fewer hops != Lower Latency
• Carriers prefer to haul packets within
their own network
• Peering can create inter-continental
short cuts
Z
X
Alice
Y
1.1.1.1
1.1.1.1
1.1.1.1
Maybe DNS wasn’t so bad
Continent level identification
City / State level identification
“Regional” Anycast
1 anycast IP per continent
Ran a RUM experiment,
all was fine Z
X
Alice
Y
2.2.2.2
1.1.1.1
1.1.1.1
50.00
55.00
60.00
65.00
70.00
75.00
80.00
85.00
90.00
95.00
100.00
20141206 20141208 20141210 20141212 20141214 20141216 20141218
%TrafficgoingtoOptimalPoP
Date
Illinois
Florida
North Carolina
Indiana
New York
New Jersey
Virginia
West Virginia
Louisiana
USA Ramp Results
Ramp outside
USA
In progress
Story so far
1. We built PoPs
2. …used RUM to assign users to Optimal PoPs
3. …found DNS based assignment is suboptimal
4. …evaluated Anycast as a solution using RUM
5. …now using Anycast to assign users to PoPs
Next play:
• Build more PoPs!
Build More PoPs: But where?
• Previously, based on
– Intuition
– Knowledge
• Can we use RUM data?
– Get a list of candidate PoP locations
– Build a machine learning model
– Rank locations based on most performance benefit
Plot Twist: Miami vs Sao Paolo
• Can we use RUM again?
– Fetch a tiny object and measure latency
– But we have no servers there
 Our CDNs do!
1. Get IP address of CDN servers
2. Configure a domain to point to the IP
3. RUM to fetch a tiny object ( /cdo/rum/id )
4. Measure time and report back
Miami vs Sao Paolo - Results
 Sao Paolo is better than US East Coast
 Miami is better than Sao Paolo for many in South America
Country Median beacon time
Miami Sao Paolo US East Coast
Columbia 648 970 778
Argentina 918 970 1213
Brazil 868 549 1074
Story: Recap
1. We built PoPs
2. …used RUM to assign users to Optimal PoPs
3. …found DNS based assignment is suboptimal
4. …evaluated Anycast as a solution using RUM
5. …now using Anycast to assign users to PoPs
6. …new PoPs locations chosen using RUM
10-25% gain
30% suboptimal in US
“Regional” Anycast
30% -> 10%
Story: The End
• Clients are your measurement agents
• Make data-driven decisions
• Trust, but verify
• Collaboration leads to bigger impact
Questions?
in/RiteshMaheshwari@ritesh
©2014 LinkedIn Corporation. All Rights Reserved.©2014 LinkedIn Corporation. All Rights Reserved.
What about Mobile?
• Packet Gateways are likely to be in the same
part of the country as users
• Wired measurements should be applicable for
Mobile
– India wired users are closer to Singapore, Mobile
should be similar
Is Synthetic Monitoring dead?
No, not yet
• Much more powerful than RUM
• Not everything can be done in production
Next Play
• Keep evaluating Anycast
• Keep building new PoPs
• Dynamic RUM-based PoP assignment

More Related Content

Viewers also liked

Linked in data to power sales - dreamforce nov 18 2013 - vfinal w. appendix
Linked in   data to power sales - dreamforce nov 18 2013 - vfinal w. appendixLinked in   data to power sales - dreamforce nov 18 2013 - vfinal w. appendix
Linked in data to power sales - dreamforce nov 18 2013 - vfinal w. appendixAndres Bang
 
LinkedIn Presentation Plainfield Library 2016
LinkedIn Presentation Plainfield Library 2016LinkedIn Presentation Plainfield Library 2016
LinkedIn Presentation Plainfield Library 2016Denis Curtin
 
By the Numbers: Leveraging LinkedIn Data to Become a Strategic Talent Advisor...
By the Numbers: Leveraging LinkedIn Data to Become a Strategic Talent Advisor...By the Numbers: Leveraging LinkedIn Data to Become a Strategic Talent Advisor...
By the Numbers: Leveraging LinkedIn Data to Become a Strategic Talent Advisor...LinkedIn Talent Solutions
 
Leveraging Data: LinkedIn Recruiter Jobs and Talent Pool Analysis | Talent Co...
Leveraging Data: LinkedIn Recruiter Jobs and Talent Pool Analysis | Talent Co...Leveraging Data: LinkedIn Recruiter Jobs and Talent Pool Analysis | Talent Co...
Leveraging Data: LinkedIn Recruiter Jobs and Talent Pool Analysis | Talent Co...LinkedIn Talent Solutions
 
LinkedIn Member Segmentation Platform: A Big Data Application
LinkedIn Member Segmentation Platform: A Big Data ApplicationLinkedIn Member Segmentation Platform: A Big Data Application
LinkedIn Member Segmentation Platform: A Big Data ApplicationDataWorks Summit
 
Hadoop World 2011: LeveragIng Hadoop to Transform Raw Data to Rich Features a...
Hadoop World 2011: LeveragIng Hadoop to Transform Raw Data to Rich Features a...Hadoop World 2011: LeveragIng Hadoop to Transform Raw Data to Rich Features a...
Hadoop World 2011: LeveragIng Hadoop to Transform Raw Data to Rich Features a...Cloudera, Inc.
 
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop Shirshanka Das
 
The latest in LinkedIn talent pool reports | Talent Connect Anaheim
The latest in LinkedIn talent pool reports  | Talent Connect AnaheimThe latest in LinkedIn talent pool reports  | Talent Connect Anaheim
The latest in LinkedIn talent pool reports | Talent Connect AnaheimLinkedIn Talent Solutions
 
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystemStrata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystemShirshanka Das
 
Hadoop Summit 2014: Building a Self-Service Hadoop Platform at LinkedIn with ...
Hadoop Summit 2014: Building a Self-Service Hadoop Platform at LinkedIn with ...Hadoop Summit 2014: Building a Self-Service Hadoop Platform at LinkedIn with ...
Hadoop Summit 2014: Building a Self-Service Hadoop Platform at LinkedIn with ...David Chen
 
LinkedIn's Logical Data Access Layer for Hadoop -- Strata London 2016
LinkedIn's Logical Data Access Layer for Hadoop -- Strata London 2016LinkedIn's Logical Data Access Layer for Hadoop -- Strata London 2016
LinkedIn's Logical Data Access Layer for Hadoop -- Strata London 2016Carl Steinbach
 
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010Yahoo Developer Network
 
Live Webinar: Advanced Strategies for Leveraging Linkedin Like a Pro
Live Webinar: Advanced Strategies for Leveraging Linkedin Like a ProLive Webinar: Advanced Strategies for Leveraging Linkedin Like a Pro
Live Webinar: Advanced Strategies for Leveraging Linkedin Like a ProLinkedIn
 
Jorge Lascas - Workshop linkedin successful strategies - Amsterdam
Jorge Lascas - Workshop linkedin successful strategies - AmsterdamJorge Lascas - Workshop linkedin successful strategies - Amsterdam
Jorge Lascas - Workshop linkedin successful strategies - AmsterdamJorge Lascas
 
Data-Driven Best Practices for LinkedIn Sponsored InMail
Data-Driven Best Practices for LinkedIn Sponsored InMailData-Driven Best Practices for LinkedIn Sponsored InMail
Data-Driven Best Practices for LinkedIn Sponsored InMailLinkedIn
 

Viewers also liked (16)

Linked in data to power sales - dreamforce nov 18 2013 - vfinal w. appendix
Linked in   data to power sales - dreamforce nov 18 2013 - vfinal w. appendixLinked in   data to power sales - dreamforce nov 18 2013 - vfinal w. appendix
Linked in data to power sales - dreamforce nov 18 2013 - vfinal w. appendix
 
LinkedIn Presentation Plainfield Library 2016
LinkedIn Presentation Plainfield Library 2016LinkedIn Presentation Plainfield Library 2016
LinkedIn Presentation Plainfield Library 2016
 
By the Numbers: Leveraging LinkedIn Data to Become a Strategic Talent Advisor...
By the Numbers: Leveraging LinkedIn Data to Become a Strategic Talent Advisor...By the Numbers: Leveraging LinkedIn Data to Become a Strategic Talent Advisor...
By the Numbers: Leveraging LinkedIn Data to Become a Strategic Talent Advisor...
 
Leveraging Data: LinkedIn Recruiter Jobs and Talent Pool Analysis | Talent Co...
Leveraging Data: LinkedIn Recruiter Jobs and Talent Pool Analysis | Talent Co...Leveraging Data: LinkedIn Recruiter Jobs and Talent Pool Analysis | Talent Co...
Leveraging Data: LinkedIn Recruiter Jobs and Talent Pool Analysis | Talent Co...
 
Talent Pools: Using insights to power your talent acquisition strategy
Talent Pools: Using insights to power your talent acquisition strategyTalent Pools: Using insights to power your talent acquisition strategy
Talent Pools: Using insights to power your talent acquisition strategy
 
LinkedIn Member Segmentation Platform: A Big Data Application
LinkedIn Member Segmentation Platform: A Big Data ApplicationLinkedIn Member Segmentation Platform: A Big Data Application
LinkedIn Member Segmentation Platform: A Big Data Application
 
Hadoop World 2011: LeveragIng Hadoop to Transform Raw Data to Rich Features a...
Hadoop World 2011: LeveragIng Hadoop to Transform Raw Data to Rich Features a...Hadoop World 2011: LeveragIng Hadoop to Transform Raw Data to Rich Features a...
Hadoop World 2011: LeveragIng Hadoop to Transform Raw Data to Rich Features a...
 
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop
 
The latest in LinkedIn talent pool reports | Talent Connect Anaheim
The latest in LinkedIn talent pool reports  | Talent Connect AnaheimThe latest in LinkedIn talent pool reports  | Talent Connect Anaheim
The latest in LinkedIn talent pool reports | Talent Connect Anaheim
 
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystemStrata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
 
Hadoop Summit 2014: Building a Self-Service Hadoop Platform at LinkedIn with ...
Hadoop Summit 2014: Building a Self-Service Hadoop Platform at LinkedIn with ...Hadoop Summit 2014: Building a Self-Service Hadoop Platform at LinkedIn with ...
Hadoop Summit 2014: Building a Self-Service Hadoop Platform at LinkedIn with ...
 
LinkedIn's Logical Data Access Layer for Hadoop -- Strata London 2016
LinkedIn's Logical Data Access Layer for Hadoop -- Strata London 2016LinkedIn's Logical Data Access Layer for Hadoop -- Strata London 2016
LinkedIn's Logical Data Access Layer for Hadoop -- Strata London 2016
 
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
 
Live Webinar: Advanced Strategies for Leveraging Linkedin Like a Pro
Live Webinar: Advanced Strategies for Leveraging Linkedin Like a ProLive Webinar: Advanced Strategies for Leveraging Linkedin Like a Pro
Live Webinar: Advanced Strategies for Leveraging Linkedin Like a Pro
 
Jorge Lascas - Workshop linkedin successful strategies - Amsterdam
Jorge Lascas - Workshop linkedin successful strategies - AmsterdamJorge Lascas - Workshop linkedin successful strategies - Amsterdam
Jorge Lascas - Workshop linkedin successful strategies - Amsterdam
 
Data-Driven Best Practices for LinkedIn Sponsored InMail
Data-Driven Best Practices for LinkedIn Sponsored InMailData-Driven Best Practices for LinkedIn Sponsored InMail
Data-Driven Best Practices for LinkedIn Sponsored InMail
 

Recently uploaded

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

Recently uploaded (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

How LinkedIn used RUM to drive optimizations and make the site faster

  • 1.
  • 2. Story of How LinkedIn Used RUM & PoPs to Make the Site Faster Ritesh Maheshwari Performance Engineer LinkedIn @ritesh
  • 6. Story: In a nutshell 1. We built PoPs 2. …used RUM to assign users to Optimal PoPs 3. …found DNS based assignment is suboptimal 4. …evaluated Anycast as a solution using RUM 5. …now using Anycast to assign users to PoPs 6. …new PoPs locations chosen using RUM
  • 7. What is RUM? JavaScript (client code) to measure performance  Navigation Timing API  LinkedIn uses https://github.com/lognormal/boomerang/ Metrics like: • DNS Time • Connection Time • First Byte Time • Download Time • Page Load Time
  • 8. What are PoPs? Point of Presence / PoP • Small-scale data centers – Cheap, Many • Proxy servers at LinkedIn (ATS)
  • 9. World Without PoPs Browser Data Center connection time 250ms
  • 10. World Without PoPs Browser Data Center connection time server compute time 250ms 500ms
  • 11. World Without PoPs Browser Data Center connection time first byte time + page download time 5 RTTs = 5x250ms = 1250ms server compute time 250ms Total = 2000ms 500ms
  • 12. With PoPs Browser Data CenterPoP 100ms 250ms
  • 13. With PoPs Browser Data CenterPoP 100msconnection time
  • 14. With PoPs Browser Data CenterPoP 100msconnection time first byte time + page download time server compute time 500ms
  • 15. With PoPs Browser Data CenterPoP 100msconnection time 5 RTTs = 5x100ms = 500ms Total = 1100ms 900 ms gain!first byte time + page download time 500ms server compute time
  • 16. How are users assigned to PoPs? Through 3rd party DNS services IP handed based on user’s resolver location # Spain $ dig @109.69.8.51 +short www.linkedin.com 91.225.248.80 # California $ dig +short www.linkedin.com 216.52.242.80
  • 17. Should India connect to Singapore or Dublin? How to assure optimal PoPs assignment? • Geographical Distance • Knowledge about Internet Connectivity
  • 18. Lets Make Data-based Decisions Synthetic measurements? No realistic agent distribution across networks No realistic agent connectivity distribution Can our users become measurements agent? • RUM!
  • 19. RUM beacons Fetch a tiny object from each candidate PoP For each pop_name, 1. Start timer 2. Fetch {pop_name}.perf.linkedin.com/pop/admin 3. Stop timer Send data back to our servers • Millions of agents! • Analyze data to find “optimal” PoP per country
  • 20. We can assign countries to new PoPs! Country PoP Median Beacon Time China Hong Kong 434 China Dublin 1216 China Singapore 515 India Hong Kong 1368 India Dublin 1042 India Singapore 898
  • 21. We can audit current assignment! Country Is PoP optimal? Current PoP Optimal PoP India TRUE Singapore Singapore Pakistan FALSE Singapore Dublin Spain TRUE Dublin Dublin Brazil FALSE US West Coast US East Coast Netherlands TRUE Dublin Dublin UAE FALSE US West Coast Dublin Italy TRUE Dublin Dublin Mexico TRUE US West Coast US West Coast Russia FALSE US West Coast Dublin
  • 22. 0% 5% 10% 15% 20% 25% 30% India Pakistan Singapore Russia Brazil PercentageImprovement LinkedIn Homepage Download Time Improvement Median Improvement 90th Percentile Improvement
  • 23. Trust, but Verify Is 100% of India connecting to Singapore PoP? Can RUM identify which PoP served this page view? 1. RUM is JavaScript 2. PoP selected through DNS 3. JavaScripts can’t know DNS answers  JavaScripts can read response headers
  • 24. Can RUM identify which PoP served this page view? After page load: 1. Fetch www.linkedin.com/pop/admin 2. Read X-Li-Pop header 3. Add that data to RUM and send it back Offline analysis: 1. What % of traffic in India going to Singapore PoP? 2. A/B testing PoP assignment.
  • 25. Plot Twist: Assignment far from optimal • About 31% of US traffic gets assigned to a suboptimal PoP. – 45% of East Coast • About 10% of traffic globally gets assigned to a suboptimal PoP.
  • 26. DNS PoP assignment is suboptimal • Assignment based on Resolver IP, not Client IP DNS Resolver PoP US East PoP US West New YorkCalifornia
  • 27. DNS PoP assignment is suboptimal • Assignment based on Resolver IP, not Client IP • Bad IP to Geo databases – Resolver really in NY, but database says CA
  • 28. Story so far 1. We built PoPs 2. …used RUM to assign users to Optimal PoPs 3. …found DNS based assignment is suboptimal
  • 29. Accurate PoP assignment Problem • Bug our DNS providers (31% -> 27%) • Run our own DNS How about Anycast?
  • 30. Anycast – One IP, Multiple Servers PoP A PoP B PoP C Bob 1.1.1.1 1.1.1.1 1.1.1.1 • Unicast – one IP per server • Seamless support at routing layer • Packets routed to closest server by # of hops  Client IP, not Resolver IP used!  No Geo-IP Databases
  • 31. How does Anycast compare to DNS? Will anycast send more users to optimal PoP? Lets test it!
  • 32. RUM to rescue For each PoP: 1. Announce same anycast IP (108.174.13.10) 2. Configure a domain ac.perf.linkedin.com to point to 108.174.13.10
  • 33. RUM to rescue For each page view: 1. RUM downloads a tiny object : ac.perf.linkedin.com/pop/admin 2. Read X-Li-Pop response header to record which PoP served the object 3. Send this back to LinkedIn with RUM data Data: 1. For each user, the anycast PoP 2. For each user, the optimal PoP (from pop beacons)
  • 34. Results  Region or Country DNS % Optimal Assignment Anycast % Optimal Assignment Illinois 70 90 Florida 73 95 Georgia 75 93 Pennsylvania 85 95
  • 35. Results  Region or Country DNS % Optimal Assignment Anycast % Optimal Assignment Arizona 60 39 Brazil 88 33 New York 77 74
  • 36. Fewer hops != Lower Latency • Carriers prefer to haul packets within their own network • Peering can create inter-continental short cuts Z X Alice Y 1.1.1.1 1.1.1.1 1.1.1.1
  • 37. Maybe DNS wasn’t so bad Continent level identification City / State level identification
  • 38. “Regional” Anycast 1 anycast IP per continent Ran a RUM experiment, all was fine Z X Alice Y 2.2.2.2 1.1.1.1 1.1.1.1
  • 39. 50.00 55.00 60.00 65.00 70.00 75.00 80.00 85.00 90.00 95.00 100.00 20141206 20141208 20141210 20141212 20141214 20141216 20141218 %TrafficgoingtoOptimalPoP Date Illinois Florida North Carolina Indiana New York New Jersey Virginia West Virginia Louisiana USA Ramp Results Ramp outside USA In progress
  • 40. Story so far 1. We built PoPs 2. …used RUM to assign users to Optimal PoPs 3. …found DNS based assignment is suboptimal 4. …evaluated Anycast as a solution using RUM 5. …now using Anycast to assign users to PoPs Next play: • Build more PoPs!
  • 41. Build More PoPs: But where? • Previously, based on – Intuition – Knowledge • Can we use RUM data? – Get a list of candidate PoP locations – Build a machine learning model – Rank locations based on most performance benefit
  • 42. Plot Twist: Miami vs Sao Paolo • Can we use RUM again? – Fetch a tiny object and measure latency – But we have no servers there  Our CDNs do! 1. Get IP address of CDN servers 2. Configure a domain to point to the IP 3. RUM to fetch a tiny object ( /cdo/rum/id ) 4. Measure time and report back
  • 43. Miami vs Sao Paolo - Results  Sao Paolo is better than US East Coast  Miami is better than Sao Paolo for many in South America Country Median beacon time Miami Sao Paolo US East Coast Columbia 648 970 778 Argentina 918 970 1213 Brazil 868 549 1074
  • 44. Story: Recap 1. We built PoPs 2. …used RUM to assign users to Optimal PoPs 3. …found DNS based assignment is suboptimal 4. …evaluated Anycast as a solution using RUM 5. …now using Anycast to assign users to PoPs 6. …new PoPs locations chosen using RUM 10-25% gain 30% suboptimal in US “Regional” Anycast 30% -> 10%
  • 45. Story: The End • Clients are your measurement agents • Make data-driven decisions • Trust, but verify • Collaboration leads to bigger impact
  • 47. ©2014 LinkedIn Corporation. All Rights Reserved.©2014 LinkedIn Corporation. All Rights Reserved.
  • 48. What about Mobile? • Packet Gateways are likely to be in the same part of the country as users • Wired measurements should be applicable for Mobile – India wired users are closer to Singapore, Mobile should be similar
  • 49. Is Synthetic Monitoring dead? No, not yet • Much more powerful than RUM • Not everything can be done in production
  • 50. Next Play • Keep evaluating Anycast • Keep building new PoPs • Dynamic RUM-based PoP assignment

Editor's Notes

  1. Here's a slide summarizing the high level story of how it went. There were ups and downs, success and failure and interesting plot twists on the way. Overall it was a bumpy ride.
  2. Simplified example
  3. That’s a 900ms gain! This is huge! Just by adding a proxy server closer to the user, we have managed to gain 900ms in download time.
  4. So we know now that PoPs are great, lets get into the mechanics of how a user gets assigned to a PoP. At LinkedIn, our primary methodology of doing this is through DNS.
  5. There are two obvious ways to do this:….. But we know that network distance is not equal to geographical distance
  6. Median time to download the tiny object
  7. This is a slide that makes me very happy. This is the .....
  8. Not so fast...... I always like to verify major decisions with data. In this case, we wanted to verify whether the changes we made were effective.
  9. So, what did we find?
  10. We found that ...
  11. Digging deeper
  12. 15 mins more
  13. And how do we test it? Lets use RUM
  14. But of course, there was another plot twist. Can you add graph instead?
  15. So why was it happening?
  16. So we came up with what we “regional” anycast concept. Basically, we will still use DNS, but only to decide which continent is this user from. Within a continent, we will use anycast
  17. 5 min more
  18. Pulled some strings.