SlideShare a Scribd company logo
1 of 39
Download to read offline
Last.fm vs. Xbox


David Singleton
last.fm/user/underpangs
twitter.com/dsingleton
Music discovery powered by Scrobbling

Personalised radio, social network, events, a
“wikipedia of music”

High traffic

  Monthly visitors: 40 million

  Monthly page views: 500,000 million
Last.fm Architecture
       Load balancer



        HTTP Cache



        Web Server



        Object Cache



         Database
Xbox Live Platform,
millions of users

Last.fm Radio App

Built by Microsoft

Powered by our API

Launched along side
Facebook & Twitter
Last.fm vs. Xbox
So, what’s there to
    talk about?
Good Things ™

  New users. It’s really cool.

Bad Things™

  Lots of new users, traffic spikes

  A very important, high profile, launch

How did Last.fm approach this?
Xbox Live: 15 million users
       assuming a 10% take-up rate = 1,500,000 users
      startup: 5 requests   + starting radio: 5 requests + 15 minutes of radio: 60 requests


               1 hour of radio = 250 requests per user
                       an hour of radio per user is a rough averaged guess


  1,500,000 users = 375,000,000 requests over 24 hours
                   assuming an even distribution = 4,500 requests / second


Likely peaking at more than triple = 15,000 requests / second




  Last.fm: 2,000 requests/sec
                      based on number of servers and apache configuration


    estimated max capacity of 3,500 requests per second
Oh fuck
What next?

Picked a metric: requests per second

Estimated traffic increase vs capacity

Selected our goals;

  Serve requests faster

  Reduce number requests
Profiling traffic

Used traffic generated beta testing

Web server request logs

  Common format, widely supported format

  Hundreds of existing tools

We generated some stats using AWK...
Which API requests
     were made?
        Method

71638   track.getInfo
53941   artist.getImages
15150   radio.getPlaylist
 7308   library.getArtists
 5020   user.getRecentStations
 4979   ads.getVideos
 4205   radio.tune
 3155   track.love
 1507   artist.getInfo
 1258   user.getRecommendedArtists
 1135   user.getInfo
 1130   geo.getTopArtists
 1128   radio.gamerStations
 1102   tag.getTopArtists
 1021   track.ban
 1006   user.getLovedTracks
  340   library.addArtist
  206   auth.getMobileSession
Which API requests
   were made?
Raw data from beta
Calls Method                         Total    Average

53941   artist.getImages              19647    0.36
71638   track.getInfo                 15789    0.22
15150   radio.getPlaylist              6962    0.46
 7308   library.getArtists             2402    0.33
 4979   ads.getVideos                  1810    0.36
 5020   user.getRecentStations         1674    0.33
 1102   tag.getTopArtists              1488    1.35
 1258   user.getRecommendedArtists     1457    1.16
 4205   radio.tune                      923    0.22
 1130   geo.getTopArtists               575    0.51
 1507   artist.getInfo                  440    0.29
 1128   radio.gamerStations             298    0.26
 1006   user.getLovedTracks             271    0.27
 1135   user.getInfo                    171    0.15
  206   auth.getMobileSession            38    0.19
  136   user.signUp                      32    0.24
  123   user.terms                       16    0.13
 3155   track.love                        0    0.00
How long did each
 method take?
Why so many
track.getInfo calls?
A tiny UI tweak...

...responsible for 25% of calls.

Arrggghhhhhh

Added that information to a sensible API call

Microsoft kindly updated the app
What next?
What about the
  getImages calls?

Powers an artist slideshow visualisation

Results of this call won’t change often

  Set a HTTP cache timeout

Set caching on a few other calls too
Cached Requests



4
Request generation
Calls Method                         Total    Average

53941   artist.getImages              19647   0.36
71638   track.getInfo                 15789   0.22
15150   radio.getPlaylist              6962   0.46
 7308   library.getArtists             2402   0.33
 4979   ads.getVideos                  1810   0.36
 5020   user.getRecentStations         1674   0.33
 1102   tag.getTopArtists              1488   1.35
 1258   user.getRecommendedArtists     1457   1.16
 4205   radio.tune                      923   0.22
 1130   geo.getTopArtists               575   0.51
 1507   artist.getInfo                  440   0.29
 1128   radio.gamerStations             298   0.26
 1006   user.getLovedTracks             271   0.27
 1135   user.getInfo                    171   0.15
  206   auth.getMobileSession            38   0.19
  136   user.signUp                      32   0.24
  123   user.terms                       16   0.13
 3155   track.love                        0   0.00
kcachegrind
 http://kcachegrind.sourceforge.net




  webgrind
 http://code.google.com/p/webgrind/
What happens if
   things break?

Simulated failing calls

Highlighted essential calls

Acted as a dry-run for launch day failures

  Informed our backup plans
Only essential
  requests
Prepare for the worst

 Unexpected problems we’ve had:

  Servers overheating (twice)

  Hardware (almost) stolen from data-centers

  Power outage in the office
Backup plans, AKA
     The “Kill List”
       Plan                  Effect         Severity

 Disable radio DB-
                          Faster calls       Minor
     backing

Disable Flash Player    Save 200 req/sec     Major

Drop non essential     Reduce Xbox traffic
                                            Extreme
 Xbox API calls            by 0 - 50%
 Drop X% of radio      Reduce Xbox traffic
                                            Nuclear
    tune calls              by X%
Communication
Last.fm: Launch Day
    (When traffic attacks)
How did it go?

Our estimate was about 50% over

Didn’t exceed capacity (but got quite close)

Profiling and caching was essential

Or we would have gone down
What did we learn?

Use timezones to rollout slowly

Traffic will follow daily trends

Live monitoring is essential

Backup plans are comforting

Pre-fill caches before launch
So, how does this
    help me?
1. Estimate

Choose your metric

Estimate launch traffic

Compare against capacity

Make performance targets

Know your limitations
2. Profile requests

Start with a sample of traffic

Extract data for your metric

Visualise the results

Identify expensive requests for your metric

Use profiling tools on individual requests
3. Optimise
Reduce number of requests

Set the right HTTP caching headers

  Combine with reverse web proxy

  Prime caches for common calls

Use an object cache

Avoid language level optimisation
Web Request



Load balancer



HTTP Cache



 Web Server



Object Cache



  Database
Web Request



Load balancer



HTTP Cache



 Web Server



Object Cache



  Database
Web Request



Load balancer



HTTP Cache



 Web Server



Object Cache



  Database
4. Plan for failure

Simulate failures

Know your weak spots

Prepare backups plans

Communicate with users and partners
5. Launch it!
Roll out slowly, if you can

Setup live monitoring

If something goes wrong;

  Don’t panic

  Keep people updated

Have some champagne on ice
1. Start with an estimate
2. Profile your traffic
3. Make optimisations
4. Prepare for the worst
5. Launch it!
Last.fm vs. Xbox

Questions?
David Singleton
last.fm/user/underpangs
twitter.com/dsingleton

More Related Content

Similar to Last.fm vs Xbox

Twitch Plays Pokémon: Twitch's Chat Architecture
Twitch Plays Pokémon: Twitch's Chat ArchitectureTwitch Plays Pokémon: Twitch's Chat Architecture
Twitch Plays Pokémon: Twitch's Chat ArchitectureC4Media
 
ASERT's DDoS Malware Corral, Volume 1 by Dennis Schwarz and Jason Jones
ASERT's DDoS Malware Corral, Volume 1 by Dennis Schwarz and Jason JonesASERT's DDoS Malware Corral, Volume 1 by Dennis Schwarz and Jason Jones
ASERT's DDoS Malware Corral, Volume 1 by Dennis Schwarz and Jason Jonesarborjjones
 
Zombie DNS
Zombie DNSZombie DNS
Zombie DNSAPNIC
 
Splunk app for stream
Splunk app for stream Splunk app for stream
Splunk app for stream csching
 
DEF CON 27 - D4KRM4TTER MIKE SPICER - I know what you did last summer
DEF CON 27 - D4KRM4TTER MIKE SPICER - I know what you did last summerDEF CON 27 - D4KRM4TTER MIKE SPICER - I know what you did last summer
DEF CON 27 - D4KRM4TTER MIKE SPICER - I know what you did last summerFelipe Prado
 
NPW2009 - my.opera.com scalability v2.0
NPW2009 - my.opera.com scalability v2.0NPW2009 - my.opera.com scalability v2.0
NPW2009 - my.opera.com scalability v2.0Cosimo Streppone
 
ASERT's DDoS Malware Corral, Volume 2
ASERT's DDoS Malware Corral, Volume 2ASERT's DDoS Malware Corral, Volume 2
ASERT's DDoS Malware Corral, Volume 2dschwarz_arbor
 
SANOG 40: DDoS in South Asia
SANOG 40: DDoS in South AsiaSANOG 40: DDoS in South Asia
SANOG 40: DDoS in South AsiaAPNIC
 
Пример отчета по анализу вредоносного кода TeslaCrypt, подготовленного Cisco ...
Пример отчета по анализу вредоносного кода TeslaCrypt, подготовленного Cisco ...Пример отчета по анализу вредоносного кода TeslaCrypt, подготовленного Cisco ...
Пример отчета по анализу вредоносного кода TeslaCrypt, подготовленного Cisco ...Cisco Russia
 
Writing a Fullstack Application with Javascript - Remote media player
Writing a Fullstack Application with Javascript - Remote media playerWriting a Fullstack Application with Javascript - Remote media player
Writing a Fullstack Application with Javascript - Remote media playerTikal Knowledge
 
Being Open: How Facebook got its Edge
Being Open: How Facebook got its EdgeBeing Open: How Facebook got its Edge
Being Open: How Facebook got its EdgeAPNIC
 
Dan York - Presentation at Emerging Communications Conference & Awards (eComm...
Dan York - Presentation at Emerging Communications Conference & Awards (eComm...Dan York - Presentation at Emerging Communications Conference & Awards (eComm...
Dan York - Presentation at Emerging Communications Conference & Awards (eComm...eCommConf
 
Advertising Fraud Detection at Scale at T-Mobile
Advertising Fraud Detection at Scale at T-MobileAdvertising Fraud Detection at Scale at T-Mobile
Advertising Fraud Detection at Scale at T-MobileDatabricks
 
Extending the Yahoo Streaming Benchmark + MapR Benchmarks
Extending the Yahoo Streaming Benchmark + MapR BenchmarksExtending the Yahoo Streaming Benchmark + MapR Benchmarks
Extending the Yahoo Streaming Benchmark + MapR BenchmarksJamie Grier
 
Honeypots - Tracking the Blackhat Community
Honeypots - Tracking the Blackhat CommunityHoneypots - Tracking the Blackhat Community
Honeypots - Tracking the Blackhat Communityamiable_indian
 
Asas Pelayaran Internet
Asas Pelayaran InternetAsas Pelayaran Internet
Asas Pelayaran InternetAhmad Faizar
 
LISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
LISA18: Hidden Linux Metrics with Prometheus eBPF ExporterLISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
LISA18: Hidden Linux Metrics with Prometheus eBPF ExporterIvan Babrou
 
Journey through the ML model deployment to production @DSC5
Journey through the ML model deployment to production @DSC5Journey through the ML model deployment to production @DSC5
Journey through the ML model deployment to production @DSC5SmartCat
 
A journey through the machine learning model deployment to production
A journey through the machine learning model deployment to productionA journey through the machine learning model deployment to production
A journey through the machine learning model deployment to productionInstitute of Contemporary Sciences
 

Similar to Last.fm vs Xbox (20)

Twitch Plays Pokémon: Twitch's Chat Architecture
Twitch Plays Pokémon: Twitch's Chat ArchitectureTwitch Plays Pokémon: Twitch's Chat Architecture
Twitch Plays Pokémon: Twitch's Chat Architecture
 
ASERT's DDoS Malware Corral, Volume 1 by Dennis Schwarz and Jason Jones
ASERT's DDoS Malware Corral, Volume 1 by Dennis Schwarz and Jason JonesASERT's DDoS Malware Corral, Volume 1 by Dennis Schwarz and Jason Jones
ASERT's DDoS Malware Corral, Volume 1 by Dennis Schwarz and Jason Jones
 
Zombie DNS
Zombie DNSZombie DNS
Zombie DNS
 
Splunk app for stream
Splunk app for stream Splunk app for stream
Splunk app for stream
 
DEF CON 27 - D4KRM4TTER MIKE SPICER - I know what you did last summer
DEF CON 27 - D4KRM4TTER MIKE SPICER - I know what you did last summerDEF CON 27 - D4KRM4TTER MIKE SPICER - I know what you did last summer
DEF CON 27 - D4KRM4TTER MIKE SPICER - I know what you did last summer
 
NPW2009 - my.opera.com scalability v2.0
NPW2009 - my.opera.com scalability v2.0NPW2009 - my.opera.com scalability v2.0
NPW2009 - my.opera.com scalability v2.0
 
ASERT's DDoS Malware Corral, Volume 2
ASERT's DDoS Malware Corral, Volume 2ASERT's DDoS Malware Corral, Volume 2
ASERT's DDoS Malware Corral, Volume 2
 
SANOG 40: DDoS in South Asia
SANOG 40: DDoS in South AsiaSANOG 40: DDoS in South Asia
SANOG 40: DDoS in South Asia
 
Пример отчета по анализу вредоносного кода TeslaCrypt, подготовленного Cisco ...
Пример отчета по анализу вредоносного кода TeslaCrypt, подготовленного Cisco ...Пример отчета по анализу вредоносного кода TeslaCrypt, подготовленного Cisco ...
Пример отчета по анализу вредоносного кода TeslaCrypt, подготовленного Cisco ...
 
Writing a Fullstack Application with Javascript - Remote media player
Writing a Fullstack Application with Javascript - Remote media playerWriting a Fullstack Application with Javascript - Remote media player
Writing a Fullstack Application with Javascript - Remote media player
 
Being Open: How Facebook got its Edge
Being Open: How Facebook got its EdgeBeing Open: How Facebook got its Edge
Being Open: How Facebook got its Edge
 
Dan York - Presentation at Emerging Communications Conference & Awards (eComm...
Dan York - Presentation at Emerging Communications Conference & Awards (eComm...Dan York - Presentation at Emerging Communications Conference & Awards (eComm...
Dan York - Presentation at Emerging Communications Conference & Awards (eComm...
 
Advertising Fraud Detection at Scale at T-Mobile
Advertising Fraud Detection at Scale at T-MobileAdvertising Fraud Detection at Scale at T-Mobile
Advertising Fraud Detection at Scale at T-Mobile
 
Extending the Yahoo Streaming Benchmark + MapR Benchmarks
Extending the Yahoo Streaming Benchmark + MapR BenchmarksExtending the Yahoo Streaming Benchmark + MapR Benchmarks
Extending the Yahoo Streaming Benchmark + MapR Benchmarks
 
Android Development Tools
Android Development ToolsAndroid Development Tools
Android Development Tools
 
Honeypots - Tracking the Blackhat Community
Honeypots - Tracking the Blackhat CommunityHoneypots - Tracking the Blackhat Community
Honeypots - Tracking the Blackhat Community
 
Asas Pelayaran Internet
Asas Pelayaran InternetAsas Pelayaran Internet
Asas Pelayaran Internet
 
LISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
LISA18: Hidden Linux Metrics with Prometheus eBPF ExporterLISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
LISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
 
Journey through the ML model deployment to production @DSC5
Journey through the ML model deployment to production @DSC5Journey through the ML model deployment to production @DSC5
Journey through the ML model deployment to production @DSC5
 
A journey through the machine learning model deployment to production
A journey through the machine learning model deployment to productionA journey through the machine learning model deployment to production
A journey through the machine learning model deployment to production
 

Recently uploaded

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 

Recently uploaded (20)

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 

Last.fm vs Xbox

  • 1. Last.fm vs. Xbox David Singleton last.fm/user/underpangs twitter.com/dsingleton
  • 2. Music discovery powered by Scrobbling Personalised radio, social network, events, a “wikipedia of music” High traffic Monthly visitors: 40 million Monthly page views: 500,000 million
  • 3. Last.fm Architecture Load balancer HTTP Cache Web Server Object Cache Database
  • 4. Xbox Live Platform, millions of users Last.fm Radio App Built by Microsoft Powered by our API Launched along side Facebook & Twitter
  • 6. So, what’s there to talk about? Good Things ™ New users. It’s really cool. Bad Things™ Lots of new users, traffic spikes A very important, high profile, launch How did Last.fm approach this?
  • 7. Xbox Live: 15 million users assuming a 10% take-up rate = 1,500,000 users startup: 5 requests + starting radio: 5 requests + 15 minutes of radio: 60 requests 1 hour of radio = 250 requests per user an hour of radio per user is a rough averaged guess 1,500,000 users = 375,000,000 requests over 24 hours assuming an even distribution = 4,500 requests / second Likely peaking at more than triple = 15,000 requests / second Last.fm: 2,000 requests/sec based on number of servers and apache configuration estimated max capacity of 3,500 requests per second
  • 9. What next? Picked a metric: requests per second Estimated traffic increase vs capacity Selected our goals; Serve requests faster Reduce number requests
  • 10. Profiling traffic Used traffic generated beta testing Web server request logs Common format, widely supported format Hundreds of existing tools We generated some stats using AWK...
  • 11. Which API requests were made? Method 71638 track.getInfo 53941 artist.getImages 15150 radio.getPlaylist 7308 library.getArtists 5020 user.getRecentStations 4979 ads.getVideos 4205 radio.tune 3155 track.love 1507 artist.getInfo 1258 user.getRecommendedArtists 1135 user.getInfo 1130 geo.getTopArtists 1128 radio.gamerStations 1102 tag.getTopArtists 1021 track.ban 1006 user.getLovedTracks 340 library.addArtist 206 auth.getMobileSession
  • 12. Which API requests were made?
  • 13. Raw data from beta Calls Method Total Average 53941 artist.getImages 19647 0.36 71638 track.getInfo 15789 0.22 15150 radio.getPlaylist 6962 0.46 7308 library.getArtists 2402 0.33 4979 ads.getVideos 1810 0.36 5020 user.getRecentStations 1674 0.33 1102 tag.getTopArtists 1488 1.35 1258 user.getRecommendedArtists 1457 1.16 4205 radio.tune 923 0.22 1130 geo.getTopArtists 575 0.51 1507 artist.getInfo 440 0.29 1128 radio.gamerStations 298 0.26 1006 user.getLovedTracks 271 0.27 1135 user.getInfo 171 0.15 206 auth.getMobileSession 38 0.19 136 user.signUp 32 0.24 123 user.terms 16 0.13 3155 track.love 0 0.00
  • 14. How long did each method take?
  • 15. Why so many track.getInfo calls? A tiny UI tweak... ...responsible for 25% of calls. Arrggghhhhhh Added that information to a sensible API call Microsoft kindly updated the app
  • 17. What about the getImages calls? Powers an artist slideshow visualisation Results of this call won’t change often Set a HTTP cache timeout Set caching on a few other calls too
  • 19. Request generation Calls Method Total Average 53941 artist.getImages 19647 0.36 71638 track.getInfo 15789 0.22 15150 radio.getPlaylist 6962 0.46 7308 library.getArtists 2402 0.33 4979 ads.getVideos 1810 0.36 5020 user.getRecentStations 1674 0.33 1102 tag.getTopArtists 1488 1.35 1258 user.getRecommendedArtists 1457 1.16 4205 radio.tune 923 0.22 1130 geo.getTopArtists 575 0.51 1507 artist.getInfo 440 0.29 1128 radio.gamerStations 298 0.26 1006 user.getLovedTracks 271 0.27 1135 user.getInfo 171 0.15 206 auth.getMobileSession 38 0.19 136 user.signUp 32 0.24 123 user.terms 16 0.13 3155 track.love 0 0.00
  • 20. kcachegrind http://kcachegrind.sourceforge.net webgrind http://code.google.com/p/webgrind/
  • 21. What happens if things break? Simulated failing calls Highlighted essential calls Acted as a dry-run for launch day failures Informed our backup plans
  • 22. Only essential requests
  • 23. Prepare for the worst Unexpected problems we’ve had: Servers overheating (twice) Hardware (almost) stolen from data-centers Power outage in the office
  • 24. Backup plans, AKA The “Kill List” Plan Effect Severity Disable radio DB- Faster calls Minor backing Disable Flash Player Save 200 req/sec Major Drop non essential Reduce Xbox traffic Extreme Xbox API calls by 0 - 50% Drop X% of radio Reduce Xbox traffic Nuclear tune calls by X%
  • 26. Last.fm: Launch Day (When traffic attacks)
  • 27. How did it go? Our estimate was about 50% over Didn’t exceed capacity (but got quite close) Profiling and caching was essential Or we would have gone down
  • 28. What did we learn? Use timezones to rollout slowly Traffic will follow daily trends Live monitoring is essential Backup plans are comforting Pre-fill caches before launch
  • 29. So, how does this help me?
  • 30. 1. Estimate Choose your metric Estimate launch traffic Compare against capacity Make performance targets Know your limitations
  • 31. 2. Profile requests Start with a sample of traffic Extract data for your metric Visualise the results Identify expensive requests for your metric Use profiling tools on individual requests
  • 32. 3. Optimise Reduce number of requests Set the right HTTP caching headers Combine with reverse web proxy Prime caches for common calls Use an object cache Avoid language level optimisation
  • 33. Web Request Load balancer HTTP Cache Web Server Object Cache Database
  • 34. Web Request Load balancer HTTP Cache Web Server Object Cache Database
  • 35. Web Request Load balancer HTTP Cache Web Server Object Cache Database
  • 36. 4. Plan for failure Simulate failures Know your weak spots Prepare backups plans Communicate with users and partners
  • 37. 5. Launch it! Roll out slowly, if you can Setup live monitoring If something goes wrong; Don’t panic Keep people updated Have some champagne on ice
  • 38. 1. Start with an estimate 2. Profile your traffic 3. Make optimisations 4. Prepare for the worst 5. Launch it!
  • 39. Last.fm vs. Xbox Questions? David Singleton last.fm/user/underpangs twitter.com/dsingleton