Building big things
      in Java
     Mark Pope and Tom Coupland
Nokia
Entertainment

     Text
Search & Buy

               Mix Radio
Internet



                   Entertainment API



Search   Library        Radio          Gigs   Delivery
Search




http://mhm.hud.ac.uk/newsroom/image/record-store-3
Entertainment API




            Search




Solr         Solr          Solr
Entertainment API




            Search




Solr         Solr          Solr
Search

• 200 requests per second
• 300ms response time
Entertainment API
              EAPI




            Search




Solr         Solr          Solr
Entertainment API




            Search




Solr         Solr          Solr
Solr

• 3 servers (96GB RAM, 12 cores)
• 420 requests per second
• 15ms response time
• 25 million documents
Entertainment API




            Search




Solr         Solr          Solr
You found it!




                                  Now what?
http://news.bbc.co.uk/sport2/hi/olympics/7564645.stm
http://winncollier.com/the-frugal-side-of-me/
Library



http://kraftylibrarian.com/?p=1596
Entertainment API


           Library


           Sharding


a-    e-    k-   p-   t-   v-
d      j    o     s   u     z
Entertainment API


           Library


           Sharding


a-    e-    k-   p-   t-   v-
d      j    o     s   u     z
Library


• 1,000,000,000 rows
• 800,000 requests per day
• 300ms response time
• 26 servers
Half way
Mix Radio




http://alyssahsadie.deviantart.com/art/Old-School-
               BoomBox-24949091
Entertainment API




Personalisation    Catalogue




                       db
Entertainment API




Personalisation    Catalogue




                       db
Entertainment API

• 35 developers, 6 teams, 2 countries
• 19000 lines of code
• 155 deployments in 2011
• 160 deployments in 2012
Show me the
   music!
Delivery




                           LOUD NOISES!
http://pursuitbikes.com/blog/?p=9279
Entertainment API




    Delivery
                    NAS




   Packaging
Entertainment API




    Delivery
                    NAS




   Packaging
Delivery

• 1 petabyte of music and books
• 16 million tracks
• 6 formats
• 1 million tracks served per day
So...
That’s how we built (part of) a big thing in
                  Java
Questions?
Thanks!
• Mark Pope             • twitter.com/
                          tcoupland
 • twitter.com/scobal
                        • github.com/
 • github.com/scobal      mantree

• Tom Coupland

Building big things in Java

Editor's Notes

  • #2 Welcomes. Thanks for coming. We’re Tom and Mark. From Nokia. \nHere to talk about \ndeliver a global music service\nhow we built it with Java\n(Do we add in a story throughout to keep the ideas concrete)\n
  • #3 Located in central Bristol \nDesign, build and maintain the service (pretty much exclusively)\nProvide music to millions of users around the world\nScaled/Served to around 1million downloads a day\nJava, windows phone, web\n
  • #4 Talk through parts of our service\nTwo use cases:\nSearch and adding to Library (Find and own)\nMix radio and Delivery (New product called mix radio)\nFirst going to talk about high level architecture...\n
  • #5 Multiple client platforms (win phone, web, symbian)\nSOA 3 years ago\nStruggling with single monolith, lots of clashing\nTeams work on groups of services\nBreak a use case down into a set of responsibilities. Service for each one.\nScale single components\nEAPI provides \nauth\nproxy for routing\ncommon api\nJSON over HTTP with REST\nTalk more about it later....\n
  • #6 Think about how you buy music from a store.\nHow you might model the process in software.\nSo, we ended up with something a bit like like:\n\n\n
  • #7 EAPI routes requests to Search service\n\n\n
  • #8 So we’re now going to talk about Search\n
  • #9 Why do we need this layer? Stateless. Built for high throughput.\nReturns track metadata, artist information, track names.\nLayer of abstraction above solr\nTalk internal language\nJava, Spring (Open Source, tool for building java software [IOC {breakout acronyms}, DI]), REST API over HTTP\nTwo servers for failover - key concept when building big things\n
  • #10 \n
  • #11 \n
  • #12 What is solr? Solr is an Apache Open Source inverted full text indexer. So...\nWhy use solr instead of SQL?\nPut in documents containing artist and song information, get broken down into individual words, then indexes.\n\nsearch for “stone” -> Stone Roses, Joss Stone, Rolling Stones.\n\n3 servers: failover and scale requests\n
  • #13 So, this how our search architecture works.\nSo....\n
  • #14 You’ve found what you’re looking for. \nWhat’s the next step?\n
  • #15 Well, you have to give us some money.\nWe’re not going to concentrate on this\n
  • #16 We have to know what you own.\nPersonal collection of music/books\nHow to model this in software (when you have to store millions of users collections)\n(Collections can be ma-hou-sive)\nClients use this to obtain what a user owns, so you can see what you own on the clients\n
  • #17 The familiar clients and api layer.\nRoutes to Library sub-system.\n\n
  • #18 The familiar clients and api layer.\nRoutes to Library sub-system.\n\n
  • #19 Why do we need to shard? Why not one big database (horizontal vs vertical [expensive] scaling) \nClone, copy, master-master\nEach shard replicated twice - failover and redundancy when building big things. Hot failover.\nThin service runs on 8 servers, 12 db servers, 6 sharding servers\nSpring and hibernate, Open Source. Open Source MySql.\n
  • #20 Not really an intermission. We’re just half way through our slides.\n
  • #21 Mix radio - product on Nokia Lumia phones for streaming free music.\nHand crafted - created by experts to suit mood and taste.\nMobile optimised - codec with great sound and small file size\nHow might you do this with software?\n\n
  • #22 Back to our old friend EAPI.\nCan see the two paths - not going to talk about personalisation today\nLooking for curated radio stations\nThese are stored in a service called Catalogue\n
  • #23 Catalogue is a big list of all the music and books we offer. Metadata store.\nAlso stores curated radio stations\nEAPI retrieves the radio station you selected (which the client will then play)\nHowever, take a quick look at this EAPI thing. What is it\n
  • #24 Conceptually a thin layer - actually a bit more to it (because of it’s centralness)\nClient agnostic. Stateless. \nContinuous change, integration and delivery\nHighly modular allowing collaboration reducing checkin conflicts (with git).\nPairing, TDD.\nDanny and Neil talking in more detail later about how we work\n
  • #25 Ok\n
  • #26 How do we make noise come out of your phone\n
  • #27 Sub-systems that make noise come out of your phone\nResponsible for \nmaking sure you’re licensed to download your selection\nactually serving up the location of the music or book\nadding images and meta-data to your download\nNAS for storing music + books - massive\n\n
  • #28 So how big is this thing?\n
  • #29 Mention VCDN. What is a (Content Delivery Network) CDN?\nContent Delivery Network, a big cache out in the cloud.\nPowers ‘most’ of the internet, act as caches for content.\nLocated all over the world, reduce load.\nGreat Firewall of China.\nStores our eaac32 formatted tracks.\n
  • #30 \n
  • #31 \n
  • #32 \n