Couchbase_UK_2013_Betfair_and_Couchbase

2,749 views
2,741 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,749
On SlideShare
0
From Embeds
0
Number of Embeds
2,324
Actions
Shares
0
Downloads
22
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Hello there everyone. Those who registered early - Tim and Abe – gazumped by businessMy name is Martin Anderson. I’m currently a Technical Consultant working for Betfair Australia out of Sydney but previous to that I was the Chief Site Architect at Betfair for 2 years and I’ve been with the business for almost 4 years. My main responsibilities have been heading up the complete replacement of our web tier for Betfair.com. That was a brand new platform for all our web channels, both desktop and mobile, including the introduction of Continuous Delivery and NoSQL.
  • So this is what I hope to cover in this talkSome background on BetfairWhy Couchbase was chosen at BetfairSome thoughts on adopting NoSQL in EnterprisesIt’s worth mentioning that there is a Q&A session at 5pm so you can catch me there or feel free to grab me during the conference
  • Before we go into why Betfair selected Couchbase and who we use it, we need to know a bit more about who Betfair are, what they do and what technologies they use to do thisSo who are Betfair?Betfair was created in 1999 as a startup between a developer and a city trader around the concept of a Betting exchange. Exactly like a stock exchange but with bets rather than shares. Since then the company has grown to be one of the largest online gambling companies in the world offering not just the exchange but also sportsbook betting, casino, arcade, bingo and poker. It is very much a dom.com success and very much a British one with the headquarters being in Hammersmith although there are development offices in Romania, Portugal and Australia.
  • We have a lot of products but the main one that we are known for is the betting exchange. Unlike a normal book maker, where you can only back an outcome like I want , you are able to lay it too. Laying is just effectively taking a back bet from another person.Size wise, we do a fair bit of business and that means that there is a fair amount of data flying around.Here are some numbers for you
  • This all comes from a volume of bets that exceeds the combined volumes from all the stock exchanges in Europe combined.My favourite is that 20% of customers admitted that they have used their mobile to bet at a weddingWe are practically a bank - we deal with massive volumes of money so people are very interested in our site staying up, being secure and being fastThe company has development centers in the UK, US, Portugal, Romania and Aus. We have a whole host of products, not just the exchange and of course our products have very strict rules from regulatorsThere is a massive amount of complexity. The complexity is not just around data volumes and the speed that we have to process them but also that we offer multiple products across multiple channels in multiple jurisdictions with overview from multiple regulators.But we are going to focus on data
  • So what sort of data are we looking atPretty much the full gamutMarket data – new markets are created all the time and they need to be surfaced on the site when they doMarkets pricing data – this is the one that changes every The navigational hierarchy data is actually a Directed Acyclic Graph that needs to be correct for each userAs an example, Italian users will have a specific markets for only them while Danish users cannot be offered events like horse racing since an animal is involved.Transactional data – of course since people are placing betsOperational monitoring – we are big exponents of DevOps and making sure that we know what’s happening in the business. Because the system is not simple, this is the only way we can know what is going on.Over 500Gb of data per month just from logging – not including the rest of operational monitoring
  • Java is not cool butGood knowledge already at Betfair •  Real concurrency – great for heavy server workload •  Large Community •  Great Toolset •  Operations Teams understand Java – stats, GC logs, deployment process Oracle
  • WE LOVE ORACLE!The lifeblood of our transaction system – in fact our core exchange business is based around OracleHighly performant – this might be surprising but just because something is an RDBMS it is not essentially slowWell understood – we have a lot of experience with this. We are comfortable using it and know what to do when this don’t go the way we plannedResilient – Given that we are a bank in many ways – how happy would you be if your bank went down? We are in the business of staying up
  • Impedance mismatch with object orientated languages – the rise (and fall?) of technology like Hibernate and other ORMs highlights this. When you are developing there is a clear break between your application logic and the persistence technology wth RDBMS.Object models possible in RDBMS but at what cost? – you can solve this issue but what are the costs both in the development cost and then the on going maintenance as you fit a square peg into a round holeMust have serious skills at this scale – we are one of the top 5 hottest Oracle databases in the worldScaling not easy – clustering and sharding – easy to say, not so easy to doIntegration with Continuous Delivery? – We deploy at least once a week.I don’t want to go on too much about Continuous Delivery but I firmly believe that it is no longer an optional requirement for software development. One of the fundamental tenets of CD is that your process is automated. For this to happen it needs to be deterministic and one of the easiest ways to guarantee this is to make the process simple. Unfortunately things like database migrations and green/blue deployments are inhenrently complex even with tooling like DBDeploy
  • So why should we use NoSQL?Well the reasons are these…The time from concept to cash
  • CoherenceMemcachedCouchDBCassandraMongoDBHbaseRedis
  • So why so many?From one perspective, since NoSQL is an umbrella term, you would naturally expect to have multiple typesSome of these technologies are dependencies of other technologies: for example the deployment tool Chef uses CouchDB and OpenTSDB which we use extensively for monitoring uses Hbase under the hoodSo what about direct usage, where our applications are directly using these technologies?Coherence – distributed caching in various tiersMemcached – distributed caching in web tierCassandra – storage in web and service tierMongoDB – storage in prototypes and caching in AustraliaRedis – high speed sorted set delivery in US ExchangeDo we understand why we chose that solution?
  • It’s fairly common for large organizations to cycle between product delivery and then delivering efficiencies/optimisations on those products especially in an Agile world.We were just coming of the back of not just the delivery of a new web platform but actually a raft of new deliveries across multiple products and channel and even countries. This mean that sometimes our technology is chosen based from what is currently supported rather than
  • Good NewsWe’ve had experience with K/V, Document and Columnar stores and seen how these things breakBad NewsCassandra is a great piece of tech. Very good for high writes but not optimal for read heavy or even equal read/writes especially when you want strong consistency. Since the client is unaware of the server topology you need to have quorum (explain) read/writes to achieve this. You get intermittent high p95 unless you go to SSDs or front it with Memcached.http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.htmlhttp://techblog.netflix.com/2012/07/benchmarking-high-performance-io-with.html
  • Pretty simple really since there are only two questionsThe trick is to remember that your use cases need to include your future use cases not just the ones you have nowAnd that your optimum solution needs to answer a whole raft of questions, not just the obvious ones
  • Web Tier PersistenceSession storage – easy to do with Memcached or Coherence but we want cross session as well so we need persistenceStrong consistency at scale with high performanceknown issues of cold cache and deployment complexityToo many people try to use and abuse cookies for this not secure size constraints so pipe separated magic values that quickly become just magic numbersMobile usage avoids issues with things like low powered devices
  • User PreferencesSimple settings for how you want your experience to beThis is effectively a map of key values but can be mutimapDates back to when there was a single product on a single channelNow multi-product on multiple channels so an obvious need to be separateWhy? They move at different speeds with different deployments frequencyAlso, feature throttles for rollout and A/B testing – Lazy load the data and let the application be in control
  • Server side rendered contentSOA Data Services exposedSupports >200,000 concurrent users
  • 22 MINUTE CHECKThe importance of the blocking calls at the top being fastRemember to loop back and talk about task dedupllication
  • Mention the performance work done by AltorosClient being aware of the server topology means that you have more deterministic behaviourEjection onto disk allows great overflow from RAMMemcached replacementRe Cassandra - Mention the two Netflix articles on their website – I will tweet them afterwards
  • Very important for developersEmpowers them to own the entire stack from top to bottom
  • We tend to have dedicated stacks for various reasons: including independence, compliance and regulation but we also have multi-tenancyWe’ve found it very stable - for example we have had no examples of data loss with Couchbase – not something I can say for other solutions (both Hbasenamenode SPOF and Cassandra Read only VM with no hinted handoff blew the stack)We’ve had no trouble spinning the devs up to speed or the ops guys who support it. For example it’s been great that we had some work in Aus where the full env was not yet ready so the guys spun up some local instances so they could just get to workFor a large organisation like ourselves, having experts we can call on is a great help
  • We tend to have dedicated stacks for various reasons: includingWe’ve found it very stable - for example we have had no examples of data loss with Couchbase – not something I can say for other solutions (both Hbasenamenode SPOF and Cassandra Read only VM with no hinted handoff blew the stack)We’ve had no trouble spinning the devs up to speed or the ops guys who support it. For example it’s been great that we had some work in Aus where the full env was not yet ready so the guys spun up some local instances so they could just get to workFor a large organisation like ourselves, having experts we can call on is a great help
  • Here are some examples of couchbase in useThis is for our sportsbookYou can see the spiky nature of the demand as it is skewed towards events that happen from midday to early evening and especially on the weekendTotal Doc data size is around 2.5 Gb
  • Here’ another bucket for the same applicationThis one is slightly higher Ops per second and has a data set of 3 Gb
  • Smaller data set here – it’s under 1 GbThese are just a sample of our couchbase usage but it’s fairly representativeIf you have any specific questions on our couchbase instances, please come and find me later.
  • Session and cross session storageYou can do funky things like share session data across channels, e.g. add a bet to your betslip on your desktop and then access it on your mobile deviceStorage – like user preferences
  • K/V, Document, Structured Data, Columnar, Graph – each has their own use case, the sweet spot where they work the bestFor us it was delivering faster with less resources – e.g.DBDev, DBAThis should just be putting down on paper your thoughts on the topic so it’s not a wasted exerciseIdeally find something with ephemeral data where going bang does not being down your siteFor use, Couchbase has shown itself to be the best document NoSQL store for our business
  • Any questions that have interesting answers I will either tweet the answer or tweet a link.Same goes for links that relate to what I’ve covered todayThank you very much
  • Couchbase_UK_2013_Betfair_and_Couchbase

    1. 1. BETFAIR + COUCHBASEMartin Anderson
    2. 2. AGENDA2• Who Betfair are• Why Couchbase was chosen at Betfair• What it is being used for• Some thoughts on adopting NoSQL in Enterprises• Q&A session later on today
    3. 3. BETFAIR3
    4. 4. 4
    5. 5. THE EXCHANGE5
    6. 6. IN NUMBERS64.0m+FundedAccounts140locations30,000bets placedone minute120,000+requests persecond£288mfunds ondeposit£2.2bnMobile FY12
    7. 7. DATA AT BETFAIR7
    8. 8. DATA AT BETFAIR8• >30,000 markets that can change every 100ms• Jurisdictionally sensitive navigation• Multiple web applications for multiple channels• Large volumes of data from other products• Transactional data• Operational monitoring too - large amount of logging data
    9. 9. TECHNOLOGY AT BETFAIR9Application Stack• JVM heavy• Linux on commodity hardware• Heavy use of Virtualisation/Private CloudData Storage Stack• Oracle• Some Informix & MySQL• NoSQL
    10. 10. RELATIONAL DATABASES10We love Oracle!The lifeblood of our transaction systemHighly performantWell understoodResilientOther databases but they are effectively integrated products
    11. 11. BUT…11Impedance mismatch with object orientated languagesObject models possible in RDBMS but at what cost?Must have serious skills at this scaleScaling not easyOften very heavyweight solutionIntegration with Continuous Delivery?So what about NoSQL?
    12. 12. NOSQL AT BETFAIR12
    13. 13. NOSQL13Matches well to object orientated languagesInherently scalableVery fast look upsIntegrates very well with Continuous DeliveryCombines to give a lower time to delivery
    14. 14. NOSQL AT BETFAIR14
    15. 15. WHY SO MANY?15Different categories of NoSQL, therefore different usage: K/V, Document, ColumnarSome are wrapped by other products• CouchDB & Chef• HBase & OpenTSDBBut what about cases where we have direct usage?What was the selection criteria for these solutions?
    16. 16. THE PRESSURE OF DELIVERY16Just finished a cycle of high product delivery focusTime to step back and reassess the selectionsBut without negatively affecting current product delivery!
    17. 17. STRATEGIC REVIEW17Good NewsWe had a fair amount of experience with different NoSQL solutionsBad NewsFairly certain that some of the uses were less than optimal
    18. 18. ADOPTION AND ASSESSMENT PROCESS18• What were our use cases?• What would be the optimum solutions?
    19. 19. NOSQL ASSESSMENT PROCESS19• Background/Maturity of the technology• Data Model Category• Consistency Model Requirements• Performance• Replication strategy (inc. Concurrency Control)• Caching Model• Query Model• Integration with Continuous Delivery
    20. 20. NOSQL ASSESSMENT PROCESS20• Operational Maintenance (inc. Backup)• Operational Monitoring• Support• Scalability• Reliability• Security• Cost
    21. 21. INITIAL USE CASES FOR NOSQL21Web Tier Persistence• Session and Cross session storage – e.g. Betslip• Memcached• Strong consistency• Cookie abuse• Cassandra as current solution
    22. 22. INITIAL USE CASES FOR NOSQL22User Preferences• Historically tied to customer account• Map of keys and values• Multiple channels with multiple applications• RDBMS as current solution
    23. 23. CURRENT ARCHITECTURE23Server side rendered contentSOA Data Services exposedSupports >200,000 concurrent users
    24. 24. WEB APPLICATION CALL STRUCTURE24
    25. 25. OUR SELECTION CRITERIA25• Performance - especially deterministic performance on VMs• Strong consistency• Scaling• Schema flexibility• Multi-tenancy when required• Simplicity• Enterprise support• Consider the future uses
    26. 26. COUCHBASE26
    27. 27. COUCHBASE PERFORMANCE27• Seriously fast• Highly deterministic• Cache ejection/eviction• Avoids Cold Cache on offlined instances• Ideal for our architecture – virtualisation/private cloud• Far better option than our current solution
    28. 28. COUCHBASE SCALING28• Inherently scalable• Impressive ability to add nodes under load• Manual rebalance gives control for highly loaded applications• Replica promotion avoids failure cascades under load
    29. 29. COUCHBASE SCHEMA FLEXIBILITY29• Giving the developers ownership of the data storage• Decouples data migration from application deployment• Important requirement for Feature Throttles• Removes many of the requirements for having DB devs/DBAs• Allows preferences to deal with A/B tests
    30. 30. OTHER COUCHBASE FEATURES30• Multi-tenancy when required• Stable and Resilient• Great ease of use for both Devs and Ops• Enterprise support• Elastic Search integration• Secured with a Service Layer
    31. 31. COUCHBASE IN CONTINUOUS DELIVERY31
    32. 32. COUCHBASE DEPLOYMENTS32• Version 1.8 in production, some 2.0 in pre-prod• 3 instance clusters for individual web applications• Larger (4-6) instance clusters for service storage• We are about 6 months in with our production instances
    33. 33. COUCHBASE IN PRODUCTION33
    34. 34. COUCHBASE IN PRODUCTION34
    35. 35. COUCHBASE IN PRODUCTION35
    36. 36. COUCHBASE AT BETFAIR36Couchbase is now our strategic document NoSQL solution• Session state• Cross session state• Service Persistence for key-based Entities• Familiarity will likely see this extend out into other areas
    37. 37. INTRODUCING NOSQL IN ENTERPRISEAKA CULTURE HACKING WITH NOSQL37• Remember its an umbrella term - non-experts will ask why we need so manydifferent types of NoSQL• Remember the business benefits• Present the business with both the use cases you want to adopt NoSQL for andthe assessment of the candidates• When you can use it, get it out there ASAP in a low risk way• It’s not about choosing what’s cool, it’s about choosing what’s best for thebusiness
    38. 38. THANK YOU!Martin Anderson @mdjanderson38http://betfair.jobs

    ×