Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Couchbase Connect 2016

309 views

Published on

How LinkedIn has evolved its use of Couchbase and built an ecosystem around the Couchbase platform

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Couchbase Connect 2016

  1. 1. Michael Kehoe Staff Site Reliability Engineer LinkedIn Going all in: From single use-case to many
  2. 2. 2 Overview • The LinkedIn Story • Couchbase Use-Cases • Development & Operations • Conclusions • Questions
  3. 3. $ whoami 3 Michael Kehoe • Staff Site Reliability Engineer (SRE) • Production-SRE team • Funny accent = Australian • Contact • linkedin.com/in/michaelkkehoe • @matrixtek
  4. 4. $ whatis SRE 4 Michael Kehoe • Site Reliability Engineering • Operations for the production application environment • Responsibilities include • Architecture design • Capacity planning • Operations • Tooling
  5. 5. $ whatis CBVT 5 Michael Kehoe • Couchbase Virtual Team • ~10 SRE’s • 2 Software Engineers • Sponsored by SRE Director • 5-90% of their time to support Couchbase • Encourage as many people to contribute as possible • What do we do? • Operational work on Couchbase clusters • Evangelize the use of Couchbase within LinkedIn • Develop tools for the Couchbase Ecosystem
  6. 6. 6 The LinkedIn Story • Founded in 2002, LinkedIn has grown into the world’s largest professional social media network • 30 offices in 24 countries, Available in 24 languages • More than 450+ million members worldwide
  7. 7. 7 The LinkedIn Story • Growth in Products • Profiles • Groups • Recruiter • Sales Navigator • Growth in Internet Traffic • Billions of page-hits per day • 100k+ QPS to production services
  8. 8. In-Memory Storage Needs 8 The LinkedIn Story • LinkedIn started as an Oracle shop • Hyper-growth = Scaling challenges • Read-Scaling becomes important • Applicable use-cases • Simple cache store • Pre-warmed • Read through • Potential for Source of Truth (SoT) store
  9. 9. Enter Couchbase 9 The LinkedIn Story • Until 2012, we were only using Memcache as a non SoT In-Memory store • Drawbacks • Difficult to pre-warm • No partitioning/sharding (had to write our own) • Cold-cache restarts • Difficult to move data across hosts/clusters data-centers
  10. 10. Enter Couchbase 10 The LinkedIn Story • Evaluated replacement systems for Memcached: Mongo, Redis, and others • Couchbase had distinct advantages: • Simple replacement for Memcached • Built-in replication and cluster expansion • Automatic partitioning • Low latency • Async writes to disk • Building tooling is simple
  11. 11. Enter Couchbase 11 The LinkedIn Story • Today we run Couchbase in our Corporate, Staging and Production environments • Production/ Staging statistics: • 148 buckets • 2821 hosts • 10M+ QPS • Largest Clusters: • By Hosts: 72 Hosts • By Documents: 1.4B Documents • By QPS: 2.5M QPS
  12. 12. Summary 12 Use-Cases Today’s use-cases: • Simple read-through cache • Ephemeral Counter Store • Temporary de-duping store • SoT data-store for internal tooling
  13. 13. Simple read-through cache 13 Use-Cases • Drop-in replacement for memcache • Read-scaling • Protecting backend database from large amounts of traffic • E.g. 3rd party ingestion credential cache
  14. 14. Counter Store 14 Use-Cases • In certain places, we simply need to increment counters from multiple systems and store them • E.g. Anti-abuse/Anti-scraping systems (Fuse)
  15. 15. Temporary De-duping store 15 Use-Cases • Need to de-dup data over a large application cluster • E.g. Email systems – Ensure we don’t send the same email twice
  16. 16. SoT Store for Internal Tools 16 Use-Cases • For Non-Member facing tools, we use Couchbase as a SoT store. • Benefits: • Schema-less • Short setup time • Couchbase Python Client works easily in our environment • Use views for simple map-reduce • Example Uses: • Nurse – Autoremediation system • TrafficshiftIn – Global traffic automation system • Availability – Storing and tracking Linkedin availability data
  17. 17. Couchbase Ecosystem 17 The LinkedIn Story
  18. 18. 18 Developing around Couchbase • Java – li-couchbase-client • Wrapper around standard Java Couchbase Client • Custom metrics emission • Using Spring interface • Storing data as Java serialized objects • Python – couchbase-python-client
  19. 19. 19 Operational Tooling In order to efficiently use Couchbase as SRE’s, we need the following: • Provisioning • Installation • Monitoring & Alerting • Infrastructure Visibility
  20. 20. Provisioning 20 Operational Tooling • Provisioning Flow • Seek estimated usage statistics for cluster • Size of data to be stored • QPS • Redundancy Needs • Calculate cluster sizing • Currently done with a template • Couchbase has a simple calculator available online: http://docs.couchbase.com/prebuilt/calculators/sizing- calc.html • Request hardware for cluster(s)
  21. 21. Installation 21 Operational Tooling • Process • Enter cluster metadata into our management system (Range) • Use Salt States to install and configure cluster • See Issa Fattah’s post for more information: • https://engineering.linkedin.com/blog/2016/04/leveraging-saltstack-to-scale-couchbase • Benefits • Ability to perform ‘state enforcement’ • Using Salt Pillar’s to encrypt cluster/ bucket passwords end-to-end
  22. 22. Monitoring & Alerting 22 Operational Tooling • We run a daemon on each Couchbase Server that collects metrics every minute via Couchbase API’s • Use cluster metadata from range to build dashboards with our own system InGraphs • See: ‘Monitoring production deployments’: 4pm - Great America 1
  23. 23. Monitoring & Alerting 23 Operational Tooling
  24. 24. Management 24 Operational Tooling • We want to see a world-view of all the clusters we run • Having bucket cluster/server level statistics is useful • Having a global view of who owns and operates each cluster/ bucket is useful
  25. 25. Management 25 Operational Tooling
  26. 26. 26 Conclusions • Couchbase was a natural fit into our existing infrastructure • Building an ecosystem around Couchbase was important to us and has helped Couchbase be successful at LinkedIn • Expanding use of Couchbase • In the past year we’ve grown the number of buckets over 50% • Starting to use Views in production • Moving Couchbase into LinkedIn standard deployment infrastructure
  27. 27. 27 Thank You Questions?
  28. 28. ©2014 LinkedIn Corporation. All Rights Reserved.©2014 LinkedIn Corporation. All Rights Reserved.

×