Successfully reported this slideshow.
Your SlideShare is downloading. ×

Netflix Open Source Meetup Season 4 Episode 2

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 77 Ad

Netflix Open Source Meetup Season 4 Episode 2

Download to read offline

In this episode, we will take a close look at 2 different approaches to high-throughput/low-latency data stores, developed by Netflix.

The first, EVCache, is a battle-tested distributed memcached-backed data store, optimized for the cloud. You will also hear about the road ahead for EVCache it evolves into an L1/L2 cache over RAM and SSDs.

The second, Dynomite, is a framework to make any non-distributed data-store, distributed. Netflix's first implementation of Dynomite is based on Redis.

Come learn about the products' features and hear from Thomson and Reuters, Diego Pacheco from Ilegra and other third party speakers, internal and external to Netflix, on how these products fit in their stack and roadmap.

In this episode, we will take a close look at 2 different approaches to high-throughput/low-latency data stores, developed by Netflix.

The first, EVCache, is a battle-tested distributed memcached-backed data store, optimized for the cloud. You will also hear about the road ahead for EVCache it evolves into an L1/L2 cache over RAM and SSDs.

The second, Dynomite, is a framework to make any non-distributed data-store, distributed. Netflix's first implementation of Dynomite is based on Redis.

Come learn about the products' features and hear from Thomson and Reuters, Diego Pacheco from Ilegra and other third party speakers, internal and external to Netflix, on how these products fit in their stack and roadmap.

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Similar to Netflix Open Source Meetup Season 4 Episode 2 (20)

Advertisement
Advertisement

Netflix Open Source Meetup Season 4 Episode 2

  1. 1. High Throughput Low Latency
  2. 2. https://github.com/Netflix/EVCache
  3. 3. User Request The Hidden Microservice
  4. 4. Distributed Memcached Tunable Replication High Resilience Topology Aware Data Chunking Additional Functionality Ephemeral Volatile Cache
  5. 5. Architecture Eureka (Service Discovery) Server Memcached Prana (Sidecar) Monitoring & Other Processes Client Application Client Library Client
  6. 6. Architecture ● Complete bipartite graph between clients and servers ● Sets fan out, gets prefer closer servers ● Multiple full copies of data
  7. 7. Use Case: Lookaside cache Application (Microservice) Service Client Library Client Ribbon Client S S S S. . . C C C C. . . . . . Data Flow
  8. 8. Use Case: Primary Store Offline / Nearline Precomputes for Recommendation Online Services Offline Services . . . Online Client Application Client Library Client Data Flow
  9. 9. Usage @ Netflix 11,000+ AWS Server instances 30+ Million Ops/sec globally Over 1.5 Million cross-region replications/sec 130+ Billion objects stored globally 70+ distinct caches 170+ Terabytes of data stored
  10. 10. MAPBruce Wobbe Senior Software Engineer
  11. 11. What is Map? ● Merchandising Application Platform ● It is the final aggregation point for the data before it is sent to the device. ● The Home page assembled here.
  12. 12. Why Map Uses EVCache? ● Map utilizes EVCache to cache the home page for each customer. ● The Home Page is cached from 1-10 hours depending on device (some rows updated more often). ● EVCache provides extremely quick and reliable access to data.
  13. 13. How Is The Data Stored? ● Map stores the data as individual records in EVCache. Each home page row is a record in EVCache. ● The normal access pattern is to request 1-6 rows at a time. ● Each record is given a TTL.
  14. 14. Advantages of EVCache For Map ● High hit rate: 99.99% ● Clean up is automatic (Let the TTL handle it for you.) ● Solid (Map was using Cassandra as a backup for the data, but EVCache proved so solid we removed the Cassandra backup) ● Super fast (Average latency under 1ms)
  15. 15. Confidence ● We’ve never have seen cases of data corruption. ● When we have problems, EVCache is the last place I'd look for a cause.
  16. 16. The Stats Peak Per Region RPS: ● Total = Reads-Writes-Touches = 616K ● Reads = 275K ● Touches = 131K ● Writes = 210K ● Peak Data Per Region ● Total = 7.7 Tbytes
  17. 17. Future ● As a common use case is for devices to never hit the cache after initial load, MAP would like to see a cheaper way to store the data for infrequent accessing users.
  18. 18. MonetaScott Mansfield & Vu Nguyen
  19. 19. Moneta Moneta: The Goddess of Memory Juno Moneta: The Protectress of Funds for Juno ● Evolution of the EVCache server ● EVCache on SSD ● Cost optimization ● Ongoing lower EVCache cost per stream ● Takes advantage of global request patterns
  20. 20. Old Server ● Stock Memcached and Prana (Netflix sidecar) ● Solid, worked for years ● All data stored in RAM (Memcached) ● Became more expensive with expansion / N+1 architecture
  21. 21. Optimization ● Global data means many copies ● Access patterns are heavily region-oriented ● In one region: ○ Hot data is used often ○ Cold data is almost never touched ● Keep hot data in RAM, cold data on SSD ● Size RAM for working set, SSD for overall dataset
  22. 22. Cost Savings 3 year heavy utilization reservations (list price) r3.2xlarge x 100 nodes ≅ $204K / yr Working set 30% i2.xlarge x 60 nodes ≅ $111K / yr ~46% savings
  23. 23. New Server ● Adds Rend and Mnemonic ● Still looks like Memcached ● Unlocks cost-efficient storage & server-side intelligence external internal
  24. 24. Rend https://github.com/netflix/rend
  25. 25. Rend ● High-performance Memcached proxy & server ● Written in Go ○ Powerful concurrency primitives ○ Productive and fast ● Manages the L1/L2 relationship ● Server-side data chunking ● Tens of thousands of connections
  26. 26. Rend ● Modular to allow future changes / expansion of scope ○ Set of libraries and a default ● Manages connections, request orchestration, and backing stores ● Low-overhead metrics library ● Multiple orchestrators ● Parallel locking for data integrity
  27. 27. Rend in Production ● Serving some of our most important personalization data ● Two ports ○ One for regular users (read heavy or active management) ○ Another for "batch" uses: Replication and Precompute ●● Maintains working set in RAM ● Optimized for precomputes ○ Smartly replaces data in L1 external internal
  28. 28. Mnemonic
  29. 29. Mnemonic ● Manages data storage to SSD ● Reuses Rend server libraries ○ handle Memcached protocol ● Mnemonic core logic ○ implements Memcached operations into RocksDB Mnemonic Stack
  30. 30. Why RocksDB for Moneta ● Fast at medium to high write load ○ Goal: 99% read latency ~20-25ms ● LSM Tree Design minimizes random writes to SSD ○ Data writes are buffered Record A Record B ... memtables ● SST: Static Sorted Table
  31. 31. How we use RocksDB ● FIFO Compaction Style ○ More suitable for our precompute use cases ○ Level Compaction generated too much traffic to SSD ● Bloom Filters and Indices kept in-memory
  32. 32. How we use RocksDB ● Records sharded across many RocksDB’s on aws instance ○ Reduces number of SST files checked--decreasing latency ... Key: ABC Key: XYZ RocksDB’s
  33. 33. FIFO Limitation ● FIFO Style Compaction not suitable for all use cases ○ Very frequently updated records may prematurely push out other valid records SST Record A2 Record B1 Record B2 Record A3 Record A1 Record A2 Record B1 Record B2 Record A3 Record A1 Record B3Record B3 Record C Record D Record E Record F Record G Record H SST SST time ● Future: Custom Compaction or Level Compaction
  34. 34. Moneta Perf Benchmark ● 1.7ms 99%ile read latency ○ Server-side latency ○ Not using batch port ● Load: 1K writes/sec, 3K reads/sec ● Instance type: i2.xlarge
  35. 35. Open Source https://github.com/Netflix/EVCache https://github.com/Netflix/rend
  36. 36. https://github.com/Netflix/dynomite
  37. 37. Problems ● Cassandra not a speed demon (reads) ● Needed a data store: o Scalable & highly available o High throughput, low latency o Active-active multi datacenter replication o Advanced data structures o Polyglot client
  38. 38. Observations Usage of Redis increasing: o Not fault tolerant o Does not have bi-directional replication o Cannot withstand a Monkey attack
  39. 39. What is Dynomite? ● A framework that makes non-distributed data stores, distributed. o Can be used with many key-value storage engines Features: highly available, automatic failover, node warmup, tunable consistency, backups/restores.
  40. 40. Dynomite @ Netflix ● Running >1 year in PROD ● ~1000 nodes ● 1M OPS at peak ● Largest cluster: 6TB source of truth data store. ● Quarterly production upgrades
  41. 41. Completing the Puzzle Dyno Dynomite-manager
  42. 42. Dynomite Ecosystem
  43. 43. Features ● Multi-layered Healthcheck of Dynomite node ● Token Management and Node configuration ● Dynomite Cold Bootstrap (warm up) o after AWS or Chaos Monkey terminations ● Backups/Restores ● Exposes operational tasks through REST API ● Integration with the Netflix OSS Ecosystem
  44. 44. Where can I find it? https://github.com/Netflix/dynomite-manager https://github.com/Netflix/dynomite https://github.com/Netflix/dyno
  45. 45. Thomson Reuters
  46. 46. About us Canadian Head of Architecture. SOA, Cloud, DevOps Practitioner. Drummer v1, Wine v0.5. Working on all aspects of NetflixOSS and AWS. Brazilian Principal Software Architect, SOA Expert, DevOps Practitioner, Blogger, Terran SC2 Player. Working with Chaos / Perf Engineering and NetflixOSS. @diego_pacheco diegopacheco http://diego-pacheco.blogspot.com.br/ ilegra.com @samsgro samsgro https://www.linkedin.com/in/sam-sgro-4217845
  47. 47. 2015 Technology POC 2016 First Commercial Apps “Project Neon” First Release 2017... “Project Neon” Platform
  48. 48. Business Use Case
  49. 49. Why Dynomite?
  50. 50. Performance Accumulated Latency per 1k operations
  51. 51. Infrastructure Changes Dynomite Manager ● Submitted 2 Dynomite PRs to Netflix to improve integration with the Dynomite Manager Eiddo: TR’s git-based network property server ● Converted this to talk to the Dynomite Manager Ribbon: Eureka-integrated client load balancer ● Cache service instrumentation around Dyno, observables Docker: Fun with Containers ● Simple image, ease of use, developer testing Dynomite-manager
  52. 52. Issues and Challenges Dynomite-manager
  53. 53. DEMO Architecture
  54. 54. Acknowledgements Diego Pacheco Maksim Likharev Yurgis Baykshtis Sam Sgro
  55. 55. Latency Performance
  56. 56. 2 to 3 years Database Development
  57. 57. RESP server Redis API Hash Conf Sharding SnitchAnti-entropy Dynomite RESP API Storage APIs / RESP Client Redis Client Gossip Telemetry RESP protocol RESP protocol Central Command Partitioning
  58. 58. 2 to 3 years Database Development 3 weeks vs.
  59. 59. Dynomite + RocksDB
  60. 60. Dynomite + RocksDB
  61. 61. DevOps RESP server Redis API Hash Conf Sharding SnitchAnti-entropy Dynomite RESP API Storage API Storage APIs / RESP Client Redis Client Gossip Telemetry ● Infrastructure ● Tooling ● Automation ● Deployment Central Command
  62. 62. Application Development Redis API Simple, Composable, Powerful
  63. 63. Single Server Redis Distributed Cache
  64. 64. Key/Value + Key/Data Structures String Integer Float List Set Sorted Set Hash Commands
  65. 65. Data model and query reuse
  66. 66. Function Benefits Database engineers ● Order of magnitude faster development ● Framework for rapid development DevOps ● Efficiency gains ● Common infrastructure ● Reuse tools, scripts, and more Application developers ● Increase development velocity ● Single API for cache and database ● Query and data model reuse
  67. 67. Beyond Key/Value 10 001 12 002 Smith 001 Jones 002 Joe 001 Mary 002
  68. 68. Thank you @DynomiteDB www.dynomitedb.com

×