Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

NetflixOSS Meetup season 3 episode 1

17,918 views

Published on

NetflixOSS Meetup S3 E1, covering latest components in Distributed Databases, Telemetry systems, Big Data tools and more. Speakers from Netflix, IBM Watson, Pivotal and Nike Digital

Published in: Technology

NetflixOSS Meetup season 3 episode 1

  1. 1. Season 3 Episode 1 Feb 11, 2015
  2. 2. Ruslan Meshenberg - @rusmeshenberg Introduction
  3. 3. ● One new way to eval ○ Zero To Docker ● Three community users ○ IBM Watson ○ Nike Digital ○ Pivotal Agenda - Lightning Talks ● Eight new projects ○ Atlas ○ Prana ○ Raigad ○ Genie 2 ○ Inviso ○ Dynomite ○ Nicobar ○ MSL
  4. 4. Atlas Roy Rapoport - @royrapoport
  5. 5. In-House Telemetry? Inconceivable! ● Crowded OSS field! ○ Cacti, InfluxDB, OpenTSDB, Nagios, Icinga, NeDi, Zabbix, Observium, Sensu, Zenoss, OpenNMS, Bosun, Prometheus, etc ● Not to mention commercial products ● Some shortcomings ...
  6. 6. In-House Telemetry? Inconceivable! ● Agility Mismatch ● Cloud (and Netflix Ecosystem) Integration ● Multiple Data Sources ● Scale ● No, seriously. Scale ○ 2011: 10M/minute ○ ~2x Increase per quarter ○ Now up to 1.3B/minute
  7. 7. If You Build It …
  8. 8. Also … ● Decent UIs ● Alerting ○ And alert threshold analysis and recommendations ● Real-Time Analytics ● Integration with Hive and EMR ● Dashboards frameworks
  9. 9. Also … ● Composable ● So we can change our minds later …
  10. 10. For now … ● Query layer ● Back end
  11. 11. Soon … ● Improved deployment ● Publish client ● Alerting ● Better UI
  12. 12. Prana Diptanu Choudhury - @diptanu
  13. 13. Motivations ● The Netflix Platform stack is JVM based ● Platform features are provided to developers via client libraries ○ Service Discovery ○ Client Side Load Balancing ○ Monitoring and alerting client libraries
  14. 14. The Netflix Ecosystem
  15. 15. Meet Prana ● Prana provides the same set of features to non-jvm or non-netflix-platform based software ● It allows applications to gel with the Netflix Ecosystem
  16. 16. Prana Features ● Easy to use http based api ○ Load Balancing via Ribbon ○ Service discovery via Eureka client ○ Monitoring via Atlas Client ● Extensible via a plugin framework ● Highly Configurable
  17. 17. Raigad Sagar Loke - @sagar_loke
  18. 18. Raigad - Motivation ● Elasticsearch Side Car – Co-process runs along side ES process ● Helps to automate ES deployment ○ ~50 Clusters in test -- ~180TB data ○ ~45 Clusters in prod -- ~780TB data ● Node Discovery and Tracking ● Automatic Index Management ● Scheduled Backup and Restore ● Geared towards running in AWS Environment
  19. 19. Auto ES Deployments ● Based on configuration parameters; tunes Elasticsearch.yml file ● Multi-region support ● Currently follows dedicated Master-Data- Search deployment based on ASG Names
  20. 20. Node Discovery and Tracking ● Sample implementation using Cassandra ● C* keeps track of metadata information of ES Clusters ● ES instance reads C* to discover other nodes during bootstrap ● Storing metadata in C* helps in Multi-Region deployments
  21. 21. Auto Index Management ● Provides configuration properties for Auto Index Management ● Based on specific index date suffix (YYYYMMDD), old indices are cleaned and new indices are created ● Before running Index Manager
  22. 22. Auto Index Management … continued ● After running Index Manager ● Index Manager job can be scheduled
  23. 23. Running in AWS ● Automatic updates to Security Groups when new nodes are added or removed ● Supports IAM Credentials ● Scheduled Snapshot Backup to S3 -- uses elasticsearch-cloud-aws plugin ● Publish ES Metrics to Servo - Centralized Monitoring System
  24. 24. Genie 2 Tom Gianos
  25. 25. Data Warehouse Prod VPCBonusQuery Prod Test Processing Clusters Clients Service Tools Our Current Architecture CLI’s
  26. 26. Goals For Genie 2 ● Develop a generic data model, which would let jobs run on any multi-tenant distributed processing cluster. ● Implement a flexible cluster and command selection algorithm for running a job. ● Provide richer API support. ● Implement a more flexible, extensible and robust codebase.
  27. 27. { "user": "tgianos", "name": "PrestoJob.1421807841069", "commandArgs": "-f script.presto ", "clusterCriterias": [ { "tags": [ "presto", "prod" ] }, { "tags": [ "adhoc" ] } ], "commandCriteria": [ "presto" ], "tags": [ "headers", "presto", "BigDataPortal" ] }
  28. 28. Our Current Deployment ● 19 i2.2xlarge nodes in prod cluster ○ Configured to allow room to scale up as needed ● 34 max jobs per node ● ~17,000 Jobs Per Day
  29. 29. Daniel Weeks
  30. 30. Dynomite Minh Do - @timiblossom
  31. 31. What is Dynomite? ● Dynamo layer on top of a non-distributed system (Redis/Memcache) ○ Peer-to-peer ○ Replication ○ Sharding ○ Gossipping ○ Multi-datacenters and racks awareness ○ Encryption ○ Linear scale
  32. 32. Dynomite Node
  33. 33. Network Topology
  34. 34. Operation features ● Florida - sidecar application to manage Dynomite clusters (like Priam for Cassandra) ● Data backup (Redis only)de replacement ● Data warm-up (Redis only) ● Client failover strategy - Dyno (our java client) ● Atlas/Servo integration for operation metrics
  35. 35. Incoming features ● Higher read/write consistencies ● Data reconciliation or data repair ● Other data storages besides Redis and Memcache ● Better/more generic warm-up method ● Spark driver integration ● and others
  36. 36. Performance ● AWS: ○ 126 nodes total in us-east-1, us-west-2, eu-west-1 ○ r3-xlarge ○ 1K data payload ● 250K Write RPS, 250K Read RPS ● Client observed latencies ○ average less than 1ms ○ 99th at ~1.5ms ○ 99.5th at ~2.5ms
  37. 37. Nicobar Dynamic Scripting Library for Java Vasanth Asokan
  38. 38. What is Nicobar? Mainly, two things: 1. A Pluggable, Dynamic Scripting Framework for Java (powered by) 2. A Modular Classloading System
  39. 39. Traditional Java Classloader Hierarchy
  40. 40. Powered by JBoss Modules Nicobar Module Classloader Hierarchy
  41. 41. Putting it all together
  42. 42. MSL Mitch Zollinger
  43. 43. What is MSL? MSL = Message Security Layer MSL is a modern security protocol which enables arbitrary application protocols to be secured over arbitrary transport protocols.
  44. 44. Performance ● sub-second playback start ● MSL messaging stacked with app protocol ○ request can have: device authentication, user authentication, key exchange & application message ○ response can have: key exchange, authentication renewal, application message ● Netflix streaming should start faster than changing channels on your cable box!
  45. 45. Reliability ● We need 4-5 “9s” of reliability ● MSL has automatic error recovery ● We had to remove reliance on 3rd party PKI ● Client time: not needed by MSL
  46. 46. Modern Protocol Design ● Human readable JSON vs. complex binary format ○ ASN.1 security issues go away ● Multiple implementations: Java, JS, C#, … ○ JS: updateable in-field
  47. 47. Flexibility ● Pluggable ○ authentication ○ crypto algorithms ● Standard porting API ○ Can use W3C WebCrypto, for example
  48. 48. Deployment Models ● Trusted Services Network ○ All servers shares a common master key allowing the same level of trust across the network ● Peer-to-Peer ○ Every pair of entities shares connection specific keys & credentials
  49. 49. Security / Feature List ● encrypt / decrypt ● device authentication ● user authentication ● integrity protection ● key exchange ● anti-replay protection ● compression ● chunked messaging
  50. 50. Zero To Docker Andrew Spyker - @aspyker
  51. 51. ● Up and running in minutes ● Before - Documented technology that we expected you to assemble ● Now - Running technology that we assembled and validated ● Not - Production Ready. Examples only, not run this way at Netflix (security, HA, monitoring, etc.) Netflix OSS on your laptop Docker Host (ex. Virtual Box on OSX) Ubuntu 14.04 single kernel Container#1 Filesystem+ process Eureka Container ZuulContainer Another Container ...
  52. 52. Trusted and Transparent Builds ● Start with Dockerhub registry ○ Pull images that you know were built securely ○ All you need if you just want to run them ● Inspect the linked github Dockerfile ○ Want to know how NetflixOSS was configured? ○ Want to know how NetflixOSS code was built? ○ All code and configuration explicitly documented
  53. 53. What is available? From https://hub.docker.com/u/netflixoss/ ● asgard ● eureka ● edda ● sketchy ● security monkey ● exhibitor ● sample karyon application ● zuul ● atlas
  54. 54. Nike Digital Alan Scherger - @flyinprogrammer
  55. 55. Where we started... > Datacenter 2 Cloud (AWS) > Cloud native architecture using microservices > Defined a Cloud Blueprint > Pioneered a REST application bootstrap > Maintain a boilerplate to define transitive dependencies.
  56. 56. How do we do metrics across billions (or 100s) of microservices?
  57. 57. ● Instrumentation to JMX ● Graphite Observer to capture metrics
  58. 58. Observer Modifications ● Use Eureka to find a Graphite node ● Use a healthcheck to timeout the tcp socket
  59. 59. How are we going to store all of these metrics?
  60. 60. ● Graphite Carbon compliant ● Cassandra metric storage ● Elasticsearch metric search ● C* and ES cross-region replication enable a global view of the metrics Cyanite https://github.com/pyr/cyanite
  61. 61. How do we make these tools Blueprint compliant?
  62. 62. Sidecars to the rescue!
  63. 63. Priam + Cassadra = Done
  64. 64. Raigad + Elasticsearch ; Prana + Cyanite
  65. 65. So those didn’t exist - sour.
  66. 66. Generic Sidecar ● Application daemon ● Convention over configuration groovy scripts.
  67. 67. configure.groovy Generate 3 config files off eureka data.
  68. 68. Add the ingredients that produce code.
  69. 69. Altas Jr.
  70. 70. Kevin Haverlock kbh@us.ibm.com Aroop Pandya apandya@us.ibm.com Kelly Abuelsaad kna@us.ibm.com Susan Diamond lsyang@us.ibm.com IBM Watson Developer Cloud
  71. 71. … http://www.msnbc.com/msnbc/how-supercomputer-sees-the-state-the-union
  72. 72. Visual Recognition Image/Video recognition and classification service to provide assessment of a user from their images Extract information from text: People, Organizations, Locations, Events, and the relationships between them User Modeling Improved understanding of people's preferences to help engage users on own terms Language Identification Machine Translation Concept Expansion Message Resonance Question and Answer Relationship Extraction Text to Speech The conversion of text to outputted audio stream Speech to Text Converts speech into text Tradeoff Analytics helps people make better choices while taking into account multiple, often conflicting, goals that matter Concept Analytics Links documents that you provide with a pre- existing graph of concepts
  73. 73. … …… … … … …
  74. 74. • – – • • • •
  75. 75. Joshua Long - @starbuxman Pivotal “bootifuL” microservices with spring cloud & Netflix oss
  76. 76. @starbuxman @Grab("spring-boot-starter-actuator") @RestController class GreetingsController { @RequestMapping("/hi/{name}") def hi(@PathVariable String name){ [ greeting: "Hello, " + name +"!" ] } } > spring run greeting.groovy > spring jar greeting.groovy greeting.jar
  77. 77. @starbuxman import org.springframework.cloud.config.server.EnableConfigServer; @SpringBootApplication @EnableConfigServer public class ConfigurationServerApplication { public static void main(String[] args) throws Exception { SpringApplication.run(ConfigurationServerApplication.class, args); } } spring: cloud: config: server: uri: ${MY_CONF:https://github.com/some/git-repository} @Value("${some.property}") private String someProperty ;
  78. 78. @starbuxman @SpringBootApplication @EnableEurekaClient public class DogeApplication { // … // src/main/resources/bootstrap.yml spring: application: name: doge-service
  79. 79. @starbuxman @Component class ReliableClient { @HystrixCommand( fallbackMethod = "defaultDogeLink") public Link buildDogeLink() { // insert volatile // service-to-service call here } } @SpringBootApplication @EnableHystrixDashboard public class HystrixApplication { public static void main(String[] args) { SpringApplication.run(HystrixApplication.class, args); } }
  80. 80. @starbuxman zuul: proxy: mapping: /api addProxyHeaders: true route: account-service: /accounts doge-service: /doges zuul: proxy: mapping: /api //: true route: /api/accounts /api/doges
  81. 81. @starbuxman spring: oauth2: client: clientId: acme clientSecret: acmesecret resource: tokenInfoUri: http://localhost:8002/auth/oauth/check_token id: openid serviceId: resource @SpringBootApplication @RestController @EnableOAuth2Resource public class SsoResourceApplication { public static void main(String[] args) { SpringApplication.run(SsoResourceApplication.class, args); } @RequestMapping("/hi") String hi(@RequestParam Optional<String> name) { return "Hello" + name.map(n -> ", " + n).orElse("") + "! "; } }
  82. 82. Josh Long (龙之春) @starbuxman @springcentral jlong@pivotal.io github.com/joshlong References spring.io/guides github.com/spring-cloud/ github.com/spring-cloud-samples/ github.com/joshlong/spring-doge github.com/joshlong/spring-doge-microservice docs.spring.io/spring-boot/ Questions?
  83. 83. Please join us next door for mingling, drinks and food! @netflixoss Thank you!

×