SlideShare a Scribd company logo
1 of 38
Download to read offline
Netflix Open Source &
What I have done in a year?
Andrew Spyker
Senior Software Engineer, Netflix
Back to the Past
Previous talks at @TriangleDevops
● 10/16/2013 - Learn about NetflixOSS
● 6/18/2014 - Learn about Docker
About Netflix
● 69M members
● 2000+ employees (1400 tech)
● 80+ countries
● > 100M hours watch per day
● > ⅓ NA internet download traffic
● 500+ Microservices
● Many 10’s of thousands VM’s
● 3 regions across the world
About the Speaker
● Cloud platform technologies
○ Distributed configuration, service discovery, RPC, application
frameworks, non-Java sidecar
● Container cloud
○ Resource management and scheduling, making Docker containers
operational in Amazon EC2/ECS
● Open Source
○ Organize @NetflixOSS meetups & internal group
● Performance
○ Assist across Netflix, but focused mainly on cloud platform perf
With Netflix for ~ 1 year. Previously at IBM here in Raleigh/Durham (RTP)
@aspyker
ispyker.
blogspot.
com
Agenda
● NetflixOSS
Netflix Cloud Architecture
Getting Started
Personal Projects
Why does Netflix open source?
● Allows engineers to gather feedback
○ Openly talk, through code, on our approach
○ Collaboration on key projects with the world
○ Happily use proven outside open source
■ And improve it for Netflix scale and availability
● Netflix culture of freedom and responsibility
○ Want to open source?
○ Go for it, be responsible!
● Recruiting and Retention
○ Candidates know exactly what they can work on
○ NetflixOSS engineers choose to stay at Netflix
NetflixOSS is widely used
● The architecture has shaped public cloud usage
○ Immutability, Red/Black Deploys, Chaos,
Regional and worldwide high availability
● Offerings
○ Pivotal Spring Cloud
● Large usage
○ IBM Watson as a Service (on IBM Cloud)
○ Nike Digital is hiring NetflixOSS experts
● Interesting usage
○ “To help locate new troves of data claiming to be the files stolen from
AshleyMadison, the company’s forensics team has been using a tool
that Netflix released last year called Scumblr”
NetflixOSS Website Relaunch
http://netflix.github.io
Key aspects of NetflixOSS website
● Show how the pieces fit together
○ Projects now discussed with each other in context
● OSS categories mirror internal teams
○ No artificial categories, focal points for each area
● Focus on projects that are core to Netflix
○ Projects mentioned are core and strategic
Agenda
NetflixOSS
● Netflix Cloud Architecture
Getting Started
Personal Projects
Elastic, Web and Hyper Scale
Doing this
Not doing that
Elastic, Web and Hyper Scale
Front end
API
Another
Microservice
Temporal
caching
Durable
Storage
Load
Balancers
…
Strategy Benefit
Automate everything Less errors, more consistency than manual runbooks
Expose well designed API to users Offloads presentation complexity to clients
Remove state for mid tier services Allows easy elastic scale out
Push temporal state to client and caching tier Leverage clients, avoids data tier overload
Use partitioned data storage Data design and storage scales with HA
…
…
…
…
…
Recommendation
Microservice
HA and Automatic Recovery
Feeling This
Not Feeling That
Micro service
Implementation
Call microservice #2
Highly Available Service Runtime Recipe
Ribbon REST client
with Eureka
Microservice #1
(REST services)
App Service
Microservice #2
Execute
call
Hystrix
Eureka
Server(s)
Eureka
Server(s)
Eureka
Server(s)
Karyon
Fallback
Implementation
Implementation Detail Benefits
Decompose into micro services
• Key user path always available
• Failure does not propagate across service boundaries
Karyon /w automatic Eureka registration
• New instances are quickly found
• Failing individual instances disappear
Ribbon client with Eureka awareness
• Load balances & retries across instances with “smarts”
• Handles temporal instance failure
Hystrix as dependency circuit breaker
• Allows for fast failure
• Provides graceful cross service degradation/recovery
IaaS High Availability
Region (us-east-1)
us-east-1e
us-east-1c
Eureka
Web App Service1 Service2
Cluster Auto Recovery and Scaling Services (Auto Scaling Groups)
…
ELB’s
Rule Why?
Always > 2 of everything 1 is SPOF, 2 doesn’t scale, slow DR recovery, majority consensus not possible
Including IaaS and cloud services You’re only as strong as your weakest dependency
Use auto scaler/recovery monitoring Clusters guarantee availability and service latency
Use application level health checks Instance on the network != healthy
Worldwide availability Data replication, global front-end routing, cross region traffic
us-east-1d
A truly global service
● Replicate data across
regions
● Be able to redirect traffic
from region to region
● Be able to migrate
regional traffic to other
regions
● Have automated control
across regions Flux Demo
Testing is only way to prove HA
● Chaos Monkey
○ Kill instances in production - runs regularly
● Chaos Gorilla
○ Kills availability zones (single datacenter)
○ Also testing for split brain important
● Chaos Kong
○ Kill entire region and shift traffic globally
○ Run frequently but with prior scheduling
Continuous Delivery
Reading This
Not This
v
Continuous Delivery
Cluster v1 Canary v2 Cluster V2
Step Technology
Developers test locally Unit test frameworks
Continuous build Continuous build server based on gradle builds
Build “bakes” full instance image Aminator and deployment pipeline bake images from build artifacts
Developer work across dev and test Archaius allows for environment based context
Developers do canary tests, red/black
deployments in prod
Asgard console provides app cluster common devops approach,
security patterns, and visibility
Continuous
Build Server
Baked to images
(AMI’s)
… …
From Asgard to Spinnaker
● Spinnaker is our CI/CD solution
○ CI/CD solution including baking and Jenkins integration
○ Workflow engine for the continuous delivery
○ Pipeline based deployment including baking
○ Global visibility across all of our AWS regions
○ Provides an API first design
○ A microservices runtime HA architecture
○ More flexible cloud model so the community can contribute back
improvements not related to AWS
● Asgard continues to work side-by-side
● Spinnaker is this new end to end CI/CD tool
Spinnaker Examples
Works at
Netflix
scale
Views of
global
pipelines
From simple Asgard
like deployment to
advanced CI/CD
pipelines
Operational Visibility
If you can’t see it, you can’t improve it
Operational Visibility
Microservice #1 Microservice #2
Visibility Point Technology
Basic IaaS instance monitoring Not enough (not scalable, not app specific)
User like external monitoring SaaS offerings or OSS like Uptime
Targeted performance, sampling Vector performance and app level metrics
Service to service interconnects Hystrix streams ➔Turbine aggregation ➔Hystrix dashboard
Application centric metrics Servo/Spectator gauges, counters, timers sent to metrics store like Atlas
Remote logging Logstash/Kibana or similar log aggregation and analysis frameworks
Threshold monitoring and alerts Services like Atlas and PagerDuty for incident management
Servo/
Spectator
Hystrix/Turbine
External
Uptime
Monitoring Metric/Event
Repositories
LogStash/Elastic
Search/Kibana
Incidents
……
…
…
Atlas
Vector
Security
Dynamic
Security
Done in new ways
NOT
Dynamic, Web Scale & Simpler Security
Security Monkey
● Monitors security policies, tracks changes, alerts on situations
Scumblr
● Searches internet for security “nuggets” (credentials, hacking discussions)
Sketchy
● A safe way to collect text and screenshots from websites
FIDO
● Automated event detection, analysis, enrichment & and enforcement
Sleepy Puppy
● Delayed cross site scripting propagation testing framework
Lemur
● x.509 certificate orchestration framework
What did we not cover?
Over 50 github projects
● NetflixOSS is “Technical indigestion as a service”
Big Data, Data Persistence and UI Engineering
● Big Data tools used well beyond Netflix
● Ephemeral, semi and fully persistent data systems
● Recent addition of UI OSS and Falcor
Agenda
NetflixOSS
Netflix Cloud Architecture
● Getting Started
Personal Projects
How do I get started?
● All of the previous slides shows NetflixOSS components
○ Code: http://netflix.github.io
○ Announcements: http://techblog.netflix.com/
● Want to get running a bit faster?
● ZeroToCloud
○ Workshop for getting started with build/bake/deploy in Amazon EC2
● ZeroToDocker
○ Docker images that containing running Netflix technologies (not production
ready, but easy to understand)
ZeroToDocker Demo
Mac OS X
Virtual Box
Ubuntu 14.04
single kernel
Container#1
Filesystem+
process
Eureka
Container
ZuulContainer
Another
Container
...
● Docker running instances
○ Single kernel
○ Contained processes
● Zookeeper and Exhibitor
● A Microservices app and
surrounding NetflixOSS
services (Zuul to Karyon
with Eureka)
Agenda
NetflixOSS
Netflix Cloud Architecture
Getting Started
● Personal Projects
Performance Focus
● Reduced Karyon startup time by ⅔
○ Removal of classpath scanning
○ Moved eureka “UP” registration to be event based
○ Java 8 (faster startup was focus)
● Investigated other opportunities now being
considered for Karyon 3
○ Loading components asynchronously (console)
● Beyond platform startup time - key service
○ Fixes to platform that saved 3 minutes
■ library version tracking, ribbon connection priming
○ Fixes to application logic (distributed indexing/filtering)
Performance Focus - Eureka
● Identified issues w/ OOM’s & eureka client
○ For a “full update” we used 2G of memory
○ Was crashing discovery for our EVCache nodes
● Helped prototype the following
○ XStream - required 370M of heap
○ Jackson V1 (first attempt) - down to 260M
○ Jackson V2 (current) - down to 130M
○ Jackson V2 (+compact for future scenarios) - down to 64M
Performance Automation
● Implemented automated performance measurement
● Jenkins pipeline as part of every platform candidate
● Uses Elastic (search) and Kibana dashboards
● Measures
○ Boot to tomcat start time
○ Tomcat start to up in discovery
○ Profiles the startup
○ Number of dependencies
○ Used/unused dependencies
○ Jacoco code coverage
● In our face monitoring
dashboard
Platform Sidecar (Prana)
● Prana started as an edge focused “what was
needed”, then wider Netflix usage
● Created release management
○ User oriented smoke tests - Acme Air NodeJS
○ Now releases can be done with confidence
● Supported the Netflix desktop experience
○ Uses isomorphic JavaScript on NodeJS + Prana
○ Added circuit breaker, LB & dist config support
○ Caused my first partial outage (insert story here)
● Supported the EVCache clusters
Strategy - Platform Direction
● Helped define some of the platform direction
● Improvements in Eureka to ensure its
continued scalability
● Key improvements needed in Karyon 3
○ Performance improvements (footprint/startup)
○ Focus on mocks needed in dev, unit test, CI envs
○ Ability to narrow features for infrastructural services
○ Rework of Prana to be on same platform base
Open Source
● Led internal & external meetups on OSS
● Web site redesign to help external users
● Implemented ZeroToDocker
○ Implemented the platform focused aspects
○ Helped other teams onboard into ZeroToDocker
● Worked to operationalize prod deployments
○ Separate dev stack, metrics, consistent pipelines
○ Built up teams (existing impl, strategic work)
● Created strategy for going forward
○ Increase leverage of “Mantis” technology for
scheduling and job management
○ Increase leverage of ECS for Docker AWS
integration & resource management
● Working on strategy of non-runtime components
○ Changes to Netflix build/bake/deploy
Container cloud
Questions
?

More Related Content

What's hot

Timed Text At Netflix
Timed Text At NetflixTimed Text At Netflix
Timed Text At NetflixRohit Puri
 
20140708 - Jeremy Edberg: How Netflix Delivers Software
20140708 - Jeremy Edberg: How Netflix Delivers Software20140708 - Jeremy Edberg: How Netflix Delivers Software
20140708 - Jeremy Edberg: How Netflix Delivers SoftwareDevOps Chicago
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Sourceaspyker
 
Netflix Open Source: Building a Distributed and Automated Open Source Program
Netflix Open Source:  Building a Distributed and Automated Open Source ProgramNetflix Open Source:  Building a Distributed and Automated Open Source Program
Netflix Open Source: Building a Distributed and Automated Open Source Programaspyker
 
Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4aspyker
 
NetflixOSS Meetup S6E1 - Titus & Containers
NetflixOSS Meetup S6E1 - Titus & ContainersNetflixOSS Meetup S6E1 - Titus & Containers
NetflixOSS Meetup S6E1 - Titus & Containersaspyker
 
Netflix Architecture and Open Source
Netflix Architecture and Open SourceNetflix Architecture and Open Source
Netflix Architecture and Open SourceAll Things Open
 
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and DaemonsQConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemonsaspyker
 
Season 7 Episode 1 - Tools for Data Scientists
Season 7 Episode 1 - Tools for Data ScientistsSeason 7 Episode 1 - Tools for Data Scientists
Season 7 Episode 1 - Tools for Data Scientistsaspyker
 
Container World 2018
Container World 2018Container World 2018
Container World 2018aspyker
 
NetflixOSS Meetup S6E2 - Spinnaker, Kayenta
NetflixOSS Meetup S6E2 - Spinnaker, KayentaNetflixOSS Meetup S6E2 - Spinnaker, Kayenta
NetflixOSS Meetup S6E2 - Spinnaker, Kayentaaspyker
 
Velocity NYC 2016 - Containers @ Netflix
Velocity NYC 2016 - Containers @ NetflixVelocity NYC 2016 - Containers @ Netflix
Velocity NYC 2016 - Containers @ Netflixaspyker
 
CMP376 - Another Week, Another Million Containers on Amazon EC2
CMP376 - Another Week, Another Million Containers on Amazon EC2CMP376 - Another Week, Another Million Containers on Amazon EC2
CMP376 - Another Week, Another Million Containers on Amazon EC2aspyker
 
Matt Chung (Independent) - Serverless application with AWS Lambda
Matt Chung (Independent) - Serverless application with AWS Lambda Matt Chung (Independent) - Serverless application with AWS Lambda
Matt Chung (Independent) - Serverless application with AWS Lambda Outlyer
 
Modern Monitoring - SysAdminDay 2017
Modern Monitoring - SysAdminDay 2017Modern Monitoring - SysAdminDay 2017
Modern Monitoring - SysAdminDay 2017Opsta
 
Leonard Austin (Ravelin) - DevOps in a Machine Learning World
Leonard Austin (Ravelin) - DevOps in a Machine Learning WorldLeonard Austin (Ravelin) - DevOps in a Machine Learning World
Leonard Austin (Ravelin) - DevOps in a Machine Learning WorldOutlyer
 
Cncf storage-final-filip
Cncf storage-final-filipCncf storage-final-filip
Cncf storage-final-filipJuraj Hantak
 
Owain Perry (Just Giving) - Continuous Delivery of Windows Micro-Services in ...
Owain Perry (Just Giving) - Continuous Delivery of Windows Micro-Services in ...Owain Perry (Just Giving) - Continuous Delivery of Windows Micro-Services in ...
Owain Perry (Just Giving) - Continuous Delivery of Windows Micro-Services in ...Outlyer
 
Cairo Kubernetes Meetup - October event Talk #1
Cairo Kubernetes Meetup - October event Talk #1Cairo Kubernetes Meetup - October event Talk #1
Cairo Kubernetes Meetup - October event Talk #1omehelba
 

What's hot (20)

Timed Text At Netflix
Timed Text At NetflixTimed Text At Netflix
Timed Text At Netflix
 
20140708 - Jeremy Edberg: How Netflix Delivers Software
20140708 - Jeremy Edberg: How Netflix Delivers Software20140708 - Jeremy Edberg: How Netflix Delivers Software
20140708 - Jeremy Edberg: How Netflix Delivers Software
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Source
 
Netflix Open Source: Building a Distributed and Automated Open Source Program
Netflix Open Source:  Building a Distributed and Automated Open Source ProgramNetflix Open Source:  Building a Distributed and Automated Open Source Program
Netflix Open Source: Building a Distributed and Automated Open Source Program
 
Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4
 
NetflixOSS Meetup S6E1 - Titus & Containers
NetflixOSS Meetup S6E1 - Titus & ContainersNetflixOSS Meetup S6E1 - Titus & Containers
NetflixOSS Meetup S6E1 - Titus & Containers
 
Netflix Architecture and Open Source
Netflix Architecture and Open SourceNetflix Architecture and Open Source
Netflix Architecture and Open Source
 
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and DaemonsQConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
 
Season 7 Episode 1 - Tools for Data Scientists
Season 7 Episode 1 - Tools for Data ScientistsSeason 7 Episode 1 - Tools for Data Scientists
Season 7 Episode 1 - Tools for Data Scientists
 
Container World 2018
Container World 2018Container World 2018
Container World 2018
 
NetflixOSS Meetup S6E2 - Spinnaker, Kayenta
NetflixOSS Meetup S6E2 - Spinnaker, KayentaNetflixOSS Meetup S6E2 - Spinnaker, Kayenta
NetflixOSS Meetup S6E2 - Spinnaker, Kayenta
 
Velocity NYC 2016 - Containers @ Netflix
Velocity NYC 2016 - Containers @ NetflixVelocity NYC 2016 - Containers @ Netflix
Velocity NYC 2016 - Containers @ Netflix
 
CMP376 - Another Week, Another Million Containers on Amazon EC2
CMP376 - Another Week, Another Million Containers on Amazon EC2CMP376 - Another Week, Another Million Containers on Amazon EC2
CMP376 - Another Week, Another Million Containers on Amazon EC2
 
Matt Chung (Independent) - Serverless application with AWS Lambda
Matt Chung (Independent) - Serverless application with AWS Lambda Matt Chung (Independent) - Serverless application with AWS Lambda
Matt Chung (Independent) - Serverless application with AWS Lambda
 
Modern Monitoring - SysAdminDay 2017
Modern Monitoring - SysAdminDay 2017Modern Monitoring - SysAdminDay 2017
Modern Monitoring - SysAdminDay 2017
 
The new Netflix API
The new Netflix APIThe new Netflix API
The new Netflix API
 
Leonard Austin (Ravelin) - DevOps in a Machine Learning World
Leonard Austin (Ravelin) - DevOps in a Machine Learning WorldLeonard Austin (Ravelin) - DevOps in a Machine Learning World
Leonard Austin (Ravelin) - DevOps in a Machine Learning World
 
Cncf storage-final-filip
Cncf storage-final-filipCncf storage-final-filip
Cncf storage-final-filip
 
Owain Perry (Just Giving) - Continuous Delivery of Windows Micro-Services in ...
Owain Perry (Just Giving) - Continuous Delivery of Windows Micro-Services in ...Owain Perry (Just Giving) - Continuous Delivery of Windows Micro-Services in ...
Owain Perry (Just Giving) - Continuous Delivery of Windows Micro-Services in ...
 
Cairo Kubernetes Meetup - October event Talk #1
Cairo Kubernetes Meetup - October event Talk #1Cairo Kubernetes Meetup - October event Talk #1
Cairo Kubernetes Meetup - October event Talk #1
 

Viewers also liked

Ibm cloud nativenetflixossfinal
Ibm cloud nativenetflixossfinalIbm cloud nativenetflixossfinal
Ibm cloud nativenetflixossfinalaspyker
 
Netflix s2e1lightningtalk
Netflix s2e1lightningtalkNetflix s2e1lightningtalk
Netflix s2e1lightningtalkaspyker
 
NetflixOSS for Triangle Devops Oct 2013
NetflixOSS for Triangle Devops Oct 2013NetflixOSS for Triangle Devops Oct 2013
NetflixOSS for Triangle Devops Oct 2013aspyker
 
Docker Demo IBM Impact 2014
Docker Demo IBM Impact 2014Docker Demo IBM Impact 2014
Docker Demo IBM Impact 2014aspyker
 
Going Cloud Native with IBM Cloud and NetflixOSS for Dev@Pulse
Going Cloud Native with IBM Cloud and NetflixOSS for Dev@PulseGoing Cloud Native with IBM Cloud and NetflixOSS for Dev@Pulse
Going Cloud Native with IBM Cloud and NetflixOSS for Dev@Pulseaspyker
 
Acme airnetflix 20130717
Acme airnetflix 20130717Acme airnetflix 20130717
Acme airnetflix 20130717aspyker
 
Netflix Container Runtime - Titus - for Container Camp 2016
Netflix Container Runtime - Titus - for Container Camp 2016Netflix Container Runtime - Titus - for Container Camp 2016
Netflix Container Runtime - Titus - for Container Camp 2016aspyker
 
Cloud Services Powered by IBM SoftLayer and NetflixOSS
Cloud Services Powered by IBM SoftLayer and NetflixOSSCloud Services Powered by IBM SoftLayer and NetflixOSS
Cloud Services Powered by IBM SoftLayer and NetflixOSSaspyker
 
Netflix Open Source Meetup Season 4 Episode 3
Netflix Open Source Meetup Season 4 Episode 3Netflix Open Source Meetup Season 4 Episode 3
Netflix Open Source Meetup Season 4 Episode 3aspyker
 
Re:invent 2016 Container Scheduling, Execution and AWS Integration
Re:invent 2016 Container Scheduling, Execution and AWS IntegrationRe:invent 2016 Container Scheduling, Execution and AWS Integration
Re:invent 2016 Container Scheduling, Execution and AWS Integrationaspyker
 
NetflixOSS season 2 episode 2 - Reactive / Async
NetflixOSS   season 2 episode 2 - Reactive / AsyncNetflixOSS   season 2 episode 2 - Reactive / Async
NetflixOSS season 2 episode 2 - Reactive / AsyncRuslan Meshenberg
 
Devops at Netflix (re:Invent)
Devops at Netflix (re:Invent)Devops at Netflix (re:Invent)
Devops at Netflix (re:Invent)Jeremy Edberg
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1Ruslan Meshenberg
 
Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016aspyker
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Sourceaspyker
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2aspyker
 
Beyond DevOps - How Netflix Bridges the Gap
Beyond DevOps - How Netflix Bridges the GapBeyond DevOps - How Netflix Bridges the Gap
Beyond DevOps - How Netflix Bridges the GapJosh Evans
 

Viewers also liked (17)

Ibm cloud nativenetflixossfinal
Ibm cloud nativenetflixossfinalIbm cloud nativenetflixossfinal
Ibm cloud nativenetflixossfinal
 
Netflix s2e1lightningtalk
Netflix s2e1lightningtalkNetflix s2e1lightningtalk
Netflix s2e1lightningtalk
 
NetflixOSS for Triangle Devops Oct 2013
NetflixOSS for Triangle Devops Oct 2013NetflixOSS for Triangle Devops Oct 2013
NetflixOSS for Triangle Devops Oct 2013
 
Docker Demo IBM Impact 2014
Docker Demo IBM Impact 2014Docker Demo IBM Impact 2014
Docker Demo IBM Impact 2014
 
Going Cloud Native with IBM Cloud and NetflixOSS for Dev@Pulse
Going Cloud Native with IBM Cloud and NetflixOSS for Dev@PulseGoing Cloud Native with IBM Cloud and NetflixOSS for Dev@Pulse
Going Cloud Native with IBM Cloud and NetflixOSS for Dev@Pulse
 
Acme airnetflix 20130717
Acme airnetflix 20130717Acme airnetflix 20130717
Acme airnetflix 20130717
 
Netflix Container Runtime - Titus - for Container Camp 2016
Netflix Container Runtime - Titus - for Container Camp 2016Netflix Container Runtime - Titus - for Container Camp 2016
Netflix Container Runtime - Titus - for Container Camp 2016
 
Cloud Services Powered by IBM SoftLayer and NetflixOSS
Cloud Services Powered by IBM SoftLayer and NetflixOSSCloud Services Powered by IBM SoftLayer and NetflixOSS
Cloud Services Powered by IBM SoftLayer and NetflixOSS
 
Netflix Open Source Meetup Season 4 Episode 3
Netflix Open Source Meetup Season 4 Episode 3Netflix Open Source Meetup Season 4 Episode 3
Netflix Open Source Meetup Season 4 Episode 3
 
Re:invent 2016 Container Scheduling, Execution and AWS Integration
Re:invent 2016 Container Scheduling, Execution and AWS IntegrationRe:invent 2016 Container Scheduling, Execution and AWS Integration
Re:invent 2016 Container Scheduling, Execution and AWS Integration
 
NetflixOSS season 2 episode 2 - Reactive / Async
NetflixOSS   season 2 episode 2 - Reactive / AsyncNetflixOSS   season 2 episode 2 - Reactive / Async
NetflixOSS season 2 episode 2 - Reactive / Async
 
Devops at Netflix (re:Invent)
Devops at Netflix (re:Invent)Devops at Netflix (re:Invent)
Devops at Netflix (re:Invent)
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
 
Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Source
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
 
Beyond DevOps - How Netflix Bridges the Gap
Beyond DevOps - How Netflix Bridges the GapBeyond DevOps - How Netflix Bridges the Gap
Beyond DevOps - How Netflix Bridges the Gap
 

Similar to Triangle Devops Meetup 10/2015

Building a Distributed & Automated Open Source Program at Netflix
Building a Distributed & Automated Open Source Program at NetflixBuilding a Distributed & Automated Open Source Program at Netflix
Building a Distributed & Automated Open Source Program at NetflixAll Things Open
 
2016_04_04_CNI_Spring_Meeting_Microservices
2016_04_04_CNI_Spring_Meeting_Microservices2016_04_04_CNI_Spring_Meeting_Microservices
2016_04_04_CNI_Spring_Meeting_MicroservicesJason Varghese
 
The Netflix Way to deal with Big Data Problems
The Netflix Way to deal with Big Data ProblemsThe Netflix Way to deal with Big Data Problems
The Netflix Way to deal with Big Data ProblemsMonal Daxini
 
Successful DevOps implementation for small teams a true story
Successful DevOps implementation for small teams  a true storySuccessful DevOps implementation for small teams  a true story
Successful DevOps implementation for small teams a true storyJakub Paweł Głazik
 
Open shift and docker - october,2014
Open shift and docker - october,2014Open shift and docker - october,2014
Open shift and docker - october,2014Hojoong Kim
 
NetflixOSS Meetup season 3 episode 2
NetflixOSS Meetup season 3 episode 2NetflixOSS Meetup season 3 episode 2
NetflixOSS Meetup season 3 episode 2Ruslan Meshenberg
 
Serverless and AI: Orit Nissan-Messing, Iguazio, Serverless NYC 2018
Serverless and AI: Orit Nissan-Messing, Iguazio, Serverless NYC 2018Serverless and AI: Orit Nissan-Messing, Iguazio, Serverless NYC 2018
Serverless and AI: Orit Nissan-Messing, Iguazio, Serverless NYC 2018iguazio
 
MRA AMA Part 10: Kubernetes and the Microservices Reference Architecture
MRA AMA Part 10: Kubernetes and the Microservices Reference ArchitectureMRA AMA Part 10: Kubernetes and the Microservices Reference Architecture
MRA AMA Part 10: Kubernetes and the Microservices Reference ArchitectureNGINX, Inc.
 
[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...
[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...
[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...Srijan Technologies
 
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthUSENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthNicolas Brousse
 
Disenchantment: Netflix Titus, Its Feisty Team, and Daemons
Disenchantment: Netflix Titus, Its Feisty Team, and DaemonsDisenchantment: Netflix Titus, Its Feisty Team, and Daemons
Disenchantment: Netflix Titus, Its Feisty Team, and DaemonsC4Media
 
What's New in Docker - February 2017
What's New in Docker - February 2017What's New in Docker - February 2017
What's New in Docker - February 2017Patrick Chanezon
 
Ensuring Performance in a Fast-Paced Environment (CMG 2014)
Ensuring Performance in a Fast-Paced Environment (CMG 2014)Ensuring Performance in a Fast-Paced Environment (CMG 2014)
Ensuring Performance in a Fast-Paced Environment (CMG 2014)Martin Spier
 
The journey to Native Cloud Architecture & Microservices, tracing the footste...
The journey to Native Cloud Architecture & Microservices, tracing the footste...The journey to Native Cloud Architecture & Microservices, tracing the footste...
The journey to Native Cloud Architecture & Microservices, tracing the footste...Mek Srunyu Stittri
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansPeter Clapham
 
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesYow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesAdrian Cockcroft
 
Kubernetes is all you need
Kubernetes is all you needKubernetes is all you need
Kubernetes is all you needVishwas N
 
Cloud Native Applications on Kubernetes: a DevOps Approach
Cloud Native Applications on Kubernetes: a DevOps ApproachCloud Native Applications on Kubernetes: a DevOps Approach
Cloud Native Applications on Kubernetes: a DevOps ApproachNicola Ferraro
 
OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...
OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...
OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...OpenNebula Project
 

Similar to Triangle Devops Meetup 10/2015 (20)

Building a Distributed & Automated Open Source Program at Netflix
Building a Distributed & Automated Open Source Program at NetflixBuilding a Distributed & Automated Open Source Program at Netflix
Building a Distributed & Automated Open Source Program at Netflix
 
2016_04_04_CNI_Spring_Meeting_Microservices
2016_04_04_CNI_Spring_Meeting_Microservices2016_04_04_CNI_Spring_Meeting_Microservices
2016_04_04_CNI_Spring_Meeting_Microservices
 
The Netflix Way to deal with Big Data Problems
The Netflix Way to deal with Big Data ProblemsThe Netflix Way to deal with Big Data Problems
The Netflix Way to deal with Big Data Problems
 
Successful DevOps implementation for small teams a true story
Successful DevOps implementation for small teams  a true storySuccessful DevOps implementation for small teams  a true story
Successful DevOps implementation for small teams a true story
 
Open shift and docker - october,2014
Open shift and docker - october,2014Open shift and docker - october,2014
Open shift and docker - october,2014
 
NetflixOSS Meetup season 3 episode 2
NetflixOSS Meetup season 3 episode 2NetflixOSS Meetup season 3 episode 2
NetflixOSS Meetup season 3 episode 2
 
Serverless and AI: Orit Nissan-Messing, Iguazio, Serverless NYC 2018
Serverless and AI: Orit Nissan-Messing, Iguazio, Serverless NYC 2018Serverless and AI: Orit Nissan-Messing, Iguazio, Serverless NYC 2018
Serverless and AI: Orit Nissan-Messing, Iguazio, Serverless NYC 2018
 
MRA AMA Part 10: Kubernetes and the Microservices Reference Architecture
MRA AMA Part 10: Kubernetes and the Microservices Reference ArchitectureMRA AMA Part 10: Kubernetes and the Microservices Reference Architecture
MRA AMA Part 10: Kubernetes and the Microservices Reference Architecture
 
[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...
[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...
[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...
 
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthUSENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
 
Disenchantment: Netflix Titus, Its Feisty Team, and Daemons
Disenchantment: Netflix Titus, Its Feisty Team, and DaemonsDisenchantment: Netflix Titus, Its Feisty Team, and Daemons
Disenchantment: Netflix Titus, Its Feisty Team, and Daemons
 
What's New in Docker - February 2017
What's New in Docker - February 2017What's New in Docker - February 2017
What's New in Docker - February 2017
 
Ensuring Performance in a Fast-Paced Environment (CMG 2014)
Ensuring Performance in a Fast-Paced Environment (CMG 2014)Ensuring Performance in a Fast-Paced Environment (CMG 2014)
Ensuring Performance in a Fast-Paced Environment (CMG 2014)
 
The journey to Native Cloud Architecture & Microservices, tracing the footste...
The journey to Native Cloud Architecture & Microservices, tracing the footste...The journey to Native Cloud Architecture & Microservices, tracing the footste...
The journey to Native Cloud Architecture & Microservices, tracing the footste...
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticians
 
Flexible compute
Flexible computeFlexible compute
Flexible compute
 
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesYow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
 
Kubernetes is all you need
Kubernetes is all you needKubernetes is all you need
Kubernetes is all you need
 
Cloud Native Applications on Kubernetes: a DevOps Approach
Cloud Native Applications on Kubernetes: a DevOps ApproachCloud Native Applications on Kubernetes: a DevOps Approach
Cloud Native Applications on Kubernetes: a DevOps Approach
 
OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...
OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...
OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...
 

Recently uploaded

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Triangle Devops Meetup 10/2015

  • 1. Netflix Open Source & What I have done in a year? Andrew Spyker Senior Software Engineer, Netflix
  • 2. Back to the Past Previous talks at @TriangleDevops ● 10/16/2013 - Learn about NetflixOSS ● 6/18/2014 - Learn about Docker
  • 3. About Netflix ● 69M members ● 2000+ employees (1400 tech) ● 80+ countries ● > 100M hours watch per day ● > ⅓ NA internet download traffic ● 500+ Microservices ● Many 10’s of thousands VM’s ● 3 regions across the world
  • 4. About the Speaker ● Cloud platform technologies ○ Distributed configuration, service discovery, RPC, application frameworks, non-Java sidecar ● Container cloud ○ Resource management and scheduling, making Docker containers operational in Amazon EC2/ECS ● Open Source ○ Organize @NetflixOSS meetups & internal group ● Performance ○ Assist across Netflix, but focused mainly on cloud platform perf With Netflix for ~ 1 year. Previously at IBM here in Raleigh/Durham (RTP) @aspyker ispyker. blogspot. com
  • 5. Agenda ● NetflixOSS Netflix Cloud Architecture Getting Started Personal Projects
  • 6. Why does Netflix open source? ● Allows engineers to gather feedback ○ Openly talk, through code, on our approach ○ Collaboration on key projects with the world ○ Happily use proven outside open source ■ And improve it for Netflix scale and availability ● Netflix culture of freedom and responsibility ○ Want to open source? ○ Go for it, be responsible! ● Recruiting and Retention ○ Candidates know exactly what they can work on ○ NetflixOSS engineers choose to stay at Netflix
  • 7. NetflixOSS is widely used ● The architecture has shaped public cloud usage ○ Immutability, Red/Black Deploys, Chaos, Regional and worldwide high availability ● Offerings ○ Pivotal Spring Cloud ● Large usage ○ IBM Watson as a Service (on IBM Cloud) ○ Nike Digital is hiring NetflixOSS experts ● Interesting usage ○ “To help locate new troves of data claiming to be the files stolen from AshleyMadison, the company’s forensics team has been using a tool that Netflix released last year called Scumblr”
  • 9. Key aspects of NetflixOSS website ● Show how the pieces fit together ○ Projects now discussed with each other in context ● OSS categories mirror internal teams ○ No artificial categories, focal points for each area ● Focus on projects that are core to Netflix ○ Projects mentioned are core and strategic
  • 10. Agenda NetflixOSS ● Netflix Cloud Architecture Getting Started Personal Projects
  • 11. Elastic, Web and Hyper Scale Doing this Not doing that
  • 12. Elastic, Web and Hyper Scale Front end API Another Microservice Temporal caching Durable Storage Load Balancers … Strategy Benefit Automate everything Less errors, more consistency than manual runbooks Expose well designed API to users Offloads presentation complexity to clients Remove state for mid tier services Allows easy elastic scale out Push temporal state to client and caching tier Leverage clients, avoids data tier overload Use partitioned data storage Data design and storage scales with HA … … … … … Recommendation Microservice
  • 13. HA and Automatic Recovery Feeling This Not Feeling That
  • 14. Micro service Implementation Call microservice #2 Highly Available Service Runtime Recipe Ribbon REST client with Eureka Microservice #1 (REST services) App Service Microservice #2 Execute call Hystrix Eureka Server(s) Eureka Server(s) Eureka Server(s) Karyon Fallback Implementation Implementation Detail Benefits Decompose into micro services • Key user path always available • Failure does not propagate across service boundaries Karyon /w automatic Eureka registration • New instances are quickly found • Failing individual instances disappear Ribbon client with Eureka awareness • Load balances & retries across instances with “smarts” • Handles temporal instance failure Hystrix as dependency circuit breaker • Allows for fast failure • Provides graceful cross service degradation/recovery
  • 15. IaaS High Availability Region (us-east-1) us-east-1e us-east-1c Eureka Web App Service1 Service2 Cluster Auto Recovery and Scaling Services (Auto Scaling Groups) … ELB’s Rule Why? Always > 2 of everything 1 is SPOF, 2 doesn’t scale, slow DR recovery, majority consensus not possible Including IaaS and cloud services You’re only as strong as your weakest dependency Use auto scaler/recovery monitoring Clusters guarantee availability and service latency Use application level health checks Instance on the network != healthy Worldwide availability Data replication, global front-end routing, cross region traffic us-east-1d
  • 16. A truly global service ● Replicate data across regions ● Be able to redirect traffic from region to region ● Be able to migrate regional traffic to other regions ● Have automated control across regions Flux Demo
  • 17. Testing is only way to prove HA ● Chaos Monkey ○ Kill instances in production - runs regularly ● Chaos Gorilla ○ Kills availability zones (single datacenter) ○ Also testing for split brain important ● Chaos Kong ○ Kill entire region and shift traffic globally ○ Run frequently but with prior scheduling
  • 19. v Continuous Delivery Cluster v1 Canary v2 Cluster V2 Step Technology Developers test locally Unit test frameworks Continuous build Continuous build server based on gradle builds Build “bakes” full instance image Aminator and deployment pipeline bake images from build artifacts Developer work across dev and test Archaius allows for environment based context Developers do canary tests, red/black deployments in prod Asgard console provides app cluster common devops approach, security patterns, and visibility Continuous Build Server Baked to images (AMI’s) … …
  • 20. From Asgard to Spinnaker ● Spinnaker is our CI/CD solution ○ CI/CD solution including baking and Jenkins integration ○ Workflow engine for the continuous delivery ○ Pipeline based deployment including baking ○ Global visibility across all of our AWS regions ○ Provides an API first design ○ A microservices runtime HA architecture ○ More flexible cloud model so the community can contribute back improvements not related to AWS ● Asgard continues to work side-by-side ● Spinnaker is this new end to end CI/CD tool
  • 21. Spinnaker Examples Works at Netflix scale Views of global pipelines From simple Asgard like deployment to advanced CI/CD pipelines
  • 22. Operational Visibility If you can’t see it, you can’t improve it
  • 23. Operational Visibility Microservice #1 Microservice #2 Visibility Point Technology Basic IaaS instance monitoring Not enough (not scalable, not app specific) User like external monitoring SaaS offerings or OSS like Uptime Targeted performance, sampling Vector performance and app level metrics Service to service interconnects Hystrix streams ➔Turbine aggregation ➔Hystrix dashboard Application centric metrics Servo/Spectator gauges, counters, timers sent to metrics store like Atlas Remote logging Logstash/Kibana or similar log aggregation and analysis frameworks Threshold monitoring and alerts Services like Atlas and PagerDuty for incident management Servo/ Spectator Hystrix/Turbine External Uptime Monitoring Metric/Event Repositories LogStash/Elastic Search/Kibana Incidents …… … … Atlas Vector
  • 25. Dynamic, Web Scale & Simpler Security Security Monkey ● Monitors security policies, tracks changes, alerts on situations Scumblr ● Searches internet for security “nuggets” (credentials, hacking discussions) Sketchy ● A safe way to collect text and screenshots from websites FIDO ● Automated event detection, analysis, enrichment & and enforcement Sleepy Puppy ● Delayed cross site scripting propagation testing framework Lemur ● x.509 certificate orchestration framework
  • 26. What did we not cover? Over 50 github projects ● NetflixOSS is “Technical indigestion as a service” Big Data, Data Persistence and UI Engineering ● Big Data tools used well beyond Netflix ● Ephemeral, semi and fully persistent data systems ● Recent addition of UI OSS and Falcor
  • 27. Agenda NetflixOSS Netflix Cloud Architecture ● Getting Started Personal Projects
  • 28. How do I get started? ● All of the previous slides shows NetflixOSS components ○ Code: http://netflix.github.io ○ Announcements: http://techblog.netflix.com/ ● Want to get running a bit faster? ● ZeroToCloud ○ Workshop for getting started with build/bake/deploy in Amazon EC2 ● ZeroToDocker ○ Docker images that containing running Netflix technologies (not production ready, but easy to understand)
  • 29. ZeroToDocker Demo Mac OS X Virtual Box Ubuntu 14.04 single kernel Container#1 Filesystem+ process Eureka Container ZuulContainer Another Container ... ● Docker running instances ○ Single kernel ○ Contained processes ● Zookeeper and Exhibitor ● A Microservices app and surrounding NetflixOSS services (Zuul to Karyon with Eureka)
  • 31. Performance Focus ● Reduced Karyon startup time by ⅔ ○ Removal of classpath scanning ○ Moved eureka “UP” registration to be event based ○ Java 8 (faster startup was focus) ● Investigated other opportunities now being considered for Karyon 3 ○ Loading components asynchronously (console) ● Beyond platform startup time - key service ○ Fixes to platform that saved 3 minutes ■ library version tracking, ribbon connection priming ○ Fixes to application logic (distributed indexing/filtering)
  • 32. Performance Focus - Eureka ● Identified issues w/ OOM’s & eureka client ○ For a “full update” we used 2G of memory ○ Was crashing discovery for our EVCache nodes ● Helped prototype the following ○ XStream - required 370M of heap ○ Jackson V1 (first attempt) - down to 260M ○ Jackson V2 (current) - down to 130M ○ Jackson V2 (+compact for future scenarios) - down to 64M
  • 33. Performance Automation ● Implemented automated performance measurement ● Jenkins pipeline as part of every platform candidate ● Uses Elastic (search) and Kibana dashboards ● Measures ○ Boot to tomcat start time ○ Tomcat start to up in discovery ○ Profiles the startup ○ Number of dependencies ○ Used/unused dependencies ○ Jacoco code coverage ● In our face monitoring dashboard
  • 34. Platform Sidecar (Prana) ● Prana started as an edge focused “what was needed”, then wider Netflix usage ● Created release management ○ User oriented smoke tests - Acme Air NodeJS ○ Now releases can be done with confidence ● Supported the Netflix desktop experience ○ Uses isomorphic JavaScript on NodeJS + Prana ○ Added circuit breaker, LB & dist config support ○ Caused my first partial outage (insert story here) ● Supported the EVCache clusters
  • 35. Strategy - Platform Direction ● Helped define some of the platform direction ● Improvements in Eureka to ensure its continued scalability ● Key improvements needed in Karyon 3 ○ Performance improvements (footprint/startup) ○ Focus on mocks needed in dev, unit test, CI envs ○ Ability to narrow features for infrastructural services ○ Rework of Prana to be on same platform base
  • 36. Open Source ● Led internal & external meetups on OSS ● Web site redesign to help external users ● Implemented ZeroToDocker ○ Implemented the platform focused aspects ○ Helped other teams onboard into ZeroToDocker
  • 37. ● Worked to operationalize prod deployments ○ Separate dev stack, metrics, consistent pipelines ○ Built up teams (existing impl, strategic work) ● Created strategy for going forward ○ Increase leverage of “Mantis” technology for scheduling and job management ○ Increase leverage of ECS for Docker AWS integration & resource management ● Working on strategy of non-runtime components ○ Changes to Netflix build/bake/deploy Container cloud