Scheduling a Fuller House: Container Management At Netflix
Customers from over all over the world streamed Forty Two Billion hours of Netflix content last year. Various Netflix batch jobs and an increasing number of service applications use containers for their processing. In this talk Netflix will present a deep dive on the motivations and the technology powering container deployment on top of the AWS EC2 service. The talk will cover our approach to cloud resource management and scheduling with the open source Fenzo library, along with details on docker execution engine as a part of project Titus. As well, the talk will share some of the results so far, lessons learned, and end with a brief look at the developer experience for containers.
Netflix and Containers: Not A Stranger Thingaspyker
Customers from over all over the world streamed Forty Two Billion hours of Netflix content last year. The Netflix streaming service had been powered by the Amazon cloud with virtual machines for over five years, blazing a trail for similar architectures. In the last year, it invested in containers for batch-style jobs and service-style applications. Andrew Spyker will explain the potential containers have to help Netflix create a more productive development experience while simultaneously deepening its control over resource management. Join Andrew to see why Netflix is moving forward with containers, how it can leverage its existing operational machinery, and how it’s running containers with a similar guarantee of high availability as current Netflix infrastructure provides.
Running Containers at Scale at Netflix. An update on the usage of containers at Netflix. Technical discussions on new features and concepts we've added across container scheduling and execution.
Netflix and Containers: Not A Stranger Thingaspyker
Customers from over all over the world streamed Forty Two Billion hours of Netflix content last year. The Netflix streaming service had been powered by the Amazon cloud with virtual machines for over five years, blazing a trail for similar architectures. In the last year, it invested in containers for batch-style jobs and service-style applications. Andrew Spyker will explain the potential containers have to help Netflix create a more productive development experience while simultaneously deepening its control over resource management. Join Andrew to see why Netflix is moving forward with containers, how it can leverage its existing operational machinery, and how it’s running containers with a similar guarantee of high availability as current Netflix infrastructure provides.
Running Containers at Scale at Netflix. An update on the usage of containers at Netflix. Technical discussions on new features and concepts we've added across container scheduling and execution.
Container orchestration: the cold war - Giulio De Donato - Codemotion Rome 2017Codemotion
L’ecosistema degli orchestratori di container è in rapido movimento, una galassia di piattaforme e framework. Come si fa a scegliere quello giusto per le vostre esigenze? Vediamo tutti gli orchestratori in commercio, con i loro pro e contro: DC/OS, Kubernetes, Docker e anche quelli meno famosi ma saranno promesse, e anche le dinamiche e le scelte fatte.
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah BardDocker, Inc.
Container technology is being used to answer some of the biggest questions in science today - what is the Universe made of? How has it evolved over time? Scientists use vast quantities of data to study these questions, and analyzing this data requires Big Data solutions on high performance computing resources. In this talk we discuss why containers are being deployed on the Cori supercomputer at NERSC (the National Energy Research Scientific Computing center) to answer fundamental scientific questions. We will give examples of the use of Docker in simulating complex physical processes and analyzing experimental data in fields as diverse as particle physics, cosmology, astronomy, genomics and material science. We will demonstrate how container technology is being used to facilitate access to scientific computing resources by scientists from around the globe. Finally, we will discuss how container technology has the potential to revolutionize scientific publishing, and could solve the problem of scientific reproducibility.
How Docker EE Helps Open Doors at Assa AbloyDocker, Inc.
Over the past 20 years, Assa Abloy has transformed from a mechanical lock producer to the global leader in door-opening solutions. Today, Assa Abloy is at the forefront of innovation when it comes to digital access solutions.
During this talk, we will discuss how Assa Abloy is using Docker EE to build a Common Access Technology platform based on microservices running in containers. We will share the architectural decisions that were made and how those resulted in deploying Docker EE on AWS. We will discuss both the technical challenges Assa Abloy encountered and the organizational changes that affected the way they develop their software. Next, we will share how Assa Abloy plans to roll out on a global scale.
Proactive ops for container orchestration environmentsDocker, Inc.
Break -> inspect -> fix is the Ops workflow for infrastructure stacks of the past. Distributed infrastructure and applications claim to be the new generation, but why is it so much more painful to maintain and troubleshoot them? Much of the pain comes from outdated operational models relying on reactive or, worse yet, manual monitoring and Ops.
This talk lays out a proactive Ops model for container infrastructure. By focusing on event monitoring, infrastructure state monitoring, trend analysis, and distributed log collection, a proactive Ops model delivers observability for distributed apps that was not possible before. Using real-world examples from Swarm and Kubernetes, we'll demonstrate the tools used and how we relieve Ops pain in container orchestration.
Join us to learn how to deploy your first containerized application on the most popular orchestration engine. You will understand the basic concepts of Kubernetes along with the terminology and the deployment architecture. We will show you everything from building a Docker image to going live with your application. Each attendee gets $300 credit to start using Google Container Engine!
Driving Business and Technical Agility in the Enterprise!
Container World 2017 is the only independent conference offering an exploration of the entire container ecosystem. Over 3 days, you’ll hear from the innovative enterprises, tech giants and startups who are transforming enterprise IT and driving business innovation on such topics as:
Containers and legacy infrastructure
Operations/DevOps
Orchestration & Workloads
Security
Storage/Persistent storage
Standardization and Certification
Emerging technology like serverless, unikernel and beyond
View the brochure for more information: https://goo.gl/OpnoEr
Andrew Spyker
Senior Software Engineer for Netflix
Find more by Andrew Spyker: http://www.slideshare.net/aspyker
All Things Open
October 26-27, 2016
Raleigh, North Carolina
Container orchestration: the cold war - Giulio De Donato - Codemotion Rome 2017Codemotion
L’ecosistema degli orchestratori di container è in rapido movimento, una galassia di piattaforme e framework. Come si fa a scegliere quello giusto per le vostre esigenze? Vediamo tutti gli orchestratori in commercio, con i loro pro e contro: DC/OS, Kubernetes, Docker e anche quelli meno famosi ma saranno promesse, e anche le dinamiche e le scelte fatte.
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah BardDocker, Inc.
Container technology is being used to answer some of the biggest questions in science today - what is the Universe made of? How has it evolved over time? Scientists use vast quantities of data to study these questions, and analyzing this data requires Big Data solutions on high performance computing resources. In this talk we discuss why containers are being deployed on the Cori supercomputer at NERSC (the National Energy Research Scientific Computing center) to answer fundamental scientific questions. We will give examples of the use of Docker in simulating complex physical processes and analyzing experimental data in fields as diverse as particle physics, cosmology, astronomy, genomics and material science. We will demonstrate how container technology is being used to facilitate access to scientific computing resources by scientists from around the globe. Finally, we will discuss how container technology has the potential to revolutionize scientific publishing, and could solve the problem of scientific reproducibility.
How Docker EE Helps Open Doors at Assa AbloyDocker, Inc.
Over the past 20 years, Assa Abloy has transformed from a mechanical lock producer to the global leader in door-opening solutions. Today, Assa Abloy is at the forefront of innovation when it comes to digital access solutions.
During this talk, we will discuss how Assa Abloy is using Docker EE to build a Common Access Technology platform based on microservices running in containers. We will share the architectural decisions that were made and how those resulted in deploying Docker EE on AWS. We will discuss both the technical challenges Assa Abloy encountered and the organizational changes that affected the way they develop their software. Next, we will share how Assa Abloy plans to roll out on a global scale.
Proactive ops for container orchestration environmentsDocker, Inc.
Break -> inspect -> fix is the Ops workflow for infrastructure stacks of the past. Distributed infrastructure and applications claim to be the new generation, but why is it so much more painful to maintain and troubleshoot them? Much of the pain comes from outdated operational models relying on reactive or, worse yet, manual monitoring and Ops.
This talk lays out a proactive Ops model for container infrastructure. By focusing on event monitoring, infrastructure state monitoring, trend analysis, and distributed log collection, a proactive Ops model delivers observability for distributed apps that was not possible before. Using real-world examples from Swarm and Kubernetes, we'll demonstrate the tools used and how we relieve Ops pain in container orchestration.
Join us to learn how to deploy your first containerized application on the most popular orchestration engine. You will understand the basic concepts of Kubernetes along with the terminology and the deployment architecture. We will show you everything from building a Docker image to going live with your application. Each attendee gets $300 credit to start using Google Container Engine!
Driving Business and Technical Agility in the Enterprise!
Container World 2017 is the only independent conference offering an exploration of the entire container ecosystem. Over 3 days, you’ll hear from the innovative enterprises, tech giants and startups who are transforming enterprise IT and driving business innovation on such topics as:
Containers and legacy infrastructure
Operations/DevOps
Orchestration & Workloads
Security
Storage/Persistent storage
Standardization and Certification
Emerging technology like serverless, unikernel and beyond
View the brochure for more information: https://goo.gl/OpnoEr
Andrew Spyker
Senior Software Engineer for Netflix
Find more by Andrew Spyker: http://www.slideshare.net/aspyker
All Things Open
October 26-27, 2016
Raleigh, North Carolina
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemonsaspyker
Disenchantment is a Netflix show following the medieval misadventures of a hard-drinking princess, her feisty elf, and her personal demon. In this talk, we will follow the story of Netflix’s container management platform, Titus, which powers critical aspects of the Netflix business (video encoding & streaming, big data, recommendations & machine learning, and other workloads). We’ll cover the challenges growing Titus from 10’s to 1000’s of workloads. We’ll talk about our feisty team’s work across container runtimes, scheduling & control plane, and cloud infrastructure integration. We’ll talk about the demons we’ve found on this journey covering operability, security, reliability and performance.
NetflixOSS Meetup S3 E1, covering latest components in Distributed Databases, Telemetry systems, Big Data tools and more. Speakers from Netflix, IBM Watson, Pivotal and Nike Digital
NetflixOSS Meetup S6E1 - Titus & Containersaspyker
Come hear about our container management platform, Titus. Titus launches over 2 millions containers per week for service and batch workloads. Come to learn what applications are powered by Titus and what values the developers are getting from containers. Also, we will cover some of the Titus unique aspects of reliability, control plane, scheduling, and container runtime technologies. We will also cover our integrations with Netflix systems such as Spinnaker as well as Amazon concepts such as VPC and IAM.
https://www.meetup.com/Netflix-Open-Source-Platform/events/247776324/
Kubernetes @ Squarespace: Kubernetes in the DatacenterKevin Lynch
This talk was presented at SRE NYC Meetup on August 16, 2017 at Squarespace HQ.
https://www.youtube.com/watch?v=UJ1QAKprVr4
As the engineering teams at Squarespace grow, we have been building more and more microservices. However, this has added operational strain as we try to shoehorn a growing, complex dynamic environment into our static data center infrastructure. We needed to rethink how we handle deployments, dependency management, resource allocation, monitoring, and alerting. Docker containerization and Kubernetes orchestration helps us tackle many of these problems, but the journey has been challenging. In this talk, we’ll discuss the challenges of running Kubernetes in a datacenter and how we switched to a more SLA-focused alert structure than per instance health with Prometheus and AlertManager.
How we have used ansible for real-time industry use cases and Integration with enterprise tools. Infra provisioning and config management using ansible and automating routine tasks.
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kevin Lynch
In this presentation I talk about our motivation to converting our microservices to run on Kubernetes. I discuss many of the technical challenges we encountered along the way, including networking issues, Java issues, monitoring and alerting, and managing all of our resources!
Watch this Tech Talk: https://do.co/video_pgupta
An introduction into the world of containers and the orchestration ecosystem, and how Kubernetes can help software developers and cloud infrastructure engineers be more agile, efficient, and productive.
Containers and Kubernetes have changed the infra world for good, bringing agility, efficiency, and more productivity. Still thinking about how to get started with Kubernetes? This talk is designed to give you an introduction into the world of containers and the orchestration ecosystem.
What You'll Learn
- Introduction to containers and microservices
- Introduction to Kubernetes and how it can help
- Essential Kubernetes building blocks (“primitives”) for getting started
About the Presenter
Peeyush Gupta is a cloud enthusiast with 5+ years of experience in developing cloud platforms and helping customers migrate their legacy applications to cloud. He has also been a speaker at multiple meetups and serves the developer community as part of Kubernetes contributor experience group. He is currently working with DigitalOcean as a Senior Developer Advocate.
New to DigitalOcean? Get US $100 in credit when you sign up: https://do.co/deploytoday
To learn more about DigitalOcean: https://www.digitalocean.com/
Follow us on Twitter: https://twitter.com/digitalocean
Like us on Facebook: https://www.facebook.com/DigitalOcean
Follow us on Instagram: https://www.instagram.com/thedigitalocean/
We're hiring: http://do.co/careers
Presented at DevOpsDays Boston. Over the past few years, massive open online courses (MOOCs) powered by Open edX have become wildly popular, bringing free or low-cost education to millions of students around the world. Such success, however, presents a slew of challenging problems in terms of providing a scalable, robust, and secure platform.
At Appsembler, we offer customers a fully managed and supported Open edX stack, all the way from the frontend web application to the backend services like ElasticSearch, MySQL, and MongoDB. With so many moving parts, we have come to realize the value of a multi-container, microservices-oriented architecture using Docker.
In contrast to a single-container deployment of the Open edX stack, a multi-container approach allows us to scale different services independently; improves robustness since we can simply spin up new copies of containers if they go down; and results in improved security through greater segmentation and isolation. In addition to discussing these benefits, we'll also cover how we're managing deployments using Kubernetes for orchestration and service discovery along with Google Cloud infrastructure.
Nate Aune is a developer and entrepreneur with over 15 years of professional experience building highly scalable web applications. Nate is also the founder and CEO of Appsembler. Morgan Robertson is a DevOps Engineer at Appsembler with experience in Docker, Ansible, Python, and automation tools.
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthNicolas Brousse
TubeMogul grew from few servers to over two thousands servers and handling over one trillion http requests a month, processed in less than 50ms each. To keep up with the fast growth, the SRE team had to implement an efficient Continuous Delivery infrastructure that allowed to do over 10,000 puppet deployment and 8,500 application deployment in 2014. In this presentation, we will cover the nuts and bolts of the TubeMogul operations engineering team and how they overcome challenges.
DevOps Days Boston 2017: Real-world Kubernetes for DevOpsAmbassador Labs
DevOps Days Boston 2017
Microservices is an increasingly popular approach to building cloud-native applications. Dozens of new technologies that streamline adopting microservices development such as Docker, Kubernetes, and Envoy have been released over the past few years. But how do you actually use these technologies together to develop, deploy, and run microservices?
In this presentation, we’ll cover the nuances of deploying containerized applications on Kubernetes, including creating a Kubernetes manifest, debugging and logging, and how to build an automated continuous deployment pipeline. Then, we’ll do a brief tour of some of the advanced concepts related to microservices, including service mesh, canary deployments, resilience, and security.
Disenchantment: Netflix Titus, Its Feisty Team, and DaemonsC4Media
Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2Gmuwlg.
Andrew Spyker talks about Netflix's feisty team’s work across container runtimes, scheduling & control plane, and cloud infrastructure integration. He also talks about the demons they’ve found on this journey covering operability, security, reliability and performance. Filmed at qconsf.com.
Andrew Spyker worked to mature the technology base of Netflix Container Cloud (Project Titus) within the development team. Recently, he moved into a product management role collaborating with supporting Netflix infrastructure dependencies as well as supporting new container cloud usage scenarios including user on-boarding, feature prioritization/delivery and relationship management.
Agenda:
What is Software Defined Storage?
What is Ceph?
What is Rook?
Storage for Kubernetes
Storage Classes
Storage on Kubernetes
Operator Pattern
Custom Resource Definition
Rook Operator
Rook architecture
Ceph on Kubernetes with Rook
Demo
Rook Framework for Storage solutions
How to Get Involved?
Kubernetes Forum Seoul 2019: Re-architecting Data Platform with KubernetesSeungYong Oh
Session Video: https://youtu.be/7MPH1mknIxE
In this talk, we share Devsisters' journey of migrating its internal data platform including Spark to Kubernetes, with its benefits and issues.
데브시스터즈에서 데이터플랫폼 컴포넌트를 쿠버네티스로 옮기면서 얻은 장점들과 이슈들에 대해 공유합니다.
Conference session page:
- English: https://sched.co/WIRK
- Korean: https://sched.co/WYRc
Free GitOps Workshop + Intro to Kubernetes & GitOpsWeaveworks
Follow along in this free workshop and experience GitOps!
AGENDA:
Welcome - Tamao Nakahara, Head of DX (Weaveworks)
Introduction to Kubernetes & GitOps - Mark Emeis, Principal Engineer (Weaveworks)
Weave Gitops Overview - Tamao Nakahara
Free Gitops Workshop - David Harris, Product Manager (Weaveworks)
If you're new to Kubernetes and GitOps, we'll give you a brief introduction to both and how GitOps is the natural evolution of Kubernetes.
Weave GitOps Core is a continuous delivery product to run apps in any Kubernetes. It is free and open source, and you can get started today!
https://www.weave.works/product/gitops-core
If you’re stuck, also come talk to us at our Slack channel! #weave-gitops http://bit.ly/WeaveGitOpsSlack (If you need to invite yourself to the Slack, visit https://slack.weave.works/)
OSDC 2018 | Three years running containers with Kubernetes in Production by T...NETWAYS
The talk gives a state of the art update of experiences with deploying applications in Kubernetes on scale. If in clouds or on premises, Kubernetes took over the leading role as a container operating system. The central paradigm of stateless containers connected to storage and services is the core of Kubernetes. However, it can be extended to distributed databases, Machine Learning, Windows VMs in Kubernetes. All these applications have been considered as edge cases a few years ago, however, are going more and more mainstream today.
Similar to Netflix Container Scheduling and Execution - QCon New York 2016 (20)
Herding Kats - Netflix’s Journey to Kubernetes Publicaspyker
An update from Netflix Compute's container management platform, Titus, covering the work to move from Mesos to Kubernetes. Lessons learned, next steps, and challenges.
Season 7 Episode 1 - Tools for Data Scientistsaspyker
Metaflow (Ville Tuulos)
Data scientists at Netflix are expected to develop and operate large machine learning workflows autonomously. However, we do not expect that all our scientists are deeply experienced with distributed systems and data engineering. Metaflow was created to make it delightfully easy to build and operate ML workflows in the cloud using idiomatic Python and off-the-shelf ML libraries, covering the whole lifecycle of an ML project from prototype to production.
Polynote (Jeremy Smith)
Polynote is a new notebook tool we created from scratch to address some of the pain points we've run into while using Scala in machine-learning notebooks at Netflix. It provides essential code editing features other tools lack like interactive auto-completes, support for mixing multiple languages and sharing data between them within a single notebook, and encourages reproducible notebooks with its immutable data model.
Papermill (Matthew Seal)
Nteract is an open source organization under which there are several libraries and applications that Netflix and many other companies and individuals contribute to. One of these libraries is Papermill, a library used to programmatically parameterize and execute Jupyter Notebooks. Papermill provides a CLI and Python interface that we'll explore during the session to see how it can be used and what value it adds. Using this pattern we'll also briefly talk about how we've integrated papermill at Netflix and how it interfaces with other Jupyter and nteract services.
CMP376 - Another Week, Another Million Containers on Amazon EC2aspyker
Netflix’s container management platform, Titus, powers critical aspects of the Netflix business, including video streaming, recommendations, machine learning, big data, content encoding, studio technology, internal engineering tools, and other Netflix workloads. Titus offers a convenient model for managing compute resources, enables developers to maintain just their application artifacts, and provides a consistent developer experience from a developer’s laptop to production by leveraging Netflix container-focused engineering tools.
In this episode, we will focus on continuous delivery and how Netflix uses Spinnaker and Kayenta to safely deliver changes to the cloud and beyond. Kayenta is a platform for Automated Canary Analysis (ACA). It is used by Spinnaker to enable automated canary deployments. We will also discuss how Spinnaker is used at Netflix to deploy targets beyond cloud VMs and containers --- batch jobs, CDNs, fast properties and Open Connect appliances.
Topics:
• RepoKid
Netflix’s Open-source Strategy to Rightsizing Cloud Permissions at Scale
• BetterTLS
A test suite for HTTPS clients implementing verification of the Name Constraints certificate extension
• Authorization at Netflix
Netflix’s architecture for implementing Authorization at scale
• Open Policy Agent
An open source, general-purpose policy engine that enables unified, context-aware policy enforcement across the entire stack. (www.openpolicyagent.org)
• Introducing PADME (Policy Access Decision Management Engine)
A modern policy management for distributed heterogenous systems. (www.padme.io)
Demo Stations:
• Stethoscope
Personalized, user-focused recommendations for employee information security.
• HubCommander
Slack bot for GitHub organization management -- and other things too!
• Open Policy Agent
An open source, general-purpose policy engine that enables unified, context-aware policy enforcement across the entire stack.
Series of Unfortunate Netflix Container Events - QConNYC17aspyker
Project Titus is Netflix's container runtime on top of Amazon EC2. Titus powers algorithm research through massively parallel model training, media encoding, data research notebooks, ad hoc reporting, NodeJS UI services, stream processing and general micro-services. As an update from last year's talk, we will focus on the lessons learned operating one of the largest container runtimes on a public cloud. We'll cover the migration we've seen of applications and frameworks from VM's to containers. We will cover the operational issues with containers that only showed after we reached the large scale (1000's of container hosts, 100's of thousands of containers launched weekly) we are currently supporting. We'll touch base on the unique features we have added to help both batch and microservices run across a variety of runtimes (Java, R, NodeJS, Python, etc) and how higher level frameworks have taken advantage of Titus's scheduling capabilities.
In this episode, we will focus on open sourcing how we run Netflix's open source program. Netflix has been using and contributing to open source for several years. Over the years, Netflix has released over one hundred Netflix Open Source (aka NetflixOSS) libraries, servers, and technologies. Netflix engineers benefit by accepting contributions and gathering feedback with key collaborators around the world. Users of NetflixOSS from many industries benefit from our solutions including Big Data, Build and Delivery Tools, Runtime Services and Libraries, Data Persistence, Insight, Reliability and Performance, Security and User Interface. With such a large and mature open source program, Netflix has worked on approaches and tools that help manage and improve the NetflixOSS source offerings and communities. Netflix has taken a different approach to building support for open source as compared to other Internet scale companies. Come to this session to learn about the unique approaches Netflix has taken to both distribute and automate the responsibilities of building a world-class open source program.
Re:invent 2016 Container Scheduling, Execution and AWS Integrationaspyker
Members from over all over the world streamed over forty-two billion hours of Netflix content last year. Various Netflix batch jobs and an increasing number of service applications use containers for their processing. In this session, Netflix presents a deep dive on the motivations and the technology powering container deployment on top of Amazon Web Services. The session covers our approach to resource management and scheduling with the open source Fenzo library, along with details of how we integrate Docker and Netflix container scheduling running on AWS. We cover the approach we have taken to deliver AWS platform features to containers such as IAM roles, VPCs, security groups, metadata proxies, and user data. We want to take advantage of native AWS container resource management using Amazon ECS to reduce operational responsibilities. We are delivering these integrations in collaboration with the Amazon ECS engineering team. The session also shares some of the results so far, and lessons learned throughout our implementation and operations.
Netflix Open Source: Building a Distributed and Automated Open Source Programaspyker
Netflix has been using and contributing to open source for several years. Over the years, Netflix has released over one hundred Netflix Open Source (aka NetflixOSS) libraries, servers, and technologies. Netflix engineers benefit by accepting contributions and gathering feedback with key collaborators around the world. Users of NetflixOSS from many industries benefit from our solutions including Big Data, Build and Delivery Tools, Runtime Services and Libraries, Data Persistence, Insight, Reliability and Performance, Security and User Interface. With such a large and mature open source program, Netflix has worked on approaches and tools that help manage and improve the NetflixOSS source offerings and communities. Netflix has taken a different approach to building support for open source as compared to other Internet scale companies. Come to this session to learn about the unique approaches Netflix has taken to both distribute and automate the responsibilities of building a world-class open source program.
Netflix Open Source Meetup Season 4 Episode 3aspyker
In this episode, we will focus on security in the cloud at scale. We’ll have Netflix speakers discussing existing and upcoming security-related OSS releases, and we’ll also have external speakers from organizations that are using and contributing to Netflix security OSS.
First, Patrick Kelley from Netflix’s Security Operations team will speak about RepoMan, an upcoming OSS release designed to right-size AWS permissions. Then, Wes Miaw from Netflix’s Security Engineering team will discuss MSL (Message Security Layer).
We have two external speakers for this event - Chris Dorros from OpenDNS/Cisco will talk about his use of and contributions to Lemur, and Ryan Lane from Lyft will talk about their use of BLESS.
After the talks, we’ll have OSS authors at demo stations to answer questions and provide demos of Netflix security OSS, including Lemur, MSL, and Security Monkey.
Netflix Open Source Meetup Season 4 Episode 2aspyker
In this episode, we will take a close look at 2 different approaches to high-throughput/low-latency data stores, developed by Netflix.
The first, EVCache, is a battle-tested distributed memcached-backed data store, optimized for the cloud. You will also hear about the road ahead for EVCache it evolves into an L1/L2 cache over RAM and SSDs.
The second, Dynomite, is a framework to make any non-distributed data-store, distributed. Netflix's first implementation of Dynomite is based on Redis.
Come learn about the products' features and hear from Thomson and Reuters, Diego Pacheco from Ilegra and other third party speakers, internal and external to Netflix, on how these products fit in their stack and roadmap.
Netflix Open Source Meetup Season 4 Episode 1aspyker
Learn more about how we are evolving our open source. In our evolution we’ll discuss how we are approaching project lifecycles, metrics we are tracking that give us insight into the health of our key projects, and how we are working to make this clear to the communities involved with our projects.
Also, we will discuss one of most recent key open source releases – Spinnaker (http://spinnaker.io/). Spinnaker is an open source, multi-cloud continuous delivery platform for releasing software changes with high velocity and confidence. Spinnaker powers thousands of deployments per day across the Netflix service.
Triangle Devops Meetup covering Netflix open source, cloud architecture, and what Andrew did in his first year working as a senior software engineer in the cloud platform group.
A presentation on the Netflix Cloud Architecture and NetflixOSS open source. For the All Things Open 2015 conference in Raleigh 2015/10/19. #ATO2015 #NetflixOSS
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Netflix Container Scheduling and Execution - QCon New York 2016
1. Scheduling a Fuller House:
Container Management
Sharma Podila, Andrew Spyker - Senior Software Engineers
2. About Netflix
● 81.5M members
● 2000+ employees (1400 tech)
● 190+ countries
● > 100M hours watch per day
● > ⅓ NA internet download traffic
● 500+ Microservices
● Many 10’s of thousands VM’s
● 3 regions across the world
2
3. Agenda
● Why containers at Netflix?
● What did we build and what did we learn?
● What are our current and future workloads?
3
⇨
4. Why a 2nd edition of virtualization?
● Given our resilient cloud native, CI/CD devops enabled,
elastically scalable virtual machine based architecture,
did we really need containers? 4
5. Motivating factors for containers
● Simpler management of compute resources
● Simpler deployment packaging artifacts for compute jobs
● Need for a consistent local developer environment
5
6. Simpler compute, Management & Packaging
Batch/stream processing jobs
● Here are the files to run my process
● I need m cores, n disk, and o memory
● Please just run it for me!
6
Service style jobs (VM’s)
● Use tested/secure base AMI
● Bake an AMI
● Define launch config
● Choose t-shirt sized instance
● Canary & red/black ASG’s
7. Consistent developer experience
● Many years focused on
○ Build, bake / cloud deploy / operational experience
○ Not as much time focused on developer experience
● New Netflix local developer experience based on Docker
● Has had a benefit in both directions
○ Cloud like local development environment
○ Easier operational debugging of cloud workloads
7
8. What about resource optimization?
● Not absolutely required and easier to get wins at larger
scale across larger virtual machine fleet
● However, potential benefits to
○ Elastic resource pool for scaling batch & adhoc jobs
○ Reliable smaller instance sizes for NodeJS
○ Cross Netflix resource optimizations
■ Trough usage, instance type migration
8
9. Agenda
● Why containers at Netflix?
● What did we build and what did we learn?
● What are our current and future workloads?
9
⇨
12. Lesson: Buy vs. Build, Why build our own?
● Looking across other container management solutions
○ Mesos, Kubernetes, and Swarm
● Proven solutions are focused on the datacenter
● Newer solutions are
○ Working to abstract datacenter and cloud
○ Delivering more than cluster manager
■ PaaS, Service discovery, IPC
■ Continuous deployment
■ Metrics
○ Not yet at our level of scale
● Not appropriate for Netflix 12
13. “Project Titus” (Firehose peek)
13
Titus UITitus UI
Docker
Registry
Docker
Registry
Rhea
container
container
container
docker
Titus Agent
metrics agent
Titus executor
logging agent
zfs
mesos agent
docker
RheaTitus API
Cassandra
Titus Master
Job Management &
Scheduler
S3
Zookeeper
Docker
Registry
EC2 Autocaling
API
Mesos Master
Titus UI
Fenzo
container
Pod & VPC net
drivers
container
container
AWS container
metadata proxy
Integration
CI/CD Amazon VM’s
15. Container Execution
15
Titus UITitus UI
Docker
Registry
Docker
Registry
Rhea
container
container
container
docker
Titus Agent
metrics agent
Titus executor
logging agent
zfs
mesos agent
docker
RheaTitus API
Cassandra
Titus Master
Job Management &
Scheduler
S3
Zookeeper
Docker
Registry
EC2 Autocaling
API
Mesos Master
Titus UI
Fenzo
container
Pod & VPC net
drivers
container
container
AWS container
metadata proxy
CI/CD Amazon VM’s
16. Lesson: What you lose with Docker on EC2
16
+ <
● Networking: VPC
● Security: Security Groups, IAM Roles
● Context: Instance Metadata, User Data / Env Context
● Operational Visibility: Metrics, Health checking
● Resource Isolation: Networking, Local Storage
MULTI-TENANT
17. Lesson: Making Containers Act Like VM’s
17
● Built: EC2 Metadata Proxy
○ Provide overridden scheduled IAM role, instance id
○ Proxy other values
● Provided: Provide Environmental Context
○ Titus specific job and task info
○ ASG app, stack, sequence, other EC2 standard
● Why? Now:
○ Service discovery registration works
○ Amazon service SDK based applications work
18. Lesson: Networking will continue to evolve
18
● Started with batch
○ Started with “bridge” with port mapping
○ Added “host” with port resource mapping (for performance?)
○ Continue to use “bridge” without port mapping
● Service style apps added
○ Added “nfvpc” VPC IP/container with libnetwork plugin
○ Removed Host (no value over VPC IP/container)
○ Changed “nfvpc” VPC IP/container
■ Pod based with customer executor (no plugin)
○ Added security groups to “nfvpc”
19. Plumbing VPC Networking into Docker
19
No IP Needed
Task 0
SecGrp Y
Task 1 Task 2 Task 3
docker0 (*)
EC2 VMeth0
eni0
SG=Titus Agent
eth1
eni1
SecGrp=X
eth2
eni2
SG=Y
IP 1
IP 2
IP 3
pod root
veth<id>
app
SecGrp X
pod root
veth<id>
app
SecGrp X
pod root
veth<id>
appapp
veth<id>
Linux Policy
Based Routing
EC2
Metadata
Proxy
169.254.169.254
IPTables NAT (*)
* **
169.254.169.254
20. Lesson: Secure Multi-tenancy is Hard
20
Common to VM’s and tiered security needed
● Protect the reduced host IAM role, Allow containers to have specific IAM roles
● Needed to support same security groups in container networking as VM’s
User namespacing
● Docker 1.10 - Introduced User Namespaces
● Didn’t work /w shared networking NS
● Docker 1.11 - Fixed shared networking NS’s
● But, namespacing is per daemon
● Not per container, as hoped
● Waiting on Linux
● Considering mass chmod / ZFS clones
21. Operational Visibility Evolution
21
● What is “node” - containers on VM’s
● Soft limits / bursting a good thing?
○ Until percent util and outliers are considered
● System level metrics
○ Currently - hand coded cgroup scraping
○ Considering Intel Snap replacement
● Pollers - Metrics, Health, Discovery
○ Created Edda common “server group” view
23. Job Management and Resource Scheduling
23
Titus UITitus UI
Docker
Registry
Docker
Registry
Rhea
container
container
container
docker
Titus Agent
metrics agent
Titus executor
logging agent
zfs
mesos agent
docker
RheaTitus API
Cassandra
Titus Master
Job Management &
Scheduler
S3
Zookeeper
Docker
Registry
EC2 Autocaling
API
Mesos Master
Titus UI
Fenzo
container
Pod & VPC net
drivers
container
container
AWS container
metadata proxy
CI/CD Amazon VM’s
24. Lesson: Complexity in scheduling
24
● Resilience
○ Balance instances across EC2 zones,
instances within a zone
● Security
○ Two level resource for ENIs
● Placement optimization
○ Resource affinity
○ Task locality
○ Bin packing (Auto Scaling)
25. Lesson: Keep resource scheduling extensible
25
Fenzo - Extensible Scheduling Library
Features:
● Heterogeneous resources & tasks
● Autoscaling of mesos cluster
○ Multiple instance types
● Plugins based scheduling objectives
○ Bin packing, etc.
● Plugins based constraints evaluator
○ Resource affinity, task locality, etc.
● Scheduling actions visibility
https://github.com/Netflix/Fenzo
27. Resources assigned in Titus
27
● CPU, memory, disk capacity
● Per container AWS EC2 Security groups, IP, and
network bandwidth via custom driver
● Abstracting out EC2 instance types
28. Security groups and their resources
28
A two level resource per EC2 Instance: N ENIs, each with M IPs
ENI 0
Assigned Security Group: SG1 Used IPs Count: 2 of 7
ENI 1
Assigned Security Group: SG1,SG2 Used IPs Count: 1 of 7
ENI 2
Assigned Security Group: SG3 Used IPs Count: 7 of 7
29. Lesson: Scheduling Vs. Job Management
29
Scheduling resources to tasks is common.
Lifecycle management is not.
30. Lesson: Scheduling Vs. Job Management
30
Task scheduling concerns
● Assign resources to tasks
● Cluster wide optimizations
○ Bin packing
○ Global constraints, like SLAs
● Task preferences and constraints
○ Locality with other tasks
○ Resource affinity
Job manager concerns
● Managing task/instance counts
● Creating metadata, defining constraints
● Lifecycle management
○ Replace failed task executions
● Handle failures
○ Rate limit requeuing & relaunching
○ Time out tasks in transitionary states
31. Future Job Management & Scheduling Focus
31
● More resources to track: GPUs
● Automatic resource affinity with heterogenous instances
● SLAs
○ Latencies for services
○ Throughput for batch
○ Task preemptions
32. Things we didn’t cover in this talk
● Overall integration
○ Chaos, continuous delivery, performance insight
● Container Execution
○ Logging (live log access & S3 log rotation)
○ Liveness and health checking
○ Isolation (disk usage, networking, block I/O)
○ Image registry (metrics, security scanning)
● Scheduling
○ Autoscaling heterogeneous pools
○ Host-task fitness criteria
● API
○ Extensibility, polymorphic, SLA and job/container ownership 32
33. Agenda
● Why containers at Netflix?
● What did we build and what did we learn?
● What are our current and future workloads?
33
⇨
34. Current Titus Production Usage
34
● Autoscaling
○ 100’s of r3.8xl’s
○ Each 32 vCPU, 244G
● Peak
○ Thousands of cores
○ Tens of TB’s memory
● Thousands containers/day
○ ~ 100 different images
35. Workloads, Past
● Most current usage is batch
○ Algorithm training, adhoc reporting jobs
● Sampling:
○ Training of “sims” and A/B test models
○ Open Connect Device/IX reporting
○ Web security scanning and analysis
○ Social media analytics updates
35
36. Workloads, Now
● Spent last five months adding service style support
● First line of fire customer requests already received
● Larger scale shadow and trickle traffic throughout 2Q
● First service style apps
○ Finer grained instances - NodeJS
○ Docker provided local developer experience
36
37. Workloads, Coming
● Media Encoding
○ Thousands of VM’s
○ VM based resource scheduling
○ Considering containers to have faster start-up
○ Internal spot-market - trough borrowing
● SPaaS
○ 10’s of thousands of containers
○ Stream Processing as a Service
○ Convert scheduling systems to Titus
37
39. Other Netflix QCon Talks
39
Title Time Speaker(s)
The Netflix API Platform for
Server-Side Scripting
Monday 10:35 Katharina Probst
Scheduling A Fuller House:
Container Mgmt @ Netflix
Tuesday 10:35 Andrew Spyker &
Sharma Podila
Chaos Kong - Endowing
Netflix with Antifragility
Tuesday 11:50 Luke Kosewski
The Evolution of the
JavaScript
Wednesday 4:10 Jafar Husain
Async Programming in JS:
The End of the Loop
Friday 9:00 Jafar Husain