The STUPS platform is a set of tools and components to provide a convenient and audit-compliant Platform-as-a-Service (PaaS) for multiple autonomous teams on top of Amazon Web Services (AWS).
More information: http://stups.io
Docker Meetup San Francisco: Radical Agility with Docker & AWSVolker Pilz
This slidedeck is about Zalando's open-source PaaS framework STUPS (stups.io), which was built in-house to enable multiple teams to use the full power of AWS without scarifying vital aspects like security, trace-ability and architectural standards. Docker plays a key role in this setup and helps us to realize an easy and robust deployment process.
The corresponding meetup took place at Oct 28 2015 at Microsoft Reactor Space in San Francisco organized by Docker.
STUPS by Zalando @ AWS User Group Ireland Meet Up September 2015Henning Jacobs
Zalando's STUPS is a set of open source tools and components to provide a convenient and audit-compliant platform for multiple autonomous teams on top of Amazon Web Services. How Zalando do Docker-based deployments.
How Zalando runs Kubernetes clusters at scale on AWS - AWS re:InventHenning Jacobs
Many clusters, many problems? Having many clusters has benefits: reduced blast radius, less vertical scaling of cluster components, and a natural trust boundary. In this session, Zalando shows its approach for running 140+ clusters on AWS, how it does continuous delivery for its cluster infrastructure, and how it created open-source tooling to manage cost efficiency and improve developer experience. The company openly shares its failures and the learnings collected during three years of Kubernetes in production.
AWS re:Invent session OPN211 on 2019-12-05
Why I love Kubernetes Failure Stories and you should too - GOTO BerlinHenning Jacobs
Talk held on 2019-10-24 at GOTO Berlin:
Everybody loves failure stories, but maybe for the wrong reasons: Schadenfreude and Internet comment threads are the dark side; continuous improvement through blameless postmortems, sharing incidents, and documenting learnings is what motivated me to compile the list of Kubernetes Failure Stories. Kubernetes gives us a infrastructure platform to talk in the same "language" and foster collaboration across organizations. In this talk, I will walk you through our horror stories of operating 100+ clusters and share the insights we gained from incidents, failures, user reports and general observations. I will highlight why Kubernetes makes sense despite its perceived complexity. Our failure stories will be sourced from recent and past incidents, so the talk will be up-to-date with our latest experiences.
https://gotober.com/2019/sessions/1129/why-i-love-kubernetes-failure-stories-and-you-should-too
Why Kubernetes? Cloud Native and Developer Experience at Zalando - Enterprise...Henning Jacobs
Kubernetes hat sich als defacto Standard für Cloud Native Plattformen etabliert. Doch warum? Welche Vorteile und Fallstricke gibt es in der Praxis? Henning Jacobs zeigt am Beispiel von Zalando wie Kubernetes als Infrastruktur für 1200+ Entwickler dient, welche Aspekte Kubernetes trotz seiner Komplexität einzigartig machen, und was dies für die Developer Experience bedeutet.
Docker Meetup San Francisco: Radical Agility with Docker & AWSVolker Pilz
This slidedeck is about Zalando's open-source PaaS framework STUPS (stups.io), which was built in-house to enable multiple teams to use the full power of AWS without scarifying vital aspects like security, trace-ability and architectural standards. Docker plays a key role in this setup and helps us to realize an easy and robust deployment process.
The corresponding meetup took place at Oct 28 2015 at Microsoft Reactor Space in San Francisco organized by Docker.
STUPS by Zalando @ AWS User Group Ireland Meet Up September 2015Henning Jacobs
Zalando's STUPS is a set of open source tools and components to provide a convenient and audit-compliant platform for multiple autonomous teams on top of Amazon Web Services. How Zalando do Docker-based deployments.
How Zalando runs Kubernetes clusters at scale on AWS - AWS re:InventHenning Jacobs
Many clusters, many problems? Having many clusters has benefits: reduced blast radius, less vertical scaling of cluster components, and a natural trust boundary. In this session, Zalando shows its approach for running 140+ clusters on AWS, how it does continuous delivery for its cluster infrastructure, and how it created open-source tooling to manage cost efficiency and improve developer experience. The company openly shares its failures and the learnings collected during three years of Kubernetes in production.
AWS re:Invent session OPN211 on 2019-12-05
Why I love Kubernetes Failure Stories and you should too - GOTO BerlinHenning Jacobs
Talk held on 2019-10-24 at GOTO Berlin:
Everybody loves failure stories, but maybe for the wrong reasons: Schadenfreude and Internet comment threads are the dark side; continuous improvement through blameless postmortems, sharing incidents, and documenting learnings is what motivated me to compile the list of Kubernetes Failure Stories. Kubernetes gives us a infrastructure platform to talk in the same "language" and foster collaboration across organizations. In this talk, I will walk you through our horror stories of operating 100+ clusters and share the insights we gained from incidents, failures, user reports and general observations. I will highlight why Kubernetes makes sense despite its perceived complexity. Our failure stories will be sourced from recent and past incidents, so the talk will be up-to-date with our latest experiences.
https://gotober.com/2019/sessions/1129/why-i-love-kubernetes-failure-stories-and-you-should-too
Why Kubernetes? Cloud Native and Developer Experience at Zalando - Enterprise...Henning Jacobs
Kubernetes hat sich als defacto Standard für Cloud Native Plattformen etabliert. Doch warum? Welche Vorteile und Fallstricke gibt es in der Praxis? Henning Jacobs zeigt am Beispiel von Zalando wie Kubernetes als Infrastruktur für 1200+ Entwickler dient, welche Aspekte Kubernetes trotz seiner Komplexität einzigartig machen, und was dies für die Developer Experience bedeutet.
Why Kubernetes? Cloud Native and Developer Experience at Zalando - OWL Tech &...Henning Jacobs
Talk held on 2019-09-26 in Paderborn:
Die Keynote:
Warum Kubernetes? Cloud Native und Developer Experience bei Zalando
Kubernetes hat sich als defacto Standard for Cloud Native Plattformen durchgesetzt. Warum? Welche Vorteile und Fallstricke gibt es in der Praxis?
Henning Jacobs zeigt am Beispiel von Zalando wie Kubernetes als Infrastruktur für 1200+ Entwickler dient, welche Aspekte Kubernetes trotz seiner Komplexität einzigartig machen, und was das für die Developer.
Experience bedeutet.
Henning Jacobs ist der Head of Developer Productivity bei Zalando und damit verantwortlich für die Developer Experience von mehr als 200 Zalando Delivery Teams.
Das Kubernetes eine hervorragende Plattform für den Erfahrungsaustausch darstellt, zeigt Henning mit seiner Liste von Kubernetes Failure Stories.
https://teuto.net/owl-tech-innovation-day/
While Go is the language-of-choice in the cloud-native world, Python has a huge community and makes it really easy to extend Kubernetes in only a few lines of code.
This talk shows examples on how to use Python to query the Kubernetes API, how to write simple controllers in only 10 lines of Python, how to build complete web UIs, and how to test everything with py.test and Kind.
Some of the open-source projects which will be covered: pykube-ng, Kubernetes Web View, kube-janitor, and Kopf (Kubernetes Operator Pythonic Framework).
Talk held in Prague on 2019-09-05:
https://www.meetup.com/Cloud-Native-Prague/events/263802447/
Kubernetes Failure Stories, or: How to Crash Your Cluster - ContainerDays EU ...Henning Jacobs
Bootstrapping a Kubernetes cluster is easy, rolling it out to nearly 200 engineering teams and operating it at scale is a challenge. In this talk, we are presenting our approach to Kubernetes provisioning on AWS, operations and developer experience for our growing Zalando developer base. We will walk you through our horror stories of operating 100+ clusters and share the insights we gained from incidents, failures, user reports and general observations. Our failure stories will be sourced from recent and past incidents, so the talk will be up-to-date with our latest experiences.
Why we don’t use the Term DevOps: the Journey to a Product Mindset - DevOpsCo...Henning Jacobs
While the adoption of DevOps makes teams move faster with reduced dependency on central operations, it can constrain teams who lack the skills to self-manage the full application and infrastructure stack. The way to overcome this challenge is creating an internal platform and treating it as a world-class product offering. “Applying product management to internal platforms means establishing empathy with internal consumers (read: developers) and collaborating with them on the design. Platform product managers establish roadmaps and ensure the platform delivers value to the business and enhances the developer experience”, via ThoughtWorks Technology Radar. In this talk, we will walk you through how Zalando adopted a customer-first mindset with regards to its developer tooling. We will show the effect on developer satisfaction when internal platforms are given the same respect as external product offerings. We will tell our story on how we moved from a classical infrastructure team to a product mindset with strong focus on building a world-class developer experience. We will share both our learnings and challenges going through this transition, and the impact it has on the daily life of our customers (developers).
Why we don’t use the Term DevOps: the Journey to a Product Mindset - Destinat...Henning Jacobs
While the adoption of DevOps makes teams move faster with reduced dependency on central operations, it can constrain teams who lack the skills to self-manage the full application and infrastructure stack.
The way to overcome this challenge is creating an internal platform and treating it as a world-class product offering. “Applying product management to internal platforms means establishing empathy with internal consumers (read: developers) and collaborating with them on the design. Platform product managers establish roadmaps and ensure the platform delivers value to the business and enhances the developer experience”, via ThoughtWorks Technology Radar.
In this talk, Henning Jacobs will walk you through how Zalando adopted a customer-first mindset with regards to its developer tooling. He will show the effect on developer satisfaction when internal platforms are given the same respect as external product offerings. Henning will furthermore tell his story about how Zalando moved from a classical infrastructure team to a product mindset with strong focus on building a world-class developer experience. Henning shares both their learnings and challenges going through this transition, and the impact it has on the daily life of Zalando’s customers (developers).
This talk was given in Aarhus on 4th of June 2019.
Kubernetes Failure Stories - KubeCon Europe BarcelonaHenning Jacobs
Talk given on 2019-05-21 at KubeCon Barcelona: https://kccnceu19.sched.com/event/MPcM/kubernetes-failure-stories-and-how-to-crash-your-clusters-henning-jacobs-zalando-se
Bootstrapping a Kubernetes cluster is easy, rolling it out to nearly 200 engineering teams and operating it at scale is a challenge. In this talk, we are presenting our approach to Kubernetes provisioning on AWS, operations and developer experience for our growing Zalando developer base. We will walk you through our horror stories of operating 100+ clusters and share the insights we gained from incidents, failures, user reports and general observations. Our failure stories will be sourced from recent and past incidents, so the talk will be up-to-date with our latest experiences.
Most of our learnings apply to other Kubernetes infrastructures (EKS, GKE, ..) as well. This talk strives to reduce the audience's unknown unknowns about running Kubernetes in production.
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Henning Jacobs
Talk given at JAX DevOps London on 2019-05-15.
Kubernetes has the concept of resource requests and limits. Pods get scheduled on the nodes based on their requests and optionally limited in how much of the resource they can consume. Understanding and optimizing resource requests/limits is crucial both for reducing resource "slack" and ensuring application performance/low-latency. This talk shows our approach to monitoring and optimizing Kubernetes resources for 90+ clusters to achieve cost-efficiency and reducing impact for latency-critical applications. All shown tools are open source and can be applied to most Kubernetes deployments. Topics covered in the talk include: understanding resource requests and limits, cgroups and CFS quota behavior, contributing factors to cluster costs (in public clouds), and best practices for managing Kubernetes resources.
Talk held at DevOps Gathering 2019 in Bochum on 2019-03-13.
Abstract: This talk will address one of the most common challenges of organizations adopting Kubernetes on a medium to large scale: how to keep cloud costs under control without babysitting each and every deployment and cluster configuration? How to operate 80+ Kubernetes clusters in a cost-efficient way for 200+ autonomous development teams?
This talk provides insights on how Zalando approaches this problem with central cost optimizations (e.g. Spot), cost monitoring/alerting, active measures to reduce resource slack, and automated cluster housekeeping. We will focus on how to ingrain cost efficiency in tooling and developer workflows while balancing rigid cost control with developer convenience and without impacting availability or performance. We will show our use case running Kubernetes on AWS, but all shown tools are open source and can be applied to most other infrastructure environments.
Developer Experience at Zalando - Handelsblatt Strategisches IT-Management 2019Henning Jacobs
Talk given at 25. Handelsblatt Jahrestagung Strategisches IT-Management in Munich on 2019-01-23. Original title (German): "Developer Experience bei Zalando: Entwicklerproduktivität steigern mit Cloud Native Infrastruktur"
- Wie macht man mehr als 1100 Entwickler glücklich und effektiv?
- Entwickler als Kunde: Produktmanagement für Plattformteams
- You build it – you run it: Self-Service-Infrastruktur mit Kubernetes und AWS
- Der Weg vom klassischen Infrastrukturteam zu Developer Productivity als Abteilung
Running Kubernetes in Production: A Million Ways to Crash Your Cluster - DevO...Henning Jacobs
Bootstrapping a Kubernetes cluster is easy, rolling it out to nearly 200 engineering teams and operating it at scale is a challenge. In this talk, we are presenting our approach to Kubernetes provisioning on AWS, operations and developer experience for our growing Zalando developer base.
We will walk you through our horror stories of operating 80+ clusters and share the insights we gained from incidents, failures, user reports and general observations.
Most of our learnings apply to other Kubernetes infrastructures (EKS, GKE, ..) as well.
This talk strives to reduce the audience’s unknown unknowns about running Kubernetes in production.
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Henning Jacobs
Kubernetes has the concept of resource requests and limits. Pods get scheduled on the nodes based on their requests and optionally limited in how much of the resource they can consume. Understanding and optimizing resource requests/limits is crucial both for reducing resource "slack" and ensuring application performance/low-latency. This talk shows our approach to monitoring and optimizing Kubernetes resources for 80+ clusters to achieve cost-efficiency and reducing impact for latency-critical applications. All shown tools are Open Source and can be applied to most Kubernetes deployments.
Running Kubernetes in Production: A Million Ways to Crash Your Cluster - Cont...Henning Jacobs
Bootstrapping a Kubernetes cluster is easy, rolling it out to nearly 200 engineering teams and operating it at scale is a challenge. In this talk, we are presenting our approach to Kubernetes provisioning on AWS, operations and developer experience for our growing Zalando developer base. We will walk you through our horror stories of operating 80+ clusters and share the insights we gained from incidents, failures, user reports and general observations. Most of our learnings apply to other Kubernetes infrastructures (EKS, GKE, ..) as well. This talk strives to reduce the audience’s unknown unknowns about running Kubernetes in production.
https://2018.container.camp/uk/schedule/running-kubernetes-in-production-a-million-ways-to-crash-your-cluster/
Connexion is an open source API first REST framework for Python, built on top of Flask and based on OpenAPI/Swagger, targeted for microservice development. Connexion automagically handles request routing, oauth2 security, request validation and response serialization based on an OpenAPI 2.0 Specification file in YAML, so you don’t have to care about boilerplate anymore.
Because it is based on Flask it supports everything that Flask does, including deployment options and extensions.
At Zalando we’ve adopted “API First” as one of our key engineering principles, to ensure our API are robust, consistent, general and
abstracted from specific implementation and use cases. But when we tried to implement this principle for the first time we were faced with the lack of a python framework to achieve it in a easy fashion - there were several frameworks that produce a swagger definition from the
implementation but none that do it the other way around - so we decided to fill that gap.
Henning will show how to get started with OpenAPI+Connexion, present some real-world use cases and deployment options such as Kubernetes.
Developer Journey at Zalando - Idea to Production with Containers in the Clou...Henning Jacobs
Talk held on R-ETAIL:CODE in London on 2018-03-15.
- The history of how DevOps evolved at Zalando: from on-premise data centers to autonomous teams, microservices and cluster management in the cloud
- How the developer experience looks like for the application lifecycle from idea to production and what our vision for the future is
- Challenges and learnings from our past experiences: why architecture principles and constraints are important to lead 200+ engineering teams
Large Scale Kubernetes on AWS at Europe's Leading Online Fashion Platform - C...Henning Jacobs
Bootstrapping a Kubernetes cluster is easy, rolling it out to nearly 200 engineering teams and operating it at scale is a challenge. In this talk, we are presenting our approach to Kubernetes provisioning on AWS, operations and developer experience for our growing Zalando Technology department. We will highlight in the context of Kubernetes: AWS service integrations, our IAM/OAuth infrastructure, cluster autoscaling, continuous delivery and general developer experience. The talk will cover our most important learnings and we will openly share failure stories.
Talk given at Container Days HH (https://containerdays.io/) on 2017-06-20.
From AWS/STUPS to Kubernetes on AWS @Zalando - Berlin Kubernetes MeetupHenning Jacobs
This talk will highlight our challenges while migrating from our STUPS infrastructure (Docker on EC2, Cloud Formation) to Kubernetes on AWS.
Talk was held at Berlin Kubernetes Meetup on 2017-05-18: https://www.meetup.com/Berlin-Kubernetes-Meetup/events/239313998/
Kubernetes on AWS @Zalando - Berlin AWS User Group 2017-05-09Henning Jacobs
In this talk we share our learnings from running Kubernetes on AWS in production and how we are migrating 200+ engineering teams from AWS/STUPS to Kubernetes.
This talk was given at the Berlin AWS User Group meetup on 2017-05-09 hosted by NewStore (https://www.meetup.com/aws-berlin/events/236795816/).
More information on http://kubernetes-on-aws.readthedocs.io/en/latest/admin-guide/kubernetes-in-production.html
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
Why Kubernetes? Cloud Native and Developer Experience at Zalando - OWL Tech &...Henning Jacobs
Talk held on 2019-09-26 in Paderborn:
Die Keynote:
Warum Kubernetes? Cloud Native und Developer Experience bei Zalando
Kubernetes hat sich als defacto Standard for Cloud Native Plattformen durchgesetzt. Warum? Welche Vorteile und Fallstricke gibt es in der Praxis?
Henning Jacobs zeigt am Beispiel von Zalando wie Kubernetes als Infrastruktur für 1200+ Entwickler dient, welche Aspekte Kubernetes trotz seiner Komplexität einzigartig machen, und was das für die Developer.
Experience bedeutet.
Henning Jacobs ist der Head of Developer Productivity bei Zalando und damit verantwortlich für die Developer Experience von mehr als 200 Zalando Delivery Teams.
Das Kubernetes eine hervorragende Plattform für den Erfahrungsaustausch darstellt, zeigt Henning mit seiner Liste von Kubernetes Failure Stories.
https://teuto.net/owl-tech-innovation-day/
While Go is the language-of-choice in the cloud-native world, Python has a huge community and makes it really easy to extend Kubernetes in only a few lines of code.
This talk shows examples on how to use Python to query the Kubernetes API, how to write simple controllers in only 10 lines of Python, how to build complete web UIs, and how to test everything with py.test and Kind.
Some of the open-source projects which will be covered: pykube-ng, Kubernetes Web View, kube-janitor, and Kopf (Kubernetes Operator Pythonic Framework).
Talk held in Prague on 2019-09-05:
https://www.meetup.com/Cloud-Native-Prague/events/263802447/
Kubernetes Failure Stories, or: How to Crash Your Cluster - ContainerDays EU ...Henning Jacobs
Bootstrapping a Kubernetes cluster is easy, rolling it out to nearly 200 engineering teams and operating it at scale is a challenge. In this talk, we are presenting our approach to Kubernetes provisioning on AWS, operations and developer experience for our growing Zalando developer base. We will walk you through our horror stories of operating 100+ clusters and share the insights we gained from incidents, failures, user reports and general observations. Our failure stories will be sourced from recent and past incidents, so the talk will be up-to-date with our latest experiences.
Why we don’t use the Term DevOps: the Journey to a Product Mindset - DevOpsCo...Henning Jacobs
While the adoption of DevOps makes teams move faster with reduced dependency on central operations, it can constrain teams who lack the skills to self-manage the full application and infrastructure stack. The way to overcome this challenge is creating an internal platform and treating it as a world-class product offering. “Applying product management to internal platforms means establishing empathy with internal consumers (read: developers) and collaborating with them on the design. Platform product managers establish roadmaps and ensure the platform delivers value to the business and enhances the developer experience”, via ThoughtWorks Technology Radar. In this talk, we will walk you through how Zalando adopted a customer-first mindset with regards to its developer tooling. We will show the effect on developer satisfaction when internal platforms are given the same respect as external product offerings. We will tell our story on how we moved from a classical infrastructure team to a product mindset with strong focus on building a world-class developer experience. We will share both our learnings and challenges going through this transition, and the impact it has on the daily life of our customers (developers).
Why we don’t use the Term DevOps: the Journey to a Product Mindset - Destinat...Henning Jacobs
While the adoption of DevOps makes teams move faster with reduced dependency on central operations, it can constrain teams who lack the skills to self-manage the full application and infrastructure stack.
The way to overcome this challenge is creating an internal platform and treating it as a world-class product offering. “Applying product management to internal platforms means establishing empathy with internal consumers (read: developers) and collaborating with them on the design. Platform product managers establish roadmaps and ensure the platform delivers value to the business and enhances the developer experience”, via ThoughtWorks Technology Radar.
In this talk, Henning Jacobs will walk you through how Zalando adopted a customer-first mindset with regards to its developer tooling. He will show the effect on developer satisfaction when internal platforms are given the same respect as external product offerings. Henning will furthermore tell his story about how Zalando moved from a classical infrastructure team to a product mindset with strong focus on building a world-class developer experience. Henning shares both their learnings and challenges going through this transition, and the impact it has on the daily life of Zalando’s customers (developers).
This talk was given in Aarhus on 4th of June 2019.
Kubernetes Failure Stories - KubeCon Europe BarcelonaHenning Jacobs
Talk given on 2019-05-21 at KubeCon Barcelona: https://kccnceu19.sched.com/event/MPcM/kubernetes-failure-stories-and-how-to-crash-your-clusters-henning-jacobs-zalando-se
Bootstrapping a Kubernetes cluster is easy, rolling it out to nearly 200 engineering teams and operating it at scale is a challenge. In this talk, we are presenting our approach to Kubernetes provisioning on AWS, operations and developer experience for our growing Zalando developer base. We will walk you through our horror stories of operating 100+ clusters and share the insights we gained from incidents, failures, user reports and general observations. Our failure stories will be sourced from recent and past incidents, so the talk will be up-to-date with our latest experiences.
Most of our learnings apply to other Kubernetes infrastructures (EKS, GKE, ..) as well. This talk strives to reduce the audience's unknown unknowns about running Kubernetes in production.
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Henning Jacobs
Talk given at JAX DevOps London on 2019-05-15.
Kubernetes has the concept of resource requests and limits. Pods get scheduled on the nodes based on their requests and optionally limited in how much of the resource they can consume. Understanding and optimizing resource requests/limits is crucial both for reducing resource "slack" and ensuring application performance/low-latency. This talk shows our approach to monitoring and optimizing Kubernetes resources for 90+ clusters to achieve cost-efficiency and reducing impact for latency-critical applications. All shown tools are open source and can be applied to most Kubernetes deployments. Topics covered in the talk include: understanding resource requests and limits, cgroups and CFS quota behavior, contributing factors to cluster costs (in public clouds), and best practices for managing Kubernetes resources.
Talk held at DevOps Gathering 2019 in Bochum on 2019-03-13.
Abstract: This talk will address one of the most common challenges of organizations adopting Kubernetes on a medium to large scale: how to keep cloud costs under control without babysitting each and every deployment and cluster configuration? How to operate 80+ Kubernetes clusters in a cost-efficient way for 200+ autonomous development teams?
This talk provides insights on how Zalando approaches this problem with central cost optimizations (e.g. Spot), cost monitoring/alerting, active measures to reduce resource slack, and automated cluster housekeeping. We will focus on how to ingrain cost efficiency in tooling and developer workflows while balancing rigid cost control with developer convenience and without impacting availability or performance. We will show our use case running Kubernetes on AWS, but all shown tools are open source and can be applied to most other infrastructure environments.
Developer Experience at Zalando - Handelsblatt Strategisches IT-Management 2019Henning Jacobs
Talk given at 25. Handelsblatt Jahrestagung Strategisches IT-Management in Munich on 2019-01-23. Original title (German): "Developer Experience bei Zalando: Entwicklerproduktivität steigern mit Cloud Native Infrastruktur"
- Wie macht man mehr als 1100 Entwickler glücklich und effektiv?
- Entwickler als Kunde: Produktmanagement für Plattformteams
- You build it – you run it: Self-Service-Infrastruktur mit Kubernetes und AWS
- Der Weg vom klassischen Infrastrukturteam zu Developer Productivity als Abteilung
Running Kubernetes in Production: A Million Ways to Crash Your Cluster - DevO...Henning Jacobs
Bootstrapping a Kubernetes cluster is easy, rolling it out to nearly 200 engineering teams and operating it at scale is a challenge. In this talk, we are presenting our approach to Kubernetes provisioning on AWS, operations and developer experience for our growing Zalando developer base.
We will walk you through our horror stories of operating 80+ clusters and share the insights we gained from incidents, failures, user reports and general observations.
Most of our learnings apply to other Kubernetes infrastructures (EKS, GKE, ..) as well.
This talk strives to reduce the audience’s unknown unknowns about running Kubernetes in production.
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Henning Jacobs
Kubernetes has the concept of resource requests and limits. Pods get scheduled on the nodes based on their requests and optionally limited in how much of the resource they can consume. Understanding and optimizing resource requests/limits is crucial both for reducing resource "slack" and ensuring application performance/low-latency. This talk shows our approach to monitoring and optimizing Kubernetes resources for 80+ clusters to achieve cost-efficiency and reducing impact for latency-critical applications. All shown tools are Open Source and can be applied to most Kubernetes deployments.
Running Kubernetes in Production: A Million Ways to Crash Your Cluster - Cont...Henning Jacobs
Bootstrapping a Kubernetes cluster is easy, rolling it out to nearly 200 engineering teams and operating it at scale is a challenge. In this talk, we are presenting our approach to Kubernetes provisioning on AWS, operations and developer experience for our growing Zalando developer base. We will walk you through our horror stories of operating 80+ clusters and share the insights we gained from incidents, failures, user reports and general observations. Most of our learnings apply to other Kubernetes infrastructures (EKS, GKE, ..) as well. This talk strives to reduce the audience’s unknown unknowns about running Kubernetes in production.
https://2018.container.camp/uk/schedule/running-kubernetes-in-production-a-million-ways-to-crash-your-cluster/
Connexion is an open source API first REST framework for Python, built on top of Flask and based on OpenAPI/Swagger, targeted for microservice development. Connexion automagically handles request routing, oauth2 security, request validation and response serialization based on an OpenAPI 2.0 Specification file in YAML, so you don’t have to care about boilerplate anymore.
Because it is based on Flask it supports everything that Flask does, including deployment options and extensions.
At Zalando we’ve adopted “API First” as one of our key engineering principles, to ensure our API are robust, consistent, general and
abstracted from specific implementation and use cases. But when we tried to implement this principle for the first time we were faced with the lack of a python framework to achieve it in a easy fashion - there were several frameworks that produce a swagger definition from the
implementation but none that do it the other way around - so we decided to fill that gap.
Henning will show how to get started with OpenAPI+Connexion, present some real-world use cases and deployment options such as Kubernetes.
Developer Journey at Zalando - Idea to Production with Containers in the Clou...Henning Jacobs
Talk held on R-ETAIL:CODE in London on 2018-03-15.
- The history of how DevOps evolved at Zalando: from on-premise data centers to autonomous teams, microservices and cluster management in the cloud
- How the developer experience looks like for the application lifecycle from idea to production and what our vision for the future is
- Challenges and learnings from our past experiences: why architecture principles and constraints are important to lead 200+ engineering teams
Large Scale Kubernetes on AWS at Europe's Leading Online Fashion Platform - C...Henning Jacobs
Bootstrapping a Kubernetes cluster is easy, rolling it out to nearly 200 engineering teams and operating it at scale is a challenge. In this talk, we are presenting our approach to Kubernetes provisioning on AWS, operations and developer experience for our growing Zalando Technology department. We will highlight in the context of Kubernetes: AWS service integrations, our IAM/OAuth infrastructure, cluster autoscaling, continuous delivery and general developer experience. The talk will cover our most important learnings and we will openly share failure stories.
Talk given at Container Days HH (https://containerdays.io/) on 2017-06-20.
From AWS/STUPS to Kubernetes on AWS @Zalando - Berlin Kubernetes MeetupHenning Jacobs
This talk will highlight our challenges while migrating from our STUPS infrastructure (Docker on EC2, Cloud Formation) to Kubernetes on AWS.
Talk was held at Berlin Kubernetes Meetup on 2017-05-18: https://www.meetup.com/Berlin-Kubernetes-Meetup/events/239313998/
Kubernetes on AWS @Zalando - Berlin AWS User Group 2017-05-09Henning Jacobs
In this talk we share our learnings from running Kubernetes on AWS in production and how we are migrating 200+ engineering teams from AWS/STUPS to Kubernetes.
This talk was given at the Berlin AWS User Group meetup on 2017-05-09 hosted by NewStore (https://www.meetup.com/aws-berlin/events/236795816/).
More information on http://kubernetes-on-aws.readthedocs.io/en/latest/admin-guide/kubernetes-in-production.html
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
STUPS by Zalando @ AWS Berlin User Group Meetup May 2015
1. STUPS
STUPS To Unleash Penguin Swarms
AWS Berlin Meetup 2015-05-21
henning.jacobs@zalando.de @try_except_
2. 15 countries
14+ million active customers
2.2 billion € revenue 2014
640+ million visits in Q1/2 2014
One of Europe's largest
online fashion retailers
3. What is STUPS?
The STUPS platform is
a set of tools and components
to provide a convenient and audit-compliant
Platform-as-a-Service (PaaS)
for multiple autonomous teams
on top of Amazon Web Services (AWS).
4. One AWS account per Team
● Every team gets own,
isolated AWS Account
● Every team gets own team domain
*.<teamid>.example.org
5. Public Internet
Isolated AWS Accounts
*.foo.example.org *.bar.example.org
Team “Foo” Team “Bar”ELB ELB
EC2
Instance
EC2
InstanceEC2
InstanceEC2
Instance
EC2
InstanceEC2
Instance
6. Isolated AWS Accounts..
● All cross-team traffic via public Internet
● All cross-team APIs as REST
● Endpoints need to be secured
via SSL and OAuth
● No firewall/network “magic” needed
7. Autonomy
Teams..
● can choose technologies
as they think fit
● own their AWS Account
● are end-to-end responsible
for their applications
8. Autonomy and Compliance
STUPS offers
maximum freedom for developers
while enabling
near-real-time audit compliance
for every single application.
9. STUPS Policy TL;DR
● Use the Taupage base AMI
⇒ Docker
● Register all applications
in the Kio application registry
● Use REST+OAuth
to expose services to other teams
10. Application Deployment
● Build your application
● Create a Docker image
● Deploy a new immutable stack with Senza
● Route traffic to the new stack
Try out for yourself: http://docs.stups.io/en/latest/user-guide/standalone-deployment.html
12. What is Senza?
● Command line tool
● Generator of Cloud Formation templates
● Management tool for CF stacks
● Convenience high-level CF “components”
18. OAuth Infrastructure
● Central IAM Provider
(ForgeRock Open Identity Stack)
● Registered Apps get OAuth
credentials automatically
● Credential Distribution via S3 Buckets