Pull, don’t push: Architectures for monitoring and configuration in a microservices era

•Download as PPTX, PDF•

1 like•439 views

Applications today are increasingly being designed using a share-nothing, microservices architecture that is resilient to the failure of individual components, even when built atop cloud infrastructure that can suffer infrequent-but-massive outages. Yet we still see many supporting tools for application monitoring, observability, configuration management and release management using a centralized “orchestration” approach that depends on pushing changes to unreliable distributed systems. In this Sensu Summit 2018 talk, Chef's Julian Dunn & Fletcher Nichol give you a primer about promise theory and the autonomous actor model that underlies the design of products like Sensu and Habitat, why it leads to not only higher overall system reliability but human comprehension for easier operations. They argue that you should consider designing all of your applications and supporting systems in this way. They may even show a demo or two to illustrate how inverting the design radically changes the notion of “application release orchestration”, so that you can retain orchestration-type semantics even with an eventually-consistent system design.

Technology

Pull, don’t push!
Architectures for monitoring and configuration in a
microservices era
Julian Dunn, Director of Product Marketing, Chef
@julian_dunn
Fletcher Nichol, Senior Software Development Engineer, Chef
@fnichol

• Modular, self-contained, pre-fabricated components
• Neighbors share components
• Complex shares services as a whole

An ordered set of operations
Across a set of independent machines
Connected to an orchestrator only via a
network.

Humans acting on Microsoft Visio acting on
machines
Humans acting on code acting on machines

An ordered set of operations
Defined in code
Across a set of independent machines
Connected to an orchestrator only via a
network.

mylaptop:~$ ./disable-load-balancer.sh
mylaptop:~$ ssh db01 do-database-migration.sh
mylaptop:~$ for i in app01 app02; do
> ssh $i do-deployment.sh
> done
mylaptop:~$ ./enable-load-balancer.sh

Problems with Orchestration
Resilience Scalability
Deployment Technical
Operational Cognitive

Deployment Resilience
for i in app01 app02 app03; do
do-deploy.sh –server $i
done

Deployment Resilience
for i in app01 app02 app03; do
do-deploy.sh –server $i
if $? != 0; then
failed=$i
break
end
done
# what goes down here?
# roll back $failed?
# roll back all others?
# ignore it?

Operational Resilience
Orchestration Backplane – must be up at all times!
Application Plane – delegated resilience to the backplane

Operational Resilience
Orchestration Backplane
Application Plane
Orchestration Backplane

Mainframes
Time Sharing
Client/Server
Web 1.0
Web 2.0
Cloud
Internet of
Things
Edge
Time
Distributed
Centralized
The Future Is Distributed

Distributed Devices Need Distributed Management
• Adaptive
Learning
• Configuration
Updates
• Software
Updates

Distributed, Autonomous Systems
Make progress towards promised
desired state
Expose interfaces to allow others to
verify promises
Can promise to take certain behaviors
in the face of failure of others

The Design of Sensu
and
The Design of Habitat

The Design of Sensu vs. Traditional “Monitoring”
Nagios master
Agent
1
Agent
2
1. Poll
(orchestrate)
2. Run
checks
1. Run
checks
Agent
1
Agent
2
Sensu Backend
2. Post data

Habitat supervisor in a nutshell
•Network-connected supervision system
•Like systemd+consul/etcd (process supervision with
lifecycle hooks + shared state for reactive realtime change
management)
•Eventually-consistent global state using SWIM masterless
(peer-to-peer) membership protocol

sensu-
backend
hab-sup
sensu-
backend
hab-sup
sensu-
backend
hab-sup
backend.default
sensu-
agent
hab-sup
agent.default
--bind sensu:backend.default
Resolve symbol “sensu” in configs to
properties of service group
backend.default

Let’s See it in Action!
Demo: Sensu running under Habitat

• Modern architectures demand a
choreographed rather than an
orchestrated approach
• At scale, fleet management and
cognitive complexity is the biggest
problem
• Habitat and Sensu are both examples
of edge-centric, autonomous actor
systems, and they work well together
😺

Pull, don’t push: Architectures for monitoring and configuration in a microservices era

What's hot

It can be easy to come up with a TCO analysis that would challenge any public cloud and make you think, "let's go in-house!" What are the challenges and is it really worth it? The TubeMogul Operation team went thru the technical challenges at building a private cloud. In this presentation you will learn how the team went from a R&D to an automated deployment of a bare-metal servers to finally migrate a large workload from a Public Cloud to its own Private Cloud infrastructure. We will detail how the team dealt with unexpected issues and also how we chose the hardware, estimated capacity, stay cost effective, improve overall performance of the system, and bring better control and visibility. This talk will cover the technical detail of: * Evaluating OpenStack, Building and automating a CI environment for a mix of bare metal and cloud servers. * What are the network limitations of OpenStack and how we creatively leverage VLANs to handle large packet per seconds. * How to efficiently monitor your cloud infrastructure Find quickly your bottlenecks * What we missed and should be consider before moving in house Lesson Learned and Post Cost Analysis

SRECon16: Moving Large Workloads from a Public Cloud to an OpenStack Private ...

Nicolas Brousse

Triangle Devops Meetup 10/2015

aspyker

Netflix oss season 1 episode 3

Ruslan Meshenberg

In this episode, we will take a close look at 2 different approaches to high-throughput/low-latency data stores, developed by Netflix. The first, EVCache, is a battle-tested distributed memcached-backed data store, optimized for the cloud. You will also hear about the road ahead for EVCache it evolves into an L1/L2 cache over RAM and SSDs. The second, Dynomite, is a framework to make any non-distributed data-store, distributed. Netflix's first implementation of Dynomite is based on Redis. Come learn about the products' features and hear from Thomson and Reuters, Diego Pacheco from Ilegra and other third party speakers, internal and external to Netflix, on how these products fit in their stack and roadmap.

Netflix Open Source Meetup Season 4 Episode 2

aspyker

Customers from over all over the world streamed Forty Two Billion hours of Netflix content last year. The Netflix streaming service had been powered by the Amazon cloud with virtual machines for over five years, blazing a trail for similar architectures. In the last year, it invested in containers for batch-style jobs and service-style applications. Andrew Spyker will explain the potential containers have to help Netflix create a more productive development experience while simultaneously deepening its control over resource management. Join Andrew to see why Netflix is moving forward with containers, how it can leverage its existing operational machinery, and how it’s running containers with a similar guarantee of high availability as current Netflix infrastructure provides.

Netflix and Containers: Not A Stranger Thing

aspyker

Nearly any Internet-connected screen is capable of streaming Netflix content. Sitting on top of a cloud-native microservice architecture, the entire ecosystem generates over 1 trillion events every day to feed critical Netflix systems to monitor service health, to detect fraudulent behaviors, and to improve customer experience. Keystone is the critical piece of Netflix backend infrastructure to ensure massive amount of events are processed in near real time, reliably, at scale, and in face of failures in a cloud-native microservices environment. Turns out, such an embarrassingly parallel stream processing system is not embarrassingly easy to develop and operate, especially given the challenges of unpredictable failures in a cloud-native environment, self-serve multi-tenancy support, and assumptions of maintaining extremely high development/operation agility. This talk will shed light on how we built an elastic, resilient, reactive, and self-healing distributed system in the cloud. Zhenzhong will present * High-level cloud-native microservice based Keystone architecture. * A deep dive on how we built the system based on ideas such as declarative reconciliation, container based immutable deployment, logical workload isolation, and chaos exercise. * Insights into our operation best practices, such as capacity provisioning, delivery semantics, deployment tradeoffs, backpressure management, etc.

Running a Massively Parallel Self-serve Distributed Data System At Scale

Zhenzhong Xu

Netflix oss season 2 episode 1 - meetup Lightning talks

Ruslan Meshenberg

以 Kubernetes 部屬 Spark 大數據計算環境

inwin stack

Nova Updates - Kilo Edition

OpenStack Foundation

It wasn’t more then 4 months between the first getting in touch with Opennebula and our productive Opennebula cluster beeing fired up. It was a quick decision that turned our to be the absolute right one. Since a little more than a year we are on our evolving way with Opennebula. So what have we been looking for and why did we end up with Opennebula? How does our setup look like in the moment and what are our future plans with Opennebula? Learnings from a year with Opennebula.

OpenNebula Conf 2014 | OpenNebula as alternative to commercial virtualization...

NETWAYS

TubeMogul handles over one trillion ad auctions per month. This session will showcase the successful integration that helped TubeMogul improve its order, fulfillment, and payment workflow of every one of those auctions. You'll be given the complete breakdown on the end-to-end workflow, as well as the technical and organizational implementation challenges overcome in handling this tremendous volume.

SuiteWorld16: Mega Volume - How TubeMogul Leverages NetSuite

Nicolas Brousse

An approach for migrating enterprise apps into open stack

Arthur Berezin

Owain Perry (Just Giving) - Continuous Delivery of Windows Micro-Services in ...

Outlyer

Talk by: Christof Hanke The Max Planck Computing and Data Facility (MPCDF, formerly known as RZG) is using Icinga / Nagios for more than 10 years now. We have/had several instances for different areas, such as miscellaneous servers, clusters or even EU-Projects spanning several European sites. This talk is about our (still ongoing) transition from our icinga1 installation for general servers which is based on many config-files, changed by two people only, to a director based self-service. The goal is that our fellow admins can integrate new hosts by a few steps. The configuration is declarative. In the host-object they just choose templates and fill in fields and arrays. The actual icinga2-configuration is then done via apply-rules.

Moving from Icinga 1 to Icinga 2 + Director - Icinga Camp Zurich 2019

Icinga

Nagios Conference 2014 - Luis Contreras - Monitoring SAP System with Nagios Core

Nagios

Netflix Open Source Meetup Season 3 Episode 2

aspyker

OpenContrail Implementations

Jakub Pavlik

In Cassandra Lunch #72, we will discuss how we can use Databricks with Cassandra. Accompanying Blog: https://blog.anant.us/apache-cassandra-lunch-72-databricks-and-cassandra Accompanying YouTube: https://youtu.be/5zCN27KHADo Sign Up For Our Newsletter: http://eepurl.com/grdMkn Join Cassandra Lunch Weekly at 12 PM EST Every Wednesday: https://www.meetup.com/Cassandra-DataStax-DC/events/ Cassandra.Link: https://cassandra.link/ Follow Us and Reach Us At: Anant: https://www.anant.us/ Awesome Cassandra: https://github.com/Anant/awesome-cassandra Cassandra.Lunch: https://github.com/Anant/Cassandra.Lunch Email: solutions@anant.us LinkedIn: https://www.linkedin.com/company/anant/ Twitter: https://twitter.com/anantcorp Eventbrite: https://www.eventbrite.com/o/anant-1072927283 Facebook: https://www.facebook.com/AnantCorp/ Join The Anant Team: https://www.careers.anant.us

Apache Cassandra Lunch #72: Databricks and Cassandra

Anant Corporation

CS80A Foothill College Open Source Talk

aspyker

Modern Monitoring - SysAdminDay 2017

Opsta

What's hot (20)

SRECon16: Moving Large Workloads from a Public Cloud to an OpenStack Private ...

Triangle Devops Meetup 10/2015

Netflix oss season 1 episode 3

Netflix Open Source Meetup Season 4 Episode 2

Netflix and Containers: Not A Stranger Thing

Running a Massively Parallel Self-serve Distributed Data System At Scale

Netflix oss season 2 episode 1 - meetup Lightning talks

以 Kubernetes 部屬 Spark 大數據計算環境

Nova Updates - Kilo Edition

OpenNebula Conf 2014 | OpenNebula as alternative to commercial virtualization...

SuiteWorld16: Mega Volume - How TubeMogul Leverages NetSuite

An approach for migrating enterprise apps into open stack

Owain Perry (Just Giving) - Continuous Delivery of Windows Micro-Services in ...

Moving from Icinga 1 to Icinga 2 + Director - Icinga Camp Zurich 2019

Nagios Conference 2014 - Luis Contreras - Monitoring SAP System with Nagios Core

Netflix Open Source Meetup Season 3 Episode 2

OpenContrail Implementations

Apache Cassandra Lunch #72: Databricks and Cassandra

CS80A Foothill College Open Source Talk

Modern Monitoring - SysAdminDay 2017

Similar to Pull, don’t push: Architectures for monitoring and configuration in a microservices era

Simplifying SDN Networking Across Private and Public Clouds

5nine

Neeraj_Virmani_Resume

Neeraj Virmani

Build Time Hacking

Mohammed Tanveer

TechWiseTV Workshop: Open NX-OS and Devops with Puppet Labs

Robb Boyd

Meteor South Bay Meetup - Kubernetes & Google Container Engine

Kit Merker

Remote sensing and control of an irrigation system using a distributed wirele...

nithinreddykaithi

Twelve Factor App

Christ Ngantung

System center 2012 configurations manager

Belarmino Tomicha

Application Streaming is dead. A smart way to choose an alternative

Denis Gundarev

Containerization Principles Overview for app development and deployment

Dr Ganesh Iyer

Operational Visibiliy and Analytics - BU Seminar

Canturk Isci

As we enter a new age of automation — where every company needs to be able to deliver better software, faster — our goal is to provide the tools you need to iterate faster, ship sooner and deliver more customer value. In October, we announced brand new products, Puppet Tasks™ and Puppet Discovery™, to give you greater control and end-to-end visibility over your software delivery. Join Eric Sorenson, Director of Product Management, on 7 December at 11:00 a.m. AEDT for an in-depth look at what’s new: Puppet Discovery is a new offering that lets you see everything you have in real time across your on-premises, cloud and container infrastructure, and know what you need to automate next. Puppet Tasks, a new family of offerings that encompass both Puppet Bolt™and Puppet Enterprise Task Management, makes it simple to automate ad hoc tasks, deploy one-off changes, and execute sequenced actions in an imperative way. With Puppet Pipelines, we’re uniting the entire software delivery lifecycle, to bring you a platform built for the enterprise, that integrates with a wide variety of tools and helps you avoid vendor lock-in.

Meet Puppet's new product lineup 12/7/2017

Puppet

Sdn primer pdf

Pooja Patel

DEVNET-1169 CI/CT/CD on a Micro Services Applications using Docker, Salt & Ni...

Cisco DevNet

Netflix Cloud Architecture and Open Source

aspyker

Open shift and docker - october,2014

Hojoong Kim

You can’t have cloud-native applications without a modern approach to databases and backing services. Data professionals are looking for ways to transform how databases are provisioned and managed. In this webinar, we’ll cover practical strategies you can employ to deliver improved business agility at the data layer. We’ll discuss the impact that microservices are having in the enterprise, and what this means for MySQL and other popular databases. Join us and learn the answers to these common questions: ● How can you meet the operational challenge of scaling the number of MySQL database instances and managing the fleet? ● Adding to this scale challenge, how can your MySQL instances maintain availability in a world where the underlying IT infrastructure is ephemeral? ● How can you secure data in motion? ● How can you enable self-service while maintaining control and governance? We’ll cover these topics and share how enterprises like yours are delivering greater outcomes with our Pivotal Platform managed MySQL. Now you can scale without fear of failure. Presenters: Judy Wang, Product Management Jagdish Mirani, Product Marketing

Cloud-Native Patterns and the Benefits of MySQL as a Platform Managed Service

VMware Tanzu

Virtualization 101

Gaurav Marwaha

TotalView Debugger On Blue Gene

Totalviewtech

Technology insights: Decision Science Platform

Decision Science Community

Similar to Pull, don’t push: Architectures for monitoring and configuration in a microservices era (20)

Simplifying SDN Networking Across Private and Public Clouds

Neeraj_Virmani_Resume

Build Time Hacking

TechWiseTV Workshop: Open NX-OS and Devops with Puppet Labs

Meteor South Bay Meetup - Kubernetes & Google Container Engine

Remote sensing and control of an irrigation system using a distributed wirele...

Twelve Factor App

System center 2012 configurations manager

Application Streaming is dead. A smart way to choose an alternative

Containerization Principles Overview for app development and deployment

Operational Visibiliy and Analytics - BU Seminar

Meet Puppet's new product lineup 12/7/2017

Sdn primer pdf

DEVNET-1169 CI/CT/CD on a Micro Services Applications using Docker, Salt & Ni...

Netflix Cloud Architecture and Open Source

Open shift and docker - october,2014

Cloud-Native Patterns and the Benefits of MySQL as a Platform Managed Service

Virtualization 101

TotalView Debugger On Blue Gene

Technology insights: Decision Science Platform

More from Sensu Inc.

Introducing GoAlert: a brand-new on-call scheduling and notification open sou...

Sensu Inc.

How can you be sure that your team is alerted of a failure before it causes an outage for your users? The move from monolith to microservice has allowed pieces of functionality to be deployed individually and on demand. Having functionality isolated allows the opportunity for one microservice to fail without bringing down the whole system. However, the complexity of releasing and monitoring API calls being made across services has increased. Whether you’re launching a new product or iterating on a feature, delivering a delightful experience is crucial to your success. If something is to fail, you’d prefer your users didn’t know. Be thoughtful about how your system will degrade, how to inject failure to verify your design, and how this is monitored. In this Sensu Summit 2019 talk, Lorne Kligerman, Director of Product at Gremlin, will cover failing gracefully as an engineering goal which can be confidently tested and monitored with Chaos Engineering. By purposely causing failure of one service at a time in a controlled environment, you can safely observe and react in a timely manner to limit the effect on the end user.

Monitoring Graceful Failure

Sensu Inc.

There’s an old wives tale (referred to as the “Evolution of QA to GA: The Sensu Go Crucible”) that tells the story of how the Sensu engineering team redefined release engineering and quality assurance at Sensu. Failure after failure, they would gut through a far from perfect release strategy which involved hours of painstaking manual testing, in order to stamp a green checkmark of approval to ship our product. To optimize this process, they implemented a full automated test infrastructure for staging and end to end testing, which later became known as the QA Crucible . This automation pattern was (and still is) great, but Sensu Software Engineer Nikki Attea couldn’t help but wonder if there was a well-known and loved product which could decipher JSON test results and instrument them in an event pipeline. She believes Sensu can encompass monitoring _and_ testing in a CI/CD pipeline, and shows everyone how in this Sensu Summit 2019 talk.

Testing and monitoring and broken things

Sensu Inc.

We’ve moved from waking someone up if a disk passed some arbitrary threshold to only paging off-hours when the business is impacted. Our lives have improved immensely because we learned how to measure the right things. In this Sensu Summit 2019 talk, Zapproved SRE Tiffany Longworth takes some of the lessons we’ve learned from monitoring and alerting and shows how we can apply them to how we measure the humans in our systems. From who we see and don’t see as leaders to which candidates we think have the potential to be excellent contributors, let’s look at how we’ve been measuring humans and see if we are evaluating the right things.

Keynote: Measuring the right things

Sensu Inc.

The day to day life of a DevOps & IT Ops engineer should be spent on developing the beautiful products and services you offer your customers and less time operating them. In this Sensu Summit 2019 talk from Moogsoft Sr. Product Manager Adam Frank, you'll learn how Moogsoft & Sensu, along with other monitoring and observability tools, will drive your digital transformation, allowing you to understand the significance of your alerts and the alerts that are correlated, to give you better and faster context of an incident.

AIOps & Observability to Lead Your Digital Transformation

Sensu Inc.

In this Sensu Summit 2019 ecosystem session, Garrett Honeycutt, Principal at Tailored Automation, shares where we are with the Puppet module for managing Sensu and discusses the changes to the module and how users can migrate from Sensu Classic to Sensu Go. He also shows off all the testing surrounding the Puppet module and how they able to uncover issues and contribute back to the Sensu-go project during the GA release (and how that continues).

Ecosystem session: Sensu + Puppet

Sensu Inc.

The Sensu Plugin architecture is what make Sensu the rich and extensible monitoring framework that it is. Sensu 2.0 provided us with an exciting opportunity to reimagine the user and developer experience for plugins. Sensu 2.0 assets give us a new kind of flexibility that removes our dependency on system packages and gives us a mechanism for uniformly packaging and shipping plugins to Sensu installations. In this talk from Sensu Summit 2018, Greg Poirier, SVP of Engineering, walks you through assets in Sensu 2.0, our ideas about how assets may impact the community plugins, and what we have planned for Sensu Enterprise integrations as assets.

Assets in Sensu 2.0

Sensu Inc.

More than 41 million users and 74,000 businesses — including 59% of the Fortune 500 — trust Box to manage content in the cloud. They were monitoring this web scale infrastructure with Nagios, and not able to keep up with the rapid pace of change inside of Box. In this talk from Sensu Summit 2018, Trent Baker, Senior Infrastructure Site Reliability Engineer at Box, Inc., tells their migration story from wrestling with management of 350K objects in Nagios – including over 130K checks – to shutting down the last Nagios host roughly a year later.

The Box.com success story: migrating 350K Nagios objects to Sensu

Sensu Inc.

Sensu has become a critical component to keeping the modern visual effects studio of Industrial, Light & Magic in the business of creating the beautiful movies of our world and realizing the dreams we all enjoy on the big screen. In this talk from Sensu Summit 2018, Christopher J. Caillouet, Senior Dev|Ops Production Engineer at Industrial Light & Magic, looks behind the curtain and sees how the intelligence and uptime they gain by leveraging Sensu in the ILM monitoring infrastructure enables reliability and stable delivery within a large scale and geographically distributed set of datacenters.

Project 3M: Meaningful Monitoring and Messaging

Sensu Inc.

For the last two years, David Schroeder, Software Engineer at Viasat, Inc. has supported a single Sensu cluster shared by multiple teams, each with their own requirements, thresholds, and contacts. How does it all work, how can these different uses coexist? This talk from Sensu Summit 2018 describes how Ansible is used to configure and deploy Sensu for multiple teams, how much autonomy is granted each one, and where the bottlenecks are.

Sharing Sensu with Multiple Teams using Ansible

Sensu Inc.

Where's My Beer: Building a Better Kegerator with a Raspberry Pi & Sensu

Sensu Inc.

Reimagining Sensu

Sensu Inc.

In this talk from Doximity's Ben Abrams (from Sensu Summit 2018), you'll learn: - Why alert fatigue is dangerous - How we can solve it - Sensu core components - Filters - Round robin subscriptions - Check dependencies - Check hooks (not strictly alert fatigue but auto triage can really help in general) - Sensu community components auto remediation: use the handler not hooks - External tools for on call management and paging (such as pagerduty) - General tuning - Reduction in noise in alerting (as opposed to monitoring)

Alert Fatigue: Avoidance and Course Correction

Sensu Inc.

In this Sensu Summit 2018 talk, Lee Briggs, Senior Infrastructure Engineer at Apptio, discusses how to monitor Kubernetes components and applications using the classic sensu components. He covers some of the tricks you can use when monitoring Kubernetes resources and cluster components. We’ll cover the kind of things you should and shouldn't monitor with sensu at this stage, as well as some of the lessons learned along the way.

Sensu and Kubernetes 1.x

Sensu Inc.

Sensu and Puppet

Sensu Inc.

More from Sensu Inc. (15)

Introducing GoAlert: a brand-new on-call scheduling and notification open sou...

Monitoring Graceful Failure

Testing and monitoring and broken things

Keynote: Measuring the right things

AIOps & Observability to Lead Your Digital Transformation

Ecosystem session: Sensu + Puppet

Assets in Sensu 2.0

The Box.com success story: migrating 350K Nagios objects to Sensu

Project 3M: Meaningful Monitoring and Messaging

Sharing Sensu with Multiple Teams using Ansible

Where's My Beer: Building a Better Kegerator with a Raspberry Pi & Sensu

Reimagining Sensu

Alert Fatigue: Avoidance and Course Correction

Sensu and Kubernetes 1.x

Sensu and Puppet

Recently uploaded

In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring. Learn about: • The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks. • Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective. • Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification. • Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process. Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.

Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality

Inflectra

💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™: See how to accelerate model training and optimize model performance with active learning Learn about the latest enhancements to out-of-the-box document processing – with little to no training required Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath. Speakers: 👨‍🏫 Andras Palfi, Senior Product Manager, UiPath 👩‍🏫 Lenka Dulovicova, Product Program Manager, UiPath

Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...

UiPathCommunity

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...

Product School

Key Trends Shaping the Future of Infrastructure.pdf

Cheryl Hung

Quantum Computing: Current Landscape and the Future Role of APIs

Vlad Stirbu

FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf

FIDO Alliance

Ever caught yourself nodding along when someone mentions "delivering value" in Agile, but secretly wondering what the heck they actually mean? You're not alone! Join us for an eye-opening session where we'll strip away the buzzwords and dive into the heart of Agile—value delivery. But what is "value"? Is it a mythical unicorn in the world of software development, or is there more to this overused term? This isn't going to be a sit-and-get lecture. We're talking about a face-to-face, interactive meetup where YOU play a crucial role. Come along to: Define It: What does "value" really mean? We’ll build a definition that’s not just words, but a compass for your Agile journey. Contextualise It: Discover what value means specifically to you, your team, your company, and your industry. Because one size does not fit all. Deliver It: Share strategies and gather new ones for uncovering and delivering true value—no more shooting in the dark!

Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx

David Michel

Designing Great Products: The Power of Design and Leadership by Chief Designe...

Product School

The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more. Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/ Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.

Transcript: Selling digital books in 2024: Insights from industry leaders - T...

BookNet Canada

Intrigued by why some of the world's largest companies (Netflix, Google, Cisco, Twitter, Uber etc) are using gRPC? In this demo based talk we delve into the world of gRPC in .Net, what it does and why we should use it. We compare the interface with both Rest and graphQL. We will show you how to implement grpc server-side in .net and in the web. Finally, I will show you how the tooling helps you deliver powerful interfaces and interact with them quickly and simply.

Demystifying gRPC in .Net by John Staveley

John Staveley

IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx

Abida Shariff

Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application. In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics. Length: 30 minutes Session Overview ------------------------------------------- During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana: - What out-of-the-box solutions are available for real-time monitoring JMeter tests? - What are the benefits of integrating InfluxDB and Grafana into the load testing stack? - Which features are provided by Grafana? - Demonstration of InfluxDB and Grafana using a practice web application To view the webinar recording, go to: https://www.rttsweb.com/jmeter-integration-webinar

JMeter webinar - integration with InfluxDB and Grafana

RTTS

I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.

"Impact of front-end architecture on development cost", Viktor Turskyi

Fwdays

I'm excited to share my latest predictions on how AI, robotics, and other technological advancements will reshape industries in the coming years. The slides explore the exponential growth of computational power, the future of AI and robotics, and their profound impact on various sectors. Why this matters: The success of new products and investments hinges on precise timing and foresight into emerging categories. This deck equips founders, VCs, and industry leaders with insights to align future products with upcoming tech developments. These insights enhance the ability to forecast industry trends, improve market timing, and predict competitor actions. Highlights: ▪ Exponential Growth in Compute: How $1000 will soon buy the computational power of a human brain ▪ Scaling of AI Models: The journey towards beyond human-scale models and intelligent edge computing ▪ Transformative Technologies: From advanced robotics and brain interfaces to automated healthcare and beyond ▪ Future of Work: How automation will redefine jobs and economic structures by 2040 With so many predictions presented here, some will inevitably be wrong or mistimed, especially with potential external disruptions. For instance, a conflict in Taiwan could severely impact global semiconductor production, affecting compute costs and related advancements. Nonetheless, these slides are intended to guide intuition on future technological trends.

Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl

Peter Udo Diehl

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf

FIDO Alliance

De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...

Product School

To Graph or Not to Graph Knowledge Graph Architectures and LLMs

Paul Groth

Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place. Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects. Here’s what you’ll gain: - Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows. - Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy. - Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency. - Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity. We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic. Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.

Essentials of Automations: Optimizing FME Workflows with Parameters

Safe Software

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...

Sri Ambati

Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf

91mobiles

Recently uploaded (20)

Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality

Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...

Key Trends Shaping the Future of Infrastructure.pdf

Quantum Computing: Current Landscape and the Future Role of APIs

FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf

Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx

Designing Great Products: The Power of Design and Leadership by Chief Designe...

Transcript: Selling digital books in 2024: Insights from industry leaders - T...

Demystifying gRPC in .Net by John Staveley

IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx

JMeter webinar - integration with InfluxDB and Grafana

"Impact of front-end architecture on development cost", Viktor Turskyi

Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf

De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...

To Graph or Not to Graph Knowledge Graph Architectures and LLMs

Essentials of Automations: Optimizing FME Workflows with Parameters

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...

Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf

Pull, don’t push: Architectures for monitoring and configuration in a microservices era

1. Pull, don’t push! Architectures for monitoring and configuration in a microservices era Julian Dunn, Director of Product Marketing, Chef @julian_dunn Fletcher Nichol, Senior Software Development Engineer, Chef @fnichol

3. • Modular, self-contained, pre-fabricated components • Neighbors share components • Complex shares services as a whole

6. Orchestration

7. An ordered set of operations Across a set of independent machines Connected to an orchestrator only via a network.

9. Humans acting on Microsoft Visio acting on machines Humans acting on code acting on machines

10. An ordered set of operations Defined in code Across a set of independent machines Connected to an orchestrator only via a network.

11. mylaptop:~$ ./disable-load-balancer.sh mylaptop:~$ ssh db01 do-database-migration.sh mylaptop:~$ for i in app01 app02; do > ssh $i do-deployment.sh > done mylaptop:~$ ./enable-load-balancer.sh

12. Problems with Orchestration Resilience Scalability Deployment Technical Operational Cognitive

13. Deployment Resilience for i in app01 app02 app03; do do-deploy.sh –server $i done

14. Deployment Resilience for i in app01 app02 app03; do do-deploy.sh –server $i if $? != 0; then failed=$i break end done # what goes down here? # roll back $failed? # roll back all others? # ignore it?

15.

16. Operational Resilience

17. Operational Resilience Orchestration Backplane – must be up at all times! Application Plane – delegated resilience to the backplane

18. Operational Resilience Orchestration Backplane Application Plane Orchestration Backplane

19. Cognitive Scalability

20. Cognitive Scalability

21. Technical Scalability

22. Mainframes Time Sharing Client/Server Web 1.0 Web 2.0 Cloud Internet of Things Edge Time Distributed Centralized The Future Is Distributed

23.

24. Distributed Devices Need Distributed Management • Adaptive Learning • Configuration Updates • Software Updates

25. Distributed, Autonomous Systems Make progress towards promised desired state Expose interfaces to allow others to verify promises Can promise to take certain behaviors in the face of failure of others

26. The Design of Sensu and The Design of Habitat

27. The Design of Sensu vs. Traditional “Monitoring” Nagios master Agent 1 Agent 2 1. Poll (orchestrate) 2. Run checks 1. Run checks Agent 1 Agent 2 Sensu Backend 2. Post data

28. Habitat supervisor in a nutshell •Network-connected supervision system •Like systemd+consul/etcd (process supervision with lifecycle hooks + shared state for reactive realtime change management) •Eventually-consistent global state using SWIM masterless (peer-to-peer) membership protocol

29. sensu- backend hab-sup sensu- backend hab-sup sensu- backend hab-sup backend.default sensu- agent hab-sup agent.default --bind sensu:backend.default Resolve symbol “sensu” in configs to properties of service group backend.default

30. Let’s See it in Action! Demo: Sensu running under Habitat

31. • Modern architectures demand a choreographed rather than an orchestrated approach • At scale, fleet management and cognitive complexity is the biggest problem • Habitat and Sensu are both examples of edge-centric, autonomous actor systems, and they work well together 😺

Editor's Notes

Fletcher and I were part of the original team that launched Habitat by Chef in 2016; I was the product manager and Fletcher was one of the lead engineers. We both have technical backgrounds, except that we do different jobs now. Fletcher’s computer boots into Linux and mine boots into PowerPoint.
So this is a talk about architecture and systems design, and if we’re going to talk about architecture maybe a good way to think about good architecture is via, well, actual architecture. One of the most famous buildings in the world is the Habitat 67 complex in Montreal, built, as you can see, for Expo 67, which was Canada’s 100th anniversary. Shout out, by the way, to the Canadians in the room, including Sean Porter, Sensu’s CTO; Fletcher and I are both Canadians so we have to make a pitch for the Great White North anytime we're up here. Universal health care! One year of paid maternity leave! Super-hot prime minister! Ok, that's enough of that Anyway, Habitat 67 was such an iconic building that Canada Post put it on the stamp for Canada’s 150th anniversary last year.
Here’s another picture, in its full glory. Probably would have actually used shipping containers today but remember, TEU (standardized) containerization didn’t arrive until the late 1960’s. But the components were standardized as you can see from the middle versus the right One unit’s roof is the other neighbor’s garden Shopping, schools, common services built into the ground floor of each complex These things sound a lot like software architectural principles Every component is responsible for its own resiliency (like Bezos’ infamous memo) Components declare peer-to-peer level dependencies All components share a base substrate of services and management (e.g. deployment, monitoring, observability, etc.)
The Habitat 67 complex is actually quite large
I wanted to put the big pictures up of Habitat 67 because, well, architecture starts to look a lot like architecture, right? These are visual diagrams (probably several years old) of microservice architectures at Amazon and Netflix. When you have complex systems this big, there are architectural patterns you’ll need to put in place to deal with it. Because when you get to something big and complex, your issue isn’t adding more to it – your issue becomes how do you manage this. Today’s talk which is really about how you design complex systems so that you can _manage_ them. It’s better to design systems with these characteristics built-in up front rather than to try and bolt them on later.
Which brings me to the patterns of management for complex systems. Traditionally, we have and in many scenarios we continue to try and manage things using a centralized approach, which I call “orchestration”. So does everyone else, unfortunately, so let me define what I mean by this.
IBM Cloud Orchestrator HP Operations Orchestration VMWare vRealize Orchestrator
But since I’m in the orchestration track I’d better try to define it so that I actually have a talk, right? Here is the definition I'll be using for the rest of the talk. And then I’m still going to tell you how and why that breaks down.
This is a trivial example of orchestration. Last year I said I at least hope you’re doing your orchestration in code, if you’re doing orchestration, because this is pretty awful. And as you can see, it causes downtime because you need to wait for the previous thing to complete before you can proceed with the next one. You can add more fancy error checking and branching to orchestration to try and handle no-downtime deploys, but that orchestration gets really complicated – more complexity means more error conditions means more things that need to be handled.
Resilience Deployment Operational Scalability Technical Cognitive
Treating machines all connected via an unreliable network as an atomic unit to which updates must be applied in full, or not at all This *used* to work when you had a small fleet and/or your network was mostly reliable (e.g. on a LAN) - not so good in a cloud
An atomic set that is assumed to succeed as a whole or not. What happens when it doesn't? A lot of complexity in failure conditions that need to be encapsulated and dealt with. Or more commonly, the approach is to drop this all off on the operator's lap and have them deal with it.
Modern orchestration systems try to get around this fundamental issue by creating more disposability and just throwing away larger and larger parts of the infrastructure. The theory goes, let’s get the exact right “new” setup first, and then cut over to it. The problem is that while this mostly works, it is an incredibly complicated and slow way to make changes – you’re saying that for every config change or deployment I have to stand up a whole new production environment and cut over everything to it? For example, how do I do things like quiesce writes to a database? I think this creates more complexity even though the interfaces seem really attractive.
Orchestration systems treat application components as dumb entities to be scheduled. Those entities don’t know about each other except through the orchestration system. This means that if components fail, they depend on the orchestration backplane (and here I’m picking on Kubernetes again) to manage their lifecycle. They also depend on the orchestration backplane to tell them where the other entities are (like where the database server is, if I’m the app server). The apps themselves are deliberately kept in the dark about their execution context.
Now remember, we’re running in the cloud now – a place where machines and networks can go down at any time. And we’re trying to build reliable applications on top of that unreliable fabric.
Now who does such a system design benefit? It only benefits the person or organization that is running the orchestration backplane – that is, if it’s external to the unreliable vagaries of the “cloud”. In other words, if it’s, say, a hosted service provided by your cloud vendor? Kubernetes and other orchestration systems soften you up for that approach so that when you run into the inherent resilience limitations, you outsource. Therefore I believe Google has never intended that you run a Kubernetes cluster on your own, but to buy it from someone (hopefully them) as a managed service. And don’t get me wrong, it’s an amazing business model, and, if you can offer your developers an experience on top of all this that’s just “push a container and it runs”, then that’s great. This is why there has been this Precambrian explosion of hosted Kubernetes solutions... Because these vendors know that this architectural model locks you into building applications on their platform. When your app is operationally dumb and the backplane is operationally smart, they have your money forever.
I don’t have that much to say about this one other than that orchestration systems or operations become really difficult to understand the more entities you’re trying to address. In particular because an orchestration activity (“play”) is intended to run to completion, atomically, trying to debug failures halfway through and figure out what to do is really hard. When things go wrong, it’s easier for the human brain to try and understand a small part of the system – where the fault is – rather than the entire global state. We know this with computer programming (“locality of reference”) and that’s why we have techniques like “information hiding” (i.e. abstracting logic).
We used to show this slide as part of old Opscode training materials when I first started at Chef. I’m sure you’ve seen slides like this before, where we talk about the # of nodes running applications, etc, and how they grow over time. While this is all true, I think these graphs neglect one key thing, which is not that the *quantity* of machines increases over time, but the fact that systems as a whole tend towards becoming more *distributed*. By "distributed" I mean that more of the computing runs at the "edge" if you will and not in a centralized way.
It’s not a straight line, though. <Talk through the build> Cloud: ML, databases, etc. – now starting to centralize more stuff into the cloud. The more that our systems become distributed, the less a centralized approach makes sense. This is true not only for data processing (why can’t it happen at the edge), but also to configuration updates and even software upgrades.
https://medium.com/@timanglade/how-hbos-silicon-valley-built-not-hotdog-with-mobile-tensorflow-keras-react-native-ef03260747f3 Tensorflow, Keras, React Native First version was centralized – too much latency So the final version runs an entire neural network on your phone.
Nike HyperAdapt shoe Number of devices continues to increase Machine Learning, Analytics, AI Latency becomes currency At-scale problems will re-emerge just like they did with Client/Server and the Web Distributed devices need distributed management
Sounds a lot like wherein we started with convergent configuration management and this guy, right? Everything old is new again.
Using SWIM rather than something like RAFT, because SWIM is masterless
This slide will be a build to show some of Habitat’s terminology, specifically: Service group Contains one or more entities that share a configuration template, but run the same workload Leaders and followers are in the same group Have a name Supervisors are responsible for [re-]writing configuration of the workload and restarting the process, possibly in coordination with other supervisors in that group Supervisors have a REST interface that allows you to modify their config (inject new configs as rumors into the network – they will be propagated. Can use any authorized supervisor as an entrypoint, doesn’t have to be the group we care about) External service groups can be subscribed to the configuration of this service group using binding Talk about communication protocol across the fleet – SWIM membership protocol/failure detector, with a gossip layer on top for distributed consensus Because we get asked a lot of questions about the protocol, it is an implementation of SWIM It's an implementation of SWIM+Infection+Suspicion for membership, and a ZeroMQ based newscast-inspired gossip protocol. Goals Eventually consistent. Over a long enough time horizon, every living member will converge on the same state. Reasonably efficient. The protocol avoids any back-chatter; messages are sent but never confirmed. Reliable. As a building block, it should be safe and reliable to use.
Config changes: injected into any peer, ACL is checked, and if accepted, gossiped around the network. No SPOF.

Pull, don’t push: Architectures for monitoring and configuration in a microservices era

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Pull, don’t push: Architectures for monitoring and configuration in a microservices era

Similar to Pull, don’t push: Architectures for monitoring and configuration in a microservices era (20)

More from Sensu Inc.

More from Sensu Inc. (15)

Recently uploaded

Recently uploaded (20)

Pull, don’t push: Architectures for monitoring and configuration in a microservices era

Editor's Notes