Serverless on OpenStack with Docker Swarm, Mistral, and StackStormDmitri Zimine
Intro to Serverless, 101 demo with StackStorm, and real world application of serverless solution.
Slides for OpenStack Summit Boston 2017 talk:
https://www.openstack.org/summit/boston-2017/summit-schedule/events/18325
Most of the talk was a demo, please stay tuned for recording.
Serverless, devops, automation, operations, faas, @Stack_Storm.
Presentation at the International Industry-Academia Workshop on Cloud Reliability and Resilience. 7-8 November 2016, Berlin, Germany.
Organized by EIT Digital and Huawei GRC, Germany.
Twitter: @CloudRR2016
Failures happen. Building resilient cloud infrastructure requires an end-to-end automated approach to failure remediation. This approach must go beyond the current DevOps model of monitoring the system and getting engineers alerted when a failure condition occurs.
Recently, event driven automation and workflows re-emerged as a way to automate troubleshooting, remediation, and a variety of Day-2 operations. Facebook famously uses FBAR to "save 16,000 engineer-hours, a day, in ops". Similar approaches had been reported by other hyper-scale cloud providers. Open-source auto-remediation platforms like StackStorm are replacing legacy Runbook automation products, and have been successfully used to automate applications, networks, security, and cloud infrastructure.
In this presentation we give a brief history of workflow automation, overview the common architecture ingredients of a typical event driven automation framework, compare and contrast alternative approaches to day-2 automation, and, most importantly, share real-world use cases and examples of applying event driven automation in operations.
Dmitri Zimine's slides on Design Summit at OpenStack Barcelona 2016. Talked about the history of two projects, technical differences, discussed an overlap and drafted a path forward.
https://www.openstack.org/summit/barcelona-2016/summit-schedule/events/16999/mistral-mistral-and-stackstorm
StackStrom: If-This-Than-That for Devops AutomationDmitri Zimine
Slides for my talk at Scale15x: https://www.socallinuxexpo.org/scale/15x/presentations/stackstorm-if-devops-automation
Devops automation, open-source,
Demo was at the core of the talk, the video is at https://youtu.be/3TjhBGshvvY?t=3h31m5s
This document provides an overview of Mistral, an OpenStack workflow service. It describes Mistral's architecture and capabilities for defining, executing, and monitoring workflows. Workflows in Mistral are graphs of tasks with control and data flow. The current version supports basic task types like SSH and REST calls. Future plans include improving the workflow definition syntax, adding standard OpenStack actions, and developing the Horizon dashboard interface.
The document discusses operational patterns and automation tools for DevOps. It covers monitoring patterns such as treating monitoring as a service and addressing pager fatigue. Remediation patterns discussed include automating responses to alerts like Facebook's FBAR system. Common tools mentioned are New Relic, Splunk, Puppet, Chef, Ansible and Salt. Best practices for automation include keeping things simple, separating concerns, and providing context to humans when they are required to intervene. ChatOps and using chat interfaces for operations are also discussed.
The document discusses the open source automation tool StackStorm. It references an IRC channel on Freenode for StackStorm discussion and notes that the conversation on November 3rd focused on it being open source. StackStorm is described as being built for purpose automation and having components like actions, triggers, rules, sensors, and the ability to automate infrastructure, cloud applications, tools, and processes while also providing auditing capabilities.
Workflows are a powerful tool to help automate many operational tasks. During an outage there are a number of tasks that are normally performed that can be turned in to workflows. We will dive in to some common use cases and show how workflows can be leveraged to help cut down time to resolution and provide a consistent response during an outage. See how to facilitate a collaborative environment through the use of ChatOps.
StackStorm is an open source event driven automation platform targeted at automating many of the tasks performed by engineers. Essentially, an If This Than That for IT Operations. Allowing users to stitch together atomic actions in to complex workflows and run these workflows based on events from external systems.
Serverless on OpenStack with Docker Swarm, Mistral, and StackStormDmitri Zimine
Intro to Serverless, 101 demo with StackStorm, and real world application of serverless solution.
Slides for OpenStack Summit Boston 2017 talk:
https://www.openstack.org/summit/boston-2017/summit-schedule/events/18325
Most of the talk was a demo, please stay tuned for recording.
Serverless, devops, automation, operations, faas, @Stack_Storm.
Presentation at the International Industry-Academia Workshop on Cloud Reliability and Resilience. 7-8 November 2016, Berlin, Germany.
Organized by EIT Digital and Huawei GRC, Germany.
Twitter: @CloudRR2016
Failures happen. Building resilient cloud infrastructure requires an end-to-end automated approach to failure remediation. This approach must go beyond the current DevOps model of monitoring the system and getting engineers alerted when a failure condition occurs.
Recently, event driven automation and workflows re-emerged as a way to automate troubleshooting, remediation, and a variety of Day-2 operations. Facebook famously uses FBAR to "save 16,000 engineer-hours, a day, in ops". Similar approaches had been reported by other hyper-scale cloud providers. Open-source auto-remediation platforms like StackStorm are replacing legacy Runbook automation products, and have been successfully used to automate applications, networks, security, and cloud infrastructure.
In this presentation we give a brief history of workflow automation, overview the common architecture ingredients of a typical event driven automation framework, compare and contrast alternative approaches to day-2 automation, and, most importantly, share real-world use cases and examples of applying event driven automation in operations.
Dmitri Zimine's slides on Design Summit at OpenStack Barcelona 2016. Talked about the history of two projects, technical differences, discussed an overlap and drafted a path forward.
https://www.openstack.org/summit/barcelona-2016/summit-schedule/events/16999/mistral-mistral-and-stackstorm
StackStrom: If-This-Than-That for Devops AutomationDmitri Zimine
Slides for my talk at Scale15x: https://www.socallinuxexpo.org/scale/15x/presentations/stackstorm-if-devops-automation
Devops automation, open-source,
Demo was at the core of the talk, the video is at https://youtu.be/3TjhBGshvvY?t=3h31m5s
This document provides an overview of Mistral, an OpenStack workflow service. It describes Mistral's architecture and capabilities for defining, executing, and monitoring workflows. Workflows in Mistral are graphs of tasks with control and data flow. The current version supports basic task types like SSH and REST calls. Future plans include improving the workflow definition syntax, adding standard OpenStack actions, and developing the Horizon dashboard interface.
The document discusses operational patterns and automation tools for DevOps. It covers monitoring patterns such as treating monitoring as a service and addressing pager fatigue. Remediation patterns discussed include automating responses to alerts like Facebook's FBAR system. Common tools mentioned are New Relic, Splunk, Puppet, Chef, Ansible and Salt. Best practices for automation include keeping things simple, separating concerns, and providing context to humans when they are required to intervene. ChatOps and using chat interfaces for operations are also discussed.
The document discusses the open source automation tool StackStorm. It references an IRC channel on Freenode for StackStorm discussion and notes that the conversation on November 3rd focused on it being open source. StackStorm is described as being built for purpose automation and having components like actions, triggers, rules, sensors, and the ability to automate infrastructure, cloud applications, tools, and processes while also providing auditing capabilities.
Workflows are a powerful tool to help automate many operational tasks. During an outage there are a number of tasks that are normally performed that can be turned in to workflows. We will dive in to some common use cases and show how workflows can be leveraged to help cut down time to resolution and provide a consistent response during an outage. See how to facilitate a collaborative environment through the use of ChatOps.
StackStorm is an open source event driven automation platform targeted at automating many of the tasks performed by engineers. Essentially, an If This Than That for IT Operations. Allowing users to stitch together atomic actions in to complex workflows and run these workflows based on events from external systems.
Google Cloud Platform monitoring with ZabbixMax Kuzkin
This presentation describes how to configure Zabbix (https://zabbix.com/) to configure Google Cloud Platform events through its Monitoring API, using gcpmetrics (https://github.com/odin-public/gcpmetrics/) command line tool.
The document discusses KrakenJS, an open source JavaScript framework built on Node.js and Express. It summarizes PayPal's transition from Java to Node.js architectures, which resulted in benefits like smaller teams, increased performance, and faster development. It then provides an overview of KrakenJS and some of its core features like Makara for internationalization, Lusca for security, and generators for quickly generating app components.
OSMC 2017 | Monitoring MySQL with Prometheus and Grafana by Julien PivottoNETWAYS
Databases monitoring is not a new topic, so what can we still improve? With Prometheus, you can collect a lot of data at a high frequency, and decide later which ones are useful. Grafana, with Percona graphs, offers a very efficient dashboard solution. We will see how to glue everything and get the best way to monitor your databases using open source tools only.
Cloud: From Unmanned Data Center to Algorithmic Economy using OpenstackAndrew Yongjoon Kong
This document discusses Kakaocorp's transition from an unmanned datacenter to using algorithms and OpenStack for resource management. It describes the development of an integrated monitoring system called "Crow" to gather metrics from physical and virtual resources. This system analyzes resource usage to identify unused virtual machines, allowing Kakaocorp to automatically garbage collect over 40% of potential candidates. The monitoring data also helps create a new controlled volume subsystem for improved resource allocation.
With more than 140 million users, KakaoTalk is the most popular mobile messaging platform in South Korea. The team at daumkakao has been using OpenStack with the intention for tranforming the current legacy infrastructure into scale out based cloud to build and offer new services for its users. In this session, we'd like to share our experiences with the OpenStack community, specifically in regards to meeting our needs for networking with Neutron.OpenStack Neutron offers a lot of methods to implement networking for VMs and containers. For production operations, VM migration can be a common activity to manage resources and improve uptime. It's not hard using shared storage like Ceph, but network settings, such as IP addresses need to be preserved. With a shared storage environment, an image can be attached anywhere inside of a data center, but a service IP for a virtual machine is different story. And when you don't use the floating IPs, keeping the same IP across a data center-wide set of VLANs is hard job.To maintain a virtual machine's IP settings and balance IPs between VLANS, we tried several options including overlay, SDN, and NFV technologies. In the end we came to use a route-only network for our virtual machine networks, leveraging technology like Quagga for RIP, OSPF BGP integrated with Neutron.
Performance and Debugging with the Diagnostics Hub in Visual Studio 2013Sasha Goldshtein
The document discusses the various diagnostic and debugging tools available in the Diagnostics Hub in Visual Studio 2013. It describes tools for profiling and analyzing CPU performance like the sampling profiler and instrumentation profiler. It also covers the concurrency visualizer, UI responsiveness tool, memory usage analysis, and memory dump analysis. It encourages using the different tools available in the Diagnostics Hub to debug performance issues, memory leaks, and non-responsive user interfaces.
This document discusses simplifying, standardizing, and automating application deployment processes before moving to the cloud. It recommends using central configuration repositories and automation tools like Chef to deploy identical environments for development, staging, and production. This allows using the same processes and tools across environments. AWS services like OpsWorks can then be used to deploy production using the same Chef configurations. The key is treating the cloud as a tool to deploy standardized, automated applications at scale.
The document discusses Parse's process for benchmarking MongoDB upgrades by replaying recorded production workloads on test servers. They found a 33-75% drop in throughput when upgrading from 2.4.10 to 2.6.3 due to query planner bugs. Working with MongoDB, they identified and helped fix several bugs, improving performance in 2.6.5 but still below 2.4.10 levels initially. Further optimization work increased throughput above 2.4.10 levels when testing with more workers and operations.
Vladimir Ulogov - Large Scale Simulation | ZabConf2016 Lightning TalkZabbix
The document discusses the need for a Zabbix Proxy Simulator to test Zabbix monitoring configurations at scale. It describes how the current Zabbix Agent Simulator is not scalable and that implementing simulations at the Zabbix Proxy level is the right approach. The Zabbix Proxy Simulator would interface with the Zabbix Server through the proxy interface and use the CLIPS expert system to apply rules to simulated facts and exchange normalized configuration and historical data. While not fully completed, the Python-CLIPS integration provides an experimental foundation.
Trouble Ticket Integration with Zabbix in Large EnvironmentAlain Ganuchaud
Large Environments rely on TroubleTicket tool and HelpDesk for managing IT issues. Bridging Zabbix with over 5000 servers and HelpDesk manually is a painful and impossible project. In this presentation we will cover how we may integrate Zabbix with HelpDesk, the architecture and what are the issues specially in Large Environments.
As an example, we will cover the case study of Zabbix - ServiceNow integration, as it was developped for SwissLife and released as OpenSource.
The document discusses automated infrastructures and provides a case study of MonkeyNews, a small startup news site about monkeys. It describes how MonkeyNews built an automated infrastructure using tools like Puppet, EC2, iClassify, and Capistrano. This allowed them to quickly scale infrastructure, deploy new applications, and address issues without manual configuration by treating infrastructure as code.
The document discusses DevOps practices at Kakaocorp. It provides background on Andrew Yongjoon Kong and describes some key metrics of Kakaocorp's OpenStack deployment. It then covers concepts like collaboration, affinity, tools, and scaling in a DevOps context. Specific examples at Kakaocorp include using GitHub, Jira, Chef, and Jenkins. It also introduces initiatives like KEMI for integration and DKOS for container management.
This document introduces Nova, an open source cloud computing fabric controller. It describes Nova's core files, binaries, and services. Nova uses a flags module to manage configurable parameters. It has a scheduler module that uses drivers and algorithms to schedule virtual machines across compute hosts. Services include APIs for user interaction and internal RPC for communication between modules.
ChinaNetCloud - The Zabbix Database - Zabbix Conference 2014ChinaNetCloud
Overview of the Zabbix monitoring system database and how to use or customize it for reporting and integration.
Originally given at Zabbix Global Conference in Riga, Latvia in Sept, 2014
Windows Configuration Management: Managing Packages, Services, & Power Shell-...Puppet
This document discusses using Puppet to manage Windows configuration. It covers installing packages using Chocolatey, managing services like WSUS, and using PowerShell with Puppet. Puppet works by defining the desired configuration, simulating changes, enforcing the configuration, and reporting differences. The presenters demonstrate installing packages, managing services, and using PowerShell modules with Puppet. They also discuss Puppet support for Windows, including supported modules for tasks like SQL Server management, patching with WSUS, and using DSC resources.
Crowbar and OpenStack: Steve Kowalik, SUSEOpenStack
Crowbar and OpenStack
Audience: Intermediate
Topic: Operations
Abstract: One of the greatest challenges in implementing OpenStack is the complexity in deploying and maintaining all of its many components on what can be a wide range of different hardware platforms. To mitigate this problem, SUSE has developed Crowbar, an open source deployment tool that has led to SUSE winning the “Rule the Stack” deployment competition every time it has been run.
This presentation will take you through the basics of Crowbar as well as a demonstration of some of its features.
Speaker Bio: Steven Kowalik, SUSE
Steven Kowalik is a Sydney-based open source developer with over two decades experience contributing to major projects, including over 15 years with Debian GNU/Linux, as well as significant involvement with upstream OpenStack.
Steven is currently a senior developer at SUSE, working on primarily on SUSE OpenStack Cloud and related projects.
OpenStack Australia Day Government - Canberra 2016
https://events.aptira.com/openstack-australia-day-canberra-2016/
Storm-on-YARN: Convergence of Low-Latency and Big-DataDataWorks Summit
adoop plays a central role for Yahoo! to provide personalized experiences for our users and create value for our advertisers. In this talk, we will discuss the convergence of low-latency processing and Hadoop platform. To enable the convergence, we have developed Storm-on-YARN to enable Storm streaming/microbatch applications and Hadoop batch applications hosted in a single cluster. Storm applications could leverage YARN for resource management, and apply Hadoop style security to Hadoop datasets on HDFS and HBase. In Storm-on-YARN, YARN is used to launch Storm application master (Nimbus), and enable Nimbus to request resources for Storm workers (Supervisors). YARN resource manager and Storm scheduler work together to support multi-tenancy and high availability. HDFS enables Storm to achieve higher availability of Nimbus itself. We are introducing Hadoop style security into Storm through JAAS authentication (Kerberos and Digest). Storm servers (Nimbus and DRPC) will be configured with authorization plugins for access control and audit. The security context enables Storm applications to access authorized datasets only (including those created by Hadoop applications). Yahoo! is making our contribution on Storm and YARN available as open source. We will work with industry partners to foster the convergence of low-latency processing and big-data.
This document discusses 101 mistakes that FINN.no learned from in running Apache Kafka. It begins with an introduction to Kafka and why FINN.no chose to use it. It then discusses FINN.no's Kafka architecture and usage over time as their implementation grew. The document outlines several common mistakes made including not distinguishing between internal and external data, lack of external schema definition, using a single configuration for all topics, defaulting to 128 partitions, and running Zookeeper on overloaded nodes. Each mistake is explained, potential consequences are given, better solutions are proposed, and what FINN.no has done to address them.
Nurse Tech Presentation given on May 14 2015 at the AutoRemediation meetup.
http://www.meetup.com/Auto-Remediation-and-Event-Driven-Automation/events/222051597/
Google Cloud Platform monitoring with ZabbixMax Kuzkin
This presentation describes how to configure Zabbix (https://zabbix.com/) to configure Google Cloud Platform events through its Monitoring API, using gcpmetrics (https://github.com/odin-public/gcpmetrics/) command line tool.
The document discusses KrakenJS, an open source JavaScript framework built on Node.js and Express. It summarizes PayPal's transition from Java to Node.js architectures, which resulted in benefits like smaller teams, increased performance, and faster development. It then provides an overview of KrakenJS and some of its core features like Makara for internationalization, Lusca for security, and generators for quickly generating app components.
OSMC 2017 | Monitoring MySQL with Prometheus and Grafana by Julien PivottoNETWAYS
Databases monitoring is not a new topic, so what can we still improve? With Prometheus, you can collect a lot of data at a high frequency, and decide later which ones are useful. Grafana, with Percona graphs, offers a very efficient dashboard solution. We will see how to glue everything and get the best way to monitor your databases using open source tools only.
Cloud: From Unmanned Data Center to Algorithmic Economy using OpenstackAndrew Yongjoon Kong
This document discusses Kakaocorp's transition from an unmanned datacenter to using algorithms and OpenStack for resource management. It describes the development of an integrated monitoring system called "Crow" to gather metrics from physical and virtual resources. This system analyzes resource usage to identify unused virtual machines, allowing Kakaocorp to automatically garbage collect over 40% of potential candidates. The monitoring data also helps create a new controlled volume subsystem for improved resource allocation.
With more than 140 million users, KakaoTalk is the most popular mobile messaging platform in South Korea. The team at daumkakao has been using OpenStack with the intention for tranforming the current legacy infrastructure into scale out based cloud to build and offer new services for its users. In this session, we'd like to share our experiences with the OpenStack community, specifically in regards to meeting our needs for networking with Neutron.OpenStack Neutron offers a lot of methods to implement networking for VMs and containers. For production operations, VM migration can be a common activity to manage resources and improve uptime. It's not hard using shared storage like Ceph, but network settings, such as IP addresses need to be preserved. With a shared storage environment, an image can be attached anywhere inside of a data center, but a service IP for a virtual machine is different story. And when you don't use the floating IPs, keeping the same IP across a data center-wide set of VLANs is hard job.To maintain a virtual machine's IP settings and balance IPs between VLANS, we tried several options including overlay, SDN, and NFV technologies. In the end we came to use a route-only network for our virtual machine networks, leveraging technology like Quagga for RIP, OSPF BGP integrated with Neutron.
Performance and Debugging with the Diagnostics Hub in Visual Studio 2013Sasha Goldshtein
The document discusses the various diagnostic and debugging tools available in the Diagnostics Hub in Visual Studio 2013. It describes tools for profiling and analyzing CPU performance like the sampling profiler and instrumentation profiler. It also covers the concurrency visualizer, UI responsiveness tool, memory usage analysis, and memory dump analysis. It encourages using the different tools available in the Diagnostics Hub to debug performance issues, memory leaks, and non-responsive user interfaces.
This document discusses simplifying, standardizing, and automating application deployment processes before moving to the cloud. It recommends using central configuration repositories and automation tools like Chef to deploy identical environments for development, staging, and production. This allows using the same processes and tools across environments. AWS services like OpsWorks can then be used to deploy production using the same Chef configurations. The key is treating the cloud as a tool to deploy standardized, automated applications at scale.
The document discusses Parse's process for benchmarking MongoDB upgrades by replaying recorded production workloads on test servers. They found a 33-75% drop in throughput when upgrading from 2.4.10 to 2.6.3 due to query planner bugs. Working with MongoDB, they identified and helped fix several bugs, improving performance in 2.6.5 but still below 2.4.10 levels initially. Further optimization work increased throughput above 2.4.10 levels when testing with more workers and operations.
Vladimir Ulogov - Large Scale Simulation | ZabConf2016 Lightning TalkZabbix
The document discusses the need for a Zabbix Proxy Simulator to test Zabbix monitoring configurations at scale. It describes how the current Zabbix Agent Simulator is not scalable and that implementing simulations at the Zabbix Proxy level is the right approach. The Zabbix Proxy Simulator would interface with the Zabbix Server through the proxy interface and use the CLIPS expert system to apply rules to simulated facts and exchange normalized configuration and historical data. While not fully completed, the Python-CLIPS integration provides an experimental foundation.
Trouble Ticket Integration with Zabbix in Large EnvironmentAlain Ganuchaud
Large Environments rely on TroubleTicket tool and HelpDesk for managing IT issues. Bridging Zabbix with over 5000 servers and HelpDesk manually is a painful and impossible project. In this presentation we will cover how we may integrate Zabbix with HelpDesk, the architecture and what are the issues specially in Large Environments.
As an example, we will cover the case study of Zabbix - ServiceNow integration, as it was developped for SwissLife and released as OpenSource.
The document discusses automated infrastructures and provides a case study of MonkeyNews, a small startup news site about monkeys. It describes how MonkeyNews built an automated infrastructure using tools like Puppet, EC2, iClassify, and Capistrano. This allowed them to quickly scale infrastructure, deploy new applications, and address issues without manual configuration by treating infrastructure as code.
The document discusses DevOps practices at Kakaocorp. It provides background on Andrew Yongjoon Kong and describes some key metrics of Kakaocorp's OpenStack deployment. It then covers concepts like collaboration, affinity, tools, and scaling in a DevOps context. Specific examples at Kakaocorp include using GitHub, Jira, Chef, and Jenkins. It also introduces initiatives like KEMI for integration and DKOS for container management.
This document introduces Nova, an open source cloud computing fabric controller. It describes Nova's core files, binaries, and services. Nova uses a flags module to manage configurable parameters. It has a scheduler module that uses drivers and algorithms to schedule virtual machines across compute hosts. Services include APIs for user interaction and internal RPC for communication between modules.
ChinaNetCloud - The Zabbix Database - Zabbix Conference 2014ChinaNetCloud
Overview of the Zabbix monitoring system database and how to use or customize it for reporting and integration.
Originally given at Zabbix Global Conference in Riga, Latvia in Sept, 2014
Windows Configuration Management: Managing Packages, Services, & Power Shell-...Puppet
This document discusses using Puppet to manage Windows configuration. It covers installing packages using Chocolatey, managing services like WSUS, and using PowerShell with Puppet. Puppet works by defining the desired configuration, simulating changes, enforcing the configuration, and reporting differences. The presenters demonstrate installing packages, managing services, and using PowerShell modules with Puppet. They also discuss Puppet support for Windows, including supported modules for tasks like SQL Server management, patching with WSUS, and using DSC resources.
Crowbar and OpenStack: Steve Kowalik, SUSEOpenStack
Crowbar and OpenStack
Audience: Intermediate
Topic: Operations
Abstract: One of the greatest challenges in implementing OpenStack is the complexity in deploying and maintaining all of its many components on what can be a wide range of different hardware platforms. To mitigate this problem, SUSE has developed Crowbar, an open source deployment tool that has led to SUSE winning the “Rule the Stack” deployment competition every time it has been run.
This presentation will take you through the basics of Crowbar as well as a demonstration of some of its features.
Speaker Bio: Steven Kowalik, SUSE
Steven Kowalik is a Sydney-based open source developer with over two decades experience contributing to major projects, including over 15 years with Debian GNU/Linux, as well as significant involvement with upstream OpenStack.
Steven is currently a senior developer at SUSE, working on primarily on SUSE OpenStack Cloud and related projects.
OpenStack Australia Day Government - Canberra 2016
https://events.aptira.com/openstack-australia-day-canberra-2016/
Storm-on-YARN: Convergence of Low-Latency and Big-DataDataWorks Summit
adoop plays a central role for Yahoo! to provide personalized experiences for our users and create value for our advertisers. In this talk, we will discuss the convergence of low-latency processing and Hadoop platform. To enable the convergence, we have developed Storm-on-YARN to enable Storm streaming/microbatch applications and Hadoop batch applications hosted in a single cluster. Storm applications could leverage YARN for resource management, and apply Hadoop style security to Hadoop datasets on HDFS and HBase. In Storm-on-YARN, YARN is used to launch Storm application master (Nimbus), and enable Nimbus to request resources for Storm workers (Supervisors). YARN resource manager and Storm scheduler work together to support multi-tenancy and high availability. HDFS enables Storm to achieve higher availability of Nimbus itself. We are introducing Hadoop style security into Storm through JAAS authentication (Kerberos and Digest). Storm servers (Nimbus and DRPC) will be configured with authorization plugins for access control and audit. The security context enables Storm applications to access authorized datasets only (including those created by Hadoop applications). Yahoo! is making our contribution on Storm and YARN available as open source. We will work with industry partners to foster the convergence of low-latency processing and big-data.
This document discusses 101 mistakes that FINN.no learned from in running Apache Kafka. It begins with an introduction to Kafka and why FINN.no chose to use it. It then discusses FINN.no's Kafka architecture and usage over time as their implementation grew. The document outlines several common mistakes made including not distinguishing between internal and external data, lack of external schema definition, using a single configuration for all topics, defaulting to 128 partitions, and running Zookeeper on overloaded nodes. Each mistake is explained, potential consequences are given, better solutions are proposed, and what FINN.no has done to address them.
Nurse Tech Presentation given on May 14 2015 at the AutoRemediation meetup.
http://www.meetup.com/Auto-Remediation-and-Event-Driven-Automation/events/222051597/
An overview of 20 automation projects within OpenStack. The presentation for OpenStack online meetup www.meetup.com/OpenStack-Online-Meetup/ Recording is at https://plus.google.com/u/0/events/ca0d20climslpjgm8dml1lft0p8
Déployer des applications à n'importe quelle échelle, facilement. C'est la promesse faite par Nomad, le dernier né de la famille HashiCorp, déjà auteur à succès de Vagrant, Consul ou bien Terraform. Lors de ce tour d'horizon de l'outil, ponctué de nombreuses démos, nous parlerons déploiement, mise à jour, contraintes et passage à l'échelle. Nous verrons en quoi Nomad apporte une réponse à la délicate question de l'optimisation des ressources d'un SI, d'un point de vue capacitif, mais aussi temporel.
Automation and Orchestration - Harnessing Threat Intelligence for Better Inci...Chris Ross
The document discusses automation and orchestration in incident response. It notes that early attempts focused on vendor mergers but that never fully delivered. Standards like APIs and REST have enabled products to integrate more easily. The future of automation and orchestration likely involves products communicating directly through APIs rather than going through a centralized system. This allows for customized solutions and speed while still using best-of-breed products. It requires security teams to develop these integrations themselves.
This document describes Mistral, an OpenStack service for task orchestration and scheduling. Mistral implements the Convection workflow service and uses the TaskFlow library to execute tasks in a distributed manner. It aims to provide an easy and flexible mechanism for executing workflows consisting of interrelated tasks. Key features include high availability, scalability, scheduling capabilities, and observability of workflow states. Mistral's DSL allows users to define workflows as a graph of tasks that can be visualized and analyzed.
Couchbase Connect 2016: Monitoring Production Deployments The Tools – LinkedInMichael Kehoe
Good monitoring can be the difference between a great night's sleep or hearing your phone go off at 2:37 a.m. because of a production outage. Couchbase Server provides a large number of metrics which can be overwhelming if you do not know the critical things to focus on or how to expose that information to your monitoring system. In this talk we will look at example production incidents, going in depth around specific things to monitor, and how this information can be used to find issues, work out root cause, and discover trends.
The Cloud Convergence: OpenStack and Kubernetes.Ihor Dvoretskyi
Murano is an OpenStack project that introduces an application catalog for publishing and deploying ready-to-use applications. It provides a way to abstract applications from underlying infrastructure resources and supports both Linux and Windows. Murano uses a dashboard, API, engine, and agents to deploy and manage applications. Kubernetes is an open source container orchestration tool that improves on Borg to manage containers across clusters. It uses components like pods, replication controllers, and services to deploy and scale containerized applications reliably. Together, Murano and Kubernetes provide application cataloging and deployment abilities while abstracting applications from infrastructure resources managed by OpenStack.
This document discusses LinkedIn's use of Couchbase as an in-memory data store. It describes how LinkedIn has grown to rely heavily on Couchbase, now running it in production, staging, and corporate environments. It also outlines some of the key use cases Couchbase supports at LinkedIn, such as serving as a read-through cache, storing counters, and acting as a source of truth datastore for some internal tools. Finally, it discusses the operational tooling and processes LinkedIn has developed to support Couchbase at scale.
LinkedIn serves traffic for its 467 million members from four data centers and multiple PoPs spread geographically around the world. Serving live traffic from from many places at the same time has taken us from a disaster recovery model to a disaster avoidance model where we can take an unhealthy data center or PoP out of rotation and redistribute its traffic to a healthy one within minutes, with virtually no visible impact to users. The geographical distribution of our infrastructure also allows us to optimize the end-user's experience by geo routing users to the best possible PoP and datacenter.
This talk provide details on how LinkedIn shifts traffic between its PoPs and data centers to provide the best possible performance and availability for its members. We will also touch on the complexities of performance in APAC, how IPv6 is helping our members and how LinkedIn stress tests data centers verify its disaster recovery capabilities.
This document discusses automating DevOps processes through orchestration and workflows. It introduces Petsy, a pet art company that needs to automate deployments. Common DevOps workflows like deployment, infrastructure upgrades, and scaling are described. The document then introduces the Cloudify project for defining application topologies and workflows through a TOSCA-inspired DSL. A live demo of installing Mezzanine is shown to demonstrate how workflows can install applications and dependencies. The document concludes by discussing how Cloudify fits into the OpenStack ecosystem.
Query processing and Query OptimizationNiraj Gandha
This presentation is made with many efforts and I believe that it will be proven as good presentation to clear the basic of query processing and optimization under the DBMS subject. The topics covered in this presentation are the basic fundamentals of the topic as suggested.
Using SaltStack to Auto Triage and Remediate Production SystemsMichael Kehoe
LinkedIn created an auto-remediation system named Nurse which leverages SaltStack and the CherryPy API to auto-triage and remediate issues with production systems. See how LinkedIn uses SaltStack with Nurse in its production environment and learn how to architect your own auto-triage and remediation system.
Deploying and managing container-based applications with OpenStack and Kubern...Ihor Dvoretskyi
Linux containers have recently taken the industry by storm, offering a lightweight, powerful, portable and upgradeable alternative to traditional app deployment on a host OS/VM.
Managing Docker containers on OpenStack VMs is possible today with Mirantis OpenStack, with the Murano Application Catalog radically simplifying the job of placing multiple application containers in an environment, installing apps in them from public resources such as Docker Hub, and deploying the environment on VMs for use. For managing containers at large scales, Mirantis and Google are now working jointly to enable Murano to configure and deploy Kubernetes — the Google-initiated open source project to build and refine cluster orchestration for containers on infrastructure.
In this presentation the core concepts of OpenStack, Docker and Kubernetes will be described, as well as demonstrated abilities to deploy containerized applications, managed by Kubernetes on above of OpenStack cloud.
This document summarizes a presentation about Hexadite's security orchestration and automation solution. Hexadite was founded in 2014 to address shortcomings in traditional incident response, which can take weeks to investigate and resolve alerts. Hexadite's Automated Incident Response Solution (AIRS) automatically investigates and resolves all cybersecurity alerts within minutes instead of weeks through intelligent automation. The presentation reviews why traditional incident response is failing, demonstrates AIRS' high level architecture and integration with security tools, and provides an example use case of AIRS automatically containing a threat across multiple systems.
Monitoring at Facebook - Ran Leibman, Facebook - DevOpsDays Tel Aviv 2015DevOpsDays Tel Aviv
This document summarizes Ran Leibman's presentation on monitoring tools, components, and mentality at Facebook. It describes Facebook's monitoring architecture including the operational data store (ODS) for storing metrics, Scuba for real-time log monitoring, the alarm system for creating alerts, Facebook Auto-Remediation (FBAR) for automating issue resolution, notifications and subscriptions for alerting engineers, and dashboards for visualizing data. The presentation emphasizes treating metrics as important data, empowering developers to monitor, automating problem resolution, and using monitoring to surface previously unknown issues.
The document discusses query processing and optimization. It describes the basic concepts including query processing, query optimization, and the phases of query processing. It also explains relational algebra operations like selection, projection, joins, and additional operations. The document then covers topics like query decomposition, analysis, normalization, simplification, and restructuring during query optimization. It discusses cost estimation and algorithms for implementing relational algebra operations and file organization.
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak PROIDEA
This document summarizes a presentation about using DTrace on OS X. It introduces DTrace as a dynamic tracing tool for user and kernel space. It discusses the D programming language used for writing DTrace scripts, including data types, variables, operators, and actions. Example one-liners and scripts are provided to demonstrate syscall tracking, memory allocation snooping, and hit tracing. The presentation outlines some past security work using DTrace and similar dynamic tracing tools. It concludes with proposing future work like more kernel and USDT tracing as well as Python bindings for DTrace.
Digdag can automate large-scale data processing and handle errors. It provides constructs like operators, parameters, and task groups to organize workflows. Operators package tasks to run queries or process data. Parameters allow passing variables between tasks. Task groups modularize and organize workflows. Digdag supports error handling, monitoring, parallelization, versioning, and reproducing workflows across environments.
Talk for PerconaLive 2016 by Brendan Gregg. Video: https://www.youtube.com/watch?v=CbmEDXq7es0 . "Systems performance provides a different perspective for analysis and tuning, and can help you find performance wins for your databases, applications, and the kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes six important areas of Linux systems performance in 50 minutes: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events), static tracing (tracepoints), and dynamic tracing (kprobes, uprobes), and much advice about what is and isn't important to learn. This talk is aimed at everyone: DBAs, developers, operations, etc, and in any environment running Linux, bare-metal or the cloud."
The document discusses using the PERFORMANCE_SCHEMA feature in MySQL 5.6 to diagnose and improve the performance of a query that is not scaling well. It provides an example query against large tables that does not scale beyond 10 threads. Various tools for performance analysis are discussed, but the PERFORMANCE_SCHEMA is presented as a potentially better option for getting detailed insight into where time is being spent and how to optimize the server. The talk will cover both the capabilities and limitations of the PERFORMANCE_SCHEMA.
1. The document discusses tools and techniques for solving performance issues in Python and PostgreSQL systems, including profiling Python code, logging PostgreSQL queries, and optimizing parallel query processing.
2. Key recommendations include reproducing performance issues in a reliable, isolated and repeatable way, and using load testing to prevent issues.
3. Analyzing tools like pg_activity and optimizing settings like max_worker_processes and max_parallel_workers can help improve query speed at the cost of higher CPU usage.
There are at least 40 to 50 different formats of GC logs. Here, we explained the commonly used GC log formats, tricks, patterns and tools to analyze them effectively.
Prometheus Everything, Observing Kubernetes in the CloudSneha Inguva
The document discusses using Prometheus and Alertmanager for monitoring Kubernetes clusters at DigitalOcean. It describes the company's transition from traditional monitoring tools to Kubernetes and Prometheus. Key points include setting up Prometheus scrapers to collect metrics from Kubernetes services, configuring alerts in manifest files, and addressing common issues like alert fatigue and unclear ownership of alerts. The presentation outlines next steps like automating alerts and using metrics for auto-scaling and autopilot functions.
ETL (production) use cases explored giving insights for the practical use of Hadoop by Bryan at a Hadoop User Group (HUG) Ireland event, which was hosted by Synchronoss in Dublin on January 11th, 2016.
Refactoring legacy code guided by tests in WordPressLuca Tumedei
Slides for the talk I've presented at WC Roma 2017 (https://2017.rome.wordcamp.org/).
"Because you can’t always start from scratch.
Modern Tribe took on this impervious task head-on.
In this speech I share the knowledge the team working on “The Events Calendar” plugin suite collected along the way, the practicalities, the “gotchas””, the pitfalls in human and development terms.
I will also go into the details of down-to-earth examples, findings and tools we used to do it."
The document discusses various strategies and techniques for capacity management of web operations, including forecasting future capacity needs, identifying ceilings for system resources, implementing safety factors, and performing diagonal scaling. It also provides examples of metrics used at Flickr for monitoring capacity and some "stupid capacity tricks" that can be employed in emergencies.
Building source code level profiler for C++.pdfssuser28de9e
1. The document describes building a source code level profiler for C++ applications. It outlines 4 milestones: logging execution time, reducing macros, tracking function hit counts, and call path profiling using a radix tree.
2. Key aspects discussed include using timers to log function durations, storing profiling data in a timed entry class, and maintaining a call tree using a radix tree with nodes representing functions and profiling data.
3. The goal is to develop a customizable profiler to identify performance bottlenecks by profiling execution times and call paths at the source code level.
Managing Large-scale Networks with Triggerjathanism
Trigger is a network automation toolkit that allows users to programmatically configure, monitor, and manage network devices. Written in Python, it uses SSH, Telnet, and Junoscript to remotely execute commands on network devices from all major vendors. Trigger handles tasks like command execution, change management, and metadata storage for network devices in a centralized, scalable, and reliable way. It aims to simplify network automation through an easy to use API and extensibility features like custom command classes.
This document discusses PM2, a production process manager for Node.js applications. It provides tools for deploying apps, managing processes, monitoring performance, and organizing microservice architectures. Key features include built-in load balancing, automatic restarting of crashed processes, process monitoring, and clustering for high availability. The document also covers using PM2 modules to extend its capabilities and writing custom modules.
Your admin toolbelt is not complete without Salesforce DXDaniel Stange
This document summarizes a presentation about using Salesforce DX (SFDX) for administrators. It discusses how SFDX can reduce the number of clicks needed to perform common admin tasks. The presentation covers what admins need to know about SFDX, including commands, scratch orgs, and connecting to orgs. It provides examples of using SFDX to assign permission sets, upload data, and run tests. It introduces the idea of "DX instant recipes" to easily accomplish tasks and shares a link to an open source SFDX command line cookbook repository on GitHub.
PVS-Studio is ready to improve the code of Tizen operating systemAndrey Karpov
Objective. Contract agreement with PVS-Studio team concerning the error fixing and regular code audit.
Currently, PVS-Studio detects more than 10% of errors that are present in the code of the Tizen project.
In the case of regular use of PVS-Studio on the new code, about 20% of errors can be prevented.
I predict that PVS-Studio team can detect and fix about 27 000 errors in the Tizen project.
How to measure everything - a million metrics per second with minimal develop...Jos Boumans
Krux is an infrastructure provider for many of the websites you
use online today, like NYTimes.com, WSJ.com, Wikia and NBCU. For
every request on those properties, Krux will get one or more as
well. We grew from zero traffic to several billion requests per
day in the span of 2 years, and we did so exclusively in AWS.
To make the right decisions in such a volatile environment, we
knew that data is everything; without it, you can't possibly make
informed decisions. However, collecting it efficiently, at scale,
at minimal cost and without burdening developers is a tremendous
challenge.
Join me in this session to learn how we overcame this challenge
at Krux; I will share with you the details of how we set up our
global infrastructure, entirely managed by Puppet, to capture over
a million data points every second on virtually every part of the
system, including inside the web server, user apps and Puppet itself,
for under $2000/month using off the shelf Open Source software and
some code we've released as Open Source ourselves. In addition, I’ll
show you how you can take (a subset of) these metrics and send them
to advanced analytics and alerting tools like Circonus or Zabbix.
This content will be applicable for anyone collecting or desiring to
collect vast amounts of metrics in a cloud or datacenter setting and
making sense of them.
User-data allows scripts to run on instance bootup, enabling automated configuration. IndexMedia improved deployment time from 30 minutes to 90 seconds by splitting scripts into static and instance-specific parts. AutoScale automatically launches and terminates instances to maintain performance within specified bounds based on metrics like CPU utilization. With just four commands, IndexMedia set up an AutoScale group with a scaling policy and alarm to dynamically scale their fleet based on load, solving their problem of maintaining consistent user experience.
Beyond php - it's not (just) about the codeWim Godden
Most PHP developers focus on writing code. But creating Web applications is about much more than just wrting PHP. Take a step outside the PHP cocoon and into the big PHP ecosphere to find out how small code changes can make a world of difference on servers and network. This talk is an eye-opener for developers who spend over 80% of their time coding, debugging and testing.
Similar to Event Driven Automation Meetup May 14/2015 (20)
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Generating privacy-protected synthetic data using Secludy and MilvusZilliz
During this demo, the founders of Secludy will demonstrate how their system utilizes Milvus to store and manipulate embeddings for generating privacy-protected synthetic data. Their approach not only maintains the confidentiality of the original data but also enhances the utility and scalability of LLMs under privacy constraints. Attendees, including machine learning engineers, data scientists, and data managers, will witness first-hand how Secludy's integration with Milvus empowers organizations to harness the power of LLMs securely and efficiently.
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Project Management Semester Long Project - Acuityjpupo2018
Acuity is an innovative learning app designed to transform the way you engage with knowledge. Powered by AI technology, Acuity takes complex topics and distills them into concise, interactive summaries that are easy to read & understand. Whether you're exploring the depths of quantum mechanics or seeking insight into historical events, Acuity provides the key information you need without the burden of lengthy texts.
OpenID AuthZEN Interop Read Out - AuthorizationDavid Brossard
During Identiverse 2024 and EIC 2024, members of the OpenID AuthZEN WG got together and demoed their authorization endpoints conforming to the AuthZEN API
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxSitimaJohn
Ocean Lotus cyber threat actors represent a sophisticated, persistent, and politically motivated group that poses a significant risk to organizations and individuals in the Southeast Asian region. Their continuous evolution and adaptability underscore the need for robust cybersecurity measures and international cooperation to identify and mitigate the threats posed by such advanced persistent threat groups.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
22. Zoom to Workflow, and Get Practical
• From now on I focus on workflow
• Reminder: EDA != Workflow, but Workflow is a
big part of it.
23. Patterns vs Practice
• ~100 patterns
http://www.workflowpatterns.com/
• Practice – IMAO: only few sufficient
• Workflow do two things well:
– Keeps state
– Carry data across systems
36. And More Details…
• Nesting
– Nothing to say except
– Input and output
– Nested workflow is an action, not a task
• Retries, Waits, Pause/Resume
• Default task policies
37. Recap: Workflow Operations
• Sequence
• Data passing
• Conditions (on data)
• Parallel execution
• Joins
• Multiple Data Items
38. What else
• Other than pattern support:
• Reliability
• Manageability – API, CLI, DSL, infra as code…
• Good to have: good GUI
39. Summary
• Event Driven Automation is coming back
– with a new twist
• EDA > Workflow,
but Workflow is a key component
• Shameless plug
StackStorm is covering it all
Why listen to me…
Created one of the legacy RunBook automation products
Currently, I am set to fix my past mistakes core member of Mistral team
All started with Business Process Automation
Applied software to business
BPM come to life
Body of Comp Sci research on Workflow dated late 90s. Petri-net, math, workflow nomenclature, definitions, pattersn – all started there.
Tibco – who was - apply to IT systems? Enterprise message bus… IT automation
Others picked up the idea,
Run Book Automation
Servers took days to deploy (and tickets were the say to go)
Docker deploys at split seconds
Speed is addictive – we now hate JIRA and love Slack and Chatops
Tools – ways more
Tools – ways more
Close the loop:
O.O.D.A
Why workflows are better than scripts –leave the proof to the reader as an exercise, actually Brian covered it
Walk you through these pattersns, show Mistral as Example
Pre-conditions, post conditions
Pre-conditions, post conditions
For simple case both work, for advanced patterns – more/less friendly.
Example: run full deployment and e2e tests on 3 platforms
You can do it sequentually but it takes forever.
How many times t5 is gonna run?
How many times t5 is gonna run?
How many times chatops_say is gonna run?
How many times t5 is gonna run now? Once!
How many times t5 is gonna run now? Once!
Cool: Watch, ma, the multi-data are running in parallel! And the final data
Check concurrency
There are few more nuances within these patternns
Which in the interest of time, I just mention in passing:
This is the minimal set that gives enough power but keeps it simple to create, track, and reason.