Organizational Learning!
Nash Equilibriums!
Pareto Inefficiencies!
Oh My!
reprising themes I want everyone to understand and apply to to building the future
What is Socket Programming in Python | EdurekaEdureka!
YouTube Link: https://youtu.be/T0rYSFPAR0A
** Python Certification Training: https://www.edureka.co/python **
This Edureka PPT on 'Socket Programming in Python' is to educate you as to how networks are created using the socket module in Python. Below are the topics covered in this PPT:
What are sockets?
How to achieve socket programming in Python?
Servers and clients
Client-server communication
Transferring Python objects
Python Tutorial Playlist: https://goo.gl/WsBpKe
Blog Series: http://bit.ly/2sqmP4s
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
Since the 1980s, Motif has been a popular toolkit for developing graphical desktop applications. Despite its age, legacy applications written using the Motif toolkit are still in production. There are a number of reasons to migrate these applications to newer platforms. Porting Motif applications to the Qt framework can achieve many of the benefits without requiring a total rewrite.
Organizational Learning!
Nash Equilibriums!
Pareto Inefficiencies!
Oh My!
reprising themes I want everyone to understand and apply to to building the future
What is Socket Programming in Python | EdurekaEdureka!
YouTube Link: https://youtu.be/T0rYSFPAR0A
** Python Certification Training: https://www.edureka.co/python **
This Edureka PPT on 'Socket Programming in Python' is to educate you as to how networks are created using the socket module in Python. Below are the topics covered in this PPT:
What are sockets?
How to achieve socket programming in Python?
Servers and clients
Client-server communication
Transferring Python objects
Python Tutorial Playlist: https://goo.gl/WsBpKe
Blog Series: http://bit.ly/2sqmP4s
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
Since the 1980s, Motif has been a popular toolkit for developing graphical desktop applications. Despite its age, legacy applications written using the Motif toolkit are still in production. There are a number of reasons to migrate these applications to newer platforms. Porting Motif applications to the Qt framework can achieve many of the benefits without requiring a total rewrite.
Discover How IBM Uses InfluxDB and Grafana to Help Clients Monitor Large Prod...InfluxData
IBM has been innovating to create new products for its clients and the world for over a century. Customers look to IBM Power Systems to address their hybrid multicloud infrastructure needs. Larger POWER9 servers can have up to 192 CPU cores, 64 TB of memory, dozens of PB of SAN storage, and typically run a mixture of AIX (UNIX) and Enterprise Linux (RHEL or SLES) workloads. As part of its sales process, IBM is always benchmarking its new hardware and software which clients use to monitor their systems. Discover how IBM and its clients are using InfluxDB and Grafana to collect, store and visualize performance data, which is used to monitor and tune for peak performance in ever-changing workload environments.
Join this webinar featuring Nigel Griffiths from IBM, Ronald McCollam from Grafana Labs, and Russ Savage from InfluxData to learn how you can use InfluxDB and Grafana to improve large production workloads. Learn about the latest product updates from InfluxData and Grafana Labs.
Infrastructure testing with Molecule and TestInfraTomislav Plavcic
Presentation defines what infrastructure as code is and what are some of the frameworks used for writing and testing it. More info is provided about Molecule framework which is used for testing Ansible roles and also TestInfra framework which can be used as a verifier with Molecule, but also as a standalone test framework.
Using Grafana with InfluxDB 2.0 and Flux Lang by Jacob LisiInfluxData
Flux, the new InfluxData data scripting and query language (formerly IFQL), super-charges queries both for analytics and data science. Jacob Lisi from Grafana Labs will give a quick overview of the language features as well as the moving parts for a working deployment. Grafana is an open source dashboard solution that shares Flux’s passion for analytics and data science. For that reason, they are very excited to showcase the new Flux support within Grafana, and a couple of common analytics use cases to get the most out of your data.
In this InfluxDays NYC 2019 talk, Jacob Lisi will share the latest updates they have made with their Flux builder in Grafana.
Create useful data center health visualizations with Dell iDRAC Telemetry Ref...Principled Technologies
Dell EMC™ has recently released Telemetry Reference tools for the Integrated Dell Remote Access Controller (iDRAC). The Telemetry Reference toolset enables data center engineers to ingest and structure telemetry data streams for visualization with Elastic Stack software. This report serves as a how-to guide for setting up and configuring these components within your own environment.
A proxy server is a server that acts as an intermediary for requests from clients seeking resources from other servers. A client connects to the proxy server, requesting some service, such as a file, connection, web page, or other resource available from a different server and the proxy server evaluates the request as a way to simplify and control its complexity.
Introduction to metasploit framework
01.History of metasploit
02.Metasploit Design and architecture
03.Metasploit Editions
04.Metasploit Interface
05.Basic commands and foot-printing modules
No sexto episódio da série sobre Node.js vamos conhecer um dos core modules mais antigos da plataforma, responsável por viabilizar a comunicação de dados com base no protocolo TCP.
Para isso, vamos desenvolver um chat, aprendendo a conectar clientes ao servidor, trocando mensagens e tratando os principais eventos como o connect, data e end.
https://www.youtube.com/watch?v=sx36tuCzUOE
Virtual training Intro to InfluxDB & TelegrafInfluxData
How to setup InfluxDB & Telgraf to pull metrics into your InfluxDB. An introduction to querying data with InfluxQL. Learn more and download the open source version of Telegraf now: https://www.influxdata.com/time-series-platform/telegraf/
IP tables-the linux firewall. This link shows the pdf document that you can download.This is a useful document for the beginners, lays the attention to know more about the topic.
A presentation on our experience at Ingram Content Group with Grafana and MySQL. In an enterprise environment it is sometimes necessary to keep data in a traditional, general purpose SQL database such as MySQL or PostgreSQL. These slides explore the challenges and benefits of using Grafana with an SQL database in a large enterprise production setting.
Discover How IBM Uses InfluxDB and Grafana to Help Clients Monitor Large Prod...InfluxData
IBM has been innovating to create new products for its clients and the world for over a century. Customers look to IBM Power Systems to address their hybrid multicloud infrastructure needs. Larger POWER9 servers can have up to 192 CPU cores, 64 TB of memory, dozens of PB of SAN storage, and typically run a mixture of AIX (UNIX) and Enterprise Linux (RHEL or SLES) workloads. As part of its sales process, IBM is always benchmarking its new hardware and software which clients use to monitor their systems. Discover how IBM and its clients are using InfluxDB and Grafana to collect, store and visualize performance data, which is used to monitor and tune for peak performance in ever-changing workload environments.
Join this webinar featuring Nigel Griffiths from IBM, Ronald McCollam from Grafana Labs, and Russ Savage from InfluxData to learn how you can use InfluxDB and Grafana to improve large production workloads. Learn about the latest product updates from InfluxData and Grafana Labs.
Infrastructure testing with Molecule and TestInfraTomislav Plavcic
Presentation defines what infrastructure as code is and what are some of the frameworks used for writing and testing it. More info is provided about Molecule framework which is used for testing Ansible roles and also TestInfra framework which can be used as a verifier with Molecule, but also as a standalone test framework.
Using Grafana with InfluxDB 2.0 and Flux Lang by Jacob LisiInfluxData
Flux, the new InfluxData data scripting and query language (formerly IFQL), super-charges queries both for analytics and data science. Jacob Lisi from Grafana Labs will give a quick overview of the language features as well as the moving parts for a working deployment. Grafana is an open source dashboard solution that shares Flux’s passion for analytics and data science. For that reason, they are very excited to showcase the new Flux support within Grafana, and a couple of common analytics use cases to get the most out of your data.
In this InfluxDays NYC 2019 talk, Jacob Lisi will share the latest updates they have made with their Flux builder in Grafana.
Create useful data center health visualizations with Dell iDRAC Telemetry Ref...Principled Technologies
Dell EMC™ has recently released Telemetry Reference tools for the Integrated Dell Remote Access Controller (iDRAC). The Telemetry Reference toolset enables data center engineers to ingest and structure telemetry data streams for visualization with Elastic Stack software. This report serves as a how-to guide for setting up and configuring these components within your own environment.
A proxy server is a server that acts as an intermediary for requests from clients seeking resources from other servers. A client connects to the proxy server, requesting some service, such as a file, connection, web page, or other resource available from a different server and the proxy server evaluates the request as a way to simplify and control its complexity.
Introduction to metasploit framework
01.History of metasploit
02.Metasploit Design and architecture
03.Metasploit Editions
04.Metasploit Interface
05.Basic commands and foot-printing modules
No sexto episódio da série sobre Node.js vamos conhecer um dos core modules mais antigos da plataforma, responsável por viabilizar a comunicação de dados com base no protocolo TCP.
Para isso, vamos desenvolver um chat, aprendendo a conectar clientes ao servidor, trocando mensagens e tratando os principais eventos como o connect, data e end.
https://www.youtube.com/watch?v=sx36tuCzUOE
Virtual training Intro to InfluxDB & TelegrafInfluxData
How to setup InfluxDB & Telgraf to pull metrics into your InfluxDB. An introduction to querying data with InfluxQL. Learn more and download the open source version of Telegraf now: https://www.influxdata.com/time-series-platform/telegraf/
IP tables-the linux firewall. This link shows the pdf document that you can download.This is a useful document for the beginners, lays the attention to know more about the topic.
A presentation on our experience at Ingram Content Group with Grafana and MySQL. In an enterprise environment it is sometimes necessary to keep data in a traditional, general purpose SQL database such as MySQL or PostgreSQL. These slides explore the challenges and benefits of using Grafana with an SQL database in a large enterprise production setting.
Most often Zabbix users will monitor Linux hosts using the Zabbix agent, however SNMP is not only an option, it's actually a very viable one. Andrew Nelson will describe his experience configuring Zabbix to monitor a Linux environment of over 500 systems using only SNMP.
Zabbix Conference 2015
Technology changes and process changes in how people build and manage Internet systems have driven a need for a new approach to monitoring. We talk about why, what and how.
As always the conference was opened with a speech by Alexei Vladishev, the creator of Zabbix, glancing over the accomplishments Zabbix made during the past year, mostly focusing on the features and improvements that await us all in the Zabbix 3.0 release.
Zabbix Conference 2015
How Yelp Uses Sensu to Monitor Services in a SOA WorldKyle Anderson
Yelp uses Sensu to dynamically monitor all the services that power Yelp. It can dynamically detect where a service is deployed and automatically alert the team responsible for that service if it is unhealthy in a particular latency zone.
This is a presentation demonstrating how Sensu is used at Yelp to support dynamic infrastructure, and promote self-service monitoring among teams.
Video Part 1: https://vimeo.com/92770954
Video Part 2: https://vimeo.com/92838680
Neben den bekannten Forks Icinga und Centreon handelt es sich bei Shinken um einen kompletten Rewrite des Nagios-Cores in Python. Hauptmotivation für Shinken waren die bekannten Performanceprobleme der bestehenden Core-Implementierung. Seit 2009 geht das Projekt eigene Wege mit dem Hauptaugenmerk auf die global verteilte Architektur, mit einfachem Load Balancing innerhalb und zwischen den einzelnen Rechenzentren.
Neben den technischen Inhalten wie Performance und verteilte Architektur wird der Vortrag auf Mehrwerte eingehen, die der Administrator von einer modernen Monitoringösung erwartet. So liegt der Schwerpunkt neben Business Impact Analyse vor allem auf Event Korrelation und Auto-Discovery.
The complexity of a typical OpenNebula installation brings a special set of challenges on the monitoring side. In this talk, I will show monitoring of a full stack of from the physical servers to storage layer and ONE daemon. Providing an aggregated view of this information allows you see the real impact of a certain failure. I would like to also present a use case for a “closed-loop” setup where new VMs are automatically added to the monitoring without human intervention, allowing for an efficient approach to monitoring the services a OpenNebula setup provides.
OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl OpenNebula Project
The complexity of a typical OpenNebula installation brings a special set of challenges on the monitoring side. In this talk, I will show monitoring of a full stack of from the physical servers to storage layer and ONE daemon. Providing an aggregated view of this information allows you see the real impact of a certain failure. I would like to also present a use case for a “closed-loop” setup where new VMs are automatically added to the monitoring without human intervention, allowing for an efficient approach to monitoring the services a OpenNebula setup provides.
Bio:
I’ve been into virtualization and storage for a long time and I like the amount of abstraction OpenNebula offers. Professionally I have been a Unix systems administrator for most of my working life. I’ve also done systems integration and monitoring work on the Check_MK project. Now I’m one of very few Nagios experts in Germany that aren’t working for one of the 3-5 leading Nagios outfits and as such I’m able to speak freely about what I think works best for the users. My strength is simply sitting down and listening to what people really need.
Abusing bleeding edge web standards for appsec gloryPriyanka Aash
"Through cooperation between browser vendors and standards bodies in the recent past, numerous standards have been created to enforce stronger client-side control for web applications. As web appsec practitioners continue to shift from mitigating vulnerabilities to implementing proactive controls, each new standard adds another layer of defense for attack patterns previously accepted as risks. With the most basic controls complete, attention is shifting toward mitigating more complex threats. As a result of the drive to control for these threats client-side, standards such as SubResource Integrity (SRI), Content Security Policy (CSP), and HTTP Public Key Pinning (HPKP) carry larger implementation risks than others such as HTTP Strict Transport Security (HSTS). Builders supporting legacy applications actively make trade-offs between implementing the latest standards versus accepting risks simply because of the increased risks newer web standards pose.
In this talk, we'll strictly explore the risks posed by SRI, CSP, and HPKP; demonstrate effective mitigation strategies and compromises which may make these standards more accessible to builders and defenders supporting legacy applications; as well as examine emergent properties of standards such as HPKP to cover previously unforeseen scenarios. As a bonus for the breakers, we'll explore and demonstrate exploitations of the emergent risks in these more volatile standards, to include multiple vulnerabilities uncovered quite literally during our research for this talk (which will hopefully be mitigated by d-day)."
(Source: Black Hat USA 2016, Las Vegas)
Skynet project: Monitor, analyze, scale, and maintain a system in the CloudSylvain Kalache
The goal of Skynet is to avoid human doing repetitive things and make a system doing them in a better way. System automation should be the way to go for any system management so that human can focus on stuff that really matters.
Related blog post for more informations https://engineering.linkedin.com/slideshare/skynet-project-_-monitor-scale-and-auto-heal-system-cloud
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Demi Ben-Ari
Once you start working with distributed Big Data systems, you start discovering a whole bunch of problems you won’t find in monolithic systems.
All of a sudden to monitor all of the components becomes a big data problem itself.
In the talk we’ll mention all of the aspects that you should take in consideration when monitoring a distributed system once you’re using tools like:
Web Services, Apache Spark, Cassandra, MongoDB, Amazon Web Services.
Not only the tools, what should you monitor about the actual data that flows in the system?
And we’ll cover the simplest solution with your day to day open source tools, the surprising thing, that it comes not from an Ops Guy.
Similar to Stop using Nagios (so it can die peacefully) (20)
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Stop using Nagios (so it can die peacefully)
1. Please stop using Nagios
(so it can die peacefully)
Andy Sykes
Devops @ Forward3D
@supersheep
andy@forward3d.com
2. Do you use Nagios?
Tell me why you picked it.
Go on.
If you don't, why don't you?
3. Reasons for choosing Nagios
• stupid simple plugin system
• billions* of existing plugins
• years of development behind it
• you can hire people who know it
"Everybody uses it."**
* may not actually be true
** except me. and maybe you. and that guy at the back, who really likes Zabbix. you know
who you are.
4. Reasons for choosing Nagios
• stupid simple plugin system
• billions* of existing plugins
• years of development behind it
• you can hire people who know it
"Everybody uses it."**
* may not actually be true
** except me. and maybe you. and that guy at the back, who really likes Zabbix. you know
who you are.
5. So why did you pick Nagios?
Because it's the "safe", default choice.
Because we've grown accustomed to the things
that really, really suck about it.
It's a little like we've all got Stockholm
Syndrome.
6. What Nagios gets right
Incredibly simple plugin model.
Fairly secure (SSL between agents + master).
Very simple conceptually.
Reliable.
7. Nagios, I hate thee; let me count thy ways
Doesn't scale. At all.
World's second most horrible configuration*.
Horrendous interface**.
Assumes a static infrastructure.
No decent programmatic interfaces***.
Throws away perfdata.
Stupid wire format for clients (NRPE/NSCA).
* the world's most horrible configuration is, obviously, Sendmail.
** even the paid Nagios XI one is ugly as sin and unusable.
*** if I catch you parsing status.dat, I will beat your ass.
8. Expansion about config
Configuration has to be in two places:
Server has to know what checks to invoke
via NRPE.
Client has to know what checks it will be
asked to invoke with NRPE.
THIS IS MADNESS.
9. Scaling, or lack of it
No such thing as a Nagios cluster.
More checks = more work = longer before you
know something's happened!
Every check increases your master's load
average.
10. Okay, yes, there’s mod_gearman
But it’s a hack at best.
No redundancy for the machine that distributes
the checks, so it’s not a real cluster.
11. API poverty
Can't easily integrate with other systems.
Can't easily write custom dashboards.
Can't get information out again!
Assumes a static infra
Master has to be told about a client before
things can happen.
12. The bandaids we make
Interface:
Opsview, Icinga, Shinken, others
API:
Parsing status.dat, NDO
Client wire format:
Opsview's NRPE, NRD
Config management:
Puppet types, Chef cookbooks
None of it is good enough.
13. The take-home point:
"If we keep using Nagios,
we'll never get anything
better."
(Writing monitoring systems is hard, and needs community involvement and
real world adoption. Nagios steals mindshare by being just good enough. It's
the monitoring system we deserve, but not the one we need right now.)
14. So, smart guy. What do we do?
Steal all the things that are great about Nagios.
(existing plugin investment, simplicity, security, reliability)
Strap them to something more awesome.
(scalable, API-ready, config management friendly, modern!)
19. Core:
Holds configuration about hosts / services
Distributed across X masters
Check execution (poke)
Results queue (poke response)
20. There’s something we can use for this.
Sensu!
Sensu is often described as the “monitoring router”.
21.
22. {
"checks": {
"chef_client": {
"command": "check-chef-client.rb",
"subscribers": [
"production" ],
"interval": 60,
"handlers": [
"pagerduty",
"irc"
]
}
}
}
Only on the server
23. Client requires no registration for the server
to know about it
Uses Nagios status return codes
Doesn’t talk to the server - talks to
RabbitMQ
24. Core:
Holds configuration about hosts / services
Distributed across X masters
Check execution (poke)
Results queue (poke response)
25. What we need:
Core
- Sensu-server
Agent
- Sensu-client
Graphing
Anomaly detection
Alerting
UI
26. Graphing is easy now.
If you’re not using Graphite, you should be.
Sensu “metric” checks can pump data to it.
27. What we need:
Core
- Sensu-server
Agent
- Sensu-client
Graphing - Graphite
Anomaly detection
Alerting
UI
28. Anomaly detection is hard.
We’ve got all this metric data, but how do we check it?
- Skyline/Oculus (Etsy)
- Grok (very early days)
- ???
29. What we need:
Core
- Sensu-server
Agent
- Sensu-client
Graphing - Graphite
Anomaly detection - ???
Alerting
UI
30. Alerting is tricky, but mostly solved.
Flapjack! - flapjack.io
Alerting is not the concern of your monitoring tool.
Push all alerts at Flapjack
- define gateways (PagerDuty, email)
- create relationships between checks and gateways
32. User interfaces are hard.
What do we need from it?
- What’s broken
- When it broke, when it broke in the past
- Say “OK, I know it’s broken”
- View graphs to see how quickly it broke
- See every check everywhere, and filter the list
33. The Sensu Dashboard sucks.
No history!
Acknowledgements aren’t easy to do.
No graphing.
Can’t see anything that’s reporting an OK status.
This won’t do.