This document summarizes and compares several open source monitoring tools: Nagios, Graphite, StatsD, Logstash, and Sensu. Nagios is introduced as a commonly used tool that some love and some find frustrating. Graphite is described as a tool for storing and graphing time-series data. StatsD aggregates counters and timers and sends them to backend services like Graphite. Logstash is an tool for managing logs and events that can input, filter, and output data. Sensu is a monitoring router that connects check scripts to handler scripts to alert or process monitoring data. Examples are given for each tool and what types of metrics to collect.
A presentation on our experience at Ingram Content Group with Grafana and MySQL. In an enterprise environment it is sometimes necessary to keep data in a traditional, general purpose SQL database such as MySQL or PostgreSQL. These slides explore the challenges and benefits of using Grafana with an SQL database in a large enterprise production setting.
Most often Zabbix users will monitor Linux hosts using the Zabbix agent, however SNMP is not only an option, it's actually a very viable one. Andrew Nelson will describe his experience configuring Zabbix to monitor a Linux environment of over 500 systems using only SNMP.
Zabbix Conference 2015
A talk about Open Source logging and monitoring tools, using the ELK stack (ElasticSearch, Logstash, Kibana) to aggregate logs, how to track metrics from systems and logs, and how Drupal.org uses the ELK stack to aggregate and process billions of logs a month.
Talk given by Thomas Widhalm at Icinga Camp San Francisco 2016 - https://www.icinga.org/community/events/archive/2016-archive/icinga-camp-san-francisco/
A presentation on our experience at Ingram Content Group with Grafana and MySQL. In an enterprise environment it is sometimes necessary to keep data in a traditional, general purpose SQL database such as MySQL or PostgreSQL. These slides explore the challenges and benefits of using Grafana with an SQL database in a large enterprise production setting.
Most often Zabbix users will monitor Linux hosts using the Zabbix agent, however SNMP is not only an option, it's actually a very viable one. Andrew Nelson will describe his experience configuring Zabbix to monitor a Linux environment of over 500 systems using only SNMP.
Zabbix Conference 2015
A talk about Open Source logging and monitoring tools, using the ELK stack (ElasticSearch, Logstash, Kibana) to aggregate logs, how to track metrics from systems and logs, and how Drupal.org uses the ELK stack to aggregate and process billions of logs a month.
Talk given by Thomas Widhalm at Icinga Camp San Francisco 2016 - https://www.icinga.org/community/events/archive/2016-archive/icinga-camp-san-francisco/
Dave Williams - Nagios Log Server - Practical ExperienceNagios
Dave Williams - Nagios Log Server - Practical Experience. -
This session will detail the green field deployment of Nagios Log Server in a client environment consisting of HP LAN Switches, 3PAR disk storage, HP Blade Chassis with Flex Fabric using
VMware, Hyper-V, Exchange & Citrix.
Monitoring Big Data Systems - "The Simple Way"Demi Ben-Ari
Once you start working with distributed Big Data systems, you start discovering a whole bunch of problems you won’t find in monolithic systems.
All of a sudden to monitor all of the components becomes a big data problem itself.
In the talk we’ll mention all of the aspects that you should take in consideration when monitoring a distributed system once you’re using tools like:
Web Services, Apache Spark, Cassandra, MongoDB, Amazon Web Services.
Not only the tools, what should you monitor about the actual data that flows in the system?
And we’ll cover the simplest solution with your day to day open source tools, the surprising thing, that it comes not from an Ops Guy.
Demi Ben-Ari is a Co-Founder and CTO @ Panorays.
Demi has over 9 years of experience in building various systems both from the field of near real time applications and Big Data distributed systems.
Describing himself as a software development groupie, Interested in tackling cutting edge technologies.
Demi is also a co-founder of the “Big Things” Big Data community: http://somebigthings.com/big-things-intro/
Presentation given at Mongo SV conference in Mountain View on December 3, 2010. Covers reasons for logging to MongoDB, logging library basics and library options for Java, Python, Ruby, PHP and C#. Updated 1/1/2012 with more info on logging in Ruby and tailable cursors.
Talk I did on log aggregation with the ELK stack at Leeds DevOps. Covers how we process over 800,000 logs per hour at laterooms, and the cultural changes this has helped drive.
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDogRedis Labs
Think you have big data? What about high availability
requirements? At DataDog we process billions of data points every day including metrics and events, as we help the world
monitor the their applications and infrastructure. Being the world’s monitoring system is a big responsibility, and thanks to
Redis we are up to the task. Join us as we discuss how the DataDog team monitors and scales Redis to power our SaaS based monitoring offering. We will discuss our usage and deployment patterns, as well as dive into monitoring best practices for production Redis workloads
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...Sematext Group, Inc.
This talk covers the basics of centralizing logs in Elasticsearch and all the strategies that make it scale with billions of documents in production. Topics include:
- Time-based indices and index templates to efficiently slice your data
- Different node tiers to de-couple reading from writing, heavy traffic from low traffic
- Tuning various Elasticsearch and OS settings to maximize throughput and search performance
- Configuring tools such as logstash and rsyslog to maximize throughput and minimize overhead
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...Demi Ben-Ari
Once you start working with distributed Big Data systems, you start discovering a whole bunch of problems you won’t find in monolithic systems.
All of a sudden to monitor all of the components becomes a big data problem itself.
In the talk we’ll mention all of the aspects that you should take in consideration when monitoring a distributed system once you’re using tools like:
Web Services, Apache Spark, Cassandra, MongoDB, Amazon Web Services.
Not only the tools, what should you monitor about the actual data that flows in the system?
And we’ll cover the simplest solution with your day to day open source tools, the surprising thing, that it comes not from an Ops Guy.
Dave Williams - Nagios Log Server - Practical ExperienceNagios
Dave Williams - Nagios Log Server - Practical Experience. -
This session will detail the green field deployment of Nagios Log Server in a client environment consisting of HP LAN Switches, 3PAR disk storage, HP Blade Chassis with Flex Fabric using
VMware, Hyper-V, Exchange & Citrix.
Monitoring Big Data Systems - "The Simple Way"Demi Ben-Ari
Once you start working with distributed Big Data systems, you start discovering a whole bunch of problems you won’t find in monolithic systems.
All of a sudden to monitor all of the components becomes a big data problem itself.
In the talk we’ll mention all of the aspects that you should take in consideration when monitoring a distributed system once you’re using tools like:
Web Services, Apache Spark, Cassandra, MongoDB, Amazon Web Services.
Not only the tools, what should you monitor about the actual data that flows in the system?
And we’ll cover the simplest solution with your day to day open source tools, the surprising thing, that it comes not from an Ops Guy.
Demi Ben-Ari is a Co-Founder and CTO @ Panorays.
Demi has over 9 years of experience in building various systems both from the field of near real time applications and Big Data distributed systems.
Describing himself as a software development groupie, Interested in tackling cutting edge technologies.
Demi is also a co-founder of the “Big Things” Big Data community: http://somebigthings.com/big-things-intro/
Presentation given at Mongo SV conference in Mountain View on December 3, 2010. Covers reasons for logging to MongoDB, logging library basics and library options for Java, Python, Ruby, PHP and C#. Updated 1/1/2012 with more info on logging in Ruby and tailable cursors.
Talk I did on log aggregation with the ELK stack at Leeds DevOps. Covers how we process over 800,000 logs per hour at laterooms, and the cultural changes this has helped drive.
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDogRedis Labs
Think you have big data? What about high availability
requirements? At DataDog we process billions of data points every day including metrics and events, as we help the world
monitor the their applications and infrastructure. Being the world’s monitoring system is a big responsibility, and thanks to
Redis we are up to the task. Join us as we discuss how the DataDog team monitors and scales Redis to power our SaaS based monitoring offering. We will discuss our usage and deployment patterns, as well as dive into monitoring best practices for production Redis workloads
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...Sematext Group, Inc.
This talk covers the basics of centralizing logs in Elasticsearch and all the strategies that make it scale with billions of documents in production. Topics include:
- Time-based indices and index templates to efficiently slice your data
- Different node tiers to de-couple reading from writing, heavy traffic from low traffic
- Tuning various Elasticsearch and OS settings to maximize throughput and search performance
- Configuring tools such as logstash and rsyslog to maximize throughput and minimize overhead
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...Demi Ben-Ari
Once you start working with distributed Big Data systems, you start discovering a whole bunch of problems you won’t find in monolithic systems.
All of a sudden to monitor all of the components becomes a big data problem itself.
In the talk we’ll mention all of the aspects that you should take in consideration when monitoring a distributed system once you’re using tools like:
Web Services, Apache Spark, Cassandra, MongoDB, Amazon Web Services.
Not only the tools, what should you monitor about the actual data that flows in the system?
And we’ll cover the simplest solution with your day to day open source tools, the surprising thing, that it comes not from an Ops Guy.
Time to say goodbye to your Nagios based setup. Discover all the new cool tools out there to do some more efficient monitoring. A talk made at OSMC 2014.
https://www.youtube.com/watch?v=_BAWi9Zhmic
OSMC 2014: Using elasticsearch, logstash & kibana in system administration | ...NETWAYS
This talk will give an introduction into the ELK stack, which consists of Elasticsearch, Logstash and Kibana. Before giving a quick theoretical introduction about the stack we will talk about the challenges and problems when trying to extract information from logfiles, which are distributed and very different in nature.
After covering the theoritical groundwork we will dive into the practical parts of the talk. There will be several demonstrations of how to use the ELK stack to obtain useful information for system administrators from your production environment. The demonstrations will include parsing realtime streams, old fashioned logfiles as well as making sense of performance metrics.
At Etsy we are big fans of graphing and monitoring all the things. We deploy our main site several times a day and our monitoring provides us with the tight feedback loop we need to make this possible. The same goes for changes in the infrastructure which are deployed in a similar fashion of small and frequent changes. For this to work we have build up monitoring that tracks changes and possible problems in every nook and cranny of the Etsy stack, be it a network change, systems or application level performance or how bad the last week of on-call rotation was. The flipside of monitoring all the things however is that we have a myriad of graphs and alerts that can potentially be important and page the on-call engineer at any given time. The risk of running into alert fatigue and a normalization of deviance through not properly scoped checks is rising with this ever increasing size of the monitoring system. This is why we also continuously monitor our monitoring system and ask questions about whether we have all the information at hand when we get paged, if certain alerts actually need to wake someone up or if they are needed at all. I will give a quick overview of how our monitoring stack is built and then give insights into how we gather data about it and use it to make things better. For site operations and more importantly for the human getting paged when something does go wrong.
WhatsUp® Gold 2017 is IT monitoring reimagined
Advanced User Experience (UX) technology lets you visualise and interact with your network for faster time to answers.
Intuitive workflows speed troubleshooting and improve productivity across the IT team. Unified monitoring of network devices, VMs, wireless, apps, servers, traffic and configurations - all under one flexible license provides industry leading value.
Présentation donnée lors du séminaire LINAGORA, intitulé : « Superviser et administrer votre SI avec les Logiciels Libres ! Ça marche ! » du mois d'octobre 2009.
MySQL performance monitoring using Statsd and Graphite (PLUK2013)spil-engineering
MySQL performance monitoring using Statsd and Graphite (PLUK2013)
Note: this is a placeholder for the presentation next Tuesday at the Percona Live London
This talk evalutes some easy ways to extract useful trending and capacity planning out of your existing monitoring investment. Using Nagios performance data, we examine simple behaviors with PNP4Nagios and graduate on to more insightful analytics with Graphite. With metrics in hand we look at the questions that IT /should/ be asking, such as:
* What sort of data should I trend?
* Why do I need to trend it?
* How do Operational or Engineering trends relate to Business or Transactional monitoring?
* How does this data impact our customer relationship and/or their bottom-line?
Finally, we look at creative ways to get profiling data out of your production systems with a minimum amount of effort from your development team.
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Demi Ben-Ari
Once you start working with distributed Big Data systems, you start discovering a whole bunch of problems you won’t find in monolithic systems.
All of a sudden to monitor all of the components becomes a big data problem itself.
In the talk we’ll mention all of the aspects that you should take in consideration when monitoring a distributed system once you’re using tools like:
Web Services, Apache Spark, Cassandra, MongoDB, Amazon Web Services.
Not only the tools, what should you monitor about the actual data that flows in the system?
And we’ll cover the simplest solution with your day to day open source tools, the surprising thing, that it comes not from an Ops Guy.
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...Codemotion
Once you start working with Big Data systems, you discover a whole bunch of problems you won’t find in monolithic systems. Monitoring all of the components becomes a big data problem itself. In the talk we’ll mention all of the aspects that you should take in consideration when monitoring a distributed system using tools like: Web Services,Spark,Cassandra,MongoDB,AWS. Not only the tools, what should you monitor about the actual data that flows in the system? We’ll cover the simplest solution with your day to day open source tools, the surprising thing, that it comes not from an Ops Guy.
Managing your black friday logs Voxxed LuxembourgDavid Pilato
Surveiller une application complexe n'est pas une tâche aisée, mais avec les bons outils, ce n'est pas si sorcier. Néanmoins, des périodes fortes telles que les opérations de type "Black Friday" (Vendredi noir) ou période de Noël peuvent pousser votre application aux limites de ce qu'elle peut supporter, ou pire, la faire crasher. Parce que le système est fortement sollicité, il génère encore davantage de logs qui peuvent également mettre à mal votre système de supervision.
Dans cette session, j'aborderai les bonnes pratiques d'utilisation de la suite Elastic pour centraliser et monitorer vos logs. Je partagerai également avec vous quelques trucs et astuces pour vous aider à passer sans souci vos Vendredis noirs !
Nous verrons :
* Les architectures de monitoring
* Trouver la taille optimale pour l'API _bulk
* Distribuer la charge
* Taille des index et des shards
* Optimiser les E/S disque
Vous ressortirez de la session avec : des bonnes pratiques pour bâtir son système de monitoring avec la suite Elastic, le tuning avancé pour optimiser les performances d'ingestion et de recherche.
Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/2l2Rr6L.
Doug Daniels discusses the cloud-based platform they have built at DataDog and how it differs from a traditional datacenter-based analytics stack. He walks through the decisions they have made at each layer, covers the pros and cons of these decisions and discusses the tooling they have built. Filmed at qconsf.com.
Doug Daniels is a Director of Engineering at Datadog, where he works on high-scale data systems for monitoring, data science, and analytics. Prior to joining Datadog, he was CTO at Mortar Data and an architect and developer at Wireless Generation, where he designed data systems to serve more than 4 million students in 49 states.
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...Codemotion
Once you start working with Big Data systems, you discover a whole bunch of problems you won’t find in monolithic systems. Monitoring all of the components becomes a big data problem itself. In the talk, we’ll mention all of the aspects that you should take into consideration when monitoring a distributed system using tools like Web Services, Spark, Cassandra, MongoDB, AWS. Not only the tools, what should you monitor about the actual data that flows in the system? We’ll cover the simplest solution with your day to day open source tools, the surprising thing, that it comes not from an Ops Guy.
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...Demi Ben-Ari
Once you start working with distributed Big Data systems, you start discovering a whole bunch of problems you won’t find in monolithic systems.
All of a sudden to monitor all of the components becomes a big data problem itself.
In the talk we’ll mention all of the aspects that you should take in consideration when monitoring a distributed system once you’re using tools like:
Web Services, Apache Spark, Cassandra, MongoDB, Amazon Web Services.
Not only the tools, what should you monitor about the actual data that flows in the system?
And we’ll cover the simplest solution with your day to day open source tools, the surprising thing, that it comes not from an Ops Guy.
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Codemotion
Monitoring an entire application is not a simple task, but with the right tools it is not a hard task either. However, events like Black Friday can push your application to the limit, and even cause crashes. As the system is stressed, it generates a lot more logs, which may crash the monitoring system as well. In this talk I will walk through the best practices when using the Elastic Stack to centralize and monitor your logs. I will also share some tricks to help you with the huge increase of traffic typical in Black Fridays.
Managing your Black Friday Logs NDC OsloDavid Pilato
Monitoring an entire application is not a simple task, but with the right tools it is not a hard task either. However, events like Black Friday can push your application to the limit, and even cause crashes. As the system is stressed, it generates a lot more logs, which may crash the monitoring system as well. In this talk I will walk through the best practices when using the Elastic Stack to centralize and monitor your logs. I will also share some tricks to help you with the huge increase of traffic typical in Black Fridays.
Topics include:
* monitoring architectures
* optimal bulk size
* distributing the load
* index and shard size
* optimizing disk IO
Takeaway: best practices when building a monitoring system with the Elastic Stack, advanced tuning to optimize and increase event ingestion performance.
This presentation will be useful to those who would like to get acquainted with Apache Spark architecture, top features and see some of them in action, e.g. RDD transformations and actions, Spark SQL, etc. Also it covers real life use cases related to one of ours commercial projects and recall roadmap how we’ve integrated Apache Spark into it.
Was presented on Morning@Lohika tech talks in Lviv.
Design by Yarko Filevych: http://www.filevych.com/
Workshop: Big Data Visualization for SecurityRaffael Marty
Big Data is the latest hype in the security industry. We will have a closer look at what big data is comprised of: Hadoop, Spark, ElasticSearch, Hive, MongoDB, etc. We will learn how to best manage security data in a small Hadoop cluster for different types of use-cases. Doing so, we will encounter a number of big-data open source tools, such as LogStash and Moloch that help with managing log files and packet captures.
As a second topic we will look at visualization and how we can leverage visualization to learn more about our data. In the hands-on part, we will use some of the big data tools, as well as a number of visualization tools to actively investigate a sample data set.
Managing your black friday logs - Code EuropeDavid Pilato
Monitoring an entire application is not a simple task, but with the right tools it is not a hard task either. However, events like Black Friday can push your application to the limit, and even cause crashes. As the system is stressed, it generates a lot more logs, which may crash the monitoring system as well. In this talk I will walk through the best practices when using the Elastic Stack to centralize and monitor your logs. I will also share some tricks to help you with the huge increase of traffic typical in Black Fridays.
Topics include:
* monitoring architectures
* optimal bulk size
* distributing the load
* index and shard size
* optimizing disk IO
Takeaway: best practices when building a monitoring system with the Elastic Stack, advanced tuning to optimize and increase event ingestion performance.
Docker Service Registration and Discoverym_richardson
This talk covers some basic concepts of Service Registry and Discovery with Docker. Consul, Registrator and consul-template are discussed.
It was presented at the Sydney Docker meetup in April 2015
Node collaboration - Exported Resources and PuppetDBm_richardson
Node Collaboration - How can your servers share information with each other. Exploring Exported Resources, PuppetDB and other methods.
This talk was given at Sydney Puppet Users Meetup on 14/08/2014.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Open Source Monitoring Tools
1. The State of Open
Source Monitoring
Tools
Michael Richardson (@m_richo)
Energized Work
2. What tools are we currently using to
monitor and troubleshoot our systems?
3. What tools are we currently using to
monitor and troubleshoot our systems?
•
•
•
•
Nagios
ssh + grep <something_bad> /some/random/log/file.log
tail –f /some/random/log/file.log
Others?
19. Graphite
•
Everything stored in graphite has a path with
components delimited by dots. Eg
servers.HOSTNAME.METRIC
applications.APPNAME.METRIC
servers.database01.memfree
applications.trading.loginattempts
20. Graphite
•
•
No need to pre-define metric end-points
Determine granularity of data upfront.
/opt/graphite/conf/storage-schemas.conf
[stats]
pattern = ^stats.*
retentions = 10:2160,60:10080,600:262974
[catchall]
priority = 0
pattern = ^.*
retentions = 30:86400,300:525600
21. Graphite
What should I graph/trend?
1. Application Profiling Data
2. Operational Profiling Data
3. Regression Testing (releases)
Why should I Graph/trend?
1. Trends can tell you when something is about to break.
2. …instead of hearing from your customers that it’s broken
3. Data can tell you when something is already broken but
you don’t yet know it (regression).
Source: Jason Dixon (@obfuscurity)
25. StatsD
• Written in node.js
• ~400 lines of javascript
• Listens to statistics (counters & timers),
and sends aggregates to backend
services (like graphite).
• simple
28. StatsD
Don’t like Javascript or Node.js??
Google “statsd alternatives”…..
20+ rewrites/clones for you including..
Ruby, python, scala, python+twisted,
erlang, clojure, C, groovy
29. StatsD
Concepts
• Buckets (a name that translates to graphite end-point)
• Values
• Flush (default 10 seconds)
Counter metrics
successfullogins:1|c|@0.1
Timing metrics
apitimer:320|ms
31. StatsD
Timer examples
• How fast is our function blah()
• How fast is a database query
• How fast is our 3rd party API service
• How fast is our internet access
• How fast are our page response times.
34. LogStash
•
•
•
•
•
Tool for managing Events and logs
http://logstash.net
https://github.com/logstash/logstash
Apache 2.0 license
Created by Jordan Sissel
(@jordansissel)
44. LogStash
Kibana
• Web interface for viewing logstash
records stored in elastic search
• http://kibana.org/
• http://github.com/rashidkpc/Kibana
• Search for records
• Stream records (near realtime)
• Create RSS feeds based on search
results
• Score, trend data
52. Sensu
• Message oriented architecture
(messages are JSON objects)
• Described as a monitoring router
• Connects “check” scripts on Sensu
Clients to “handler” scripts on Sensu
Servers
53. Sensu
Checks can
• Determine if a service like apache up
and running? (check exit code)
• Collect metrics like page views or
database cache usage.
55. Sensu
Output of checks are router to 1 or more
handlers who determine what to do.
• Send alerts via email, pagerduty, IRC,
twitter, basecamp, xmpp, hipchat,
campfire, etc, etc
56. Sensu
Output of checks are router to 1 or more
handlers who determine what to do.
• Send alerts via email, pagerduty, IRC,
twitter, basecamp, xmpp, hipchat,
campfire, etc, etc
• Feed metrics to backend services like
graphite, librato, opentsdb, etc, etc
Anyone want a quick rundown of how it works?Fault detection, notifictations, escalations, acknowledgements, adding new nodes, no ajax
Graphite is a highly scalable real-time graphing systemwritten in pythonapache 2.0 license
Graphite is a highly scalable real-time graphing systemwritten in pythonapache 2.0 license
Web – djangoWhisper – metrics database format (similar to RRDTool). Accepts out-of-order data and supports pipelining of data in a single operation.Carbon – storage engine (agent + cache + persister)
Web – djangoWhisper – database for storing time series dataCarbon – listening service for capturing data
Web – djangoWhisper – database for storing time series dataCarbon – listening service for capturing data
Why Graphing and trendingApplication profiling dataOperational profiling data
Why Graphing and trendingApplication profiling dataOperational profiling data
Counter example add 1 to the particular bucket. Count is sent at flush interval and reset to 0tells statsd that counter is sampled every 1/10th of the time.Timing exampleAPI service took 320ms to completeStatsd determines percentiles, average (mean), standard deviation, sum, lower and upper bounds for the flush intervalCan support storing histogram of values too (not default)