Monitoring is not mainstream anymore, observability is. 10 minutes about why this shift is happening now and what's the purpose of that. Is it just another name for monitoring?
CitoEngine : Alert management and automation tool.extremeunix
Cyrus Dasadia introduces CitoEngine, an open source alert management and automation tool that helps teams manage the growing number of alerts generated by monitoring systems. CitoEngine allows users to easily define events, create actions for those events, and integrate with third party applications. It creates unique incidents for each alert, provides detailed dashboards to view current and acknowledged incidents, and generates reports. CitoEngine is designed to help network operations centers, DevOps, and operations teams manage alerts more efficiently.
How to push a react js application in production and sleep betterEmanuele Rampichini
“Everything fails all the time”. True for every software project, especially for a large and complex JS project. In this session we are gonna explore testing and monitoring techniques to deliver and maintain a ReactJS + Redux application, and at the same time being able to go back to sleep without the fear that everything is gonna explode during the night.
Go fit perfectly inside containers, you can ship apps as tiny images on k8s, distributing them across the globe. Gianluca will show how InfluxData debugs containers running on Kubernetes to allow sysadmins and developers to troubleshoot and replicate issues using core dump, debuggers, and logs.
Go applications are perfect to be run inside a container. You can build a single binary, a tiny Docker image and you can ship them on your Kubernetes cluster. A successful production environment requires stability and simplicity, it needs to be easy to troubleshoot and operators need to be able to get all the information developers will need to fix a bug. During this talk, Gianluca will share what influxData is doing to allow developers and system administrator to work together, understanding problems running live at scale on Kubernetes and how to escalate them down to Software Engineer using logs, delve, gdb, core dumps, and traces to replicate and fix issues.
Modern software development is increasingly taking a “microservice” approach that has resulted in an explosion of complexity at the network level. We have more applications running distributed across different datacenters. Distributed tracing, events, and metrics are essential for observing and understanding modern microservice architectures.
This talk is a deep dive on how to monitor your distributed system. You will get tools, methodologies, and experiences that will help you to realize what your applications expose and how to get value out from all these information.
Gianluca Arbezzano, SRE at InfluxData will share how to monitor a distributed system, how to switch from a more traditional monitoring approach to observability. Stay focused on the server’s role and not on the hostname because it’s not really important anymore, our servers or containers are fast moving part and it’s easy to detach it from the right in case of trouble than call the server by name as a cute puppet. How to design a SLO for your core services and now to iterate on them. Instrument your services with tracing using tools like Zipkin or Jaeger to measure latency between in your network.
I would like to speak about what I am actually doing at InfluxData. Sharing with you some ideas about how an orchestrator should work. We will start from a bit of history about distributed system, containers, runtime and so on. Hoping to have a good chat about the future of scheduling and orchestrator.
Gianluca has been working on a project called Orbiter, an open source tool designed to be an easy to maintain autoscaler for Docker Swarm. He will present an overview about Docker Swarm & demo this project and take your questions and suggestions.
Overview and Opentracing in theory by Gianluca ArbezzanoGianluca Arbezzano
That is this group? How does it work? What is the CNCF? After this short introduction I am going to show you what is Opentracing what it means and why the adoption is growing so much in a short amount of time. Use cases, possible implementations and so on.
This document discusses the differences between pull-based and push-based monitoring approaches. It notes that both approaches require instrumenting applications, and that sometimes one is easier than the other depending on the situation. Specifically, pull may be easier for monitoring millions of IoT devices globally, while push may be difficult for monitoring metrics from MySQL. The document also thanks Prometheus, the CNCF, and the OpenMetrics initiative for standardizing the pull method and metrics format. It highlights how Kapacitor now integrates Prometheus' service discovery and scraping code to enable pull-based monitoring using InfluxDB.
CitoEngine : Alert management and automation tool.extremeunix
Cyrus Dasadia introduces CitoEngine, an open source alert management and automation tool that helps teams manage the growing number of alerts generated by monitoring systems. CitoEngine allows users to easily define events, create actions for those events, and integrate with third party applications. It creates unique incidents for each alert, provides detailed dashboards to view current and acknowledged incidents, and generates reports. CitoEngine is designed to help network operations centers, DevOps, and operations teams manage alerts more efficiently.
How to push a react js application in production and sleep betterEmanuele Rampichini
“Everything fails all the time”. True for every software project, especially for a large and complex JS project. In this session we are gonna explore testing and monitoring techniques to deliver and maintain a ReactJS + Redux application, and at the same time being able to go back to sleep without the fear that everything is gonna explode during the night.
Go fit perfectly inside containers, you can ship apps as tiny images on k8s, distributing them across the globe. Gianluca will show how InfluxData debugs containers running on Kubernetes to allow sysadmins and developers to troubleshoot and replicate issues using core dump, debuggers, and logs.
Go applications are perfect to be run inside a container. You can build a single binary, a tiny Docker image and you can ship them on your Kubernetes cluster. A successful production environment requires stability and simplicity, it needs to be easy to troubleshoot and operators need to be able to get all the information developers will need to fix a bug. During this talk, Gianluca will share what influxData is doing to allow developers and system administrator to work together, understanding problems running live at scale on Kubernetes and how to escalate them down to Software Engineer using logs, delve, gdb, core dumps, and traces to replicate and fix issues.
Modern software development is increasingly taking a “microservice” approach that has resulted in an explosion of complexity at the network level. We have more applications running distributed across different datacenters. Distributed tracing, events, and metrics are essential for observing and understanding modern microservice architectures.
This talk is a deep dive on how to monitor your distributed system. You will get tools, methodologies, and experiences that will help you to realize what your applications expose and how to get value out from all these information.
Gianluca Arbezzano, SRE at InfluxData will share how to monitor a distributed system, how to switch from a more traditional monitoring approach to observability. Stay focused on the server’s role and not on the hostname because it’s not really important anymore, our servers or containers are fast moving part and it’s easy to detach it from the right in case of trouble than call the server by name as a cute puppet. How to design a SLO for your core services and now to iterate on them. Instrument your services with tracing using tools like Zipkin or Jaeger to measure latency between in your network.
I would like to speak about what I am actually doing at InfluxData. Sharing with you some ideas about how an orchestrator should work. We will start from a bit of history about distributed system, containers, runtime and so on. Hoping to have a good chat about the future of scheduling and orchestrator.
Gianluca has been working on a project called Orbiter, an open source tool designed to be an easy to maintain autoscaler for Docker Swarm. He will present an overview about Docker Swarm & demo this project and take your questions and suggestions.
Overview and Opentracing in theory by Gianluca ArbezzanoGianluca Arbezzano
That is this group? How does it work? What is the CNCF? After this short introduction I am going to show you what is Opentracing what it means and why the adoption is growing so much in a short amount of time. Use cases, possible implementations and so on.
This document discusses the differences between pull-based and push-based monitoring approaches. It notes that both approaches require instrumenting applications, and that sometimes one is easier than the other depending on the situation. Specifically, pull may be easier for monitoring millions of IoT devices globally, while push may be difficult for monitoring metrics from MySQL. The document also thanks Prometheus, the CNCF, and the OpenMetrics initiative for standardizing the pull method and metrics format. It highlights how Kapacitor now integrates Prometheus' service discovery and scraping code to enable pull-based monitoring using InfluxDB.
Open Tracing, to order and understand your mess. - ApiConf 2017Gianluca Arbezzano
This about how many api calls your applications were doing 3-4 years ago, and think about how many integration and difference services your requests is crossing before to come back to the final destination. How do you know this step of your pipeline is taking too much time? What is taking 2 seconds to answer? Is it the authentication service? Maybe it's the invoice generation service or the notification platform. Open Tracing is a distributed tracing cross vendor and open source that help you to understand bottleneck and to profile the requests from where they arrive at the final user. In an ecosystem where microservices and as a service concept are growing this can be a real challenge. During this presentation, we will see how it works from a general point of view to land in some real implementation, examples, and demo.
When you are designing a production environment security is essential. All the Docker ecosystem but in particular Docker Swarm allows us to ship our containers out of our laptop, how can we make this process safe? During my talk, I will share tips around production environment, immutability and how troubleshooting common attack as code injection with Docker. Static analysis of our images, content trust with Notary to make our journey secure.
How can we setup a cluster on the main cloud providers with VPN and node labeling to expose only a portion of our cluster? I will also show what Docker provides (Content Trust, Static Analysis) but also open source alternatives as Notary, centos/clair and Cilium.
In the end of this talk, we had a better idea around how manage Docker in production.
There are a lot of Continuous Integration services but Jenkins is still one of the most used in most programming languages. In this talk I will share the CurrencyFair experience, how our IT Team made of 40 engineers manage CurrencyFair delivery with GitHub, Jenkins, Hubot and Slack on different environments. Artifact to guarantee the stability of your codebase, pipeline and some Jenkins’s plugins in order to create the most comfortable delivery flow for your projects.
Our application speaks, time series are one of their languages. During this talk I will share how to use the open source Tick Stack to spin up a modern monitoring system for your application and your infrastructure. DevOps, cloud computing and containers changed how we are writing and running our applications. This talk shows what InfluxData and the community is building to have a modern and flexible monitoring toolkit.
Tick Stack - Listen your infrastructure and please sleepGianluca Arbezzano
Our application and our infrastructure speak, time series are one of their languages, during this talk I will share my experience about InfluxDB and time series to monitor and know the status of our cloud infrastructure. We will show best practice and tricks to grab information from an application in order to understand the mains difference between logs and time series.
Docker 1.13 introduces improvements that make Docker better for production environments. The Docker CLI is redesigned with top-level commands for different resources. Docker system commands are added for system management tasks. Compose file version 3 is now compatible with Swarm mode for defining multi-container applications. Secrets can now be used in Swarm mode to securely provide sensitive information to containers. Other features include squashing layers to reduce image sizes and viewing event streams. Examples and links are provided to demonstrate these Docker 1.13 features.
Docker 1.13 includes several updates that improve the management of production environments, including a redesigned command line interface (CLI) and new system, plugin, and secret commands. The top-level CLI commands are now organized around common concepts like container, image, network and service. Docker 1.13 also introduces support for secrets in swarm mode and a --squash option to reduce image layers when building. The release makes experimental features fully supported.
This document discusses tools for working with time series data, including InfluxDB for storing time series data, Telegraf for collecting metrics, and Kapacitor for processing and alerting on metrics. It provides an overview of how to install and use InfluxDB, describes its HTTP and UDP APIs, query language, and advantages over alternatives. Continuous queries, input and output plugins for Telegraf, and alerting capabilities of Kapacitor are also summarized. The document encourages representing log lines and other time-indexed data as compact time series for scalability.
This document discusses a queue system for processing asynchronous tasks. It describes using a queue to send messages to be processed by worker consumers. Key aspects covered include queue adapters that can store messages in databases or message brokers, producing and consuming messages, and using the await method for workers to wait for new messages to process. Examples are provided of how to set up a queue, send and receive messages, and build asynchronous worker processes using the queue.
There is a module for evenrything, zend framework is a modular framework. How can I write good code?
Packaging and reuse code is an important practice for write good application.
This document discusses Vagrant, a tool that allows users to set up and configure development environments using configuration files. It covers topics like adding and removing Vagrant boxes, provisioning boxes using tools like Chef and Puppet, configuring networks, folders, and providers in the Vagrantfile, and creating custom Vagrant boxes using Bento. The overall message is that Vagrant streamlines setting up virtual development environments for projects.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Open Tracing, to order and understand your mess. - ApiConf 2017Gianluca Arbezzano
This about how many api calls your applications were doing 3-4 years ago, and think about how many integration and difference services your requests is crossing before to come back to the final destination. How do you know this step of your pipeline is taking too much time? What is taking 2 seconds to answer? Is it the authentication service? Maybe it's the invoice generation service or the notification platform. Open Tracing is a distributed tracing cross vendor and open source that help you to understand bottleneck and to profile the requests from where they arrive at the final user. In an ecosystem where microservices and as a service concept are growing this can be a real challenge. During this presentation, we will see how it works from a general point of view to land in some real implementation, examples, and demo.
When you are designing a production environment security is essential. All the Docker ecosystem but in particular Docker Swarm allows us to ship our containers out of our laptop, how can we make this process safe? During my talk, I will share tips around production environment, immutability and how troubleshooting common attack as code injection with Docker. Static analysis of our images, content trust with Notary to make our journey secure.
How can we setup a cluster on the main cloud providers with VPN and node labeling to expose only a portion of our cluster? I will also show what Docker provides (Content Trust, Static Analysis) but also open source alternatives as Notary, centos/clair and Cilium.
In the end of this talk, we had a better idea around how manage Docker in production.
There are a lot of Continuous Integration services but Jenkins is still one of the most used in most programming languages. In this talk I will share the CurrencyFair experience, how our IT Team made of 40 engineers manage CurrencyFair delivery with GitHub, Jenkins, Hubot and Slack on different environments. Artifact to guarantee the stability of your codebase, pipeline and some Jenkins’s plugins in order to create the most comfortable delivery flow for your projects.
Our application speaks, time series are one of their languages. During this talk I will share how to use the open source Tick Stack to spin up a modern monitoring system for your application and your infrastructure. DevOps, cloud computing and containers changed how we are writing and running our applications. This talk shows what InfluxData and the community is building to have a modern and flexible monitoring toolkit.
Tick Stack - Listen your infrastructure and please sleepGianluca Arbezzano
Our application and our infrastructure speak, time series are one of their languages, during this talk I will share my experience about InfluxDB and time series to monitor and know the status of our cloud infrastructure. We will show best practice and tricks to grab information from an application in order to understand the mains difference between logs and time series.
Docker 1.13 introduces improvements that make Docker better for production environments. The Docker CLI is redesigned with top-level commands for different resources. Docker system commands are added for system management tasks. Compose file version 3 is now compatible with Swarm mode for defining multi-container applications. Secrets can now be used in Swarm mode to securely provide sensitive information to containers. Other features include squashing layers to reduce image sizes and viewing event streams. Examples and links are provided to demonstrate these Docker 1.13 features.
Docker 1.13 includes several updates that improve the management of production environments, including a redesigned command line interface (CLI) and new system, plugin, and secret commands. The top-level CLI commands are now organized around common concepts like container, image, network and service. Docker 1.13 also introduces support for secrets in swarm mode and a --squash option to reduce image layers when building. The release makes experimental features fully supported.
This document discusses tools for working with time series data, including InfluxDB for storing time series data, Telegraf for collecting metrics, and Kapacitor for processing and alerting on metrics. It provides an overview of how to install and use InfluxDB, describes its HTTP and UDP APIs, query language, and advantages over alternatives. Continuous queries, input and output plugins for Telegraf, and alerting capabilities of Kapacitor are also summarized. The document encourages representing log lines and other time-indexed data as compact time series for scalability.
This document discusses a queue system for processing asynchronous tasks. It describes using a queue to send messages to be processed by worker consumers. Key aspects covered include queue adapters that can store messages in databases or message brokers, producing and consuming messages, and using the await method for workers to wait for new messages to process. Examples are provided of how to set up a queue, send and receive messages, and build asynchronous worker processes using the queue.
There is a module for evenrything, zend framework is a modular framework. How can I write good code?
Packaging and reuse code is an important practice for write good application.
This document discusses Vagrant, a tool that allows users to set up and configure development environments using configuration files. It covers topics like adding and removing Vagrant boxes, provisioning boxes using tools like Chef and Puppet, configuring networks, folders, and providers in the Vagrantfile, and creating custom Vagrant boxes using Bento. The overall message is that Vagrant streamlines setting up virtual development environments for projects.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
20 Comprehensive Checklist of Designing and Developing a WebsitePixlogix Infotech
Dive into the world of Website Designing and Developing with Pixlogix! Looking to create a stunning online presence? Look no further! Our comprehensive checklist covers everything you need to know to craft a website that stands out. From user-friendly design to seamless functionality, we've got you covered. Don't miss out on this invaluable resource! Check out our checklist now at Pixlogix and start your journey towards a captivating online presence today.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
8. Why all this troubles?
● Cloud
● Containers
● Distributed systems
9. Build tools to have a better
point of view on your system
10. Do you need more?
● @mipsytipsy Charity Major on twitter
● @gianarb (it’s my twitter username)
● https://gianarb.it/blog/observability
● https://www.youtube.com/watch?v=1wj
ovFSCGhE
Editor's Notes
Observability is more around how to learn a codebase. It is about debugging not about dashboards or graphs. When you get paged and you look at one of your dashboards you troubleshoot quick because you already know what the problem is. Dashboard don’t speak enough for somebody that doesn’t know the app.
I am Software Engineer, deploy my code to production is the best experience I never had and it took me to the next level.Production environment is the best learning experience for a developer to understand how its code really works or does not with real traffic, real customers but you need to give them the right tools and the right skills to build a comfortable production environment
All of them are useful
Instrument application for logs, metrics, traces can be expensive
You build dashboard from answer that you had after a page. A lot of pages, a lot of dashbords... It doesn’t scale and it doesn’t look a work for engineers
Distributed system are complicated. Replicate them is hard and keep databases in sync, reproduce traffic is hard.. So the best place to test is production.