In this presentation, we’ll explore the basics of the three pillars and what Spring has to offer to implement them for logging (SLF4J), metrics (Micrometer), and distributed tracing (Spring Cloud Sleuth, Zipkin/Brave, OpenTelemetry).
I’ll also talk about how to take your system to the next level, and what else you can find in Spring and related technologies to look under the hood of your running system (Spring Boot Actuator, Logbook, Eureka, Spring Boot Admin, Swagger, Spring HATEOAS) and what our future plans are.
How deeply can you understand what is happening inside your application? In modern, microservices-based applications, it’s critical to have end-to-end observability of each component and the communications between them in order to quickly identify and debug issues. In this session, we show how to have the necessary instrumentation and how to use the data you collect to have a better grasp of your production environment. On AWS, CloudWatch collects monitoring and operational data in the form of logs, metrics, and events, providing you with a unified view of AWS resources, applications, and services. With AWS X-Ray, you can understand how your application and its underlying services are performing to identify and troubleshoot the root cause of performance issues and errors. X-Ray provides an end-to-end view of requests as they travel through your application, and shows a map of your application’s underlying components. AWS App Mesh standardizes how your microservices communicate, giving you end-to-end visibility and helping to ensure high-availability for your applications.
The monolith to cloud-native, microservices evolution has driven a shift from monitoring to observability. OpenTelemetry, a merger of the OpenTracing and OpenCensus projects, is enabling Observability 2.0. This talk covers the fundamental concepts of observability and then demonstrates how to instrument your applications using the OpenTelemetry libraries.
What is observability and how is it different from traditional monitoring? How do we effectively monitor and debug complex, elastic microservice architectures? In this interactive discussion, we’ll answer these questions. We’ll also introduce the idea of an “observability pipeline” as a way to empower teams following DevOps practices. Lastly, we’ll demo cloud-native observability tools that fit this “observability pipeline” model, including Fluentd, OpenTracing, and Jaeger.
Keeping Up with the ELK Stack: Elasticsearch, Kibana, Beats, and LogstashAmazon Web Services
Version 7 of the Elastic Stack adds powerful new features to the popular open source platform for search, logging, and analytics. Come hear directly from Elastic engineers and architecture team members on powerful new additions like GIS functionality and frozen-tier search. Plus, hear about the full range of orchestration options for getting the most out of your deployments, however and wherever you choose to run them. This session is sponsored by Elastic.
In this session, we will start with the importance of monitoring of services and infrastructure. We will discuss about Prometheus an opensource monitoring tool. We will discuss the architecture of Prometheus. We will also discuss some visualization tools which can be used over Prometheus. Then we will have a quick demo for Prometheus and Grafana.
How deeply can you understand what is happening inside your application? In modern, microservices-based applications, it’s critical to have end-to-end observability of each component and the communications between them in order to quickly identify and debug issues. In this session, we show how to have the necessary instrumentation and how to use the data you collect to have a better grasp of your production environment. On AWS, CloudWatch collects monitoring and operational data in the form of logs, metrics, and events, providing you with a unified view of AWS resources, applications, and services. With AWS X-Ray, you can understand how your application and its underlying services are performing to identify and troubleshoot the root cause of performance issues and errors. X-Ray provides an end-to-end view of requests as they travel through your application, and shows a map of your application’s underlying components. AWS App Mesh standardizes how your microservices communicate, giving you end-to-end visibility and helping to ensure high-availability for your applications.
The monolith to cloud-native, microservices evolution has driven a shift from monitoring to observability. OpenTelemetry, a merger of the OpenTracing and OpenCensus projects, is enabling Observability 2.0. This talk covers the fundamental concepts of observability and then demonstrates how to instrument your applications using the OpenTelemetry libraries.
What is observability and how is it different from traditional monitoring? How do we effectively monitor and debug complex, elastic microservice architectures? In this interactive discussion, we’ll answer these questions. We’ll also introduce the idea of an “observability pipeline” as a way to empower teams following DevOps practices. Lastly, we’ll demo cloud-native observability tools that fit this “observability pipeline” model, including Fluentd, OpenTracing, and Jaeger.
Keeping Up with the ELK Stack: Elasticsearch, Kibana, Beats, and LogstashAmazon Web Services
Version 7 of the Elastic Stack adds powerful new features to the popular open source platform for search, logging, and analytics. Come hear directly from Elastic engineers and architecture team members on powerful new additions like GIS functionality and frozen-tier search. Plus, hear about the full range of orchestration options for getting the most out of your deployments, however and wherever you choose to run them. This session is sponsored by Elastic.
In this session, we will start with the importance of monitoring of services and infrastructure. We will discuss about Prometheus an opensource monitoring tool. We will discuss the architecture of Prometheus. We will also discuss some visualization tools which can be used over Prometheus. Then we will have a quick demo for Prometheus and Grafana.
By Tom Wilkie, delivered at London Microservices User Group on 2/12/15
The rise of microservice-based applications has had many knock-on effects, not least on the complexity of monitoring your application. Order-of-magnitude increase in the number of moving parts and rate of change of the application require us to reassess traditional monitoring techniques.
In this talk we will discuss some different approaches to monitoring, visualising and tracing containerised, microservices-based applications. We’ll present different techniques to some of the emergent problems, and try not to rant too much.
Combining Logs, Metrics, and Traces for Unified ObservabilityElasticsearch
Learn how Elasticsearch efficiently combines data in a single store and how Kibana is used to analyze it. Plus, see how recent developments help identify, troubleshoot, and resolve operational issues faster.
Watch this talk here: https://www.confluent.io/online-talks/apache-kafka-architecture-and-fundamentals-explained-on-demand
This session explains Apache Kafka’s internal design and architecture. Companies like LinkedIn are now sending more than 1 trillion messages per day to Apache Kafka. Learn about the underlying design in Kafka that leads to such high throughput.
This talk provides a comprehensive overview of Kafka architecture and internal functions, including:
-Topics, partitions and segments
-The commit log and streams
-Brokers and broker replication
-Producer basics
-Consumers, consumer groups and offsets
This session is part 2 of 4 in our Fundamentals for Apache Kafka series.
This is a talk on how you can monitor your microservices architecture using Prometheus and Grafana. This has easy to execute steps to get a local monitoring stack running on your local machine using docker.
MeetUp Monitoring with Prometheus and Grafana (September 2018)Lucas Jellema
This presentation introduces the concept of monitoring - focusing on why and how and finally on the tools to use. It introduces Prometheus (metrics gathering, processing, alerting), application instrumentation and Prometheus exporters and finally it introduces Grafana as a common companion for dashboarding, alerting and notifications. This presentations also introduces the handson workshop - for which materials are available from https://github.com/lucasjellema/monitoring-workshop-prometheus-grafana
Installation of Grafana on linux ; connectivity with Prometheus database , installation of Prometheus ; Installation of node_exporter ,Tomcat-exporter ; installation and configuration of alert manager .. Detailed step by step installation and working
Stephan Ewen - Experiences running Flink at Very Large ScaleVerverica
This talk shares experiences from deploying and tuning Flink steam processing applications for very large scale. We share lessons learned from users, contributors, and our own experiments about running demanding streaming jobs at scale. The talk will explain what aspects currently render a job as particularly demanding, show how to configure and tune a large scale Flink job, and outline what the Flink community is working on to make the out-of-the-box for experience as smooth as possible. We will, for example, dive into - analyzing and tuning checkpointing - selecting and configuring state backends - understanding common bottlenecks - understanding and configuring network parameters
The monolith to cloud-native, microservices evolution has driven a shift from monitoring to observability. OpenTelemetry, a merger of the OpenTracing and OpenCensus projects, is enabling Observability 2.0. This talk covers the latest concepts in observability and then demonstrates how to configure and deploy various OpenTelemetry components to effectively meet your SLO's.
DevOps and Continuous Delivery Reference Architectures (including Nexus and o...Sonatype
There are numerous examples of DevOps and Continuous Delivery reference architectures available, and each of them vary in levels of detail, tools highlighted, and processes followed. Yet, there is a constant theme among the tool sets: Jenkins, Maven, Sonatype Nexus, Subversion, Git, Docker, Puppet/Chef, Rundeck, ServiceNow, and Sonar seem to show up time and again.
Event-driven Infrastructure - Mike Place, SaltStack - DevOpsDays Tel Aviv 2016DevOpsDays Tel Aviv
"As we move into the age of containerization, it becomes more important than ever to figure out how to automate, monitor and deploy systems which are resilient and well-understood.
In this talk, we'll discuss methods for building infrastructures with universal event buses and reactive systems which can act as a nervous system for our computing environments."
By Tom Wilkie, delivered at London Microservices User Group on 2/12/15
The rise of microservice-based applications has had many knock-on effects, not least on the complexity of monitoring your application. Order-of-magnitude increase in the number of moving parts and rate of change of the application require us to reassess traditional monitoring techniques.
In this talk we will discuss some different approaches to monitoring, visualising and tracing containerised, microservices-based applications. We’ll present different techniques to some of the emergent problems, and try not to rant too much.
Combining Logs, Metrics, and Traces for Unified ObservabilityElasticsearch
Learn how Elasticsearch efficiently combines data in a single store and how Kibana is used to analyze it. Plus, see how recent developments help identify, troubleshoot, and resolve operational issues faster.
Watch this talk here: https://www.confluent.io/online-talks/apache-kafka-architecture-and-fundamentals-explained-on-demand
This session explains Apache Kafka’s internal design and architecture. Companies like LinkedIn are now sending more than 1 trillion messages per day to Apache Kafka. Learn about the underlying design in Kafka that leads to such high throughput.
This talk provides a comprehensive overview of Kafka architecture and internal functions, including:
-Topics, partitions and segments
-The commit log and streams
-Brokers and broker replication
-Producer basics
-Consumers, consumer groups and offsets
This session is part 2 of 4 in our Fundamentals for Apache Kafka series.
This is a talk on how you can monitor your microservices architecture using Prometheus and Grafana. This has easy to execute steps to get a local monitoring stack running on your local machine using docker.
MeetUp Monitoring with Prometheus and Grafana (September 2018)Lucas Jellema
This presentation introduces the concept of monitoring - focusing on why and how and finally on the tools to use. It introduces Prometheus (metrics gathering, processing, alerting), application instrumentation and Prometheus exporters and finally it introduces Grafana as a common companion for dashboarding, alerting and notifications. This presentations also introduces the handson workshop - for which materials are available from https://github.com/lucasjellema/monitoring-workshop-prometheus-grafana
Installation of Grafana on linux ; connectivity with Prometheus database , installation of Prometheus ; Installation of node_exporter ,Tomcat-exporter ; installation and configuration of alert manager .. Detailed step by step installation and working
Stephan Ewen - Experiences running Flink at Very Large ScaleVerverica
This talk shares experiences from deploying and tuning Flink steam processing applications for very large scale. We share lessons learned from users, contributors, and our own experiments about running demanding streaming jobs at scale. The talk will explain what aspects currently render a job as particularly demanding, show how to configure and tune a large scale Flink job, and outline what the Flink community is working on to make the out-of-the-box for experience as smooth as possible. We will, for example, dive into - analyzing and tuning checkpointing - selecting and configuring state backends - understanding common bottlenecks - understanding and configuring network parameters
The monolith to cloud-native, microservices evolution has driven a shift from monitoring to observability. OpenTelemetry, a merger of the OpenTracing and OpenCensus projects, is enabling Observability 2.0. This talk covers the latest concepts in observability and then demonstrates how to configure and deploy various OpenTelemetry components to effectively meet your SLO's.
DevOps and Continuous Delivery Reference Architectures (including Nexus and o...Sonatype
There are numerous examples of DevOps and Continuous Delivery reference architectures available, and each of them vary in levels of detail, tools highlighted, and processes followed. Yet, there is a constant theme among the tool sets: Jenkins, Maven, Sonatype Nexus, Subversion, Git, Docker, Puppet/Chef, Rundeck, ServiceNow, and Sonar seem to show up time and again.
Event-driven Infrastructure - Mike Place, SaltStack - DevOpsDays Tel Aviv 2016DevOpsDays Tel Aviv
"As we move into the age of containerization, it becomes more important than ever to figure out how to automate, monitor and deploy systems which are resilient and well-understood.
In this talk, we'll discuss methods for building infrastructures with universal event buses and reactive systems which can act as a nervous system for our computing environments."
An Introduction to Prometheus (GrafanaCon 2016)Brian Brazil
Often what you monitor and get alerted on is defined by your tools, rather than what makes the most sense to you and your organisation. Alerts on metrics such as CPU usage which are noisy and rarely spot real problems, while outages go undetected. Monitoring systems can also be challenging to maintain, and overall provide a poor return on investment.
In the past few years several new monitoring systems have appeared with more powerful semantics and which are easier to run, which offer a way to vastly improve how your organisation operates and prepare you for a Cloud Native environment. Prometheus is one such system. This talk will look at the monitoring ideal and how whitebox monitoring with a time series database, multi-dimensional labels and a powerful querying/alerting language can free you from midnight pages.
This Tutorial will discuss and demonstrate how to implement different realtime streaming analytics patterns. We will start with counting usecases and progress into complex patterns like time windows, tracking objects, and detecting trends. We will start with Apache Storm and progress into Complex Event Processing based technologies.
Sherlock Homepage - A detective story about running large web services (VISUG...Maarten Balliauw
The site was slow. CPU and memory usage everywhere! Some dead objects in the corner. Something terrible must have happened! We have some IIS logs. Some traces from a witness. But not enough to find out what was wrong. In this session, we’ll see how effective telemetry, a profiler or two as well as a refresher of how IIS runs our ASP.NET web applications can help solve this server murder mystery.
The site was slow. CPU and memory usage everywhere! Some dead objects in the corner. Something terrible must have happened! We have some IIS logs. Some traces from a witness. But not enough to find out what was wrong. In this session, we’ll see how effective telemetry, a profiler or two as well as a refresher of how IIS runs our ASP.NET web applications can help solve this server murder mystery.
The rise of cloud and containers has led to systems that are much more distributed and dynamic in nature. Highly elastic microservice and serverless architectures mean containers spin up on demand and scale to zero when that demand goes away. This generates a continous stream of infrastructure data.
On the business side, we have started storing lot of data and this data contains enormours information, specially when married with infrastructure data this gives holistic health information of the entire platform. We will talk about how to achieve this kind of fine-grained observability at scale in real-time.
With distributed tracing, we can track requests as they pass through multiple services, emitting timing and other metadata throughout, and this information can then be reassembled to provide a complete picture of the application’s behavior at runtime - Read more in https://blog.buoyant.io/2016/05/17/distributed-tracing-for-polyglot-microservices/ and https://www.rookout.com/
Microservices architecture involves many services that are being distributed over the network resulting in many more ways of failure. This session will try to cover the available tools that can help you when designing/building such distributed system in Go
Nonfunctional Testing: Examine the Other Side of the CoinTechWell
Creating a highly available, scalable, and high-performing system requires a substantial amount of what we call nonfunctional testing. Developing nonfunctional testing skills is a must for many of today’s quality engineers (QEs). For the past several years, Balaji Arunachalam’s quality team for Intuit Core Services has experienced several highly available and disaster recovery buildup and testing challenges. Their journey includes the evolution of functional QEs into hybrid QEs who are capable of doing both functional and nonfunctional testing. Nonfunctional testing includes capacity, stability, benchmarking, FMEA/RAS, datacenter failover, and scalability testing. Balaji shares nonfunctional testing best practices, learnings, and mistakes they encountered on this journey. If you or your team is ready flip the coin and take a serious look at nonfunctional testing methods, opportunities, challenges, and solutions, this session is for you.
Sherlock Homepage - A detective story about running large web services - NDC ...Maarten Balliauw
The site was slow. CPU and memory usage everywhere! Some dead objects in the corner. Something terrible must have happened! We have some IIS logs. Some traces from a witness. But not enough to find out what was wrong. In this session, we’ll see how effective telemetry, a profiler or two as well as a refresher of how IIS runs our ASP.NET web applications can help solve this server murder mystery.
Sherlock Homepage - A detective story about running large web services - WebN...Maarten Balliauw
The site was slow. CPU and memory usage everywhere! Some dead objects in the corner. Something terrible must have happened! We have some IIS logs. Some traces from a witness. But not enough to find out what was wrong. In this session, we’ll see how effective telemetry, a profiler or two as well as a refresher of how IIS runs our ASP.NET web applications can help solve this server murder mystery.
Skynet project: Monitor, analyze, scale, and maintain a system in the CloudSylvain Kalache
The goal of Skynet is to avoid human doing repetitive things and make a system doing them in a better way. System automation should be the way to go for any system management so that human can focus on stuff that really matters.
Related blog post for more informations https://engineering.linkedin.com/slideshare/skynet-project-_-monitor-scale-and-auto-heal-system-cloud
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...Codemotion
Once you start working with Big Data systems, you discover a whole bunch of problems you won’t find in monolithic systems. Monitoring all of the components becomes a big data problem itself. In the talk, we’ll mention all of the aspects that you should take into consideration when monitoring a distributed system using tools like Web Services, Spark, Cassandra, MongoDB, AWS. Not only the tools, what should you monitor about the actual data that flows in the system? We’ll cover the simplest solution with your day to day open source tools, the surprising thing, that it comes not from an Ops Guy.
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...Demi Ben-Ari
Once you start working with distributed Big Data systems, you start discovering a whole bunch of problems you won’t find in monolithic systems.
All of a sudden to monitor all of the components becomes a big data problem itself.
In the talk we’ll mention all of the aspects that you should take in consideration when monitoring a distributed system once you’re using tools like:
Web Services, Apache Spark, Cassandra, MongoDB, Amazon Web Services.
Not only the tools, what should you monitor about the actual data that flows in the system?
And we’ll cover the simplest solution with your day to day open source tools, the surprising thing, that it comes not from an Ops Guy.
Similar to Observability: Beyond the Three Pillars with Spring (20)
The Tanzu Developer Connect is a hands-on workshop that dives deep into TAP. Attendees receive a hands on experience. This is a great program to leverage accounts with current TAP opportunities.
The Tanzu Developer Connect is a hands-on workshop that dives deep into TAP. Attendees receive a hands on experience. This is a great program to leverage accounts with current TAP opportunities.
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
A Comprehensive Look at Generative AI in Retail App Testing.pdfkalichargn70th171
Traditional software testing methods are being challenged in retail, where customer expectations and technological advancements continually shape the landscape. Enter generative AI—a transformative subset of artificial intelligence technologies poised to revolutionize software testing.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...informapgpstrackings
Keep tabs on your field staff effortlessly with Informap Technology Centre LLC. Real-time tracking, task assignment, and smart features for efficient management. Request a live demo today!
For more details, visit us : https://informapuae.com/field-staff-tracking/
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
In software engineering, the right architecture is essential for robust, scalable platforms. Wix has undergone a pivotal shift from event sourcing to a CRUD-based model for its microservices. This talk will chart the course of this pivotal journey.
Event sourcing, which records state changes as immutable events, provided robust auditing and "time travel" debugging for Wix Stores' microservices. Despite its benefits, the complexity it introduced in state management slowed development. Wix responded by adopting a simpler, unified CRUD model. This talk will explore the challenges of event sourcing and the advantages of Wix's new "CRUD on steroids" approach, which streamlines API integration and domain event management while preserving data integrity and system resilience.
Participants will gain valuable insights into Wix's strategies for ensuring atomicity in database updates and event production, as well as caching, materialization, and performance optimization techniques within a distributed system.
Join us to discover how Wix has mastered the art of balancing simplicity and extensibility, and learn how the re-adoption of the modest CRUD has turbocharged their development velocity, resilience, and scalability in a high-growth environment.
Navigating the Metaverse: A Journey into Virtual Evolution"Donna Lenk
Join us for an exploration of the Metaverse's evolution, where innovation meets imagination. Discover new dimensions of virtual events, engage with thought-provoking discussions, and witness the transformative power of digital realms."
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxrickgrimesss22
Discover the essential features to incorporate in your Winzo clone app to boost business growth, enhance user engagement, and drive revenue. Learn how to create a compelling gaming experience that stands out in the competitive market.
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
Understanding Globus Data Transfers with NetSageGlobus
NetSage is an open privacy-aware network measurement, analysis, and visualization service designed to help end-users visualize and reason about large data transfers. NetSage traditionally has used a combination of passive measurements, including SNMP and flow data, as well as active measurements, mainly perfSONAR, to provide longitudinal network performance data visualization. It has been deployed by dozens of networks world wide, and is supported domestically by the Engagement and Performance Operations Center (EPOC), NSF #2328479. We have recently expanded the NetSage data sources to include logs for Globus data transfers, following the same privacy-preserving approach as for Flow data. Using the logs for the Texas Advanced Computing Center (TACC) as an example, this talk will walk through several different example use cases that NetSage can answer, including: Who is using Globus to share data with my institution, and what kind of performance are they able to achieve? How many transfers has Globus supported for us? Which sites are we sharing the most data with, and how is that changing over time? How is my site using Globus to move data internally, and what kind of performance do we see for those transfers? What percentage of data transfers at my institution used Globus, and how did the overall data transfer performance compare to the Globus users?
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfJay Das
With the advent of artificial intelligence or AI tools, project management processes are undergoing a transformative shift. By using tools like ChatGPT, and Bard organizations can empower their leaders and managers to plan, execute, and monitor projects more effectively.
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Mind IT Systems
Healthcare providers often struggle with the complexities of chronic conditions and remote patient monitoring, as each patient requires personalized care and ongoing monitoring. Off-the-shelf solutions may not meet these diverse needs, leading to inefficiencies and gaps in care. It’s here, custom healthcare software offers a tailored solution, ensuring improved care and effectiveness.
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
2. About Me
- @jonatan_ivanov
- develotters.com
- Seattle Java User Group
- Spring Team @ VMware
- Micrometer
- Spring Cloud Sleuth
- “Spring Observability”
3. Disclaimer
This presentation may contain product features or functionality that are currently under
development.
This overview of new technology represents no commitment from VMware to deliver these
features in any generally available product.
Features are subject to change, and must not be included in contracts, purchase orders, or
sales agreements of any kind.
Technical feasibility and market demand will affect final delivery.
Pricing and packaging for any new features/functionality/technology discussed or
presented, have not been determined.
The information in this presentation is for informational purposes only and may not be
incorporated into any contract. There is no commitment or obligation to deliver any items
presented herein.
4. Cover w/ Image
Agenda
- What is Observability?
- Why do we need it?
- “The Three Pillars” (with examples)
- Logging
- Metrics
- Distributed Tracing
- How to implement it with Spring?
- “Non-conventional” Observability
- Q&A
6. What is Observability?
“In control theory, observability is a measure of how well
internal states of a system can be inferred from knowledge
of its external outputs.”
…
“A system is said to be observable if [...] the current state can
be estimated using only the information from outputs.”
(Wikipedia)
7.
8. What is Observability?
How well we can understand the
internals of a system based on its
outputs
(Providing meaningful information about what happens inside)
9. What is Observability?
Being able to ask arbitrary questions
without knowing ahead what you want to ask
Turning data points and context into insights
Being able to quickly troubleshoot problems
with no prior knowledge (unknown unknowns)
10. Why do we need Observability?
Today's systems are insanely complex (cloud)
(Death Star Architecture, Big Ball of Mud)
11. Why do we need Observability?
Complexity (cloud): LAMP stack vs. Cloud Environments
We need to face unknown unknowns
We might not know where our apps are
We might not know how many instances we have (or what versions)
We can’t modify/debug/etc. it
Something is always broken (Fallacies of Distributed Computing)
Like sending rovers to Mars: You can’t touch/modify them after launch
12. Why do we need Observability?
Chaos
Environments can be chaotic
You turn a knob here a little and services are going down there
Unknown Unknowns
We can’t know everything, we need to deal with unknown unknowns
“This should be impossible!”, “That will never happen!”
Relativity
The same thing can be perceived differently by different observers
Everything is broken for the users but the server side seems ok
13. Why do we need Observability?
Continuous Improvement
If you want to improve something, you need to be able to measure it first
How many resources do you utilize (cpu, ram, io, etc.)?
What are your throughput/latency (max.) patterns?
How frequently do you deploy?
How long does it take for the code to go live?
How long does it take to troubleshoot an issue or recover from an outage?
How often are you paged?
14. Why do we need Observability?
Opens the door for advanced capabilities
Chaos Engineering
Anomaly Detection
Feature flags
A/B Testing
Auto-tuning
Adaptive Apps
16. Logging - Metrics - Distributed Tracing
Metrics
What is the context?
Measure-and-Combine data
Aggregatable
Can identify trends
Not traffic-sensitive (usually)
Distributed Tracing
Why happened?
Recording events
With causal ordering
Can identify cause across
apps
Context Propagation (later)
Logging
What happened?
Emitting events
Easy to read (grep)
INFO/WARN/ERROR/…
Stacktraces
17. Example: Latency
Metrics
“99.999% of the requests
were faster than 140ms.”
“The max was 150ms.”
So it’s quite bad.
But why was this slow?
Logging
“Processing a request took
140ms.”
Is it bad?
Is it good?
What is the context?
Distributed Tracing
“Service A called Service B.”
“Service B called the DB.”
“The services were ok.”
“The network was ok.”
“The DB was slow.”
“Because somebody
requested a lot of data.”
18. Example: Error
Metrics
“The error rate is 0.001/sec.”
“We had 2 errors recently.”
So it’s not that bad.
But why did this happen?
Logging
“Request processing failed.”
“Here’s the stacktrace.”
Is it bad?
(Well, it failed.) How bad?
How many of them failed?
What is the context?
Distributed Tracing
“Service A called Service B.”
“Service B called the DB.”
“The services were ok.”
“The network was ok.”
“The DB call failed.”
“Because of invalid input.”
20. Application logs: classic DEBUG/INFO/WARN/ERROR events (+stacktraces)
Payload logs: Raw request and response pairs
GC logs: GC events (JEP 271 - Unified GC Logging)
Access logs: Logs from the underlying HTTP server (e.g.: Tomcat)
- Who and when called our service
- What request (HTTP method, headers, path, query)
- Response status, processing time, payload sizes
etc. (audit logs, metrics in logs, trace logs)
Logging 101 - Types of logs
21. SLF4J with Logback comes pre-configured but you can replace Logback
SLF4J
- Simple Logging Façade for Java
- Simple API for various logging libraries
- Allows to plug in the desired logging library
Logback
- Modern logging library
- Natively implements the SLF4J API
If you want Log4j2 instead of Logback:
- spring-boot-starter-logging
+ spring-boot-starter-log4j2
Logging with Spring: SLF4J + Logback
22. Logging with Spring: Payload, Access, GC
Payload logs: Logbook
+ logbook-spring-boot-starter (auto-configured)
Access logs:
server.tomcat.accesslog.enabled=true
server.tomcat.basedir=logs
server.tomcat.accesslog.pattern=...
server.jetty.accesslog.enabled=true
server.undertow.accesslog.enabled=true
+ logback-access (if you want to use Logback, needs to be configured)
GC logs: JVM args
24. Metrics 101
Time series data: data that changes over time
Trends, context, anomaly detection, visualization, alerting
Various Backends
Publishing: Client Pushes vs. Server Polls
Dimensionality: Dimensional vs. Hierarchical
25. Metrics with Spring: Micrometer
Popular Metrics library on the JVM
Like SLF4J, but for metrics
Simple API
Supports the most popular metric backends
Comes with spring-boot-actuator
Spring projects are instrumented using Micrometer
A lot of third-party libraries use Micrometer
26. Micrometer - Like SLF4J, but for metrics
Graphite
Humio
InfluxDB
JMX
KairosDB
New Relic
OpenTSDB
OTLP
Prometheus
SignalFx
Stackdriver (GCP)
StatsD
Wavefront* (VMware)
(/actuator/metrics)
AppOptics
Atlas
Azure Monitor
CloudWatch (AWS)
Datadog
Dynatrace
Elastic
Ganglia
*VMware Tanzu Observability by Wavefront
31. Span (basic unit of work)
SpanId, ParentSpanId, TraceId
Timestamps (start/stop)
Events (annotations) with timestamps
Tags (key-value pairs)
ProcessId
Local IP, Remote IP
+ Log correlation (and context propagation)
+ Visualization
Distributed Tracing 101 - Span and Trace
32.
33. Distributed Tracing with Spring: Spring Cloud Sleuth
Distributed Tracing Support for Spring
Provides an abstraction layer on top of tracing libraries (3.x)
- Brave (OpenZipkin), default
- OpenTelemetry (CNCF), experimental
Log Correlation + Context Propagation
Instrumentation for Spring Projects (and your application)
Instrumentation for third-party libraries (through Brave and OTel)
Supports various backends (through Brave and OTel)
36. “Non-conventional” Observability
Is there anything else beyond Logging + Metrics + Tracing?
We are looking for:
- outputs (that provide)
- meaningful information
- about what’s inside of our system
45. Info Endpoint
How to contact the dev team, where is the repo of the project?
Cloud
instanceId and type
image version
region, account, cloud provider
TLS Certificate Chain
subject, issuer
validity (expiration date) -> health check?
signature algorithm
You can create your own endpoint
Dependencies used runtime; Dependency lock files
/whoami: username + roles
46. Service Registry/Discoverability
How many service instances do we have (by environment)?
What versions are deployed (by environment)?
Where are they?
host/ip, port
instanceId, region, account, cloud provider, etc.
Service starts/stops (deployments, restarts)?
52. API Discoverability
How can I call this service?
Spring REST Docs
Generates docs from tests and hand-written docs
Spring Cloud Contract + Pact Broker
Consumer Driven Contracts (test client-server contract)
You know when you break your clients
Swagger / OpenAPI + ReDoc
API spec, docs
API browser + client
Spring HATEOAS + HAL Explorer
Add links to your resources (other resources or operations)
API browser + client