Websummit 2016 b1nate.i0 presentation - With agile development processes transforming software creation and deployment, they are delivering these new services at record speed for business. The outcomes? For starters: new market wealth and tremendous competitive pressures for traditional businesses trying to make the shift.
Catching up with what has happened with logging in Docker since late 2014 all the way up to the recently released Docker 0.10. Also, presenting my view on a comprehensive approach to monitoring Docker using the API to get events, logs, stats with a little bit of self promotion in pointing out that we have recently released an implementation of comprehensive monitoring as part of a Sumo Logic collector source.
Websummit 2016 b1nate.i0 presentation - With agile development processes transforming software creation and deployment, they are delivering these new services at record speed for business. The outcomes? For starters: new market wealth and tremendous competitive pressures for traditional businesses trying to make the shift.
Catching up with what has happened with logging in Docker since late 2014 all the way up to the recently released Docker 0.10. Also, presenting my view on a comprehensive approach to monitoring Docker using the API to get events, logs, stats with a little bit of self promotion in pointing out that we have recently released an implementation of comprehensive monitoring as part of a Sumo Logic collector source.
Designed for Administrators, this course shows you how to set up your data collection according to your organization’s data sources. Best practices around deployment options ensure you choose a deployment that scales as your organization grows. Because metadata is so important to a healthy environment, learn how to design and set up a naming convention that works best for your teams. Use Chef, Puppet or the likes? Learn how to automate your deployment. Test your deployment with simple searches, and learn to take advantage of optimization tools that can help you stay on top of your deployment.
Observability, the practical approach - Anton Drukh - DevOpsDays Tel Aviv 2018DevOpsDays Tel Aviv
Observability is the buzz these days. Everybody wants their code to explain to them what the issue is and how to fix it. But how do you get it right for your architecture, team structure and product maturity? I’ll share the path to observability at Snyk.io from day one to a team of 20 today.
ThoughtWorks Tech Talks NYC: DevOops, 10 Ops Things You Might Have Forgotten ...Rosemary Wang
Thoughtworks Tech Talks NYC, 11/30
We built an application or a platform! However, we soon realize that it is t-minus two weeks before release and we have no way of supporting it when it goes to production. Operations has not been trained, no one will know if a component goes down, and somehow the pipeline used in testing does not work in production. Oops. In this talk, we'll cover ten tips from the operations battlefront to remember as you develop an application or platform. With a focus on operations as a user and designing for support, these tips range from reminders on systems quirks to practices on engaging operations early in the development process. By taking a bit of an "operations" mindset in the development process, we can ease the release process and move closer to DevOps culture.
Splunk All the Things: Our First 3 Months Monitoring Web Service APIs - Splun...Dan Cundiff
A presentation titled "Splunk All the Things: Our First 3 Months Monitoring Web Service APIs" that Dan Cundiff and Eric Helgeson from Target Corporation gave at Splunk .conf2012.
Get Certified as a Sumo Power User!
Video: Video: https://www.sumologic.com/online-training/#Start
Designed for users, this series deep-dives into every aspect of analyzing your data. Run as a "how-to" webinar, this session walks viewers through data searching, filtering, parsing, and advanced analytics. This series concludes with "how to"details to create dashboards and alerts to monitor your data and get Sumo Logic to work for you.
Brand new to Sumo Logic? Learn how to get started and get the most out of your service. Learn how to capitalize on critical capabilities that can amplify your log analytics and monitoring experience while providing you with meaningful business and IT insights.
Video: https://www.sumologic.com/online-training/#QuickStart
Why And When Should We Consider Stream Processing In Our Solutions Teqnation ...Soroosh Khodami
Session Recording on Youtube
https://www.youtube.com/watch?v=uWPZQ_HMy10
- Session Description
Do you find yourself bombarded with buzzwords and overwhelmed by the rapid emergence of new technologies? "Stream Processing" is a tech buzzword that has been around for some time but is still unfamiliar to many. Join this session to discover its potential in software systems. I will share insights from Apache Flink, Apache Beam, Google Dataflow, and my experiences at Bol.com (the biggest e-commerce platform in the Netherlands) as we cover:
- Stream Processing overview: main concepts and features
- Apache Beam vs. Spring Boot comparison
- Key Considerations for Using Stream Processing
- Learning strategies to navigate this evolving landscape.
Have you ever wondered how to speed up your code in Python? This presentation will show you how to start. I will begin with a guide how to locate performance bottlenecks and then give you some tips how to speed up your code. Also I would like to discuss how to avoid premature optimization as it may be ‘the root of all evil’ (at least according to D. Knuth).
LogChaos: Challenges and Opportunities of Security Log StandardizationAnton Chuvakin
LogChaos: Challenges and Opportunities of Security Log Standardization
Abstract: The presentation will discuss how to bring order (in the form of standards!) to the chaotic world of logging. It will give a brief introduction to logs and logging and explain how and why logs grew so chaotic and disorganized. Next it will cover why log standards are sorely needed. It will offer a walkthrough that highlights the critical areas of log standardization. Past failed standards will be looked at and their lessons learned. Finally, current logging standard efforts will be presented briefly.
Monitoring As Code: How to Integrate App Monitoring Into Your Developer CycleAtlassian
Over the past decade, DevOps has empowered teams to break silos and create an environment of shared responsibility for delivering scalable applications.
At this breakout session, Remie Bolte, Marketplace Vendor and Cloud Solutions Architect, will explore how to break down one of the last silos still standing: application monitoring. You will learn about the history of monitoring and how it has evolved from basic systems monitoring to application performance monitoring. It will outline the common pitfalls of the most popular monitoring solutions and how these are antithetical to the DevOps movement.
To solve this, we'll introduce you to a new monitoring concept focused on developers: Monitoring as Code.
[Rodrigo Pinto] Being an enterprise consultant in many companies across the globe, the most common phrase I come across each client\supplier I've helped last year is "each team as a different way to do it"In small\medium\huge companies, this is a major overkill.In this session learn how you can structure your teams, processes, tools, and development for SharePoint Enterprise solutions.
Another day, another buzzword in the world of software development! ‘Microservices’ is a new approach to structuring server-side software. But is it really new? In this talk I’ll walk you through the birth and ‘raison d’etre’ of microservices and tell about pro’s and con’s of the approach.
Having laid the foundation, we will take a look at best-practices and patterns for building micro service architectures and combine this with a tour of current technologies and development tools.
Finally, I will take a quick look at the future and discuss some of the remaining challenges. All parts of the presentation will be accompanied by structural examples based on a real ecommerse system.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
More Related Content
Similar to How to Meta-Sumo - Using Logs for Agile Monitoring of Production Services
Designed for Administrators, this course shows you how to set up your data collection according to your organization’s data sources. Best practices around deployment options ensure you choose a deployment that scales as your organization grows. Because metadata is so important to a healthy environment, learn how to design and set up a naming convention that works best for your teams. Use Chef, Puppet or the likes? Learn how to automate your deployment. Test your deployment with simple searches, and learn to take advantage of optimization tools that can help you stay on top of your deployment.
Observability, the practical approach - Anton Drukh - DevOpsDays Tel Aviv 2018DevOpsDays Tel Aviv
Observability is the buzz these days. Everybody wants their code to explain to them what the issue is and how to fix it. But how do you get it right for your architecture, team structure and product maturity? I’ll share the path to observability at Snyk.io from day one to a team of 20 today.
ThoughtWorks Tech Talks NYC: DevOops, 10 Ops Things You Might Have Forgotten ...Rosemary Wang
Thoughtworks Tech Talks NYC, 11/30
We built an application or a platform! However, we soon realize that it is t-minus two weeks before release and we have no way of supporting it when it goes to production. Operations has not been trained, no one will know if a component goes down, and somehow the pipeline used in testing does not work in production. Oops. In this talk, we'll cover ten tips from the operations battlefront to remember as you develop an application or platform. With a focus on operations as a user and designing for support, these tips range from reminders on systems quirks to practices on engaging operations early in the development process. By taking a bit of an "operations" mindset in the development process, we can ease the release process and move closer to DevOps culture.
Splunk All the Things: Our First 3 Months Monitoring Web Service APIs - Splun...Dan Cundiff
A presentation titled "Splunk All the Things: Our First 3 Months Monitoring Web Service APIs" that Dan Cundiff and Eric Helgeson from Target Corporation gave at Splunk .conf2012.
Get Certified as a Sumo Power User!
Video: Video: https://www.sumologic.com/online-training/#Start
Designed for users, this series deep-dives into every aspect of analyzing your data. Run as a "how-to" webinar, this session walks viewers through data searching, filtering, parsing, and advanced analytics. This series concludes with "how to"details to create dashboards and alerts to monitor your data and get Sumo Logic to work for you.
Brand new to Sumo Logic? Learn how to get started and get the most out of your service. Learn how to capitalize on critical capabilities that can amplify your log analytics and monitoring experience while providing you with meaningful business and IT insights.
Video: https://www.sumologic.com/online-training/#QuickStart
Why And When Should We Consider Stream Processing In Our Solutions Teqnation ...Soroosh Khodami
Session Recording on Youtube
https://www.youtube.com/watch?v=uWPZQ_HMy10
- Session Description
Do you find yourself bombarded with buzzwords and overwhelmed by the rapid emergence of new technologies? "Stream Processing" is a tech buzzword that has been around for some time but is still unfamiliar to many. Join this session to discover its potential in software systems. I will share insights from Apache Flink, Apache Beam, Google Dataflow, and my experiences at Bol.com (the biggest e-commerce platform in the Netherlands) as we cover:
- Stream Processing overview: main concepts and features
- Apache Beam vs. Spring Boot comparison
- Key Considerations for Using Stream Processing
- Learning strategies to navigate this evolving landscape.
Have you ever wondered how to speed up your code in Python? This presentation will show you how to start. I will begin with a guide how to locate performance bottlenecks and then give you some tips how to speed up your code. Also I would like to discuss how to avoid premature optimization as it may be ‘the root of all evil’ (at least according to D. Knuth).
LogChaos: Challenges and Opportunities of Security Log StandardizationAnton Chuvakin
LogChaos: Challenges and Opportunities of Security Log Standardization
Abstract: The presentation will discuss how to bring order (in the form of standards!) to the chaotic world of logging. It will give a brief introduction to logs and logging and explain how and why logs grew so chaotic and disorganized. Next it will cover why log standards are sorely needed. It will offer a walkthrough that highlights the critical areas of log standardization. Past failed standards will be looked at and their lessons learned. Finally, current logging standard efforts will be presented briefly.
Monitoring As Code: How to Integrate App Monitoring Into Your Developer CycleAtlassian
Over the past decade, DevOps has empowered teams to break silos and create an environment of shared responsibility for delivering scalable applications.
At this breakout session, Remie Bolte, Marketplace Vendor and Cloud Solutions Architect, will explore how to break down one of the last silos still standing: application monitoring. You will learn about the history of monitoring and how it has evolved from basic systems monitoring to application performance monitoring. It will outline the common pitfalls of the most popular monitoring solutions and how these are antithetical to the DevOps movement.
To solve this, we'll introduce you to a new monitoring concept focused on developers: Monitoring as Code.
[Rodrigo Pinto] Being an enterprise consultant in many companies across the globe, the most common phrase I come across each client\supplier I've helped last year is "each team as a different way to do it"In small\medium\huge companies, this is a major overkill.In this session learn how you can structure your teams, processes, tools, and development for SharePoint Enterprise solutions.
Another day, another buzzword in the world of software development! ‘Microservices’ is a new approach to structuring server-side software. But is it really new? In this talk I’ll walk you through the birth and ‘raison d’etre’ of microservices and tell about pro’s and con’s of the approach.
Having laid the foundation, we will take a look at best-practices and patterns for building micro service architectures and combine this with a tour of current technologies and development tools.
Finally, I will take a quick look at the future and discuss some of the remaining challenges. All parts of the presentation will be accompanied by structural examples based on a real ecommerse system.
Similar to How to Meta-Sumo - Using Logs for Agile Monitoring of Production Services (20)
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
How to Meta-Sumo - Using Logs for Agile Monitoring of Production Services
1. How to Meta-Sumo
Using Logs for Agile Monitoring of
Production Services
Christian Beedgen
christian@sumologic.com
@raychaser
2. Who Am I?
Co-Founder & CTO, Sumo Logic since 2010
– Cloud-based log management and analytics
– Applications, operations, security
Server guy, Chief Architect, ArcSight, 2001 – 2009
– Major SIEM player in the enterprise space
– Log management for security and compliance
2
6. What are Logs?
The murmurings and whispers of your infrastructure
– Devices & Services (ex. Security, Email, Authentication)
– Applications (your code, or 3rd party applications)
Written to disk by applications
– Used during development by developers
– And then often ignored in production
Yes, Logs are Big Data
– There’s a lot of Logs ( Volume)
– A large Variety of formats, plus it’s real-time ( Velocity)
Free-form text, usually one message per line
– Scrolls by in your terminal
– Makes you feel like you’re in the matrix
6
7. Application Logging
Just do it*
– Use what’s common in your language (Log4J, …)
– If in doubt, log it - you might need it later, and disk is cheap
The obvious stuff
– Always log a precise timestamp
– Add a log level for filtering
Building a distributed system?
– Always add the process/service/module name
– Add the host ID if you can
Log the context
– When processing a request, remember who it is from
– Within the scope of the request, always log the context
Bring it all together in a single place
– Open Source options
– Commercial options
7
8. Just Do It*
Actually, please do think first
– Do not log passwords
– Do not log credit card numbers
– Do not log anything that’s PII
8
13. This is the Meta part
Our system to collect and analyze Logs is actually a
distributed, cloud-based infrastructure that itself
generates a ton of logs
We are running a second, smaller instance that
receives all the logs from the Prod system – we call
this our Shadow system
A majority of the management and monitoring of the
Prod system is done via the Shadow system, which
we are madly in love with
13
18. Example
Timestamp with time zone!
Log level
18
19. Example
Timestamp with time zone!
Log level
Host ID & module name (process/service)
19
20. Example
Timestamp with time zone!
Log level
Host ID & module name (process/service)
Code location or class
20
21. Example
Timestamp with time zone!
Log level
Host ID & module name (process/service)
Code location or class
Authentication context
21
22. Example
Timestamp with time zone!
Log level
Host ID & module name (process/service)
Code location or class
Authentication context
Key-value pairs
22
23. What about structure?
Should you rather use JSON, say?
– Hey, it’s really up to you – machines love it
– But do humans read the logs as well?
Modern tools extract structure if there is structure
– Even if the structure doesn’t conform to a protocol
– So you don’t strictly have to use a structured format
23
25. Sumo Pet Trick #1 – Basic Troubleshooting
All errors in the application have an error instance ID
UI always tries very hard to display the ID
– User can easily copy/paste the error ID into a ticket
25
26. Sumo Pet Trick #1 – Basic Troubleshooting
Find AP2V3-HZICT-BP6UO in the logs!
26
27. Sumo Pet Trick #2 – Everything breaks
Distributed systems lots of different components
– There will be fail
– The morning coffee routine…
There is beauty in numbers
– Pull out all the logs that contain the term “error” (or “fail”,
or “exception”, or all of the above)
– Then count by process/service/module, or host ID
Even the most basic aggregation will create
actionable insight
27
30. Sumo Pet Trick #2 – Everything breaks
Old school can rock this too!
30
31. Sumo Pet Trick #3 – API Choke Points
What is my API doing?
– Which APIs are being called?
– Average execution times per API
– Who’s calling?
31
32. Sumo Pet Trick #3 – API Choke Points
Number of calls by API
32
33. Sumo Pet Trick #3 – API Choke Points
Average execution time by API
33
34. Sumo Pet Trick #3 – API Choke Points
Number of API calls by user
34
35. Sumo Pet Trick #4 – Which searches are running
Log a session ID on start and stop
– When there’s long-running operations
– More than one node participates in generating the result
For each session in the time range, did you get both
the start and the stop message?
35