Object, measure thyself

•Download as PPT, PDF•

1 like•1,194 views

The document discusses myths around instrumentation and monitoring in software systems. It presents the ERMA and Graphite open source projects for self-instrumentation and event aggregation/visualization. Instrumentation provides value through fixing production issues fast, capacity planning, and helping business teams. When objects can measure themselves through techniques like AOP, performance monitoring becomes easy.

Object, Measure Thyself
Greg Opaczewski – Orbitz Worldwide
Michael Ducy – BMC Software

Open Source
• ERMA Project :
http://launchpad.net/erma
• Graphite Project :
http://launchpad.net/graphite

Myths of Instrumentation
• No Time For Instrumentation
• No Value ($) in Instrumentation
• Instrumentation Causes Bugs

TransactionMonitor monitor =
new TransactionMonitor(“HotelService.purchase”);
try {
response = hotelSupplier.reserve(hotel);
monitor.succeeded();
} catch (ServiceException e) {
monitor.failedDueTo(e);
throw e;
} finally {
monitor.done();
}
ERMA

Self-Instrumentation by:
• Hooks – Interceptors and Listeners
• Abstraction – Abstract the details away
from developers
• AOP – Aspect Oriented Programming

Self-Instrumentation by:
• Aspect Oriented Programming (AOP)
<aop:config>
<aop:aspect id="transactionMonitorActionAspect"
ref="transactionMonitorActionAdvice">
<aop:pointcut id="transactionMonitorActionPointcut“
expression="target(org.springframework.webflow.execution.Action)
and args(context)"/>
<aop:around pointcut-ref="transactionMonitorActionPointcut“
method="invoke"/>
</aop:aspect>
</aop:config>

Value to the Business
• Fixing Production Problems Fast
• Capacity Planning
• Business Product teams rely on ERMA
data

Avoid Boilerplate
@Monitored
public interface HotelService {
void purchase(Itinerary itinerary);
void cancel(Itinerary itinerary);
}

Avoid Boilerplate
public interface HotelService {
@Monitored(includeArguments = true)
void purchase(Itinerary itinerary);
void cancel(Itinerary itinerary);
}

Uncovers Bugs
• Allows you to base line across builds
• MASF and SPC
• Event Pattern Monitoring

Base Lining
• Compare present performance vs.
historical performance
• Validate testing via theoretical models

Need for Abstraction
abstraction
Webapp
Travel Business Services
Switching Services
Transaction Services
Suppliers

Event Pattern Monitoring
wl|httpIn.shop.search.air.redirect_searchFailure
wl|AirSearchExecuteAction.search
wl|com.orbitz.ojf.OJFClient.getInternal
wl|jiniOut_ShopService_createResultSet
tbs-shop|jiniIn_ShopService_createResultSet
tbs-shop|jiniOut_LowFareSearchService_execute
air-search|jiniIn_LowFareSearchService_execute
air-search|com.orbitz.afo.lib.SearchFilter
air-search|com.orbitz.afo.lib.LowFareSearchServiceImpl.execute
air-search|jiniOut_AirportLookupService_findLocationByIATACode
market|jiniIn_LocationService|DbPoolExhaustedException

Final Thought
Performance monitoring is easy when the
objects practically measure themselves.

Thank You
• Special thanks to:
– Fellow Co-Authors – Matthew O’Keefe and
Stephen Mullins
– Neil Gunther – Mentoring and Candid Editorial
Review
– Lead Graphite Developer – Chris Davis

Websites
• ERMA Project :
http://launchpad.net/erma
• Graphite Project :
http://launchpad.net/graphite

?
michael@ducy.org
gopaczewski@orbitz.com

As many startups of the last decade, SoundCloud’s architecture started as a Ruby-on-Rails monolith, which later had to be broken into microservices to cope with the growing size and complexity of the site. The microservices initially ran on an in-house container management and deployment platform. Recently, the company has started to migrate to Kubernetes. With the introduction of microservices, the existing conventional monitoring setup failed both conceptually and in terms of scalability. Thus, starting in 2012, SoundCloud invested heavily into the development of the open-source monitoring system Prometheus, which was designed for large-scale highly dynamic service-oriented architectures. Migrating to Kubernetes, it became apparent that Prometheus and Kubernetes are a match made in open-source heaven. The talk will demonstrate the current Prometheus setup at SoundCloud, monitoring a large-scale Kubernetes cluster.

Chat+twitter app with lift

k4200

Distributed app development with nodejs and zeromqRuben Tan

Common Pitfalls of Functional Programming and How to Avoid Them: A Mobile Gam...

gree_tech

This material is presented on CUFP 2013. Functional programming is already an established technology is many areas. However, the lack of skilled developers has been a challenging hurdle in the adoption of such languages. It is easy for an inexperienced programmer to fall into the many traps of functional programming, resulting in a loss of productivity and bad software quality. Resource leaks caused by Haskell's lazy evaluation, for instance, are only the tip of the iceberg. Knowledge sharing and a mature tool-assisted development process are ways to avoid such pitfalls. At GREE, one of the largest mobile gaming companies, we use Haskell and Scala to develop major components of our platform, such as a distributed NoSQL solution, or an image storage infrastructure. However, only 11 programmers use functional programming on their daily task. In this talk, we will describe some unexpected functional programming issues we ran into, how we solved them and how we hope to avoid them in the future. We have developed a system testing framework to enhance regression testing, spent lots of time documenting pitfalls and introduced technical reviews. Recently, we even started holding lunchtime presentations about functional programming in order to attract beginners and prevent them from falling into the same traps.

This presentation introduces the concept of monitoring - focusing on why and how and finally on the tools to use. It introduces Prometheus (metrics gathering, processing, alerting), application instrumentation and Prometheus exporters and finally it introduces Grafana as a common companion for dashboarding, alerting and notifications. This presentations also introduces the handson workshop - for which materials are available from https://github.com/lucasjellema/monitoring-workshop-prometheus-grafana

Developing of a high load java script framework

Mikita Manko

Neotys PAC 2018 - Jonathon Wright

Neotys_Partner

The PAC aims to promote engagement between various experts from around the world, to create relevant, value-added content sharing between members. For Neotys, to strengthen our position as a thought leader in load & performance testing. Since its beginning, the PAC is designed to connect performance experts during a single event. In June, during 24 hours, 20 participants convened exploring several topics on the minds of today’s performance tester such as DevOps, Shift Left/Right, Test Automation, Blockchain and Artificial Intelligence.

HeapStats: Troubleshooting with Serviceability and the New Runtime Monitoring...

Yuji Kubota

Prometheus - Utah Software Architecture Meetup - Clint Checketts

clintchecketts

PyCon AU 2012 - Debugging Live Python Web Applications

Graham Dumpleton

Reactive Programming in Java 8 with Rx-Java

Kasun Indrasiri

Django In The Real World

Jacob Kaplan-Moss

Capacity Planning for fun & profit

Rodrigo Campos

OSMC 2021 | Robotmk: You don’t run IT – you deliver services!

NETWAYS

Business applications have to be available, performant and functioning. Full stop. Even with thousands of infrastructure monitoring checks, you won’t be able to even begin to monitor the end-user’s perspective. The fact is: you monitor your IT, but you can only hope that your services will work. Time to change that. Time to use a framework. Time to use Robot Framework. My presentation will show you the demand for End2End-Monitoring and why Robot Framework is an excellent choice for automated application tests. You will also get to know Robotmk, the link between Robot Framework and Checkmk. It dovetails both tools extremely closely and gives your infrastructure monitoring a holistic approach. It is used by companies of diverse branches, as well as by authorities and governments. And once you have discovered the KubernetesLibrary, DataDriver, RequestsLibrary and all the many more libraries, you will not want to put Robot Framework down again. But that’s another story…

Google App Engine Java, Groovy and Gaelyk

Guillaume Laforge

Hacking Robots for Fun and Profit

Chad Udell

This is a fun one! Learn how to hack up robots you can buy at a local toy store. You’ll see the methods used to take the video stream out of the robot and turn it into a format Flash likes. You’ll get the lowdown on how to send API commands to control the bot. We’ll show you how to connect it to alternative controllers and use ActionScript for some simple color detection on the video stream.

Hacking Robots for Fun and Profit

Chad Udell

AWS Loft Talk: Behind the Scenes with SignalFx

SignalFx

OGRE: Qt & OGRE for Multimedia Creation

account inactive

There are many benefits of leveraging open source components to accelerate development of innovative applications and frameworks. In this session, projects will be showcased which have used OGRE and Qt to build specialized tools for multimedia creation in industries such as marketing and animation. Presentation by Steve Streeting held during Qt Developer Days 2009. http://qt.nokia.com/developer/learning/elearning

Efficient use of NodeJSYura Bogdanov

OWASP Poland Day 2018 - Andrzej Dyjak - Zero Trust Theorem

OWASP

Realtime streaming architecture in INFINARIO

Jozo Kovac

Ruby on Rails Penetration Testing

3S Labs

Automating Security Response with Serverless

Michael Ducy

Serverless (or Functions as a Service) tends to get thrown in the "paradigms nice for developers" bucket, but Serverless can provide meaningful benefits to Operations, DevOps, and SRE teams. In a world where everything is presented or controlled via an API, Serverless' event driven, api first philosophy can help these teams create new levels of automation that were typically the realm of runbook tooling. In this talk we'll cover the various open source Serverless frameworks and platforms available. We'll show how to automate basic day to day operational task with Serverless functions. Finally, we will show how to build an open source, automated, Serverless based, event driven pipeline to automatically secure and protect a Kubernetes cluster.

Rethinking Open Source in the Age of Cloud

Michael Ducy

The last several years has brought explosive growth to the realm of open source. Many new projects have started, and many have went on to become foundational components of running applications at scale. Cloud providers have focused on a strategy of embracing open source not only to help build value added services, but to also make it easy to use open source on their compute platforms. Open source companies have reacted by changing their software licenses in an attempt to cut out the Cloud providers. So what does this mean for the future of open source? In this talk we’ll revisit some of the foundational tenets of open source, and compare these ideas to where open source has evolved. We’ll also talk about the pros and cons, and maybe unintended consequences, of Cloud based computing.

Similar to Object, measure thyself

Performance Oriented Design

Rodrigo Campos

MeetUp Monitoring with Prometheus and Grafana (September 2018)

Lucas Jellema

Developing of a high load java script framework

Mikita Manko

Neotys PAC 2018 - Jonathon Wright

Neotys_Partner

HeapStats: Troubleshooting with Serviceability and the New Runtime Monitoring...

Yuji Kubota

Prometheus - Utah Software Architecture Meetup - Clint Checketts

clintchecketts

PyCon AU 2012 - Debugging Live Python Web Applications

Graham Dumpleton

Reactive Programming in Java 8 with Rx-Java

Kasun Indrasiri

Django In The Real World

Jacob Kaplan-Moss

Capacity Planning for fun & profit

Rodrigo Campos

OSMC 2021 | Robotmk: You don’t run IT – you deliver services!

NETWAYS

Google App Engine Java, Groovy and Gaelyk

Guillaume Laforge

Hacking Robots for Fun and Profit

Chad Udell

Hacking Robots for Fun and Profit

Chad Udell

AWS Loft Talk: Behind the Scenes with SignalFx

SignalFx

OGRE: Qt & OGRE for Multimedia Creation

account inactive

Efficient use of NodeJSYura Bogdanov

OWASP Poland Day 2018 - Andrzej Dyjak - Zero Trust Theorem

OWASP

Realtime streaming architecture in INFINARIO

Jozo Kovac

Ruby on Rails Penetration Testing

3S Labs

Similar to Object, measure thyself (20)

Performance Oriented Design

MeetUp Monitoring with Prometheus and Grafana (September 2018)

Developing of a high load java script framework

Neotys PAC 2018 - Jonathon Wright

HeapStats: Troubleshooting with Serviceability and the New Runtime Monitoring...

Prometheus - Utah Software Architecture Meetup - Clint Checketts

PyCon AU 2012 - Debugging Live Python Web Applications

Reactive Programming in Java 8 with Rx-Java

Django In The Real World

Capacity Planning for fun & profit

OSMC 2021 | Robotmk: You don’t run IT – you deliver services!

Google App Engine Java, Groovy and Gaelyk

Hacking Robots for Fun and Profit

AWS Loft Talk: Behind the Scenes with SignalFx

OGRE: Qt & OGRE for Multimedia Creation

Efficient use of NodeJS

OWASP Poland Day 2018 - Andrzej Dyjak - Zero Trust Theorem

Realtime streaming architecture in INFINARIO

Ruby on Rails Penetration Testing

More from Michael Ducy

Automating Security Response with Serverless

Michael Ducy

Rethinking Open Source in the Age of Cloud

Michael Ducy

Open source security tools for Kubernetes.

Michael Ducy

Cloud Native platforms such as Kubernetes help developers to easily get started deploying and running their applications at scale. But as this access to compute starts to become ubiquitous, how you secure and maintain compliance standards in these environments becomes extremely important. In this talk, we'll cover the basics of securing Cloud Native platforms such as Kubernetes. We will also cover open source tools - such as Clair, Anchore, and Sysdig Falco - that can be used to maintain a secure computing environment. Attendees will walk away with a good understanding of the challenges of securing a Cloud Native platform and practical advice on using open source tools as part of their security strategy.

Container Runtime Security with Falco

Michael Ducy

Effective security requires a layered approach. If one layer is comprised, the additional layers will (hopefully) stop an attacker from going further. Much of container security has focused on the image build process and providing providence for the artifacts in a container image, and restricting kernel level tunables in the container runtime (seccomp, SELinux, capabilities, etc). What if we can detect abnormal behavior in the application and the container runtime environment as well? In this talk, we’ll present Falco - an open source project for runtime security - and discuss how it provides application and container runtime security. We will show how Falco taps Linux system calls to provide low level insight into application behavior, and how to write Falco rules to detect abnormal behavior. Finally we will show how Falco can trigger notifications to stop abnormal behavior, notify humans, and isolate the compromised application for forensics. Attendees will leave with a better understanding of the container security landscape, what problems runtime security solves, & how Falco can provide runtime security and incident response.

DevOps in a Cloud Native World

Michael Ducy

You just got “done” with the transformation of your organization (or parts of it) to leverage more DevOps practices, and now the next hot thing is taking over the industry: containers, Cloud Native, SRE, GitOps, Kubernetes, etc. What’s a DevOps Manager to do? Throw away the last few years and rebrand the team as Cloud Native SREs? Technological advancement not only provides advancement in “what” a modern technology architecture looks like, it can also provide advancement in the processes and the day to day of an organization’s technology teams. We’ve seen this before in the move from mainframe to client-server, and client-server to Cloud. In this presentation I’ll talk about the relationship of DevOps to Cloud Native technologies, and help make sense of all the jargon - containers, microservices, orchestration (and Kubernetes), SRE, GitOps, etc. I’ll also talk about how some processes & practices in the world of DevOps change when leveraging these technologies. Attendees will leave with a base understanding of what a DevOps operating model looks like when leveraging modern Cloud Native technologies.

Securing your Container Environment with Open Source

Michael Ducy

Cloud Native platforms such as Kubernetes and Cloud Foundry help developers to easily get started deploying and running their applications at scale. But as this access to compute starts to become ubiquitous, how you secure and maintain compliance standards in these environments becomes extremely important. In this talk we'll cover the basics of securing Cloud Native platforms such as Kubernetes. We will also cover open source tools - such as Clair, Anchore, and Sysdig Falco - that can be used to maintain secure computing environment. Attendees will walk away with a good understanding of the challenges of securing a Cloud Native platform and practical advice on using open source tools as part of their security strategy.

Sysdig Open Source Intro

Michael Ducy

Monitoring & Securing Microservices in Kubernetes

Michael Ducy

Sysdig Tokyo Meetup 2018 02-27

Michael Ducy

Principles of Monitoring Microservices

Michael Ducy

Containers and Microservices have radically changed how you get visibility into your applications. As developers have started to leverage orchestration systems on top of containers, the game is changing yet again. What was a simple application on a host before is now a sophisticated, dynamically orchestrated, multi-container architecture. It’s amazing for development - but introduces a whole new set of challenges for monitoring and visibility. In this talk we’ll lay out five key principles for monitoring microservices and the containers they are based on. These principles take into account the operational difference of containers and microservices when compared to traditional architectures. This talk is for the operator that needs to help development teams understand how visibility of apps has changed, and help teams implement these ideas. You’ll walk away with a good understanding of the challenges of monitoring microservices and how you can set your team up for success.

Survey of Container Build Tools

Michael Ducy

Monoliths, Myths, and Microservices - CfgMgmtCamp

Michael Ducy

Moving from a monolithic based architecture to a more microservices architecture can be fraught with challenges. We'll talk about some of these challenges and some common myths associated with trying to strangle the Monolith. We'll also talk about config management and automation's critical role in helping you move to a microservices architecture, and how our monolithic approach to automation changes in the new world.

Monoliths, Myths, and Microservices

Michael Ducy

Why Pipelines Matter

Michael Ducy

The Future of Everything

Michael Ducy

Improving Goat Production

Michael Ducy

The Road to Hybrid Cloud is Paved with AutomationMichael Ducy

The Velocity of Bureaucracy

Michael Ducy

The Goat and the Silo

Michael Ducy

Little Tech, Big Impact - Monktoberfest 2013

Michael Ducy

More from Michael Ducy (20)

Automating Security Response with Serverless

Rethinking Open Source in the Age of Cloud

Open source security tools for Kubernetes.

Container Runtime Security with Falco

DevOps in a Cloud Native World

Securing your Container Environment with Open Source

Sysdig Open Source Intro

Monitoring & Securing Microservices in Kubernetes

Sysdig Tokyo Meetup 2018 02-27

Principles of Monitoring Microservices

Survey of Container Build Tools

Monoliths, Myths, and Microservices - CfgMgmtCamp

Monoliths, Myths, and Microservices

Why Pipelines Matter

The Future of Everything

Improving Goat Production

The Road to Hybrid Cloud is Paved with Automation

The Velocity of Bureaucracy

The Goat and the Silo

Little Tech, Big Impact - Monktoberfest 2013

Recently uploaded

GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...

Neo4j

Leonard Jayamohan, Partner & Generative AI Lead, Deloitte This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.

SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf

Peter Spielvogel

Building better applications for business users with SAP Fiori. • What is SAP Fiori and why it matters to you • How a better user experience drives measurable business benefits • How to get started with SAP Fiori today • How SAP Fiori elements accelerates application development • How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities • How SAP Fiori paves the way for using AI in SAP apps

UiPath Test Automation using UiPath Test Suite series, part 5

DianaGray10

Securing your Kubernetes cluster_ a step-by-step guide to success !

KatiaHIMEUR1

Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster. However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks. In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.

PHP Frameworks: I want to break free (IPC Berlin 2024)

Ralf Eggert

In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development. This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.

DevOps and Testing slides at DASA Connect

Kari Kakkonen

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf

Paige Cruz

Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack. While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack. I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:

GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024

Neo4j

Introduction to CHERI technology - Cybersecurity

mikeeftimakis1

By Design, not by Accident - Agile Venture Bolzano 2024

Pierluigi Pugliese

GraphRAG is All You need? LLM & Knowledge Graph

Guy Korland

Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs. 1. Unifying Large Language Models and Knowledge Graphs: A Roadmap. https://arxiv.org/abs/2306.08302 2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs: https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/

Removing Uninteresting Bytes in Software Fuzzing

Aftab Hussain

Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process. In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds. - These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.

Elizabeth Buie - Older adults: Are we really designing for our future selves?

Nexer Digital

GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...

Neo4j

Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.

FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf

FIDO Alliance

20240605 QFM017 Machine Intelligence Reading List May 2024

Matthew Sinclair

National Security Agency - NSA mobile device best practices

Quotidiano Piemontese

PCI PIN Basics Webinar from the Controlcase Team

ControlCase

Essentials of Automations: The Art of Triggers and Actions in FME

Safe Software

In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation. We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios. Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!

GridMate - End to end testing is a critical piece to ensure quality and avoid...

ThomasParaiso2

Recently uploaded (20)

GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...

SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf

UiPath Test Automation using UiPath Test Suite series, part 5

Securing your Kubernetes cluster_ a step-by-step guide to success !

PHP Frameworks: I want to break free (IPC Berlin 2024)

DevOps and Testing slides at DASA Connect

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf

GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024

Introduction to CHERI technology - Cybersecurity

By Design, not by Accident - Agile Venture Bolzano 2024

GraphRAG is All You need? LLM & Knowledge Graph

Removing Uninteresting Bytes in Software Fuzzing

Elizabeth Buie - Older adults: Are we really designing for our future selves?

GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...

FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf

20240605 QFM017 Machine Intelligence Reading List May 2024

National Security Agency - NSA mobile device best practices

PCI PIN Basics Webinar from the Controlcase Team

Essentials of Automations: The Art of Triggers and Actions in FME

GridMate - End to end testing is a critical piece to ensure quality and avoid...

Object, measure thyself

1. Object, Measure Thyself Greg Opaczewski – Orbitz Worldwide Michael Ducy – BMC Software

2. Open Source • ERMA Project : http://launchpad.net/erma • Graphite Project : http://launchpad.net/graphite

3. Complex Environment

4. $10.8 Billion in Gross Bookings in 2007

5. Myths of Instrumentation • No Time For Instrumentation • No Value ($) in Instrumentation • Instrumentation Causes Bugs

6. Myth: No Time For Instrumentation

7. ERMA Extremely Reusable Monitoring API

8. TransactionMonitor monitor = new TransactionMonitor(“HotelService.purchase”); try { response = hotelSupplier.reserve(hotel); monitor.succeeded(); } catch (ServiceException e) { monitor.failedDueTo(e); throw e; } finally { monitor.done(); } ERMA

9. Self-Instrumentation by: • Hooks – Interceptors and Listeners • Abstraction – Abstract the details away from developers • AOP – Aspect Oriented Programming

10. Frameworks - Hooks • Spring Framework

11. Frameworks - Abstraction

12. Self-Instrumentation by: • Aspect Oriented Programming (AOP) <aop:config> <aop:aspect id="transactionMonitorActionAspect" ref="transactionMonitorActionAdvice"> <aop:pointcut id="transactionMonitorActionPointcut“ expression="target(org.springframework.webflow.execution.Action) and args(context)"/> <aop:around pointcut-ref="transactionMonitorActionPointcut“ method="invoke"/> </aop:aspect> </aop:config>

13. Myth: No Time For Instrumentation

14. Myth: No Value ($) in Instrumentation

15. Event Aggregation

16. Event Aggregation

17. Storage and Visualization: Graphite

18. Graphite

19. Graphite

20. Graphite Demo

21. Value to the Business • Fixing Production Problems Fast • Capacity Planning • Business Product teams rely on ERMA data

22. Myth: No Value ($) in Instrumentation

23. Myth: Instrumentation Causes Bugs

24. Avoid Boilerplate @Monitored public interface HotelService { void purchase(Itinerary itinerary); void cancel(Itinerary itinerary); }

25. Avoid Boilerplate public interface HotelService { @Monitored(includeArguments = true) void purchase(Itinerary itinerary); void cancel(Itinerary itinerary); }

26. Uncovers Bugs • Allows you to base line across builds • MASF and SPC • Event Pattern Monitoring

27. Base Lining • Compare present performance vs. historical performance • Validate testing via theoretical models

28. MASF and SPC

29. Need for Abstraction abstraction Webapp Travel Business Services Switching Services Transaction Services Suppliers

30. Event Pattern Monitoring wl|httpIn.shop.search.air.redirect_searchFailure wl|AirSearchExecuteAction.search wl|com.orbitz.ojf.OJFClient.getInternal wl|jiniOut_ShopService_createResultSet tbs-shop|jiniIn_ShopService_createResultSet tbs-shop|jiniOut_LowFareSearchService_execute air-search|jiniIn_LowFareSearchService_execute air-search|com.orbitz.afo.lib.SearchFilter air-search|com.orbitz.afo.lib.LowFareSearchServiceImpl.execute air-search|jiniOut_AirportLookupService_findLocationByIATACode market|jiniIn_LocationService|DbPoolExhaustedException

31. Myth: Instrumentation Causes Bugs

32. Final Thought Performance monitoring is easy when the objects practically measure themselves.

33. Thank You • Special thanks to: – Fellow Co-Authors – Matthew O’Keefe and Stephen Mullins – Neil Gunther – Mentoring and Candid Editorial Review – Lead Graphite Developer – Chris Davis

34. Websites • ERMA Project : http://launchpad.net/erma • Graphite Project : http://launchpad.net/graphite

35. ? michael@ducy.org gopaczewski@orbitz.com

Editor's Notes

MD/GAO Hello, I ’m Mike Ducy… Hello, I ’m Greg Opaczewski a Tech Lead at Orbitz WorldWide. I’m a part of development team named Operations Architecture. We develop site health and performance monitoring tools, primarily for operations teams. But also tools for development teams that need to know how their applications are performing in production.
GO I ’m excited to say that two of the major technologies in the Orbitz monitoring platform are now open source software. I encourage you to check out the project sites on launchpad at the URLs listed on the screen. We welcome any feedback you might have for the projects. We will display these again at the end of the presentation as well
MD SOA/Distrubuted architectures create problems for administration, support and development teams. Instrumentation of the various applications can provide valuable insights into how they interact and can ease the administration headaches. Orbitz Worldwide (OWW) operates dozens of applications running on hundreds of servers connected in a multi-layered Jini network. In this kind of environment it can be difficult to obtain consistent, uniform instrumentation at all application process boundaries. It is also not ideal to require each and every application development team to become experts at leveraging an instrumentation API and monitoring tools.
GO The Orbitz technology platform is very large and the business has grown as well. Orbitz WorldWide operates websites around the world in over a dozen locales. These point of sale and internationalization variables can make service operations even more challenging due to the number of key metrics that must be monitored. In 2007 the sales of over $10 Billion in travel products were dependent on the health of our technology platform. Therefore, OWW has made substantial investments in technology to detect problems early and minimize mean-time-to-repair.
MD It can be difficult for a technology organization to commit to provide the level of application instrumentation required to effectively monitor availability, reliability and performance. In this presentation we will examine several myths and explain how our technology overcame them. We ’re huge fans of the Mythbusters show on the Discovery channel, hopefully we have some fans in the audience as well.
GO Some believe that it has to be a time consuming process to apply monitoring code. In many approaches , every method call has to be wrapped with instrumentation code. Additionally, standards need to be defined for how the instrumentation will be applied consistently across a system. This obviously requires additional effort on the part of the development teams as well as technical leaders responsible for ensuring the standards are being followed. At Orbitz we ’ve observed that instrumentation (and monitoring concerns in general) are often the last concerns of developers. Naturally a majority or all of the development cycle producing code for new features.
We addressed this need to make the process of applying instrumentation simple by creating the Extremely Reusable Monitoring API (ERMA). ERMA consists of an API used for instrumenting Java applications and a library used to process the data produced by the instrumentation. This separation of concerns makes it easy for developers to apply the instrumentation without needing to be concerned with the details of how the data will be consumed.
GO Monitor objects in ERMA are Plain Old Java Objects (POJOs). To instrument a transaction, you construct a TransactionMonitor. Upon construction a stopwatch is started, that is used to measure latency. The code to be monitored is surrounded with try/catch/finally blocks. If the business code executes without exception, succeeded is invoked on the TM. However, if an exception is caught, it is recorded in the failedDueTo method. In the finally block, done is invoked in order to stop the stopwatch and pass the Monitor to the MonitoringEngine for processing. This is the handoff point to the processors implemented in the ERMA library.
GO So what I ’ve shown you on the previous slide is ERMA applied explicitly, wrapped around the business logic by a developer. The API is simple enough to use on its own. But we wanted to make the application of monitoring even easier. So we have implemented several techniques of self-instrumentation in order to achieve monitoring of the business objects with a minimal amount of effort.
GO We use the Spring Framework throughout our system. Spring is a popular open-source framework in the Java development community. Spring MVC and Spring Web Flow (SWF) are used in the web application architecture. Both of these frameworks provide hooks that can be used for monitoring. For example, there is a HandlerInterceptor interface in Spring MVC that we ’ve implemented and configured such that each and every web request is intercepted. ERMA is applied to these requests in a consistent and reusable manner. Spring webflow acts as the controller – it allows you to define flows between components such as actions and views in a webapp. WebFlow provides the FlowExecutionListener. By implementing this listener interface, we provide detailed metrics on how users are interacting with these flows in production.
GO Abstraction is another important technique we have used. Orbitz applications are networked together using Jini technology. Jini provides for dynamic service discovery and remote invocation. In a service oriented architecture, applications need a way to find out where the services they depend on are running. Jini provides this as well the ability to add and remove services from the network seamlessly. We created the Orbitz Jini Framework (OJF) in order to abstract away the details of our Jini service network from end developers. The abstraction layer contains a FilterChain facility that we have leveraged for monitoring. ERMA filters are executed both on the client and server side for each and every request. Because OJF is a shared library used consistently across our system, all developers get monitoring of remote method calls for free.
GO In the absence of hooks in the form of APIs that can be leveraged for monitoring, Aspect Oriented Programming (AOP) is another good option for providing reusable monitoring code. Spring provides integration with the popular AspectJ AOP framework. We have implemented an ERMA aspect that applies monitoring to all Action component invocations with just a few lines of reusable XML configuration. Spring creates a dynamic proxy for each Action object once at startup, and overhead at runtime is minimal as just one extra method invocation through the proxy is involved.
GO As a result of these techniques for applying reusable instrumentation, a developer at Orbitz needs to spend almost no time at all to get basic monitoring coverage. The frameworks that we use were instrumented by a small group of platform developers, many other development teams benefit without the need to spend any additional development time. So have we have BUSTED this myth of no time for instrumentation
MD From a standard ROI perspective, instrumentation does not provide real dollars back for the money invested in it ’s development. The value provided is often in reduced downtime, better understanding of code performance, better understanding of code dependencies and interactions of systems, opportunities to increase application performance and enhance the customer experience. While from a long term perspective these enhancements can provide increased revenue, it is not as immediate as implementing something like a new feature with has a more immediate ROI.
MD
MD The ERMA Instrumented applications sends monitoring data back to the Event Processor engine. The data is sent by a background thread in the ERMA instrumented application which prevents latency from being introduced for the other incoming remote service calls. Since Event Processing is done outside of the instrumented application, this helps to reduce the introduction of latency in the instrumented application. The event processor aggregates and summarizes the various metrics, computing summary statistics (Average, Standard Deviation, % Fail, % Success, etc), and sends these metrics over to Graphite for storage and visualization. The event processor is also capable of sending SNMP alarms when Aggregated data points exceed certain thresholds (e.g. latency is high, or rate of failures is high).
MD Graphite consists of several components. The 2 primary components are Carbon and the Web Application. The Carbon component is responsible for reading data into the system and storing it in fixed size database files (similar to RRD files). The web application then reads these files to graphically represent the data for the end user.
MD The Graphite composer interface allows you to browse various metrics available for reporting in a hierarchical tree. When a metric is selected a graph of that metric ’s data is drawn in the composer interface. The user can manipulate the graph by selecting size, duration of the data to be graphed, as well as other elements.
MD The Graphite Command Line Interface allows a user to draw graphs in individual windows. These windows can be arranged and sized within the browser window. The window layout can also be saved which allows a user to create “dashboard” of commonly used graphs.
MD
MD Value is not in the instrumentation itself, but in the data that the instrumentation provides. Gartner estimates that on average an hour of downtime can cost an organization $42,000 per hour. Instrumentation data can help reduce the length of outages by making it easier for Operators to locate the problem (via SNMP alarms), and through the tools used to visualize the data.
MD Instrumentation provides a Return On Investment by maximizing the ROI of the applications that are monitored.
GO Another myth that we ’d like to address is the belief that instrumentation only causes bugs. Boiler plate code often used to apply instrumentation makes code harder to read and maintain. This has a direct effect on developer productivity. It also gives developers an argument to not add the instrumentation at all. We use several techniques that allow our developers to avoid the need to write boilerplate code. We provide reusable, well tested instrumentation packaged in libraries and applied via hooks, abstraction and AOP as described previously.
GO Another good option to avoid boilerplate code with ERMA is Annotations. This feature, supported with Java 5 and above, applies instrumentation at build time and requires no ERMA code to be mixed with business code. The example shown here will apply an ERMA TransactionMonitor to each method in this service.
GO This example will wrap a TransactionMonitor around only the purchase method. Setting includeArguments to true will include method parameters in the monitor object as an attribute. The nice thing about this approach is how cleanly separated the business code is from the monitoring, it is simply declarative monitoring versus intrusive instrumentation
MD Our use of ERMA has introduced very few bugs. In fact, far more bugs have been uncovered using the ERMA data.
MD Instrumentation data allows to base line your current application performance against historical data. You can also use instrumentation data to build theoretical models to help verify that testing tools are correctly measuring application performance.
MD Historical instrumentation data can be used to build models based on Multivariate Adaptive Statistical Filtering and Statistical Process Control. This allows you to determine if your current application is performing within historical bounds and if something has changed.
GO For a large system there is a need to provide an abstraction for monitoring so that developers, operators and business analysts can all share the same language for describing system functionality. Example abstractions from our domain are “hotel search”, “air purchase”, “package selection”, etc. ERMA has some unique design features that enable detailed monitoring put in the context of these abstractions. ERMA assembles hierarchies of events transparently within its MonitoringEngine component. It does this by maintaining a stack of Monitors for each application thread. Whenever a new monitor is created during request processing, a parent-child relationship is introduced with the Monitor previously on top of the stack. At the completion of request processing, the result is a tree data structure can be analyzed to find event patterns. Our Jini framework passes monitoring data back and forth, allowing these event patterns to even span the boundaries of all applications involved in servicing a user request.. As a result, we can accelerate root cause analysis by delivering alarms to our operations teams that contain both the low level root cause of a problem and the impact to our customer.
GO What you are looking at here is an example of an ERMA event pattern captured from an air search request in our system. We present these patterns to the operator in such a way that is obvious where an exception originated and how it bubbled up through the stack. Yellow represents any monitor that has recorded a failure and red represents the lowest level failure. We use this information to zero in on the application and component that is contributing most significantly to a site issue. An e.g. alarm that may be sent to our operations center based on this data would read “Air search is failing at 80% due to a maket application DbPoolExhaustedException.” So it is very clear as to the top-level impact (air search is failing) and points to the underlying issue (likely that there are no available database connections). Before we implemented this approach an alarm would be generated for every failure in this pattern and our operators would be left trying to figure out the bigger picture. The improved alarms help ensure proper development resources are engaged quickly when support teams are troubleshooting production issues. They also help to prioritize action on alarm conditions by making clear the impact to our customers. ERMA patterns also enable you to drill down into latency metrics in order to see which components are contributing the most to latency. We are working on a user interface that will make it easier to visualize this kind of data. For example, by generating dynamic UML sequence diagrams based on the runtime behavior of the system.
GO So in our experience many more bugs are uncovered using the data produced by instrumentation than are caused by it. Bugs that would otherwise be difficult or impossible to diagnose without the instrumentation. So the myth that instrumentation only causes bugs is busted.
MD ??? GO Pragmatic in the design of our monitoring platform / tools. We have acknowledged developers are focused on implementing new features and site improvements. So monitoring of all core metrics is already in place. These include JVM and machine-level statistics such as: cpu, memory and threads. Resource pools such as database connections, also network connections to external suppliers. Frameworks contain monitoring of our business services and detailed monitoring of every request into the web application. Lastly, we have invested in tools that allow us to get tremendous value out of the instrumentation. These tools translate detailed metric data into an improved customer experience.
GAO I want to thank CMG for allowing us to share our story with everyone here. I too want to thank Neil Gunther again for all of his help in putting the paper and this presentation together
MD Both ERMA and Graphite have been open sourced by Orbitz Worldwide and the teams welcome your feedback and contributions.

Object, measure thyself

Recommended

Recommended

More Related Content

Similar to Object, measure thyself

Similar to Object, measure thyself (20)

More from Michael Ducy

More from Michael Ducy (20)

Recently uploaded

Recently uploaded (20)

Object, measure thyself

Editor's Notes