ITB2015 - Monitoring and Tracking Your Web Applications
Application Performance Management
CFML and ColdBox
Darren Pywell / Joel Watson
● CTO at Intergral (The FusionReactor people…)
● 18 yrs CF experience (CF released 20 years ago!)
● Over 33 years in Software
● Worked in HP’s OpenView Network + System
Management Software Division before Intergral
● Background in Network and System Management
● Responsible for all Fusion(X) products
● Based in Stuttgart, Germany for last 25 years :-)
• The need for monitoring
• Gartner Application Performance Model
• Core APM
• When things go wrong
• World Premier!
• Monitoring ProfileBox and FusionReactor
The Need for APM
Modern IT solutions need to be monitored and managed
in a complete, end-to-end manner
Detail remains important and has to be set into a well-
understood overall picture of system behavior
Five distinct dimensions of application performance
exist, each one complementary to the others
Gartner's APM Model
End-user experience monitoring
Runtime application architecture
Component deep-dive monitoring
UEM in Action
● Blocked Threads
Almost all stability issues relate to Block Threads eventually.
Caused by locks,synchronizers,resources waits,exhaustion
● Chain Reaction
Blocked threads on one server increase load on others. This
slows the them down, causing more blocked threads...
● Integration Point
Exit points from the platform. Typical systems today may touch
8 or more on average. You're at the mercy of someone else...
● Cascade Failure
Occurs when problems in one layer causes problems in the
previous. Cracks jump from system to system. Be paranoid
about integration and stay up even if they do down.
● Circuit Breaker
Protects callers by not calling if Integration Point has failed.
Fast-fail when the breaker is open.
System must run without you touching it. Anything that grows
resource (DB,files) must have a something that cleans it up. Use
caching to maintain performance.
Partitions capacity to preserve functionality. Use pools to protect
Use timeouts to prevent integration points becoming blocked
threads. Consider (delayed) retries.
When things go wrong
• Avoid Blame!!!
• Reduce Service instead of Outage
• Monitor and Gather Data
• Mean Time to Restore Service (MTRS)
• Always generate a test for every bug you find
• Tools are critical (ProfileBox)
• How can you debug a production problem?
Unattended Production Debugging
What if you could…
debug when you’re not there?
safely debug a production system?
fix a problem without changing code?
Now you can!!!
Thanks for listening...
More information on: