In pre-production, there’s lots of tools that help optimizing your code: debuggers, CI/CD, load tests, etc. There’s even tools that automatically deploy them into production. Plus, engineers usually have a whole sprint as a time frame. Once you’re in production, things are a bit different and all that convenience is just not there for operators. They need to be able to pinpoint trouble spots within minutes. They have to identify the handful of bad requests out of thousands that allows for reproduction of the problem. And then, they are finally required to hand all that information over to the developers as convenient and as soon as possible. And nothing of that is automated. Performance monitoring, call-tracing and visualization are the concepts any developer should know about to be able to provide as much insight as possible into running systems. This session introduces open-source tools that allow devs and ops to work together much closer. To name just a few: * statsd / collectd * Zipkin * Spring-Cloud Sleuth * and some more For the sake of completeness and to also cover the Enterprise user-space, the main commercial vendors in that space will also be mentioned real quick. After that session, you’ll see new ideas popping up inside your head and already have all the knowledge you need to directly jump into planning and implementation.
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Pushing the hassle from production to developers. Easily
1. DEV
OPS
Push the hassle
from production
to developers.
Easily
DevOpsDays Ghent 2016
October 27th
@MartinGoodwell
Dynatrace
2. @MartinGoodwell
About me
Passionate about life,
technology, and the people
behind both of them.
• Started with Commodore 8-bit (VC-20 and C-64)
• Built Null-modem connections for playing Doom and WarCraft
• Built IPX/SPX networks between MS-DOS 5.0 and Windows 3.1
• Did DevOps before they called it that way (mainly Java and Web)
for about 10 years
• Now at Dynatrace Innovation Lab
• Tech Lead for Microsoft Technologies
and Software Architecture
• Find me on Twitter: @MartinGoodwell
@MartinGoodwell
3. @MartinGoodwell
Agenda
• The Rules
• Warm-up
• The Ops dilemma (I call it that)
• The second Ops dilemma
• On Monitoring ...
• ... and Logging
• ... and Call Tracing
• ... and Databases
• Commercial offerings
@MartinGoodwell
4. @MartinGoodwell
The Rules
• Please, ask or interrupt anytime
• But keep ideas for open space discussions
• Or track me down anytime around
@MartinGoodwell
5. @MartinGoodwell
Warm up
• What's your occupation?
• Dev, Ops, BinExec?
• What's your technology stack?
• Node.js
• Go
• Java
• .net
• Who of you does
• Monitoring
• Logging
• Call-Tracing
• Application performance management/monitoring (APM)
@MartinGoodwell
6. @MartinGoodwell
The Ops dilemma (1)
Dev
• Single transaction
• Deal with a specific problem
• No impact on real users and business
• Concentrate on single component
• Deadlines refer to sprints
• Weeks, usually
Ops
• 100s or 1000s of txns
• No idea, what the cause is
• Real user impact
• Lots of moving parts
• Deadlines usually mean SLAs
• Hours, maybe just minutes
@MartinGoodwell
8. @MartinGoodwell
The Ops dilemma (1)
Dev
• Single transaction
• Deal with a specific problem
• No impact on real users and business
• Concentrate on single component
• Deadlines refer to sprints
• Weeks, usually
Ops
• 100s or 1000s of txns
• No idea, what the cause is
• Real user impact
• Lots of moving parts
• Deadlines usually mean SLAs
• Hours, maybe just minutes
@MartinGoodwell
9. @MartinGoodwell
The Ops dilemma (2)
Automation
• Continuous {Integration/Deployment/Delivery} pipeline
• triggering unit tests for fast feedback
• Build servers
• Repositories
• Automatic deployments
• Helps devs getting stuff into production
• Does nothing for the opposite direction
@MartinGoodwell
10. @MartinGoodwell
DevOps is about collaboration.
Collaboration requires documentation.
Automation is implicit documentation.
But there is no automation for
supporting Ops with troubleshooting.
@MartinGoodwell
12. @MartinGoodwell
Host metrics
• CPU usage
• Memory usage
• Disk I/O
• Network performance
• No insight into app's
problems and performance
@MartinGoodwell
16. @MartinGoodwell
Downsides?
• "Polluting" business logic with monitoring code
• Code introspection (ie AOP) requires advanced skills
• Not using something like statsd leads to cluttered metrics
• Great for single component insight
• what about called 3rd parties?
• what about microservices (ie distributed transactions)?
• what about calls to databases, queues, etc.
@MartinGoodwell
21. @MartinGoodwell
Logging learnings
• Use a logging server (eg ELK stack)
• directly log as JSON
• at least store as JSON
• Using logging for monitoring is expensive
• log analysis is a real resource hog
• works great for troubleshooting
• works great with limited problem scope
• for Java, use Logback via SLF4J
• to local logfiles
• to logstash
• to syslog
@MartinGoodwell