Pushing the hassle from production to developers. Easily

DEV
OPS
Push the hassle
from production
to developers.
Easily
DevOpsDays Ghent 2016
October 27th
@MartinGoodwell
Dynatrace

@MartinGoodwell
About me
Passionate about life,
technology, and the people
behind both of them.
• Started with Commodore 8-bit (VC-20 and C-64)
• Built Null-modem connections for playing Doom and WarCraft
• Built IPX/SPX networks between MS-DOS 5.0 and Windows 3.1
• Did DevOps before they called it that way (mainly Java and Web)
for about 10 years
• Now at Dynatrace Innovation Lab
• Tech Lead for Microsoft Technologies
and Software Architecture
• Find me on Twitter: @MartinGoodwell
@MartinGoodwell

@MartinGoodwell
Agenda
• The Rules
• Warm-up
• The Ops dilemma (I call it that)
• The second Ops dilemma
• On Monitoring ...
• ... and Logging
• ... and Call Tracing
• ... and Databases
• Commercial offerings
@MartinGoodwell

@MartinGoodwell
The Rules
• Please, ask or interrupt anytime
• But keep ideas for open space discussions
• Or track me down anytime around
@MartinGoodwell

@MartinGoodwell
Warm up
• What's your occupation?
• Dev, Ops, BinExec?
• What's your technology stack?
• Node.js
• Go
• Java
• .net
• Who of you does
• Monitoring
• Logging
• Call-Tracing
• Application performance management/monitoring (APM)
@MartinGoodwell

@MartinGoodwell
The Ops dilemma (1)
Dev
• Single transaction
• Deal with a specific problem
• No impact on real users and business
• Concentrate on single component
• Deadlines refer to sprints
• Weeks, usually
Ops
• 100s or 1000s of txns
• No idea, what the cause is
• Real user impact
• Lots of moving parts
• Deadlines usually mean SLAs
• Hours, maybe just minutes
@MartinGoodwell

@MartinGoodwell@MartinGoodwell

@MartinGoodwell
The Ops dilemma (2)
Automation
• Continuous {Integration/Deployment/Delivery} pipeline
• triggering unit tests for fast feedback
• Build servers
• Repositories
• Automatic deployments
• Helps devs getting stuff into production
• Does nothing for the opposite direction
@MartinGoodwell

@MartinGoodwell
DevOps is about collaboration.
Collaboration requires documentation.
Automation is implicit documentation.
But there is no automation for
supporting Ops with troubleshooting.
@MartinGoodwell

@MartinGoodwell
Monitoring
@MartinGoodwell

@MartinGoodwell
Host metrics
• CPU usage
• Memory usage
• Disk I/O
• Network performance
• No insight into app's
problems and performance
@MartinGoodwell

@MartinGoodwell
In your code
@MartinGoodwell

@MartinGoodwell
Use statsd
@MartinGoodwell

@MartinGoodwell
statsd real quick
http://www.slideshare.net/DatadogSlides/dev-opsdays-tokyo2013effectivestatsdmonitoring
@MartinGoodwell

@MartinGoodwell
Downsides?
• "Polluting" business logic with monitoring code
• Code introspection (ie AOP) requires advanced skills
• Not using something like statsd leads to cluttered metrics
• Great for single component insight
• what about called 3rd parties?
• what about microservices (ie distributed transactions)?
• what about calls to databases, queues, etc.
@MartinGoodwell

@MartinGoodwell
Logging
@MartinGoodwell

@MartinGoodwell
http://theburningmonk.com/2015/05/a-consistent-approach-to-track-correlation-ids-through-microservices/
@MartinGoodwell

@MartinGoodwell
Logging learnings
• Use a logging server (eg ELK stack)
• directly log as JSON
• at least store as JSON
• Using logging for monitoring is expensive
• log analysis is a real resource hog
• works great for troubleshooting
• works great with limited problem scope
• for Java, use Logback via SLF4J
• to local logfiles
• to logstash
• to syslog
@MartinGoodwell

@MartinGoodwell
Call Tracing
@MartinGoodwell

@MartinGoodwell
Google Dapper paper
• The Dapper paper (2010)
http://research.google.com/pubs/archive/36356.pdf
• OpenTracing
http://opentracing.io/documentation/
• OpenZipkin (by Twitter)
• http://zipkin.io/
@MartinGoodwell

@MartinGoodwell
Zipkin architecture
http://zipkin.io/pages/architecture.html
@MartinGoodwell

@MartinGoodwell
https://github.com/openzipkin/zipkin
@MartinGoodwell

@MartinGoodwell
http://zipkin.io/
@MartinGoodwell

@MartinGoodwell
https://github.com/ordina-jworks/microservices-dashboard
@MartinGoodwell

@MartinGoodwell
https://github.com/spring-cloud/spring-cloud-sleuth
Spring Cloud Sleuth is a distributed tracing solution on top of Spring Cloud
@MartinGoodwell

@MartinGoodwell
http://trace.risingstack.com
@MartinGoodwell

@MartinGoodwell
Databases
@MartinGoodwell

@MartinGoodwell
Getting database insight
• Database automation
• eg. DB Maintain
• https://dbmaintain.github.io/
• Database performance logging
• log4jdbc
• https://github.com/arthurblake/log4jdbc
@MartinGoodwell

@MartinGoodwell
The commercial hood
@MartinGoodwell

@MartinGoodwell
Broad technology support
@MartinGoodwell

@MartinGoodwell
Zero-conf and ready to run dashboards
@MartinGoodwell

@MartinGoodwell
Method level insight for code and database
@MartinGoodwell

@MartinGoodwell
Host, process and network metrics
@MartinGoodwell

@MartinGoodwell
Call-tracing across technologies
@MartinGoodwell

@MartinGoodwell
Including log analytics
@MartinGoodwell

@MartinGoodwell
Full Docker insight (zero-conf)
@MartinGoodwell

@MartinGoodwell
Dedicated support for most important technologies
@MartinGoodwell

@MartinGoodwell
Automated baselining, root-cause-analysis, and problem
correlation
@MartinGoodwell

@MartinGoodwell
You can't fight in here, Gentlemen.
This is the war room!

@MartinGoodwell
Let's talk
• From monolith to microservice
• cloud migration
• performance optimization
• team culture
@MartinGoodwell
martin.goodwell@dynatrace.com
@MartinGoodwell

@MartinGoodwell
Thank you!
@MartinGoodwell

Pushing the hassle from production to developers. Easily

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (10)

Similar to Pushing the hassle from production to developers. Easily

Similar to Pushing the hassle from production to developers. Easily (20)

Recently uploaded

Recently uploaded (20)

Pushing the hassle from production to developers. Easily