2. Snyk is an open-source security company
We’re 3 years old, raised $32M total, engineering team of 26
SaaS offering on a NodeJS & Python microservices stack
Some context
3. My take on observability
Think about operating your
service
Care for all the lemmings
Care for the individual
lemming
4. With proper observability you have
Speed-of-light troubleshooting
Single source of truth for what happened in the system
Scientific approach to changes
5. How do we get there
Not cost-effective to start in a new code-base
Not cost-effective to start in a mature code-base
So… forever locked outside?
6. Logs to the rescue
Those write-once-when-debugging-then-forget strings
“Not sure what the problem is, added some logs, let’s see”
- Every developer, sometime in their professional lives
7. Step 0 - talk to your team
Is observability important to the team?
Does it fit your team’s methodology?
Definitely a team effort to get it right!
Our take - included in training, code reviews and oncall
8. Step 1 - where to keep your logs
Buy it if you can, build it if you must
Needs to serve end goal
Our angle - happy logz.io customers, pushing 15GB daily
9. Step 2 - start shipping your logs
11th of the 12 factors - don’t manage, just output
Choose a logging library
Adjust to indexing service
Our angle - fluentd daemonsets on a k8s cluster;
`bunyan` logging library with single-line JSONs
10. Step 3 - structure your logs
Decide on a few rules to make your logs behave
Use a context object for varying parameters
Add a constant label to identify the logged action
Use logging level as part of context
Special treatment for errors
11. Step 3 - structure your logs
logger.info({
temperature: measurement.temperature,
duration: Date.now() - startTime,
params: request.params,
}, 'Completed temperature measurement');
12. Our take -
Standard logged keys match common objects
Logging at specific checkpoints and on response
Logging level matches HTTP status code (2xx, 4xx, 5xx)
Reverse lookup from log to line of code using log label
Error message is the failure, log label is the action
Step 3 - structure your logs
13. Prevent sensitive data - it will leak!
Protect from size overflow
Your log library will become standard in your code-bases
Our angle - sanitising auth tokens and emails (:wave: GDPR)
huge logged objects halted our services with IO
Step 4 - protect your logs
14. 1 log per request
Collect ‘breadcrumbs’ during request handling
Log upon response with all collected context
Our angle - see https://github.com/snyk/koa2-bunyan-server
Step 5 - make logging easy
16. Skip logs when they carry little value
Sample logs with higher weight to errors
Constantly invest in team training and reviews
Share the joy with Customer Success and Sales Engineering
Our angle - training inside and outside of Engineering
Step 6 - watch out for scale
17. Align your team
Push logs to an external service
*Structure your logs* and sanitise them
Embed logging into your boilerplates
Reap the reward in how your team operates its software
Practical observability