For most ecommerce companies, software is not the final deliverable product. It is a research tool, to determine what customers will pay for. To be able to get good data from software, monitoring and analytics must be built into the system. Alerting must come from business requirements and be based on application generated data.
In the traditional operations world, we monitor what is easy, and avoid monitoring that which is difficult. This talk is an attempt to show people that monitoring must be driven by metrics from the CxO office, and then potentially involve technical metrics if needed.
This talk explains why functional and business level monitoring is crucial. We also cover the tradeoffs from a DTAP model to continuous deployment. There will be a brief introduction to a couple of useful monitoring tools for functional monitoring. No special technical skills are expected of the audience, but having a general overview of the monitoring world is a good thing. This talk is not limited to ecommerce companies, but is most applicable to that environment.
Unleash Your Potential - Namagunga Girls Coding Club
OSMC 2015: Testing in Production by Devdas Bhagat
1. I have a test network. You may know it as
production.
– Andreas Thienemann
Testing
2. Asking the right questions
● Information is not a scarce resource, attention
is
– Herb Simon
3. Asking the right questions
● Information is not a scarce resource, attention
is
– Herb Simon
● What do you know?
4. Asking the right questions
● Information is not a scarce resource, attention
is
– Herb Simon
● What do you know?
● What do you not know?
5. Asking the right questions
● Information is not a scarce resource, attention
is
– Herb Simon
● What do you know?
● What do you not know?
● What do you not know that you do not know?
9. Fast Feedback Loops
● Test driven business
– It's like Test Driven
Design
● Enable IT to speak
the language of
business
10. Fast Feedback Loops
● Test driven business
– It's like Test Driven
Design
● Enable IT to speak
the language of
business
– Show me the data!
11. Fast Feedback Loops
● Test driven business
– It's like Test Driven
Design
● Enable IT to speak
the language of
business
– Show me the data!
– HIPPO
26. Testing vs Reality
● Stable environment
● No humans
● Low latency
● Highly unstable
● Humans
● Potentially high
latency
27. Testing vs Reality
● Stable environment
● No humans
● Low latency
● Not always the same
size of dataset
● Highly unstable
● Humans
● Potentially high
latency
● Large, ever changing
datasets
30. Users
● Humans do strange things
● Or sometimes make mistakes
● They come up with different requirements
31. Users
● Humans do strange things
● Or sometimes make mistakes
● They come up with different requirements
● They change the world your software works in
32. Risk management
● Approach 1:
– Scope your problems well
– Test a lot
– Release stable code
– Avoid changing a working system
33. Risk management
● Approach 2:
– Accept that you have an ill-defined problem
– Iterate rapidly
– Make a large number of small changes
– Build software to be able to isolate these changes
– Test them in the real world
– Keep only what works
34. What if When it breaks?
● Fix fast (maybe)
– “Do it right the first time” does not apply
● Business process for handling failure
– Hardware will eventually fail, software will
eventually work
36. The lifetime of code
● How long does your code live?
– Hours or days?
● This should be most code out there
37. The lifetime of code
● How long does your code live?
– Hours or days?
● This should be most code out there
– Months?
● A little code, often “libraries” with a single application
38. The lifetime of code
● How long does your code live?
– Hours or days?
● This should be most code out there
– Months?
● A little code, often “libraries” with a single application
– Years?
● Very little. Just core libraries
39. The lifetime of code
● How long does your code live?
– Hours or days?
● This should be most code out there
– Months?
● A little code, often “libraries” with a single application
– Years?
● Very little. Just core libraries
● It is safe to delete code, if you are using version
control
40. Event processing
● Generate information about software use and
changes in realtime
● For more information and tooling:
– https://www.quora.com/Are-there-any-open-source-
CEP-tools?share=1
– https://en.wikipedia.org/wiki/Complex_event_proces
sing
41. Event processing
● Generate information about software use and
changes in realtime
● For more information and tooling:
– https://www.quora.com/Are-there-any-open-source-
CEP-tools?share=1
– https://en.wikipedia.org/wiki/Complex_event_proces
sing
44. Monitoring/Alerting
● Process events to generate graphs
● Riemann is an excellent tool for generating
alerts from event streams
● Generate graphs as close to realtime as
possible
– Developers doing rollouts know that something else
is changing
45. Monitoring/Alerting
● Process events to generate graphs
● Riemann is an excellent tool for generating
alerts from event streams
● Generate graphs as close to realtime as
possible
– Developers doing rollouts know that something else
is changing
– Any major problems will be caught really quickly
46. Monitoring/Alerting
● Process events to generate graphs
● Riemann is an excellent tool for generating alerts
from event streams
● Generate graphs as close to realtime as possible
– Developers doing rollouts know that something else is
changing
– Any major problems will be caught really quickly
● Isolation of changes means that you can track
longer term effects of each change