7. So what's the difference between
Observability and monitoring?
8. > in the old days, monitoring are Ops domain
> most of the time it's limited to only checks
it checks whether the machine are up, or the
network are ok, or the applications are
running.
9. These days applications are complex and got deployed on
distributed system mindset
how can you monitor all of these system that running to more than 10
instances?
Today’s Challenge
Accross region, accross datacenter
Can you Observe all of these thing?
What kind of output these system generated?
Will the output going to be useful for the organization?
14. Observability is not about the tools
tools can change over time
it is about the people, the people who build the
system, the people who put their love on their
products.
15. Are we there yet?
Yes. Good!
if it’s not there,
how to change it or how do we get there?
16. Build engineering culture in the organization that cares
about the business and its surrounding.
First
Convince and get the team on board in this journey
Change the mindset, break the habits to implement a
new culture.
17. The engineers must expand their horizon of thought and
help others to be great at their jobs.
What mindset or culture are we talking?
18. Let’s break it down
As an engineer :
What kinda thing I want to improve?
Can I measure what I want to build?
Is this the best way to achieve this?
Can other engineer get the benefits from what I want to build?
20. To get better at Observability, you need tools or solution depending on our
needs.
Tooling
- Logging
- Performance Monitoring
- Metrics
- Debugging
- System Tracing
21. New relic, datadog, pagetduty, Grafana, Mochajs, testify,
ginko, circleCI, travisCI, Gitlab-ci/ Runner, Docker, ELK,
Zipkin, Finagle, Jenkins, Spinnaker, ATLAS, Kubernetes, Istio,
Prometheus, Mesos, DC/OS, Hadoop, R, Spark, etc.
Tools
These are the tools that usually used by engineers, commercial, opensource or both.
22. Always practice open communication within the org, adopt
devops, embrace engineering culture.
Final Thought
the big question will be…
How BTPN can bring you a new way of life?
Present condition: kita dianggap sebagai bank pensiunan.
Beberapa waktu yang lalu, fortune.com publish Fortune Change The World
Remember nagios, rrdtool
- with todays technology, app are containerazed, building, testing and with deployment orchestration, we got kubernetes, swarm, mesos adding layer to our workflow
- sometimes application and service deployed accross region, if you use AWS, GCP or any other cloud provider, that can added more complexity to the system and workflows
Clarity & transparency
as developer and operations team, i want to know whether my service working perfectly, what data or logs that generated by the system, and also i want to know what data that being generated from this service through monitoring, logs, visualization, alerting, tracing.
as product manager, i want to know if this certain feature are useful for my users through metrics analysis.
as stakeholder, i want my business running fine and can see the overall performance of my business.
as users, i want they get wonderful experience using our service.
Prevention
we know that things are gonna or can happen on our system eventually, and if this things coming, we knew it before hand or at least we know what to find and what how to fix this.
we set up alerting, rotation and for advance, heal it self.
——-
we know that things are gonna or can happen on our system eventually, and if this things coming, we knew it before hand or at least we know what to find and what how to fix this.
we set up alerting, rotation and for advance, heal it self.
So, how to do or how to build Observability in our DevOps-sified organization?
Because it doesn’t necessarily matter if we use one tool or another, the main focus are the people
because it's have to break habits or change the way we think.
Think of developers whose gonna build a microservice, they need to have sets of tool that gonna provide them what they need, and what others need, and before that platform engineer should build a pre-configured, predictable and repeatable environment for application to run, whether you run it on container or baremetal doesnt matter.
Instead of waiting for these scenarios and then try to figure out how to monitor and solve them, our line of thought should be around how to catch them as soon as they happen.
No matter who you are, data platform engineer, data engineer, developers, everyone in the same board, and they’re complement each other.
As organization adopting devops, there’s no silos, no one said this is your s* not mind, clean it up. Instead the thought are more broad than that, platform engineer help devs, devs help data enginer, and engineer was helping other dept like marketing for example etc.
Ask yourself and the team when you want to build, get their feedback, ask them what they think you or the team want to improve on the thing that we focus on.
Through this
Since tools are change or evolving over time, it’s the concept that we need to grasp, and we find what tools that best for us and our organization/company.
To get to previous point, where Clarity and transparency, Measurement and Prevention, Stability & Optimization, Data Insight, logging are our one way to peek to the system. The tools are a lot, jenius use splunk + ELK (elastic, logstash, Kibana) for this, other company might be use other solution like papertrail, ArcSight. The tool can be many, from commercial one, to opensource, your choice.
Logging can be collected from lots of source
microservices, servers, build testing, integration testing etc,
Performance monitoring
Application performance monitoring (APM) are essential tool to get insight of how well your system run
Metrics
Even tho this are considered derivative from logging, I think this has gone exclusively, meaning, it’s not always about logging, but things like events are now popular among engineers.
Debugging
Engineer have to have a debuggable software, in a sensee that the software itself tells what it was doing, the instrumentation came from designing process, developing or coding, until it lives in the servers. It all has to make sense for devs itself and for other engineer.
One of the example are Bugsnag or overops, it give devs a clarity what was going on in the code level, get the stacktrace etc.
System tracing.
This also considered derivative from logs,