SlideShare a Scribd company logo
1 of 52
The Art of Container Monitoring
Derek Chen
2016.9.22
Copyright 2016 Trend Micro Inc.2
About me
• DevOps Engineer at Trend Micro
– Agile transformation
– Micro service and cloud service
– Docker integration
– Monitoring system development
• Automate all the things
• Make everything smoother
• Find me at derekhound@gmail.com
Copyright 2016 Trend Micro Inc.3
• We want to know when things go wrong
• We want to know when things aren’t quite right
• We want to know in advance of problems
Why monitoring?
Microscope?
Magnifier?
Telescope?
Copyright 2016 Trend Micro Inc.4
Blackbox vs Whitebox
• Blackbox
– Requires no participation of the monitored system
– Observes external functionality, “what the user see”
– End to end test
• Whitebox
– Collects data internally provided by the target system
– Has more granular information about the system
– Can provide warning of problems before they occur
Copyright 2016 Trend Micro Inc.5
Blackbox tests measure…
• Can you ping the server
• Can you fetch a page from the server
• Does it have the correct contents
Whitebox monitoring collects…
• Network statistics (packets/bytes sent/received)
• System load (cpu, memory, disk usage)
• Application statistics (connection count, query per second)
Copyright 2016 Trend Micro Inc.6
Goal
Operating System
Application
Business Logic
Metrics
Events
Logs
Router
Destinations
Store
Visualize
Alert
Monitoring Framework
• Be metric, event, and log
• Allow us to easily visualize the
stat of our environment
• Provide contextual and useful
notification.
• Focus on whitebox monitoring
instead of blackbox monitoring
[image] https://www.artofmonitoring.com
Copyright 2016 Trend Micro Inc.7
• The practice of system admin is changing quickly
– DevOps
– Infrastructure as Code
– Visualization
– Cloud Platform
• Commercial solution can’t keep up!
Why Open Source Software Monitoring?
[image] http://devops.com/2014/05/05/meet-infrastructure-code
Copyright 2016 Trend Micro Inc.8
Monolithic
• If all-in-one gets the job done, then great
• Good for smaller scale, non-tech-focused
companies
Modular
• DevOps requires flexibility and innovation
• Good for tech driven and ops-focused
companies
Metric
We’ll rely most heavily on metrics to help us understand
what’s going on in our environment.
Copyright 2016 Trend Micro Inc.10
Metric
• Provide a dynamic, real-time picture of the state of
your infrastructure
Collect Store Visualize
Operating System
Application
Business Logic
Copyright 2016 Trend Micro Inc.11
Pull Mode
A central collector periodically
requests metrics from each
monitored system
Scrape
Metrics
Server #1
Service
Server #2
Service
Server #3
Service
Monitoring
Push Mode
Metrics are periodically sent
by each monitored system to a
central collector
Server #1
Agent
Server #2
Agent
Server #3
Agent
Monitoring
Push
Metrics
Collectd
The system statistics collection daemon
Copyright 2016 Trend Micro Inc.13
Database
Host Machine OS Metric
K8S Minion
Collectd
K8S Master
Collectd
Kubernetes Cluster
Elasticsearch
Monitor
GrafanaInfluxDB
Collectd Collectd
K8S Minion
Collectd
cAdvisor
Collects, aggregates, processes, and exports information
about running containers.
Copyright 2016 Trend Micro Inc.15
Database
Container OS Metric
K8S Minion
cAdvisor
Heapster
K8S Master
API Server
cAdvisor
Kubernetes Cluster
Elasticsearch
Monitor
GrafanaInfluxDB
K8S Minion
cAdvisor
Copyright 2016 Trend Micro Inc.16
http://127.0.0.1:4194
http://127.0.0.1:4194/metrics
Telegraf
Collects time series data from a variety of sources
Copyright 2016 Trend Micro Inc.18
Database
Application Metric – Pull Mode
K8S Minion
K8S Master
API Server
Kubernetes Cluster
Elasticsearch
Monitor
GrafanaInfluxDB
NginxPostgres ES
Telegraf
K8S Minion
Redis
Copyright 2016 Trend Micro Inc.19
Database
Application Metric – Push Mode
K8S Minion
Telegraf
K8S Master
API Server
Kubernetes Cluster
Elasticsearch
Monitor
GrafanaInfluxDB
Nginx
Telegraf
Postgres
Telegraf
ES
K8S Minion
Telegraf
Redis
Pod Pod
Copyright 2016 Trend Micro Inc.20
Telegraf
• An agent written in Go
• Easy to contribute or develop your plugins
• Supported Input Plugins
– System: cpu, mem, net, disk, processes and etc.
– Application: docker, nginx, postgresql, redis, and etc.
– Third party api: aws cloudwatch
• Supported output plugins
– influxdb, graphite, cloudwatch, datadog and etc.
Copyright 2016 Trend Micro Inc.21
Pull Mode
• Collector discoveries services
periodically
• Collector needs to talk to target
services
– Overlay network. Ex: Flannel,
Weave, etc.
– Forward proxy
Scrape
Metrics
Server #1
Service
Server #2
Service
Server #3
Service
Monitoring
Copyright 2016 Trend Micro Inc.22
Push Mode
• Agent sends metrics as soon as it
starts up
• Suitable for short-lived containers
or dynamic environments
Server #1
Agent
Server #2
Agent
Server #3
Agent
Monitoring
Push
Metrics
InfluxDB
Time series database stores all the metrics
Copyright 2016 Trend Micro Inc.24
• Similar scope to Graphite
• Written in Go
• No external dependency
• Ready for billions row
• Several client libraries
• SQL style queries
Copyright 2016 Trend Micro Inc.25
InfluxDB Schema
• Measurements (e.g. cpu, mem, disk, net, nginx)
• Timestamp (nano second)
• Tags (e.g. app=atom-apigw namespace=alpha)
• Fields (e.g. accepts=30925, active=1, handled=30925)
Copyright 2016 Trend Micro Inc.26
InfluxDB Schema
• Measurements (e.g. cpu, mem, disk, net, nginx)
• Timestamp (nano second)
• Tags (e.g. app=atom-apigw namespace=alpha)
• Fields (e.g. accepts=30925, active=1, handled=30925)
Copyright 2016 Trend Micro Inc.27
InfluxDB Schema
• Measurements (e.g. cpu, mem, disk, net, nginx)
• Timestamp (nano second)
• Tags (e.g. app=atom-apigw namespace=alpha)
• Fields (e.g. accepts=30925, active=1, handled=30925)
Copyright 2016 Trend Micro Inc.28
InfluxDB Schema
• Measurements (e.g. cpu, mem, disk, net, nginx)
• Timestamp (nano second)
• Tags (e.g. app=atom-apigw namespace=alpha)
• Fields (e.g. accepts=30925, active=1, handled=30925)
Copyright 2016 Trend Micro Inc.29
InfluxDB Schema
• Measurements (e.g. cpu, mem, disk, net, nginx)
• Timestamp (nano second)
• Tags (e.g. app=atom-apigw namespace=alpha)
• Fields (e.g. accepts=30925, active=1, handled=30925)
Grafana
Querying and visualizing time series and metrics
Finally! What we can to see!
Less talk more demo…
Copyright 2016 Trend Micro Inc.32
Copyright 2016 Trend Micro Inc.35
Copyright 2016 Trend Micro Inc.37
Event
We’ll generally use events to let us know about changes
and occurrences in our environment.
Icinga2
A monitoring system which checks the availability of
your resources, notifies users of outages
Copyright 2016 Trend Micro Inc.40
• Originally forked from Nagios in 2009
• Independent version Icinga2 since 2014
• Monitors everything
• Gathering status
• Collect performance data
/use/lib/nagios/plugins/plugin_name
Any program which returns
• 0 – OK
• 1 – WARNING
• 2 – CRITICAL
Message to STDOUT
Copyright 2016 Trend Micro Inc.41
Database
Event Monitoring
K8S Minion
K8S Master
API Server
Kubernetes Cluster
Elasticsearch
Monitor
GrafanaInfluxDBIcinga2
NginxPostgres ES
K8S Minion
Redis
Copyright 2016 Trend Micro Inc.43
Copyright 2016 Trend Micro Inc.44
Host
Pod
Copyright 2016 Trend Micro Inc.45
Server Monitoring
• About changes and occurrences in our environment
– Is cpu load too high?
– Is memory not enough?
– Is docker engine still alive?
External Monitoring
• End to end test from user’s perspective
• Can you connect ssh port 22?
• Can you browse web page?
• Can you request RESTful API successfully?
Dashboard
• Provides a clear view for current environment
Alerting
• Notifies using email, slack, pager and etc.
• Notification escalation
Others…
• Distributed Monitoring
• Reloads config without interrupt checks
Log
Logs are a subset of events. They’re often most useful
for fault diagnosis and investigation
Copyright 2016 Trend Micro Inc.48
• EFK stack is a mature logging solution
EFK
Shipping all of your log
to where it should go
The main part to store your
data with high availability
Visualize will power your data.
To know more about its value.
Summary
Copyright 2016 Trend Micro Inc.50
Collect Store Visualize
Monitoring Service Stack
Alert
InfluxDB
Elasticserach
Grafana
Kibana
Icinga2Collectd
Telegraf Fluentd
cAdvisor
Copyright 2016 Trend Micro Inc.51
Last but Not least
• Data Flow
– Blackbox vs Whitebox
– Pull Mode vs Push Mode
• Data Content
– Opeating System, Application,
Business Logic
• Data Type
– Metric, Events, Logs
Operating System
Application
Business Logic
Metrics
Events
Logs
Router
Destinations
Store
Visualize
Alert
Monitoring Framwork
[image] https://www.artofmonitoring.com
THANKS!
Any Questions?

More Related Content

What's hot

Learning Solidity
Learning SolidityLearning Solidity
Learning SolidityArnold Pham
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Sourceaspyker
 
DevSecOps at Agile 2019
DevSecOps at   Agile 2019 DevSecOps at   Agile 2019
DevSecOps at Agile 2019 Elizabeth Ayer
 
Power of Splunk Search Processing Language (SPL) ...
Power of Splunk Search Processing Language (SPL)                             ...Power of Splunk Search Processing Language (SPL)                             ...
Power of Splunk Search Processing Language (SPL) ...Splunk
 
Darktrace Proof of Value
Darktrace Proof of ValueDarktrace Proof of Value
Darktrace Proof of ValueDarktrace
 
Metrics that Matters in Software Engineering
Metrics that Matters in Software EngineeringMetrics that Matters in Software Engineering
Metrics that Matters in Software EngineeringPanji Gautama
 
cybercrimeandsecurityppt-140210064917-phpapp02.pptx
cybercrimeandsecurityppt-140210064917-phpapp02.pptxcybercrimeandsecurityppt-140210064917-phpapp02.pptx
cybercrimeandsecurityppt-140210064917-phpapp02.pptxRahulDhanware2
 
Token Engineering from an Economic Perspective
Token Engineering from an Economic Perspective Token Engineering from an Economic Perspective
Token Engineering from an Economic Perspective Token Engineering
 
Creating a Context-Aware solution, Complex Event Processing with FIWARE Perseo
Creating a Context-Aware solution, Complex Event Processing with FIWARE PerseoCreating a Context-Aware solution, Complex Event Processing with FIWARE Perseo
Creating a Context-Aware solution, Complex Event Processing with FIWARE PerseoFernando Lopez Aguilar
 
Understanding hd wallets design and implementation
Understanding hd wallets  design and implementationUnderstanding hd wallets  design and implementation
Understanding hd wallets design and implementationArcBlock
 
DevSecOps Implementation Journey
DevSecOps Implementation JourneyDevSecOps Implementation Journey
DevSecOps Implementation JourneyDevOps Indonesia
 
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24Neo4j
 
How to Build a Telegraf Plugin by Noah Crowley
How to Build a Telegraf Plugin by Noah CrowleyHow to Build a Telegraf Plugin by Noah Crowley
How to Build a Telegraf Plugin by Noah CrowleyInfluxData
 
Threat Hunting with Data Science
Threat Hunting with Data ScienceThreat Hunting with Data Science
Threat Hunting with Data ScienceAustin Taylor
 
The New Pentest? Rise of the Compromise Assessment
The New Pentest? Rise of the Compromise AssessmentThe New Pentest? Rise of the Compromise Assessment
The New Pentest? Rise of the Compromise AssessmentInfocyte
 

What's hot (20)

Learning Solidity
Learning SolidityLearning Solidity
Learning Solidity
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Source
 
DevSecOps at Agile 2019
DevSecOps at   Agile 2019 DevSecOps at   Agile 2019
DevSecOps at Agile 2019
 
Power of Splunk Search Processing Language (SPL) ...
Power of Splunk Search Processing Language (SPL)                             ...Power of Splunk Search Processing Language (SPL)                             ...
Power of Splunk Search Processing Language (SPL) ...
 
DevOps or DevSecOps
DevOps or DevSecOpsDevOps or DevSecOps
DevOps or DevSecOps
 
Snort IDS/IPS Basics
Snort IDS/IPS BasicsSnort IDS/IPS Basics
Snort IDS/IPS Basics
 
Darktrace Proof of Value
Darktrace Proof of ValueDarktrace Proof of Value
Darktrace Proof of Value
 
Metrics that Matters in Software Engineering
Metrics that Matters in Software EngineeringMetrics that Matters in Software Engineering
Metrics that Matters in Software Engineering
 
Hyperledger Fabric
Hyperledger FabricHyperledger Fabric
Hyperledger Fabric
 
cybercrimeandsecurityppt-140210064917-phpapp02.pptx
cybercrimeandsecurityppt-140210064917-phpapp02.pptxcybercrimeandsecurityppt-140210064917-phpapp02.pptx
cybercrimeandsecurityppt-140210064917-phpapp02.pptx
 
Token Engineering from an Economic Perspective
Token Engineering from an Economic Perspective Token Engineering from an Economic Perspective
Token Engineering from an Economic Perspective
 
IronPort
IronPortIronPort
IronPort
 
Creating a Context-Aware solution, Complex Event Processing with FIWARE Perseo
Creating a Context-Aware solution, Complex Event Processing with FIWARE PerseoCreating a Context-Aware solution, Complex Event Processing with FIWARE Perseo
Creating a Context-Aware solution, Complex Event Processing with FIWARE Perseo
 
Understanding hd wallets design and implementation
Understanding hd wallets  design and implementationUnderstanding hd wallets  design and implementation
Understanding hd wallets design and implementation
 
DevSecOps Implementation Journey
DevSecOps Implementation JourneyDevSecOps Implementation Journey
DevSecOps Implementation Journey
 
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
 
How to Build a Telegraf Plugin by Noah Crowley
How to Build a Telegraf Plugin by Noah CrowleyHow to Build a Telegraf Plugin by Noah Crowley
How to Build a Telegraf Plugin by Noah Crowley
 
Threat Hunting with Data Science
Threat Hunting with Data ScienceThreat Hunting with Data Science
Threat Hunting with Data Science
 
The New Pentest? Rise of the Compromise Assessment
The New Pentest? Rise of the Compromise AssessmentThe New Pentest? Rise of the Compromise Assessment
The New Pentest? Rise of the Compromise Assessment
 
IDS and IPS
IDS and IPSIDS and IPS
IDS and IPS
 

Similar to The Art of Container Monitoring

ThroughTheLookingGlass_EffectiveObservability.pptx
ThroughTheLookingGlass_EffectiveObservability.pptxThroughTheLookingGlass_EffectiveObservability.pptx
ThroughTheLookingGlass_EffectiveObservability.pptxGrace Jansen
 
利用 SDACK 架構分析資安事件大數據
利用 SDACK 架構分析資安事件大數據利用 SDACK 架構分析資安事件大數據
利用 SDACK 架構分析資安事件大數據Yu-Lun Chen
 
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDogRedis Labs
 
Leveraging Analytics for DevOps
Leveraging Analytics for DevOpsLeveraging Analytics for DevOps
Leveraging Analytics for DevOpsMichael Floyd
 
How to Monitor Microservices
How to Monitor MicroservicesHow to Monitor Microservices
How to Monitor MicroservicesSysdig
 
Splunk App for Stream
Splunk App for StreamSplunk App for Stream
Splunk App for StreamSplunk
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudyJohn Adams
 
Integration in the Cloud, by Rob Davies
Integration in the Cloud, by Rob DaviesIntegration in the Cloud, by Rob Davies
Integration in the Cloud, by Rob DaviesJudy Breedlove
 
Monitoring federation open stack infrastructure
Monitoring federation open stack infrastructureMonitoring federation open stack infrastructure
Monitoring federation open stack infrastructureFernando Lopez Aguilar
 
Flink Forward Berlin 2018: Yonatan Most & Avihai Berkovitz - "Anomaly Detecti...
Flink Forward Berlin 2018: Yonatan Most & Avihai Berkovitz - "Anomaly Detecti...Flink Forward Berlin 2018: Yonatan Most & Avihai Berkovitz - "Anomaly Detecti...
Flink Forward Berlin 2018: Yonatan Most & Avihai Berkovitz - "Anomaly Detecti...Flink Forward
 
Cloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark AnalyticsCloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark Analyticsamesar0
 
Monitoring microservices: Docker, Mesos and Kubernetes visibility at scale
Monitoring microservices: Docker, Mesos and Kubernetes visibility at scaleMonitoring microservices: Docker, Mesos and Kubernetes visibility at scale
Monitoring microservices: Docker, Mesos and Kubernetes visibility at scaleAlessandro Gallotta
 
Microservices @ Work - A Practice Report of Developing Microservices
Microservices @ Work - A Practice Report of Developing MicroservicesMicroservices @ Work - A Practice Report of Developing Microservices
Microservices @ Work - A Practice Report of Developing MicroservicesQAware GmbH
 
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Guglielmo Iozzia
 
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...Codemotion
 
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...Demi Ben-Ari
 
Using the SDACK Architecture on Security Event Inspection by Yu-Lun Chen and ...
Using the SDACK Architecture on Security Event Inspection by Yu-Lun Chen and ...Using the SDACK Architecture on Security Event Inspection by Yu-Lun Chen and ...
Using the SDACK Architecture on Security Event Inspection by Yu-Lun Chen and ...Docker, Inc.
 
Docker Usage Patterns - Meetup Docker Paris - November, 10th 2015
Docker Usage Patterns - Meetup Docker Paris - November, 10th 2015Docker Usage Patterns - Meetup Docker Paris - November, 10th 2015
Docker Usage Patterns - Meetup Docker Paris - November, 10th 2015Datadog
 
DevOps Spain 2019. Beatriz Martínez-IBM
DevOps Spain 2019. Beatriz Martínez-IBMDevOps Spain 2019. Beatriz Martínez-IBM
DevOps Spain 2019. Beatriz Martínez-IBMatSistemas
 

Similar to The Art of Container Monitoring (20)

ThroughTheLookingGlass_EffectiveObservability.pptx
ThroughTheLookingGlass_EffectiveObservability.pptxThroughTheLookingGlass_EffectiveObservability.pptx
ThroughTheLookingGlass_EffectiveObservability.pptx
 
利用 SDACK 架構分析資安事件大數據
利用 SDACK 架構分析資安事件大數據利用 SDACK 架構分析資安事件大數據
利用 SDACK 架構分析資安事件大數據
 
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 
Leveraging Analytics for DevOps
Leveraging Analytics for DevOpsLeveraging Analytics for DevOps
Leveraging Analytics for DevOps
 
How to Monitor Microservices
How to Monitor MicroservicesHow to Monitor Microservices
How to Monitor Microservices
 
Splunk App for Stream
Splunk App for StreamSplunk App for Stream
Splunk App for Stream
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 
Integration in the Cloud, by Rob Davies
Integration in the Cloud, by Rob DaviesIntegration in the Cloud, by Rob Davies
Integration in the Cloud, by Rob Davies
 
Monitoring federation open stack infrastructure
Monitoring federation open stack infrastructureMonitoring federation open stack infrastructure
Monitoring federation open stack infrastructure
 
Flink Forward Berlin 2018: Yonatan Most & Avihai Berkovitz - "Anomaly Detecti...
Flink Forward Berlin 2018: Yonatan Most & Avihai Berkovitz - "Anomaly Detecti...Flink Forward Berlin 2018: Yonatan Most & Avihai Berkovitz - "Anomaly Detecti...
Flink Forward Berlin 2018: Yonatan Most & Avihai Berkovitz - "Anomaly Detecti...
 
Cloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark AnalyticsCloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark Analytics
 
Monitoring microservices: Docker, Mesos and Kubernetes visibility at scale
Monitoring microservices: Docker, Mesos and Kubernetes visibility at scaleMonitoring microservices: Docker, Mesos and Kubernetes visibility at scale
Monitoring microservices: Docker, Mesos and Kubernetes visibility at scale
 
Microservices @ Work - A Practice Report of Developing Microservices
Microservices @ Work - A Practice Report of Developing MicroservicesMicroservices @ Work - A Practice Report of Developing Microservices
Microservices @ Work - A Practice Report of Developing Microservices
 
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
 
Monitoring in 2017 - TIAD Camp Docker
Monitoring in 2017 - TIAD Camp DockerMonitoring in 2017 - TIAD Camp Docker
Monitoring in 2017 - TIAD Camp Docker
 
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
 
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
 
Using the SDACK Architecture on Security Event Inspection by Yu-Lun Chen and ...
Using the SDACK Architecture on Security Event Inspection by Yu-Lun Chen and ...Using the SDACK Architecture on Security Event Inspection by Yu-Lun Chen and ...
Using the SDACK Architecture on Security Event Inspection by Yu-Lun Chen and ...
 
Docker Usage Patterns - Meetup Docker Paris - November, 10th 2015
Docker Usage Patterns - Meetup Docker Paris - November, 10th 2015Docker Usage Patterns - Meetup Docker Paris - November, 10th 2015
Docker Usage Patterns - Meetup Docker Paris - November, 10th 2015
 
DevOps Spain 2019. Beatriz Martínez-IBM
DevOps Spain 2019. Beatriz Martínez-IBMDevOps Spain 2019. Beatriz Martínez-IBM
DevOps Spain 2019. Beatriz Martínez-IBM
 

Recently uploaded

Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 

Recently uploaded (20)

Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software Solutions
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 

The Art of Container Monitoring

  • 1. The Art of Container Monitoring Derek Chen 2016.9.22
  • 2. Copyright 2016 Trend Micro Inc.2 About me • DevOps Engineer at Trend Micro – Agile transformation – Micro service and cloud service – Docker integration – Monitoring system development • Automate all the things • Make everything smoother • Find me at derekhound@gmail.com
  • 3. Copyright 2016 Trend Micro Inc.3 • We want to know when things go wrong • We want to know when things aren’t quite right • We want to know in advance of problems Why monitoring? Microscope? Magnifier? Telescope?
  • 4. Copyright 2016 Trend Micro Inc.4 Blackbox vs Whitebox • Blackbox – Requires no participation of the monitored system – Observes external functionality, “what the user see” – End to end test • Whitebox – Collects data internally provided by the target system – Has more granular information about the system – Can provide warning of problems before they occur
  • 5. Copyright 2016 Trend Micro Inc.5 Blackbox tests measure… • Can you ping the server • Can you fetch a page from the server • Does it have the correct contents Whitebox monitoring collects… • Network statistics (packets/bytes sent/received) • System load (cpu, memory, disk usage) • Application statistics (connection count, query per second)
  • 6. Copyright 2016 Trend Micro Inc.6 Goal Operating System Application Business Logic Metrics Events Logs Router Destinations Store Visualize Alert Monitoring Framework • Be metric, event, and log • Allow us to easily visualize the stat of our environment • Provide contextual and useful notification. • Focus on whitebox monitoring instead of blackbox monitoring [image] https://www.artofmonitoring.com
  • 7. Copyright 2016 Trend Micro Inc.7 • The practice of system admin is changing quickly – DevOps – Infrastructure as Code – Visualization – Cloud Platform • Commercial solution can’t keep up! Why Open Source Software Monitoring? [image] http://devops.com/2014/05/05/meet-infrastructure-code
  • 8. Copyright 2016 Trend Micro Inc.8 Monolithic • If all-in-one gets the job done, then great • Good for smaller scale, non-tech-focused companies Modular • DevOps requires flexibility and innovation • Good for tech driven and ops-focused companies
  • 9. Metric We’ll rely most heavily on metrics to help us understand what’s going on in our environment.
  • 10. Copyright 2016 Trend Micro Inc.10 Metric • Provide a dynamic, real-time picture of the state of your infrastructure Collect Store Visualize Operating System Application Business Logic
  • 11. Copyright 2016 Trend Micro Inc.11 Pull Mode A central collector periodically requests metrics from each monitored system Scrape Metrics Server #1 Service Server #2 Service Server #3 Service Monitoring Push Mode Metrics are periodically sent by each monitored system to a central collector Server #1 Agent Server #2 Agent Server #3 Agent Monitoring Push Metrics
  • 12. Collectd The system statistics collection daemon
  • 13. Copyright 2016 Trend Micro Inc.13 Database Host Machine OS Metric K8S Minion Collectd K8S Master Collectd Kubernetes Cluster Elasticsearch Monitor GrafanaInfluxDB Collectd Collectd K8S Minion Collectd
  • 14. cAdvisor Collects, aggregates, processes, and exports information about running containers.
  • 15. Copyright 2016 Trend Micro Inc.15 Database Container OS Metric K8S Minion cAdvisor Heapster K8S Master API Server cAdvisor Kubernetes Cluster Elasticsearch Monitor GrafanaInfluxDB K8S Minion cAdvisor
  • 16. Copyright 2016 Trend Micro Inc.16 http://127.0.0.1:4194 http://127.0.0.1:4194/metrics
  • 17. Telegraf Collects time series data from a variety of sources
  • 18. Copyright 2016 Trend Micro Inc.18 Database Application Metric – Pull Mode K8S Minion K8S Master API Server Kubernetes Cluster Elasticsearch Monitor GrafanaInfluxDB NginxPostgres ES Telegraf K8S Minion Redis
  • 19. Copyright 2016 Trend Micro Inc.19 Database Application Metric – Push Mode K8S Minion Telegraf K8S Master API Server Kubernetes Cluster Elasticsearch Monitor GrafanaInfluxDB Nginx Telegraf Postgres Telegraf ES K8S Minion Telegraf Redis Pod Pod
  • 20. Copyright 2016 Trend Micro Inc.20 Telegraf • An agent written in Go • Easy to contribute or develop your plugins • Supported Input Plugins – System: cpu, mem, net, disk, processes and etc. – Application: docker, nginx, postgresql, redis, and etc. – Third party api: aws cloudwatch • Supported output plugins – influxdb, graphite, cloudwatch, datadog and etc.
  • 21. Copyright 2016 Trend Micro Inc.21 Pull Mode • Collector discoveries services periodically • Collector needs to talk to target services – Overlay network. Ex: Flannel, Weave, etc. – Forward proxy Scrape Metrics Server #1 Service Server #2 Service Server #3 Service Monitoring
  • 22. Copyright 2016 Trend Micro Inc.22 Push Mode • Agent sends metrics as soon as it starts up • Suitable for short-lived containers or dynamic environments Server #1 Agent Server #2 Agent Server #3 Agent Monitoring Push Metrics
  • 23. InfluxDB Time series database stores all the metrics
  • 24. Copyright 2016 Trend Micro Inc.24 • Similar scope to Graphite • Written in Go • No external dependency • Ready for billions row • Several client libraries • SQL style queries
  • 25. Copyright 2016 Trend Micro Inc.25 InfluxDB Schema • Measurements (e.g. cpu, mem, disk, net, nginx) • Timestamp (nano second) • Tags (e.g. app=atom-apigw namespace=alpha) • Fields (e.g. accepts=30925, active=1, handled=30925)
  • 26. Copyright 2016 Trend Micro Inc.26 InfluxDB Schema • Measurements (e.g. cpu, mem, disk, net, nginx) • Timestamp (nano second) • Tags (e.g. app=atom-apigw namespace=alpha) • Fields (e.g. accepts=30925, active=1, handled=30925)
  • 27. Copyright 2016 Trend Micro Inc.27 InfluxDB Schema • Measurements (e.g. cpu, mem, disk, net, nginx) • Timestamp (nano second) • Tags (e.g. app=atom-apigw namespace=alpha) • Fields (e.g. accepts=30925, active=1, handled=30925)
  • 28. Copyright 2016 Trend Micro Inc.28 InfluxDB Schema • Measurements (e.g. cpu, mem, disk, net, nginx) • Timestamp (nano second) • Tags (e.g. app=atom-apigw namespace=alpha) • Fields (e.g. accepts=30925, active=1, handled=30925)
  • 29. Copyright 2016 Trend Micro Inc.29 InfluxDB Schema • Measurements (e.g. cpu, mem, disk, net, nginx) • Timestamp (nano second) • Tags (e.g. app=atom-apigw namespace=alpha) • Fields (e.g. accepts=30925, active=1, handled=30925)
  • 30. Grafana Querying and visualizing time series and metrics
  • 31. Finally! What we can to see! Less talk more demo…
  • 32. Copyright 2016 Trend Micro Inc.32
  • 33.
  • 34.
  • 35. Copyright 2016 Trend Micro Inc.35
  • 36.
  • 37. Copyright 2016 Trend Micro Inc.37
  • 38. Event We’ll generally use events to let us know about changes and occurrences in our environment.
  • 39. Icinga2 A monitoring system which checks the availability of your resources, notifies users of outages
  • 40. Copyright 2016 Trend Micro Inc.40 • Originally forked from Nagios in 2009 • Independent version Icinga2 since 2014 • Monitors everything • Gathering status • Collect performance data /use/lib/nagios/plugins/plugin_name Any program which returns • 0 – OK • 1 – WARNING • 2 – CRITICAL Message to STDOUT
  • 41. Copyright 2016 Trend Micro Inc.41 Database Event Monitoring K8S Minion K8S Master API Server Kubernetes Cluster Elasticsearch Monitor GrafanaInfluxDBIcinga2 NginxPostgres ES K8S Minion Redis
  • 42.
  • 43. Copyright 2016 Trend Micro Inc.43
  • 44. Copyright 2016 Trend Micro Inc.44 Host Pod
  • 45. Copyright 2016 Trend Micro Inc.45 Server Monitoring • About changes and occurrences in our environment – Is cpu load too high? – Is memory not enough? – Is docker engine still alive? External Monitoring • End to end test from user’s perspective • Can you connect ssh port 22? • Can you browse web page? • Can you request RESTful API successfully?
  • 46. Dashboard • Provides a clear view for current environment Alerting • Notifies using email, slack, pager and etc. • Notification escalation Others… • Distributed Monitoring • Reloads config without interrupt checks
  • 47. Log Logs are a subset of events. They’re often most useful for fault diagnosis and investigation
  • 48. Copyright 2016 Trend Micro Inc.48 • EFK stack is a mature logging solution EFK Shipping all of your log to where it should go The main part to store your data with high availability Visualize will power your data. To know more about its value.
  • 50. Copyright 2016 Trend Micro Inc.50 Collect Store Visualize Monitoring Service Stack Alert InfluxDB Elasticserach Grafana Kibana Icinga2Collectd Telegraf Fluentd cAdvisor
  • 51. Copyright 2016 Trend Micro Inc.51 Last but Not least • Data Flow – Blackbox vs Whitebox – Pull Mode vs Push Mode • Data Content – Opeating System, Application, Business Logic • Data Type – Metric, Events, Logs Operating System Application Business Logic Metrics Events Logs Router Destinations Store Visualize Alert Monitoring Framwork [image] https://www.artofmonitoring.com

Editor's Notes

  1. 大家好,很開心今天可以在這裡跟大家分享 the art of container monitoring。開始之前,大家想看看有沒有遇這樣的困擾。 因為 aws 帶來的便利性,我們很習慣把機器搬到 aws 上面,但是機器發生問題的時候,打開 cloud watch,發現這些 metric 都不是我想看的,你有這種困擾嗎?我有。 所以我就把 custom metric 放到 cloud watch 上,卻發現 cloud watch 萬萬稅,新增 metric 要錢,設定 alarm 要錢,五分鐘監控一次改成一分鐘監控一次也要錢,你有這種困擾嗎?我有。 把 container 放到 kubernetes 上面跑的時候,卻發現不知道怎麼計算 real time total current connection,你有這種困擾嗎?我有。 如果你跟我一樣有這些困擾的話,希望今天的演講結束以後,人人都可以量身訂造一個屬於你自己的 monitoring system。
  2. 首先我要介紹一下我自己,我叫做 Derek ,目前任職於趨勢科技擔任 DevOps 的工作,主要將 CI/CD 導入生產團隊,對於 microservice 和 cloud service 有著濃厚的興趣。最近致力於將 kubernetes 整合進生產環境,並建構 container monitoring system。我喜歡將所有的事情盡可能的自動化,讓每件事情變得更簡單更順利。有問題想要找我的話,email在這裡。
  3. 在容器的世界中,依據系統的規模大小,可能有數百個到數千個容器在運行。我們到底是應該用顯微鏡觀察每一個 container 的運行狀態,還是用放大鏡觀察 micro service 的整體狀態,還是拿個望遠鏡,遠遠的觀望整個系統的狀態,取決於觀察的精細度是否能幫助我們容易找到問題和解決問題。
  4. 因為我們所處的環境不斷的在改變。近幾年提昌的 DevOps,需要團隊做敏捷的開發,需要整合 CI / CD 的 tool。最近開始要有一句響亮的口號 ‘Infrastructure as Code’,希望能夠自動化完成所有的部署工作。我們需要把數據視覺化。我們開始大量的採用 cloud platform, 例如 aws, google cloud, azure。 環境一直在改變,但是 commercial solution 在大部分的時候沒有這麼彈性,難以符合我們不斷變動的需求。
  5. 右上方的圖是 google nexus 手機,是一個 all in one solution 的代表,右下方的圖是 google開發的模組化手機,可以根據你的需求替換不同的模組。當你在選擇 open source software 時,你會選擇那一種呢,all in one solution 或是 modular solution 呢? 假如這個 all in one solution 剛好符合你的需求,那你真得非常幸運,Promethues 就是這類型的代表。 但是,大部份的時候我們沒有那麼幸運,我們必須將數個 software 放在一起去建構我們自己的 monitoring system。
  6. http://www.slideshare.net/icinga/why-favour-icinga-over-nagios-rootconf-2015?qid=13b9dfe7-0eee-45cc-b16a-fcdaa2b54b22&v=&b=&from_search=23