SlideShare a Scribd company logo
1 of 20
Download to read offline
Следете актуалните обяви за DevOps
Партньори:
Monitoring & Logging
Marian Marinov
mm@yuhu.biz
Следете актуалните обяви за DevOps
Партньори:
Who am I?
● Director of Engineering at Web Hosting Canada
● Former partner and Head of DevOps at SiteGround
● A SysAdmin and System Architect
Следете актуалните обяви за DevOps
Партньори:
What I have to monitor?
● 13 physical linux machines
○ Storage capacity (df/df -i)
○ S.M.A.R.T. of the drives
○ RAID (HW or Soft)
○ Network (routes, traffic and usage)
○ Performance (CPU, Mem, I/O, Processes)
○ Kernel logs
○ Service logs
Следете актуалните обяви за DevOps
Партньори:
What I have to monitor?
● 1 UPS
● 2 APC PUDs
● 2 Switches (SNMP statistics)
● 2 Thermostat (traffic, temp, humidity)
● 40+ LXC containers
○ Performance (CPU, Mem, I/O, Processes)
○ Storage capacity (df/df -i)
○ Service logs
● 2-3 Wifi access points
○ number of attached devices
○ traffic per-device
Следете актуалните обяви за DevOps
Партньори:
What I have to monitor?
● A few things for which I want traffic and power on time
○ 3 TVs
○ 3 Amplifiers
○ 4 Cameras
○ 1 Washing machine
○ 1 Dryer
Следете актуалните обяви за DevOps
Партньори:
What I wanted
● Single solution for log and metrics collection
● Single central interface
Следете актуалните обяви за DevOps
Партньори:
What I ended up having
● multiple grafana dashboards
● monitor events, instead of reading logs
● a bunch of different log collectors
Следете актуалните обяви за DevOps
Партньори:
What tested
● syslog-ng
● rsyslog
● Filebeat
● Prometheus node_exporter
● Loki
● Fluentd
● Clolectd
● StatsD
● Graylog
● PostgreSQL+timescale
● Grafana
Следете актуалните обяви за DevOps
Партньори:
Conclusions
● there is no one solution to rule them all
● SNMP is still the king for networking
● too many logging formats and DSLs
Следете актуалните обяви за DevOps
Партньори:
Conclusions
● there is no one solution to rule them all
● SNMP is still the king for networking
● too many logging formats and DSLs
● collectd was the easiest
○ with the most metrics out-of-the-box
Следете актуалните обяви за DevOps
Партньори:
Conclusions
● there is no one solution to rule them all
● SNMP is still the king for networking
● too many logging formats and DSLs
● collectd was the easiest
○ with the most metrics out-of-the-box
● ElasticSearch + Kibana require too much resources
○ Not usable for smaller setups
● Graylog uses a lot of CPU for the work it does
○ alerts can be based on number of events instead of parsing logs
Следете актуалните обяви за DevOps
Партньори:
Installation / Setup
● basic apt-get:
○ rsyslogd, syslog-ng, fluentd, collectd, filebeat, loki, node_exporter
○ statsd wanted full npm
Следете актуалните обяви за DevOps
Партньори:
Pros and Cons
● Syslog pros
○ can easily ingest netconsole kernel logging
○ very good performance
○ well documented and standardized interface
● Syslog cons
○ fire and forget
○ the syslog protocol
○ not enough parsing flexibility
○ syslog-ng was heavier then rsyslogd
Следете актуалните обяви за DevOps
Партньори:
Pros and Cons
● Loki/Node_exporter/filebeat/fluentd
○ very good parsing capabilities
○ filebeat was the easiest for me
○ reliable log delivery
○ different integrations
○ ready made grafana dashboards
● Loki/Node_exporter/filebeat/fluentd
○ very heavy on CPU
○ Loki did not have sysv init script :)
Следете актуалните обяви за DevOps
Партньори:
Interesting
● OAIEvals Collector - by Nikolay Stankov
Следете актуалните обяви за DevOps
Партньори:
DB integrations
1. Prometheus node-exporter
2. Fluentd
3. filebeat
4. syslog
Следете актуалните обяви за DevOps
Партньори:
Not out of the box
● Custom local collectors still have to go directly to your metrics DB
● Having a producer/subscriber greatly reduces the performance hit
● Fluent and fliebeat were the only one supporting kafka out of the box
○ https://github.com/hikhvar/mqtt2prometheus
○ https://github.com/toyokazu/fluent-plugin-mqtt-io
Thank you!
СЛЕДВАЩО СЪБИТИЕ
Лектор Дата Език
Следете актуалните обяви за DevOps
Партньори:
Monitoring & Logging
Marian Marinov 19.Mar.2024 Български
Contacts:
Marian Marinov
Github profile
Facebook profile
Следете актуалните обяви за DevOps
Партньори:
What do I have on the containers?
● NextCloud
● Home Assistant
● Mirrors
● VPNs
● NetBox
● Monitoring (Grafana, StatPing)
● Games (Minecraft, CS, PVPGN)
● IRC (server, bouncers, bots)
● Matrix, Mattermost
● Backups
● Streaming (FOSDEM streamer setup)
● DBs (PostgreSQL, MySQL, Redis, DragonFly, Timescale, InfluxDB, Mongo)
● Vitess, ProxySQL
● MPI (Gearman, MQTT, Kafka, RabbitMQ)
● Web stuff - Wiki, HAproxy, Nginx, Varnish
● OpenShift, OpenStack, K8s on VMs and physical
● A lot of other experiments
Следете актуалните обяви за DevOps
Партньори:
What storage do I use?
● Local + LVM
● DRBD+OCFS2
● iSCSI
● cLVM + iSCSI
● GlusterFS
● OrangeFS
● I had in the past:
○ Ceph
○ NFS
○ cLVM + ATAoE
○ cLVM + NBD

More Related Content

What's hot

OSC2012Kansai@Kyoto 自宅SAN友の会 - インフラエンジニアなら知っておきたい ストレージのはなし
OSC2012Kansai@Kyoto 自宅SAN友の会 - インフラエンジニアなら知っておきたい ストレージのはなしOSC2012Kansai@Kyoto 自宅SAN友の会 - インフラエンジニアなら知っておきたい ストレージのはなし
OSC2012Kansai@Kyoto 自宅SAN友の会 - インフラエンジニアなら知っておきたい ストレージのはなし
Satoshi Shimazaki
 
CyberAgentにおけるMongoDB
CyberAgentにおけるMongoDBCyberAgentにおけるMongoDB
CyberAgentにおけるMongoDB
Akihiro Kuwano
 

What's hot (20)

Apache hive essentials
Apache hive essentialsApache hive essentials
Apache hive essentials
 
OSC2011 Tokyo/Spring 自宅SAN友の会(前半)
OSC2011 Tokyo/Spring 自宅SAN友の会(前半)OSC2011 Tokyo/Spring 自宅SAN友の会(前半)
OSC2011 Tokyo/Spring 自宅SAN友の会(前半)
 
MapReduce/YARNの仕組みを知る
MapReduce/YARNの仕組みを知るMapReduce/YARNの仕組みを知る
MapReduce/YARNの仕組みを知る
 
High throughput data replication over RAFT
High throughput data replication over RAFTHigh throughput data replication over RAFT
High throughput data replication over RAFT
 
HA環境構築のベスト・プラクティス
HA環境構築のベスト・プラクティスHA環境構築のベスト・プラクティス
HA環境構築のベスト・プラクティス
 
Collect distributed application logging using fluentd (EFK stack)
Collect distributed application logging using fluentd (EFK stack)Collect distributed application logging using fluentd (EFK stack)
Collect distributed application logging using fluentd (EFK stack)
 
OSC2012Kansai@Kyoto 自宅SAN友の会 - インフラエンジニアなら知っておきたい ストレージのはなし
OSC2012Kansai@Kyoto 自宅SAN友の会 - インフラエンジニアなら知っておきたい ストレージのはなしOSC2012Kansai@Kyoto 自宅SAN友の会 - インフラエンジニアなら知っておきたい ストレージのはなし
OSC2012Kansai@Kyoto 自宅SAN友の会 - インフラエンジニアなら知っておきたい ストレージのはなし
 
アーキテクチャから理解するPostgreSQLのレプリケーション
アーキテクチャから理解するPostgreSQLのレプリケーションアーキテクチャから理解するPostgreSQLのレプリケーション
アーキテクチャから理解するPostgreSQLのレプリケーション
 
Holistic data application quality
Holistic data application qualityHolistic data application quality
Holistic data application quality
 
詳説データベース輪読会: 分散合意その2
詳説データベース輪読会: 分散合意その2詳説データベース輪読会: 分散合意その2
詳説データベース輪読会: 分散合意その2
 
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeSimplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Apache Bigtopによるオープンなビッグデータ処理基盤の構築(オープンデベロッパーズカンファレンス 2021 Online 発表資料)
Apache Bigtopによるオープンなビッグデータ処理基盤の構築(オープンデベロッパーズカンファレンス 2021 Online 発表資料)Apache Bigtopによるオープンなビッグデータ処理基盤の構築(オープンデベロッパーズカンファレンス 2021 Online 発表資料)
Apache Bigtopによるオープンなビッグデータ処理基盤の構築(オープンデベロッパーズカンファレンス 2021 Online 発表資料)
 
What it takes to run Hadoop at Scale: Yahoo! Perspectives
What it takes to run Hadoop at Scale: Yahoo! PerspectivesWhat it takes to run Hadoop at Scale: Yahoo! Perspectives
What it takes to run Hadoop at Scale: Yahoo! Perspectives
 
OpenDaylight 소개
OpenDaylight 소개OpenDaylight 소개
OpenDaylight 소개
 
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureCeph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
 
Wide Column Store NoSQL vs SQL Data Modeling
Wide Column Store NoSQL vs SQL Data ModelingWide Column Store NoSQL vs SQL Data Modeling
Wide Column Store NoSQL vs SQL Data Modeling
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
 
CyberAgentにおけるMongoDB
CyberAgentにおけるMongoDBCyberAgentにおけるMongoDB
CyberAgentにおけるMongoDB
 
Big data on aws
Big data on awsBig data on aws
Big data on aws
 

Similar to Dev.bg DevOps March 2024 Monitoring & Logging

BUD17-405: Building a reference IoT product with Zephyr
BUD17-405: Building a reference IoT product with Zephyr BUD17-405: Building a reference IoT product with Zephyr
BUD17-405: Building a reference IoT product with Zephyr
Linaro
 
A Kernel of Truth: Intrusion Detection and Attestation with eBPF
A Kernel of Truth: Intrusion Detection and Attestation with eBPFA Kernel of Truth: Intrusion Detection and Attestation with eBPF
A Kernel of Truth: Intrusion Detection and Attestation with eBPF
oholiab
 
DevSecCon London 2019: A Kernel of Truth: Intrusion Detection and Attestation...
DevSecCon London 2019: A Kernel of Truth: Intrusion Detection and Attestation...DevSecCon London 2019: A Kernel of Truth: Intrusion Detection and Attestation...
DevSecCon London 2019: A Kernel of Truth: Intrusion Detection and Attestation...
DevSecCon
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMU
Linaro
 

Similar to Dev.bg DevOps March 2024 Monitoring & Logging (20)

LCE13: Test and Validation Summit: The future of testing at Linaro
LCE13: Test and Validation Summit: The future of testing at LinaroLCE13: Test and Validation Summit: The future of testing at Linaro
LCE13: Test and Validation Summit: The future of testing at Linaro
 
LCE13: Test and Validation Mini-Summit: Review Current Linaro Engineering Pro...
LCE13: Test and Validation Mini-Summit: Review Current Linaro Engineering Pro...LCE13: Test and Validation Mini-Summit: Review Current Linaro Engineering Pro...
LCE13: Test and Validation Mini-Summit: Review Current Linaro Engineering Pro...
 
Delivering a bleeding edge community-led openstack distribution: RDO
Delivering a bleeding edge community-led openstack distribution: RDO Delivering a bleeding edge community-led openstack distribution: RDO
Delivering a bleeding edge community-led openstack distribution: RDO
 
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json  postgre-sql vs. mongodbPGConf APAC 2018 - High performance json  postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
 
A Kong retrospective: from 0.10 to 0.13
A Kong retrospective: from 0.10 to 0.13A Kong retrospective: from 0.10 to 0.13
A Kong retrospective: from 0.10 to 0.13
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
 
OpenTelemetry For Operators
OpenTelemetry For OperatorsOpenTelemetry For Operators
OpenTelemetry For Operators
 
Criteo Labs Infrastructure Tech Talk Meetup Nov. 7
Criteo Labs Infrastructure Tech Talk Meetup Nov. 7Criteo Labs Infrastructure Tech Talk Meetup Nov. 7
Criteo Labs Infrastructure Tech Talk Meetup Nov. 7
 
Splunk, SIEMs, and Big Data - The Undercroft - November 2019
Splunk, SIEMs, and Big Data - The Undercroft - November 2019Splunk, SIEMs, and Big Data - The Undercroft - November 2019
Splunk, SIEMs, and Big Data - The Undercroft - November 2019
 
The bond between automation and network engineering
The bond between automation and network engineeringThe bond between automation and network engineering
The bond between automation and network engineering
 
[scala.by] Launching new application fast
[scala.by] Launching new application fast[scala.by] Launching new application fast
[scala.by] Launching new application fast
 
BUD17-405: Building a reference IoT product with Zephyr
BUD17-405: Building a reference IoT product with Zephyr BUD17-405: Building a reference IoT product with Zephyr
BUD17-405: Building a reference IoT product with Zephyr
 
A Kernel of Truth: Intrusion Detection and Attestation with eBPF
A Kernel of Truth: Intrusion Detection and Attestation with eBPFA Kernel of Truth: Intrusion Detection and Attestation with eBPF
A Kernel of Truth: Intrusion Detection and Attestation with eBPF
 
DevSecCon London 2019: A Kernel of Truth: Intrusion Detection and Attestation...
DevSecCon London 2019: A Kernel of Truth: Intrusion Detection and Attestation...DevSecCon London 2019: A Kernel of Truth: Intrusion Detection and Attestation...
DevSecCon London 2019: A Kernel of Truth: Intrusion Detection and Attestation...
 
Go at uber
Go at uberGo at uber
Go at uber
 
HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?
HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?
HKNOG 6.0 Next Generation Networks - will automation put us out of jobs?
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMU
 
OpenFlow @ Google
OpenFlow @ GoogleOpenFlow @ Google
OpenFlow @ Google
 
High performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbHigh performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodb
 
LMG Lightning Talks - SFO17-205
LMG Lightning Talks - SFO17-205LMG Lightning Talks - SFO17-205
LMG Lightning Talks - SFO17-205
 

More from Marian Marinov

More from Marian Marinov (20)

How to implement PassKeys in your application
How to implement PassKeys in your applicationHow to implement PassKeys in your application
How to implement PassKeys in your application
 
Basic presentation of cryptography mechanisms
Basic presentation of cryptography mechanismsBasic presentation of cryptography mechanisms
Basic presentation of cryptography mechanisms
 
Microservices: Benefits, drawbacks and are they for me?
Microservices: Benefits, drawbacks and are they for me?Microservices: Benefits, drawbacks and are they for me?
Microservices: Benefits, drawbacks and are they for me?
 
Introduction and replication to DragonflyDB
Introduction and replication to DragonflyDBIntroduction and replication to DragonflyDB
Introduction and replication to DragonflyDB
 
Message Queuing - Gearman, Mosquitto, Kafka and RabbitMQ
Message Queuing - Gearman, Mosquitto, Kafka and RabbitMQMessage Queuing - Gearman, Mosquitto, Kafka and RabbitMQ
Message Queuing - Gearman, Mosquitto, Kafka and RabbitMQ
 
How to successfully migrate to DevOps .pdf
How to successfully migrate to DevOps .pdfHow to successfully migrate to DevOps .pdf
How to successfully migrate to DevOps .pdf
 
How to survive in the work from home era
How to survive in the work from home eraHow to survive in the work from home era
How to survive in the work from home era
 
Managing sysadmins
Managing sysadminsManaging sysadmins
Managing sysadmins
 
Improve your storage with bcachefs
Improve your storage with bcachefsImprove your storage with bcachefs
Improve your storage with bcachefs
 
Control your service resources with systemd
 Control your service resources with systemd  Control your service resources with systemd
Control your service resources with systemd
 
Comparison of-foss-distributed-storage
Comparison of-foss-distributed-storageComparison of-foss-distributed-storage
Comparison of-foss-distributed-storage
 
Защо и как да обогатяваме знанията си?
Защо и как да обогатяваме знанията си?Защо и как да обогатяваме знанията си?
Защо и как да обогатяваме знанията си?
 
Securing your MySQL server
Securing your MySQL serverSecuring your MySQL server
Securing your MySQL server
 
Sysadmin vs. dev ops
Sysadmin vs. dev opsSysadmin vs. dev ops
Sysadmin vs. dev ops
 
DoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDKDoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDK
 
Challenges with high density networks
Challenges with high density networksChallenges with high density networks
Challenges with high density networks
 
SiteGround building automation
SiteGround building automationSiteGround building automation
SiteGround building automation
 
Preventing cpu side channel attacks with kernel tracking
Preventing cpu side channel attacks with kernel trackingPreventing cpu side channel attacks with kernel tracking
Preventing cpu side channel attacks with kernel tracking
 
Managing a lot of servers
Managing a lot of serversManaging a lot of servers
Managing a lot of servers
 
Let's Encrypt failures
Let's Encrypt failuresLet's Encrypt failures
Let's Encrypt failures
 

Recently uploaded

Teachers record management system project report..pdf
Teachers record management system project report..pdfTeachers record management system project report..pdf
Teachers record management system project report..pdf
Kamal Acharya
 
Online blood donation management system project.pdf
Online blood donation management system project.pdfOnline blood donation management system project.pdf
Online blood donation management system project.pdf
Kamal Acharya
 
ENCODERS & DECODERS - Digital Electronics - diu swe
ENCODERS & DECODERS - Digital Electronics - diu sweENCODERS & DECODERS - Digital Electronics - diu swe
ENCODERS & DECODERS - Digital Electronics - diu swe
MohammadAliNayeem
 
Lecture_8-Digital implementation of analog controller design.pdf
Lecture_8-Digital implementation of analog controller design.pdfLecture_8-Digital implementation of analog controller design.pdf
Lecture_8-Digital implementation of analog controller design.pdf
mohamedsamy9878
 

Recently uploaded (20)

Teachers record management system project report..pdf
Teachers record management system project report..pdfTeachers record management system project report..pdf
Teachers record management system project report..pdf
 
Electrical shop management system project report.pdf
Electrical shop management system project report.pdfElectrical shop management system project report.pdf
Electrical shop management system project report.pdf
 
BRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWING
BRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWINGBRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWING
BRAKING SYSTEM IN INDIAN RAILWAY AutoCAD DRAWING
 
Online blood donation management system project.pdf
Online blood donation management system project.pdfOnline blood donation management system project.pdf
Online blood donation management system project.pdf
 
İTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering WorkshopİTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering Workshop
 
KIT-601 Lecture Notes-UNIT-5.pdf Frame Works and Visualization
KIT-601 Lecture Notes-UNIT-5.pdf Frame Works and VisualizationKIT-601 Lecture Notes-UNIT-5.pdf Frame Works and Visualization
KIT-601 Lecture Notes-UNIT-5.pdf Frame Works and Visualization
 
RM&IPR M5 notes.pdfResearch Methodolgy & Intellectual Property Rights Series 5
RM&IPR M5 notes.pdfResearch Methodolgy & Intellectual Property Rights Series 5RM&IPR M5 notes.pdfResearch Methodolgy & Intellectual Property Rights Series 5
RM&IPR M5 notes.pdfResearch Methodolgy & Intellectual Property Rights Series 5
 
Electrostatic field in a coaxial transmission line
Electrostatic field in a coaxial transmission lineElectrostatic field in a coaxial transmission line
Electrostatic field in a coaxial transmission line
 
Furniture showroom management system project.pdf
Furniture showroom management system project.pdfFurniture showroom management system project.pdf
Furniture showroom management system project.pdf
 
ONLINE CAR SERVICING SYSTEM PROJECT REPORT.pdf
ONLINE CAR SERVICING SYSTEM PROJECT REPORT.pdfONLINE CAR SERVICING SYSTEM PROJECT REPORT.pdf
ONLINE CAR SERVICING SYSTEM PROJECT REPORT.pdf
 
Planetary Gears of automatic transmission of vehicle
Planetary Gears of automatic transmission of vehiclePlanetary Gears of automatic transmission of vehicle
Planetary Gears of automatic transmission of vehicle
 
ENCODERS & DECODERS - Digital Electronics - diu swe
ENCODERS & DECODERS - Digital Electronics - diu sweENCODERS & DECODERS - Digital Electronics - diu swe
ENCODERS & DECODERS - Digital Electronics - diu swe
 
The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...
The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...
The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...
 
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical EngineeringIntroduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
 
Natalia Rutkowska - BIM School Course in Kraków
Natalia Rutkowska - BIM School Course in KrakówNatalia Rutkowska - BIM School Course in Kraków
Natalia Rutkowska - BIM School Course in Kraków
 
Lecture_8-Digital implementation of analog controller design.pdf
Lecture_8-Digital implementation of analog controller design.pdfLecture_8-Digital implementation of analog controller design.pdf
Lecture_8-Digital implementation of analog controller design.pdf
 
Construction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptxConstruction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptx
 
Peek implant persentation - Copy (1).pdf
Peek implant persentation - Copy (1).pdfPeek implant persentation - Copy (1).pdf
Peek implant persentation - Copy (1).pdf
 
Research Methodolgy & Intellectual Property Rights Series 2
Research Methodolgy & Intellectual Property Rights Series 2Research Methodolgy & Intellectual Property Rights Series 2
Research Methodolgy & Intellectual Property Rights Series 2
 
Lect_Z_Transform_Main_digital_image_processing.pptx
Lect_Z_Transform_Main_digital_image_processing.pptxLect_Z_Transform_Main_digital_image_processing.pptx
Lect_Z_Transform_Main_digital_image_processing.pptx
 

Dev.bg DevOps March 2024 Monitoring & Logging

  • 1. Следете актуалните обяви за DevOps Партньори: Monitoring & Logging Marian Marinov mm@yuhu.biz
  • 2. Следете актуалните обяви за DevOps Партньори: Who am I? ● Director of Engineering at Web Hosting Canada ● Former partner and Head of DevOps at SiteGround ● A SysAdmin and System Architect
  • 3. Следете актуалните обяви за DevOps Партньори: What I have to monitor? ● 13 physical linux machines ○ Storage capacity (df/df -i) ○ S.M.A.R.T. of the drives ○ RAID (HW or Soft) ○ Network (routes, traffic and usage) ○ Performance (CPU, Mem, I/O, Processes) ○ Kernel logs ○ Service logs
  • 4. Следете актуалните обяви за DevOps Партньори: What I have to monitor? ● 1 UPS ● 2 APC PUDs ● 2 Switches (SNMP statistics) ● 2 Thermostat (traffic, temp, humidity) ● 40+ LXC containers ○ Performance (CPU, Mem, I/O, Processes) ○ Storage capacity (df/df -i) ○ Service logs ● 2-3 Wifi access points ○ number of attached devices ○ traffic per-device
  • 5. Следете актуалните обяви за DevOps Партньори: What I have to monitor? ● A few things for which I want traffic and power on time ○ 3 TVs ○ 3 Amplifiers ○ 4 Cameras ○ 1 Washing machine ○ 1 Dryer
  • 6. Следете актуалните обяви за DevOps Партньори: What I wanted ● Single solution for log and metrics collection ● Single central interface
  • 7. Следете актуалните обяви за DevOps Партньори: What I ended up having ● multiple grafana dashboards ● monitor events, instead of reading logs ● a bunch of different log collectors
  • 8. Следете актуалните обяви за DevOps Партньори: What tested ● syslog-ng ● rsyslog ● Filebeat ● Prometheus node_exporter ● Loki ● Fluentd ● Clolectd ● StatsD ● Graylog ● PostgreSQL+timescale ● Grafana
  • 9. Следете актуалните обяви за DevOps Партньори: Conclusions ● there is no one solution to rule them all ● SNMP is still the king for networking ● too many logging formats and DSLs
  • 10. Следете актуалните обяви за DevOps Партньори: Conclusions ● there is no one solution to rule them all ● SNMP is still the king for networking ● too many logging formats and DSLs ● collectd was the easiest ○ with the most metrics out-of-the-box
  • 11. Следете актуалните обяви за DevOps Партньори: Conclusions ● there is no one solution to rule them all ● SNMP is still the king for networking ● too many logging formats and DSLs ● collectd was the easiest ○ with the most metrics out-of-the-box ● ElasticSearch + Kibana require too much resources ○ Not usable for smaller setups ● Graylog uses a lot of CPU for the work it does ○ alerts can be based on number of events instead of parsing logs
  • 12. Следете актуалните обяви за DevOps Партньори: Installation / Setup ● basic apt-get: ○ rsyslogd, syslog-ng, fluentd, collectd, filebeat, loki, node_exporter ○ statsd wanted full npm
  • 13. Следете актуалните обяви за DevOps Партньори: Pros and Cons ● Syslog pros ○ can easily ingest netconsole kernel logging ○ very good performance ○ well documented and standardized interface ● Syslog cons ○ fire and forget ○ the syslog protocol ○ not enough parsing flexibility ○ syslog-ng was heavier then rsyslogd
  • 14. Следете актуалните обяви за DevOps Партньори: Pros and Cons ● Loki/Node_exporter/filebeat/fluentd ○ very good parsing capabilities ○ filebeat was the easiest for me ○ reliable log delivery ○ different integrations ○ ready made grafana dashboards ● Loki/Node_exporter/filebeat/fluentd ○ very heavy on CPU ○ Loki did not have sysv init script :)
  • 15. Следете актуалните обяви за DevOps Партньори: Interesting ● OAIEvals Collector - by Nikolay Stankov
  • 16. Следете актуалните обяви за DevOps Партньори: DB integrations 1. Prometheus node-exporter 2. Fluentd 3. filebeat 4. syslog
  • 17. Следете актуалните обяви за DevOps Партньори: Not out of the box ● Custom local collectors still have to go directly to your metrics DB ● Having a producer/subscriber greatly reduces the performance hit ● Fluent and fliebeat were the only one supporting kafka out of the box ○ https://github.com/hikhvar/mqtt2prometheus ○ https://github.com/toyokazu/fluent-plugin-mqtt-io
  • 18. Thank you! СЛЕДВАЩО СЪБИТИЕ Лектор Дата Език Следете актуалните обяви за DevOps Партньори: Monitoring & Logging Marian Marinov 19.Mar.2024 Български Contacts: Marian Marinov Github profile Facebook profile
  • 19. Следете актуалните обяви за DevOps Партньори: What do I have on the containers? ● NextCloud ● Home Assistant ● Mirrors ● VPNs ● NetBox ● Monitoring (Grafana, StatPing) ● Games (Minecraft, CS, PVPGN) ● IRC (server, bouncers, bots) ● Matrix, Mattermost ● Backups ● Streaming (FOSDEM streamer setup) ● DBs (PostgreSQL, MySQL, Redis, DragonFly, Timescale, InfluxDB, Mongo) ● Vitess, ProxySQL ● MPI (Gearman, MQTT, Kafka, RabbitMQ) ● Web stuff - Wiki, HAproxy, Nginx, Varnish ● OpenShift, OpenStack, K8s on VMs and physical ● A lot of other experiments
  • 20. Следете актуалните обяви за DevOps Партньори: What storage do I use? ● Local + LVM ● DRBD+OCFS2 ● iSCSI ● cLVM + iSCSI ● GlusterFS ● OrangeFS ● I had in the past: ○ Ceph ○ NFS ○ cLVM + ATAoE ○ cLVM + NBD