SlideShare a Scribd company logo
1 of 14
Download to read offline
Monitoring @ ACOnet


                           Robert Wein, ACOnet NOC

                           TF-NOC, Dublin, 2012-06-05




                                                        1


Dienstag, 05. Juni 2012
ACOnet

                          ■   ACOnet is the Austrian NREN, connecting
                               ■   (all) Universities & Academies
                               ■   Colleges & Research Institutes
                               ■   Austrian School Network (edunet), Dormitories
                               ■   Museums, educational and cultural institutions
                               ■   Hospitals
                               ■   Ministries, Federal Agencies
                               ■   Federal Chancellery, Presidential Offices
                               ■   Provincial Government and Administration
                               ■   …

                          ■ Legal       Entity & Management: University of Vienna

                          ■ Operation:         UniVie + other Universities, fiber backbone by
                              telco




                                                                                           2


Dienstag, 05. Juni 2012
current topology




                                             3


Dienstag, 05. Juni 2012
Vienna Internet Exchange (VIX)

                          ■ neutraland non for profit IXP
                          ■ founded 1996
                          ■ 107 participants (different AS-Numbers)
                          ■ 65 Gbps peak traffic in May 2012
                          ■ redundant setup - 2 sites




                                                                      4


Dienstag, 05. Juni 2012
Monitoring status December 2010

                          ■ Nagios/Cacti
                            ■ integration    in configuration authority database (ACOnetDB)
                              ■ integration in web-portal
                              ■ (intensive) use of check_rrd
                              ■ outsourced maintainance and development - together with
                                UniVie Campus
                              ■ troubles
                                  ■ check_rrd takes much IO-load
                                  ■ integration of new platform in backbone (Cisco ASR9k)
                                  ■ lot of CPU load from SNMP on Catalysts due to polling for
                                    values/thresholds _and_ statistics
                                  ■ outsourced maintainance and development
                          ■ flowsampling to Arbor boxes
                          ■ VIX: additional sFlow-sampling, „VIXflow“



                                                                                    5


Dienstag, 05. Juni 2012
new monitoring setup

                          ■ Icinga
                            ■ Nagios  fork
                             ■ Developer@ACOnet-Team
                          ■ pnp4nagios
                             ■ takes perfdata and puts it into rrds
                          ■ check_mk
                             ■ keeping inventory
                             ■ generates Icinga-config
                             ■ one active check for one device
                             ■ python - just a small job to write your own checks :)




                                                                                       6


Dienstag, 05. Juni 2012
Monitoring@ACOnet
                          ■ integration
                             ■ ACOnet   Database/VIX Database
                                ■ configuration authority
                                ■ dispatcher writes dictionaries for check_mk and calls
                                  check_mk to generate the config
                             ■ display of statistics in portal (per participant)
                             ■ weathermap (standalone php)
                             ■ display of relevant status data/checkresults in portal




                                                                                   7


Dienstag, 05. Juni 2012
Monitoring@ACOnet

                          ■ characteristics
                            ■ one  active check per device
                            ■ results used in many passive checks
                            ■ SNMPv2 (except older power-measurement-devices)
                            ■ no traps
                            ■ perfdata in RRDs
                            ■ OID cache
                            ■ SNMPv2 bulkwalks
                            ■ ido2db - postgresql
                            ■ one poll for statistics and threshold decision
                            ■ use of rrdcached speeds up the whole thing
                            ■ Icinga classic UI
                            ■ two monitoring hosts at different locations
                            ■ dedicated hardware for monitoring
                                ■ commodity HP hardware

                                                                            8


Dienstag, 05. Juni 2012
Monitoring@ACOnet




                                              9


Dienstag, 05. Juni 2012
Monitoring@ACOnet
                          ■ what  do we check/graph
                            ■ traffic/packets/errors/discards
                            ■ CoS (QoS) - basis for cost sharing model
                            ■ module status
                            ■ BGP
                                ■ incl. Prefix count
                                ■ @Cisco ASR9K also IPv6
                            ■ ICMP RTT in v4 and v6
                            ■ Memory/CPU usage
                            ■ temperatures
                            ■ DOM
                            ■ .....
                            ■ @VIX
                                ■ power consumption (for billing of RUs)
                                ■ bird BGP-daemon
                                ■ special: Proxy ARP check
                                                                           10


Dienstag, 05. Juni 2012
Monitoring@ACOnet
                          ■ Enhancements
                            ■ ASR9k    integrated
                            ■ checks and statistics in <45 s per Device
                                ■ check latency >200s when using Cacti/Nagios
                            ■ less CPU consumed from SNMP on monitored devices
                            ■ Load@montoring host between 0,3 and 0,9
                                ■ compared to 5 (nagios/cacti)
                            ■ VIX routeserver (bird) monitoring established
                            ■ reduced IO-load due to rrdcached
                            ■ easy (?) implementation of new checks
                            ■ advantages of Icinga
                                ■ active development
                                   ■ eg., flexible downtime, multiple acknowledgements, .....
                                   ■ easy bringing in of new ideas :)



                                                                                   11


Dienstag, 05. Juni 2012
Monitoring@ACOnet
                          ■ Future
                            ■ dependencies
                            ■ better
                                   grained notifications
                            ■ weathermap redesign




                                                           12


Dienstag, 05. Juni 2012
Monitoring@ACOnet




                                              13


Dienstag, 05. Juni 2012
Monitoring@ACOnet




                                     Questions?




                                                  14


Dienstag, 05. Juni 2012

More Related Content

Similar to Icinga 2012 at ACOnet on 6th TF-NOC Meeting

Spark in the Maritime Domain
Spark in the Maritime DomainSpark in the Maritime Domain
Spark in the Maritime DomainDemi Ben-Ari
 
EDF2013: Selected Talk, Simon Riggs: Practical PostgreSQL and AXLE Project
EDF2013: Selected Talk, Simon Riggs: Practical PostgreSQL and AXLE ProjectEDF2013: Selected Talk, Simon Riggs: Practical PostgreSQL and AXLE Project
EDF2013: Selected Talk, Simon Riggs: Practical PostgreSQL and AXLE ProjectEuropean Data Forum
 
An Active and Hybrid Storage System for Data-intensive Applications
An Active and Hybrid Storage System for Data-intensive ApplicationsAn Active and Hybrid Storage System for Data-intensive Applications
An Active and Hybrid Storage System for Data-intensive ApplicationsXiao Qin
 
Extending Zeek for ICS Defense
Extending Zeek for ICS DefenseExtending Zeek for ICS Defense
Extending Zeek for ICS DefenseJames Dickenson
 
Model-driven Network Management
Model-driven Network ManagementModel-driven Network Management
Model-driven Network ManagementAnees Shaikh
 
Just two clicks away - from monitoring and reporting to root-cause analysis
Just two clicks away - from monitoring and reporting to root-cause analysisJust two clicks away - from monitoring and reporting to root-cause analysis
Just two clicks away - from monitoring and reporting to root-cause analysisSavvius, Inc
 
Confusion of Things — The IoT Hardware Kerfuffle
Confusion of Things — The IoT Hardware KerfuffleConfusion of Things — The IoT Hardware Kerfuffle
Confusion of Things — The IoT Hardware KerfuffleOmer Kilic
 
LCU14 310- Cisco ODP v2
LCU14 310- Cisco ODP v2LCU14 310- Cisco ODP v2
LCU14 310- Cisco ODP v2Linaro
 
Linux kernel status in RISC-V
Linux kernel status in RISC-VLinux kernel status in RISC-V
Linux kernel status in RISC-VAtish Patra
 
The Kubernetes Effect
The Kubernetes EffectThe Kubernetes Effect
The Kubernetes EffectBilgin Ibryam
 
Openstack and Reddwarf Overview
Openstack and Reddwarf OverviewOpenstack and Reddwarf Overview
Openstack and Reddwarf OverviewCraig Vyvial
 
ODN - Technical introduction of the platform
ODN - Technical introduction of the platformODN - Technical introduction of the platform
ODN - Technical introduction of the platformComsode - FP7 project
 
Monitor OpenStack Environments from the bottom up and front to back
Monitor OpenStack Environments from the bottom up and front to backMonitor OpenStack Environments from the bottom up and front to back
Monitor OpenStack Environments from the bottom up and front to backIcinga
 
Why sdn
Why sdnWhy sdn
Why sdnlz1dsb
 
Under the hood, fighting fires with realtime semantic web technology
Under the hood, fighting fires with realtime semantic web technologyUnder the hood, fighting fires with realtime semantic web technology
Under the hood, fighting fires with realtime semantic web technologyBart van Leeuwen
 
Red Hat Enterprise Linux: Open, hyperconverged infrastructure
Red Hat Enterprise Linux: Open, hyperconverged infrastructureRed Hat Enterprise Linux: Open, hyperconverged infrastructure
Red Hat Enterprise Linux: Open, hyperconverged infrastructureRed_Hat_Storage
 
Cisco Connect Toronto 2017 - Model-driven Telemetry
Cisco Connect Toronto 2017 - Model-driven TelemetryCisco Connect Toronto 2017 - Model-driven Telemetry
Cisco Connect Toronto 2017 - Model-driven TelemetryCisco Canada
 
Splunk as a_big_data_platform_for_developers_spring_one2gx
Splunk as a_big_data_platform_for_developers_spring_one2gxSplunk as a_big_data_platform_for_developers_spring_one2gx
Splunk as a_big_data_platform_for_developers_spring_one2gxDamien Dallimore
 
Scaling the Container Dataplane
Scaling the Container Dataplane Scaling the Container Dataplane
Scaling the Container Dataplane Michelle Holley
 

Similar to Icinga 2012 at ACOnet on 6th TF-NOC Meeting (20)

Spark in the Maritime Domain
Spark in the Maritime DomainSpark in the Maritime Domain
Spark in the Maritime Domain
 
EDF2013: Selected Talk, Simon Riggs: Practical PostgreSQL and AXLE Project
EDF2013: Selected Talk, Simon Riggs: Practical PostgreSQL and AXLE ProjectEDF2013: Selected Talk, Simon Riggs: Practical PostgreSQL and AXLE Project
EDF2013: Selected Talk, Simon Riggs: Practical PostgreSQL and AXLE Project
 
An Active and Hybrid Storage System for Data-intensive Applications
An Active and Hybrid Storage System for Data-intensive ApplicationsAn Active and Hybrid Storage System for Data-intensive Applications
An Active and Hybrid Storage System for Data-intensive Applications
 
Extending Zeek for ICS Defense
Extending Zeek for ICS DefenseExtending Zeek for ICS Defense
Extending Zeek for ICS Defense
 
Model-driven Network Management
Model-driven Network ManagementModel-driven Network Management
Model-driven Network Management
 
Just two clicks away - from monitoring and reporting to root-cause analysis
Just two clicks away - from monitoring and reporting to root-cause analysisJust two clicks away - from monitoring and reporting to root-cause analysis
Just two clicks away - from monitoring and reporting to root-cause analysis
 
Confusion of Things — The IoT Hardware Kerfuffle
Confusion of Things — The IoT Hardware KerfuffleConfusion of Things — The IoT Hardware Kerfuffle
Confusion of Things — The IoT Hardware Kerfuffle
 
LCU14 310- Cisco ODP v2
LCU14 310- Cisco ODP v2LCU14 310- Cisco ODP v2
LCU14 310- Cisco ODP v2
 
State of the OpenDaylight Union
State of the OpenDaylight UnionState of the OpenDaylight Union
State of the OpenDaylight Union
 
Linux kernel status in RISC-V
Linux kernel status in RISC-VLinux kernel status in RISC-V
Linux kernel status in RISC-V
 
The Kubernetes Effect
The Kubernetes EffectThe Kubernetes Effect
The Kubernetes Effect
 
Openstack and Reddwarf Overview
Openstack and Reddwarf OverviewOpenstack and Reddwarf Overview
Openstack and Reddwarf Overview
 
ODN - Technical introduction of the platform
ODN - Technical introduction of the platformODN - Technical introduction of the platform
ODN - Technical introduction of the platform
 
Monitor OpenStack Environments from the bottom up and front to back
Monitor OpenStack Environments from the bottom up and front to backMonitor OpenStack Environments from the bottom up and front to back
Monitor OpenStack Environments from the bottom up and front to back
 
Why sdn
Why sdnWhy sdn
Why sdn
 
Under the hood, fighting fires with realtime semantic web technology
Under the hood, fighting fires with realtime semantic web technologyUnder the hood, fighting fires with realtime semantic web technology
Under the hood, fighting fires with realtime semantic web technology
 
Red Hat Enterprise Linux: Open, hyperconverged infrastructure
Red Hat Enterprise Linux: Open, hyperconverged infrastructureRed Hat Enterprise Linux: Open, hyperconverged infrastructure
Red Hat Enterprise Linux: Open, hyperconverged infrastructure
 
Cisco Connect Toronto 2017 - Model-driven Telemetry
Cisco Connect Toronto 2017 - Model-driven TelemetryCisco Connect Toronto 2017 - Model-driven Telemetry
Cisco Connect Toronto 2017 - Model-driven Telemetry
 
Splunk as a_big_data_platform_for_developers_spring_one2gx
Splunk as a_big_data_platform_for_developers_spring_one2gxSplunk as a_big_data_platform_for_developers_spring_one2gx
Splunk as a_big_data_platform_for_developers_spring_one2gx
 
Scaling the Container Dataplane
Scaling the Container Dataplane Scaling the Container Dataplane
Scaling the Container Dataplane
 

More from Icinga

Upgrading Incident Management with Icinga - Icinga Camp Milan 2023
Upgrading Incident Management with Icinga - Icinga Camp Milan 2023Upgrading Incident Management with Icinga - Icinga Camp Milan 2023
Upgrading Incident Management with Icinga - Icinga Camp Milan 2023Icinga
 
Extending Icinga Web with Modules: powerful, smart and easily created - Icing...
Extending Icinga Web with Modules: powerful, smart and easily created - Icing...Extending Icinga Web with Modules: powerful, smart and easily created - Icing...
Extending Icinga Web with Modules: powerful, smart and easily created - Icing...Icinga
 
Infrastructure Monitoring for Cloud Native Enterprises - Icinga Camp Milan 2023
Infrastructure Monitoring for Cloud Native Enterprises - Icinga Camp Milan 2023Infrastructure Monitoring for Cloud Native Enterprises - Icinga Camp Milan 2023
Infrastructure Monitoring for Cloud Native Enterprises - Icinga Camp Milan 2023Icinga
 
Incident management: Best industry practices your team should know - Icinga C...
Incident management: Best industry practices your team should know - Icinga C...Incident management: Best industry practices your team should know - Icinga C...
Incident management: Best industry practices your team should know - Icinga C...Icinga
 
Monitoring Cooling Units in a pharmaceutical GxP regulated environment - Icin...
Monitoring Cooling Units in a pharmaceutical GxP regulated environment - Icin...Monitoring Cooling Units in a pharmaceutical GxP regulated environment - Icin...
Monitoring Cooling Units in a pharmaceutical GxP regulated environment - Icin...Icinga
 
SNMP Monitoring at scale - Icinga Camp Milan 2023
SNMP Monitoring at scale - Icinga Camp Milan 2023SNMP Monitoring at scale - Icinga Camp Milan 2023
SNMP Monitoring at scale - Icinga Camp Milan 2023Icinga
 
Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023
Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023
Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023Icinga
 
Current State of Icinga - Icinga Camp Milan 2023
Current State of Icinga - Icinga Camp Milan 2023Current State of Icinga - Icinga Camp Milan 2023
Current State of Icinga - Icinga Camp Milan 2023Icinga
 
Efficient IT operations using monitoring systems and standardized tools - Ici...
Efficient IT operations using monitoring systems and standardized tools - Ici...Efficient IT operations using monitoring systems and standardized tools - Ici...
Efficient IT operations using monitoring systems and standardized tools - Ici...Icinga
 
Tornado Complex Event Processing Framework for Icinga - Icinga Camp Zurich 2019
Tornado Complex Event Processing Framework for Icinga - Icinga Camp Zurich 2019Tornado Complex Event Processing Framework for Icinga - Icinga Camp Zurich 2019
Tornado Complex Event Processing Framework for Icinga - Icinga Camp Zurich 2019Icinga
 
Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019
Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019
Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019Icinga
 
Moving from Icinga 1 to Icinga 2 + Director - Icinga Camp Zurich 2019
Moving from Icinga 1 to Icinga 2 + Director - Icinga Camp Zurich 2019Moving from Icinga 1 to Icinga 2 + Director - Icinga Camp Zurich 2019
Moving from Icinga 1 to Icinga 2 + Director - Icinga Camp Zurich 2019Icinga
 
Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019
Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019
Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019Icinga
 
Current State of Icinga - Icinga Camp Zurich 2019
Current State of Icinga - Icinga Camp Zurich 2019Current State of Icinga - Icinga Camp Zurich 2019
Current State of Icinga - Icinga Camp Zurich 2019Icinga
 
NetEye 4 based on Icinga 2 - Icinga Camp Milan 2019
NetEye 4 based on Icinga 2 - Icinga Camp Milan 2019NetEye 4 based on Icinga 2 - Icinga Camp Milan 2019
NetEye 4 based on Icinga 2 - Icinga Camp Milan 2019Icinga
 
Integrating Icinga 2 and ntopng - Icinga Camp Milan 2019
Integrating Icinga 2 and ntopng - Icinga Camp Milan 2019Integrating Icinga 2 and ntopng - Icinga Camp Milan 2019
Integrating Icinga 2 and ntopng - Icinga Camp Milan 2019Icinga
 
DevOps monitoring: Best Practices using OpenShift combined with Icinga & Big ...
DevOps monitoring: Best Practices using OpenShift combined with Icinga & Big ...DevOps monitoring: Best Practices using OpenShift combined with Icinga & Big ...
DevOps monitoring: Best Practices using OpenShift combined with Icinga & Big ...Icinga
 
Current State of Icinga - Icinga Camp Milan 2019
Current State of Icinga - Icinga Camp Milan 2019Current State of Icinga - Icinga Camp Milan 2019
Current State of Icinga - Icinga Camp Milan 2019Icinga
 
Best of Icinga Modules - Icinga Camp Milan 2019
Best of Icinga Modules - Icinga Camp Milan 2019Best of Icinga Modules - Icinga Camp Milan 2019
Best of Icinga Modules - Icinga Camp Milan 2019Icinga
 
hallenges of Monitoring Big Infrastructure - Icinga Camp Milan 2019
hallenges of Monitoring Big Infrastructure - Icinga Camp Milan 2019hallenges of Monitoring Big Infrastructure - Icinga Camp Milan 2019
hallenges of Monitoring Big Infrastructure - Icinga Camp Milan 2019Icinga
 

More from Icinga (20)

Upgrading Incident Management with Icinga - Icinga Camp Milan 2023
Upgrading Incident Management with Icinga - Icinga Camp Milan 2023Upgrading Incident Management with Icinga - Icinga Camp Milan 2023
Upgrading Incident Management with Icinga - Icinga Camp Milan 2023
 
Extending Icinga Web with Modules: powerful, smart and easily created - Icing...
Extending Icinga Web with Modules: powerful, smart and easily created - Icing...Extending Icinga Web with Modules: powerful, smart and easily created - Icing...
Extending Icinga Web with Modules: powerful, smart and easily created - Icing...
 
Infrastructure Monitoring for Cloud Native Enterprises - Icinga Camp Milan 2023
Infrastructure Monitoring for Cloud Native Enterprises - Icinga Camp Milan 2023Infrastructure Monitoring for Cloud Native Enterprises - Icinga Camp Milan 2023
Infrastructure Monitoring for Cloud Native Enterprises - Icinga Camp Milan 2023
 
Incident management: Best industry practices your team should know - Icinga C...
Incident management: Best industry practices your team should know - Icinga C...Incident management: Best industry practices your team should know - Icinga C...
Incident management: Best industry practices your team should know - Icinga C...
 
Monitoring Cooling Units in a pharmaceutical GxP regulated environment - Icin...
Monitoring Cooling Units in a pharmaceutical GxP regulated environment - Icin...Monitoring Cooling Units in a pharmaceutical GxP regulated environment - Icin...
Monitoring Cooling Units in a pharmaceutical GxP regulated environment - Icin...
 
SNMP Monitoring at scale - Icinga Camp Milan 2023
SNMP Monitoring at scale - Icinga Camp Milan 2023SNMP Monitoring at scale - Icinga Camp Milan 2023
SNMP Monitoring at scale - Icinga Camp Milan 2023
 
Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023
Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023
Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023
 
Current State of Icinga - Icinga Camp Milan 2023
Current State of Icinga - Icinga Camp Milan 2023Current State of Icinga - Icinga Camp Milan 2023
Current State of Icinga - Icinga Camp Milan 2023
 
Efficient IT operations using monitoring systems and standardized tools - Ici...
Efficient IT operations using monitoring systems and standardized tools - Ici...Efficient IT operations using monitoring systems and standardized tools - Ici...
Efficient IT operations using monitoring systems and standardized tools - Ici...
 
Tornado Complex Event Processing Framework for Icinga - Icinga Camp Zurich 2019
Tornado Complex Event Processing Framework for Icinga - Icinga Camp Zurich 2019Tornado Complex Event Processing Framework for Icinga - Icinga Camp Zurich 2019
Tornado Complex Event Processing Framework for Icinga - Icinga Camp Zurich 2019
 
Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019
Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019
Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019
 
Moving from Icinga 1 to Icinga 2 + Director - Icinga Camp Zurich 2019
Moving from Icinga 1 to Icinga 2 + Director - Icinga Camp Zurich 2019Moving from Icinga 1 to Icinga 2 + Director - Icinga Camp Zurich 2019
Moving from Icinga 1 to Icinga 2 + Director - Icinga Camp Zurich 2019
 
Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019
Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019
Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019
 
Current State of Icinga - Icinga Camp Zurich 2019
Current State of Icinga - Icinga Camp Zurich 2019Current State of Icinga - Icinga Camp Zurich 2019
Current State of Icinga - Icinga Camp Zurich 2019
 
NetEye 4 based on Icinga 2 - Icinga Camp Milan 2019
NetEye 4 based on Icinga 2 - Icinga Camp Milan 2019NetEye 4 based on Icinga 2 - Icinga Camp Milan 2019
NetEye 4 based on Icinga 2 - Icinga Camp Milan 2019
 
Integrating Icinga 2 and ntopng - Icinga Camp Milan 2019
Integrating Icinga 2 and ntopng - Icinga Camp Milan 2019Integrating Icinga 2 and ntopng - Icinga Camp Milan 2019
Integrating Icinga 2 and ntopng - Icinga Camp Milan 2019
 
DevOps monitoring: Best Practices using OpenShift combined with Icinga & Big ...
DevOps monitoring: Best Practices using OpenShift combined with Icinga & Big ...DevOps monitoring: Best Practices using OpenShift combined with Icinga & Big ...
DevOps monitoring: Best Practices using OpenShift combined with Icinga & Big ...
 
Current State of Icinga - Icinga Camp Milan 2019
Current State of Icinga - Icinga Camp Milan 2019Current State of Icinga - Icinga Camp Milan 2019
Current State of Icinga - Icinga Camp Milan 2019
 
Best of Icinga Modules - Icinga Camp Milan 2019
Best of Icinga Modules - Icinga Camp Milan 2019Best of Icinga Modules - Icinga Camp Milan 2019
Best of Icinga Modules - Icinga Camp Milan 2019
 
hallenges of Monitoring Big Infrastructure - Icinga Camp Milan 2019
hallenges of Monitoring Big Infrastructure - Icinga Camp Milan 2019hallenges of Monitoring Big Infrastructure - Icinga Camp Milan 2019
hallenges of Monitoring Big Infrastructure - Icinga Camp Milan 2019
 

Icinga 2012 at ACOnet on 6th TF-NOC Meeting

  • 1. Monitoring @ ACOnet Robert Wein, ACOnet NOC TF-NOC, Dublin, 2012-06-05 1 Dienstag, 05. Juni 2012
  • 2. ACOnet ■ ACOnet is the Austrian NREN, connecting ■ (all) Universities & Academies ■ Colleges & Research Institutes ■ Austrian School Network (edunet), Dormitories ■ Museums, educational and cultural institutions ■ Hospitals ■ Ministries, Federal Agencies ■ Federal Chancellery, Presidential Offices ■ Provincial Government and Administration ■ … ■ Legal Entity & Management: University of Vienna ■ Operation: UniVie + other Universities, fiber backbone by telco 2 Dienstag, 05. Juni 2012
  • 3. current topology 3 Dienstag, 05. Juni 2012
  • 4. Vienna Internet Exchange (VIX) ■ neutraland non for profit IXP ■ founded 1996 ■ 107 participants (different AS-Numbers) ■ 65 Gbps peak traffic in May 2012 ■ redundant setup - 2 sites 4 Dienstag, 05. Juni 2012
  • 5. Monitoring status December 2010 ■ Nagios/Cacti ■ integration in configuration authority database (ACOnetDB) ■ integration in web-portal ■ (intensive) use of check_rrd ■ outsourced maintainance and development - together with UniVie Campus ■ troubles ■ check_rrd takes much IO-load ■ integration of new platform in backbone (Cisco ASR9k) ■ lot of CPU load from SNMP on Catalysts due to polling for values/thresholds _and_ statistics ■ outsourced maintainance and development ■ flowsampling to Arbor boxes ■ VIX: additional sFlow-sampling, „VIXflow“ 5 Dienstag, 05. Juni 2012
  • 6. new monitoring setup ■ Icinga ■ Nagios fork ■ Developer@ACOnet-Team ■ pnp4nagios ■ takes perfdata and puts it into rrds ■ check_mk ■ keeping inventory ■ generates Icinga-config ■ one active check for one device ■ python - just a small job to write your own checks :) 6 Dienstag, 05. Juni 2012
  • 7. Monitoring@ACOnet ■ integration ■ ACOnet Database/VIX Database ■ configuration authority ■ dispatcher writes dictionaries for check_mk and calls check_mk to generate the config ■ display of statistics in portal (per participant) ■ weathermap (standalone php) ■ display of relevant status data/checkresults in portal 7 Dienstag, 05. Juni 2012
  • 8. Monitoring@ACOnet ■ characteristics ■ one active check per device ■ results used in many passive checks ■ SNMPv2 (except older power-measurement-devices) ■ no traps ■ perfdata in RRDs ■ OID cache ■ SNMPv2 bulkwalks ■ ido2db - postgresql ■ one poll for statistics and threshold decision ■ use of rrdcached speeds up the whole thing ■ Icinga classic UI ■ two monitoring hosts at different locations ■ dedicated hardware for monitoring ■ commodity HP hardware 8 Dienstag, 05. Juni 2012
  • 9. Monitoring@ACOnet 9 Dienstag, 05. Juni 2012
  • 10. Monitoring@ACOnet ■ what do we check/graph ■ traffic/packets/errors/discards ■ CoS (QoS) - basis for cost sharing model ■ module status ■ BGP ■ incl. Prefix count ■ @Cisco ASR9K also IPv6 ■ ICMP RTT in v4 and v6 ■ Memory/CPU usage ■ temperatures ■ DOM ■ ..... ■ @VIX ■ power consumption (for billing of RUs) ■ bird BGP-daemon ■ special: Proxy ARP check 10 Dienstag, 05. Juni 2012
  • 11. Monitoring@ACOnet ■ Enhancements ■ ASR9k integrated ■ checks and statistics in <45 s per Device ■ check latency >200s when using Cacti/Nagios ■ less CPU consumed from SNMP on monitored devices ■ Load@montoring host between 0,3 and 0,9 ■ compared to 5 (nagios/cacti) ■ VIX routeserver (bird) monitoring established ■ reduced IO-load due to rrdcached ■ easy (?) implementation of new checks ■ advantages of Icinga ■ active development ■ eg., flexible downtime, multiple acknowledgements, ..... ■ easy bringing in of new ideas :) 11 Dienstag, 05. Juni 2012
  • 12. Monitoring@ACOnet ■ Future ■ dependencies ■ better grained notifications ■ weathermap redesign 12 Dienstag, 05. Juni 2012
  • 13. Monitoring@ACOnet 13 Dienstag, 05. Juni 2012
  • 14. Monitoring@ACOnet Questions? 14 Dienstag, 05. Juni 2012