SlideShare a Scribd company logo
Monasca
Monitoring/Logging-as-a-Service (at-scale)
Speaker
Roland Hochmuth
Hewlett Packard Enterprise
Fort Collins, Colorado, USA
Agenda
• Describe how to build a highly scalable monitoring and logging as a
service platform
• Architectural and design principles
• Scale, HA
• Provide an overview of Monasca
• Features
• API
• Demo
What is Monitoring-as-a-Service?
• A Monitoring or Logging solution deployed as Software-as-a-Service
• E.g. CloudWatch, Datadog, New Relic, Librato, Loggly and many others
• First-class, preferably RESTful HTTP API
• Authentication
• Multi-tenancy
• Provides self-provisioning to users/tenants of the service
• Designed to be highly reliable and operate at scale
• Historically run by an operations team doing web services
What is OpenStack?
• OpenStack is a cloud operating system that controls large pools of
compute, storage, and networking resources
• Open-source alternative to AWS, Microsoft Azure, Google Cloud and
other cloud services
• Deployed in both public and private clouds
What is Monasca?
• Open-source Monitoring/Logging-as-a-Service platform for OpenStack
• Authentication currently via OpenStack Identity Service (Keystone)
• Microservices message-bus based architecture
• First-class RESTful API
• Push-based metrics
• Consolidates Operational Monitoring, Monitoring-as-a-Service, Metering &
Billing and more
• Designed for elastic cloud environments/deployments
• High-availability / clustering built-in
• Horizontally scalable and vertically 4 tiered/layered architecture
• Capable of long-term data retention to address metering, SLA, capacity
planning, trend analysis, post-hoc RCA, and other use cases
• Extensible and Composable
The Log
• The Log: What every software engineer should know about real-time data's
unifying abstraction
• https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-
should-know-about-real-time-datas-unifying
• Log: An append-only, totally-ordered sequence of records ordered by time
From To
Monitoring Architecture
Kafka
• A performant, distributed, durable, publish/subscribe messaging and stream
processing system
• Metrics, logs and events are published to topics in Kafka
• Microservices register in a "consumer group" as a consumer
• Microservices "subscribe" to topics and consume metrics/logs and events
• Messages are replicated per consumer group
• Messages are load-balanced across all consumers in a consumer group
• Can add/remove micro-services to handle load or mitigate problems
• As micro-services expand/contract the partitions are automatically re-balanced
• At-least-once semantic guarantees on message delivery
• Also used for domain events, notification retry events, periodic notifications,
grouping notifcations and other areas
• Always accept data, never drop data, true elasticity
• Loggly: https://www.youtube.com/watch?v=LpNbjXFPyZ0
CQRS
• Command Query Responsibility Segregation (CQRS)
• CQRS involves splitting an application into two parts internally:
1. Command side ordering the system to update state
2. Query side that gets information without changing state
• Advantages
• Decouples the read/write load. Allows each to be scaled independently
• Read store can be optimized for the query pattern of the application
• Reference
• Event sourcing, CQRS, stream processing and Apache Kafka
• https://www.confluent.io/blog/event-sourcing-cqrs-stream-processing-apache-kafka-whats-connection/
Microservices
• Microservices are small, autonomous, decoupled services that are
deployed independenty and work together as a single application
• Communication between services occurs via a network
• Services need to be able to change independently of each other, and be
deployed by themselves without requiring consumers to change
• Benefits:
• Resilience
• Scale
• Ease of deployment
• Organizational Alignment
• Optimized for Change/Replaceability
POST Metrics Sequence
Domain Events Sequence
Deployment Models (HA/Scale)
• Many ways to deploy Monasca
• Typically deployed in a clustered/HA configuration using three nodes
or greater
• If any node or microservice fails, the cluster remains operational
• Partitions in Kafka are redistributed among the remaining components
• Preferably, the database is run on a separate layer from the other
components/microservices
• Note, Monasca can also be deployed on a single-node, non-clustered
• Has also been containerized and run in Kubernetes
Metrics Model
POST /v2.0/metrics
{
name: http_status,
dimensions:
{
url: http://host.domain.com:1234/service,
cluster: c1,
control_plane: ccp,
service: compute
}
timestamp: 0, /* milliseconds */
value: 1.0,
value_meta: {
status_code: 500,
msg: Internal server error
}
}
• Simple, concise, multi-dimensional flexible description
• Name (string)
• Dimensions: Dictionary of user-defined (key, value)
pairs that are used to uniquely identify a metric
• Optional dictionary of user-defined (key, value)
pairs that can be used to describe a measurement
• Normally used for errors and messages
Push vs Pull
• Monitoring-as-a-Service
• Can't always pull due to firewalls and network issues
• Low-latency: sub-second latency difficult for pull model
• Doesn't require service discovery and registration
• As entities are deployed, they can start sending metrics without have to be
discovered or registered
• Events
• Temporary caching/buffering of metrics/events while service
unreachable.
Monasca API
• Primary point for pushing metrics and handling queries
• Authenticates all requests against the Keystone identity service
• Note, auth tokens are cached to reduce the load on Keystone
• Resources: Metrics, Alarm Definitions, Alarms and Notification Methods
• API Specification:
• https://github.com/openstack/monasca-api/tree/master/docs
• Horizontally scalable
• Publishes metrics to Kafka
• Queries timeseries DB for measurements and statistics
• Queries Config DB for alarms, alarm definitions and notification methods
Persister
• Consumes both metrics and alarm state transition events from Kafka
• Stores temporarily in-memory and does batch writes to the TSDB, based on
batch size or time, to optimize write performance
• At-least once message delivery semantics:
• No metrics or alarm state transition events are lost
• The Kafka consumer offset for each batch is only updated after successfully storing
the metric or alarm state transition event
• Note, duplicates are possible
• HA/fault-tolerance:
• Multiple persisters run simultaneously and balance load
• If a persister fails, the load is automatically re-balanced across the remaining
persisters.
Time Series Databases
• Used for storing:
• Metrics
• Alarm state history
• Two databases supported:
1. Vertica
• Enterprise class, proprietary, closed-source, clustered, HA, analytics database
• Excels at time-series
2. InfluxDB
• Open-source single-node time-series DB
• Clustering is closed-source
• Note, can replicate to multiple instances of InfluxDB using Kafka
• Investigating support for additional databases
Config Database
• Stores all "transactional" data for Monasca such as
• Alarm Definitions
• Alarms
• Notification Methods
• MySQL and Postgres supported
• Typically deployed in a clustered or HA configuration
Threshold Engine
• Near real-time stream processing, clustered and highly available
threshold engine
• Based on Apache Storm
• Consumes metrics from Kafka
• Creates alarms based on metrics that match patterns specified in the
alarm definition
• Evaluates whether metrics exceed threshold
• Publishes alarm state transition events to Kafka
• Supports both simple and compound alarm expressions
Notification Engine
• Consumes "alarm state transition events" from Kafka produced by the
Threshold Engine
• Evaluates whether notifications should be sent based on actions specified
in the alarm definition.
• OK, ALARM and UNDETERMINED actions
• Supports email, PagerDuty, webhooks, HipChat, Slack and JIRA
• Dynamic plugins supported
• Supports both "one-shot" and "periodic" notifications
• If sending to the notification address fails, then notification is published to
retry topic in Kafka, and retried later
• Grouping notifications: In progress
Kafka Message Schema
• JSON messages published/consumed to/from Kafka by Monasca
micro-services
• Well-defined schema is published at:
• https://wiki.openstack.org/wiki/Monasca/Message_Schema
Metrics
Create, query and get statistics for metrics
• GET, POST /v2.0/metrics
• GET /v2.0/metrics/names:
• Returns the unique metric names
• GET /v2.0/metrics/dimension/names
• Returns the unique dimension names
• GET /v2.0/metrics/dimension/names/values
• Returns the unique dimension values
Measurements
GET /v2.0/metrics/measurements
• Returns a list of measurements
• Query parameters
• Name and dimensions to filter by
• Start_time and end_time
• Offset and limit
• merge_metrics: allow multiple metrics to be combined into a single list
of measurements.
• group_by: list of columns to group the metrics to be returned. Allows
multiple unique metrics to be returned in a single query.
Statistics
GET /v2.0/metrics/statistics
• Query parameters
• Name and dimensions to filter by
• Start_time and end_time
• Statistics: avg, min, max, sum and count
• Period: The time period to aggregate measurements by
• Offset, limit
• merge_metrics: allow multiple metrics to be combined into a single list
of statistics
• group_by: list of columns to group the metrics to be returned. Allows
multiple unique metrics to be returned in a single query.
Metrics Names
GET /v2.0/metrics/names
• Returns a list of the unique metric names
• Query parameters
• Dimensions
• Offset, limit
Metric Dimension Names
GET /v2.0/metrics/dimensions/names
• List the dimension names
• Query parameters
• Metric name
• Offset, limit
Metric Dimension Values
GET /v2.0/metrics/dimensions/names/values
• List the dimension values
• Query parameters
• Metric name
• Dimension name
• Offset, limit
Alarm Definitions
POST, GET /v2.0/alarm-definitions
• Alarm definitions are templates that are used to automatically and
dynamically create alarms based on matching metric names and
dimensions
• One alarm definition can result in zero or more alarms.
• Simple grammar for creating compound alarm expressions:
• avg(cpu.user_perc{}) > 85 or avg(disk.read_ops{device=vda}, 120) > 1000
• Alarm states (OK, ALARM and UNDETERMINED)
• Actions associated with alarms for state transitions
• User assigned severity (LOW, MEDIUM, HIGH, CRITICAL)
• Thresholds can be dynamically adjusted via PATCH
• Minimal lifecycle management, alarm_lifecycle_state and link.
List Alarms
GET /v2.0/alarms
Query parameters:
• metric_name - Name of metric to filter by
• metric_dimensions
• State: OK, ALARM or UNDETERMINED.
• Severity: One or more severities to filter by, separated with |,
ex. severity=LOW|MEDIUM
• state_updated_start_time : The start time in ISO 8601 combined date and
time format in UTC.
• Offset, limit
• sort_by
Alarms
GET, PUT, PATCH, DELETE /v2.0/alarms/{alarm-id}
• Alarms created by the Threshold Engine based on matching alarm
definitions.
• When new nodes or components are deployed, alarms are automatically created
• Alarms are resources within Monasca. They have a resource ID and
lifecycle.
• By default, three states: OK, ALARM and UNDETERMINED
• UNDETERMINED state occurs when metrics are no longer being received
• Deterministic alarms, two states: OK and ALARM
• Used for systems where metrics are sporadic. E.g. Creating metrics when errors in log
files occur, and no metrics, when there aren't any errors.
Alarm Counts
GET /v2.0/alarms/count
• Query the total number of alarms in the OK, ALARM or
UNDETERMINED state, and their severities, grouped by
metrics dimension, such as OpenStack service, state and
severity.
• Used for summary dashboards
Example: Helion Ops Console
Alarm History
GET /v2.0/alarms/state-history
• Lists the alarm state history for alarms
• Query Parameters:
• Dimensions to filter on
• Start/end timestamp
• Offset, limit
GET /v2.0/alarms/{alarm-id}/state-history
• Lists the alarm state history for a specific alarm
Notification Methods
POST, GET, DELETE /v2.0/notification-methods
Notification methods are associated with Actions in alarm definitions.
Example:
POST /v2.0/notification-methods {
"name":"Name of notification method",
"type":"EMAIL",
"address":"john.doe@hp.com"
}
Monasca Agent
• System metrics (cpu, memory, network, filesystem, …)
• Service metrics
• MySQL, Kafka, and many others
• Application metrics
• Built-in Statsd daemon
• Python monasca-statsd library: Adds support for dimensions
• VM system metrics
• Open vSwitch metrics
• Active checks
• HTTP status checks and response times
• System up/down checks (ping and ssh)
• Runs any Nagios plugin or check_mk
• Extensible/Pluggable: Additional services can be easily added
Agent details
• The Agent Forwarder buffers metrics for a short time to increase the
size of the http request body (number of metrics) sent to the
Monasca API.
• The Agent request an auth token from the Keystone Identity service
which is supplied on all requests.
• The Monasca Agent and API caches Monasca Agent and API caches
Monasca Agent and API caches auth tokens in-memory to reduce
the round-trip authorization requests to Keystone
• If network connectivity between the Agent and API occurs the Agent
will buffer metrics and send when connectivity is restored
• Metrics are submitted using a “agent” role, which only allows metrics
to be POST’d to the metrics endpoint
Grafana/Monasca Integration
• Datasource: A datasource that can be added to the Grafana
dashboard to enable Monasca
• https://github.com/openstack/monasca-grafana-datasource
• Keystone authentication
• https://github.com/twc-openstack/grafana
• Support for Alerting will be added in Grafana 4.
Grafana Monasca Data Source
Logging Architecture
Logging API
• POST /v3.0/logs
• Batch log messages in a single http request
• Global / local / mixed dimensions
• Similar to dimensions in metrics.
• JSON only
• Specification
• https://github.com/openstack/monasca-log-api/blob/master/docs/monasca-
log-api-spec.md
• Queries not done via API, but via Tenantized version of Kibana
• https://github.com/FujitsuEnablingSoftwareTechnologyGmbH/fts-keystone
Log Model
• { "dimensions": {
"hostname":"devstack",
"service":"monitoring",
"component":"monasca-api" }
"logs":[
{ "message":"msg1",
"dimensions": {
"service":"compute",
"component":"nova-api",
"path":"/var/log/mysql.log" } },
{ "message":"msg2",
"dimensions": {
"path":"/var/log/monasca/monasca-api.log" } }
]
}
Log Agents
• Logstash
• https://github.com/logstash-plugins/logstash-output-monasca_log_api/pull/1
• Beaver
• https://github.com/python-beaver/python-beaver/pull/406
• Logspout: Under Investigation
Kibana Integration
• Keystone authentication support for Kibana
• Authentication plugin:
• https://github.com/FujitsuEnablingSoftwareTechnologyGmbH/fts-keystone
• Note: In progress of moving to official OpenStack repo
Composabilty: Logging/Metrics
Transform and Analytics Engine
Monasca Transform
• A new micro-service in Monasca that aggregates and transforms metrics.
• Currently based on Apache Spark Streaming.
• Use Cases:
• Object Storage Disk Capacity
• Object Storage Capacity
• Compute Host Capacity
• VM Capacity
• More to come
• Metrics are aggregated and published every hour.
• Currently in deployment in HPE Helion OpenStack 4.0.
• OpenStack project/repo
• https://github.com/openstack/monasca-transform
Monasca Analytics
• A framework that adds data science tools (parsers, algorithms, etc).
• Features include:
• Algorithmic flow definition, enabling sharing of complex algorithmic recipes
• Thin orchestration layer that instantiates an execution environment.
• Focused on:
• Anomaly detection
• Reducing alert fatigue via alarm clustering (unsupervised machine learning).
• Example algorithms: One Class SVM and LiNGAM.
• Status: Under Development
• OpenStack project/repo
• https://github.com/openstack/monasca-analytics
Distributions & Deployments
• Charter Communications:
• Monasca and Grafana is currently deployed in production private cloud
• Monitoring-as-a-Service Use cases supported with Grafana as the Visualization
Dashboard
• 2 datacenters, 600-700 compute nodes, 1000 VMs, 11,000 metrics/sec
• FIWARE Lab:
• http://superuser.openstack.org/articles/monitoring-a-multi-region-cloud-based-on-openstack/
• Hewlett Packard Enterprise: Cloud System, Helion OpenStack
• Supported and tested up to 65K metrics/sec injest rates.
• Fujitsu:
• FUJITSU Software ServerView Cloud Monitoring Manager.
• NEC:
• Planning to include Monasca in "Cloud Solution Menus" solution.
• Others
Statistics: Mitaka/Newton Release
• Organizations:​
• Contributors:​
• Commits:​
• Reviews:​
• Lines of code:​
31​
97​
1075​
4080​
215,370​
Ecosystem
• Hewlett Packard Enterprise
• Fujitsu
• Charter Communications
• NEC
• Cisco
• Cloudbase Solutions
• SUSE
• SolidFire
• SAP
• Cray Inc.
• FIWARE Lab
• Mirantis
• Broadcom
Containers and Kubernetes
• New Monasca Agent Plugins
• Docker plugin
• cAdviser plugin
• Kubernetes plugin: Monitors both Kubernetes control plane and containers
• Prometheus client plugin: Scrapes apps
• Mesos pugin
• Containerization of Monasca
• Heapster Monasca data sink
Next Steps
• Containerizing Monasca
• Monitoring containers and container managers, such as Kubernetes
• Grouping notifications

More Related Content

What's hot

Kafka as Message Broker
Kafka as Message BrokerKafka as Message Broker
Kafka as Message Broker
Haluan Irsad
 
High performance messaging with Apache Pulsar
High performance messaging with Apache PulsarHigh performance messaging with Apache Pulsar
High performance messaging with Apache Pulsar
Matteo Merli
 
Apache Kafka - Messaging System Overview
Apache Kafka - Messaging System OverviewApache Kafka - Messaging System Overview
Apache Kafka - Messaging System Overview
Dmitry Tolpeko
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
Viswanath J
 
Apache Kafka - Overview
Apache Kafka - OverviewApache Kafka - Overview
Apache Kafka - Overview
CodeOps Technologies LLP
 
Kafka Overview
Kafka OverviewKafka Overview
Kafka Overview
iamtodor
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache Kafka
Joe Stein
 
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Christopher Curtin
 
Microservices deck
Microservices deckMicroservices deck
Microservices deck
Raja Chattopadhyay
 
Apache Kafka
Apache Kafka Apache Kafka
Kafka tutorial
Kafka tutorialKafka tutorial
Kafka tutorial
Srikrishna k
 
Understanding kafka
Understanding kafkaUnderstanding kafka
Understanding kafka
AmitDhodi
 
Messaging queue - Kafka
Messaging queue - KafkaMessaging queue - Kafka
Messaging queue - Kafka
Mayank Bansal
 
Effectively-once semantics in Apache Pulsar
Effectively-once semantics in Apache PulsarEffectively-once semantics in Apache Pulsar
Effectively-once semantics in Apache Pulsar
Matteo Merli
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
emreakis
 
Building Kafka-powered Activity Stream
Building Kafka-powered Activity StreamBuilding Kafka-powered Activity Stream
Building Kafka-powered Activity StreamOleksiy Holubyev
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
Srikrishna k
 
Devoxx Morocco 2016 - Microservices with Kafka
Devoxx Morocco 2016 - Microservices with KafkaDevoxx Morocco 2016 - Microservices with Kafka
Devoxx Morocco 2016 - Microservices with Kafka
László-Róbert Albert
 
Apache Kafka - Free Friday
Apache Kafka - Free FridayApache Kafka - Free Friday
Apache Kafka - Free Friday
Otávio Carvalho
 

What's hot (20)

Kafka as Message Broker
Kafka as Message BrokerKafka as Message Broker
Kafka as Message Broker
 
High performance messaging with Apache Pulsar
High performance messaging with Apache PulsarHigh performance messaging with Apache Pulsar
High performance messaging with Apache Pulsar
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache Kafka - Messaging System Overview
Apache Kafka - Messaging System OverviewApache Kafka - Messaging System Overview
Apache Kafka - Messaging System Overview
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache Kafka - Overview
Apache Kafka - OverviewApache Kafka - Overview
Apache Kafka - Overview
 
Kafka Overview
Kafka OverviewKafka Overview
Kafka Overview
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache Kafka
 
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
 
Microservices deck
Microservices deckMicroservices deck
Microservices deck
 
Apache Kafka
Apache Kafka Apache Kafka
Apache Kafka
 
Kafka tutorial
Kafka tutorialKafka tutorial
Kafka tutorial
 
Understanding kafka
Understanding kafkaUnderstanding kafka
Understanding kafka
 
Messaging queue - Kafka
Messaging queue - KafkaMessaging queue - Kafka
Messaging queue - Kafka
 
Effectively-once semantics in Apache Pulsar
Effectively-once semantics in Apache PulsarEffectively-once semantics in Apache Pulsar
Effectively-once semantics in Apache Pulsar
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Building Kafka-powered Activity Stream
Building Kafka-powered Activity StreamBuilding Kafka-powered Activity Stream
Building Kafka-powered Activity Stream
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Devoxx Morocco 2016 - Microservices with Kafka
Devoxx Morocco 2016 - Microservices with KafkaDevoxx Morocco 2016 - Microservices with Kafka
Devoxx Morocco 2016 - Microservices with Kafka
 
Apache Kafka - Free Friday
Apache Kafka - Free FridayApache Kafka - Free Friday
Apache Kafka - Free Friday
 

Viewers also liked

OSMC 2016 - Alerting with Time Series by Fabian Reinartz
OSMC 2016 - Alerting with Time Series by Fabian ReinartzOSMC 2016 - Alerting with Time Series by Fabian Reinartz
OSMC 2016 - Alerting with Time Series by Fabian Reinartz
NETWAYS
 
OSMC 2016 - Small things for monitoring by Jan-Piet Mens
OSMC 2016 - Small things for monitoring by Jan-Piet MensOSMC 2016 - Small things for monitoring by Jan-Piet Mens
OSMC 2016 - Small things for monitoring by Jan-Piet Mens
NETWAYS
 
OSMC 2016: You like Nagios - You will love Centreon by Laurent Pinsivy & Maxi...
OSMC 2016: You like Nagios - You will love Centreon by Laurent Pinsivy & Maxi...OSMC 2016: You like Nagios - You will love Centreon by Laurent Pinsivy & Maxi...
OSMC 2016: You like Nagios - You will love Centreon by Laurent Pinsivy & Maxi...
NETWAYS
 
OSMC 2016 - Friends and foes by Heinrich Hartmann
OSMC 2016 - Friends and foes by Heinrich HartmannOSMC 2016 - Friends and foes by Heinrich Hartmann
OSMC 2016 - Friends and foes by Heinrich Hartmann
NETWAYS
 
OSMC 2016: Security and Compliance Automation and Reports with Foreman by Shl...
OSMC 2016: Security and Compliance Automation and Reports with Foreman by Shl...OSMC 2016: Security and Compliance Automation and Reports with Foreman by Shl...
OSMC 2016: Security and Compliance Automation and Reports with Foreman by Shl...
NETWAYS
 
OSMC 2016 - NeDi update and more by Remo Rickli
OSMC 2016 - NeDi update and more by Remo RickliOSMC 2016 - NeDi update and more by Remo Rickli
OSMC 2016 - NeDi update and more by Remo Rickli
NETWAYS
 
OSMC 2016: Open Monitoring Distribution 2016+ by Gerhard Laußer
OSMC 2016: Open Monitoring Distribution 2016+ by Gerhard Laußer   OSMC 2016: Open Monitoring Distribution 2016+ by Gerhard Laußer
OSMC 2016: Open Monitoring Distribution 2016+ by Gerhard Laußer
NETWAYS
 
OSMC 2016 - Application Performance Management with Open-Source-Tooling by M...
OSMC 2016 -  Application Performance Management with Open-Source-Tooling by M...OSMC 2016 -  Application Performance Management with Open-Source-Tooling by M...
OSMC 2016 - Application Performance Management with Open-Source-Tooling by M...
NETWAYS
 
OSMC 2016 - Komponenten Monitoring und Performance Management mit Icinga bei ...
OSMC 2016 - Komponenten Monitoring und Performance Management mit Icinga bei ...OSMC 2016 - Komponenten Monitoring und Performance Management mit Icinga bei ...
OSMC 2016 - Komponenten Monitoring und Performance Management mit Icinga bei ...
NETWAYS
 
OSMC 2016 - Current State of Icinga by Icinga Team
OSMC 2016 - Current State of Icinga by Icinga Team OSMC 2016 - Current State of Icinga by Icinga Team
OSMC 2016 - Current State of Icinga by Icinga Team
NETWAYS
 
OSMC 2016 - Monitoring the real world by Antony Stone
OSMC 2016 - Monitoring the real world by Antony Stone OSMC 2016 - Monitoring the real world by Antony Stone
OSMC 2016 - Monitoring the real world by Antony Stone
NETWAYS
 
OSMC 2016 - Take care of your logs by Jan Doberstein
OSMC 2016 - Take care of your logs by Jan DobersteinOSMC 2016 - Take care of your logs by Jan Doberstein
OSMC 2016 - Take care of your logs by Jan Doberstein
NETWAYS
 
OSMC 2016 - Automated Monitoring with Icinga and NSClient++ by Michael Medin
OSMC 2016 - Automated Monitoring with Icinga and NSClient++ by Michael Medin OSMC 2016 - Automated Monitoring with Icinga and NSClient++ by Michael Medin
OSMC 2016 - Automated Monitoring with Icinga and NSClient++ by Michael Medin
NETWAYS
 
OSMC 2016 - Soma - A Monitoring Configuration Management Database by Jörg Per...
OSMC 2016 - Soma - A Monitoring Configuration Management Database by Jörg Per...OSMC 2016 - Soma - A Monitoring Configuration Management Database by Jörg Per...
OSMC 2016 - Soma - A Monitoring Configuration Management Database by Jörg Per...
NETWAYS
 
OSMC 2016 - Ein Jahr mit dem Icinga Director by Thomas Gelf
OSMC 2016 - Ein Jahr mit dem Icinga Director by Thomas GelfOSMC 2016 - Ein Jahr mit dem Icinga Director by Thomas Gelf
OSMC 2016 - Ein Jahr mit dem Icinga Director by Thomas Gelf
NETWAYS
 
OSMC 2016 - The Engineer's guide to Data Analysis by Avishai Ish-Shalom
OSMC 2016 - The Engineer's guide to Data Analysis by Avishai Ish-ShalomOSMC 2016 - The Engineer's guide to Data Analysis by Avishai Ish-Shalom
OSMC 2016 - The Engineer's guide to Data Analysis by Avishai Ish-Shalom
NETWAYS
 
OSMC 2016 - Hello Redfish, Goodbye IPMI - The future of Hardware Monitoring
OSMC 2016 - Hello Redfish, Goodbye IPMI - The future of Hardware MonitoringOSMC 2016 - Hello Redfish, Goodbye IPMI - The future of Hardware Monitoring
OSMC 2016 - Hello Redfish, Goodbye IPMI - The future of Hardware Monitoring
NETWAYS
 
OSMC 2016 - DNS Monitoring from Several Vantage Points by Stéphane Bortzmeyer
OSMC 2016 - DNS Monitoring from Several Vantage Points by Stéphane Bortzmeyer OSMC 2016 - DNS Monitoring from Several Vantage Points by Stéphane Bortzmeyer
OSMC 2016 - DNS Monitoring from Several Vantage Points by Stéphane Bortzmeyer
NETWAYS
 
OSMC 2016: Software Development seen from a #yolo^wdevop by Jan Wagner
OSMC 2016: Software Development seen from a #yolo^wdevop by Jan WagnerOSMC 2016: Software Development seen from a #yolo^wdevop by Jan Wagner
OSMC 2016: Software Development seen from a #yolo^wdevop by Jan Wagner
NETWAYS
 
Modernes System-Management — Alles ist ein Stream
Modernes System-Management — Alles ist ein StreamModernes System-Management — Alles ist ein Stream
Modernes System-Management — Alles ist ein Stream
SysDB Project
 

Viewers also liked (20)

OSMC 2016 - Alerting with Time Series by Fabian Reinartz
OSMC 2016 - Alerting with Time Series by Fabian ReinartzOSMC 2016 - Alerting with Time Series by Fabian Reinartz
OSMC 2016 - Alerting with Time Series by Fabian Reinartz
 
OSMC 2016 - Small things for monitoring by Jan-Piet Mens
OSMC 2016 - Small things for monitoring by Jan-Piet MensOSMC 2016 - Small things for monitoring by Jan-Piet Mens
OSMC 2016 - Small things for monitoring by Jan-Piet Mens
 
OSMC 2016: You like Nagios - You will love Centreon by Laurent Pinsivy & Maxi...
OSMC 2016: You like Nagios - You will love Centreon by Laurent Pinsivy & Maxi...OSMC 2016: You like Nagios - You will love Centreon by Laurent Pinsivy & Maxi...
OSMC 2016: You like Nagios - You will love Centreon by Laurent Pinsivy & Maxi...
 
OSMC 2016 - Friends and foes by Heinrich Hartmann
OSMC 2016 - Friends and foes by Heinrich HartmannOSMC 2016 - Friends and foes by Heinrich Hartmann
OSMC 2016 - Friends and foes by Heinrich Hartmann
 
OSMC 2016: Security and Compliance Automation and Reports with Foreman by Shl...
OSMC 2016: Security and Compliance Automation and Reports with Foreman by Shl...OSMC 2016: Security and Compliance Automation and Reports with Foreman by Shl...
OSMC 2016: Security and Compliance Automation and Reports with Foreman by Shl...
 
OSMC 2016 - NeDi update and more by Remo Rickli
OSMC 2016 - NeDi update and more by Remo RickliOSMC 2016 - NeDi update and more by Remo Rickli
OSMC 2016 - NeDi update and more by Remo Rickli
 
OSMC 2016: Open Monitoring Distribution 2016+ by Gerhard Laußer
OSMC 2016: Open Monitoring Distribution 2016+ by Gerhard Laußer   OSMC 2016: Open Monitoring Distribution 2016+ by Gerhard Laußer
OSMC 2016: Open Monitoring Distribution 2016+ by Gerhard Laußer
 
OSMC 2016 - Application Performance Management with Open-Source-Tooling by M...
OSMC 2016 -  Application Performance Management with Open-Source-Tooling by M...OSMC 2016 -  Application Performance Management with Open-Source-Tooling by M...
OSMC 2016 - Application Performance Management with Open-Source-Tooling by M...
 
OSMC 2016 - Komponenten Monitoring und Performance Management mit Icinga bei ...
OSMC 2016 - Komponenten Monitoring und Performance Management mit Icinga bei ...OSMC 2016 - Komponenten Monitoring und Performance Management mit Icinga bei ...
OSMC 2016 - Komponenten Monitoring und Performance Management mit Icinga bei ...
 
OSMC 2016 - Current State of Icinga by Icinga Team
OSMC 2016 - Current State of Icinga by Icinga Team OSMC 2016 - Current State of Icinga by Icinga Team
OSMC 2016 - Current State of Icinga by Icinga Team
 
OSMC 2016 - Monitoring the real world by Antony Stone
OSMC 2016 - Monitoring the real world by Antony Stone OSMC 2016 - Monitoring the real world by Antony Stone
OSMC 2016 - Monitoring the real world by Antony Stone
 
OSMC 2016 - Take care of your logs by Jan Doberstein
OSMC 2016 - Take care of your logs by Jan DobersteinOSMC 2016 - Take care of your logs by Jan Doberstein
OSMC 2016 - Take care of your logs by Jan Doberstein
 
OSMC 2016 - Automated Monitoring with Icinga and NSClient++ by Michael Medin
OSMC 2016 - Automated Monitoring with Icinga and NSClient++ by Michael Medin OSMC 2016 - Automated Monitoring with Icinga and NSClient++ by Michael Medin
OSMC 2016 - Automated Monitoring with Icinga and NSClient++ by Michael Medin
 
OSMC 2016 - Soma - A Monitoring Configuration Management Database by Jörg Per...
OSMC 2016 - Soma - A Monitoring Configuration Management Database by Jörg Per...OSMC 2016 - Soma - A Monitoring Configuration Management Database by Jörg Per...
OSMC 2016 - Soma - A Monitoring Configuration Management Database by Jörg Per...
 
OSMC 2016 - Ein Jahr mit dem Icinga Director by Thomas Gelf
OSMC 2016 - Ein Jahr mit dem Icinga Director by Thomas GelfOSMC 2016 - Ein Jahr mit dem Icinga Director by Thomas Gelf
OSMC 2016 - Ein Jahr mit dem Icinga Director by Thomas Gelf
 
OSMC 2016 - The Engineer's guide to Data Analysis by Avishai Ish-Shalom
OSMC 2016 - The Engineer's guide to Data Analysis by Avishai Ish-ShalomOSMC 2016 - The Engineer's guide to Data Analysis by Avishai Ish-Shalom
OSMC 2016 - The Engineer's guide to Data Analysis by Avishai Ish-Shalom
 
OSMC 2016 - Hello Redfish, Goodbye IPMI - The future of Hardware Monitoring
OSMC 2016 - Hello Redfish, Goodbye IPMI - The future of Hardware MonitoringOSMC 2016 - Hello Redfish, Goodbye IPMI - The future of Hardware Monitoring
OSMC 2016 - Hello Redfish, Goodbye IPMI - The future of Hardware Monitoring
 
OSMC 2016 - DNS Monitoring from Several Vantage Points by Stéphane Bortzmeyer
OSMC 2016 - DNS Monitoring from Several Vantage Points by Stéphane Bortzmeyer OSMC 2016 - DNS Monitoring from Several Vantage Points by Stéphane Bortzmeyer
OSMC 2016 - DNS Monitoring from Several Vantage Points by Stéphane Bortzmeyer
 
OSMC 2016: Software Development seen from a #yolo^wdevop by Jan Wagner
OSMC 2016: Software Development seen from a #yolo^wdevop by Jan WagnerOSMC 2016: Software Development seen from a #yolo^wdevop by Jan Wagner
OSMC 2016: Software Development seen from a #yolo^wdevop by Jan Wagner
 
Modernes System-Management — Alles ist ein Stream
Modernes System-Management — Alles ist ein StreamModernes System-Management — Alles ist ein Stream
Modernes System-Management — Alles ist ein Stream
 

Similar to OSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland Hochmuth

NATS: A Cloud Native Messaging System
NATS: A Cloud Native Messaging SystemNATS: A Cloud Native Messaging System
NATS: A Cloud Native Messaging System
Shiju Varghese
 
WSO2Con Asia 2014 - Essential Elements of an Enterprise PaaS
WSO2Con Asia 2014 - Essential Elements of an Enterprise PaaSWSO2Con Asia 2014 - Essential Elements of an Enterprise PaaS
WSO2Con Asia 2014 - Essential Elements of an Enterprise PaaSWSO2
 
Essential Elements of an Enterprise PaaS
Essential Elements of an Enterprise PaaSEssential Elements of an Enterprise PaaS
Essential Elements of an Enterprise PaaSLakmal Warusawithana
 
CloudStack Overview
CloudStack OverviewCloudStack Overview
CloudStack Overview
sedukull
 
OnPrem Monitoring.pdf
OnPrem Monitoring.pdfOnPrem Monitoring.pdf
OnPrem Monitoring.pdf
TarekHamdi8
 
Hacking apache cloud stack
Hacking apache cloud stackHacking apache cloud stack
Hacking apache cloud stackNitin Mehta
 
messaging.pptx
messaging.pptxmessaging.pptx
messaging.pptx
NParakh1
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
confluent
 
AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...
AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...
AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...
QCloudMentor
 
Transforming Legacy Applications Into Dynamically Scalable Web Services
Transforming Legacy Applications Into Dynamically Scalable Web ServicesTransforming Legacy Applications Into Dynamically Scalable Web Services
Transforming Legacy Applications Into Dynamically Scalable Web Services
Adam Takvam
 
Unleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptxUnleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptx
Knoldus Inc.
 
Pulsar - flexible pub-sub for internet scale
Pulsar - flexible pub-sub for internet scalePulsar - flexible pub-sub for internet scale
Pulsar - flexible pub-sub for internet scale
Matteo Merli
 
Linked In Stream Processing Meetup - Apache Pulsar
Linked In Stream Processing Meetup - Apache PulsarLinked In Stream Processing Meetup - Apache Pulsar
Linked In Stream Processing Meetup - Apache Pulsar
Karthik Ramasamy
 
Kafka
KafkaKafka
Kafka
shrenikp
 
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
Lucas Jellema
 
Webinar: What's new in CDAP 3.5?
Webinar: What's new in CDAP 3.5?Webinar: What's new in CDAP 3.5?
Webinar: What's new in CDAP 3.5?
Cask Data
 
Flink Forward Berlin 2018: Andrew Torson - "Using a sharded Akka distributed ...
Flink Forward Berlin 2018: Andrew Torson - "Using a sharded Akka distributed ...Flink Forward Berlin 2018: Andrew Torson - "Using a sharded Akka distributed ...
Flink Forward Berlin 2018: Andrew Torson - "Using a sharded Akka distributed ...
Flink Forward
 
Bigdata meetup dwarak_realtime_score_app
Bigdata meetup dwarak_realtime_score_appBigdata meetup dwarak_realtime_score_app
Bigdata meetup dwarak_realtime_score_app
Dwarakanath Ramachandran
 
Kubernetes Infra 2.0
Kubernetes Infra 2.0Kubernetes Infra 2.0
Kubernetes Infra 2.0
Deepak Sood
 
Event Driven Architectures with Apache Kafka
Event Driven Architectures with Apache KafkaEvent Driven Architectures with Apache Kafka
Event Driven Architectures with Apache Kafka
Matt Masuda
 

Similar to OSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland Hochmuth (20)

NATS: A Cloud Native Messaging System
NATS: A Cloud Native Messaging SystemNATS: A Cloud Native Messaging System
NATS: A Cloud Native Messaging System
 
WSO2Con Asia 2014 - Essential Elements of an Enterprise PaaS
WSO2Con Asia 2014 - Essential Elements of an Enterprise PaaSWSO2Con Asia 2014 - Essential Elements of an Enterprise PaaS
WSO2Con Asia 2014 - Essential Elements of an Enterprise PaaS
 
Essential Elements of an Enterprise PaaS
Essential Elements of an Enterprise PaaSEssential Elements of an Enterprise PaaS
Essential Elements of an Enterprise PaaS
 
CloudStack Overview
CloudStack OverviewCloudStack Overview
CloudStack Overview
 
OnPrem Monitoring.pdf
OnPrem Monitoring.pdfOnPrem Monitoring.pdf
OnPrem Monitoring.pdf
 
Hacking apache cloud stack
Hacking apache cloud stackHacking apache cloud stack
Hacking apache cloud stack
 
messaging.pptx
messaging.pptxmessaging.pptx
messaging.pptx
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
 
AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...
AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...
AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...
 
Transforming Legacy Applications Into Dynamically Scalable Web Services
Transforming Legacy Applications Into Dynamically Scalable Web ServicesTransforming Legacy Applications Into Dynamically Scalable Web Services
Transforming Legacy Applications Into Dynamically Scalable Web Services
 
Unleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptxUnleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptx
 
Pulsar - flexible pub-sub for internet scale
Pulsar - flexible pub-sub for internet scalePulsar - flexible pub-sub for internet scale
Pulsar - flexible pub-sub for internet scale
 
Linked In Stream Processing Meetup - Apache Pulsar
Linked In Stream Processing Meetup - Apache PulsarLinked In Stream Processing Meetup - Apache Pulsar
Linked In Stream Processing Meetup - Apache Pulsar
 
Kafka
KafkaKafka
Kafka
 
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
 
Webinar: What's new in CDAP 3.5?
Webinar: What's new in CDAP 3.5?Webinar: What's new in CDAP 3.5?
Webinar: What's new in CDAP 3.5?
 
Flink Forward Berlin 2018: Andrew Torson - "Using a sharded Akka distributed ...
Flink Forward Berlin 2018: Andrew Torson - "Using a sharded Akka distributed ...Flink Forward Berlin 2018: Andrew Torson - "Using a sharded Akka distributed ...
Flink Forward Berlin 2018: Andrew Torson - "Using a sharded Akka distributed ...
 
Bigdata meetup dwarak_realtime_score_app
Bigdata meetup dwarak_realtime_score_appBigdata meetup dwarak_realtime_score_app
Bigdata meetup dwarak_realtime_score_app
 
Kubernetes Infra 2.0
Kubernetes Infra 2.0Kubernetes Infra 2.0
Kubernetes Infra 2.0
 
Event Driven Architectures with Apache Kafka
Event Driven Architectures with Apache KafkaEvent Driven Architectures with Apache Kafka
Event Driven Architectures with Apache Kafka
 

Recently uploaded

In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
XfilesPro
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
IES VE
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
 
De mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FMEDe mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FME
Jelle | Nordend
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
Tier1 app
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
Cyanic lab
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Software Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdfSoftware Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdf
MayankTawar1
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Shahin Sheidaei
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
wottaspaceseo
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 
Why React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdfWhy React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdf
ayushiqss
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Anthony Dahanne
 

Recently uploaded (20)

In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
De mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FMEDe mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FME
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Software Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdfSoftware Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdf
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
Why React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdfWhy React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdf
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 

OSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland Hochmuth

  • 2. Speaker Roland Hochmuth Hewlett Packard Enterprise Fort Collins, Colorado, USA
  • 3. Agenda • Describe how to build a highly scalable monitoring and logging as a service platform • Architectural and design principles • Scale, HA • Provide an overview of Monasca • Features • API • Demo
  • 4. What is Monitoring-as-a-Service? • A Monitoring or Logging solution deployed as Software-as-a-Service • E.g. CloudWatch, Datadog, New Relic, Librato, Loggly and many others • First-class, preferably RESTful HTTP API • Authentication • Multi-tenancy • Provides self-provisioning to users/tenants of the service • Designed to be highly reliable and operate at scale • Historically run by an operations team doing web services
  • 5. What is OpenStack? • OpenStack is a cloud operating system that controls large pools of compute, storage, and networking resources • Open-source alternative to AWS, Microsoft Azure, Google Cloud and other cloud services • Deployed in both public and private clouds
  • 6. What is Monasca? • Open-source Monitoring/Logging-as-a-Service platform for OpenStack • Authentication currently via OpenStack Identity Service (Keystone) • Microservices message-bus based architecture • First-class RESTful API • Push-based metrics • Consolidates Operational Monitoring, Monitoring-as-a-Service, Metering & Billing and more • Designed for elastic cloud environments/deployments • High-availability / clustering built-in • Horizontally scalable and vertically 4 tiered/layered architecture • Capable of long-term data retention to address metering, SLA, capacity planning, trend analysis, post-hoc RCA, and other use cases • Extensible and Composable
  • 7. The Log • The Log: What every software engineer should know about real-time data's unifying abstraction • https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer- should-know-about-real-time-datas-unifying • Log: An append-only, totally-ordered sequence of records ordered by time From To
  • 9. Kafka • A performant, distributed, durable, publish/subscribe messaging and stream processing system • Metrics, logs and events are published to topics in Kafka • Microservices register in a "consumer group" as a consumer • Microservices "subscribe" to topics and consume metrics/logs and events • Messages are replicated per consumer group • Messages are load-balanced across all consumers in a consumer group • Can add/remove micro-services to handle load or mitigate problems • As micro-services expand/contract the partitions are automatically re-balanced • At-least-once semantic guarantees on message delivery • Also used for domain events, notification retry events, periodic notifications, grouping notifcations and other areas • Always accept data, never drop data, true elasticity • Loggly: https://www.youtube.com/watch?v=LpNbjXFPyZ0
  • 10. CQRS • Command Query Responsibility Segregation (CQRS) • CQRS involves splitting an application into two parts internally: 1. Command side ordering the system to update state 2. Query side that gets information without changing state • Advantages • Decouples the read/write load. Allows each to be scaled independently • Read store can be optimized for the query pattern of the application • Reference • Event sourcing, CQRS, stream processing and Apache Kafka • https://www.confluent.io/blog/event-sourcing-cqrs-stream-processing-apache-kafka-whats-connection/
  • 11. Microservices • Microservices are small, autonomous, decoupled services that are deployed independenty and work together as a single application • Communication between services occurs via a network • Services need to be able to change independently of each other, and be deployed by themselves without requiring consumers to change • Benefits: • Resilience • Scale • Ease of deployment • Organizational Alignment • Optimized for Change/Replaceability
  • 14. Deployment Models (HA/Scale) • Many ways to deploy Monasca • Typically deployed in a clustered/HA configuration using three nodes or greater • If any node or microservice fails, the cluster remains operational • Partitions in Kafka are redistributed among the remaining components • Preferably, the database is run on a separate layer from the other components/microservices • Note, Monasca can also be deployed on a single-node, non-clustered • Has also been containerized and run in Kubernetes
  • 15. Metrics Model POST /v2.0/metrics { name: http_status, dimensions: { url: http://host.domain.com:1234/service, cluster: c1, control_plane: ccp, service: compute } timestamp: 0, /* milliseconds */ value: 1.0, value_meta: { status_code: 500, msg: Internal server error } } • Simple, concise, multi-dimensional flexible description • Name (string) • Dimensions: Dictionary of user-defined (key, value) pairs that are used to uniquely identify a metric • Optional dictionary of user-defined (key, value) pairs that can be used to describe a measurement • Normally used for errors and messages
  • 16. Push vs Pull • Monitoring-as-a-Service • Can't always pull due to firewalls and network issues • Low-latency: sub-second latency difficult for pull model • Doesn't require service discovery and registration • As entities are deployed, they can start sending metrics without have to be discovered or registered • Events • Temporary caching/buffering of metrics/events while service unreachable.
  • 17. Monasca API • Primary point for pushing metrics and handling queries • Authenticates all requests against the Keystone identity service • Note, auth tokens are cached to reduce the load on Keystone • Resources: Metrics, Alarm Definitions, Alarms and Notification Methods • API Specification: • https://github.com/openstack/monasca-api/tree/master/docs • Horizontally scalable • Publishes metrics to Kafka • Queries timeseries DB for measurements and statistics • Queries Config DB for alarms, alarm definitions and notification methods
  • 18. Persister • Consumes both metrics and alarm state transition events from Kafka • Stores temporarily in-memory and does batch writes to the TSDB, based on batch size or time, to optimize write performance • At-least once message delivery semantics: • No metrics or alarm state transition events are lost • The Kafka consumer offset for each batch is only updated after successfully storing the metric or alarm state transition event • Note, duplicates are possible • HA/fault-tolerance: • Multiple persisters run simultaneously and balance load • If a persister fails, the load is automatically re-balanced across the remaining persisters.
  • 19. Time Series Databases • Used for storing: • Metrics • Alarm state history • Two databases supported: 1. Vertica • Enterprise class, proprietary, closed-source, clustered, HA, analytics database • Excels at time-series 2. InfluxDB • Open-source single-node time-series DB • Clustering is closed-source • Note, can replicate to multiple instances of InfluxDB using Kafka • Investigating support for additional databases
  • 20. Config Database • Stores all "transactional" data for Monasca such as • Alarm Definitions • Alarms • Notification Methods • MySQL and Postgres supported • Typically deployed in a clustered or HA configuration
  • 21. Threshold Engine • Near real-time stream processing, clustered and highly available threshold engine • Based on Apache Storm • Consumes metrics from Kafka • Creates alarms based on metrics that match patterns specified in the alarm definition • Evaluates whether metrics exceed threshold • Publishes alarm state transition events to Kafka • Supports both simple and compound alarm expressions
  • 22. Notification Engine • Consumes "alarm state transition events" from Kafka produced by the Threshold Engine • Evaluates whether notifications should be sent based on actions specified in the alarm definition. • OK, ALARM and UNDETERMINED actions • Supports email, PagerDuty, webhooks, HipChat, Slack and JIRA • Dynamic plugins supported • Supports both "one-shot" and "periodic" notifications • If sending to the notification address fails, then notification is published to retry topic in Kafka, and retried later • Grouping notifications: In progress
  • 23. Kafka Message Schema • JSON messages published/consumed to/from Kafka by Monasca micro-services • Well-defined schema is published at: • https://wiki.openstack.org/wiki/Monasca/Message_Schema
  • 24. Metrics Create, query and get statistics for metrics • GET, POST /v2.0/metrics • GET /v2.0/metrics/names: • Returns the unique metric names • GET /v2.0/metrics/dimension/names • Returns the unique dimension names • GET /v2.0/metrics/dimension/names/values • Returns the unique dimension values
  • 25. Measurements GET /v2.0/metrics/measurements • Returns a list of measurements • Query parameters • Name and dimensions to filter by • Start_time and end_time • Offset and limit • merge_metrics: allow multiple metrics to be combined into a single list of measurements. • group_by: list of columns to group the metrics to be returned. Allows multiple unique metrics to be returned in a single query.
  • 26. Statistics GET /v2.0/metrics/statistics • Query parameters • Name and dimensions to filter by • Start_time and end_time • Statistics: avg, min, max, sum and count • Period: The time period to aggregate measurements by • Offset, limit • merge_metrics: allow multiple metrics to be combined into a single list of statistics • group_by: list of columns to group the metrics to be returned. Allows multiple unique metrics to be returned in a single query.
  • 27. Metrics Names GET /v2.0/metrics/names • Returns a list of the unique metric names • Query parameters • Dimensions • Offset, limit
  • 28. Metric Dimension Names GET /v2.0/metrics/dimensions/names • List the dimension names • Query parameters • Metric name • Offset, limit
  • 29. Metric Dimension Values GET /v2.0/metrics/dimensions/names/values • List the dimension values • Query parameters • Metric name • Dimension name • Offset, limit
  • 30. Alarm Definitions POST, GET /v2.0/alarm-definitions • Alarm definitions are templates that are used to automatically and dynamically create alarms based on matching metric names and dimensions • One alarm definition can result in zero or more alarms. • Simple grammar for creating compound alarm expressions: • avg(cpu.user_perc{}) > 85 or avg(disk.read_ops{device=vda}, 120) > 1000 • Alarm states (OK, ALARM and UNDETERMINED) • Actions associated with alarms for state transitions • User assigned severity (LOW, MEDIUM, HIGH, CRITICAL) • Thresholds can be dynamically adjusted via PATCH • Minimal lifecycle management, alarm_lifecycle_state and link.
  • 31. List Alarms GET /v2.0/alarms Query parameters: • metric_name - Name of metric to filter by • metric_dimensions • State: OK, ALARM or UNDETERMINED. • Severity: One or more severities to filter by, separated with |, ex. severity=LOW|MEDIUM • state_updated_start_time : The start time in ISO 8601 combined date and time format in UTC. • Offset, limit • sort_by
  • 32. Alarms GET, PUT, PATCH, DELETE /v2.0/alarms/{alarm-id} • Alarms created by the Threshold Engine based on matching alarm definitions. • When new nodes or components are deployed, alarms are automatically created • Alarms are resources within Monasca. They have a resource ID and lifecycle. • By default, three states: OK, ALARM and UNDETERMINED • UNDETERMINED state occurs when metrics are no longer being received • Deterministic alarms, two states: OK and ALARM • Used for systems where metrics are sporadic. E.g. Creating metrics when errors in log files occur, and no metrics, when there aren't any errors.
  • 33. Alarm Counts GET /v2.0/alarms/count • Query the total number of alarms in the OK, ALARM or UNDETERMINED state, and their severities, grouped by metrics dimension, such as OpenStack service, state and severity. • Used for summary dashboards
  • 35. Alarm History GET /v2.0/alarms/state-history • Lists the alarm state history for alarms • Query Parameters: • Dimensions to filter on • Start/end timestamp • Offset, limit GET /v2.0/alarms/{alarm-id}/state-history • Lists the alarm state history for a specific alarm
  • 36. Notification Methods POST, GET, DELETE /v2.0/notification-methods Notification methods are associated with Actions in alarm definitions. Example: POST /v2.0/notification-methods { "name":"Name of notification method", "type":"EMAIL", "address":"john.doe@hp.com" }
  • 37. Monasca Agent • System metrics (cpu, memory, network, filesystem, …) • Service metrics • MySQL, Kafka, and many others • Application metrics • Built-in Statsd daemon • Python monasca-statsd library: Adds support for dimensions • VM system metrics • Open vSwitch metrics • Active checks • HTTP status checks and response times • System up/down checks (ping and ssh) • Runs any Nagios plugin or check_mk • Extensible/Pluggable: Additional services can be easily added
  • 38. Agent details • The Agent Forwarder buffers metrics for a short time to increase the size of the http request body (number of metrics) sent to the Monasca API. • The Agent request an auth token from the Keystone Identity service which is supplied on all requests. • The Monasca Agent and API caches Monasca Agent and API caches Monasca Agent and API caches auth tokens in-memory to reduce the round-trip authorization requests to Keystone • If network connectivity between the Agent and API occurs the Agent will buffer metrics and send when connectivity is restored • Metrics are submitted using a “agent” role, which only allows metrics to be POST’d to the metrics endpoint
  • 39. Grafana/Monasca Integration • Datasource: A datasource that can be added to the Grafana dashboard to enable Monasca • https://github.com/openstack/monasca-grafana-datasource • Keystone authentication • https://github.com/twc-openstack/grafana • Support for Alerting will be added in Grafana 4.
  • 42. Logging API • POST /v3.0/logs • Batch log messages in a single http request • Global / local / mixed dimensions • Similar to dimensions in metrics. • JSON only • Specification • https://github.com/openstack/monasca-log-api/blob/master/docs/monasca- log-api-spec.md • Queries not done via API, but via Tenantized version of Kibana • https://github.com/FujitsuEnablingSoftwareTechnologyGmbH/fts-keystone
  • 43. Log Model • { "dimensions": { "hostname":"devstack", "service":"monitoring", "component":"monasca-api" } "logs":[ { "message":"msg1", "dimensions": { "service":"compute", "component":"nova-api", "path":"/var/log/mysql.log" } }, { "message":"msg2", "dimensions": { "path":"/var/log/monasca/monasca-api.log" } } ] }
  • 44. Log Agents • Logstash • https://github.com/logstash-plugins/logstash-output-monasca_log_api/pull/1 • Beaver • https://github.com/python-beaver/python-beaver/pull/406 • Logspout: Under Investigation
  • 45. Kibana Integration • Keystone authentication support for Kibana • Authentication plugin: • https://github.com/FujitsuEnablingSoftwareTechnologyGmbH/fts-keystone • Note: In progress of moving to official OpenStack repo
  • 48. Monasca Transform • A new micro-service in Monasca that aggregates and transforms metrics. • Currently based on Apache Spark Streaming. • Use Cases: • Object Storage Disk Capacity • Object Storage Capacity • Compute Host Capacity • VM Capacity • More to come • Metrics are aggregated and published every hour. • Currently in deployment in HPE Helion OpenStack 4.0. • OpenStack project/repo • https://github.com/openstack/monasca-transform
  • 49. Monasca Analytics • A framework that adds data science tools (parsers, algorithms, etc). • Features include: • Algorithmic flow definition, enabling sharing of complex algorithmic recipes • Thin orchestration layer that instantiates an execution environment. • Focused on: • Anomaly detection • Reducing alert fatigue via alarm clustering (unsupervised machine learning). • Example algorithms: One Class SVM and LiNGAM. • Status: Under Development • OpenStack project/repo • https://github.com/openstack/monasca-analytics
  • 50. Distributions & Deployments • Charter Communications: • Monasca and Grafana is currently deployed in production private cloud • Monitoring-as-a-Service Use cases supported with Grafana as the Visualization Dashboard • 2 datacenters, 600-700 compute nodes, 1000 VMs, 11,000 metrics/sec • FIWARE Lab: • http://superuser.openstack.org/articles/monitoring-a-multi-region-cloud-based-on-openstack/ • Hewlett Packard Enterprise: Cloud System, Helion OpenStack • Supported and tested up to 65K metrics/sec injest rates. • Fujitsu: • FUJITSU Software ServerView Cloud Monitoring Manager. • NEC: • Planning to include Monasca in "Cloud Solution Menus" solution. • Others
  • 51. Statistics: Mitaka/Newton Release • Organizations:​ • Contributors:​ • Commits:​ • Reviews:​ • Lines of code:​ 31​ 97​ 1075​ 4080​ 215,370​
  • 52. Ecosystem • Hewlett Packard Enterprise • Fujitsu • Charter Communications • NEC • Cisco • Cloudbase Solutions • SUSE • SolidFire • SAP • Cray Inc. • FIWARE Lab • Mirantis • Broadcom
  • 53. Containers and Kubernetes • New Monasca Agent Plugins • Docker plugin • cAdviser plugin • Kubernetes plugin: Monitors both Kubernetes control plane and containers • Prometheus client plugin: Scrapes apps • Mesos pugin • Containerization of Monasca • Heapster Monasca data sink
  • 54. Next Steps • Containerizing Monasca • Monitoring containers and container managers, such as Kubernetes • Grouping notifications