Logs/Metrics Gathering With OpenShift EFK Stack

Logs/Metrics Gathering
With OpenShift EFK Stack
DevConf, Brno, January 27 2018
Josef Karásek Jan Wozniak
Software Engineer Software Engineer
1

@Pepe_CZ
● The project was officially added to the Group 2 in OpenShift
organisation
● The Dev team grew in size:
○ Rich Megginson
○ Noriko Hosoi
○ Lukáš Vlček
○ Jeff Cantrill
○ Eric Wolinetz
○ Jan Wozniak
○ Josef Karásek
ADDITIONS TO THE TEAM
3
WE HAVE GROWN

@Pepe_CZ
● Collecting Distributed Logs
● Common Data Model
● Security model - Multi-Tenancy
● Integration with Red Hat products and their upstream projects
● Scalability
● Enable “Big Data” Analysis
● All Open Source
Watch the talk on YouTube!
MAIN OBJECTIVES
4
WHAT WE WANT TO ACHIEVE

@Pepe_CZ5
LOGGING SYSTEM - ABSTRACT
COMPONENTS
Log
files
Journal Collector
Data
Warehouse
(Cluster)
Visualization
Guests
Containers
Services
Applications
Tlog Syslog
Host
...
Host
Load
Balancer
Logging System
Monitoring
Log
files
Journal Collector
Guests
Containers
Services
Applications
Tlog Syslog

@Pepe_CZ6
CURRENT OPENSHIFT LOGGING
Elasticsearch
(Cluster)
Kibana
ES service
Logging Namespace
Prometheus
OpenShift Cluster
pod
pod
project
pod
pod
project
openshift
docker/cri
OS
Fluentd
journald
/var/log/containers/*.log
Curator
audit
ES
reencrypt
route
Fluentd browserManageIQ
Kopf
Mux
(Fluentd)*

@Pepe_CZ
FLUENTD - COLLECTOR AND
NORMALIZER
RUBY BASED LOG AGENT
● Configuration - Apache like,
ruby based
● Scalable, secure msgpack
secure_forward
● Hundreds of plugins
● Easy to write ruby plugins
● Kubernetes metadata
plugin
● OpenStack reference
architecture
● Use rsyslog via RELP plugin
<filter
kubernetes.journal.container**>
@type record_transformer
enable_ruby
<record>
time
${Time.at((record["_SOURCE_REALTIME_
TIMESTAMP"] ||
record["__REALTIME_TIMESTAMP"]).to_f
/
1000000.0).utc.to_datetime.rfc3339(6
)}
...
7

@Pepe_CZ
WIDELY USED, JAVA BASED
SEARCH ENGINE
ELASTICSEARCH - DATA WAREHOUSE
● Based on Apache Lucene
● Great for full text log
searching
● Very good for TSD
● SearchGuard for security,
authz
● Openshift Elasticsearch
plugin
● OpenStack, oVirt reference
architecture
● Curator for log trimming
{
"_id": "AVm4sS7SHNq31gLBPp4-",
"_index": ".operations.2017.01.18",
"_score": 1.0,
"_source": {
"@timestamp":
"2017-01-17T21:45:41.000000-00:00",
"Hostname": “os.rmeggins.test",
"message": "Journal stopped",
"systemd": {
"t": {
"PID": “109”,
...
},
"_type": "com.redhat.viaq.common"
8

@Pepe_CZ9
KIBANA - VISUALIZATION
Node.js Based - Tightly Coupled with Elasticsearch

@Pepe_CZ10
ARCHITECTURE - LOGGING DETAIL
Elasticsearch
(Cluster)
ES service/externalIP
Logging System - OpenShift Platform
Fluentd
OpenShift ES
plugin
SearchGuard
plugin
Kibana container
Auth proxy
container
OpenShift
OAuth
OpenShift
API
K8s
metadata
User
project
and roles
Browser
Add token
and userid
headers
Token and
userid
headers
Kibana Pod

@Pepe_CZ11
QUICKSTART - oc cluster up --logging
● Deploy OpenShift with oc cluster up
● Shutdown cluster
● Restart docker
● Bring cluster back up with existing configuration
There is currently a bug that the pods cannot inter-network e.g. Fluentd
cannot talk to Elasticsearch unless docker is restarted while the cluster is
down.
$ sudo oc cluster down
$ sudo systemctl restart docker
$ sudo oc cluster up --use-existing-config …

@Pepe_CZ12
QUICKSTART - minishift start --logging
● Set up minishift [1] - use
[1] https://github.com/MiniShift/minishift
minishift start --logging

@Pepe_CZ13
ViaQ - LOGGING THE HARD WAY
● Follow directions on GitHub
● Uses openshift-ansible to set up an all-in-one cluster
● Configures logging for external access - similar to how oVirt uses
logging
● Extensible for more complex deployments

@Pepe_CZ14
EXAMPLE ANSIBLE INVENTORY FILES
● deploy_cluster.yml playbook to deploy OpenShift and logging
● All-in-one inventory based on OpenShift Origin 3.7.1
# Make sure to set version and to install logging
[OSEv3:vars]
openshift_release=v3.7.1
openshift_logging_install_logging=true
openshift_image_tag=v3.7.1
openshift_logging_es_allow_external=true

@Pepe_CZ15
TROUBLESHOOTING
● logging-dump.sh - an “sosreport” for logging [1],[2]
○ Contains pod logs, config
○ Look at the pod log files for errors
○ Good for attaching to bug reports
[1]
https://github.com/openshift/origin-aggregated-logging/blob/master
/hack/README-dump.md
[2]
https://github.com/openshift/origin-aggregated-logging/blob/master
/hack/logging-dump.sh

@Pepe_CZ16
TROUBLESHOOTING
● Query Elasticsearch from command line - es_util
Where <query> could be something like
Instead of project.* use .operations.* for system logs
● Get the list of indices
oc get pods | grep logging-es # get the pod name
espod=logging-es-.....
oc exec -c elasticsearch $espod -- es_util --query
“project.*/_search?sort=@timestamp:desc&q=<query>”
| python -mjson.tool | more
level:error
oc exec -c elasticsearch $espod -- indices

@Pepe_CZ17
USING WITH oVirt
● oVirt uses Collectd to gather metrics and monitoring data
● Collectd writes to Fluentd using http input
● Fluentd also gathers oVirt engine logs
● Fluentd sends data to external Elasticsearch endpoint
● Logging is configured with ovirt-metrics-engine and
ovirt-logs-engine projects
● Links:
https://www.ovirt.org/blog/2017/12/ovirt-metrics-store/
https://www.ovirt.org/develop/release-management/features/me
trics/metrics-store/

@Pepe_CZ18
USING WITH OpenStack
● OpenStack can be configured with a Fluentd client
● OpenStack uses secure_forward to send logs to mux
● Upstream documentation is here[1]
● Downstream documentation is here[2]
[1]http://opstools-ansible.readthedocs.io/en/latest/tripleo_integration
.html
[2]https://access.redhat.com/documentation/en-us/red_hat_opensta
ck_platform/10/html/advanced_overcloud_customization/sect-monito
ring_tools_configuration

@Pepe_CZ19
LOGGING CUSTOM APPLICATION
DATA
● Have clear definition of fields in log messages
● Send logs to stdout
● Configure application to output single-line JSON
BEST PRACTICES
{
"hostname":"myhost.test",
"level":"info",
"message":"Server listening on 0.0.0.0:8080",
"time":"2018-01-24T17:35:10+01:00"
}

@Pepe_CZ20
DATA
● Or even:
BEST PRACTICES
{
"application": {
"accounts": {
"hostname":"myhost.test",
"level":"info",
"message":"Server listening on 0.0.0.0:8080",
"time":"2018-01-24T17:35:10+01:00"
}
}
}

@Pepe_CZ21
DATA
These things are easy...
BEST PRACTICES
func initLogger() *log.Entry {
log.SetFormatter(&log.JSONFormatter)
log.SetOutput(os.Stdout)
return log.WithFields(log.Fields{
"hostname": os.Getenv("HOSTNAME"),
})
}

@Pepe_CZ22
DATA
Log line:
Becomes:
JSON FORMATTED MESSAGE FIELD
INFO[0000] 2018-01-24T17:35:10+01:00 message="{"level":"warn","message":"Function
deprecated", "some_field":"some_value"}"
{
"level":"warn",
"some_field":"some_value",
"message":"Function deprecated",
...
}

@Pepe_CZ23
DATA
● Plain text messages
○ ...the default for most loggers
○ Searching such logs becomes a real CSI crime scene investigation
WORST PRACTICE
{
"level":"info",
"message":"ERROR[0000] 2018-01-24T17:35:10+01:00 NullPointerException
in ...",
...
}

@Pepe_CZ25
FUTURE DIRECTIONS
● Support CRI log format - not docker json-file compatible
● Fluentd does not scale well - look for alternatives: rsyslog,
fluent-bit, Elastic Beats
● Fluentd RELP input - rsyslog to fluentd[1]
● More integration with Prometheus - fluentd metrics, other metrics
● Elasticsearch 5 (OpenShift 3.10), Elasticsearch 6 (OpenShift 3.11 or
later)
● Grafana - display metrics and log data on same dashboard -
aggregate from different sources
● Message Queue integration
[1] https://github.com/ViaQ/fluent-plugin-relp

@Pepe_CZ26
ARCHITECTURE USING QUEUE
Log
sources Collector
Elasticsearch
(Cluster)
Kibana
Host
...
Host
Mux -
Normalizer
Mux -
Normalizer
Logging SystemMessage
Queue
Separate
topics for
Raw and
Normalized
Log
sources Collector
Raw
Raw
Raw
Raw
“Big Data” Analysis
Archival
“Tailing”
Monitoring
Normalized

@Pepe_CZ27
WHERE TO FIND THE CODE?

@Pepe_CZ28
SOURCE CODE & MAILING LIST
● OpenShift Aggregated Logging
○ https://github.com/openshift/origin-aggregated-logging
○ #openshift-dev FreeNode IRC
● ViaQ
○ https://github.com/ViaQ
○ #viaq FreeNode IRC
● CentOS OpsTools SIG
○ https://wiki.centos.org/SpecialInterestGroup/OpsTools
○ #centos-devel FreeNode IRC
○ centos-devel mailing list

THANK YOU
plus.google.com/+RedHat
linkedin.com/company/red-hat
youtube.com/user/RedHatVideos
facebook.com/redhatinc
twitter.com/RedHatNews
30

Logs/Metrics Gathering With OpenShift EFK Stack

In this document

More Related Content

What's hot

Similar to Logs/Metrics Gathering With OpenShift EFK Stack

Recently uploaded

Logs/Metrics Gathering With OpenShift EFK Stack