Our Drupal 8 websites are true applications, often very complex ones.
More and more workload is being delegated to external systems, usually microservices, that are used for many different tasks.
Software architectures are becoming more distributed and fragmented.
To track down problems and optimize for performance, it will become mandatory to trace the lifecycle of a single request as it originates from a client, passes through all Drupal subsystems, reaches external (micro)services and comes back.
This is often time consuming and without the right tools may become very difficult.
A simple, unstructured log stream isn't enough anymore; we need to find a way to observe the details of what is going on.
Observability is what it’s all about. This is based on structured logs, metrics and traces. In this talk you will see how to implement these techniques in Drupal, which tools and which modules to use to trace and log all requests that reach our website and how to expose and display useful metrics.
We will integrate Drupal with OpenTracing, Prometheus, Monolog, Grafana and many more.
5. Almost everyone is working with distributed systems.
There are microservices, containers, cloud, serverless,
and a lot of combinations of these technologies. All of
these increase the number of failures that systems will
have because there are too many parts interacting.
And because of the distributed system’s diversity, it’s
complex to understand present problems and predict
future ones
6. "observability is a measure of how well
internal states of a system can be
inferred from knowledge of its external
outputs"
7. We want to observe production environments and
generic metrics like CPU and memory usage are not
sufficient anymore
8. 3 PILLARS OF OBSERVABILITY3 PILLARS OF OBSERVABILITY
1. Structured logs
2. Metrics
3. (Distributed) Traces
18. Then in settings.php we add monolog.services.yml to
the list of container yamls
settings.php
$settings['container_yamls'][] =
DRUPAL_ROOT . '/sites/default/monolog.services.yml';
20. Structured logs makes it simple to query them for any
sort of useful information
We can write custom Monolog processors to add
application specific data to our logs
22. 1. Logs are about storing specific events
2. Metrics are a measurement at a point in time for the
system
23. Examples of the sort of metrics you might have would
be:
the number of times you received HTTP requests
how much time was spent handling requests
how many requests are currently in progress
the number of errors occurred
24. To instrument our application and record real-time
metrics we will use Prometheus (prometheus.io)
25. Prometheus was the second project to join the Cloud
Native Computing Foundation a er Kubernetes
26.
27. To gather information from our production
environment we need two things:
instrument our application
extract data from the system
29. We start writing a simple module to implement
Observability in Drupal 8:
https://www.drupal.org/sandbox/lussoluca/3054802
30. The O11y module uses the
jimdo/prometheus_client_php library to implement a
Prometheus client (so you have to install it using
Composer)
Using the APC storage will reset metrics if server is
restarted, use the Redis storage to overcome this
$registry =
new PrometheusCollectorRegistry(
new PrometheusStorageAPC()
);
31. Prometheus has three types of metric:
1. Counters (represents a single monotonically
increasing counter)
$registry
->getOrRegisterCounter(
$namespace,
$name,
$help,
$labels
);
32. 2. Gauges (represents a single numerical value that
can arbitrarily go up and down)
$registry
->getOrRegisterGauge(
$namespace,
$name,
$help,
$labels
);
33. 3. Histograms (samples observations and counts them
in buckets)
$registry
->getOrRegisterHistogram(
$namespace,
$name,
$help,
$labels
);
35. The module esposes an URL with metrics in
Prometheus format (/metrics)
# HELP drupal_entity_insert Insert a new entity
# TYPE drupal_entity_insert counter
drupal_entity_insert{type="comment",bundle="comment"} 2
drupal_entity_insert{type="node",bundle="article"} 1
45. Grafana uses PromQL to let the user select and
aggregate time series data in real time
46. // PHP request per second
increase(php_request[5m])
// Entity created per second
increase(drupal_entity_insert{bundle="page"}[5m])
// PHP Memory peak
avg_over_time(php_memory_peak[5m])
// CPU usage
rate(node_cpu_seconds_total{mode="user"}[5m])
47.
48. Now that we have Grafana up and running we can use
it also for viewing logs
Loki is a new project from Grafana Labs to scrape and
aggregate logs inspired by Prometheus
51. 1. Logs are about storing specific events
2. Metrics are a measurement at a point in time for the
system
3. Distributed traces deals with information that is
request-scoped
53. Per-process logging and metric monitoring have their
place, but neither can reconstruct the elaborate
journeys that transactions take as they propagate
across a distributed system. Distributed traces are
these journeys
54. We take for example a Drupal Commerce 2 website
with product prices that comes from a remote
microservice
66. One last thing we need is to correlate traces with logs,
so when we found a problem with a request we can go
from the trace to the logs (and viceversa)
67. The O11y module provides a new processor for
Monolog that adds a trace_id argument to every log