At Elastic, we recently added OpenTelemetry support to most of our OpenSource Elasticsearch clients. This talk will tell the story on how we got there and what we learned along the way.
Elasticsearch clients exist in multiple languages (Java, .NET, PHP, Ruby, etc.), therefore we also created Semantic Conventions to make sure all Elasticsearch client instrumentations behave in the same way.
Attendees will learn about how to instrument existing libraries with OpenTelemetry and they will also learn how to interact with the community and collaborate on creating Semantic Conventions for specific technologies.
2. Greg Kalapos
● Works at Elastic since 2018
● Created the Elastic .NET APM Agent
(completely OpenSource)
● Now focuses on OpenTelemetry
https://twitter.com/gregkalapos
3. Demo
Out of the box, built-in telemetry in
Elasticsearch clients based on
OpenTelemetry
4. What is OpenTelemetry?
● CNCF Project
● Collection of APIs, SDKs, and tools.
● To instrument, generate, collect, and
export telemetry data
● Metrics, logs, and traces
8. Anatomy of a modern application
public class HomeController
{
public async Task<IActionResult> Index()
{
return View();
}
}
9. Anatomy of a modern application
public class HomeController : Controller
{
public async Task<IActionResult> Index()
{
return View();
}
}
Frameworks and libraries
10. public class HomeController : Controller
{
ElasticsearchClient _client = new(CloudId, new ApiKey(ApiKey));
public async Task<IActionResult> Index()
{
var response = await _client.GetAsync<MyDoc>(1, idx => idx.Index("my_index"));
var doc = response.Source;
return View(doc);
}
}
Anatomy of a modern application
Frameworks and libraries
11. public class HomeController : Controller
{
ElasticsearchClient _client = new(CloudId, new ApiKey(ApiKey));
IProducer<string, string> producer = new ProducerBuilder<string, string>(config).Build();
public async Task<IActionResult> Index()
{
var response = await _client.GetAsync<MyDoc>(1, idx => idx.Index("my_index"));
var deliveryResult = await producer.ProduceAsync(topic, message);
var doc = response.Source;
return View(doc);
}
}
Anatomy of a modern application
Frameworks and libraries
12. Observability
Things we want to see
HTTP GET - Index 723 ms
Elasticsearch - Get - my_index 360 ms
Kafka - Send 280 ms
Traces
Logs /
Events
[2023-11-06T15:23:40:152][INFO] HomeController - GET - /index - 200
[2023-11-06T15:23:40:167][INFO] ES Client - GET - my_index - 1
[2023-11-06T15:23:40:167][DEBUG] ES Client - Selected node MyNode-2
[2023-11-06T15:23:40:167][INFO] Kafka - Sending record to topic XYZ
[2023-11-06T15:23:40:167][WARN] Kafka - Failed sending, retrying …
Metrics
13. How do we get the data?
Instrumentation!
public async Task<IActionResult> Index()
{
var response = await _client.GetAsync<MyDoc>(1, idx => idx.Index("my_index"));
var deliveryResult = await producer.ProduceAsync(topic, message);
var doc = response.Source;
return View(doc);
}
14. How do we get the data?
Instrumentation!
public async Task<IActionResult> Index()
{
using (var httpSpan = tracer.StartActiveSpan("HTTP GET Index"))
{
var response = await _client.GetAsync<MyDoc>(1, idx => idx.Index("my_index"));
var deliveryResult = await producer.ProduceAsync(topic, message);
var doc = response.Source;
return View(doc);
}
}
HTTP GET - Index 723 ms
Traces
15. How do we get the data?
Instrumentation!
public async Task<IActionResult> Index()
{
using (var httpSpan = tracer.StartActiveSpan("HTTP GET Index"))
{
using (var esSpan = tracer.StartActiveSpan("ElasticSearch Get my_index"))
{
var response = await _client.GetAsync<MyDoc>(1, idx => idx.Index("my_index"));
}
var deliveryResult = await producer.ProduceAsync(topic, message);
var doc = response.Source;
return View(doc);
}
}
HTTP GET - Index 723 ms
Elasticsearch - Get - my_index 360 ms
Traces
16. How do we get the data?
Instrumentation!
public async Task<IActionResult> Index()
{
using (var httpSpan = tracer.StartActiveSpan("HTTP GET Index"))
{
using (var esSpan = tracer.StartActiveSpan("ElasticSearch Get my_index"))
{
var response = await _client.GetAsync<MyDoc>(1, idx => idx.Index("my_index"));
}
using (var kafkaSpan = tracer.StartActiveSpan("Kafka Send"))
{
var deliveryResult = await producer.ProduceAsync(topic, message);
}
var doc = response.Source;
return View(doc);
}
}
HTTP GET - /index 723 ms
Elasticsearch - Get - my_index 360 ms
Kafka - Send 280 ms
Traces
17. Photo by Andrea Piacquadio - pexels.com
LET’S DO IT FOR ALL THE
DEPENDENCIES …
18. … YOU WOULD GET ALL OF THIS FOR FREE
IMAGINE …
imgflip.com
30. Semantic Conventions
● Common names for different kinds of operations and data.
● Common naming scheme that can be standardized across a
codebase, libraries, and platforms.
● https://opentelemetry.io/docs/concepts/semantic-conventions/
33. Implementation in specific language clients
● Adding reference to the OpenTelemetry API
● Implement built-in tracing
34. Implementing built-in tracing
Challenges
● Think from the perspective of the users of the
library
○ Don't over-engineer
● Be careful about span names
○ Pay attention to low cardinality
● Instrumentation always comes with overhead
○ Only instrument relevant code-path
○ Avoid code paths that are being called
excessively
● Do not collect sensitive data (by default)
37. Future of native instrumentation
● Realistically: there will always be a mix of native + external
instrumentation
● OTel registry: https://opentelemetry.io/ecosystem/registry/