At Elastic, we recently added OpenTelemetry support to most of our OpenSource Elasticsearch clients. This talk will tell the story on how we got there and what we learned along the way.
Elasticsearch clients exist in multiple languages (Java, .NET, PHP, Ruby, etc.), therefore we also created Semantic Conventions to make sure all Elasticsearch client instrumentations behave in the same way.
Attendees will learn about how to instrument existing libraries with OpenTelemetry and they will also learn how to interact with the community and collaborate on creating Semantic Conventions for specific technologies.
This is an adaptation of the presentation given at the SpringOne 2008 conference in Hollywood, FL. It contains some updates on project status, and also information about the recently published book "Spring Python 1.1"
This slideshow is licensed under a Creative Commons Attribution 3.0 United States License.
Replace Angular with React. Make the move from the MEAN stack to the powerful MERN Stack!
Come and learn about the MERN stack. No, that isn't a typo. The MERN stack is Mongo, Express, and Node, with React instead of Angular. While both React and Angular are remarkable JavaScript technologies, React comes with less baggage. There is no TypeScript, no annotations, no bossy framework telling you how to do everything.
This is an adaptation of the presentation given at the SpringOne 2008 conference in Hollywood, FL. It contains some updates on project status, and also information about the recently published book "Spring Python 1.1"
This slideshow is licensed under a Creative Commons Attribution 3.0 United States License.
Replace Angular with React. Make the move from the MEAN stack to the powerful MERN Stack!
Come and learn about the MERN stack. No, that isn't a typo. The MERN stack is Mongo, Express, and Node, with React instead of Angular. While both React and Angular are remarkable JavaScript technologies, React comes with less baggage. There is no TypeScript, no annotations, no bossy framework telling you how to do everything.
Apache Spark™ + IBM Watson + Twitter DataPalooza SF 2015Mike Broberg
Use Apache Spark Streaming in with IBM Watson on Bluemix to perform sentiment analysis and track how a conversation is trending on Twitter.
By David Taieb: https://twitter.com/DTAIEB55
Video: https://youtu.be/KLc_wazud3s
Tutorial: https://developer.ibm.com/clouddataservices/sentiment-analysis-of-twitter-hashtags/
Scalable Open-Source IoT Solutions on Microsoft AzureMaxim Ivannikov
Scalable Open-Source IoT Solutions from gateways to the Cloud using DeviceHive, Ubuntu Snappy Core and Microsoft Azure.
The presentation was used during the NY Open-Source IoT Solutions Summit on November 12, 2015.
Why apache Flink is the 4G of Big Data Analytics FrameworksSlim Baltagi
Apache Flink is a community-driven open source and memory-centric Big Data analytics framework. It provides the only hybrid (Real-Time Streaming + Batch) open source distributed data processing engine supporting many use cases.
Flink uses a mixture of Scala and Java internally, has very good Scala APIs and some of its libraries are basically pure Scala (FlinkML and Table).
At its core, it is a streaming dataflow execution engine and it also provides several APIs for batch processing (DataSet API), real-time streaming (DataStream API) and relational queries (Table API) and also domain-specific libraries for machine learning (FlinkML) and graph processing (Gelly).
In this talk, you will learn in more details about:
What is Apache Flink, how it fits into the Big Data ecosystem and why it is the 4G (4th Generation) of Big Data Analytics frameworks?
How Apache Flink integrates with Apache Hadoop and other open source tools for data input and output as well as deployment?
Why Apache Flink is an alternative to Apache Hadoop MapReduce, Apache Storm and Apache Spark? What are the benchmarking results between Apache Flink and those other Big Data analytics frameworks?
Introduction to interactive data visualisation using R Shinyanamarisaguedes
Shiny is an R library for building interactive webapps. Shiny allows rapid prototyping and quick production of dashboards and interactive data visualisations. This is especially important in situations where putting a real data-driven prototype in the hands of the end user allows for better refining of requirements before passing off to a web development team. This allows to speed up the delivery process and reducing the dependencies on other teams.
Code and solution to exercises available on github: https://github.com/amguedes/ShinySeminar
At the beginning of 2021, Shopify Data Platform decided to adopt Apache Flink to enable modern stateful stream-processing. Shopify had a lot of experience with other streaming technologies, but Flink was a great fit due to its state management primitives.
After about six months, Shopify now has a flourishing ecosystem of tools, tens of prototypes from many teams across the company and a few large use-cases in production.
Yaroslav will share a story about not just building a single data pipeline but building a sustainable ecosystem. You can learn about how they planned their platform roadmap, the tools and libraries Shopify built, the decision to fork Flink, and how Shopify partnered with other teams and drove the adoption of streaming at the company.
Our cloud-native environments are more complex than ever before! So how can we ensure that the applications we’re deploying to them are behaving as we intended them to? This is where effective observability is crucial. It enables us to monitor our applications in real-time and analyse and diagnose their behaviour in the cloud. However, until recently, we were lacking the standardization to ensure our observability solutions were applicable across different platforms and technologies. In this session, we’ll delve into what effective observability really means, exploring open source technologies and specifications, like OpenTelemetry, that can help us to achieve this while ensuring our applications remain flexible and portable.
How and why we evolved a legacy Java web application to Scala... and we are s...Katia Aresti
Applications get old, and technology moves fast. Overtime, adding or modifying functionalities might become as expensive as re-coding everything all from scratch. But rewriting a complete website and its functionalities it’s hard if we want to minimize the risks of breaking existing functionalities and specially when this application fits in a ecosystem and interacts with other pieces of software and teams.
In this session, you will learn how we moved from a legacy java monolithic website using scala PlayFramework, AngularJS, Elasticsearch and MongoDB, how we built a multi service and REST oriented architecture, which were the technical and human problems we encountered and how we managed to solved them.
This guide will help you get started with Innoslate, the full lifecycle systems engineering tool. It will take you through developing your requirements, creating model, simulating your models, and keeping traceability through the entire project.
Easing offline web application development with GWTArnaud Tournier
At this current time, HTML5 APIs are mature enough so that the web browser can now be a very good platform for applications that were before only implemented as native applications : offline applications with locally stored data, embedded SQL engines, etc. Although there are many good Javascript frameworks out there, the Java language allows to build, maintain, debug and work with ease on really big applications (> 100,000 LOC).
You'll discover in this presentation all the tools we assembled to make an application available with its data 100% of the time, even without internet!
Near real-time anomaly detection at Lyftmarkgrover
Near real-time anomaly detection at Lyft, by Mark Grover and Thomas Weise at Strata NY 2018.
https://conferences.oreilly.com/strata/strata-ny/public/schedule/detail/69155
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...DataStax Academy
This session covers our experience with using the Spark and Shark frameworks for running real-time queries on top of Cassandra data.We will start by surveying the current Cassandra analytics landscape, including Hadoop and HIVE, and touch on the use of custom input formats to extract data from Cassandra. We will then dive into Spark and Shark, two memory-based cluster computing frameworks, and how they enable often dramatic improvements in query speed and productivity, over the standard solutions today.
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie WellsRosie Wells
Insight: In a landscape where traditional narrative structures are giving way to fragmented and non-linear forms of storytelling, there lies immense potential for creativity and exploration.
'Collapsing Narratives: Exploring Non-Linearity' is a micro report from Rosie Wells.
Rosie Wells is an Arts & Cultural Strategist uniquely positioned at the intersection of grassroots and mainstream storytelling.
Their work is focused on developing meaningful and lasting connections that can drive social change.
Please download this presentation to enjoy the hyperlinks!
More Related Content
Similar to OSMC 2023 | Built-in OpenTelemetry support in Elasticsearch clients by Greg Kalapos
Apache Spark™ + IBM Watson + Twitter DataPalooza SF 2015Mike Broberg
Use Apache Spark Streaming in with IBM Watson on Bluemix to perform sentiment analysis and track how a conversation is trending on Twitter.
By David Taieb: https://twitter.com/DTAIEB55
Video: https://youtu.be/KLc_wazud3s
Tutorial: https://developer.ibm.com/clouddataservices/sentiment-analysis-of-twitter-hashtags/
Scalable Open-Source IoT Solutions on Microsoft AzureMaxim Ivannikov
Scalable Open-Source IoT Solutions from gateways to the Cloud using DeviceHive, Ubuntu Snappy Core and Microsoft Azure.
The presentation was used during the NY Open-Source IoT Solutions Summit on November 12, 2015.
Why apache Flink is the 4G of Big Data Analytics FrameworksSlim Baltagi
Apache Flink is a community-driven open source and memory-centric Big Data analytics framework. It provides the only hybrid (Real-Time Streaming + Batch) open source distributed data processing engine supporting many use cases.
Flink uses a mixture of Scala and Java internally, has very good Scala APIs and some of its libraries are basically pure Scala (FlinkML and Table).
At its core, it is a streaming dataflow execution engine and it also provides several APIs for batch processing (DataSet API), real-time streaming (DataStream API) and relational queries (Table API) and also domain-specific libraries for machine learning (FlinkML) and graph processing (Gelly).
In this talk, you will learn in more details about:
What is Apache Flink, how it fits into the Big Data ecosystem and why it is the 4G (4th Generation) of Big Data Analytics frameworks?
How Apache Flink integrates with Apache Hadoop and other open source tools for data input and output as well as deployment?
Why Apache Flink is an alternative to Apache Hadoop MapReduce, Apache Storm and Apache Spark? What are the benchmarking results between Apache Flink and those other Big Data analytics frameworks?
Introduction to interactive data visualisation using R Shinyanamarisaguedes
Shiny is an R library for building interactive webapps. Shiny allows rapid prototyping and quick production of dashboards and interactive data visualisations. This is especially important in situations where putting a real data-driven prototype in the hands of the end user allows for better refining of requirements before passing off to a web development team. This allows to speed up the delivery process and reducing the dependencies on other teams.
Code and solution to exercises available on github: https://github.com/amguedes/ShinySeminar
At the beginning of 2021, Shopify Data Platform decided to adopt Apache Flink to enable modern stateful stream-processing. Shopify had a lot of experience with other streaming technologies, but Flink was a great fit due to its state management primitives.
After about six months, Shopify now has a flourishing ecosystem of tools, tens of prototypes from many teams across the company and a few large use-cases in production.
Yaroslav will share a story about not just building a single data pipeline but building a sustainable ecosystem. You can learn about how they planned their platform roadmap, the tools and libraries Shopify built, the decision to fork Flink, and how Shopify partnered with other teams and drove the adoption of streaming at the company.
Our cloud-native environments are more complex than ever before! So how can we ensure that the applications we’re deploying to them are behaving as we intended them to? This is where effective observability is crucial. It enables us to monitor our applications in real-time and analyse and diagnose their behaviour in the cloud. However, until recently, we were lacking the standardization to ensure our observability solutions were applicable across different platforms and technologies. In this session, we’ll delve into what effective observability really means, exploring open source technologies and specifications, like OpenTelemetry, that can help us to achieve this while ensuring our applications remain flexible and portable.
How and why we evolved a legacy Java web application to Scala... and we are s...Katia Aresti
Applications get old, and technology moves fast. Overtime, adding or modifying functionalities might become as expensive as re-coding everything all from scratch. But rewriting a complete website and its functionalities it’s hard if we want to minimize the risks of breaking existing functionalities and specially when this application fits in a ecosystem and interacts with other pieces of software and teams.
In this session, you will learn how we moved from a legacy java monolithic website using scala PlayFramework, AngularJS, Elasticsearch and MongoDB, how we built a multi service and REST oriented architecture, which were the technical and human problems we encountered and how we managed to solved them.
This guide will help you get started with Innoslate, the full lifecycle systems engineering tool. It will take you through developing your requirements, creating model, simulating your models, and keeping traceability through the entire project.
Easing offline web application development with GWTArnaud Tournier
At this current time, HTML5 APIs are mature enough so that the web browser can now be a very good platform for applications that were before only implemented as native applications : offline applications with locally stored data, embedded SQL engines, etc. Although there are many good Javascript frameworks out there, the Java language allows to build, maintain, debug and work with ease on really big applications (> 100,000 LOC).
You'll discover in this presentation all the tools we assembled to make an application available with its data 100% of the time, even without internet!
Near real-time anomaly detection at Lyftmarkgrover
Near real-time anomaly detection at Lyft, by Mark Grover and Thomas Weise at Strata NY 2018.
https://conferences.oreilly.com/strata/strata-ny/public/schedule/detail/69155
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...DataStax Academy
This session covers our experience with using the Spark and Shark frameworks for running real-time queries on top of Cassandra data.We will start by surveying the current Cassandra analytics landscape, including Hadoop and HIVE, and touch on the use of custom input formats to extract data from Cassandra. We will then dive into Spark and Shark, two memory-based cluster computing frameworks, and how they enable often dramatic improvements in query speed and productivity, over the standard solutions today.
Similar to OSMC 2023 | Built-in OpenTelemetry support in Elasticsearch clients by Greg Kalapos (20)
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie WellsRosie Wells
Insight: In a landscape where traditional narrative structures are giving way to fragmented and non-linear forms of storytelling, there lies immense potential for creativity and exploration.
'Collapsing Narratives: Exploring Non-Linearity' is a micro report from Rosie Wells.
Rosie Wells is an Arts & Cultural Strategist uniquely positioned at the intersection of grassroots and mainstream storytelling.
Their work is focused on developing meaningful and lasting connections that can drive social change.
Please download this presentation to enjoy the hyperlinks!
This presentation, created by Syed Faiz ul Hassan, explores the profound influence of media on public perception and behavior. It delves into the evolution of media from oral traditions to modern digital and social media platforms. Key topics include the role of media in information propagation, socialization, crisis awareness, globalization, and education. The presentation also examines media influence through agenda setting, propaganda, and manipulative techniques used by advertisers and marketers. Furthermore, it highlights the impact of surveillance enabled by media technologies on personal behavior and preferences. Through this comprehensive overview, the presentation aims to shed light on how media shapes collective consciousness and public opinion.
2. Greg Kalapos
● Works at Elastic since 2018
● Created the Elastic .NET APM Agent
(completely OpenSource)
● Now focuses on OpenTelemetry
https://twitter.com/gregkalapos
3. Demo
Out of the box, built-in telemetry in
Elasticsearch clients based on
OpenTelemetry
4. What is OpenTelemetry?
● CNCF Project
● Collection of APIs, SDKs, and tools.
● To instrument, generate, collect, and
export telemetry data
● Metrics, logs, and traces
8. Anatomy of a modern application
public class HomeController
{
public async Task<IActionResult> Index()
{
return View();
}
}
9. Anatomy of a modern application
public class HomeController : Controller
{
public async Task<IActionResult> Index()
{
return View();
}
}
Frameworks and libraries
10. public class HomeController : Controller
{
ElasticsearchClient _client = new(CloudId, new ApiKey(ApiKey));
public async Task<IActionResult> Index()
{
var response = await _client.GetAsync<MyDoc>(1, idx => idx.Index("my_index"));
var doc = response.Source;
return View(doc);
}
}
Anatomy of a modern application
Frameworks and libraries
11. public class HomeController : Controller
{
ElasticsearchClient _client = new(CloudId, new ApiKey(ApiKey));
IProducer<string, string> producer = new ProducerBuilder<string, string>(config).Build();
public async Task<IActionResult> Index()
{
var response = await _client.GetAsync<MyDoc>(1, idx => idx.Index("my_index"));
var deliveryResult = await producer.ProduceAsync(topic, message);
var doc = response.Source;
return View(doc);
}
}
Anatomy of a modern application
Frameworks and libraries
12. Observability
Things we want to see
HTTP GET - Index 723 ms
Elasticsearch - Get - my_index 360 ms
Kafka - Send 280 ms
Traces
Logs /
Events
[2023-11-06T15:23:40:152][INFO] HomeController - GET - /index - 200
[2023-11-06T15:23:40:167][INFO] ES Client - GET - my_index - 1
[2023-11-06T15:23:40:167][DEBUG] ES Client - Selected node MyNode-2
[2023-11-06T15:23:40:167][INFO] Kafka - Sending record to topic XYZ
[2023-11-06T15:23:40:167][WARN] Kafka - Failed sending, retrying …
Metrics
13. How do we get the data?
Instrumentation!
public async Task<IActionResult> Index()
{
var response = await _client.GetAsync<MyDoc>(1, idx => idx.Index("my_index"));
var deliveryResult = await producer.ProduceAsync(topic, message);
var doc = response.Source;
return View(doc);
}
14. How do we get the data?
Instrumentation!
public async Task<IActionResult> Index()
{
using (var httpSpan = tracer.StartActiveSpan("HTTP GET Index"))
{
var response = await _client.GetAsync<MyDoc>(1, idx => idx.Index("my_index"));
var deliveryResult = await producer.ProduceAsync(topic, message);
var doc = response.Source;
return View(doc);
}
}
HTTP GET - Index 723 ms
Traces
15. How do we get the data?
Instrumentation!
public async Task<IActionResult> Index()
{
using (var httpSpan = tracer.StartActiveSpan("HTTP GET Index"))
{
using (var esSpan = tracer.StartActiveSpan("ElasticSearch Get my_index"))
{
var response = await _client.GetAsync<MyDoc>(1, idx => idx.Index("my_index"));
}
var deliveryResult = await producer.ProduceAsync(topic, message);
var doc = response.Source;
return View(doc);
}
}
HTTP GET - Index 723 ms
Elasticsearch - Get - my_index 360 ms
Traces
16. How do we get the data?
Instrumentation!
public async Task<IActionResult> Index()
{
using (var httpSpan = tracer.StartActiveSpan("HTTP GET Index"))
{
using (var esSpan = tracer.StartActiveSpan("ElasticSearch Get my_index"))
{
var response = await _client.GetAsync<MyDoc>(1, idx => idx.Index("my_index"));
}
using (var kafkaSpan = tracer.StartActiveSpan("Kafka Send"))
{
var deliveryResult = await producer.ProduceAsync(topic, message);
}
var doc = response.Source;
return View(doc);
}
}
HTTP GET - /index 723 ms
Elasticsearch - Get - my_index 360 ms
Kafka - Send 280 ms
Traces
17. Photo by Andrea Piacquadio - pexels.com
LET’S DO IT FOR ALL THE
DEPENDENCIES …
18. … YOU WOULD GET ALL OF THIS FOR FREE
IMAGINE …
imgflip.com
30. Semantic Conventions
● Common names for different kinds of operations and data.
● Common naming scheme that can be standardized across a
codebase, libraries, and platforms.
● https://opentelemetry.io/docs/concepts/semantic-conventions/
33. Implementation in specific language clients
● Adding reference to the OpenTelemetry API
● Implement built-in tracing
34. Implementing built-in tracing
Challenges
● Think from the perspective of the users of the
library
○ Don't over-engineer
● Be careful about span names
○ Pay attention to low cardinality
● Instrumentation always comes with overhead
○ Only instrument relevant code-path
○ Avoid code paths that are being called
excessively
● Do not collect sensitive data (by default)
37. Future of native instrumentation
● Realistically: there will always be a mix of native + external
instrumentation
● OTel registry: https://opentelemetry.io/ecosystem/registry/