Did you like it? Check out our blog to stay up to date: https://getindata.com/blog
The webinar was organized by GetinData on 2020. During the webinar we explaned the concept of monitoring and observability with focus on data analytics platforms.
Watch more here: https://www.youtube.com/watch?v=qSOlEN5XBQc
Whitepaper - Monitoring ang Observability for Data Platform: https://getindata.com/blog/white-paper-big-data-monitoring-observability-data-platform/
Speaker: Albert Lewandowski
Linkedin: https://www.linkedin.com/in/albert-lewandowski/
___
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...GetInData
Did you like it? Check out our blog to stay up to date: https://getindata.com/blog
The talk is focused on administration, development and monitoring platform with Apache Spark, Apache Flink and Kubeflow in which the monitoring stack is based on Prometheus stack.
Author: Albert Lewandowski
Linkedin: https://www.linkedin.com/in/albert-lewandowski/
___
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Brian Brazil
Prometheus is a next-generation monitoring system. Since being publicly announced last year it has seen wide-spread interest and adoption. This talk will look at the concepts behind monitoring with Prometheus, and how to use it with Kubernetes which has direct support for Prometheus.
MeetUp Monitoring with Prometheus and Grafana (September 2018)Lucas Jellema
This presentation introduces the concept of monitoring - focusing on why and how and finally on the tools to use. It introduces Prometheus (metrics gathering, processing, alerting), application instrumentation and Prometheus exporters and finally it introduces Grafana as a common companion for dashboarding, alerting and notifications. This presentations also introduces the handson workshop - for which materials are available from https://github.com/lucasjellema/monitoring-workshop-prometheus-grafana
Build cloud native solution using open source Nitesh Jadhav
Build cloud native solution using open source. I have tried to give a high level overview on How to build Cloud Native using CNCF graduated software's which are tested, proven and having many reference case studies and partner support for deployment
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...GetInData
Did you like it? Check out our blog to stay up to date: https://getindata.com/blog
The talk is focused on administration, development and monitoring platform with Apache Spark, Apache Flink and Kubeflow in which the monitoring stack is based on Prometheus stack.
Author: Albert Lewandowski
Linkedin: https://www.linkedin.com/in/albert-lewandowski/
___
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Brian Brazil
Prometheus is a next-generation monitoring system. Since being publicly announced last year it has seen wide-spread interest and adoption. This talk will look at the concepts behind monitoring with Prometheus, and how to use it with Kubernetes which has direct support for Prometheus.
MeetUp Monitoring with Prometheus and Grafana (September 2018)Lucas Jellema
This presentation introduces the concept of monitoring - focusing on why and how and finally on the tools to use. It introduces Prometheus (metrics gathering, processing, alerting), application instrumentation and Prometheus exporters and finally it introduces Grafana as a common companion for dashboarding, alerting and notifications. This presentations also introduces the handson workshop - for which materials are available from https://github.com/lucasjellema/monitoring-workshop-prometheus-grafana
Build cloud native solution using open source Nitesh Jadhav
Build cloud native solution using open source. I have tried to give a high level overview on How to build Cloud Native using CNCF graduated software's which are tested, proven and having many reference case studies and partner support for deployment
Troubleshooting and Best Practices with WSO2 Enterprise IntegratorWSO2
This slide deck discusses how to troubleshoot an issue in WSO2 Enterprise Integrator and follow best practices in order to optimize output and avoid failure.
Monitoring Your AWS EKS Environment with DatadogDevOps.com
Join Datadog for a webinar on monitoring Kubernetes with a focus on Amazon EKS. You'll learn how to get the most out of Datadog's intuitive platform and EKS's unique capabilities, including:
How to monitor metrics, logs and traces from your EKS environment
How to test the usability of your environment with features such as adaptive Browser Tests and globally available Real User Monitoring
How to find and fix user-facing issues with synthetic monitoring features like adaptive Browser Tests and globally available Real User Monitoring
Troubleshooting and Best Practices with WSO2 Enterprise IntegratorWSO2
This slide deck discusses how to troubleshoot an issue in WSO2 Enterprise Integrator and follow best practices in order to optimize output and avoid failure.
Watch webinar here: https://wso2.com/library/webinars/2018/10/troubleshooting-and-best-practices-with-wso2-enterprise-integrator
Prometheus for Monitoring Metrics (Fermilab 2018)Brian Brazil
From its humble beginnings in 2012, the Prometheus monitoring system has grown a substantial community with a comprehensive set of integrations. This talk will give an overview of the core ideas behind Prometheus, its feature set and how it has grown to met the challenges of modern cloud-based systems.
Here is the PPT of our recently happened workshop. You can also watch on our youtube channel. here is the link -https://www.youtube.com/channel/UCeLma6SpNYH7jjYKSBNSexw
No production system is complete without a way to monitor it. In software, we define observability as the ability to understand how our system is performing. This talk dives into capabilities and tools that are recommended for implementing observability when running K8s in production as the main platform today for deploying and maintaining containers with cloud-native solutions.
We start by introducing the concept of observability in the context of distributed systems such as K8s and the difference with monitoring. We continue by reviewing the observability stack in K8s and the main functionalities. Finally, we will review the tools K8s provides for monitoring and logging, and get metrics from applications and infrastructure.
Between the points to be discussed we can highlight:
-Introducing the concept of observability
-Observability stack in K8s
-Tools and apps for implementing Kubernetes observability
-Integrating Prometheus with OpenMetrics
In this session, we will start with the importance of monitoring of services and infrastructure. We will discuss about Prometheus an opensource monitoring tool. We will discuss the architecture of Prometheus. We will also discuss some visualization tools which can be used over Prometheus. Then we will have a quick demo for Prometheus and Grafana.
Prometheus: A Next Generation Monitoring System (FOSDEM 2016)Brian Brazil
A look at how Prometheus's instrumentation, data model, query language, manageability and reliability make it a next generation solution.
Video: https://www.youtube.com/watch?v=cwRmXqXKGtk
Contact us: prometheus@robustperception.io
Prometheus - Intro, CNCF, TSDB,PromQL,GrafanaSridhar Kumar N
https://www.youtube.com/playlist?list=PLAiEy9H6ItrKC5PbH7KiELiSEIKv3tuov
-What is Prometheus?
-Difference Between Nagios vs Prometheus
-Architecture
-Alertmanager
-Time series DB
-PromQL (Prometheus Query Language)
-Live Demo
-Grafana
How to Improve the Observability of Apache Cassandra and Kafka applications...Paul Brebner
As distributed cloud applications grow more complex, dynamic, and massively scalable, “observability” becomes more critical.
Observability is the practice of using metrics, monitoring and distributed tracing to understand how a system works.
We’ll explore two complementary Open Source technologies:
Prometheus for monitoring application metrics, and
OpenTracing and Jaeger for distributed tracing.
We’ll discover how they improve the observability of
an Anomaly Detection application, deployed on AWS Kubernetes, and using Instaclustr managed Apache Cassandra and Kafka clusters.
OSMC 2019 | Monitoring Cockpit for Kubernetes Clusters by Ulrike KlusikNETWAYS
Monitoring Kubernetes Clusters with Prometheus is state of the art. The difficulty is to find the significant metrics from the vast amount of available metrics. This talk shows a Monitoring Cockpit defined to get a quick overview of the cluster health and usage. It uses the Standard Metrics available for Kubernetes/OpenShift Clusters and their standard services. The monitoring solution is based on Prometheus, using InfluxDB for central long term storage and Grafana.
Monitoring at scale: Migrating to Prometheus at FastlyMarcus Barczak
Over the past 6 months at Fastly we've migrated away from our legacy monitoring systems and have deployed Prometheus as our primary system for infrastructure and application monitoring.
The Prometheus approach posed some unique challenges over traditional monitoring systems, whilst at the same time enabling us to easily scale our monitoring infrastructure alongside our global network growth.
It hasn't been completely smooth sailing and deploying Prometheus across a globe spanning network serving over 10% of the world's internet traffic has raised its fair share of challenges both technical and cultural.
In this presentation you will learn how we addressed these issues in ways that deviate slightly from conventional wisdom, the mistakes we made along the way, and how the new system has been received by our teams. We hope that our experiences can help you succeed in deploying Prometheus within your organization.
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...NETWAYS
SoundCloud <3 Prometheus. However, when it comes down to hardware, getting data into Prometheus isn’t always straight-forward. In this talk, I will provide a look into how we managed to port all our infrastructure monitoring – including SNMP, IPMI and more – to Prometheus, and even improve it along the way.
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfGetInData
Recently we have observed the rise of open-source Large Language Models (LLMs) that are community-driven or developed by the AI market leaders, such as Meta (Llama3), Databricks (DBRX) and Snowflake (Arctic). On the other hand, there is a growth in interest in specialized, carefully fine-tuned yet relatively small models that can efficiently assist programmers in day-to-day tasks. Finally, Retrieval-Augmented Generation (RAG) architectures have gained a lot of traction as the preferred approach for LLMs context and prompt augmentation for building conversational SQL data copilots, code copilots and chatbots.
In this presentation, we will show how we built upon these three concepts a robust Data Copilot that can help to democratize access to company data assets and boost performance of everyone working with data platforms.
Why do we need yet another (open-source ) Copilot?
How can we build one?
Architecture and evaluation
How do we work with customers on Big Data / ML / Analytics Projects using Scr...GetInData
How do we work with our customers ? How does it look? What do the meetings look like ? How do we structure the cooperation? Who does what and when ?
We receive these kinds of questions quite often. They are very important questions as the customer should know the details before we start the project and it’s important for GetInData to be transparent on this so the client is well informed.
During the webinar our Project Lead, Rafał Zalewski talked about Scrum Framework we use in cooperation with our customers.
Watch here:
https://www.youtube.com/watch?v=uOWrgcaKwWo&t=32s
Speaker: Rafał Zalewski, GetInData: https://www.linkedin.com/in/rafalzalewski/
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
More Related Content
Similar to Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Troubleshooting and Best Practices with WSO2 Enterprise IntegratorWSO2
This slide deck discusses how to troubleshoot an issue in WSO2 Enterprise Integrator and follow best practices in order to optimize output and avoid failure.
Monitoring Your AWS EKS Environment with DatadogDevOps.com
Join Datadog for a webinar on monitoring Kubernetes with a focus on Amazon EKS. You'll learn how to get the most out of Datadog's intuitive platform and EKS's unique capabilities, including:
How to monitor metrics, logs and traces from your EKS environment
How to test the usability of your environment with features such as adaptive Browser Tests and globally available Real User Monitoring
How to find and fix user-facing issues with synthetic monitoring features like adaptive Browser Tests and globally available Real User Monitoring
Troubleshooting and Best Practices with WSO2 Enterprise IntegratorWSO2
This slide deck discusses how to troubleshoot an issue in WSO2 Enterprise Integrator and follow best practices in order to optimize output and avoid failure.
Watch webinar here: https://wso2.com/library/webinars/2018/10/troubleshooting-and-best-practices-with-wso2-enterprise-integrator
Prometheus for Monitoring Metrics (Fermilab 2018)Brian Brazil
From its humble beginnings in 2012, the Prometheus monitoring system has grown a substantial community with a comprehensive set of integrations. This talk will give an overview of the core ideas behind Prometheus, its feature set and how it has grown to met the challenges of modern cloud-based systems.
Here is the PPT of our recently happened workshop. You can also watch on our youtube channel. here is the link -https://www.youtube.com/channel/UCeLma6SpNYH7jjYKSBNSexw
No production system is complete without a way to monitor it. In software, we define observability as the ability to understand how our system is performing. This talk dives into capabilities and tools that are recommended for implementing observability when running K8s in production as the main platform today for deploying and maintaining containers with cloud-native solutions.
We start by introducing the concept of observability in the context of distributed systems such as K8s and the difference with monitoring. We continue by reviewing the observability stack in K8s and the main functionalities. Finally, we will review the tools K8s provides for monitoring and logging, and get metrics from applications and infrastructure.
Between the points to be discussed we can highlight:
-Introducing the concept of observability
-Observability stack in K8s
-Tools and apps for implementing Kubernetes observability
-Integrating Prometheus with OpenMetrics
In this session, we will start with the importance of monitoring of services and infrastructure. We will discuss about Prometheus an opensource monitoring tool. We will discuss the architecture of Prometheus. We will also discuss some visualization tools which can be used over Prometheus. Then we will have a quick demo for Prometheus and Grafana.
Prometheus: A Next Generation Monitoring System (FOSDEM 2016)Brian Brazil
A look at how Prometheus's instrumentation, data model, query language, manageability and reliability make it a next generation solution.
Video: https://www.youtube.com/watch?v=cwRmXqXKGtk
Contact us: prometheus@robustperception.io
Prometheus - Intro, CNCF, TSDB,PromQL,GrafanaSridhar Kumar N
https://www.youtube.com/playlist?list=PLAiEy9H6ItrKC5PbH7KiELiSEIKv3tuov
-What is Prometheus?
-Difference Between Nagios vs Prometheus
-Architecture
-Alertmanager
-Time series DB
-PromQL (Prometheus Query Language)
-Live Demo
-Grafana
How to Improve the Observability of Apache Cassandra and Kafka applications...Paul Brebner
As distributed cloud applications grow more complex, dynamic, and massively scalable, “observability” becomes more critical.
Observability is the practice of using metrics, monitoring and distributed tracing to understand how a system works.
We’ll explore two complementary Open Source technologies:
Prometheus for monitoring application metrics, and
OpenTracing and Jaeger for distributed tracing.
We’ll discover how they improve the observability of
an Anomaly Detection application, deployed on AWS Kubernetes, and using Instaclustr managed Apache Cassandra and Kafka clusters.
OSMC 2019 | Monitoring Cockpit for Kubernetes Clusters by Ulrike KlusikNETWAYS
Monitoring Kubernetes Clusters with Prometheus is state of the art. The difficulty is to find the significant metrics from the vast amount of available metrics. This talk shows a Monitoring Cockpit defined to get a quick overview of the cluster health and usage. It uses the Standard Metrics available for Kubernetes/OpenShift Clusters and their standard services. The monitoring solution is based on Prometheus, using InfluxDB for central long term storage and Grafana.
Monitoring at scale: Migrating to Prometheus at FastlyMarcus Barczak
Over the past 6 months at Fastly we've migrated away from our legacy monitoring systems and have deployed Prometheus as our primary system for infrastructure and application monitoring.
The Prometheus approach posed some unique challenges over traditional monitoring systems, whilst at the same time enabling us to easily scale our monitoring infrastructure alongside our global network growth.
It hasn't been completely smooth sailing and deploying Prometheus across a globe spanning network serving over 10% of the world's internet traffic has raised its fair share of challenges both technical and cultural.
In this presentation you will learn how we addressed these issues in ways that deviate slightly from conventional wisdom, the mistakes we made along the way, and how the new system has been received by our teams. We hope that our experiences can help you succeed in deploying Prometheus within your organization.
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...NETWAYS
SoundCloud <3 Prometheus. However, when it comes down to hardware, getting data into Prometheus isn’t always straight-forward. In this talk, I will provide a look into how we managed to port all our infrastructure monitoring – including SNMP, IPMI and more – to Prometheus, and even improve it along the way.
Similar to Monitoring in Big Data Platform - Albert Lewandowski, GetInData (20)
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfGetInData
Recently we have observed the rise of open-source Large Language Models (LLMs) that are community-driven or developed by the AI market leaders, such as Meta (Llama3), Databricks (DBRX) and Snowflake (Arctic). On the other hand, there is a growth in interest in specialized, carefully fine-tuned yet relatively small models that can efficiently assist programmers in day-to-day tasks. Finally, Retrieval-Augmented Generation (RAG) architectures have gained a lot of traction as the preferred approach for LLMs context and prompt augmentation for building conversational SQL data copilots, code copilots and chatbots.
In this presentation, we will show how we built upon these three concepts a robust Data Copilot that can help to democratize access to company data assets and boost performance of everyone working with data platforms.
Why do we need yet another (open-source ) Copilot?
How can we build one?
Architecture and evaluation
How do we work with customers on Big Data / ML / Analytics Projects using Scr...GetInData
How do we work with our customers ? How does it look? What do the meetings look like ? How do we structure the cooperation? Who does what and when ?
We receive these kinds of questions quite often. They are very important questions as the customer should know the details before we start the project and it’s important for GetInData to be transparent on this so the client is well informed.
During the webinar our Project Lead, Rafał Zalewski talked about Scrum Framework we use in cooperation with our customers.
Watch here:
https://www.youtube.com/watch?v=uOWrgcaKwWo&t=32s
Speaker: Rafał Zalewski, GetInData: https://www.linkedin.com/in/rafalzalewski/
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
Data-Driven Fast Track: Introduction to data-drivenness with Piotr MenclewiczGetInData
Watch video here: https://youtu.be/sfowpU90zFM
Piotr's presentation about GetInData’s Data-Driven Fast Track, the 3-step framework for data transformation.
You will learn:
➡ How to assess how data-driven your company is
➡ How to generate ideas for new initiatives to push your company towards better decisions
➡ How to think about implementing these initiatives to increase your chances of success
If you miss it live don't despair. Watch the video and feel free to diagnose your company by filling out the survey prepared by our team here: https://bit.ly/3fKcRrb! After completing the survey, you will receive a tailored summary report with insights from one of our experts.
Below you'll find links to all the materials mentioned in the workshop needed for exercises.
LINKS TO MATERIALS ABOUT DATA-DRIVEN:
Data-driven fast-track: 3 steps to make your company more data-driven: https://getindata.com/blog/data-drive...
Is my company data-driven? Here’s how you can find out: https://getindata.com/blog/is-my-comp...
If you:
➡ have questions about webinar topic,
➡ want to talk about your data-driven transformation,
➡ want to become more data-driven and you need consultations,
don't hesitate to write to us: hello@getindata.com
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
If you want to stay up to date, subscribe to our newsletter here: https://bit.ly/3tiw1I8
Presentation from the performance given by Piotr Chaberski and Adrian Dembek the Data Science Summit ML Edition.
Authors: Piotr Chaberski, Adrian Dembek
Linkedin: https://www.linkedin.com/in/piotrchaberski/
https://www.linkedin.com/in/adriandembek/
___
Company:
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
How to become good Developer in Scrum Team? GetInData
Speaker:
Rafał Zalewski, GetInData: https://www.linkedin.com/in/rafalzalewski/
Abstract:
To become good Developer in Scrum Team you need to understand not only Scrum Events but also Scrum fundaments like Scrum Pillars and Scrum Values. In this presentation you will learn and understand the mindset expectation from you as Developer in Scrum Team. You will also learn how Scrum mindset helps to achieve better development results.
____
Company:
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
OpenLineage & Airflow - data lineage has never been easierGetInData
If you want to stay up to date, subscribe to our newsletter here: https://bit.ly/3tiw1I8
Presentation from the performance given by Paweł during the Airflow Summit 2022.
Author: Paweł Leszczyński
Linkedin: https://www.linkedin.com/in/pawel-leszczynski/
___
Company:
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
Did you like it? Check out our blog to stay up to date: https://getindata.com/blog
Building your own platform is often ostracized these days. Everyone is encouraged to reuse existing solutions for known reasons. But using a ready-made platform / tool should not be a mindless process. Reusability is an art. During this presentation, you will learn why we decided to build our own MLOps platform while not re-inventing the wheel by using ready-made components with a touch of custom components. What are the benefits of this, but also what limitations and hurdles we have encountered. We hope that our experience will help you make the right decisions in your projects. Sometimes, maybe more risky ones.
Model serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInDataGetInData
If you want to stay up to date, subscribe to our newsletter here: https://bit.ly/3tiw1I8
Presentation from the performance given by Mariusz during the Data Science Summit ML Edition.
Author: Mariusz Strzelecki
Linkedin: https://www.linkedin.com/in/mariusz-strzelecki/
___
Company:
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...GetInData
Did you like it? Check out our blog to stay up to date: https://getindata.com/blog
This workshop focuses on creating a data streaming platform from scratch using an empty Kubernetes (or even Minikube) cluster. During the workshop, we go through the installation process, deploy the basic components for the platform, start Apache Flink, and monitor the process, using SQL to query available data.
Author: Albert Lewandowski
Linkedin: https://www.linkedin.com/in/albert-lewandowski/
___
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
MLOps implemented - how we combine the cloud & open-source to boost data scie...GetInData
Check out more about this presentation here: https://www.youtube.com/watch?v=nSsssYHiylQ&t=17s
Presentation from the performance given by our team during the NSML Summit.
Authors: Krzysztof Zarzycki, Marek Wiewiórka
Linkedin: https://www.linkedin.com/in/kzarzycki/
https://www.linkedin.com/in/marekwiewiorka/
___
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...GetInData
Did you like it? Check out our E-book: Apache NiFi - A Complete Guide
https://ebook.getindata.com/apache-nifi-complete-guide
Apache NiFi is one of the most popular services for running ETL pipelines otherwise it’s not the youngest technology. During the talk, there are described all details about migrating pipelines from the old Hadoop platform to the Kubernetes, managing everything as the code, monitoring all corner cases of NiFi and making it a robust solution that is user-friendly even for non-programmers.
Author: Albert Lewandowski
Linkedin: https://www.linkedin.com/in/albert-lewandowski/
___
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
Read more here: https://getindata.com/blog/machine-learning-features-discovery-feast-amundsen
Author: Mariusz Strzelecki
Linkedin: https://www.linkedin.com/in/mariusz-strzelecki/
___
Company:
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
Kubernetes and real-time analytics - how to connect these two worlds with Apa...GetInData
Did you like it? Check out our blog to stay up to date: https://getindata.com/blog
More and more services are running in Kubernetes so it means that we can migrate our current data pipelines to the new environment. In case of Flink we have multiple ways to do real-time data streaming: use Lyft or GCP operator, go with official deployment and customize it or choose the Ververica Platform or create something on your own. The presentation shows how to choose the right solution for technical requirements and business needs to run Flink in Kubernetes at great scale with no issues.
Author: Albert Lewandowski
Linkedin: https://www.linkedin.com/in/albert-lewandowski/
___
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
Big data trends - Krzysztof Zarzycki, GetInDataGetInData
If you want to stay up to date, subscribe to our newsletter here: https://bit.ly/3tiw1I8
Get more info here: https://getindata.com/blog/6-big-data-trends-2021-bigdata-blog/
Author: Krzysztof Zarzycki
Linkedin: https://www.linkedin.com/in/kzarzycki/
___
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...GetInData
Check out more about this presentation here: https://www.youtube.com/watch?v=eqNToHn4yB0
The webinar was organized by GetinData on 2020. During the webinar we explaned what does it mean to build a data-driven company.
Watch more here: https://www.youtube.com/watch?v=eqNToHn4yB0
Speaker: Rafał Małanij
___
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
Complex event processing platform handling millions of users - Krzysztof Zarz...GetInData
If you want to learn more about it, check out our webinar here: https://www.youtube.com/watch?v=EfGPY_NyYQ8&t=77s
The webinar was organized by GetinData on 2020. During the webinar, we shared our lessons learnt from building and running stream processing platform in production for over 2 years.
Watch more here: https://www.youtube.com/watch?v=EfGPY_NyYQ8
Author: Krzysztof Zarzycki
Linkedin: https://www.linkedin.com/in/kzarzycki/
___
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
Predicting Startup Market Trends based on the news and social media - Albert ...GetInData
Did you like it? Check out our blog to stay up to date: https://getindata.com/blog
Nowadays, one tweet can have impact on the value of the company or cryptocurrency. It becomes important for companies to be able to know everything what's happening in the market, especially for startups or when entering the new market. The presentation is about presenting the complex platform used for creating and verifying the strategy for a startup from the Wellbeing market. We go through web scraping-based data ingestion to ElasticSearch, NLP pipelines to understand what people write and what is the possible future of each market predicted by PySpark job.
Author: Albert Lewandowski
Linkedin: https://www.linkedin.com/in/albert-lewandowski/
___
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
Managing Big Data projects in a constantly changing environment - Rafał Zalew...GetInData
Watch our full performance given by our team during the Big Data Technology Warsaw Summit: https://www.youtube.com/watch?v=CBrq7z8ikaM
The nature of Big Data projects are nowadays one of its kind - they are not like the data warehousing initiatives in the old days, nor like cloud native applications projects, at least not yet. Variety of technologies, complicated architectures and rapidly changing landscape are just a few challenges that the IT Department is facing in such projects. When you add the number of stakeholders from different departments involved and that Big Data project is sometimes more like an R&D with unpredictable outcome, this makes a mix where the objectives can be easily lost. It is not a surprise that up to 85% of Big Data projects were pure failures (Gartner 2016).
In this talk we will share our experience in planning and executing Big Data initiatives in the organisations, with some use cases and good practices in mind
Watch our webinar here: https://www.youtube.com/watch?v=CBrq7z8ikaM
Speakers:
Rafał Małanij
Rafał Zalewski
Linkedin: https://www.linkedin.com/in/rafalzalewski/
___
Company:
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...GetInData
Did you like it? Check out our blog to stay up to date: https://getindata.com/blog
Currently there are more and more created videos distributed via multiple social media channels. It becomes more and more important to monitor all of them by companies to verify their customers' feedback, reviews, opinions. During the talk, we talk about extracting text from videos, analyzing language and prepare robust, scalable infrastructure for it. The idea behind platform is about having the mix between managed and self-managed service for Big Data processing. The keynote shows the case study of the MVP of the platform for marketing companies.
Author: Albert Lewandowski
Linkedin: https://www.linkedin.com/in/albert-lewandowski/
___
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInDataGetInData
If you want to learn more about it watch our video: https://www.youtube.com/watch?v=WjAY9mnqD4s
The webinar was organized by GetinData on 2020, June 9th. We discussed various approaches one can take to migrate the infrastructure, pointing out their pros and cons. The webinar was based on our real-life, production-grade experiences, however, the strategies and insights can be easily applied to all kinds of IT business.
Watch our webinar here: https://www.youtube.com/watch?v=WjAY9mnqD4s&t=1s
Speaker: Mateusz Pytel
Linkedin: https://www.linkedin.com/in/em-pe/
___
Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets.
Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries.
https://getindata.com
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
2. •Big Data DevOps Engineer in Getindata
•Focused on infrastructure, cloud, Internet of Things and
Big Data
Who am I?
3. Agenda
1. Monitoring overview.
2. Metrics.
a. Infrastructure
b. Big Data Components
c. Applications.
3. Logs analysis.
a. Overview.
b. Logging solutions.
c. Why and how to implement?
4. Q&A
5. Push vs. Pull model
Push Pull
Agents push metrics Collector takes metrics
Polling task fully distributed among agents,
resulting in linear scalability.
Workload on central poller increases with the
number of devices polled.
Push agents are inherently secure against remote
attacks since they do not listen for network
connections.
Polling protocol can potentially open up system to
remote access and denial of service attacks.
Relatively inflexible: pre-determined, fixed set of
measurements are periodically exported.
Flexible: poller can ask for any metric at any time.
8. Prometheus Architecture
Prometheus is an open-source systems monitoring and alerting toolkit
originally built at SoundCloud and joined the Cloud Native
Computing Foundation in 2016 as the second hosted project,
after Kubernetes.
Components:
• server
• client libraries for instrumenting application code
• a push gateway for supporting short-lived jobs
• special-purpose exporters for services
• an AlertManager to handle alerts
9. Prometheus’ stories
•Use service discovery, it’s great
• Discover where Flink JM and TM expose their
metrics.
•How to provide HA?
• Think of using long-term storage like Thanos
or M3 or Cortex
•Do you need archived data?
•Monitor Prometheus even if it’s a
monitoring tool.
10. Service Discovery
It is used for discovering scrape targets.
We can use Kubernetes, Consul or file-based
service discovery and many others.
11. Prometheus Security
Remember:
■ It should not by default be possible for a target to expose data
that impersonates a different target. The honor_labels option
removes this protection, as can certain relabelling setups.
■ --web.enable-admin-api flag controls access to the
administrative HTTP API which includes functionality such as
deleting time series. This is disabled by default. If enabled,
administrative and mutating functionality will be accessible
under the /api/*/admin/ paths.
■ Any secrets stored in template files could be exfiltrated by
anyone able to configure receivers in the Alertmanager
configuration file.
12. Prometheus Security
Remember:
■ Prometheus and its components do not provide any
server-side authentication, authorization or encryption. If
you require this, it is recommended to use a reverse proxy.
■ As administrative and mutating endpoints are intended to
be accessed via simple tools such as cURL, there is no built
in CSRF protection.
In the future, server-side TLS support will be rolled out to the different
Prometheus projects. Those projects include Prometheus,
Alertmanager, Pushgateway and the official exporters. Authentication
of clients by TLS client certs will also be supported.
16. Servers Monitoring
■ Using Node Exporter or any custom exporter
■ Predict disk usage
■ Send alerts in case of any issue or any dangerous
situation
17. Kubernetes Monitoring
■ Nodes monitoring
■ Pods monitoring
■ Namespace monitoring
Check:
- resources limits and resources requests
- resource utilization
- Node Exporter
- Kube State metrics -it is a simple service that listens to the Kubernetes
API server and generates metrics about the state of the objects.
20. Kafka exporter
We can scrape and expose
mBeans of a JMX target.
Add to KAFKA_OPTS
export
KAFKA_OPTS='-javaagent:/opt/
jmx-exporter/jmx-exporter.ja
r=9101:/etc/jmx-exporter/kaf
ka.yml'
22. What about Kafka?
• Critical component in many systems
• A lot of data so any issue can cause
business loss
• What about adding more brokers?
23. What about Kafka?
• Keep it simple with JMX exporter.
• Monitor partitions state, number of
replicas
• You must have alerts when the number
of under replicated partitions grows fast
25. When should I use PushGateway?
• When monitoring multiple instances through a single
Pushgateway, the Pushgateway becomes both a
single point of failure and a potential bottleneck.
• You lose Prometheus's automatic instance health
monitoring via the up metric (generated on every
scrape).
• The Pushgateway never forgets series pushed to it
and will expose them to Prometheus forever unless
those series are manually deleted via the
Pushgateway's API.
26. Blackbox Exporter
● Official Prometheus Exporter, written in Go
● Single binary installed on monitoring server
● Allows probing API endpoints
https://github.com/prometheus/blackbox_exporter
27. Airflow Exporter
● Provides metrics for Airflow:
○ dag_status
○ task_status
○ run_duration
● Simple install via pip
https://github.com/epoch8/airflow-exporter
28. Knox Exporter
● Provides metrics about access to WebHDFS and Hive
● Written in Java, manual installation as a service, needs
configuration file
https://github.com/marcelmay/apache-knox-exporter
29. NiFi
We use PrometheusReportingTask to report metrics in
Prometheus format.
It created metrics HTTP endpoint with information about
the JVM and the NiFi instance.
For flows, we need to use custom reporters like own NAR
processor that can count the number of flow files or any
different metric.
30. Bash Exporter
● Written in Go, manual installation as a service, needs
additional bash scripts to run
● Can make metric from output of bash command
● Status for Hadoop component:
○ Zeppelin
○ Ranger
○ HiveServer
https://github.com/gree-gorey/bash-exporter
31. Hive Query
● Running Hive query via crontab and writing output to
file
● Running bash exporter reading file and displaying
metrics for Prometheus
● Reading metrics in Grafana
33. Flink monitoring
Processing metrics
• Number of processed events
• Lag of processing that may trigger
alerts
• JVM parameters
Flink stability
• How many restarts does Flink have?
• Can it finish making checkpoints
and savepoints?
36. What about Spark jobs?
• Let’s connect Prometheus with
statsd_exporter
• Use built in sink for statsd in Spark
37. Spark & Prometheus
The best way to achieve stable metrics system
with right timestamp is about using statsd_sink.
These metrics can be sent to statsd_exporter
(https://github.com/prometheus/statsd_exporter)
and then can be exposed to the Prometheus.
38. Spark & Prometheus
The different approach based on JMX Exporter
and using Spark’s JMXSink.
The next is about using Spark’s GraphiteSink and
using Prometheus Graphite Exporter.
The last is about building custom exporter of
metrics.
40. Spark & Prometheus
Spark 3.0 introduces the following resources:
- PrometheusServlet - which makes the
Master/Worker/Driver nodes expose metrics in
a Prometheus format.
- Prometheus Resource - which export metrics of
all executors at the driver.
49. Elasticsearch
Elasticsearch is a distributed, open source
search and analytics engine for all types of
data, including textual, numerical,
geospatial, structured, and unstructured.
Elasticsearch is built on Apache Lucene
and was first released in 2010
50. Elasticsearch
Elasticsearch is a distributed, open source
search and analytics engine for all types of
data, including textual, numerical,
geospatial, structured, and unstructured.
Elasticsearch is built on Apache Lucene
and was first released in 2010
51. Elasticsearch Use cases
• Logging and log analytics
• Infrastructure metrics and container
monitoring
• Application performance monitoring
• Geospatial data analysis and
visualization
• Security analytics
• Business analytics
55. Loki
Loki is a horizontally-scalable,
highly-available, multi-tenant log
aggregation system inspired by
Prometheus. It is designed to be very cost
effective and easy to operate. It does not
index the contents of the logs, but rather a
set of labels for each log stream.
56. Loki Overview
Loki receives logs in separate streams, where each
stream is uniquely identified by its tenant ID
and its set of labels.
As log entries from a stream arrive, they are
GZipped as "chunks" and saved in the chunks
store.
The index stores each stream's label set and links
them to the individual chunks.
58. LogQL
This example counts all the log lines within the last
five minutes for the MySQL job.
rate( ( {job="mysql"} |= "error" !=
"timeout)[10s] ) )
It counts the entries for each log stream
count_over_time({job="mysql"}[5m])
59. Promtail
Promtail is an agent which ships the
contents of local logs to a private Loki
instance. It is deployed to every machine
that has applications needed to be
monitored.
Currently, Promtail can tail logs from two
sources: local log files and the systemd
journal (on AMD64 machines only).
60. Pipelines in Promtail
A pipeline is used to transform a single
log line, its labels, and its timestamp.
A pipeline is comprised of a set of
stages.
62. ELK vs. Loki+Promtail
ELK Loki + Promtail
Data visualisation Kibana Grafana
Query performance Faster due to indexed
all the data
Slower due to indexing
only labels
Resource
consumption
Higher due to the need
of indexing
Lower due to index
only labels
63. ELK vs. Loki+Promtail
ELK Loki + Promtail
Indexing Keys and content of
each key
Only labels
Query language Query DSL or Lucene
QL
LogQL
Log pipeline More components
1. Fluentd/Fluentbit
-> Logstash ->
Elasticsearch
2. Filebeat ->
Logstash ->
Elasticsearch
Fewer components:
Promtail -> Loki