SlideShare a Scribd company logo
1 of 32
How to build
continuously
processing for 24/7
real-time data
streaming platform?
Author: Albert Lewandowski
• Big Data DevOps Engineer in Getindata
• Smart City Consultant in Almine
• Editor in Antyweb
• Focused on infrastructure, cloud, Internet of
Things and Big Data
Who am I?
Bullet points
• Infrastructure as Code (IaC)
• Documentation, documentation, documentation
• Monitor wisely
• Log analytics
• SRE principals with DevOps mindset
• CICD
• Automate tasks
• Ask why? not how?
• Stories, issues, curiosities
Infrastructure as Code
• Code everything because it means readable
infrastructure for everyone
• Do not rely on local copies of script, someone
has invented code repositories and it was a
great day for IT stuff
• Cloud-ready scripts – all public cloud vendors
provide their tools for deploying
infrastructure
Infrastructure as Code in
practice
• Amazon Web Services
• CloudFormation
• Google Cloud Platform
• Deployment Manager
• Microsoft Azure
• Azure Templates
• Azure DevOps
Or use one tool for all environment, like
Terraform
Exercise: Infrastructure as
Code
Google Cloud Platform & Deployment Manager
Source:
https://github.com/GoogleCloudPlatform/deploymen
tmanager-samples/tree/master/examples/v2
git clone https://github.com/GoogleCloudPlatform/deploymentmanager-
samples.git
cd deploymentmanager-samples/examples/v2
NAME="gid_gke_example_20200108"
ZONE="europe-west3b"
gcloud deployment-manager deployments create ${NAME} --template
cluster.py --properties zone:${ZONE}
Documentation
• Add README if you won’t be
disliked by your team.
• You will forget about
important things faster than
you think.
• Add description and comments
to your tasks and merge
requests. Make work easier
not harder.
Monitoring
• Scraping metrics is done everywhere.
• Getting information about CPU utilisation, disk space usage,
amount of free RAM, etc.
• Multiple available tools so which one should you choose?
• Do not forget about learning about new tools.
• Understand which metrics are valuable and provide information
about status of data pipelines or data ingestion
Prometheus’ stories
• Use service discovery, it’s great
• Discover where Flink JM and TM expose their metrics.
• How to provide HA?
• Think of using long-term storage like Thanos or M3
or Cortex
• Do you need archived data?
• Monitor Prometheus even if it’s a monitoring
tool.
How to monitor and visualize
metrics?
How to monitor and visualize
metrics?
All available services for monitoring visualisation
are quite similar.
• Think about it like a part of the complex solution
that has to be compatible with metrics exporters
and log exporters.
• Think of security, ease-of-use for operations team
and adding any own modules that may be required.
• Simpler visualisation = more readable (often, not
always).
• Understand value of metrics.
Don’t forget about alerts
Alerts signify that a human needs to take action
immediately in response to something that is
either happening or about to happen, in order to
improve the situation.
Do not overuse alerts, some issues should be
fixed by automation scripts.
Solutions designed for alerts
• AlertManager
• Builtin alerts in Grafana
• Alerta
Exercise: One to rull them all
• Demo from Alerta’s team: https://try.alerta.io/
git clone https://github.com/alerta/docker-alerta
docker-compose up
Log analytics
• Discover what is inside your log files.
• Useful for operations team and for developers to understand what
happened with their applications.
• Read log files in the dedicated tool, not use less or tail when you have
several machines to check.
• You can take wise actions based on the log content when you see
what is going on with your services.
Log analytics tools
• Elastic stack:
• ELK
• EFK
• Loki + Promtail + Grafana
Make it simple with Loki
Make it simple with Loki
Like Prometheus, but for logs!
• Write simple queries with LogQL that is similar
to PromQL
Examples:
• {instance=~"kafka-[23]",name="kafka"} !=
kafka.server:type=ReplicaManager
• topk(10,sum(rate({region="us-east1"}[5m])) by (name))
• Ingest log files with Promtail or Fluentd or
Fluentbit
• Relabeling log files if needed
• Designed for clusters in Kubernetes
Our experience with Loki
• Two environments: development and production stages.
• Migration from ELK stack that didn’t provide enough good
performance and had issues with scraping log files.
• Loki in Grafana: metrics and logs can be verified in one tool.
• Stable solution that provides all metrics and enables counting
interesting values from jobs’ logs
Exercise: Glance at Loki
Exercise: Glance at Loki
Install Helm
curl -fsSL -o get_helm.sh
https://raw.githubusercontent.com/helm/helm/master/scripts/get-
helm-3
chmod 700 get_helm.sh
./get_helm.sh
Exercise: Glance at Loki
helm repo add loki https://grafana.github.io/loki/charts
helm repo update
helm upgrade --install loki --namespace=loki-stack loki/loki-stack
helm install stable/grafana -n loki-grafana
kubectl get secret --namespace <YOUR-NAMESPACE> loki-grafana -o
jsonpath="{.data.admin-password}" | base64 --decode ; echo
kubectl port-forward --namespace <YOUR-NAMESPACE> service/loki-
grafana 3000:80
Go to: http://localhost:3000
Add Loki as data source.
SRE & DevOps
If a human operator needs to touch your system during normal
operations, you have a bug. The definition of normal changes as your
systems grow.
Carla Geisser, Google SRE
DevOps vs. SRE
DevOps SRE
Reduce organization silos Share ownership with developers by
using the same tools and
techniques across the stack
Accept failure as normal Have a formula for balancing
accidents and failures against new
releases
Implement gradual change Encourage moving quickly by
reducing costs of failure
Leverage tooling & automation Encourages "automating this year's
job away" and minimizing manual
systems work to focus on efforts
that bring long-term value to the
system
Measure everything Believes that operations is a
software problem, and defines
CICD pipelines
Besides black art, there is only automation and mechanization.
Federico García Lorca (1898–1936), Spanish poet and playwright
Source: AWS
Improve, commit, test, deploy
• Define which applications or jobs can be
deployed automatically to the production
environment.
• Test everything.
• Remember about CI tools that are really useful.
• Teach others how they should use automation
tools.
• Discuss, improve, make
Automate and drink coffee
• Automate boring stuff.
• Make Everything as Code.
• Run tested Ansible playbooks
and forget about manual
changes.
• More well thought automation,
less problems.
Useful tools
• Ansible
• Jenkins
• Rundeck
• Automate all ops
tasks
• Run Ansible from
one place where
you can set up
variables easily.
• It supports LDAP.
Test your Ansible
• Use Molecule.
• Molecule provides support for testing with multiple instances,
operating systems and distributions, virtualization providers, test
frameworks and testing scenarios.
• Use Tox.
• Tox is a generic virtualenv management, and test command line
tool. Tox can be used in conjunction with Factors and Molecule, to
perform scenario tests.
• Use them in your CI tool.
Molecule tests
Have installed working Docker, installed molecule
and tox (by pip).
Scenarios – they are a test suite for your newly created role.
What is inside scenario directory?
• Dockerfile.j2
• INSTALL.rst – it contains instructions on what
additional software or setup steps.
• molecule.yml – here you add dependencies, etc.
• playbook.yml – it’ll be invoked by Molecule.
• Tests – here you can add specific tests.
Q&A
Feel free to ask me about anything 
Thank you for your attention!

More Related Content

What's hot

OSMC 2021 | Use OpenSource monitoring for an Enterprise Grade Platform
OSMC 2021 | Use OpenSource monitoring for an Enterprise Grade PlatformOSMC 2021 | Use OpenSource monitoring for an Enterprise Grade Platform
OSMC 2021 | Use OpenSource monitoring for an Enterprise Grade PlatformNETWAYS
 
Docker at and with SignalFx
Docker at and with SignalFxDocker at and with SignalFx
Docker at and with SignalFxSignalFx
 
Ansiblefest 2018 Network automation journey at roblox
Ansiblefest 2018 Network automation journey at robloxAnsiblefest 2018 Network automation journey at roblox
Ansiblefest 2018 Network automation journey at robloxDamien Garros
 
Bringing DevOps to Routing with evolved XR: an overview
Bringing DevOps to Routing with evolved XR: an overviewBringing DevOps to Routing with evolved XR: an overview
Bringing DevOps to Routing with evolved XR: an overviewCisco DevNet
 
Netflix Open Source: Building a Distributed and Automated Open Source Program
Netflix Open Source:  Building a Distributed and Automated Open Source ProgramNetflix Open Source:  Building a Distributed and Automated Open Source Program
Netflix Open Source: Building a Distributed and Automated Open Source Programaspyker
 
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...Chris Fregly
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1Ruslan Meshenberg
 
Microcontainers and Tools for Hardcore Container Debugging
Microcontainers and Tools for Hardcore Container DebuggingMicrocontainers and Tools for Hardcore Container Debugging
Microcontainers and Tools for Hardcore Container DebuggingOracle Developers
 
OSDC 2018 | Three years running containers with Kubernetes in Production by T...
OSDC 2018 | Three years running containers with Kubernetes in Production by T...OSDC 2018 | Three years running containers with Kubernetes in Production by T...
OSDC 2018 | Three years running containers with Kubernetes in Production by T...NETWAYS
 
Storage Spaces Direct - the new Microsoft SDS star - Carsten Rachfahl
Storage Spaces Direct - the new Microsoft SDS star - Carsten RachfahlStorage Spaces Direct - the new Microsoft SDS star - Carsten Rachfahl
Storage Spaces Direct - the new Microsoft SDS star - Carsten RachfahlITCamp
 
Ports, pods and proxies
Ports, pods and proxiesPorts, pods and proxies
Ports, pods and proxiesLibbySchulze
 
Supporting Digital Media Workflows in the Cloud with Perforce Helix
Supporting Digital Media Workflows in the Cloud with Perforce HelixSupporting Digital Media Workflows in the Cloud with Perforce Helix
Supporting Digital Media Workflows in the Cloud with Perforce HelixPerforce
 
Devops Columbia October 2020 - Gabriel Alix: A Discussion on Terraform
Devops Columbia October 2020 - Gabriel Alix: A Discussion on TerraformDevops Columbia October 2020 - Gabriel Alix: A Discussion on Terraform
Devops Columbia October 2020 - Gabriel Alix: A Discussion on TerraformDrew Malone
 
Dev309 from asgard to zuul - netflix oss-final
Dev309  from asgard to zuul - netflix oss-finalDev309  from asgard to zuul - netflix oss-final
Dev309 from asgard to zuul - netflix oss-finalRuslan Meshenberg
 
Deep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnetDeep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnetMarco Parenzan
 
Netflix: From Zero to Production-Ready in Minutes (QCon 2017)
Netflix: From Zero to Production-Ready in Minutes (QCon 2017)Netflix: From Zero to Production-Ready in Minutes (QCon 2017)
Netflix: From Zero to Production-Ready in Minutes (QCon 2017)Tim Bozarth
 
Series of Unfortunate Netflix Container Events - QConNYC17
Series of Unfortunate Netflix Container Events - QConNYC17Series of Unfortunate Netflix Container Events - QConNYC17
Series of Unfortunate Netflix Container Events - QConNYC17aspyker
 
Container Runtime Security with Falco, by Néstor Salceda
Container Runtime Security with Falco, by Néstor SalcedaContainer Runtime Security with Falco, by Néstor Salceda
Container Runtime Security with Falco, by Néstor SalcedaCloud Native Day Tel Aviv
 

What's hot (20)

OSMC 2021 | Use OpenSource monitoring for an Enterprise Grade Platform
OSMC 2021 | Use OpenSource monitoring for an Enterprise Grade PlatformOSMC 2021 | Use OpenSource monitoring for an Enterprise Grade Platform
OSMC 2021 | Use OpenSource monitoring for an Enterprise Grade Platform
 
Docker at and with SignalFx
Docker at and with SignalFxDocker at and with SignalFx
Docker at and with SignalFx
 
Ansiblefest 2018 Network automation journey at roblox
Ansiblefest 2018 Network automation journey at robloxAnsiblefest 2018 Network automation journey at roblox
Ansiblefest 2018 Network automation journey at roblox
 
Bringing DevOps to Routing with evolved XR: an overview
Bringing DevOps to Routing with evolved XR: an overviewBringing DevOps to Routing with evolved XR: an overview
Bringing DevOps to Routing with evolved XR: an overview
 
Netflix Open Source: Building a Distributed and Automated Open Source Program
Netflix Open Source:  Building a Distributed and Automated Open Source ProgramNetflix Open Source:  Building a Distributed and Automated Open Source Program
Netflix Open Source: Building a Distributed and Automated Open Source Program
 
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
 
Microcontainers and Tools for Hardcore Container Debugging
Microcontainers and Tools for Hardcore Container DebuggingMicrocontainers and Tools for Hardcore Container Debugging
Microcontainers and Tools for Hardcore Container Debugging
 
OSDC 2018 | Three years running containers with Kubernetes in Production by T...
OSDC 2018 | Three years running containers with Kubernetes in Production by T...OSDC 2018 | Three years running containers with Kubernetes in Production by T...
OSDC 2018 | Three years running containers with Kubernetes in Production by T...
 
Storage Spaces Direct - the new Microsoft SDS star - Carsten Rachfahl
Storage Spaces Direct - the new Microsoft SDS star - Carsten RachfahlStorage Spaces Direct - the new Microsoft SDS star - Carsten Rachfahl
Storage Spaces Direct - the new Microsoft SDS star - Carsten Rachfahl
 
Ports, pods and proxies
Ports, pods and proxiesPorts, pods and proxies
Ports, pods and proxies
 
Supporting Digital Media Workflows in the Cloud with Perforce Helix
Supporting Digital Media Workflows in the Cloud with Perforce HelixSupporting Digital Media Workflows in the Cloud with Perforce Helix
Supporting Digital Media Workflows in the Cloud with Perforce Helix
 
Devops Columbia October 2020 - Gabriel Alix: A Discussion on Terraform
Devops Columbia October 2020 - Gabriel Alix: A Discussion on TerraformDevops Columbia October 2020 - Gabriel Alix: A Discussion on Terraform
Devops Columbia October 2020 - Gabriel Alix: A Discussion on Terraform
 
Netflix conductor
Netflix conductorNetflix conductor
Netflix conductor
 
Dev309 from asgard to zuul - netflix oss-final
Dev309  from asgard to zuul - netflix oss-finalDev309  from asgard to zuul - netflix oss-final
Dev309 from asgard to zuul - netflix oss-final
 
Deep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnetDeep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnet
 
Netflix: From Zero to Production-Ready in Minutes (QCon 2017)
Netflix: From Zero to Production-Ready in Minutes (QCon 2017)Netflix: From Zero to Production-Ready in Minutes (QCon 2017)
Netflix: From Zero to Production-Ready in Minutes (QCon 2017)
 
Series of Unfortunate Netflix Container Events - QConNYC17
Series of Unfortunate Netflix Container Events - QConNYC17Series of Unfortunate Netflix Container Events - QConNYC17
Series of Unfortunate Netflix Container Events - QConNYC17
 
Java Performance Tuning
Java Performance TuningJava Performance Tuning
Java Performance Tuning
 
Container Runtime Security with Falco, by Néstor Salceda
Container Runtime Security with Falco, by Néstor SalcedaContainer Runtime Security with Falco, by Néstor Salceda
Container Runtime Security with Falco, by Néstor Salceda
 

Similar to Hot to build continuously processing for 24/7 real-time data streaming platform?

Intro to Telegraf
Intro to TelegrafIntro to Telegraf
Intro to TelegrafInfluxData
 
Top 10 dev ops tools (1)
Top 10 dev ops tools (1)Top 10 dev ops tools (1)
Top 10 dev ops tools (1)yalini97
 
The challenge of application distribution - Introduction to Docker (2014 dec ...
The challenge of application distribution - Introduction to Docker (2014 dec ...The challenge of application distribution - Introduction to Docker (2014 dec ...
The challenge of application distribution - Introduction to Docker (2014 dec ...Sébastien Portebois
 
DevOpsCon 2015 - DevOps in Mobile Games
DevOpsCon 2015 - DevOps in Mobile GamesDevOpsCon 2015 - DevOps in Mobile Games
DevOpsCon 2015 - DevOps in Mobile GamesAndreas Katzig
 
Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013
Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013
Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013dotCloud
 
Kubernetes Infra 2.0
Kubernetes Infra 2.0Kubernetes Infra 2.0
Kubernetes Infra 2.0Deepak Sood
 
Containers, Serverless and Functions in a nutshell
Containers, Serverless and Functions in a nutshellContainers, Serverless and Functions in a nutshell
Containers, Serverless and Functions in a nutshellEugene Fedorenko
 
Devops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShiftDevops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShiftYaniv cohen
 
BSidesDFW2022-PurpleTeam_Cloud_Identity.pptx
BSidesDFW2022-PurpleTeam_Cloud_Identity.pptxBSidesDFW2022-PurpleTeam_Cloud_Identity.pptx
BSidesDFW2022-PurpleTeam_Cloud_Identity.pptxJasonOstrom1
 
What we talk about when we talk about DevOps
What we talk about when we talk about DevOpsWhat we talk about when we talk about DevOps
What we talk about when we talk about DevOpsRicard Clau
 
Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015aspyker
 
Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...
Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...
Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...Codemotion
 
CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National Police
CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National PoliceCodeMotion Amsterdam 2018 - Microservices in action at the Dutch National Police
CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National PoliceBert Jan Schrijver
 
Design Like a Pro: Scripting Best Practices
Design Like a Pro: Scripting Best PracticesDesign Like a Pro: Scripting Best Practices
Design Like a Pro: Scripting Best PracticesInductive Automation
 
Time Series Anomaly Detection with Azure and .NETT
Time Series Anomaly Detection with Azure and .NETTTime Series Anomaly Detection with Azure and .NETT
Time Series Anomaly Detection with Azure and .NETTMarco Parenzan
 

Similar to Hot to build continuously processing for 24/7 real-time data streaming platform? (20)

Intro to Telegraf
Intro to TelegrafIntro to Telegraf
Intro to Telegraf
 
Top 10 dev ops tools (1)
Top 10 dev ops tools (1)Top 10 dev ops tools (1)
Top 10 dev ops tools (1)
 
The ABC's of IaC
The ABC's of IaCThe ABC's of IaC
The ABC's of IaC
 
Infrastructure as Code
Infrastructure as CodeInfrastructure as Code
Infrastructure as Code
 
The challenge of application distribution - Introduction to Docker (2014 dec ...
The challenge of application distribution - Introduction to Docker (2014 dec ...The challenge of application distribution - Introduction to Docker (2014 dec ...
The challenge of application distribution - Introduction to Docker (2014 dec ...
 
DevOpsCon 2015 - DevOps in Mobile Games
DevOpsCon 2015 - DevOps in Mobile GamesDevOpsCon 2015 - DevOps in Mobile Games
DevOpsCon 2015 - DevOps in Mobile Games
 
Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013
Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013
Write Once and REALLY Run Anywhere | OpenStack Summit HK 2013
 
Kubernetes Infra 2.0
Kubernetes Infra 2.0Kubernetes Infra 2.0
Kubernetes Infra 2.0
 
Containers, Serverless and Functions in a nutshell
Containers, Serverless and Functions in a nutshellContainers, Serverless and Functions in a nutshell
Containers, Serverless and Functions in a nutshell
 
Devops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShiftDevops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShift
 
BSidesDFW2022-PurpleTeam_Cloud_Identity.pptx
BSidesDFW2022-PurpleTeam_Cloud_Identity.pptxBSidesDFW2022-PurpleTeam_Cloud_Identity.pptx
BSidesDFW2022-PurpleTeam_Cloud_Identity.pptx
 
What we talk about when we talk about DevOps
What we talk about when we talk about DevOpsWhat we talk about when we talk about DevOps
What we talk about when we talk about DevOps
 
JOSA TechTalks - Docker in Production
JOSA TechTalks - Docker in ProductionJOSA TechTalks - Docker in Production
JOSA TechTalks - Docker in Production
 
Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015
 
Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...
Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...
Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...
 
CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National Police
CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National PoliceCodeMotion Amsterdam 2018 - Microservices in action at the Dutch National Police
CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National Police
 
What is Docker?
What is Docker?What is Docker?
What is Docker?
 
Design Like a Pro: Scripting Best Practices
Design Like a Pro: Scripting Best PracticesDesign Like a Pro: Scripting Best Practices
Design Like a Pro: Scripting Best Practices
 
Time Series Anomaly Detection with Azure and .NETT
Time Series Anomaly Detection with Azure and .NETTTime Series Anomaly Detection with Azure and .NETT
Time Series Anomaly Detection with Azure and .NETT
 
Devops
DevopsDevops
Devops
 

More from GetInData

How do we work with customers on Big Data / ML / Analytics Projects using Scr...
How do we work with customers on Big Data / ML / Analytics Projects using Scr...How do we work with customers on Big Data / ML / Analytics Projects using Scr...
How do we work with customers on Big Data / ML / Analytics Projects using Scr...GetInData
 
Data-Driven Fast Track: Introduction to data-drivenness with Piotr Menclewicz
Data-Driven Fast Track: Introduction to data-drivenness with Piotr MenclewiczData-Driven Fast Track: Introduction to data-drivenness with Piotr Menclewicz
Data-Driven Fast Track: Introduction to data-drivenness with Piotr MenclewiczGetInData
 
How NOT to win a Kaggle competition
How NOT to win a Kaggle competitionHow NOT to win a Kaggle competition
How NOT to win a Kaggle competitionGetInData
 
How to become good Developer in Scrum Team?
How to become good Developer in Scrum Team? How to become good Developer in Scrum Team?
How to become good Developer in Scrum Team? GetInData
 
OpenLineage & Airflow - data lineage has never been easier
OpenLineage & Airflow - data lineage has never been easierOpenLineage & Airflow - data lineage has never been easier
OpenLineage & Airflow - data lineage has never been easierGetInData
 
Benefits of a Homemade ML Platform
Benefits of a Homemade ML PlatformBenefits of a Homemade ML Platform
Benefits of a Homemade ML PlatformGetInData
 
Model serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInData
Model serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInDataModel serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInData
Model serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInDataGetInData
 
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...GetInData
 
MLOps implemented - how we combine the cloud & open-source to boost data scie...
MLOps implemented - how we combine the cloud & open-source to boost data scie...MLOps implemented - how we combine the cloud & open-source to boost data scie...
MLOps implemented - how we combine the cloud & open-source to boost data scie...GetInData
 
Feast + Amundsen Integration - Mariusz Strzelecki, GetInData
Feast + Amundsen Integration - Mariusz Strzelecki, GetInDataFeast + Amundsen Integration - Mariusz Strzelecki, GetInData
Feast + Amundsen Integration - Mariusz Strzelecki, GetInDataGetInData
 
Kubernetes and real-time analytics - how to connect these two worlds with Apa...
Kubernetes and real-time analytics - how to connect these two worlds with Apa...Kubernetes and real-time analytics - how to connect these two worlds with Apa...
Kubernetes and real-time analytics - how to connect these two worlds with Apa...GetInData
 
Big data trends - Krzysztof Zarzycki, GetInData
Big data trends - Krzysztof Zarzycki, GetInDataBig data trends - Krzysztof Zarzycki, GetInData
Big data trends - Krzysztof Zarzycki, GetInDataGetInData
 
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...GetInData
 
Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...
Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...
Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...GetInData
 
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataGetInData
 
Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...GetInData
 
Predicting Startup Market Trends based on the news and social media - Albert ...
Predicting Startup Market Trends based on the news and social media - Albert ...Predicting Startup Market Trends based on the news and social media - Albert ...
Predicting Startup Market Trends based on the news and social media - Albert ...GetInData
 
Managing Big Data projects in a constantly changing environment - Rafał Zalew...
Managing Big Data projects in a constantly changing environment - Rafał Zalew...Managing Big Data projects in a constantly changing environment - Rafał Zalew...
Managing Big Data projects in a constantly changing environment - Rafał Zalew...GetInData
 
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...GetInData
 
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInDataStrategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInDataGetInData
 

More from GetInData (20)

How do we work with customers on Big Data / ML / Analytics Projects using Scr...
How do we work with customers on Big Data / ML / Analytics Projects using Scr...How do we work with customers on Big Data / ML / Analytics Projects using Scr...
How do we work with customers on Big Data / ML / Analytics Projects using Scr...
 
Data-Driven Fast Track: Introduction to data-drivenness with Piotr Menclewicz
Data-Driven Fast Track: Introduction to data-drivenness with Piotr MenclewiczData-Driven Fast Track: Introduction to data-drivenness with Piotr Menclewicz
Data-Driven Fast Track: Introduction to data-drivenness with Piotr Menclewicz
 
How NOT to win a Kaggle competition
How NOT to win a Kaggle competitionHow NOT to win a Kaggle competition
How NOT to win a Kaggle competition
 
How to become good Developer in Scrum Team?
How to become good Developer in Scrum Team? How to become good Developer in Scrum Team?
How to become good Developer in Scrum Team?
 
OpenLineage & Airflow - data lineage has never been easier
OpenLineage & Airflow - data lineage has never been easierOpenLineage & Airflow - data lineage has never been easier
OpenLineage & Airflow - data lineage has never been easier
 
Benefits of a Homemade ML Platform
Benefits of a Homemade ML PlatformBenefits of a Homemade ML Platform
Benefits of a Homemade ML Platform
 
Model serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInData
Model serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInDataModel serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInData
Model serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInData
 
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...
 
MLOps implemented - how we combine the cloud & open-source to boost data scie...
MLOps implemented - how we combine the cloud & open-source to boost data scie...MLOps implemented - how we combine the cloud & open-source to boost data scie...
MLOps implemented - how we combine the cloud & open-source to boost data scie...
 
Feast + Amundsen Integration - Mariusz Strzelecki, GetInData
Feast + Amundsen Integration - Mariusz Strzelecki, GetInDataFeast + Amundsen Integration - Mariusz Strzelecki, GetInData
Feast + Amundsen Integration - Mariusz Strzelecki, GetInData
 
Kubernetes and real-time analytics - how to connect these two worlds with Apa...
Kubernetes and real-time analytics - how to connect these two worlds with Apa...Kubernetes and real-time analytics - how to connect these two worlds with Apa...
Kubernetes and real-time analytics - how to connect these two worlds with Apa...
 
Big data trends - Krzysztof Zarzycki, GetInData
Big data trends - Krzysztof Zarzycki, GetInDataBig data trends - Krzysztof Zarzycki, GetInData
Big data trends - Krzysztof Zarzycki, GetInData
 
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
 
Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...
Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...
Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...
 
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
 
Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...
 
Predicting Startup Market Trends based on the news and social media - Albert ...
Predicting Startup Market Trends based on the news and social media - Albert ...Predicting Startup Market Trends based on the news and social media - Albert ...
Predicting Startup Market Trends based on the news and social media - Albert ...
 
Managing Big Data projects in a constantly changing environment - Rafał Zalew...
Managing Big Data projects in a constantly changing environment - Rafał Zalew...Managing Big Data projects in a constantly changing environment - Rafał Zalew...
Managing Big Data projects in a constantly changing environment - Rafał Zalew...
 
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
 
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInDataStrategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
 

Recently uploaded

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Recently uploaded (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Hot to build continuously processing for 24/7 real-time data streaming platform?

  • 1. How to build continuously processing for 24/7 real-time data streaming platform? Author: Albert Lewandowski
  • 2. • Big Data DevOps Engineer in Getindata • Smart City Consultant in Almine • Editor in Antyweb • Focused on infrastructure, cloud, Internet of Things and Big Data Who am I?
  • 3. Bullet points • Infrastructure as Code (IaC) • Documentation, documentation, documentation • Monitor wisely • Log analytics • SRE principals with DevOps mindset • CICD • Automate tasks • Ask why? not how? • Stories, issues, curiosities
  • 4. Infrastructure as Code • Code everything because it means readable infrastructure for everyone • Do not rely on local copies of script, someone has invented code repositories and it was a great day for IT stuff • Cloud-ready scripts – all public cloud vendors provide their tools for deploying infrastructure
  • 5. Infrastructure as Code in practice • Amazon Web Services • CloudFormation • Google Cloud Platform • Deployment Manager • Microsoft Azure • Azure Templates • Azure DevOps Or use one tool for all environment, like Terraform
  • 6. Exercise: Infrastructure as Code Google Cloud Platform & Deployment Manager Source: https://github.com/GoogleCloudPlatform/deploymen tmanager-samples/tree/master/examples/v2 git clone https://github.com/GoogleCloudPlatform/deploymentmanager- samples.git cd deploymentmanager-samples/examples/v2 NAME="gid_gke_example_20200108" ZONE="europe-west3b" gcloud deployment-manager deployments create ${NAME} --template cluster.py --properties zone:${ZONE}
  • 7. Documentation • Add README if you won’t be disliked by your team. • You will forget about important things faster than you think. • Add description and comments to your tasks and merge requests. Make work easier not harder.
  • 8. Monitoring • Scraping metrics is done everywhere. • Getting information about CPU utilisation, disk space usage, amount of free RAM, etc. • Multiple available tools so which one should you choose? • Do not forget about learning about new tools. • Understand which metrics are valuable and provide information about status of data pipelines or data ingestion
  • 9. Prometheus’ stories • Use service discovery, it’s great • Discover where Flink JM and TM expose their metrics. • How to provide HA? • Think of using long-term storage like Thanos or M3 or Cortex • Do you need archived data? • Monitor Prometheus even if it’s a monitoring tool.
  • 10. How to monitor and visualize metrics?
  • 11. How to monitor and visualize metrics? All available services for monitoring visualisation are quite similar. • Think about it like a part of the complex solution that has to be compatible with metrics exporters and log exporters. • Think of security, ease-of-use for operations team and adding any own modules that may be required. • Simpler visualisation = more readable (often, not always). • Understand value of metrics.
  • 12. Don’t forget about alerts Alerts signify that a human needs to take action immediately in response to something that is either happening or about to happen, in order to improve the situation. Do not overuse alerts, some issues should be fixed by automation scripts.
  • 13. Solutions designed for alerts • AlertManager • Builtin alerts in Grafana • Alerta
  • 14. Exercise: One to rull them all • Demo from Alerta’s team: https://try.alerta.io/ git clone https://github.com/alerta/docker-alerta docker-compose up
  • 15. Log analytics • Discover what is inside your log files. • Useful for operations team and for developers to understand what happened with their applications. • Read log files in the dedicated tool, not use less or tail when you have several machines to check. • You can take wise actions based on the log content when you see what is going on with your services.
  • 16. Log analytics tools • Elastic stack: • ELK • EFK • Loki + Promtail + Grafana
  • 17. Make it simple with Loki
  • 18. Make it simple with Loki Like Prometheus, but for logs! • Write simple queries with LogQL that is similar to PromQL Examples: • {instance=~"kafka-[23]",name="kafka"} != kafka.server:type=ReplicaManager • topk(10,sum(rate({region="us-east1"}[5m])) by (name)) • Ingest log files with Promtail or Fluentd or Fluentbit • Relabeling log files if needed • Designed for clusters in Kubernetes
  • 19. Our experience with Loki • Two environments: development and production stages. • Migration from ELK stack that didn’t provide enough good performance and had issues with scraping log files. • Loki in Grafana: metrics and logs can be verified in one tool. • Stable solution that provides all metrics and enables counting interesting values from jobs’ logs
  • 21. Exercise: Glance at Loki Install Helm curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get- helm-3 chmod 700 get_helm.sh ./get_helm.sh
  • 22. Exercise: Glance at Loki helm repo add loki https://grafana.github.io/loki/charts helm repo update helm upgrade --install loki --namespace=loki-stack loki/loki-stack helm install stable/grafana -n loki-grafana kubectl get secret --namespace <YOUR-NAMESPACE> loki-grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo kubectl port-forward --namespace <YOUR-NAMESPACE> service/loki- grafana 3000:80 Go to: http://localhost:3000 Add Loki as data source.
  • 23. SRE & DevOps If a human operator needs to touch your system during normal operations, you have a bug. The definition of normal changes as your systems grow. Carla Geisser, Google SRE
  • 24. DevOps vs. SRE DevOps SRE Reduce organization silos Share ownership with developers by using the same tools and techniques across the stack Accept failure as normal Have a formula for balancing accidents and failures against new releases Implement gradual change Encourage moving quickly by reducing costs of failure Leverage tooling & automation Encourages "automating this year's job away" and minimizing manual systems work to focus on efforts that bring long-term value to the system Measure everything Believes that operations is a software problem, and defines
  • 25. CICD pipelines Besides black art, there is only automation and mechanization. Federico García Lorca (1898–1936), Spanish poet and playwright Source: AWS
  • 26. Improve, commit, test, deploy • Define which applications or jobs can be deployed automatically to the production environment. • Test everything. • Remember about CI tools that are really useful. • Teach others how they should use automation tools. • Discuss, improve, make
  • 27. Automate and drink coffee • Automate boring stuff. • Make Everything as Code. • Run tested Ansible playbooks and forget about manual changes. • More well thought automation, less problems.
  • 28. Useful tools • Ansible • Jenkins • Rundeck • Automate all ops tasks • Run Ansible from one place where you can set up variables easily. • It supports LDAP.
  • 29. Test your Ansible • Use Molecule. • Molecule provides support for testing with multiple instances, operating systems and distributions, virtualization providers, test frameworks and testing scenarios. • Use Tox. • Tox is a generic virtualenv management, and test command line tool. Tox can be used in conjunction with Factors and Molecule, to perform scenario tests. • Use them in your CI tool.
  • 30. Molecule tests Have installed working Docker, installed molecule and tox (by pip). Scenarios – they are a test suite for your newly created role. What is inside scenario directory? • Dockerfile.j2 • INSTALL.rst – it contains instructions on what additional software or setup steps. • molecule.yml – here you add dependencies, etc. • playbook.yml – it’ll be invoked by Molecule. • Tests – here you can add specific tests.
  • 31. Q&A Feel free to ask me about anything 
  • 32. Thank you for your attention!