SlideShare a Scribd company logo
The ELK Stack @ Inbot
Jilles van Gurp - Inbot Inc.
Who is Jilles?
www.jillesvangurp.com, and @jillesvangurpon everything I've signed up for
Java (J)Ruby Python Javascript/node.js
Servers reluctant Devops guy Software Architecture
Universities of Utrecht (NL), Blekinge (SE), and Groningen (NL)
GX (NL),Nokia Research (FI), Nokia/Here (DE),Localstream (DE),
Inbot(DE).
Inbot app - available for Android & IOS
ELK Stack?
Elasticsearch
Logstash
Kibana
Recent trends
Clustered/scalable time series DBs
Other people than sysadmins looking at graphs
Databases do some funky stuff these days: aggregations, search
Serverless, Docker, Amazon Lambda, Microservices etc. - where do the logs go?
More moving parts = more logs than ever
Logging
Kind of a boring topic ...
Stuff runs on servers, cloud, whatever
Produces errors, warnings, debug, telemetry, analytics, kpis, ux events, ...
Where does all this go and how do you make sense of it?
WHAT IS HAPPENING??!?!
Old school: Cat, grep, awk, cut, ….
Good luck with that on 200GB of unstructured logs from a gazillion microservices
on 40 virtual machines, docker images, etc.
That doesn't really work anymore ...
If you are doing this: you are doing it wrong!
Hadoop ecosystem?
Works great for structured data, if you know what you are looking for.
Requires a lot of infrastructure and hassle.
Not really real-time, tedious to explore data
Some hipster with a Ph.D. will fix it or ...
I’m not a data scientist, are you?
Monitoring/graphing ecosystem
Mostly geared at measuring stuff like cpu load, IO, memory, etc.
Intended for system administrators
What about the higher level stuff?
You probably should do monitoring but it’s not really what we need either ...
So, ELK ….
Logging
Most languages/servers ship with awful logging defaults, you can fix this
Log enough but not too much or too little.
Log at the right log level ⇒ Turn off DEBUG log. Use ERROR sparingly.
Log metadata so you can pick your logs apart ⇒ Metadata == json fields.
Log opportunistically, it's cheap
Too much logging
Your Elasticsearch cluster dies/you pay a fortune to keep data around that you
don’t need.
Not enough logging
Something happened, you don’t know what because there’s nothing in the logs;
you can't find back relevant events because metadata is missing.
You are going to waste what you saved in cost on finding out WTF is going on,
probably more.
Log entries in ELK
{
"message": "[3017772.750979] device-mapper: thin: 252:0: unable to service pool target messages
in READ_ONLY or FAIL mode",
"@timestamp": "2016-08-16T09:50:01.000Z",
"type": "syslog",
"host": "10.1.6.7",
"priority": 3,
"timestamp": "Aug 16 09:50:01",
"logsource": "ip-10-1-6-7",
"program": "kernel",
"severity": 3,
"facility": 0,
"facility_label": "kernel",
"severity_label": "Error"
}
Plumbing your logs
Simple problem: given some logs, convert it into json and shove it into
Elasticsearch.
Lots of components to help you do that: Logstash, Docker Gelf driver, Beats, etc.
If you can, log json natively: e.g. Logback logstash driver, http://jsonlines.org/
Ca. 40 Amazon EC2 instances, most of which have docker containers
VPC with several subnets and dmz.
Testing, production, and dev environments + dev infrastructure.
AWS comes with monitoring & alerts for basic stuff.
Everything logs to http://logs-internal.inbot.io/
Elasticsearch 2.2.0, logstash 2.2.1, kibana 4.4.1
1 week data retention, 14M events/day
Inbot technical setup
Demo time
Things to watch out for
Avoid split brains and other nasty ES failure modes -> RTFM & configure ...
Data retention policies are not optional
Use curator https://github.com/elastic/curator
Customise your mappings, changing them sucks on a live logstash cluster.
Dynamic mappings on fields that sometimes look like a number will break shit.
Running out of CPU credits in Amazon can kill your ES cluster
ES Rolling restarts take time when you have 6 months of logs
Mapped Diagnostic Context (MDC)
Common in java logging fws - log4j, slf4j, logback, etc.
Great for adding context to your logs
E.g. user_id, request url, host name, environment, headers, user agent, etc.
Makes it easy to slice and dice your logs
{
MDC.put("user_id","123");
LOG.info("some message");
MDC.remove("user_id");
}
MDC for node.js: our log4js fork
https://github.com/joona/log4js-node
Allows for MDC style attributes
Sorry: works for us but not in shape for pull request; maybe later.
But: this was an easy hack.
MdcContext
https://github.com/Inbot/inbot-utils/blob/master/src/main/java/io/inbot/utils/MdcCont
ext.java
try(MdcContext ctx=MdcContext.create()){
ctx.put("user_id","123");
LOG.info("some message");
}
Application Metrics
http://metrics.dropwizard.io/
Add counters, timers, gauges, etc. to your business logic.
metrics.register("httpclient_leased", new Gauge<Integer>() {
@Override
public Integer getValue() {
return connectionManager.getTotalStats().getLeased();
}
});
Reporter uses MDC to log once per minute: giant json blob but it works.
Docker Gelf driver
Configure your docker hosts to log the output of any docker containers using the
log driver.
command, container id, etc. become fields in log entry
nice as a fallback when you don't control the logging
/usr/bin/docker daemon --log-driver=gelf --log-opt gelf-address=udp://logs-internal.inbot.io:12201
Thanks
@jillesvangurp, @inbotapp

More Related Content

What's hot

Using Logstash, elasticsearch & kibana
Using Logstash, elasticsearch & kibanaUsing Logstash, elasticsearch & kibana
Using Logstash, elasticsearch & kibana
Alejandro E Brito Monedero
 
Debugging and Testing ES Systems
Debugging and Testing ES SystemsDebugging and Testing ES Systems
Debugging and Testing ES SystemsChris Birchall
 
Journée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et Kibana
Journée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et KibanaJournée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et Kibana
Journée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et Kibana
Publicis Sapient Engineering
 
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
ForgeRock
 
Elk with Openstack
Elk with OpenstackElk with Openstack
Elk with Openstack
Arun prasath
 
Logstash + Elasticsearch + Kibana Presentation on Startit Tech Meetup
Logstash + Elasticsearch + Kibana Presentation on Startit Tech MeetupLogstash + Elasticsearch + Kibana Presentation on Startit Tech Meetup
Logstash + Elasticsearch + Kibana Presentation on Startit Tech MeetupStartit
 
PuppetDB: A Single Source for Storing Your Puppet Data - PUG NY
PuppetDB: A Single Source for Storing Your Puppet Data - PUG NYPuppetDB: A Single Source for Storing Your Puppet Data - PUG NY
PuppetDB: A Single Source for Storing Your Puppet Data - PUG NY
Puppet
 
Advanced troubleshooting linux performance
Advanced troubleshooting linux performanceAdvanced troubleshooting linux performance
Advanced troubleshooting linux performance
Forthscale
 
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Oleksiy Panchenko
 
Experiences in ELK with D3.js for Large Log Analysis and Visualization
Experiences in ELK with D3.js  for Large Log Analysis  and VisualizationExperiences in ELK with D3.js  for Large Log Analysis  and Visualization
Experiences in ELK with D3.js for Large Log Analysis and Visualization
Surasak Sanguanpong
 
LogStash in action
LogStash in actionLogStash in action
LogStash in action
Manuj Aggarwal
 
OpenStack Log Mining
OpenStack Log MiningOpenStack Log Mining
OpenStack Log Mining
John Stanford
 
Side by Side with Elasticsearch and Solr
Side by Side with Elasticsearch and SolrSide by Side with Elasticsearch and Solr
Side by Side with Elasticsearch and Solr
Sematext Group, Inc.
 
CouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 HourCouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 Hour
Peter Friese
 
ELK Stack
ELK StackELK Stack
ELK Stack
Phuc Nguyen
 
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
Sematext Group, Inc.
 
Introducing CouchDB
Introducing CouchDBIntroducing CouchDB
Introducing CouchDB
Hatem Ben Yacoub
 
Monitoring with Graylog - a modern approach to monitoring?
Monitoring with Graylog - a modern approach to monitoring?Monitoring with Graylog - a modern approach to monitoring?
Monitoring with Graylog - a modern approach to monitoring?
inovex GmbH
 

What's hot (20)

Using Logstash, elasticsearch & kibana
Using Logstash, elasticsearch & kibanaUsing Logstash, elasticsearch & kibana
Using Logstash, elasticsearch & kibana
 
Debugging and Testing ES Systems
Debugging and Testing ES SystemsDebugging and Testing ES Systems
Debugging and Testing ES Systems
 
Journée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et Kibana
Journée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et KibanaJournée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et Kibana
Journée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et Kibana
 
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
 
Elk with Openstack
Elk with OpenstackElk with Openstack
Elk with Openstack
 
Logstash + Elasticsearch + Kibana Presentation on Startit Tech Meetup
Logstash + Elasticsearch + Kibana Presentation on Startit Tech MeetupLogstash + Elasticsearch + Kibana Presentation on Startit Tech Meetup
Logstash + Elasticsearch + Kibana Presentation on Startit Tech Meetup
 
PuppetDB: A Single Source for Storing Your Puppet Data - PUG NY
PuppetDB: A Single Source for Storing Your Puppet Data - PUG NYPuppetDB: A Single Source for Storing Your Puppet Data - PUG NY
PuppetDB: A Single Source for Storing Your Puppet Data - PUG NY
 
Advanced troubleshooting linux performance
Advanced troubleshooting linux performanceAdvanced troubleshooting linux performance
Advanced troubleshooting linux performance
 
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
 
Experiences in ELK with D3.js for Large Log Analysis and Visualization
Experiences in ELK with D3.js  for Large Log Analysis  and VisualizationExperiences in ELK with D3.js  for Large Log Analysis  and Visualization
Experiences in ELK with D3.js for Large Log Analysis and Visualization
 
LogStash in action
LogStash in actionLogStash in action
LogStash in action
 
OpenStack Log Mining
OpenStack Log MiningOpenStack Log Mining
OpenStack Log Mining
 
elk_stack_alexander_szalonnas
elk_stack_alexander_szalonnaselk_stack_alexander_szalonnas
elk_stack_alexander_szalonnas
 
Side by Side with Elasticsearch and Solr
Side by Side with Elasticsearch and SolrSide by Side with Elasticsearch and Solr
Side by Side with Elasticsearch and Solr
 
CouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 HourCouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 Hour
 
ELK Stack
ELK StackELK Stack
ELK Stack
 
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
 
Introduction to ELK
Introduction to ELKIntroduction to ELK
Introduction to ELK
 
Introducing CouchDB
Introducing CouchDBIntroducing CouchDB
Introducing CouchDB
 
Monitoring with Graylog - a modern approach to monitoring?
Monitoring with Graylog - a modern approach to monitoring?Monitoring with Graylog - a modern approach to monitoring?
Monitoring with Graylog - a modern approach to monitoring?
 

Similar to Elk stack @inbot

Log analysis with the elk stack
Log analysis with the elk stackLog analysis with the elk stack
Log analysis with the elk stack
Vikrant Chauhan
 
JetBrains Day Seoul - Exploring .NET’s memory management – a trip down memory...
JetBrains Day Seoul - Exploring .NET’s memory management – a trip down memory...JetBrains Day Seoul - Exploring .NET’s memory management – a trip down memory...
JetBrains Day Seoul - Exploring .NET’s memory management – a trip down memory...
Maarten Balliauw
 
Why databases cry at night
Why databases cry at nightWhy databases cry at night
Why databases cry at night
Michael Yarichuk
 
Search and analyze data in real time
Search and analyze data in real timeSearch and analyze data in real time
Search and analyze data in real time
Rohit Kalsarpe
 
.NET Fest 2018. Maarten Balliauw. Let’s refresh our memory! Memory management...
.NET Fest 2018. Maarten Balliauw. Let’s refresh our memory! Memory management....NET Fest 2018. Maarten Balliauw. Let’s refresh our memory! Memory management...
.NET Fest 2018. Maarten Balliauw. Let’s refresh our memory! Memory management...
NETFest
 
Migrating the elastic stack to the cloud, or application logging @ travix
 Migrating the elastic stack to the cloud, or application logging @ travix Migrating the elastic stack to the cloud, or application logging @ travix
Migrating the elastic stack to the cloud, or application logging @ travix
Ruslan Lutsenko
 
Infrastructure as code might be literally impossible part 2
Infrastructure as code might be literally impossible part 2Infrastructure as code might be literally impossible part 2
Infrastructure as code might be literally impossible part 2
ice799
 
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...BigDataCloud
 
Application Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.keyApplication Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.key
Tim Bunce
 
Cost effective BigData Processing on Amazon EC2
Cost effective BigData Processing on Amazon EC2Cost effective BigData Processing on Amazon EC2
Cost effective BigData Processing on Amazon EC2
Sujee Maniyam
 
Smarter internet of things with stream and event processing virtual io_t_meet...
Smarter internet of things with stream and event processing virtual io_t_meet...Smarter internet of things with stream and event processing virtual io_t_meet...
Smarter internet of things with stream and event processing virtual io_t_meet...
Istvan Rath
 
Low level java programming
Low level java programmingLow level java programming
Low level java programming
Peter Lawrey
 
DotNetFest - Let’s refresh our memory! Memory management in .NET
DotNetFest - Let’s refresh our memory! Memory management in .NETDotNetFest - Let’s refresh our memory! Memory management in .NET
DotNetFest - Let’s refresh our memory! Memory management in .NET
Maarten Balliauw
 
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Hernan Costante
 
Exploring .NET memory management - A trip down memory lane - Copenhagen .NET ...
Exploring .NET memory management - A trip down memory lane - Copenhagen .NET ...Exploring .NET memory management - A trip down memory lane - Copenhagen .NET ...
Exploring .NET memory management - A trip down memory lane - Copenhagen .NET ...
Maarten Balliauw
 
Building a Database for the End of the World
Building a Database for the End of the WorldBuilding a Database for the End of the World
Building a Database for the End of the World
jhugg
 
Kibana+ElasticSearch+LogStash to handle Log messages on Prod servers
Kibana+ElasticSearch+LogStash to handle Log messages on Prod serversKibana+ElasticSearch+LogStash to handle Log messages on Prod servers
Kibana+ElasticSearch+LogStash to handle Log messages on Prod servers
HYS Enterprise
 
ConFoo - Exploring .NET’s memory management – a trip down memory lane
ConFoo - Exploring .NET’s memory management – a trip down memory laneConFoo - Exploring .NET’s memory management – a trip down memory lane
ConFoo - Exploring .NET’s memory management – a trip down memory lane
Maarten Balliauw
 
Redis for duplicate detection on real time stream
Redis for duplicate detection on real time streamRedis for duplicate detection on real time stream
Redis for duplicate detection on real time stream
Roberto Franchini
 
Redis - for duplicate detection on real time stream
Redis - for duplicate detection on real time streamRedis - for duplicate detection on real time stream
Redis - for duplicate detection on real time stream
Codemotion
 

Similar to Elk stack @inbot (20)

Log analysis with the elk stack
Log analysis with the elk stackLog analysis with the elk stack
Log analysis with the elk stack
 
JetBrains Day Seoul - Exploring .NET’s memory management – a trip down memory...
JetBrains Day Seoul - Exploring .NET’s memory management – a trip down memory...JetBrains Day Seoul - Exploring .NET’s memory management – a trip down memory...
JetBrains Day Seoul - Exploring .NET’s memory management – a trip down memory...
 
Why databases cry at night
Why databases cry at nightWhy databases cry at night
Why databases cry at night
 
Search and analyze data in real time
Search and analyze data in real timeSearch and analyze data in real time
Search and analyze data in real time
 
.NET Fest 2018. Maarten Balliauw. Let’s refresh our memory! Memory management...
.NET Fest 2018. Maarten Balliauw. Let’s refresh our memory! Memory management....NET Fest 2018. Maarten Balliauw. Let’s refresh our memory! Memory management...
.NET Fest 2018. Maarten Balliauw. Let’s refresh our memory! Memory management...
 
Migrating the elastic stack to the cloud, or application logging @ travix
 Migrating the elastic stack to the cloud, or application logging @ travix Migrating the elastic stack to the cloud, or application logging @ travix
Migrating the elastic stack to the cloud, or application logging @ travix
 
Infrastructure as code might be literally impossible part 2
Infrastructure as code might be literally impossible part 2Infrastructure as code might be literally impossible part 2
Infrastructure as code might be literally impossible part 2
 
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...
 
Application Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.keyApplication Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.key
 
Cost effective BigData Processing on Amazon EC2
Cost effective BigData Processing on Amazon EC2Cost effective BigData Processing on Amazon EC2
Cost effective BigData Processing on Amazon EC2
 
Smarter internet of things with stream and event processing virtual io_t_meet...
Smarter internet of things with stream and event processing virtual io_t_meet...Smarter internet of things with stream and event processing virtual io_t_meet...
Smarter internet of things with stream and event processing virtual io_t_meet...
 
Low level java programming
Low level java programmingLow level java programming
Low level java programming
 
DotNetFest - Let’s refresh our memory! Memory management in .NET
DotNetFest - Let’s refresh our memory! Memory management in .NETDotNetFest - Let’s refresh our memory! Memory management in .NET
DotNetFest - Let’s refresh our memory! Memory management in .NET
 
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
 
Exploring .NET memory management - A trip down memory lane - Copenhagen .NET ...
Exploring .NET memory management - A trip down memory lane - Copenhagen .NET ...Exploring .NET memory management - A trip down memory lane - Copenhagen .NET ...
Exploring .NET memory management - A trip down memory lane - Copenhagen .NET ...
 
Building a Database for the End of the World
Building a Database for the End of the WorldBuilding a Database for the End of the World
Building a Database for the End of the World
 
Kibana+ElasticSearch+LogStash to handle Log messages on Prod servers
Kibana+ElasticSearch+LogStash to handle Log messages on Prod serversKibana+ElasticSearch+LogStash to handle Log messages on Prod servers
Kibana+ElasticSearch+LogStash to handle Log messages on Prod servers
 
ConFoo - Exploring .NET’s memory management – a trip down memory lane
ConFoo - Exploring .NET’s memory management – a trip down memory laneConFoo - Exploring .NET’s memory management – a trip down memory lane
ConFoo - Exploring .NET’s memory management – a trip down memory lane
 
Redis for duplicate detection on real time stream
Redis for duplicate detection on real time streamRedis for duplicate detection on real time stream
Redis for duplicate detection on real time stream
 
Redis - for duplicate detection on real time stream
Redis - for duplicate detection on real time streamRedis - for duplicate detection on real time stream
Redis - for duplicate detection on real time stream
 

Recently uploaded

Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
timtebeek1
 
AI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website CreatorAI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website Creator
Google
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
Deuglo Infosystem Pvt Ltd
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Neo4j
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
Google
 
Launch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in MinutesLaunch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in Minutes
Roshan Dwivedi
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
Donna Lenk
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
Boni García
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 

Recently uploaded (20)

Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
 
AI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website CreatorAI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website Creator
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
 
Launch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in MinutesLaunch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in Minutes
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 

Elk stack @inbot

  • 1. The ELK Stack @ Inbot Jilles van Gurp - Inbot Inc.
  • 2. Who is Jilles? www.jillesvangurp.com, and @jillesvangurpon everything I've signed up for Java (J)Ruby Python Javascript/node.js Servers reluctant Devops guy Software Architecture Universities of Utrecht (NL), Blekinge (SE), and Groningen (NL) GX (NL),Nokia Research (FI), Nokia/Here (DE),Localstream (DE), Inbot(DE).
  • 3. Inbot app - available for Android & IOS
  • 5. Recent trends Clustered/scalable time series DBs Other people than sysadmins looking at graphs Databases do some funky stuff these days: aggregations, search Serverless, Docker, Amazon Lambda, Microservices etc. - where do the logs go? More moving parts = more logs than ever
  • 6. Logging Kind of a boring topic ... Stuff runs on servers, cloud, whatever Produces errors, warnings, debug, telemetry, analytics, kpis, ux events, ... Where does all this go and how do you make sense of it? WHAT IS HAPPENING??!?!
  • 7. Old school: Cat, grep, awk, cut, …. Good luck with that on 200GB of unstructured logs from a gazillion microservices on 40 virtual machines, docker images, etc. That doesn't really work anymore ... If you are doing this: you are doing it wrong!
  • 8. Hadoop ecosystem? Works great for structured data, if you know what you are looking for. Requires a lot of infrastructure and hassle. Not really real-time, tedious to explore data Some hipster with a Ph.D. will fix it or ... I’m not a data scientist, are you?
  • 9. Monitoring/graphing ecosystem Mostly geared at measuring stuff like cpu load, IO, memory, etc. Intended for system administrators What about the higher level stuff? You probably should do monitoring but it’s not really what we need either ...
  • 11. Logging Most languages/servers ship with awful logging defaults, you can fix this Log enough but not too much or too little. Log at the right log level ⇒ Turn off DEBUG log. Use ERROR sparingly. Log metadata so you can pick your logs apart ⇒ Metadata == json fields. Log opportunistically, it's cheap
  • 12. Too much logging Your Elasticsearch cluster dies/you pay a fortune to keep data around that you don’t need. Not enough logging Something happened, you don’t know what because there’s nothing in the logs; you can't find back relevant events because metadata is missing. You are going to waste what you saved in cost on finding out WTF is going on, probably more.
  • 13. Log entries in ELK { "message": "[3017772.750979] device-mapper: thin: 252:0: unable to service pool target messages in READ_ONLY or FAIL mode", "@timestamp": "2016-08-16T09:50:01.000Z", "type": "syslog", "host": "10.1.6.7", "priority": 3, "timestamp": "Aug 16 09:50:01", "logsource": "ip-10-1-6-7", "program": "kernel", "severity": 3, "facility": 0, "facility_label": "kernel", "severity_label": "Error" }
  • 14. Plumbing your logs Simple problem: given some logs, convert it into json and shove it into Elasticsearch. Lots of components to help you do that: Logstash, Docker Gelf driver, Beats, etc. If you can, log json natively: e.g. Logback logstash driver, http://jsonlines.org/
  • 15. Ca. 40 Amazon EC2 instances, most of which have docker containers VPC with several subnets and dmz. Testing, production, and dev environments + dev infrastructure. AWS comes with monitoring & alerts for basic stuff. Everything logs to http://logs-internal.inbot.io/ Elasticsearch 2.2.0, logstash 2.2.1, kibana 4.4.1 1 week data retention, 14M events/day Inbot technical setup
  • 17. Things to watch out for Avoid split brains and other nasty ES failure modes -> RTFM & configure ... Data retention policies are not optional Use curator https://github.com/elastic/curator Customise your mappings, changing them sucks on a live logstash cluster. Dynamic mappings on fields that sometimes look like a number will break shit. Running out of CPU credits in Amazon can kill your ES cluster ES Rolling restarts take time when you have 6 months of logs
  • 18. Mapped Diagnostic Context (MDC) Common in java logging fws - log4j, slf4j, logback, etc. Great for adding context to your logs E.g. user_id, request url, host name, environment, headers, user agent, etc. Makes it easy to slice and dice your logs { MDC.put("user_id","123"); LOG.info("some message"); MDC.remove("user_id"); }
  • 19. MDC for node.js: our log4js fork https://github.com/joona/log4js-node Allows for MDC style attributes Sorry: works for us but not in shape for pull request; maybe later. But: this was an easy hack.
  • 21. Application Metrics http://metrics.dropwizard.io/ Add counters, timers, gauges, etc. to your business logic. metrics.register("httpclient_leased", new Gauge<Integer>() { @Override public Integer getValue() { return connectionManager.getTotalStats().getLeased(); } }); Reporter uses MDC to log once per minute: giant json blob but it works.
  • 22. Docker Gelf driver Configure your docker hosts to log the output of any docker containers using the log driver. command, container id, etc. become fields in log entry nice as a fallback when you don't control the logging /usr/bin/docker daemon --log-driver=gelf --log-opt gelf-address=udp://logs-internal.inbot.io:12201