SlideShare a Scribd company logo
Why Loggly Loves Apache 
Kafka, and How We Use Its 
Unbreakable Messaging for 
Better Apache Log Storm 
Management 
Infrastructure Engineering Team 
June 2014 
| Log management as a service Simplify Log Management
What Loggly Does 
World’s most popular cloud-based 
log management service 
§ More than 5,000 customers 
§ Near real-time indexing of events 
Distributed architecture, built on AWS 
Initial production services in 2011 
§ Loggly Generation 2 released in Sept 2013 
| Log management as a service Simplify Log Management
Loggly: Addressing the first big data 
problem every company faces 
§ Centralized logging 
and archival 
§ Real-time processing, 
analysis and 
visualization 
§ Monitoring, alerting 
and troubleshooting 
| Log management as a service Simplify Log Management
Agenda for this Presentation 
§ The challenges of Log 
Management at scale 
§ Overview of Loggly’s 
processing pipeline 
§ Alternative technologies 
considered 
§ Why we love Apache Kafka 
§ How Kafka has added 
flexibility to our pipeline 
| Log management as a service Simplify Log Management
The Challenges of Log Management at Scale 
§ Big data 
– >750 billion events logged to 
date 
– Sustained bursts of 100,000+ 
events per second 
– Data space measured in 
petabytes 
§ Need for high fault tolerance 
§ Near real-time indexing 
requirements 
§ Time-series index 
management 
| Log management as a service Simplify Log Management
Log Management Processing Pipeline: 
Overview 
Load Balancing 
Kafka 
Stage 
2 
Loggly 
Custom 
Module 
| Log management as a service Simplify Log Management
Collectors Can Easily Outpace 
Downstream Processes 
Load Balancing 
Kafka 
Stage 
2 
Loggly 
Custom 
Module 
§ Written in C++ 
§ Designed to ingest 
massive data volumes 
§ Need to collect 
regardless of what’s 
happening 
downstream 
| Log management as a service Simplify Log Management
Solution: 
Queue That’s External to Collector 
Load Balancing 
Kafka 
Stage 
2 
Loggly 
Custom 
Module 
§ Based on Apache 
Kafka 
§ Highly performant 
and reliable 
| Log management as a service Simplify Log Management
Alternate/ Supplementary 
Approaches Considered 
§ Internal buffering in collectors 
– Added complexity 
§ Cassandra 
– Not as good a queue as Kafka 
§ Apache Storm 
– In initial Gen2 architecture, removed after launch 
| Log management as a service Simplify Log Management
The Secret to Log Management at Scale: 
Keep It Simple, Stupid 
Results: 
§ Can process sustained rates of 
100,000+ events per second per cluster 
§ Average message 300 bytes 
| Log management as a service Simplify Log Management
Why We Love 
Kafka 
| Log management as a service Simplify Log Management
What Attracted Us in the First Place 
No single point 
of failure 
• Terabytes of data move through our Kafka cluster 
every day without losing a single event 
• We use age-based retention to purge old data on disks 
Low latency • 99.99999% of the time our data is coming from disk 
cache and RAM; only very rarely do we hit disk 
Performance • Crazy good! 
• We currently have a bunch of Kafka brokers running 
on m2.xlarge instances backed by provisioned IOPS. 
• One of consumer group (eight threads) which maps a 
log to a customer can process about 200,000 events 
per second draining from 192 partitions spread across 
three brokers 
Scalability • Ability to increase partition count per topic and 
downstream consumer threads provides flexibility to 
increase throughput when desired 
| Log management as a service Simplify Log Management
How Our Kafka Crush Has Deepened 
Distributed log 
collection 
• Local pods and collectors spread all over the Internet with 
local Kafka deployments to collect data from customers 
located all over world 
• Can collect logs even when we lose connectivity 
• When network comes back, Kafka sends the logs 
downstream to the rest of the pipeline 
More efficient, 
effective 
DevOps 
• Deploying Kafka throughout pipeline makes it easy to 
disable certain parts of system (for troubleshooting or 
upgrades) 
• No worrying that we will lose customer data 
• Example: Add support for new log type into our 
automatic parsing capabilities by turning off existing 
parser, deploying new one, and processing logs that 
Kafka has queued up 
Controlling 
resource 
utilization 
• Keep collectors as simple as possible for resilience and 
reliability reasons 
• Add intelligence into our pipelines using Kafka 
| Log management as a service Simplify Log Management
Resource Utilization Example: 
“Noisy Neighbors” 
| Log management as a service Simplify Log Management
“Noisy Neighbors” are 
Inherent to SaaS 
§ Sending many times their “normal” level of 
logging volume, inadvertently or because their 
application is in big trouble 
§ Routing logs to separate queue minimizes 
impact on other customers 
| Log management as a service Simplify Log Management
Kafka Queues Add Flexibility to Loggly 
Pipeline 
§ Because Kafka topics are very cheap from a 
performance and overhead standpoint, we 
can create as many queues as we want 
§ Scaled to the performance we want 
§ Optimizing resource utilization across the system 
§ Because they can be created dynamically, we 
can make business rules very flexible 
§ Makes us confident that pipeline will scale as 
customer data volumes do 
| Log management as a service Simplify Log Management
Conclusion: 
Kafka Frees Our Development Team 
to Build Differentiating Features 
§ Kafka deployment working without us thinking 
about it 
§ Plenty of other things to do to keep our 
position as the world’s most popular cloud-based 
log management service! 
| Log management as a service Simplify Log Management
Does Log Management 
Sound Hard? It Should! 
Let us do the heavy lifting for you! 
Try Loggly FREE for 30 days 
About Us: 
Loggly is the world’s most popular cloud-based log management solution, used by 
more than 5,000 happy customers to effortlessly spot problems in real-time, easily 
pinpoint root causes and resolve issues faster to ensure application success. 
Visit us at loggly.com or follow @loggly on Twitter. 
| Log management as a service Simplify Log Management
Did you like this presentation? 
Head over to our blog for 
more great content! 
Take me to the Loggly Blog 
| Log management as a service Simplify Log Management

More Related Content

What's hot

What Crimean War gunboats teach us about the need for schema registries
What Crimean War gunboats teach us about the need for schema registriesWhat Crimean War gunboats teach us about the need for schema registries
What Crimean War gunboats teach us about the need for schema registries
Alexander Dean
 

What's hot (20)

Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
The Netflix Way to deal with Big Data Problems
The Netflix Way to deal with Big Data ProblemsThe Netflix Way to deal with Big Data Problems
The Netflix Way to deal with Big Data Problems
 
Netflix Keystone—Cloud scale event processing pipeline
Netflix Keystone—Cloud scale event processing pipelineNetflix Keystone—Cloud scale event processing pipeline
Netflix Keystone—Cloud scale event processing pipeline
 
Kafka - Linkedin's messaging backbone
Kafka - Linkedin's messaging backboneKafka - Linkedin's messaging backbone
Kafka - Linkedin's messaging backbone
 
Streaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in ProductionStreaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in Production
 
Introduction to Kafka
Introduction to KafkaIntroduction to Kafka
Introduction to Kafka
 
URP? Excuse You! The Three Metrics You Have to Know
URP? Excuse You! The Three Metrics You Have to Know URP? Excuse You! The Three Metrics You Have to Know
URP? Excuse You! The Three Metrics You Have to Know
 
Building Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache KafkaBuilding Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache Kafka
 
Beaming flink to the cloud @ netflix ff 2016-monal-daxini
Beaming flink to the cloud @ netflix   ff 2016-monal-daxiniBeaming flink to the cloud @ netflix   ff 2016-monal-daxini
Beaming flink to the cloud @ netflix ff 2016-monal-daxini
 
Kafka Summit NYC 2017 - Cloud Native Data Streaming Microservices with Spring...
Kafka Summit NYC 2017 - Cloud Native Data Streaming Microservices with Spring...Kafka Summit NYC 2017 - Cloud Native Data Streaming Microservices with Spring...
Kafka Summit NYC 2017 - Cloud Native Data Streaming Microservices with Spring...
 
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
Deploying Kafka at Dropbox, Mark Smith, Sean FellowsDeploying Kafka at Dropbox, Mark Smith, Sean Fellows
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
 
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the Job
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the JobAkka, Spark or Kafka? Selecting The Right Streaming Engine For the Job
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the Job
 
Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...
Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...
Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Sum...
 
Kafka Summit NYC 2017 - Apache Kafka in the Enterprise: What if it Fails?
Kafka Summit NYC 2017 - Apache Kafka in the Enterprise: What if it Fails? Kafka Summit NYC 2017 - Apache Kafka in the Enterprise: What if it Fails?
Kafka Summit NYC 2017 - Apache Kafka in the Enterprise: What if it Fails?
 
6/18/14 Billing & Payments Engineering Meetup I
6/18/14 Billing & Payments Engineering Meetup I6/18/14 Billing & Payments Engineering Meetup I
6/18/14 Billing & Payments Engineering Meetup I
 
Should you read Kafka as a stream or in batch? Should you even care? | Ido Na...
Should you read Kafka as a stream or in batch? Should you even care? | Ido Na...Should you read Kafka as a stream or in batch? Should you even care? | Ido Na...
Should you read Kafka as a stream or in batch? Should you even care? | Ido Na...
 
Hive & HBase For Transaction Processing
Hive & HBase For Transaction ProcessingHive & HBase For Transaction Processing
Hive & HBase For Transaction Processing
 
Change Data Capture using Kafka
Change Data Capture using KafkaChange Data Capture using Kafka
Change Data Capture using Kafka
 
Netflix viewing data architecture evolution - EBJUG Nov 2014
Netflix viewing data architecture evolution - EBJUG Nov 2014Netflix viewing data architecture evolution - EBJUG Nov 2014
Netflix viewing data architecture evolution - EBJUG Nov 2014
 
What Crimean War gunboats teach us about the need for schema registries
What Crimean War gunboats teach us about the need for schema registriesWhat Crimean War gunboats teach us about the need for schema registries
What Crimean War gunboats teach us about the need for schema registries
 

Viewers also liked

Log management with Graylog2 - FrOSCon 2012
Log management with Graylog2 - FrOSCon 2012Log management with Graylog2 - FrOSCon 2012
Log management with Graylog2 - FrOSCon 2012
lennartkoopmann
 
Framework and Product Comparison for Big Data Log Analytics and ITOA
Framework and Product Comparison for Big Data Log Analytics and ITOA Framework and Product Comparison for Big Data Log Analytics and ITOA
Framework and Product Comparison for Big Data Log Analytics and ITOA
Kai Wähner
 

Viewers also liked (16)

If Santa Had a Data Audit Log App...
If Santa Had a Data Audit Log App...If Santa Had a Data Audit Log App...
If Santa Had a Data Audit Log App...
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
 
Like loggly using open source
Like loggly using open sourceLike loggly using open source
Like loggly using open source
 
6 Critical SaaS Engineering Mistakes to Avoid
6 Critical SaaS Engineering Mistakes to Avoid6 Critical SaaS Engineering Mistakes to Avoid
6 Critical SaaS Engineering Mistakes to Avoid
 
2014 AWS Re:Invent sharing
2014 AWS Re:Invent sharing2014 AWS Re:Invent sharing
2014 AWS Re:Invent sharing
 
Rumble Entertainment GDC 2014: Maximizing Revenue Through Logging
Rumble Entertainment GDC 2014: Maximizing Revenue Through LoggingRumble Entertainment GDC 2014: Maximizing Revenue Through Logging
Rumble Entertainment GDC 2014: Maximizing Revenue Through Logging
 
Log Management and Analysis for Cloud Applications
Log Management and Analysis for Cloud ApplicationsLog Management and Analysis for Cloud Applications
Log Management and Analysis for Cloud Applications
 
Delivering High-Availability Web Services with NGINX Plus on AWS
Delivering High-Availability Web Services with NGINX Plus on AWSDelivering High-Availability Web Services with NGINX Plus on AWS
Delivering High-Availability Web Services with NGINX Plus on AWS
 
Enterprise Logging and Log Management: Hot Topics by Dr. Anton Chuvakin
Enterprise Logging and Log Management: Hot Topics by Dr. Anton ChuvakinEnterprise Logging and Log Management: Hot Topics by Dr. Anton Chuvakin
Enterprise Logging and Log Management: Hot Topics by Dr. Anton Chuvakin
 
Log management principle and usage
Log management principle and usageLog management principle and usage
Log management principle and usage
 
Log management with Graylog2 - FrOSCon 2012
Log management with Graylog2 - FrOSCon 2012Log management with Graylog2 - FrOSCon 2012
Log management with Graylog2 - FrOSCon 2012
 
NIST 800-92 Log Management Guide in the Real World
NIST 800-92 Log Management Guide in the Real WorldNIST 800-92 Log Management Guide in the Real World
NIST 800-92 Log Management Guide in the Real World
 
Framework and Product Comparison for Big Data Log Analytics and ITOA
Framework and Product Comparison for Big Data Log Analytics and ITOA Framework and Product Comparison for Big Data Log Analytics and ITOA
Framework and Product Comparison for Big Data Log Analytics and ITOA
 
SIEM for Beginners: Everything You Wanted to Know About Log Management but We...
SIEM for Beginners: Everything You Wanted to Know About Log Management but We...SIEM for Beginners: Everything You Wanted to Know About Log Management but We...
SIEM for Beginners: Everything You Wanted to Know About Log Management but We...
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
 
SIEM vs Log Management - Data Security Solutions 2011
SIEM vs Log Management - Data Security Solutions 2011 SIEM vs Log Management - Data Security Solutions 2011
SIEM vs Log Management - Data Security Solutions 2011
 

Similar to Why @Loggly Loves Apache Kafka, and How We Use Its Unbreakable Messaging for Better Log Management

AWS Cloud Kata | Manila - Getting to Scale on AWS
AWS Cloud Kata | Manila - Getting to Scale on AWSAWS Cloud Kata | Manila - Getting to Scale on AWS
AWS Cloud Kata | Manila - Getting to Scale on AWS
Amazon Web Services
 

Similar to Why @Loggly Loves Apache Kafka, and How We Use Its Unbreakable Messaging for Better Log Management (20)

Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...
 
Elastically Scaling Kafka Using Confluent
Elastically Scaling Kafka Using ConfluentElastically Scaling Kafka Using Confluent
Elastically Scaling Kafka Using Confluent
 
Serverless design considerations for Cloud Native workloads
Serverless design considerations for Cloud Native workloadsServerless design considerations for Cloud Native workloads
Serverless design considerations for Cloud Native workloads
 
Event Driven Microservices
Event Driven MicroservicesEvent Driven Microservices
Event Driven Microservices
 
Distributed Kafka Architecture Taboola Scale
Distributed Kafka Architecture Taboola ScaleDistributed Kafka Architecture Taboola Scale
Distributed Kafka Architecture Taboola Scale
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Twitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter
Twitter’s Apache Kafka Adoption Journey | Ming Liu, TwitterTwitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter
Twitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter
 
Event Driven Architectures with Apache Kafka
Event Driven Architectures with Apache KafkaEvent Driven Architectures with Apache Kafka
Event Driven Architectures with Apache Kafka
 
Introduction to streaming and messaging flume,kafka,SQS,kinesis
Introduction to streaming and messaging  flume,kafka,SQS,kinesis Introduction to streaming and messaging  flume,kafka,SQS,kinesis
Introduction to streaming and messaging flume,kafka,SQS,kinesis
 
AWS Cloud Kata | Manila - Getting to Scale on AWS
AWS Cloud Kata | Manila - Getting to Scale on AWSAWS Cloud Kata | Manila - Getting to Scale on AWS
AWS Cloud Kata | Manila - Getting to Scale on AWS
 
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
 
AWS Lambda and Serverless Cloud
AWS Lambda and Serverless CloudAWS Lambda and Serverless Cloud
AWS Lambda and Serverless Cloud
 
Redpanda and ClickHouse
Redpanda and ClickHouseRedpanda and ClickHouse
Redpanda and ClickHouse
 
Building Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaBuilding Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache Kafka
 
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
 
Apache Kafka® at Dropbox
Apache Kafka® at DropboxApache Kafka® at Dropbox
Apache Kafka® at Dropbox
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Aws 12 Month Free Tier for Web Designers and Developers
Aws 12 Month Free Tier for Web Designers and DevelopersAws 12 Month Free Tier for Web Designers and Developers
Aws 12 Month Free Tier for Web Designers and Developers
 
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
 
IBM Message Hub service in Bluemix - Apache Kafka in a public cloud
IBM Message Hub service in Bluemix - Apache Kafka in a public cloudIBM Message Hub service in Bluemix - Apache Kafka in a public cloud
IBM Message Hub service in Bluemix - Apache Kafka in a public cloud
 

More from SolarWinds Loggly

More from SolarWinds Loggly (10)

Loggly - Tools and Techniques For Logging Microservices
Loggly - Tools and Techniques For Logging MicroservicesLoggly - Tools and Techniques For Logging Microservices
Loggly - Tools and Techniques For Logging Microservices
 
Loggly - 5 Popular .NET Logging Libraries
Loggly - 5 Popular .NET Logging LibrariesLoggly - 5 Popular .NET Logging Libraries
Loggly - 5 Popular .NET Logging Libraries
 
Loggly - IT Operations in a Serverless World (Infographic)
Loggly - IT Operations in a Serverless World (Infographic)Loggly - IT Operations in a Serverless World (Infographic)
Loggly - IT Operations in a Serverless World (Infographic)
 
Loggly - Case Study - Loggly and Docker Deliver Powerful Monitoring for XAPPm...
Loggly - Case Study - Loggly and Docker Deliver Powerful Monitoring for XAPPm...Loggly - Case Study - Loggly and Docker Deliver Powerful Monitoring for XAPPm...
Loggly - Case Study - Loggly and Docker Deliver Powerful Monitoring for XAPPm...
 
Loggly - Case Study - Stanley Black & Decker Transforms Work with Support fro...
Loggly - Case Study - Stanley Black & Decker Transforms Work with Support fro...Loggly - Case Study - Stanley Black & Decker Transforms Work with Support fro...
Loggly - Case Study - Stanley Black & Decker Transforms Work with Support fro...
 
Loggly - Case Study - Loggly and Kubernetes Give Molecule Easy Access to the ...
Loggly - Case Study - Loggly and Kubernetes Give Molecule Easy Access to the ...Loggly - Case Study - Loggly and Kubernetes Give Molecule Easy Access to the ...
Loggly - Case Study - Loggly and Kubernetes Give Molecule Easy Access to the ...
 
Loggly - Case Study - Datami Keeps Developer Productivity High with Loggly
Loggly - Case Study - Datami Keeps Developer Productivity High with LogglyLoggly - Case Study - Datami Keeps Developer Productivity High with Loggly
Loggly - Case Study - Datami Keeps Developer Productivity High with Loggly
 
Loggly - Case Study - BEMOBI - Bemobi Monitors the Experience of 500 Million ...
Loggly - Case Study - BEMOBI - Bemobi Monitors the Experience of 500 Million ...Loggly - Case Study - BEMOBI - Bemobi Monitors the Experience of 500 Million ...
Loggly - Case Study - BEMOBI - Bemobi Monitors the Experience of 500 Million ...
 
Loggly - How to Scale Your Architecture and DevOps Practices for Big Data App...
Loggly - How to Scale Your Architecture and DevOps Practices for Big Data App...Loggly - How to Scale Your Architecture and DevOps Practices for Big Data App...
Loggly - How to Scale Your Architecture and DevOps Practices for Big Data App...
 
Loggly - Benchmarking 5 Node.js Logging Libraries
Loggly - Benchmarking 5 Node.js Logging LibrariesLoggly - Benchmarking 5 Node.js Logging Libraries
Loggly - Benchmarking 5 Node.js Logging Libraries
 

Recently uploaded

Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 

Recently uploaded (20)

Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
The architecture of Generative AI for enterprises.pdf
The architecture of Generative AI for enterprises.pdfThe architecture of Generative AI for enterprises.pdf
The architecture of Generative AI for enterprises.pdf
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG Evaluation
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 

Why @Loggly Loves Apache Kafka, and How We Use Its Unbreakable Messaging for Better Log Management

  • 1. Why Loggly Loves Apache Kafka, and How We Use Its Unbreakable Messaging for Better Apache Log Storm Management Infrastructure Engineering Team June 2014 | Log management as a service Simplify Log Management
  • 2. What Loggly Does World’s most popular cloud-based log management service § More than 5,000 customers § Near real-time indexing of events Distributed architecture, built on AWS Initial production services in 2011 § Loggly Generation 2 released in Sept 2013 | Log management as a service Simplify Log Management
  • 3. Loggly: Addressing the first big data problem every company faces § Centralized logging and archival § Real-time processing, analysis and visualization § Monitoring, alerting and troubleshooting | Log management as a service Simplify Log Management
  • 4. Agenda for this Presentation § The challenges of Log Management at scale § Overview of Loggly’s processing pipeline § Alternative technologies considered § Why we love Apache Kafka § How Kafka has added flexibility to our pipeline | Log management as a service Simplify Log Management
  • 5. The Challenges of Log Management at Scale § Big data – >750 billion events logged to date – Sustained bursts of 100,000+ events per second – Data space measured in petabytes § Need for high fault tolerance § Near real-time indexing requirements § Time-series index management | Log management as a service Simplify Log Management
  • 6. Log Management Processing Pipeline: Overview Load Balancing Kafka Stage 2 Loggly Custom Module | Log management as a service Simplify Log Management
  • 7. Collectors Can Easily Outpace Downstream Processes Load Balancing Kafka Stage 2 Loggly Custom Module § Written in C++ § Designed to ingest massive data volumes § Need to collect regardless of what’s happening downstream | Log management as a service Simplify Log Management
  • 8. Solution: Queue That’s External to Collector Load Balancing Kafka Stage 2 Loggly Custom Module § Based on Apache Kafka § Highly performant and reliable | Log management as a service Simplify Log Management
  • 9. Alternate/ Supplementary Approaches Considered § Internal buffering in collectors – Added complexity § Cassandra – Not as good a queue as Kafka § Apache Storm – In initial Gen2 architecture, removed after launch | Log management as a service Simplify Log Management
  • 10. The Secret to Log Management at Scale: Keep It Simple, Stupid Results: § Can process sustained rates of 100,000+ events per second per cluster § Average message 300 bytes | Log management as a service Simplify Log Management
  • 11. Why We Love Kafka | Log management as a service Simplify Log Management
  • 12. What Attracted Us in the First Place No single point of failure • Terabytes of data move through our Kafka cluster every day without losing a single event • We use age-based retention to purge old data on disks Low latency • 99.99999% of the time our data is coming from disk cache and RAM; only very rarely do we hit disk Performance • Crazy good! • We currently have a bunch of Kafka brokers running on m2.xlarge instances backed by provisioned IOPS. • One of consumer group (eight threads) which maps a log to a customer can process about 200,000 events per second draining from 192 partitions spread across three brokers Scalability • Ability to increase partition count per topic and downstream consumer threads provides flexibility to increase throughput when desired | Log management as a service Simplify Log Management
  • 13. How Our Kafka Crush Has Deepened Distributed log collection • Local pods and collectors spread all over the Internet with local Kafka deployments to collect data from customers located all over world • Can collect logs even when we lose connectivity • When network comes back, Kafka sends the logs downstream to the rest of the pipeline More efficient, effective DevOps • Deploying Kafka throughout pipeline makes it easy to disable certain parts of system (for troubleshooting or upgrades) • No worrying that we will lose customer data • Example: Add support for new log type into our automatic parsing capabilities by turning off existing parser, deploying new one, and processing logs that Kafka has queued up Controlling resource utilization • Keep collectors as simple as possible for resilience and reliability reasons • Add intelligence into our pipelines using Kafka | Log management as a service Simplify Log Management
  • 14. Resource Utilization Example: “Noisy Neighbors” | Log management as a service Simplify Log Management
  • 15. “Noisy Neighbors” are Inherent to SaaS § Sending many times their “normal” level of logging volume, inadvertently or because their application is in big trouble § Routing logs to separate queue minimizes impact on other customers | Log management as a service Simplify Log Management
  • 16. Kafka Queues Add Flexibility to Loggly Pipeline § Because Kafka topics are very cheap from a performance and overhead standpoint, we can create as many queues as we want § Scaled to the performance we want § Optimizing resource utilization across the system § Because they can be created dynamically, we can make business rules very flexible § Makes us confident that pipeline will scale as customer data volumes do | Log management as a service Simplify Log Management
  • 17. Conclusion: Kafka Frees Our Development Team to Build Differentiating Features § Kafka deployment working without us thinking about it § Plenty of other things to do to keep our position as the world’s most popular cloud-based log management service! | Log management as a service Simplify Log Management
  • 18. Does Log Management Sound Hard? It Should! Let us do the heavy lifting for you! Try Loggly FREE for 30 days About Us: Loggly is the world’s most popular cloud-based log management solution, used by more than 5,000 happy customers to effortlessly spot problems in real-time, easily pinpoint root causes and resolve issues faster to ensure application success. Visit us at loggly.com or follow @loggly on Twitter. | Log management as a service Simplify Log Management
  • 19. Did you like this presentation? Head over to our blog for more great content! Take me to the Loggly Blog | Log management as a service Simplify Log Management