SlideShare a Scribd company logo
1 of 14
Log Files
by Heinrich Hartmann
twitter: @HeinrichHartman
web: heinrich-hartmann.net
Where do log files come from?
Use Case: Distributed Web Applications
● Web Servers
● Application Servers
● Databases
● Network infrastructure:
Routers / Load balancers / Switches
How do Log Files look like?
Example Web Server Log:
● Timestamp
● Source hostname / IP
● Session ID (if available)
● Request URL
● Return code
● Reply size
Example Log File
NASA Dataset (http://ita.ee.lbl.gov/html/contrib/NASA-HTTP.html)
Two months worth of HTTP request to Kennedy Space Center. (22 MB zipped)
What insights do Log Files provide?
● Monitoring
Are my servers up and running as expected?
How much resources are being used used?
Are my business metrics ok? (earnings/h)
● Troubleshooting / Debugging
Why is my system slow?
Where are messages dropped?
● Reporting / Mining
How do my users behave?
Click stream analysis. KPI calculations (click through, bounce rate)
Log Volume
Example Calculation for Wikipedia
● 200 application servers
● 20 database servers
● 70 cache servers
● 10k http requests per second (rps) peak load (peek: 50k rps)
● 80k SQL queries per second peak load
Apache Access Log rate:
10k req/sec * 100b log message = 1mb/s (peek: 5mb/s)
= 86gb/day (i.e. BIG)
Source:
http://www.datacenterknowledge.com/archives/2008/06/24/a-look-inside-wikipedias-infrastructure/
http://reportcard.wmflabs.org/
https://ganglia.wikimedia.org/
Log File Processing
Web Server
Web Server
Web Server
Log Aggregator
Log Monitor
Log Analytics
Business Logic.
Generate Log Files.
Gathers log files from
individual servers and
stores them on a
central location
Real time reports,
dashboards, plots,
alerts
Batch processing,
data mining
Source: Theo Schlossnagle - Scalable Internet Architectures
● Local storage of log files on web servers
● Periodic “pull” aggregation of log files, via ftp or scp
Drawbacks:
● No real-time access to logs.
● No log files from crashed servers.
Classical Solution
Web Server
Web Server
Web Server
Log Aggregator
Log Monitor
Log Analytics
● Real-time aggregation of logs files (“push”)
● Need to use reliable transfer (syslog only provides UDP)
● Configuration management complicated,
- every web server needs to know about the log aggregator
- problematic if redundancy should be adedd
Real-Time Unicast
Web Server
Web Server
Web Server
Log Aggregator
Log Monitor
Log Analytics
Passive “Sniffing” Log aggregation
Web Server
Web Server
Web ServerLog Aggregator
● Log files are produced by sniffing network packages
● Very accurate log files
● No interaction/configuration of server required
● Need single egress point
● Security flaw (man in the middle attack). Not compatible with SSL.
Internet
● Real-time log distribution using group messaging (ZeroMQ/Spread/Thrift)
● Flexible communication patterns (allow multiple subscribers)
● Use reliable IP multicast to reduce network load
● Less configuration overhead (group subscriptions)
Log AnalyticsLog Aggregator
Best practice: Group communication
Web Server
Web Server
Web Server Log Monitor(s)
Further Topics
Real-Time Monitoring
● Splunk
● circonus
● kibana
● storm
Log batch analysis
● Map-Reduce/Hadoop
● Hive
● Drill/Dremel
Further Reading
* http://hortonworks.com/use-cases/server-logs-hadoop-example/
* http://www.slideshare.net/mapredit/apache-flume-ng
* Logstash
* http://www.elasticsearch.org/overview/kibana/
Further Steps
● Chefkoch Datensatz (log files from several months)
1. Inspect data
2. Gather interesting questions to data
3. Try to answer questions using big data processing
● Stream processing vs. batch processing (Thomas)
- Welche queries/operatoren können auf dem stream beantwortet werden
- knowledge discovery / feature selection -> Indexing
Other log file sources
● Sensor data analysis
● Profiling von software projekten (SOAMIG)

More Related Content

What's hot

Ch 11: Hacking Wireless Networks
Ch 11: Hacking Wireless NetworksCh 11: Hacking Wireless Networks
Ch 11: Hacking Wireless NetworksSam Bowne
 
Malware detection-using-machine-learning
Malware detection-using-machine-learningMalware detection-using-machine-learning
Malware detection-using-machine-learningSecurity Bootcamp
 
Threat hunting for Beginners
Threat hunting for BeginnersThreat hunting for Beginners
Threat hunting for BeginnersSKMohamedKasim
 
intrusion detection system (IDS)
intrusion detection system (IDS)intrusion detection system (IDS)
intrusion detection system (IDS)Aj Maurya
 
CS6010 Social Network Analysis Unit V
CS6010 Social Network Analysis Unit VCS6010 Social Network Analysis Unit V
CS6010 Social Network Analysis Unit Vpkaviya
 
Web application attacks
Web application attacksWeb application attacks
Web application attackshruth
 
Distributed computing time
Distributed computing timeDistributed computing time
Distributed computing timeDeepak John
 
Cloud security Presentation
Cloud security PresentationCloud security Presentation
Cloud security PresentationAjay p
 
Introduction To Exploitation & Metasploit
Introduction To Exploitation & MetasploitIntroduction To Exploitation & Metasploit
Introduction To Exploitation & MetasploitRaghav Bisht
 
Intrusion detection system
Intrusion detection systemIntrusion detection system
Intrusion detection systemAAKASH S
 
Footprinting and reconnaissance
Footprinting and reconnaissanceFootprinting and reconnaissance
Footprinting and reconnaissanceNishaYadav177
 
CNIT 123 8: Desktop and Server OS Vulnerabilities
CNIT 123 8: Desktop and Server OS VulnerabilitiesCNIT 123 8: Desktop and Server OS Vulnerabilities
CNIT 123 8: Desktop and Server OS VulnerabilitiesSam Bowne
 
IoT on Raspberry PI v1.2
IoT on Raspberry PI v1.2IoT on Raspberry PI v1.2
IoT on Raspberry PI v1.2John Staveley
 
Technical Challenges in Cyber Forensics
Technical Challenges in Cyber ForensicsTechnical Challenges in Cyber Forensics
Technical Challenges in Cyber ForensicsOllie Whitehouse
 
Ch 5: Port Scanning
Ch 5: Port ScanningCh 5: Port Scanning
Ch 5: Port ScanningSam Bowne
 
How to Hunt for Lateral Movement on Your Network
How to Hunt for Lateral Movement on Your NetworkHow to Hunt for Lateral Movement on Your Network
How to Hunt for Lateral Movement on Your NetworkSqrrl
 
OpenVAS
OpenVASOpenVAS
OpenVASsvm
 

What's hot (20)

Ch 11: Hacking Wireless Networks
Ch 11: Hacking Wireless NetworksCh 11: Hacking Wireless Networks
Ch 11: Hacking Wireless Networks
 
Malware detection-using-machine-learning
Malware detection-using-machine-learningMalware detection-using-machine-learning
Malware detection-using-machine-learning
 
Threat hunting for Beginners
Threat hunting for BeginnersThreat hunting for Beginners
Threat hunting for Beginners
 
intrusion detection system (IDS)
intrusion detection system (IDS)intrusion detection system (IDS)
intrusion detection system (IDS)
 
CS6010 Social Network Analysis Unit V
CS6010 Social Network Analysis Unit VCS6010 Social Network Analysis Unit V
CS6010 Social Network Analysis Unit V
 
Web application attacks
Web application attacksWeb application attacks
Web application attacks
 
Distributed computing time
Distributed computing timeDistributed computing time
Distributed computing time
 
Cloud security Presentation
Cloud security PresentationCloud security Presentation
Cloud security Presentation
 
Introduction To Exploitation & Metasploit
Introduction To Exploitation & MetasploitIntroduction To Exploitation & Metasploit
Introduction To Exploitation & Metasploit
 
Cloud security ppt
Cloud security pptCloud security ppt
Cloud security ppt
 
Intrusion detection system
Intrusion detection systemIntrusion detection system
Intrusion detection system
 
Footprinting and reconnaissance
Footprinting and reconnaissanceFootprinting and reconnaissance
Footprinting and reconnaissance
 
CNIT 123 8: Desktop and Server OS Vulnerabilities
CNIT 123 8: Desktop and Server OS VulnerabilitiesCNIT 123 8: Desktop and Server OS Vulnerabilities
CNIT 123 8: Desktop and Server OS Vulnerabilities
 
Metaploit
MetaploitMetaploit
Metaploit
 
IoT on Raspberry PI v1.2
IoT on Raspberry PI v1.2IoT on Raspberry PI v1.2
IoT on Raspberry PI v1.2
 
Technical Challenges in Cyber Forensics
Technical Challenges in Cyber ForensicsTechnical Challenges in Cyber Forensics
Technical Challenges in Cyber Forensics
 
Web Mining
Web MiningWeb Mining
Web Mining
 
Ch 5: Port Scanning
Ch 5: Port ScanningCh 5: Port Scanning
Ch 5: Port Scanning
 
How to Hunt for Lateral Movement on Your Network
How to Hunt for Lateral Movement on Your NetworkHow to Hunt for Lateral Movement on Your Network
How to Hunt for Lateral Movement on Your Network
 
OpenVAS
OpenVASOpenVAS
OpenVAS
 

Similar to Log Files

Protecting the Web at a scale using consul and Elk / Valentin Chernozemski (S...
Protecting the Web at a scale using consul and Elk / Valentin Chernozemski (S...Protecting the Web at a scale using consul and Elk / Valentin Chernozemski (S...
Protecting the Web at a scale using consul and Elk / Valentin Chernozemski (S...Ontico
 
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...Amazon Web Services
 
Log aggregation and analysis
Log aggregation and analysisLog aggregation and analysis
Log aggregation and analysisDhaval Mehta
 
How Netflix Monitors Applications in Near Real-time w Amazon Kinesis - ABD401...
How Netflix Monitors Applications in Near Real-time w Amazon Kinesis - ABD401...How Netflix Monitors Applications in Near Real-time w Amazon Kinesis - ABD401...
How Netflix Monitors Applications in Near Real-time w Amazon Kinesis - ABD401...Amazon Web Services
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2aspyker
 
Big data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on dockerBig data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on dockerFederico Palladoro
 
Observability tips for HAProxy
Observability tips for HAProxyObservability tips for HAProxy
Observability tips for HAProxyWilly Tarreau
 
ECS19 - Ingo Gegenwarth - Running Exchange in large environment
ECS19 - Ingo Gegenwarth -  Running Exchangein large environmentECS19 - Ingo Gegenwarth -  Running Exchangein large environment
ECS19 - Ingo Gegenwarth - Running Exchange in large environmentEuropean Collaboration Summit
 
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | EnglishAWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | EnglishOmid Vahdaty
 
Ceilometer lsf-intergration-openstack-summit
Ceilometer lsf-intergration-openstack-summitCeilometer lsf-intergration-openstack-summit
Ceilometer lsf-intergration-openstack-summitTim Bell
 
Node.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scaleNode.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scaleDmytro Semenov
 
#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitter#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitterTwitter Developers
 
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDogRedis Labs
 
Big data @ Hootsuite analtyics
Big data @ Hootsuite analtyicsBig data @ Hootsuite analtyics
Big data @ Hootsuite analtyicsClaudiu Coman
 
Zero Downtime JEE Architectures
Zero Downtime JEE ArchitecturesZero Downtime JEE Architectures
Zero Downtime JEE ArchitecturesAlexander Penev
 
AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned Omid Vahdaty
 
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Guglielmo Iozzia
 
(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP PerformanceBIOVIA
 
Web performance mercadolibre - ECI 2013
Web performance   mercadolibre - ECI 2013Web performance   mercadolibre - ECI 2013
Web performance mercadolibre - ECI 2013Santiago Aimetta
 
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...Hernan Costante
 

Similar to Log Files (20)

Protecting the Web at a scale using consul and Elk / Valentin Chernozemski (S...
Protecting the Web at a scale using consul and Elk / Valentin Chernozemski (S...Protecting the Web at a scale using consul and Elk / Valentin Chernozemski (S...
Protecting the Web at a scale using consul and Elk / Valentin Chernozemski (S...
 
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
 
Log aggregation and analysis
Log aggregation and analysisLog aggregation and analysis
Log aggregation and analysis
 
How Netflix Monitors Applications in Near Real-time w Amazon Kinesis - ABD401...
How Netflix Monitors Applications in Near Real-time w Amazon Kinesis - ABD401...How Netflix Monitors Applications in Near Real-time w Amazon Kinesis - ABD401...
How Netflix Monitors Applications in Near Real-time w Amazon Kinesis - ABD401...
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
 
Big data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on dockerBig data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on docker
 
Observability tips for HAProxy
Observability tips for HAProxyObservability tips for HAProxy
Observability tips for HAProxy
 
ECS19 - Ingo Gegenwarth - Running Exchange in large environment
ECS19 - Ingo Gegenwarth -  Running Exchangein large environmentECS19 - Ingo Gegenwarth -  Running Exchangein large environment
ECS19 - Ingo Gegenwarth - Running Exchange in large environment
 
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | EnglishAWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
 
Ceilometer lsf-intergration-openstack-summit
Ceilometer lsf-intergration-openstack-summitCeilometer lsf-intergration-openstack-summit
Ceilometer lsf-intergration-openstack-summit
 
Node.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scaleNode.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scale
 
#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitter#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitter
 
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 
Big data @ Hootsuite analtyics
Big data @ Hootsuite analtyicsBig data @ Hootsuite analtyics
Big data @ Hootsuite analtyics
 
Zero Downtime JEE Architectures
Zero Downtime JEE ArchitecturesZero Downtime JEE Architectures
Zero Downtime JEE Architectures
 
AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned
 
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
 
(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance
 
Web performance mercadolibre - ECI 2013
Web performance   mercadolibre - ECI 2013Web performance   mercadolibre - ECI 2013
Web performance mercadolibre - ECI 2013
 
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
 

More from Heinrich Hartmann

Latency SLOs Done Right @ SREcon EMEA 2019
Latency SLOs Done Right @ SREcon EMEA 2019Latency SLOs Done Right @ SREcon EMEA 2019
Latency SLOs Done Right @ SREcon EMEA 2019Heinrich Hartmann
 
Circonus: Design failures - A Case Study
Circonus: Design failures - A Case StudyCirconus: Design failures - A Case Study
Circonus: Design failures - A Case StudyHeinrich Hartmann
 
Linux System Monitoring with eBPF
Linux System Monitoring with eBPFLinux System Monitoring with eBPF
Linux System Monitoring with eBPFHeinrich Hartmann
 
Scalable Online Analytics for Monitoring
Scalable Online Analytics for MonitoringScalable Online Analytics for Monitoring
Scalable Online Analytics for MonitoringHeinrich Hartmann
 
Seminar on Motivic Hall Algebras
Seminar on Motivic Hall AlgebrasSeminar on Motivic Hall Algebras
Seminar on Motivic Hall AlgebrasHeinrich Hartmann
 
GROUPOIDS, LOCAL SYSTEMS AND DIFFERENTIAL EQUATIONS
GROUPOIDS, LOCAL SYSTEMS AND DIFFERENTIAL EQUATIONSGROUPOIDS, LOCAL SYSTEMS AND DIFFERENTIAL EQUATIONS
GROUPOIDS, LOCAL SYSTEMS AND DIFFERENTIAL EQUATIONSHeinrich Hartmann
 
Related-Work.net at WeST Oberseminar
Related-Work.net at WeST OberseminarRelated-Work.net at WeST Oberseminar
Related-Work.net at WeST OberseminarHeinrich Hartmann
 
Pushforward of Differential Forms
Pushforward of Differential FormsPushforward of Differential Forms
Pushforward of Differential FormsHeinrich Hartmann
 
Dimensionstheorie Noetherscher Ringe
Dimensionstheorie Noetherscher RingeDimensionstheorie Noetherscher Ringe
Dimensionstheorie Noetherscher RingeHeinrich Hartmann
 
Hecke Curves and Moduli spcaes of Vector Bundles
Hecke Curves and Moduli spcaes of Vector BundlesHecke Curves and Moduli spcaes of Vector Bundles
Hecke Curves and Moduli spcaes of Vector BundlesHeinrich Hartmann
 
Dimension und Multiplizität von D-Moduln
Dimension und Multiplizität von D-ModulnDimension und Multiplizität von D-Moduln
Dimension und Multiplizität von D-ModulnHeinrich Hartmann
 
Nodale kurven und Hilbertschemata
Nodale kurven und HilbertschemataNodale kurven und Hilbertschemata
Nodale kurven und HilbertschemataHeinrich Hartmann
 
Local morphisms are given by composition
Local morphisms are given by compositionLocal morphisms are given by composition
Local morphisms are given by compositionHeinrich Hartmann
 

More from Heinrich Hartmann (20)

Latency SLOs Done Right @ SREcon EMEA 2019
Latency SLOs Done Right @ SREcon EMEA 2019Latency SLOs Done Right @ SREcon EMEA 2019
Latency SLOs Done Right @ SREcon EMEA 2019
 
Circonus: Design failures - A Case Study
Circonus: Design failures - A Case StudyCirconus: Design failures - A Case Study
Circonus: Design failures - A Case Study
 
Linux System Monitoring with eBPF
Linux System Monitoring with eBPFLinux System Monitoring with eBPF
Linux System Monitoring with eBPF
 
Statistics for Engineers
Statistics for EngineersStatistics for Engineers
Statistics for Engineers
 
Scalable Online Analytics for Monitoring
Scalable Online Analytics for MonitoringScalable Online Analytics for Monitoring
Scalable Online Analytics for Monitoring
 
Geometric Aspects of LSA
Geometric Aspects of LSAGeometric Aspects of LSA
Geometric Aspects of LSA
 
Geometric Aspects of LSA
Geometric Aspects of LSAGeometric Aspects of LSA
Geometric Aspects of LSA
 
Seminar on Complex Geometry
Seminar on Complex GeometrySeminar on Complex Geometry
Seminar on Complex Geometry
 
Seminar on Motivic Hall Algebras
Seminar on Motivic Hall AlgebrasSeminar on Motivic Hall Algebras
Seminar on Motivic Hall Algebras
 
GROUPOIDS, LOCAL SYSTEMS AND DIFFERENTIAL EQUATIONS
GROUPOIDS, LOCAL SYSTEMS AND DIFFERENTIAL EQUATIONSGROUPOIDS, LOCAL SYSTEMS AND DIFFERENTIAL EQUATIONS
GROUPOIDS, LOCAL SYSTEMS AND DIFFERENTIAL EQUATIONS
 
Topics in Category Theory
Topics in Category TheoryTopics in Category Theory
Topics in Category Theory
 
Related-Work.net at WeST Oberseminar
Related-Work.net at WeST OberseminarRelated-Work.net at WeST Oberseminar
Related-Work.net at WeST Oberseminar
 
Komplexe Zahlen
Komplexe ZahlenKomplexe Zahlen
Komplexe Zahlen
 
Pushforward of Differential Forms
Pushforward of Differential FormsPushforward of Differential Forms
Pushforward of Differential Forms
 
Dimensionstheorie Noetherscher Ringe
Dimensionstheorie Noetherscher RingeDimensionstheorie Noetherscher Ringe
Dimensionstheorie Noetherscher Ringe
 
Polynomproblem
PolynomproblemPolynomproblem
Polynomproblem
 
Hecke Curves and Moduli spcaes of Vector Bundles
Hecke Curves and Moduli spcaes of Vector BundlesHecke Curves and Moduli spcaes of Vector Bundles
Hecke Curves and Moduli spcaes of Vector Bundles
 
Dimension und Multiplizität von D-Moduln
Dimension und Multiplizität von D-ModulnDimension und Multiplizität von D-Moduln
Dimension und Multiplizität von D-Moduln
 
Nodale kurven und Hilbertschemata
Nodale kurven und HilbertschemataNodale kurven und Hilbertschemata
Nodale kurven und Hilbertschemata
 
Local morphisms are given by composition
Local morphisms are given by compositionLocal morphisms are given by composition
Local morphisms are given by composition
 

Recently uploaded

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 

Recently uploaded (20)

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 

Log Files

  • 1. Log Files by Heinrich Hartmann twitter: @HeinrichHartman web: heinrich-hartmann.net
  • 2. Where do log files come from? Use Case: Distributed Web Applications ● Web Servers ● Application Servers ● Databases ● Network infrastructure: Routers / Load balancers / Switches
  • 3. How do Log Files look like? Example Web Server Log: ● Timestamp ● Source hostname / IP ● Session ID (if available) ● Request URL ● Return code ● Reply size
  • 4. Example Log File NASA Dataset (http://ita.ee.lbl.gov/html/contrib/NASA-HTTP.html) Two months worth of HTTP request to Kennedy Space Center. (22 MB zipped)
  • 5. What insights do Log Files provide? ● Monitoring Are my servers up and running as expected? How much resources are being used used? Are my business metrics ok? (earnings/h) ● Troubleshooting / Debugging Why is my system slow? Where are messages dropped? ● Reporting / Mining How do my users behave? Click stream analysis. KPI calculations (click through, bounce rate)
  • 6. Log Volume Example Calculation for Wikipedia ● 200 application servers ● 20 database servers ● 70 cache servers ● 10k http requests per second (rps) peak load (peek: 50k rps) ● 80k SQL queries per second peak load Apache Access Log rate: 10k req/sec * 100b log message = 1mb/s (peek: 5mb/s) = 86gb/day (i.e. BIG) Source: http://www.datacenterknowledge.com/archives/2008/06/24/a-look-inside-wikipedias-infrastructure/ http://reportcard.wmflabs.org/ https://ganglia.wikimedia.org/
  • 7. Log File Processing Web Server Web Server Web Server Log Aggregator Log Monitor Log Analytics Business Logic. Generate Log Files. Gathers log files from individual servers and stores them on a central location Real time reports, dashboards, plots, alerts Batch processing, data mining Source: Theo Schlossnagle - Scalable Internet Architectures
  • 8. ● Local storage of log files on web servers ● Periodic “pull” aggregation of log files, via ftp or scp Drawbacks: ● No real-time access to logs. ● No log files from crashed servers. Classical Solution Web Server Web Server Web Server Log Aggregator Log Monitor Log Analytics
  • 9. ● Real-time aggregation of logs files (“push”) ● Need to use reliable transfer (syslog only provides UDP) ● Configuration management complicated, - every web server needs to know about the log aggregator - problematic if redundancy should be adedd Real-Time Unicast Web Server Web Server Web Server Log Aggregator Log Monitor Log Analytics
  • 10. Passive “Sniffing” Log aggregation Web Server Web Server Web ServerLog Aggregator ● Log files are produced by sniffing network packages ● Very accurate log files ● No interaction/configuration of server required ● Need single egress point ● Security flaw (man in the middle attack). Not compatible with SSL. Internet
  • 11. ● Real-time log distribution using group messaging (ZeroMQ/Spread/Thrift) ● Flexible communication patterns (allow multiple subscribers) ● Use reliable IP multicast to reduce network load ● Less configuration overhead (group subscriptions) Log AnalyticsLog Aggregator Best practice: Group communication Web Server Web Server Web Server Log Monitor(s)
  • 12. Further Topics Real-Time Monitoring ● Splunk ● circonus ● kibana ● storm Log batch analysis ● Map-Reduce/Hadoop ● Hive ● Drill/Dremel
  • 13. Further Reading * http://hortonworks.com/use-cases/server-logs-hadoop-example/ * http://www.slideshare.net/mapredit/apache-flume-ng * Logstash * http://www.elasticsearch.org/overview/kibana/
  • 14. Further Steps ● Chefkoch Datensatz (log files from several months) 1. Inspect data 2. Gather interesting questions to data 3. Try to answer questions using big data processing ● Stream processing vs. batch processing (Thomas) - Welche queries/operatoren können auf dem stream beantwortet werden - knowledge discovery / feature selection -> Indexing Other log file sources ● Sensor data analysis ● Profiling von software projekten (SOAMIG)