The presentation will give a general overview of logging and why it is becoming more important for cloud based systems. In particular it will focus on limitations in PAAS logging infrastructures and will outline how more powerful logging can be achieved on python platforms such as app engine giving details on python/app engine plug in libraries that have been developed. The talk will also outline how JLizard has made wide use of python as part of its log management as a service technology (Logentries.com).
2. Outline
• A Logging Story
• About Us
• Data Data Everywhere (i.e. Logs Logs Everywhere)
• LogEntries.com
• Logging in the Cloud & PAAS
• Python vs JLizard
7. JLizard & Logentries
• UCD spin out company
• Founded by Viliam Holub & Trevor Parsons
• Based on UCD/IBM Research (Enterprise Ireland)
• Participated in National Digital Research Centre’s Launchpad program
(styled on Ycombinator)
• Nova UCD spin out award
• EI Funded Company
• Dogpatch Europe Company
• Logentries Launched in Q3 2011
8. About Me
• Trevor Parsons
– Tech Background
• Enterprise Software Design, Monitoring Tools, Data Mining
• Java Java Java
• Performance & Testing Tools
– Startups
• Crovan: Training & Consultancy
• JLizard
9. Data Data Everywhere
(i.e. Logs Logs Everywhere)
• Estimates are that 150 billion gigabytes (exabytes) of data was created in
2005; This year about 8 times that amount (1,200 exabytes) will be created
- The Economist Feb 2010
• “The amount of enterprise data will grow about 650% over the next five
years, the vast majority of it unstructured, or not included in any database. “
- Gartner 2009
• Log data is the fasted-growing data source at large organizations -
CNET magazine 2010
• Many organizations are currently producing terabytes of log data per month
- CIO.Net 2009
10. Log Maths
100,000 log messages / second x 300 bytes / log
message ~ 28.6 MB
x 3600 seconds ~ 100.6 GB / hour
x 24 hours ~ 2.35 TB / day
x 365 days ~ 860.5 TB / year
x 3 years ~ 2.52 PB
From Anton Chuvakin’s Blog Aug 2010
http://chuvakin.blogspot.com/
11. Typical Log Volumes
Customer Type Log Volumes Events per Second Events per Day
Large Cloud Provider 50 Terabyes per Day 2,000,000 172,000,000,000
Large Social Media 25 Terabytes per Day 1,000,000
Organisation
Telecom Middleware/ 1 Terabyte per Day 50,0000
Applications
Large Organisation 300 GB Per Day 15,000
(>1000 employees)
Online Marketing Org 100 GB per day 5,000 432,000,000
Small 10 GBs per Day 500
Data Centre
SAAS Educational 5Gbs Per Day 250
Tools
Single IBM Test Team 2 GBs per Day 100
Online Multimedia 700Mbs Per Day 35
Early Stage Start up 50Mbs Per Day 25 2,000,000
12. Consequences
…Hacked System …Slow Web Site …System Crash
“Every minute that Facebook is down costs Zynga $10,000”
SocialMediaInfluence.com, September 2010
13. Logging - A Cloud Requirement
Server to Admin Ratio Increasing Significantly, from 10:1 to 500:1
Computerworld.com, July 2010
Compliance and Security Legislation Driving Requirement for
Log Management
Where Is My Elastic Log Management Service?
securecloudreview.com & networkworld, September 2010
14. PAAS Logging
• Limited Logging Infrastructure
– App Engine – limited log buffers
– Heroku – 500 events for free
• No/Limited access to File System
• No Log Storage
• How to debug without any logs????
15. Python vs. JLizard
Python @logentries
• Logentries Agent
• Web interface – Django
• App Engine Plug ins
16. Logentries Python Agent
Why Python?
• Light weight agent
• Runs on Linux Out Of The Box
• Windows support
– created an exe
17.
18. Logentries Python Agent
Features:
Available on github: - Authenticate with Logentries Account
www.github.com/logentries
- Register your machine
- Configure your logs
- Monitor Logs
- Export Logs
What it does:
- Push Logs
– Communicates with Logentries API
- Filter Logs
– Uses SSL
- Configure your logs (clusters /apps)
– Data compressed on the wire
- Navigate and Manage your account
- Windows Event Logs
Other Stuff:
– Run as daemon
– Run as windows service
19. Django
• Web Site
• Notifications
• User Authentication
• Landing Pages/Sign up
• Billing (Recurly Python Interface)
• Application UI
20. App Engine Logging
Logging on App Engine:
• Nice interface
• Limited Buffers
• Can download logs periodically
• No Trend Analysis
• No Business Intelligence
• App Engine Roadmap - Improvements