Heka - Rob Miller

3,291 views

Published on

Heka - Rob Miller

  1. 1. Heka Unified Data Processing
  2. 2. So. Much. Data.
  3. 3. So. Much. Data. •Server level ops data •Process •Ops level data data / metrics •Business •Logging •Error data output reports / tracebacks
  4. 4. So. Many. Tools. •collectd •statsd / tcollector / graphite / etc. •[r]syslog[-ng] •Logstash •Riemann •Nagios / Esper / other CEP / Zenoss
  5. 5. One Basic Pattern •Acquire data •Transform •Output and/or Transport data data
  6. 6. One Multi-Tool? What would it be like to build a tool to tackle this in the general case? Wins: •Fewer processes to manage •Increased client / configuration consistency •Processing shared across domains
  7. 7. One Multi-Tool? Requirements: •Lightweight •Flexible •Easily and configurable extended
  8. 8. I know, I know...
  9. 9. BUT! Replacing even two services on each box is a net ops win. SCIENCE!
  10. 10. How Heka Is Put Together
  11. 11. Inputs •Listen •Just or fetch about the low level transport
  12. 12. Splitters •Slice Inputs' raw data streams into discrete events •Text or binary protocols •Decouple protocols from their transports
  13. 13. Decoders •Parse event data to populate a metadata envelope for all event types •Extract structure from unstructured data... •... or just wrap a blob •Sandbox-able (Lua)
  14. 14. Router Simple, efficient grammar for matching messages: Type == "counter" && Payload == "1" Type == "applog" && Logger == "marketplace" Type == "alert" && (Severity==7 || Payload=="emergency") Type == "myapp.metric" && Fields[name] =~ /.*.stat/
  15. 15. Filters •Watch flowing data •Generate output messages •Sandbox-able (Lua)
  16. 16. Outputs •Deliver to external service... •… and/or to upstream Heka... •… and/or directly to Heka Dashboard UI •Configurable reconnect
  17. 17. Sandboxes Are Fun! • Dynamically added to running Heka w/ no config changes, no restart ● CPU cycles and RAM usage monitored ● Misbehaving plugins are shut off
  18. 18. Sandboxes Are Fun! • LPeg (parsing expression grammar) & JSON libraries for data parsing • Circular buffer library for time series data
  19. 19. Sandboxes Are Fun! Circular buffers auto-generate dashboard graphs
  20. 20. Try It Out https://github.com/mozilla-services/heka http://hekad.readthedocs.org https://mail.mozilla.org/listinfo/heka irc.mozilla.org, #heka rmiller@mozilla.com

×