Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012
Upcoming SlideShare
Loading in...5

Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012






Total Views
Views on SlideShare
Embed Views



2 Embeds 9

https://twitter.com 8
https://tweetdeck.twitter.com 1



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012 Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012 Presentation Transcript

  • Metrics with Riak A retrospective Martin Törnwall
  • Metrics?Many definitions, but heres ours...
  • Recording thingsthat change over time So we can visualize it and search for patterns
  • OSCPU, network, memory and disk usage, ...
  • ApplicationNumber of requests, errors, events, ...
  • External eventsText messages or emails sent, customer service calls, ...
  • What is a Metric?● A named variable: "sys.mem.free"● With tags: "host=sl075", "code=403", ... avg("sys.mem.free") from 1 hour ago where host="sl075"
  • Going Technical
  • We have distributed servicesWhy not have distributed metrics?
  • Reinventing the wheel?Solutions exist, but rely on technology stacks we had no experience of (e.g., HBASE)
  • I mean, really...Just how hard can it be?
  • I mean, really...Just how hard can it be?
  • Introducing MetyrOur weekend hack glorious metrics storage and processing software
  • Design Decisions● Use familiar tools: Erlang, Riak, HTTP● Not a critical service but ...● ... Avoid SPOF● Write performance >> read performance● Centralized reference clock● Integer only● Avoid 2i if possible● When in doubt, leave it to Riak
  • In Theory... Client Client Client Metyr Metyr Metyr Riak cluster
  • Storing metrics in RiakNo SQL, no schemas, no indices (?), no aggregate operations
  • Attempt 1The naïve way just never works...
  • Make each sample an objectA bucket per metric; index by Epoch time
  • The Good™Atomicity, write-once, fast range queries
  • The BadSlow, large overhead, requires 2i
  • Attempt 2Combine samples into chunks by time
  • Key Points● One bucket per metric as before● Split into hour-sized chunks (configurable)● Chunk key: Epoch time● Chunk value: List of samples● To read: Fetch chunks within interval● To write: Fetch chunk, add sample, write back
  • Chunk Anatomy One sample Time0 Value0 Tags0... ... TimeN ValueN TagsN... 64 bits 64 bits
  • Writing just got harderSlower since we must fetch a chunk first; potential race conditions, ...
  • (Arbitrary) Goal: Write 1K samples/secTests showed that the solution described so far was inadequate
  • Buffer them writesKeep per-metric write buffers, flushed every 10 seconds or so
  • Some Remaining Issues● Race condition on write● Storage requirements● Downsampling of old data
  • Thank you!