fifth elephant - 2014: Live analytical dashboards at scale
Upcoming SlideShare
Loading in...5
×
 

fifth elephant - 2014: Live analytical dashboards at scale

on

  • 91 views

https://funnel.hasgeek.com/fifthel2014/1152-live-analytical-dashboards-at-scale-sql-style

https://funnel.hasgeek.com/fifthel2014/1152-live-analytical-dashboards-at-scale-sql-style

Statistics

Views

Total Views
91
Views on SlideShare
86
Embed Views
5

Actions

Likes
0
Downloads
3
Comments
0

1 Embed 5

https://hasgeek.tv 5

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

fifth elephant - 2014: Live analytical dashboards at scale fifth elephant - 2014: Live analytical dashboards at scale Presentation Transcript

  • Live analytical dashboards at scale - SQL style Shashwat Agarwal
  • Live Analytical
  • Live Analytical
  • What we have Services (Alotofthem) Events (millions of updates) Information
  • Challenges • Metric Definition • Scale • Reliability
  • Metric Definition • Not just count of events; but • func of • fields from one or more related events/entities • on each event or a batch of events (for statistical analysis) • for a set of dimensions
  • Scale Challenges • Dimensional Lookup • High throughput (write), • Low Latency (query) • MultiDimensional Store
  • Reliability Challenges • Accuracy • Consistency • Fault tolerance
  • Solution? Real time + Scale == Stream Processing Kafka Storm
  • Storage • MultiDimensional support • Optimized for Time series query • Low query response times • High write throughput • Scalable TSD* * OpenTSDB does not support kerberose
  • Metric Definition • Not scalable to write storm topologies for each metrics • Require DSL for non-tech folks Introducing... Esper
  • Storm Topology - 1 Dim Lookup Dim Lookup Kafka Spouts Enricher Bolts Kafka Bolts { id: a123-234, time: 1234, entityId: OD12 … } Event { id: a123-234, time: 1234, entityId: OD12 … } Enriched Event Dim Store
  • Storm Topology - 3’ TSDKafka Spouts Esper Bolts TSD Bolts { id: a123-234, time: 1234, entityId: OD12 … } Enriched Event ( metric name, [dim name-value-pairs]*, value, ts )
  • Time Batching • Event time • Enables • calculate statistics • windowed join • out of order events
  • Reliability Faults Upgrades Metrics Def changes Last good Checkpoint Reset Checkpoint Replay Transactional Storm
  • Storm Topology - 2 Kafka Spouts TIme Batch Bolt HBase Bolt { id: a123-234, time: 1234, entityId: OD12 … } Enriched Event
  • HBase Time Batch Schema Table 1 - Event Queue • Key <event_ns>_slot_<batchId> batchId is constructed from event timestamp • Value (each column - Event JSON)
  • HBase Time Batch Schema Table 2 - Event Queue Update Log • Key <event_ns>_log_<batchId>_<version> batchId is constructed from event timestamp version is timestamp at which batch was updated • Value Version
  • Storm Topology - 3 TSD Time Batch Spout Esper Bolts TSD Bolts ( metric name, [dim name-value-pairs]*, value, ts )
  • Learnings • Replayability • Event and Entity Schema • Checkpointing • Bootstrapping • Sidelining • Fault Tolerance
  • Questions ?? sb.lk/hasgeek