Your SlideShare is downloading. ×
0
fifth elephant - 2014: Live analytical dashboards at scale
fifth elephant - 2014: Live analytical dashboards at scale
fifth elephant - 2014: Live analytical dashboards at scale
fifth elephant - 2014: Live analytical dashboards at scale
fifth elephant - 2014: Live analytical dashboards at scale
fifth elephant - 2014: Live analytical dashboards at scale
fifth elephant - 2014: Live analytical dashboards at scale
fifth elephant - 2014: Live analytical dashboards at scale
fifth elephant - 2014: Live analytical dashboards at scale
fifth elephant - 2014: Live analytical dashboards at scale
fifth elephant - 2014: Live analytical dashboards at scale
fifth elephant - 2014: Live analytical dashboards at scale
fifth elephant - 2014: Live analytical dashboards at scale
fifth elephant - 2014: Live analytical dashboards at scale
fifth elephant - 2014: Live analytical dashboards at scale
fifth elephant - 2014: Live analytical dashboards at scale
fifth elephant - 2014: Live analytical dashboards at scale
fifth elephant - 2014: Live analytical dashboards at scale
fifth elephant - 2014: Live analytical dashboards at scale
fifth elephant - 2014: Live analytical dashboards at scale
fifth elephant - 2014: Live analytical dashboards at scale
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

fifth elephant - 2014: Live analytical dashboards at scale

253

Published on

https://funnel.hasgeek.com/fifthel2014/1152-live-analytical-dashboards-at-scale-sql-style

https://funnel.hasgeek.com/fifthel2014/1152-live-analytical-dashboards-at-scale-sql-style

Published in: Data & Analytics
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
253
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Live analytical dashboards at scale - SQL style Shashwat Agarwal
  • 2. Live Analytical
  • 3. Live Analytical
  • 4. What we have Services (Alotofthem) Events (millions of updates) Information
  • 5. Challenges • Metric Definition • Scale • Reliability
  • 6. Metric Definition • Not just count of events; but • func of • fields from one or more related events/entities • on each event or a batch of events (for statistical analysis) • for a set of dimensions
  • 7. Scale Challenges • Dimensional Lookup • High throughput (write), • Low Latency (query) • MultiDimensional Store
  • 8. Reliability Challenges • Accuracy • Consistency • Fault tolerance
  • 9. Solution? Real time + Scale == Stream Processing Kafka Storm
  • 10. Storage • MultiDimensional support • Optimized for Time series query • Low query response times • High write throughput • Scalable TSD* * OpenTSDB does not support kerberose
  • 11. Metric Definition • Not scalable to write storm topologies for each metrics • Require DSL for non-tech folks Introducing... Esper
  • 12. Storm Topology - 1 Dim Lookup Dim Lookup Kafka Spouts Enricher Bolts Kafka Bolts { id: a123-234, time: 1234, entityId: OD12 … } Event { id: a123-234, time: 1234, entityId: OD12 … } Enriched Event Dim Store
  • 13. Storm Topology - 3’ TSDKafka Spouts Esper Bolts TSD Bolts { id: a123-234, time: 1234, entityId: OD12 … } Enriched Event ( metric name, [dim name-value-pairs]*, value, ts )
  • 14. Time Batching • Event time • Enables • calculate statistics • windowed join • out of order events
  • 15. Reliability Faults Upgrades Metrics Def changes Last good Checkpoint Reset Checkpoint Replay Transactional Storm
  • 16. Storm Topology - 2 Kafka Spouts TIme Batch Bolt HBase Bolt { id: a123-234, time: 1234, entityId: OD12 … } Enriched Event
  • 17. HBase Time Batch Schema Table 1 - Event Queue • Key <event_ns>_slot_<batchId> batchId is constructed from event timestamp • Value (each column - Event JSON)
  • 18. HBase Time Batch Schema Table 2 - Event Queue Update Log • Key <event_ns>_log_<batchId>_<version> batchId is constructed from event timestamp version is timestamp at which batch was updated • Value Version
  • 19. Storm Topology - 3 TSD Time Batch Spout Esper Bolts TSD Bolts ( metric name, [dim name-value-pairs]*, value, ts )
  • 20. Learnings • Replayability • Event and Entity Schema • Checkpointing • Bootstrapping • Sidelining • Fault Tolerance
  • 21. Questions ?? sb.lk/hasgeek

×