Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Interactive Visualization of Streaming Data Powered by Spark by Ruhollah Farchtchi

1,881 views

Published on

Spark Summit East Talk

Published in: Data & Analytics
  • Be the first to comment

Interactive Visualization of Streaming Data Powered by Spark by Ruhollah Farchtchi

  1. 1. INTERACTIVE VISUALIZATION OF STREAMING DATA POWERED BY SPARK
  2. 2. Streaming @Zoomdata Visualizations react to new data delivered Users start, stop, pause the stream Users select a rolling window or pin a start time to capture cumulative metrics
  3. 3. Drivers for Streaming Data Data Freshness Time to Analytic Business Context
  4. 4. Challenges • Time • Frequency • Retention • Synchronization • Order • Updates
  5. 5. Addressing streaming @Zoomdata Historical Revised Receive Data JMS Kafka Manipulate Stream Single JVM in Memory Spark Streaming Hold Data in Buffer MongoDB Pluggable Interact with Data Custom Code Pluggable
  6. 6. Technology Cast • The Stream - Kafka, Kinesis, JMS • Processing Fabric - Spark Streaming • Landing Area - MemSQL, Solr, Kudu, Others
  7. 7. How it looks
  8. 8. With the rest of the app
  9. 9. Scale Out
  10. 10. Demo • Twitter Producer • Spark Streaming • MemSQL & Solr Sinks
  11. 11. Benefits • Contextual Expressiveness with Streaming Data • Independent scalability (scale-up, scale- around) • Expressiveness powered by Spark -- using Windowing (dataframe API with stream) • DR COOP, other Data management concerns
  12. 12. Future Work • Cross stream synchronization & fusion • On-demand scale out and resource management via Mesos • Schema evolution • More extensible landing strategies
  13. 13. Thanks

×