Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Spark meetup - Zoomdata Streaming


Published on

Zoomdata explains how it uses Spark for streaming interactive visualizations on big data.

Published in: Technology
  • It's genuinely changed my life. I have been sleeping in the spare room for 4 months - and let's just say my sex life had become pretty boring! My wife and I were becoming strangers living in the same house. Thanks to your strategies, I am now back in our bed and the closeness and intimacy have returned. Thank you so much for taking the time to put all this together. It has genuinely changed my life. ■■■
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

Spark meetup - Zoomdata Streaming

  1. 1. Interactive Visualization of Data powered by Spark
  2. 2. Streaming Data @ Zoomdata Visualizations react to new data delivered Users start, stop, pause the stream Users select a rolling window or pin a start time to capture cumulative metrics
  3. 3. Drivers for Streaming Data Data Freshness Time to Analytic Business Context
  4. 4. Challenges ● Time ● Frequency ● Retention ● Synchronization ● Order ● Updates
  5. 5. Addressing the Problem @ Zoomdata Historical Revised Receive Data JMS Kafka Manipulate Stream Single JVM in Memory Spark Streaming Hold Data in Buffer MongoDB Pluggable Interact with Data Custom Code Pluggable
  6. 6. Technology Cast ● The Stream - Kafka, Kinesis, JMS ● Processing Fabric - Spark Streaming ● Landing Area - MemSQL, Solr, Kudu, Others
  7. 7. How it looks
  8. 8. With the rest of the app
  9. 9. Scale Out
  10. 10. Benefits ● Contextual Expressiveness with Streaming Data ● Independent scalability (scale-up, scale-around) ● Expressiveness powered by Spark -- using Windowing (dataframe API with stream)
  11. 11. Side Benefits ● Separation of concerns ● Disaster Recovery, COOP, other Data management concerns ● Restatements ● Options!
  12. 12. Demo ● Twitter Producer ● Spark Streaming ● MemSQL & Solr Sinks
  13. 13. Future Work ● Cross Stream Synchronization & Fusion ● On-demand scale out and resource management via Mesos ● Schema Evolution ● Storage Tiering
  14. 14. Thanks For more information contact: