• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Dynamic Cloud scaling for real-time Big Data applications
 

Dynamic Cloud scaling for real-time Big Data applications

on

  • 1,122 views

SQLstream has been deployed to detect significant events in data collected from a large grid of seismic sensors. A large-scale data infrastructure provides raw signal data over an AMQP message bus. ...

SQLstream has been deployed to detect significant events in data collected from a large grid of seismic sensors. A large-scale data infrastructure provides raw signal data over an AMQP message bus. SQLstream monitors live seismic data feeds in real-time, applying heuristic algorithms that look for patterns indicating earthquakes. The live system scales dynamically across multiple servers in a cloud environment based on the current demand.

Statistics

Views

Total Views
1,122
Views on SlideShare
1,120
Embed Views
2

Actions

Likes
1
Downloads
0
Comments
0

1 Embed 2

http://www.docshut.com 2

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Dynamic Cloud scaling for real-time Big Data applications Dynamic Cloud scaling for real-time Big Data applications Presentation Transcript

    • A Dynamically Scalable Cloud Platform for the Real-Time Detection of Seismic EventsMarc Berkowitz, SQLstream Inc.May 2012Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc.  
    • Contents »  The Context »  The Problem »  The Platform: SQLstream »  The Solution »  Conclusions2   Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc.  
    • The Context »  OOI (Ocean Observatories Initiative) »  Large, long-term program to support collaborative research in earth & ocean science »  Ocean Research Interactive Observatory Networks »  OOI/CI (Cyber-Infrastructure) »  Private cloud infrastructure for scientific computing »  Sensors, data streams, networks3   Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc.  
    • OOI/CI as a platform for scientific computing »  Basic tools for application programming »  Usual open-source components: git, doxygen »  AMQP message bus »  python scripts »  Next level tools? »  Common data model »  Higher level language »  Build an application component with SQLstream, a platform for processing live streaming data feeds.4   Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc.  
    • Array Network Facility Deployment5   Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc.  
    • Input Signal Data (blue) and Detected Events (red)6   Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc.  
    • ANF Event Detection STA   LTA   filtered  data   ra6o1   thresh   threshoff   ra6o2   onset  (pick)   %me   detec%on     detec%on     on   off   LTA   STA   look  for  onset  6me  here   compute  ra6o2  noise  floor  here  7   Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc.  
    • SQLstream »  Real-time monitoring & analysis of live data streams »  Continuous queries, written in streaming SQL »  Scalable, distributed platform8   Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc.  
    • Architecture of SQLstream »  Query engine »  Metadata catalog »  Data kernel »  Data transport layer, JDBC clients »  Extensible »  UDRs, UDXs, SQL/MED foreign servers9   Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc.  
    • Relational Data Streams »  A relational stream = »  a sequence of time-stamped, typed rows »  extends into the future »  Evaluate SQL queries with stream results »  apply relational algebra, stream or table context »  continuous calculation of incremental results »  Operators include »  scalar functions; project, filter »  aggregates on sliding windows10   Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc.  
    • Data Streams »  A relational stream = »  a sequence of time-stamped, typed rows »  whereas »  a table = a set of typed rows »  a cursor = a sequence of typed rows »  but a table is static and known »  while a stream is dynamic, and has an unknown future part.11   Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc.  
    • Streams as Dynamic Relations »  A relational algebra treats streams & relations uniformly »  a stream S ≈ its cumulant, a relation R(t) = ∫dt (t) »  usual transformations »  In practice, compute stream expressions incrementally »  row-by-row »  sliding finite time windows (or row-count windows)12   Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc.  
    • Declarative Programming »  Query language is standard SQL »  Stream and table context »  affects meaning of some SQL constructs »  Add keyword STREAM »  For example, standard OVER(...) defines a sliding window »  windowed aggregate functions sum(x) over (range 1 second preceding) »  windowed join »  Partitioned operations (natural parallelism) »  sum(x) over (range 1 second preceding partition by chan)13   Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc.  
    • Continuous Execution »  Each SQL query → a query plan → a dataflow graph »  Queries connect at “nexuses” which multiplex data net   calc   agg   net   S1   calc   calc   net   net   agg   calc   net   S2  14   Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc.  
    • Continuous Execution »  Dynamically add and remove query graphs »  Data flows continuously, queries can last forever »  continuous results »  multithreaded execution »  avoid disk writes15   Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc.  
    • A Scientific Application of SQLstream »  Goal: detect significant events in many channels of seismographic signals »  Requires: »  SQLstream »  coding an existing heuristic in SQL »  integrating into an existing infrastructure (OOI/CI) »  get & decode signal data from AMQP bus »  signal processing / event detection »  publish results on bus16   Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc.  
    • ANF Event Detection STA   LTA   filtered  data   ra6o1   thresh   threshoff   ra6o2   onset  (pick)   %me   detec%on     detec%on     on   off   LTA   STA   look  for  onset  6me  here   compute  ra6o2  noise  floor  here  17   Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc.  
    • Application Design »  Signals as Streams: »  time series x(t)→ stream of rows (t, x(t)) »  tag each row with channel ID »  merge channels into one stream »  use partitioned operations to process channels in parallel »  wide stream: easy to distribute »  Calculated quantities → additional columns downstream »  Define locally as nested views »  Expose globally as name streams »  Project and aggregate final results18   Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc.  
    • Elements of Computation for Detecting Events »  Scalar calculations (ratios) »  Fixed-size time windows for sliding average »  Find threshold-crossing events »  Matching begin & end events »  Collecting and reporting »  Publishing to the message bus19   Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc.  
    • Data Flow Overview Confidential and Trade Secret SQLstream Inc. Copyright © 201020   Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc.  
    • Data Flow (1)21   Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc.  
    • Data Flow (2)22   Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc.  
    • Conclusion »  SQLstream is scalable and extensible »  SQLstream encourages downloads and experiments »  Research licenses available23   Copyright  ©  2012  –  Proprietary  and  Confiden6al  Informa6on  of  SQLstream  Inc.