spChains: A Declarative Framework for Data Stream Processing in Pervasive Applications
Upcoming SlideShare
Loading in...5
×
 

spChains: A Declarative Framework for Data Stream Processing in Pervasive Applications

on

  • 1,047 views

Presentation given at the 3rd International Conference on Ambient Systems, Networks and Technologies ...

Presentation given at the 3rd International Conference on Ambient Systems, Networks and Technologies
August 27-29, 2012, Niagara Falls, Ontario, Canada.
The paper is available on the PORTO open access repository: http://porto.polito.it/2496720/

Statistics

Views

Total Views
1,047
Views on SlideShare
1,011
Embed Views
36

Actions

Likes
1
Downloads
4
Comments
0

2 Embeds 36

http://elite.polito.it 35
http://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    spChains: A Declarative Framework for Data Stream Processing in Pervasive Applications spChains: A Declarative Framework for Data Stream Processing in Pervasive Applications Presentation Transcript

    • Politecnico di Torino Dip. Automatica e Informatica Torino, ItalyThe 3rd International Conference onAmbient Systems, Networks and TechnologiesAugust 27-29, 2012, Niagara Falls, Ontario, Canada http://elite.polito.it spChains: A Declarative Framework for Data Stream Processing in Pervasive Applications Dario Bonino, Fulvio Corno
    • Goals Enable real-time ambient & sensor data processing Allow AmI designers to easily specify required computations Provide an extensible open source processing library 2 ANT’2012, Niagara Falls, Canada spChains
    • Outline Motivation and Background Stream processing spChains Framework Use cases Conclusions 3 ANT’2012, Niagara Falls, Canada spChains
    • Motivation Ambient Intelligence Systems  100’s or 1,000’s of sensors  Different physical quantities (ºC, %H2O, kW, kWh, …)  Sampling frequencies from seconds to minutes Huge stream of data being generated  Storage and retrieval  On-line processing  Off-line processing  Analytics 4 ANT’2012, Niagara Falls, Canada spChains
    • On-line processing: Applications Data Decimation (from kHz to mHz)  Aggregation (over time, over space, over sensor types)  Averaging Feeding User Displays and Dashboards  Computing up-to-date and user-meaningful information Monitoring and Alerting  Checking Thresholds  Generating Alert messages Virtual Sensors  Computing derivative quantities 5 ANT’2012, Niagara Falls, Canada spChains
    • Requirements Input: up to 10,000-100,000 events/second Data: real-valued quantities, explicit units of measure Output: real-valued or Boolean, often at much lower frequency Computation: custom-defined depending on the application requirements Operators: reusable standard temporal operations applicable to data streams Usability: should not require database expert to define computations, domain experts must be autonomous 6 ANT’2012, Niagara Falls, Canada spChains
    • Technology scouting Standard Relational DBMS  Custom programming  Good for storage  Perfect fit with application  Not efficient for requirements computations  Very expensive to  Rely on central servers customize NoSQL approaches  Stream Processing  Great for storage  No storage  May do computations,  Excellent for computations require custom  Requires custom expertise programming and expertise  Rely on central (or cloud) servers 7 ANT’2012, Niagara Falls, Canada spChains
    • Stream Processing(or Complex Event Processing, CEP) Event processing: tracking and analyzing streams of data «events», and deriving a conclusion from them Defines a set of (fixed) queries Event streams are analyzed in real time (often with in- memory processing) according to the programmed queries Guarantees fast and scalable processing Increasingly adopted in different domains: Business Process Management, Recommender Systems, Financial Services, Time Series, … Several tools available (commercial and open source) Specific skills needed to write efficient queries, in tool- dependent languages 8 ANT’2012, Niagara Falls, Canada spChains
    • Stream Processing(or Complex Event Processing, CEP) Event processing: tracking and analyzing streams of data «events», and deriving a conclusion from them insert into RealEvent(src, streamName, value, Defines a set of (fixed) queries unitOfMeasure) select ‘‘Average’’, ‘‘Average-out’’, avg(value) as value, Event streams are analyzed in real(streamName=’’M1’’). in- unitOfMeasure from realEvent time (often with memorywin:time_batch(‘‘1h’’) to the programmed queries processing) according group by src, streamName, unitOfMeasure; Guarantees fast and scalable processing insert into BooleanEvent(src, streamName, booleanValue) select ‘‘Threshold’’, Increasingly adopted in different domains: Business Process ‘‘Threshold-out’’ as streamName, true as value from pattern [every (oldSample=RealEvent( Management, Recommender Systems, Financial Services, Time streamName=‘‘Average-out’’, Series, …MeasureEventComparator.compareToMeasure(oldSample,‘‘1kW’’, EventComparisonEnum.LESS_THAN_OR_EQUAL)) -> Several tools available (commercial and open source) newSample=RealEvent(streamName=oldSample.streamName, MeasureEventComparator.compareToMeasure(newSample,‘‘1kW’’, Specific skills needed to write efficient queries, in tool- EventComparisonEnum.GREATER_THAN)))].win:length(2); dependent languages 9 ANT’2012, Niagara Falls, Canada spChains
    • Proposed approach (1) Stream Processing for event data processing in real time (Extensible) Library of predefined operators (spBlocks) Declarative framework (spChains) to express the required computations  Each Computation = Stream Processing Chain  Chain = Sequence of Stream Processing Blocks  Block = predefined operator, configured with parameters 10 ANT’2012, Niagara Falls, Canada spChains
    • Proposed approach (2)  The set of spChains is described as a simple XML file  All chains are automatically mapped to Stream Processing queries<spXML:blocks> insert into RealEvent(src, streamName, value, unitOfMeasure) select ‘‘Average’’, <spXML:block id="Avg1“ ‘‘Average-out’’, avg(value) as value, function="AVERAGE"> unitOfMeasure from realEvent <spXML:param name="window" value="1“ (streamName=’’M1’’). win:time_batch(‘‘1h’’) unitOfMeasure="h"/> group by src, streamName, unitOfMeasure; <spXML:param name="mode“ insert into BooleanEvent(src, streamName, value="batch"/> booleanValue) select ‘‘Threshold’’, ‘‘Threshold-out’’ as streamName, true as value</spXML:block> from pattern [every (oldSample=RealEvent( streamName=‘‘Average-out’’,<spXML:block id="Th1“ MeasureEventComparator.compareToMeasure(oldSamp le,‘‘1kW’’, function="THRESHOLD"> EventComparisonEnum.LESS_THAN_OR_EQUAL)) -> <spXML:param name="threshold“ newSample=RealEvent(streamName=oldSample.stream value="1" unitOfMeasure="kW"/> Name, MeasureEventComparator.compareToMeasure(newSamp</spXML:block> le,‘‘1kW’’, EventComparisonEnum.GREATER_THAN)))].win:length</spXML:blocks> (2); 11 ANT’2012, Niagara Falls, Canada spChains
    • spChains Framework spBlocks Stream Pattern Match / Alerts Processing Block Pervasive Event Sources application Event Drains Environmental Stream Aggregate / Computed (s) Final Users Data Processing Measures ChainsPervasive/Ubiquitous CommunicationInfrastructure Chain Definition 12 ANT’2012, Niagara Falls, Canada spChains
    • Basic spBlock Library13 ANT’2012, Niagara Falls, Canada spChains
    • Examples of spChains14 ANT’2012, Niagara Falls, Canada spChains
    • Examples of spChains <spXML:blockid = "Avg1" function = "AVERAGE"> <spXML:param name = "window" value = "1" unitOfMeasure = "h" / > <spXML:param name = "mode" value = "batch" /> </spXML:block>15 ANT’2012, Niagara Falls, Canada spChains
    • Implementation Java spChains library (Apache v2.0 license)  Core library http://elite.polito.it/spchains  Esper bindings  Basic spBlock library  Scales up to 200 k events/sec Already in use  3 different data centers, running on embedded PCs  Monitoring environment, electrical power consumption, thermal flows (heating and cooling), polled by means of the Dog2.x multiprotocol gateway  Computed quantity are “pushed” to Web Service collectors  Over 3 months of uptime, no issues found 16 ANT’2012, Niagara Falls, Canada spChains
    • Conclusions Complex computations in the field and in real time Efficient and easy to integrate Lowered the barrier to adoption of Stream Processing Future work http://elite.polito.it  User interface http://elite.polito.it/spchains  Large-scale installations fulvio.corno@polito.it dario.bonino@polito.it 17 ANT’2012, Niagara Falls, Canada spChains