Your SlideShare is downloading. ×
0
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Phd
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Phd

458

Published on

Published in: Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
458
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
21
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Ontology-based Access toSensor Data StreamsJean-Paul CalbimonteSupervisor: Oscar CorchoOntology Engineering GroupFacultad de Informática, Universidad Politécnica de Madridjp.calbimonte@upm.esPhD Thesis Defense18.4.2013
  • 2. 2OutlineMotivationBackgroundConclusionsSemantic stream query processingSensor metadata characterizationOntology-based Access to Sensor Data StreamsHypotheses & contributionsChallenges
  • 3. Motivation3from Sensor Networksto the Sensor Weband the Semantic Sensor Web
  • 4. Sensors4http://www.flickr.com/photos/wouterh/2409251427/data capturedifferent Sensor providerstransmission. . .. . .data streams
  • 5. Sensor Networks and the Web5Sensor Networksusersapplicationsdata streamsVolumeVelocityVariety WEBUniversal Web-based access to Sensor data
  • 6. Querying the semantic sensor Web6e.g. publish sensor data as RDF/Linked Data?URIs as names of thingsHTTP URIsuseful information when URIis dereferencedLink to other URIsusersapplicationsWEBUse ontology models to continuously query real-time data streams originated from sensors?1static vs. streamsone-off vs. continuous
  • 7. Research questions & hypotheses7Ontology models to query real-time sensor data streams?Access heterogeneous SPEs using ontologies as anoverarching data model?SPARQL streaming extensions for querying data from SPEs(stream processing engines)?1H1: Sensor streaming data  instances of an ontology modelH2: SPARQL extensions  streaming operators & continuous processingH3: Ontology-based streaming queries  rewritten to relational-basedqueries using mappingsH4: Ontology-based streaming queries  abstract expressions concrete executable SPE queriesH5: Query rewriting  Pull & Push delivery  acceptable overhead
  • 8. Sensor Data: ObservationsCitizen ScienceMultiple publishersHeterogeneityMetadata quality8
  • 9. Sensor data: observations99
  • 10. Characterizing semantic sensor metadata10usersapplicationsWEBCharacterizing sensor data, deriving semanticmetadata from the sensor observations2different publishersdifferent metadatapublish streamsSearch/query relevantdata sources?GSN
  • 11. Research questions & hypotheses11Data representation suitable for extracting data featuresthat characterize a set of sensor streams?Classification and mining techniques to characterizesensor data streams?2H6: Sensor data series  find characteristic patternsmake it recognizable among other typesH7: Slope representations  semantic properties such as the type of data learned with classification techniques acceptable precision
  • 12. Contributions12 SPARQL extensions & formalization rewriting to algebra expressions using declarative mappings results data translation query evaluation pluggable to ≠ SPEs query rewriting using R2RML mappings data representation as slope distributions characterize types of sensor data classifying sensor time series extract metadata features derive semantic properties & R2RMLSPARQLStreamSensor metadata characterizationQueryingMetadata21
  • 13. Limitations13L1: Rewriting  medium sampling throughput, e.g. Env. monitoringL2: Query expressivity  is limited to underlying SPEs’.L3: Adapters  implemented for custom sources.L4: Querying  only simple entailmentL5: Arbitrarily noisy sensor series  no accurate characterization.L6: Classification  number of sensor time series in training setL7: Data characterization is not computed in real-time, but offline
  • 14. 14OutlineMotivationBackgroundConclusionsSemantic stream query processingSensor metadata characterizationOntology-based Access to Sensor Data StreamsHypotheses & contributionsChallengesData Streams Continuous queries WindowSPEs Ontology-based data access
  • 15. Sensor data streams & events15(temp,hum,pres) τi(36.2,89,4) τimilford1(35.6,87,4) τi-1(37.2,88,4) τi+1watford7. . .(37.6,88,7) τi (36.3,89,2) τi+1. . .. . .stream tuplesevent processing
  • 16. Querying streams & events16w1 w2windowsSELECT attribute FROM stream [NOW -10 MIN]streaming tuplesQueryprocessorquery resultsdatabaseContinuousqueryprocessorquerypushresultspullrequestSPEcontinuous processingone-off queries
  • 17. Stream Processing Engines (SPE)17Data Stream Management Systems (DSMS)Complex Event Processors (CEP)Sensor Data MiddlewareCQL/StreamBorealisTelegraphCQStreamMillCayugaGEM CEDRNiagaraCQRapideCosmHourglassSStreamWare GSNIBM InfoSphereSybase CEPMicrosoft StreamInsightOracle CEPEsperStreamBaseDiverse query languagesDifferent query capabilitiesDifferent query models
  • 18. Extracting data from relational databases18WEBOntology-baseddata accessone-off SPARQLqueriesdata as RDFrelational databaseRDB to RDFmappingsstatic dataD2RMorphODEMapster TriplifyUltraWrap MastroR2RMLW3C SSN Ontology
  • 19. Summary19Existing SPEs available and producing data streamsOntology-based access only for stored dataSPARQL query language not suitable for streamsSPEs are highly heterogeneous in models and queries
  • 20. 20OutlineMotivationBackgroundConclusionsSemantic stream query processingSensor metadata characterizationOntology-based Access to Sensor Data StreamsHypotheses & contributionsSPARQLStreamChallengesQuery rewritingRDF StreamMappings using R2RML Execution over SPEs
  • 21. RDF Streams21s,p,o<aemet:observation1, qudt:hasNumericValue, “15.5”><aemet:observation1, ssn:observedBy, aemet:Sensor3>For streams?( s,p,o ,τ)(<aemet:observation1, qudt:hasNumericValue, “15.5”>,34532)timestamped triples• Gutierrez et al. (2007) Introducing time into RDF. IEEE TKDE• Rodríguez et al. (2009) Semantic management of streaming data. SSN
  • 22. SPARQLStream extensions22SELECT (MAX(?temperature) AS ?maxtemp) ?sensorWHERE {?obs ssn:observedBy ?sensor.?obs ssn:observationResult ?res.?res aemet:hasAirTemperatureValue ?val.?val qu:numericValue ?temperature.}GROUP BY ?sensorSELECT (MAX(?temp) AS ?maxtemp) ?sensorFROM NAMED STREAM <http://aemet.linkeddata.es/observations.srdf> [NOW-1 HOURS]WHERE {?obs ssn:observedBy ?sensor.?obs ssn:observationResult ?res.?res aemet:hasAirTemperatureValue ?val.?val qu:numericValue ?temp.}GROUP BY ?sensorSPARQLStreamNamed streamsTime windowsOther approaches: Streaming SPARQL (2008), C-SPARQL (2009), CQELS(2011), EP-SPARQL (2011), INSTANS (2012)
  • 23. Streaming SPARQL execution approaches23Extend RDF for streaming dataExtend SPARQL for streaming RDFUse a SPE internally for evaluationQuery rewriting to SPEsRDF Streaming engine from scratchLogic-programming based query evaluation~SimilaritiesDivergencestreamsDSMSsCEPsMiddlewareSPARQLStream
  • 24. Mapping SPE schemas and ontologies24wan7timed: datetime PKsp_wind: floattimed sp_wind1 3.42 5.63 11.24 1.25 3.1.. …QueriesSELECT sp_windFROM wan7 [NOW -5 HOUR]WHERE sp_wind >10SPESPE data schemasssn:ObservationOntology modelsSPARQLStream QueriesStream-to-ontologymappingsSELECT ?wspeedFROM STREAM <SensorReadings.srdf> [NOW–5 HOUR]WHERE {?obs a ssn:ObservationValue;qudt:numericalValue ?wspeed;FILTER (?wspeed>10) }
  • 25. http://swissex.ch/data#Wan7/WindSpeed/ObsValue{timed}sp_windhttp://swissex.ch/data#Wan7/WindSpeed/Observation{timed}http://swissex.ch/data#Wan7/ WindSpeed/ObsOutput{timed}sweetSpeed:WindSpeedCreating Mappings25wan7timed: datetime PKsp_wind: floatssn:ObservationValuequdt:numericValuexsd:decimalssn:SensorOutputssn:Observationssn:hasValuessn:observationResultssn:Propertyssn:observedProperty:Wan4WindSpeed a rr:TriplesMapClass; rr:tableName "wan7";rr:subjectMap [rr:template "http://swissex.ch/data#Wan7/WindSpeed/ObsValue/{timed}";rr:class ssn:ObservationValue; rr:graph ssg:swissexsnow.srdf ];rr:predicateObjectMap [ rr:predicateMap [ rr:predicate qudt:numericValue ];rr:objectMap [ rr:column "sp_wind” rr:datatypexsd:decimal]];.W3C R2RML Mapping Language
  • 26. Query rewritingSELECT ?windspeedFROM STREAM <http://ssg4env.eu/SensorReadings.srdf>[NOW–5 HOUR TO NOW]WHERE {?obs a ssn:ObservationValue;qudt:numericalValue ?windspeed;FILTER (?windspeed>10) }SELECT sp_wind FROM wan7 [FROM NOW-5 HOURS TO NOW]WHERE sp_wind >10timed,sp_windπωσsp_wind>105 Hourwan7SELECT sp_wind FROM wan7.win:time(5 hour)WHERE sp_wind >10http://montblanc.slf.ch:22001/multidata?vs[0]=wan7&field[0]=wind_speed_scalar_av&c_min[0]=10&from=15/05/2012+05:00:00&to=15/05/2012+10:00:00http://api.cosm.com/v2/feeds/14321/datastreams/4?start=2012-05-15T05:00:00Z&end=2012-05-15T10:00:00ZQueryrewritingR2RMLSNEE (DSMS)Esper (DSMS)GSN (middlwr)Cosm(middlwr)26H4: Ontology-based streaming queries abstract expressions concrete executable SPE queriesH3: Ontology-based streaming queries rewritten to relational-basedqueries using mappingsSPARQLStream
  • 27. Ontology-based query rewriting27QueryrewritingQueryProcessingClientSPARQLStream[tuples][triples/bindings]AlgebraexpressionR2RMLMappingsSPARQLStream query processingSELECT ?windspeedFROM STREAM <http://ssg4env.eu/SensorReadings.srdf>[NOW–5 HOUR]WHERE {?obs a ssn:ObservationValue;qudt:numericalValue ?windspeed;FILTER (?windspeed>10) }SELECT sp_windFROM wan7.win:time(5 hour)WHERE sp_wind >10π timed,sp_windωσsp_wind>105 Hourwan7DatatranslationSNEEEsperGSNCosmpull/pushhttps://github.com/jpcik/morph-streamsOtherH1: Sensor streaming data instances of an ontology modelH2: SPARQL extensions  streamingoperators & continuous processing
  • 28. Evaluation of query rewriting overhead28H5: Query rewriting Pull & Push delivery acceptable overheadNative execution w/o rewritingExecution with rewritingPull & Push deliveryEnd-to latencyAdapted Esper benchmark
  • 29. 29OutlineMotivationBackgroundConclusionsSemantic stream query processingSensor metadata characterizationOntology-based Access to Sensor Data StreamsHypotheses & contributionsRepresentationChallengesClassification Metadata
  • 30. Characterizing semantic sensor metadata30WEBGSNAir Pressure?Air Temperature?Already classified time seriesUnclassified input seriescompare
  • 31. Deriving Semantic Metadata31RepresentationClassificationMetadata
  • 32. 0 1 2 3 4 5 6 7 8 9 103.653.73.753.83.853.93.9544.054.10 1 2 3 4 5 6 7 8 9 103.73.753.83.853.93.9544.054.1Piecewise Linear Approximation32Reflect data trendsApply with different resolutionsApplicable for different ratesOnline computation cheapLinear segmentsTime seriestimeReduce numerosity
  • 33. Linear Approximations33adac0π/2-π/4π/4abcdKey: segment slopes (angles)Divide the angle space in sectorsdistribution of angles in training setcompute linear approximationcompute slope distributionK-nearest neighbor classification213
  • 34. Experiments SwissExConfusion matrix SwissExTraining-Test datasetsSwissExperiment AEMET34
  • 35. Experiments AEMETConfusion matrix AEMETH6: Sensor data series find characteristic patterns make it recognizable among other types35Classification according to typeFPs on subclasses of the same property
  • 36. Evaluation vs SAX36H7: Slope representations type of data: semantic property learned through classification
  • 37. Semantic Sensor Metadataswissex:Sensor1rdf:type ssn:Sensor;ssn:onPlatform swissex:Station1;ssn:observes cf-property:wind_speed.swissex:Sensor2rdf:type ssn:Sensor;ssn:onPlatform swissex:Station1;ssn:observes cf-property:air_temperature.37station1W3C SSN OntologyDerive semantic metadata propertiescf-property:wind_speed rdf:type dim:VelocityOrSpeed;rdfs:label "wind speed";ssn:isPropertyOf cf-feature:wind;qu:propertyType qu:scalar;qu:generalQuantityKind qu:speed.Raw sensor data Semantic metadata
  • 38. 38OutlineMotivationBackgroundConclusionsSemantic stream query processingSensor metadata characterizationOntology-based Access to Sensor Data StreamsHypotheses & contributionsChallenges
  • 39. ConclusionsH1: Sensor streaming data  instances of an ontology modelH2: SPARQL extensions  streaming operators & continuous processingH3: Ontology-based streaming queries  rewritten to relational-basedqueries using mappingsMapping sensor data to ontology instances, e.g. SSN OntologySPARQLStream  data model, extensions syntax, semanticsSPARQLStream  semantics of query rewriting to relational steamingalgebra usage of declarative mappings (W3C R2RML)Calbimonte, Corcho & Gray. Enabling ontology-based access to streaming data sources. ISWC 2010Gray, García-Castro, Kyzirakos, Karpathiotakis, Calbimonte, Page et al. A semantically enabled servicearchitecture for mashups over streaming and stored data. ESWC 2011Gray, Sadler, Kit, Kyzirakos, Karpathiotakis, Calbimonte, Page, García-Castro, et al. A semantic sensorweb for environmental decision support applications. Sensors, MDPI, 2011Calbimonte, Corcho & Gray. Ontology-based Access to Streaming Data. In Posters ESWC 201039
  • 40. Conclusions40H4: Ontology-based streaming queries  abstract expressions concrete executable SPE queriesInstantiate, execute  ≠ SPEs: SNEE (DSMS), Esper (CEP), GSN & Cosm (Middlwr) Available implementation application in different domainsH5: Query rewriting  Pull & Push delivery  evaluation overheadSPARQLStream  evaluation overhead wrt. native executionPush & pull delivery evaluationCalbimonte, Jeung, Corcho & Aberer. Enabling Query Technologies for the Semantic Sensor Web. IJSWIS 2012.Calbimonte & Corcho. Evaluating SPARQL Queries over RDF Streams. Linked Data Management: Principlesand Techniques, CRC Press, 2013 (under review)Zhang, Duc, Corcho & Calbimonte. SRBench: A Streaming RDF/SPARQL Benchmark. ISWC 2012.Ruckhaus, Calbimonte, García-Castro & Corcho. Short Paper: From Streaming Data to Linked Data–A CaseStudy with Bike Sharing Systems. ISWC SSN 2012
  • 41. Conclusions41H6: Sensor data series  analyze in order to find characteristic patternsmake it recognizable among other typesH7: Slope representations  semantic properties such as the type of data learned with classification techniques acceptable precision41Raw observations analysis  slope distribution representation compared with SoA representations i.e. SAXEvaluation of classification task  real world datasets AEMET, SwissEx in presence of noisy data deriving semantic metadataCalbimonte, Yan, Jeung, Corcho & Aberer. Deriving Semantic Sensor Metadata from Raw Measurements.ISWC SSN 2012Calbimonte, Jeung, Corcho, & Aberer. Semantic Sensor Data Search in a Large-Scale Federated SensorNetwork. ISWC SSN 2011
  • 42. Future directions42WEBSPARQLStream queriesPublishing Linked Stream DataCurrently staticSPARQL streamingstandardsDereferencing streamingdataQuery FederationDistributed sensor dataStatic and streaming sourcesStream Reasoningquery rewriting, expanding queriesExpresivenessIntegrate with the Web of DataInferencing
  • 43. Future directionsWEBSensor pattern classificationCombine with queryprocessingLive data classificationStatistical & quality analysis Integrate statistic analyisisMappings to statistical modelsData quality filteringParallel Massive Stream Processing Online stream analysisScalable stream processingS4, Storm, StreamcloudHeterogeneity43
  • 44. Ontology-based Access toSensor Data StreamsJean-Paul CalbimonteSupervisor: Oscar CorchoOntology Engineering GroupFacultad de Informática, Universidad Politécnica de Madrid18.4.2013jp.calbimonte@upm.esPhD Thesis Defense
  • 45. 45
  • 46. SSN Ontology with other ontologies46W3C SSN Ontologytool for modeling our sensor datacombine with domain ontologies
  • 47. Algebra construction47timed,sp_windπωσ sp_wind>105 Hourwan7windsensor1 windsensor2
  • 48. Static optimization48timed,sp_windπωσ sp_wind>105 Hourwan7timed,windvalueπωσ windvalue>105 Hourwindsensor1timed,windvalueπωσ windvalue>105 Hourwindsensor2
  • 49. SPARQL Streaming extensions49
  • 50. SPARQL Stream features50
  • 51. SRBench51
  • 52. RDF Streams and SPARQLStream52RDF StreamTime windowWindow-Stream
  • 53. Mappings53Subject, predicate, objectGiven a triple pattern t p = (sp, pp,op), the semantics of its evaluation over alational streams referenced by a set of mappings M , is given by eval (t p,M), whn algebra expression defined as:eval (t p,M) = ρf s→sp,f p→pp,f o→opπf s,f p,f o(s)where ρ is the relational rename operation and π is the relational projectionon. s is the stream referenced by the mapping µ = f i ndM appi ng(t p,M) and f s,e the functions of µ that generate the projection expressions for producing respece subject, predicate and object, for every tuple of s.For the previous example, the evaluation of t p1 is given by:eval (t p1,M) = ρf s→sp,f p→pp,f o→opπf sµ1(s1.ts),fpµ1(),f oµ1()(s1)The resulting algebra expression projects the s1.ts attribute, applying the f son to create the subject. The functions fpµ1and f oµ1in this case are constants,edicate and object are the same for all tuples of s1. For the evaluation of more coEvaluate query
  • 54. Rewrite to algebra54Then, the evaluation of gp can be represented as the following algebra expression:eval (t p,M) = ωts,te,δ πf sµ1(s1) ✶ πf sµ2,f oµ2(s1) ✶ πf sµ4,f oµ4(s1) ✶πf sµ5,f oµ5(s1)This expression can be represented as a tree (Figure 4.1), where the leaf nodes are thestreams and the other nodes are the relational streaming operators.Figure 4.1: Tree representation of the evaluation of a SPARQL Stream query rewritten as an alge-bra expression.eval (t p, M ) = ωts,te,δ πf sµ1(s1) ✶ πf sµ2,f oµ2(s1) ✶ πf sµ4,f oµ4(s1) ✶πf sµ5,f oµ5(s1)This expression can be represented as a tree (Figure 4.1), where the leaf nodes are thstreams and the other nodes are the relational streaming operators.Figure 4.1: Tree representation of the evaluation of a SPARQL Stream query rewritten as an algbra expression.
  • 55. Rewriting and Execution Process55
  • 56. Execution process56
  • 57. SRBench Datasetsreal-world U.S. weather data1first & largest sensor dataset in LOD57LinkedSensorDataLinkedSensorMetadata LinkedObservationData~20k US weather stations, ~100k sensorslinks to locations in GeoNames nearbyhurricane & blizzard observations in US~1.73 billion RDF triples~159 million observations1 http://mesowest.utah.eduName Storm Type Date #Triples #Observations Data sizeBill Hurricane Aug. 17 – 22, 2009 231,021,108 21,272,790 ~15 GBIke Hurricane Sep. 01 – 13, 2008 374,094,660 34,430,964 ~34 GBGustav Hurricane Aug. 25 – 31, 2008 258,378,511 23,792,818 ~17 GBBertha Hurricane Jul. 06 – 17, 2008 278,235,734 25,762,568 ~13 GBWilma Hurricane Oct. 17 – 23, 2005 171,854,686 15,797,852 ~10 GBKatrina Hurricane Aug. 23 – 30, 2005 203,386,049 18,832,041 ~12 GBCharley Hurricane Aug. 09 – 15, 2004 101,956,760 9,333,676 ~7 GBBlizzard Apr. 01 – 06, 2003 111,357,227 10,237,791 ~2 GB
  • 58. SRBench Queries58graph pattern matchingsolution modifierquery formSPARQL 1.1reasoningstreamingdata accessand, filter, union, optionalprojection, distinctselect, construct, askaggregate, subquerysubclass, subproperty, sameAstime window, istreamobservations, sensor metadatageonames, dbpediaselect expr, property pathdstream, rstream17queries
  • 59. Query Features59Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q171.Graph patternmatchingA A,F,O A A,F A A,F,U A A A A A,F A,F,U A,F A,F,U A,F A,F A,F2. Solution modifier P,D P,D P P P P P,D P P P,D P,D P P P,D P P P3. Query form S S A S C S S S S S S S S S S S S4. SPARQL 1.1 F,P A A,E,M,FA,S N A,E,M A,E,M A,S,M,FA,S,E,M,F,PA,E,M,F,PF,P A,E,M,PP P5. Reasoning C R C A C6. Streaming T T T T T T T,D T T T T T T T T7. Dataset O O O O O O O O,S O,S O,S O,S O,S,G O,S,G O,S,G O,S,D O,S,G,DS1. And, Filter, Union, Optional2. Projection, Distinct3. Select, Construct, Ask4. Aggregate, Subquery, Negation, Expr in SELECT, assignMent,Functions&operators, PropertyPath5. subClassOf, subpRopertyOf, owl:sameAs6. Time-based window, Istream, Dstream,Rstream7. LinkedObservationData, LinkedSensorMetadata, GeoNames, Dbpedia

×