Enabling semantic integration

1,092 views
1,004 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,092
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
29
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Enabling semantic integration

  1. 1. Date: 23/09/2010<br />FIS 2010 DoctoralConsortium<br />Enabling Semantic Integration of Streaming Data Sources<br />Jean-Paul Calbimonte<br />Ontology Engineering Group. Departamento de Inteligencia Artificial.<br />Facultad de Informática, Universidad Politécnica de Madrid. <br />Campus de Montegancedo s/n. <br />28660 Boadilla del Monte. Madrid. Spain<br />{jpcalbimonte}@fi.upm.es<br />Supervisor: Oscar Corcho<br />DC Scientific advisor: AchimRettinger<br />
  2. 2. Index<br />Introduction<br />Problem statement<br />Main research questions<br />Approach<br />Proposed solution<br />Work done so far<br />Evaluation <br />Future work<br />2<br />EnablingSemanticIntegration of Streaming Data Sources<br />
  3. 3. Introduction & Scope<br />3<br /><ul><li>Streaming Data
  4. 4. Continuously appended data
  5. 5. Potentially infinite
  6. 6. Time-stamped tuples
  7. 7. Continuous queries
  8. 8. Latest used in queries
  9. 9. Windows: time information and tuple based</li></ul>(t9, a1, a2, ... , an)<br />(t8, a1, a2, ... , an)<br />(t7, a1, a2, ... , an)<br />...<br />...<br />(t1, a1, a2, ... , an)<br />...<br />...<br />Window [t7 - t9]<br />Streaming Data<br /><ul><li> Sensor Networks characteristics
  10. 10. Cheap, Noisy, Unreliable (depends)
  11. 11. Low computational, power resources, storage
  12. 12. Distributed query execution
  13. 13. Routing, Optimization</li></ul>Query<br />EnablingSemanticIntegration of Streaming Data Sources<br />
  14. 14. Problem Statement<br /><ul><li>Heterogeneous sources: schemas, stream rates, QoS, delivery mechanisms
  15. 15. Distributed sources
  16. 16. Semantic heterogeneity
  17. 17. Semantic data provision only for stored data
  18. 18. Need for live streaming continuous queries</li></ul>Sensor Network<br />Database Data<br />Integrate<br />Decl. Query<br />Integrated view<br />Stream Data<br />4<br />SemanticIntegrationStreaming Data Sources<br />
  19. 19. Main Research Questions<br /><ul><li>Provide semantic query interfaces for streaming data
  20. 20. Expose streaming data for the semantic web
  21. 21. Integrate streaming sources through ontology mappings
  22. 22. Optimize distributed query execution for streaming + stored data</li></ul>Ontology-based Data Access<br />Heterogeneous data Integration<br />Streaming Data Access<br />Distributed Query Processing<br />RDF Streams Querying<br />Semantic Integrator<br />q<br />5<br />EnablingSemanticIntegration of Streaming Data Sources<br />
  23. 23. 6<br />General Approach<br /><ul><li>Related work: literature and existing approaches
  24. 24. Identify limitations
  25. 25. Potential gaps
  26. 26. Incremental solution proposals
  27. 27. Ontology-based data access to streams
  28. 28. Semantic streaming query language
  29. 29. Semantic integration for distributed streams
  30. 30. Stream query optimization
  31. 31. Evaluation
  32. 32. SemSorGrid4Env project
  33. 33. Benchmarks, LinearRoad</li></ul>EnablingSemanticIntegration of Streaming Data Sources<br />
  34. 34. 7<br />Proposed Solution<br />Client<br />SPARQLSTR (Og) <br />Query reconciliation<br />Ontology-to-Ontologymappings<br />SPARQLSTR (O1 O2On) <br />Query translation<br />Ontology-to-Sourcemappings<br />Sensor Network (S1)<br />SPARQLSTR algebra(S1 S2 Sm) <br />Relational DB (S2)<br />Rewriter<br />Query Evaluator<br />Optimizer<br />Rewriter<br />Stream Engine (S3)<br />Distributed Query Processing<br />RDF Store (Sm)<br />Ontology-based Streaming Data Access Service<br />7<br />SemanticIntegrationStreaming Data Sources<br />
  35. 35. So Far…<br />Ontology-base data access<br /><ul><li>Definestream extensions for R2O
  36. 36. Define SPARQLSTRlanguagesyntax and semantics
  37. 37. Enableengine support for « S2O » documents, SPARQLSTR queries
  38. 38. Enabledengine support for SNEEql translation and connection
  39. 39. Limited to non-distributed scenario initially</li></ul>8<br />SemanticIntegrationStreaming Data Sources<br />
  40. 40. v<br />v<br />So Far...<br />9<br />Enabling Semantic Integration of Streaming Data Sources<br />envdata_westbay<br />Feature<br />envdata_chesil<br />locatedInRegion<br />v<br />envdata_milford<br />Observation<br />hasObservationResult<br />S2O <br />Mapping<br />v<br />envdata_hornsea<br />Streams<br />Ontologies<br />observedProperty<br />envdata_rhylflats<br />Region<br />xsd:float<br />Timestamp: long<br />Hs : float<br />Lon: float<br />Lat: float<br />WaveHeightProperty<br />PREFIX cd: <http://www.semsorgrid4env.eu/ontologies/CoastalDefences.owl#><br />PREFIX sb: <http://www.w3.org/2009/SSN-XG/Ontologies/SensorBasis.owl#> <br />PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> <br />SELECT ?waveheight ?wavets ?lat ?lon<br />FROM STREAM <http://www.semsorgrid4env/ccometeo.srdf> <br />WHERE <br />{ <br /> ?WaveObs a cd:Observation; <br />cd:observationResult ?waveheight; <br />cd:observationResultTime ?wavets;<br />cd:observationResultLatitude ?lat;<br />cd:observationResultLongitude ?lon;<br />cd:observedProperty ?waveProperty;<br />cd:featureOfInterest ?waveFeature. <br /> ?waveFeature a cd:Feature;<br />cd:locatedInRegioncd:SouthEastEnglandCCO.<br /> ?waveProperty a cd:WaveHeight. <br />}<br />(SELECT Lon,timestamp,Hs,Lat FROM envdata_rhylflats) UNION <br />(SELECT Lon,timestamp,Hs,Lat FROM envdata_hornsea) UNION <br />(SELECT Lon,timestamp,Hs,Lat FROM envdata_milford) UNION <br />(SELECT Lon,timestamp,Hs,Lat FROM envdata_chesil) UNION <br />(SELECT Lon,timestamp,Hs,Lat FROM envdata_perranporth) UNION <br />(SELECT Lon,timestamp,Hs,Lat FROM envdata_westbay) UNION <br />(SELECT Lon,timestamp,Hs,Lat FROM envdata_pevenseybay)<br />SNEEql<br />SPARQLSTR<br />
  41. 41. 10<br />Future Works<br /><ul><li>Ontology-based data access
  42. 42. SPARQL construct expressions, aggregates, projected operators
  43. 43. Implement adapters for other streaming sources
  44. 44. Add query rewriting algorithms
  45. 45. Ontology-based streaming data integration
  46. 46. Horizontal & vertical integration
  47. 47. Integrate streaming + stored data
  48. 48. RDF data sources integration
  49. 49. Streaming query optimization
  50. 50. Analyze cost models
  51. 51. Streaming sources statistics and metadata
  52. 52. Quantitative evaluation</li></ul>10<br />SemanticIntegrationStreaming Data Sources<br />
  53. 53. Thanks!<br />11<br />Enabling Semantic Integration of Streaming Data Sources<br />Enabling Semantic Integration of Streaming Data Sources<br />
  54. 54. 12<br />Red de Ontologías para el Camino de Santiago<br />References<br /><ul><li>Arasu, A., Babcock, B., Babu, S., Cieslewicz, J., Datar, M., Ito, K., Motwani, R., Srivastava, U., Widom, J.: Stream: The stanford data stream management system. In Garofalakis, M., Gehrke, J., Rastogi, R., eds.: Data Stream Management. (2006)
  55. 55. Sahoo, S.S., Halb, W., Hellmann, S., Idehen, K., Jr, T.T., Auer, S., Sequeda, J., Ezzat, A.: A survey of current approaches for mapping of relational databases to RDF. W3C (January 2009)
  56. 56. Arasu, A., Babu, S., Widom, J.: The cql continuous query language: semantic foundations and query execution. The VLDB Journal 15(2) (June 2006) 121-142
  57. 57. Brenninkmeijer, C.Y., Galpin, I., Fernandes, A.A., Paton, N.W.: A semantics for a query language over sensors, streams and relations. In: BNCOD '08. (2008) 87-99
  58. 58. Barrasa, J., Oscar Corcho, Gomez-Perez, A.: R2O, an extensible and semantically based database-to-ontology mapping language. In: SWDB2004. (2004) 1069-1070
  59. 59. Lenzerini, M.: Data integration: a theoretical perspective. In: PODS '02. (2002) 233-246
  60. 60. Barrasa Rodriguez, J., Gomez-Perez, A.: Upgrading relational legacy data to the semantic web. In: WWW '06. (2006) 1069-1070
  61. 61. Barbieri, D.F., Braga, D., Ceri, S., Della Valle, E., Grossniklaus, M.: C-sparql: A continuous query language for rdf data streams (to appear). In: (IJSC). (2010)
  62. 62. Bolles, A., Grawunder, M., Jacobi, J.: Streaming SPARQL - extending SPARQL to process data streams. In: ESWC 08. (2008) 448-462
  63. 63. Kossmann, D.: The state of the art in distributed query processing. ACM Comput. Surv. 32(4) (2000) 422-469
  64. 64. Perez, J., Arenas, M., Gutierrez, C.: Semantics and complexity of sparql. ACM Trans. Database Syst. 34(3) (2009) 1-45
  65. 65. Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: DL-Lite: Tractable description logics for ontologies. In: AAAI 2005. (2005) 602-607
  66. 66. Poggi, A., Lembo, D., Calvanese, D., Giacomo, G.D., Lenzerini, M., Rosati, R.: Linking data to ontologies. J. Data Semantics 10 (2008) 133-173
  67. 67. Perez-Urbina, H., Horrocks, I., Motik, B.: Ecient query answering for owl 2. In: ISWC 2009. (2009) 489-504</li></li></ul><li>Introduction & Scope<br />13<br />SemanticIntegrationStreaming Data Sources<br />SemSorGrid4Env<br /> Development of an integrated informationspace where new sensor networks can be easily discovered and integrated with existing ones and possibly other data sources (e.g., historical databases)<br /> Rapid development of flexible and user-centric decision support systems that use data from multiple autonomous independently deployed sensor networks and other applications.<br />
  68. 68. Ontology-based data access & integration<br />SquirrelRDF<br />RDBToOnto<br />Relational.OWL<br />SPASQL<br />Virtuoso<br />D2RQ<br />MASTRO<br />R2O + ODEMapster<br />ODEMapster<br />MySQL<br />Oracle<br />...others<br />ODEMQL<br />Transform<br />relational query<br />Ontological query<br />Ontological model<br />Database<br />OWL<br />Mapping<br />R2O<br />OBSERVER<br />SIMS<br />Carnot<br />DWQ<br />PICSEL<br />MOMIS<br />Ontological query<br />Ontological model<br />Transform<br />relational query<br />Databases<br />Mappings<br />14<br />SemanticIntegrationStreaming Data Sources<br />
  69. 69. 15<br />Background: Approaches & Technologies<br />DSMS<br />DQP<br />QP<br />S-RDF<br />Ontology-based Data Access<br />Heterogeneous data Integration<br />Streaming Data Access<br />Distributed Query Processing<br />RDF Streams Querying<br />R2O + ODEMapster<br />C-SPARQL extensions<br />SNEE/SNEEql<br />Semantic Integrator<br />q<br />15<br />SemanticIntegrationStreaming Data Sources<br />
  70. 70. 16<br />S2O: MappingStreamsto Ontologies<br />conceptmap-def WindSpeedMeasurement<br />uri-as <br />concat('ssg4env:WindSpeedMeasurement_',<br />windsamples.sensorid,windsamples.ts) <br /> described-by<br />attributemap-defhasSpeed<br /> operation "constant" <br /> has-column windsamples.speed<br />dbrelationmap-def isProducedBytoConceptSensor <br /> joins-via <br /> condition "equals" <br /> has-column sensors.sensorid<br /> has-column windsamples.sensorid<br />conceptmap-def Sensor<br />uri-as<br />concat('ssg4env:Sensor_',sensors.sensorid)<br /> described-by<br />attributemap-def hasName<br /> operation "constant" <br /> has-column sensors.sensorname<br />S:WindSamples<br /><ul><li>ts
  71. 71. speed
  72. 72. direction
  73. 73. sensorid</li></ul>WindSpeedMeasurement<br />hasSpeedxsd:float<br />Measurement<br />isProducedBy<br />T:Sensors<br />- sensorid<br /><ul><li>sensorname</li></ul>Sensor<br />hasNamexsd:string<br />16<br />SemanticIntegrationStreaming Data Sources<br />
  74. 74. 17<br />Red de Ontologías para el Camino de Santiago<br />QueryTransformationSemantics<br /><ul><li>ConjunctiveQueries
  75. 75. Mapping</li></ul>expression<br />overstreamingsources<br />conjunctive<br />query<br />
  76. 76. 18<br />From SPARQLSTRtoSNEEql<br />Semantic Integrator<br />SELECTconcat( ‘ssg4env.eu#Sensor' , sensors.sensorid ) as a1 , <br /> ( sensors.sensorname ) as name <br />FROM sensors<br />SELECTconcat(‘ssg4env.eu#WindSpeedMeasurement' , <br /> windsensor.id , windsensor.ts ) as a1 , <br /> ( windsensor.speed ) as speed <br />FROMwindsensor[ FROM NOW - 10 TO NOW MIN]<br />SELECTconcat(‘ssg4env.eu#WindSpeedMeasurement' , <br /> windsensor.id, windsensor.ts ) as a1 , <br />concat( ‘ssg4env.eu#Sensor' , sensors.sensorid ) as a2 <br />FROM sensors, windsensor[ FROM NOW - 10 TO NOW MIN] <br />WHERE ( sensors.sensorid = windsensor.id )<br /><ul><li>PREFIX fire: http://www.semsorgrid4env.eu#
  77. 77. PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
  78. 78. SELECT ?speed ?name
  79. 79. FROM STREAM <http://www.ssg4env.eu/Readings.srdf>
  80. 80. [RANGE 10 MINUTE STEP 1 MINUTE]
  81. 81. WHERE {
  82. 82. ?WindSpeed a fire:WindSpeedMeasurement;
  83. 83. fire:hasSpeed ?speed;
  84. 84. fire:isProducedBy ?sensor;
  85. 85. fire:hasTimestamp ?time.
  86. 86. ?sensor a fire:Sensor;
  87. 87. fire:hasName ?name.
  88. 88. }</li></ul>Work in progress: removing redundant queries, basic optimisations, more complex scenarios<br />18<br />SemanticIntegrationStreaming Data Sources<br />
  89. 89. 19<br />SemanticIntegrationInteractions<br />Semantic Integrator<br />Streaming Data Resource<br />Stored Data Resource<br />Consumer<br />IntegrateAs (StrRes,StoRes,map)<br />IntegratedRes<br />SPARQLExecuteFactory(IntegratedRes, query)<br />SNEEqlExecuteFactory(StreamingRes, querySNEEql)<br />accessStreamingRes<br />GetResponseItem(accessStreamingRes)<br />repeat<br />results<br />SQLExecuteFactory(StoredRes, querySQL)<br />results<br />accessIntegratedRes<br />GetResponseItem(accessIntegratedRes)<br />19<br />SemanticIntegrationStreaming Data Sources<br />

×