Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Efficient RDF Interchange (ERI) Format for RDF Data Streams

772 views

Published on

Presentation done* at the 13th International Semantic Web Conference (ISWC) in which we approach a compressed format to represent RDF Data Streams. See the original article at: http://dataweb.infor.uva.es/wp-content/uploads/2014/07/iswc14.pdf

* Presented by Alejandro Llaves (http://www.slideshare.net/allaves)

Published in: Data & Analytics
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Efficient RDF Interchange (ERI) Format for RDF Data Streams

  1. 1. Efficient RDF Interchange (ERI) Format for RDF Data Streams Javier D. Fernández, Alejandro Llaves, Oscar Corcho Ontology Engineering Group (OEG) Universidad Politécnica de Madrid, Spain
  2. 2. Outline Index 1. Introduction & Motivation 2. Background 3. Efficient RDF Interchange (ERI) Format i. Basic Concepts ii. ERI Streams iii. Practical Deployment 4. Evaluation 5. Conclusions and Next steps 2
  3. 3. INTRODUCTION - Static data versus RDF data streams 3
  4. 4. INTRODUCTION - Static data versus RDF data streams 3 Files Extract Transform Load DBMS Spatial Information Web APIs Linked Data discovery
  5. 5. INTRODUCTION - Static data versus RDF data streams 3 Files Extract Transform Load DBMS Spatial Information Web APIs Linked Data discovery
  6. 6. INTRODUCTION - Static data versus RDF data streams 3 Files Extract Transform Load DBMS Spatial Information Web APIs Linked Data discovery “Most semantic tools are focused on this static view”
  7. 7. INTRODUCTION - Static data versus RDF data streams RDF Data Streams are gaining momentum, generated from any type of data stream, and combining real-time and historical data. ©Wilgengebroed on Flickr, Mr3641, ProtoplasmaKid and ISA Internationales Stadtbauatelier in commons wikimedia 3
  8. 8. INTRODUCTION - Static data versus RDF data streams 3
  9. 9. INTRODUCTION - Static data versus RDF data streams 3
  10. 10. INTRODUCTION - Static data versus RDF data streams 3
  11. 11. INTRODUCTION - Static data versus RDF data streams 3
  12. 12. INTRODUCTION - Static data versus RDF data streams RDF streams: potentially unbounded sequences of timestamped RDF statements or graphs. 3
  13. 13. INTRODUCTION - Static data versus RDF data streams RDF streams: potentially unbounded sequences of timestamped RDF statements or graphs. 3 user1_observation [t1] weather1_observation [t1] user2_observation [t3] …
  14. 14. INTRODUCTION - Static data versus RDF data streams RDF streams: potentially unbounded sequences of timestamped RDF statements or graphs. 3 t w1 w2 w3 u1 u2 u3 u4 Stream user1_observation [t1] weather1_observation [t1] user2_observation [t3] …
  15. 15. INTRODUCTION - Motivation Achieve efficient transmission of RDF streams, a necessary step to ensure higher throughput for RDF Stream processors 3 Stream source Stream source Stream source Stream source Stream Processor Engine Historic Information C-SPARQL, SPARQLStream morph-streams CQELS Cloud Ztreamy … Stream source queries Continuous results
  16. 16. INTRODUCTION – Motivation - Requirements 16 Efficient transmission of RDF streams: • Streamable • Scalable • Easy (fast) to process (create and parse) • Compact • Parametrizable (several tradeoffs compression/time)
  17. 17. BACKGROUND 17 Plain: Turtle/ Trig/ JSON-LD Plain +Compression (e.g. gzip) HDT Streaming HDT RDSZ RDF/XML + EXI ERI Streamable Yes Yes No Yes Yes Yes Yes Scalable Limited Yes Yes No Yes Yes Yes Easy (fast) to Yes Limited Limited Yes Limited Limited Yes create and parse Compact No Yes Yes Limited Yes Yes Yes Parametrizable: No Limited Yes No Limited Limited Yes compression/time
  18. 18. Outline Index 1. Introduction & Motivation 2. Background 3. Efficient RDF Interchange (ERI) Format i. Basic Concepts ii. ERI Streams iii. Practical Deployment 4. Evaluation 5. Conclusions and Next steps 18
  19. 19. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Basic Concepts 19 • (Assumption) Most RDF streams are well structured structure • the is well-known by the data provider • the number of variations in the structure are limited
  20. 20. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Basic Concepts 20 • (Assumption) Most RDF streams are well structured structure • the is well-known by the data provider • the number of variations in the structure are limited • Efficient RDF Interchange (ERI) Format encodes the information at two levels:
  21. 21. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Basic Concepts 21 • (Assumption) Most RDF streams are well structured structure • the is well-known by the data provider • the number of variations in the structure are limited • Efficient RDF Interchange (ERI) Format encodes the information at two levels: • A sliding dictionary of structures: Structural Dictionary
  22. 22. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Basic Concepts 22 • (Assumption) Most RDF streams are well structured structure • the is well-known by the data provider • the number of variations in the structure are limited • Efficient RDF Interchange (ERI) Format encodes the information at two levels: • A sliding dictionary of structures: Structural Dictionary • The concrete value for each predicate
  23. 23. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Basic Concepts 23 • (Assumption) Most RDF streams are well structured structure • the is well-known by the data provider • the number of variations in the structure are limited • Efficient RDF Interchange (ERI) Format encodes the information at two levels: • A sliding dictionary of structures: Structural Dictionary • The concrete value for each predicate
  24. 24. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Basic Concepts 24 • (Assumption) Most RDF streams are well structured structure • the is well-known by the data provider • the number of variations in the structure are limited • Efficient RDF Interchange (ERI) Format encodes the information at two levels: • A sliding dictionary of structures: Structural Dictionary • The concrete value for each predicate
  25. 25. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Basic Concepts 25 ID-31 ID-32 Structural Dictionary “7.7”^^xsd:float “9.4”^^xsd:float t w1 w2 w3 u1 u2 u3 u4 Stream temper ature Casual user Anual pass wind ID-30 ID-33 … weather: TemperatureObservation rdf:type weather: AirTemperature ssn:observedProperty ??? ex:CelsiusValue … … …
  26. 26. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Basic Concepts 26 ID-31 ID-32 Structural Dictionary “7.7”^^xsd:float “9.4”^^xsd:float t w1 w2 w3 u1 u2 u3 u4 Stream temper ature Casual user Anual pass wind ID-30 ID-33 … weather: TemperatureObservation rdf:type weather: AirTemperature ssn:observedProperty ??? ex:CelsiusValue … … … molecule
  27. 27. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Basic Concepts 27 ID-31 ID-32 Structural Dictionary “7.7”^^xsd:float “9.4”^^xsd:float t w1 w2 w3 u1 u2 u3 u4 Stream temper ature Casual user Anual pass wind ID-30 ID-33 … weather: TemperatureObservation rdf:type weather: AirTemperature ssn:observedProperty ??? ex:CelsiusValue … … … molecule
  28. 28. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Basic Concepts 28 ID-31 ID-32 Structural Dictionary “7.7”^^xsd:float “9.4”^^xsd:float t w1 w2 w3 u1 u2 u3 u4 Stream temper ature Casual user Anual pass wind ID-30 ID-33 … weather: TemperatureObservation rdf:type weather: AirTemperature ssn:observedProperty ??? ex:CelsiusValue … … … molecule
  29. 29. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Basic Concepts 29 ID-31 ID-32 Structural Dictionary “7.7”^^xsd:float “9.4”^^xsd:float t w1 w2 w3 u1 u2 u3 u4 Stream temper ature Casual user Anual pass wind ID-30 ID-33 … weather: TemperatureObservation rdf:type weather: AirTemperature ssn:observedProperty ??? ex:CelsiusValue … … … molecule
  30. 30. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Basic Concepts 30 ID-31 ID-32 Structural Dictionary “7.7”^^xsd:float “9.4”^^xsd:float t w1 w2 w3 u1 u2 u3 u4 Stream temper ature Casual user Anual pass wind ID-30 ID-33 … weather: TemperatureObservation rdf:type weather: AirTemperature ssn:observedProperty ??? ex:CelsiusValue … … … molecule
  31. 31. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Basic Concepts 31 ID-31 ID-32 Structural Dictionary “7.7”^^xsd:float “9.4”^^xsd:float t w1 w2 w3 u1 u2 u3 u4 Stream temper ature Casual user Anual pass wind ID-30 ID-33 … weather: TemperatureObservation rdf:type weather: AirTemperature ssn:observedProperty ??? ex:CelsiusValue … … … molecule
  32. 32. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Basic Concepts 32 ID-31 ID-32 Structural Dictionary “7.7”^^xsd:float “9.4”^^xsd:float t w1 w2 w3 u1 u2 u3 u4 Stream temper ature Casual user Anual pass wind ID-30 ID-33 … weather: TemperatureObservation rdf:type weather: AirTemperature ssn:observedProperty ??? ex:CelsiusValue … … … molecule
  33. 33. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Basic Concepts 33 • ERI processing model • Minimal Information Unit is a molecule: • We initially restrict to subject molecules
  34. 34. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Basic Concepts sens-obs:Observation_AirTemperature_4UT01_2003_3_31_6_55_00 a weather:TemperatureObservation ; rdfs: label “Air temperature at 6:55:00”, “Verified” ; om-owl:observedProperty weather:_AirTemperature ; om-owl:procedure sens-obs:System_4UT01 ; om-owl:result sens-obs:MeasureData_AirTemperature_4UT01_2003_3_31_6_55_00 ; om-owl:samplingTime sens-obs:Instant_2003_3_31_6_55_00. ex:CelsiusValue “7.7”^^xsd:float sens-obs:Observation_AirTemperature_4UT01_2003_3_31_7_45_00 a weather:TemperatureObservation ; rdfs: label “Air temperature at 7:45:00”, “Not Verified” ; om-owl:observedProperty weather:_AirTemperature ; om-owl:procedure sens-obs:System_4UT01 ; om-owl:result sens-obs:MeasureData_AirTemperature_4UT01_2003_3_31_7_45_00 ; om-owl:samplingTime sens-obs:Instant_2003_3_31_7_45_00 . ex:CelsiusValue “9.4”^^xsd:float 34 Subject Molecule … Suubject Molecule …
  35. 35. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Basic Concepts sens-obs:Observation_AirTemperature_4UT01_2003_3_31_6_55_00 a weather:TemperatureObservation ; rdfs: label “Air temperature at 6:55:00”, “Verified” ; om-owl:observedProperty weather:_AirTemperature ; om-owl:procedure sens-obs:System_4UT01 ; om-owl:result sens-obs:MeasureData_AirTemperature_4UT01_2003_3_31_6_55_00 ; om-owl:samplingTime sens-obs:Instant_2003_3_31_6_55_00. ex:CelsiusValue “7.7”^^xsd:float sens-obs:Observation_AirTemperature_4UT01_2003_3_31_7_45_00 a weather:TemperatureObservation ; rdfs: label “Air temperature at 7:45:00”, “Not Verified” ; om-owl:observedProperty weather:_AirTemperature ; om-owl:procedure sens-obs:System_4UT01 ; om-owl:result sens-obs:MeasureData_AirTemperature_4UT01_2003_3_31_7_45_00 ; om-owl:samplingTime sens-obs:Instant_2003_3_31_7_45_00 . ex:CelsiusValue “9.4”^^xsd:float 35 Subject Molecule ….. Structure ID30= a (1, weather:TemperatureObservation) rdfs:label (2) om-wl:observedProperty (1, weather:_AirTemperature ) om-owl:procedure (1,sens-obs:System_4UT01) om-owl:result (1) om-owl:samplingTime (1) ex:CelsiusValue (1) ….. Structural Dictionary … Suubject Molecule …
  36. 36. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Basic Concepts Air Temperature Observations of the Sensor “System_4UT01” sens-obs:Observation_AirTemperature_4UT01_2003_3_31_6_55_00 a weather:TemperatureObservation ; rdfs: label “Air temperature at 6:55:00”, “Verified” ; om-owl:observedProperty weather:_AirTemperature ; om-owl:procedure sens-obs:System_4UT01 ; om-owl:result sens-obs:MeasureData_AirTemperature_4UT01_2003_3_31_6_55_00 ; om-owl:samplingTime sens-obs:Instant_2003_3_31_6_55_00. ex:CelsiusValue “7.7”^^xsd:float sens-obs:Observation_AirTemperature_4UT01_2003_3_31_7_45_00 a weather:TemperatureObservation ; rdfs: label “Air temperature at 7:45:00”, “Not Verified” ; om-owl:observedProperty weather:_AirTemperature ; om-owl:procedure sens-obs:System_4UT01 ; om-owl:result sens-obs:MeasureData_AirTemperature_4UT01_2003_3_31_7_45_00 ; om-owl:samplingTime sens-obs:Instant_2003_3_31_7_45_00 . ex:CelsiusValue “9.4”^^xsd:float 36 Subject Molecule ….. Structure ID30= a (1, weather:TemperatureObservation) rdfs:label (2) om-wl:observedProperty (1, weather:_AirTemperature ) om-owl:procedure (1,sens-obs:System_4UT01) om-owl:result (1) om-owl:samplingTime (1) ex:CelsiusValue (1) ….. Structural Dictionary … Suubject Molecule …
  37. 37. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Basic Concepts Air Temperature Observations of the Sensor “System_4UT01” sens-obs:Observation_AirTemperature_4UT01_2003_3_31_6_55_00 a weather:TemperatureObservation ; rdfs: label “Air temperature at 6:55:00”, “Verified” ; om-owl:observedProperty weather:_AirTemperature ; om-owl:procedure sens-obs:System_4UT01 ; om-owl:result sens-obs:MeasureData_AirTemperature_4UT01_2003_3_31_6_55_00 ; om-owl:samplingTime sens-obs:Instant_2003_3_31_6_55_00. ex:CelsiusValue “7.7”^^xsd:float sens-obs:Observation_AirTemperature_4UT01_2003_3_31_7_45_00 a weather:TemperatureObservation ; rdfs: label “Air temperature at 7:45:00”, “Not Verified” ; om-owl:observedProperty weather:_AirTemperature ; om-owl:procedure sens-obs:System_4UT01 ; om-owl:result sens-obs:MeasureData_AirTemperature_4UT01_2003_3_31_7_45_00 ; om-owl:samplingTime sens-obs:Instant_2003_3_31_7_45_00 . ex:CelsiusValue “9.4”^^xsd:float 37 Subject Molecule ….. Structure ID30= a (1, weather:TemperatureObservation) rdfs:label (2) om-wl:observedProperty (1, weather:_AirTemperature ) om-owl:procedure (1,sens-obs:System_4UT01) om-owl:result (1) om-owl:samplingTime (1) ex:CelsiusValue (1) ….. Structural Dictionary … Suubject Molecule …
  38. 38. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – ERI Streams Based on: Efficient XML Interchange (EXI) format 38 Block Mole cule Mole cule Mole cule … Block Mole cule Mole cule Mole cule … Block Mole cule Mole cule Mole cule … … Multiplex / Demultiplex Compression/Decompression (per channel) Stream Header Stream Body META DATA COMP CHAN. COMP CHAN. COMP CHAN. COMP CHAN. COMP CHAN. COMP CHAN. META DATA COMP CHAN. COMP CHAN. COMP CHAN. COMP CHAN. META DATA COMP CHAN. COMP CHAN. COMP CHAN. COMP CHAN. COMP CHAN. COMP CHAN. Channels Structural Channels Value Channels … ERI stream
  39. 39. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – ERI Streams 39 ERI follows an encoding procedure similar to that of the Efficient XML Interchange (EXI) format. Structural channels: They encode the subjects in each block and, for each one, the structural properties of the related triples, using the dynamic dictionary of structures. • Main Terms of molecules: subject of the grouping. • ID-Structures: ID of the structure of each molecule in the block. The ID points to the entry in the Structural Dictionary. • New Structures: New entries in the Structural Dictionary. – Value channels: They encode the concrete data values held by each predicate in the block in a compact fashion. • One channel per different predicate in the block. • Lists explicit values or use IDs pointing to a sliding object dictionary variations structure
  40. 40. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Practical Deployment 40 ID-Structures New Structure Marker … sens-obs:MeasureData_Air…55_00 sens-obs:Instant_2003…55_00 sens-obs:MeasureData_Air…45_00 sens-obs:Instant_2003…55_00 … … 30 30 … ID-pred1 weather: TemperatureObservation ID-pred2 ID-pred3 weather:_AirTemperature ID-pred4 sensobs: System_4UT01 ID-pred5 ID-pred6 ID-pred7 [IDs of Structures] … om-owl:samplingTime ex:CelsiusValue … [Encoded Structures] [Strings] Structural Channels …. sens-obs:Observation_AirTemperature...55_00 sens-obs:Observation_AirTemperature...45_00 …. ID-pred2 … Air temperature at 6:55:00 Verified Air temperature at 7:45:00 Not Verified … [Object Values] [Meta: strings] ID-pred5 [Term IDs] [Meta: IDs] New Terms [Strings] … 101 245 … ID-pred6 1 2 … [Term IDs] [Meta: IDs] Potential Compression Differential … Prefix compression Zlib Snappy … Main Terms of Molecules [Strings] Prefix compression Zlib Snappy … Prefix compression Zlib Snappy … Zlib Snappy … Differential … Differential … … 1 0 … [Bits] New Structures New Predicates Zlib Snappy … New Object Marker ID-pred5 … 0 1 … [Bits] New Object Marker ID-pred6 1 1 … [Bits] 1 2 1 1 1 1 1 ID-pred7 … 7.7 9.4 …. [Object Values] [Meta: xsd:float] Differential … Value Channels Potential Compression
  41. 41. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Practical Deployment 41 ID-Structures New Structure Marker … sens-obs:MeasureData_Air…55_00 sens-obs:Instant_2003…55_00 sens-obs:MeasureData_Air…45_00 sens-obs:Instant_2003…55_00 … … 30 30 … ID-pred1 weather: TemperatureObservation ID-pred2 ID-pred3 weather:_AirTemperature ID-pred4 sensobs: System_4UT01 ID-pred5 ID-pred6 ID-pred7 [IDs of Structures] … om-owl:samplingTime ex:CelsiusValue … [Encoded Structures] [Strings] Structural Channels …. sens-obs:Observation_AirTemperature...55_00 sens-obs:Observation_AirTemperature...45_00 …. ID-pred2 … Air temperature at 6:55:00 Verified Air temperature at 7:45:00 Not Verified … [Object Values] [Meta: strings] ID-pred5 [Term IDs] [Meta: IDs] New Terms [Strings] … 101 245 … ID-pred6 1 2 … [Term IDs] [Meta: IDs] Potential Compression Differential … Prefix compression Zlib Snappy … Main Terms of Molecules [Strings] Prefix compression Zlib Snappy … Prefix compression Zlib Snappy … Zlib Snappy … Differential … Differential … … 1 0 … [Bits] New Structures New Predicates Zlib Snappy … New Object Marker ID-pred5 … 0 1 … [Bits] New Object Marker ID-pred6 1 1 … [Bits] 1 2 1 1 1 1 1 ID-pred7 … 7.7 9.4 …. [Object Values] [Meta: xsd:float] Differential … Value Channels Potential Compression Explicit list of values
  42. 42. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Practical Deployment 42 ID-Structures New Structure Marker … sens-obs:MeasureData_Air…55_00 sens-obs:Instant_2003…55_00 sens-obs:MeasureData_Air…45_00 sens-obs:Instant_2003…55_00 … … 30 30 … ID-pred1 weather: TemperatureObservation ID-pred2 ID-pred3 weather:_AirTemperature ID-pred4 sensobs: System_4UT01 ID-pred5 ID-pred6 ID-pred7 [IDs of Structures] … om-owl:samplingTime ex:CelsiusValue … [Encoded Structures] [Strings] Structural Channels …. sens-obs:Observation_AirTemperature...55_00 sens-obs:Observation_AirTemperature...45_00 …. ID-pred2 … Air temperature at 6:55:00 Verified Air temperature at 7:45:00 Not Verified … [Object Values] [Meta: strings] ID-pred5 [Term IDs] [Meta: IDs] New Terms [Strings] … 101 245 … ID-pred6 1 2 … [Term IDs] [Meta: IDs] Potential Compression Differential … Prefix compression Zlib Snappy … Main Terms of Molecules [Strings] Prefix compression Zlib Snappy … Prefix compression Zlib Snappy … Zlib Snappy … Differential … Differential … … 1 0 … [Bits] New Structures New Predicates Zlib Snappy … New Object Marker ID-pred5 … 0 1 … [Bits] New Object Marker ID-pred6 1 1 … [Bits] 1 2 1 1 1 1 1 ID-pred7 … 7.7 9.4 …. [Object Values] [Meta: xsd:float] Differential … Value Channels Potential Compression Explicit list of values
  43. 43. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Practical Deployment 43 ID-Structures New Structure Marker … sens-obs:MeasureData_Air…55_00 sens-obs:Instant_2003…55_00 sens-obs:MeasureData_Air…45_00 sens-obs:Instant_2003…55_00 … … 30 30 … ID-pred1 weather: TemperatureObservation ID-pred2 ID-pred3 weather:_AirTemperature ID-pred4 sensobs: System_4UT01 ID-pred5 ID-pred6 ID-pred7 [IDs of Structures] … om-owl:samplingTime ex:CelsiusValue … [Encoded Structures] [Strings] Structural Channels …. sens-obs:Observation_AirTemperature...55_00 sens-obs:Observation_AirTemperature...45_00 …. ID-pred2 … Air temperature at 6:55:00 Verified Air temperature at 7:45:00 Not Verified … [Object Values] [Meta: strings] ID-pred5 [Term IDs] [Meta: IDs] New Terms [Strings] … 101 245 … ID-pred6 1 2 … [Term IDs] [Meta: IDs] Potential Compression Differential … Prefix compression Zlib Snappy … Main Terms of Molecules [Strings] Prefix compression Zlib Snappy … Prefix compression Zlib Snappy … Zlib Snappy … Differential … Differential … … 1 0 … [Bits] New Structures New Predicates Zlib Snappy … New Object Marker ID-pred5 … 0 1 … [Bits] New Object Marker ID-pred6 1 1 … [Bits] 1 2 1 1 1 1 1 ID-pred7 … 7.7 9.4 …. [Object Values] [Meta: xsd:float] Differential … Value Channels Potential Compression Explicit list of values
  44. 44. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Practical Deployment 44 ID-Structures New Structure Marker … sens-obs:MeasureData_Air…55_00 sens-obs:Instant_2003…55_00 sens-obs:MeasureData_Air…45_00 sens-obs:Instant_2003…55_00 … … 30 30 … ID-pred1 weather: TemperatureObservation ID-pred2 ID-pred3 weather:_AirTemperature ID-pred4 sensobs: System_4UT01 ID-pred5 ID-pred6 ID-pred7 [IDs of Structures] … om-owl:samplingTime ex:CelsiusValue … [Encoded Structures] [Strings] Structural Channels …. sens-obs:Observation_AirTemperature...55_00 sens-obs:Observation_AirTemperature...45_00 …. ID-pred2 … Air temperature at 6:55:00 Verified Air temperature at 7:45:00 Not Verified … [Object Values] [Meta: strings] ID-pred5 [Term IDs] [Meta: IDs] New Terms [Strings] … 101 245 … ID-pred6 1 2 … [Term IDs] [Meta: IDs] Potential Compression Differential … Prefix compression Zlib Snappy … Main Terms of Molecules [Strings] Prefix compression Zlib Snappy … Prefix compression Zlib Snappy … Zlib Snappy … Differential … Differential … … 1 0 … [Bits] New Structures New Predicates Zlib Snappy … New Object Marker ID-pred5 … 0 1 … [Bits] New Object Marker ID-pred6 1 1 … [Bits] 1 2 1 1 1 1 1 ID-pred7 … 7.7 9.4 …. [Object Values] [Meta: xsd:float] Differential … Value Channels Potential Compression Explicit list of values
  45. 45. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Practical Deployment 45 ID-Structures New Structure Marker … sens-obs:MeasureData_Air…55_00 sens-obs:Instant_2003…55_00 sens-obs:MeasureData_Air…45_00 sens-obs:Instant_2003…55_00 … … 30 30 … ID-pred1 weather: TemperatureObservation ID-pred2 ID-pred3 weather:_AirTemperature ID-pred4 sensobs: System_4UT01 ID-pred5 ID-pred6 ID-pred7 [IDs of Structures] … om-owl:samplingTime ex:CelsiusValue … [Encoded Structures] [Strings] Structural Channels …. sens-obs:Observation_AirTemperature...55_00 sens-obs:Observation_AirTemperature...45_00 …. ID-pred2 … Air temperature at 6:55:00 Verified Air temperature at 7:45:00 Not Verified … [Object Values] [Meta: strings] ID-pred5 [Term IDs] [Meta: IDs] New Terms [Strings] … 101 245 … ID-pred6 1 2 … [Term IDs] [Meta: IDs] Potential Compression Differential … Prefix compression Zlib Snappy … Main Terms of Molecules [Strings] Prefix compression Zlib Snappy … Prefix compression Zlib Snappy … Zlib Snappy … Differential … Differential … … 1 0 … [Bits] New Structures New Predicates Zlib Snappy … New Object Marker ID-pred5 … 0 1 … [Bits] New Object Marker ID-pred6 1 1 … [Bits] 1 2 1 1 1 1 1 ID-pred7 … 7.7 9.4 …. [Object Values] [Meta: xsd:float] Differential … Value Channels Potential Compression Explicit list of values IDs pointing to a sliding object dictionary
  46. 46. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Practical Deployment 46 ID-Structures New Structure Marker … sens-obs:MeasureData_Air…55_00 sens-obs:Instant_2003…55_00 sens-obs:MeasureData_Air…45_00 sens-obs:Instant_2003…55_00 … … 30 30 … ID-pred1 weather: TemperatureObservation ID-pred2 ID-pred3 weather:_AirTemperature ID-pred4 sensobs: System_4UT01 ID-pred5 ID-pred6 ID-pred7 [IDs of Structures] … om-owl:samplingTime ex:CelsiusValue … [Encoded Structures] [Strings] Structural Channels …. sens-obs:Observation_AirTemperature...55_00 sens-obs:Observation_AirTemperature...45_00 …. ID-pred2 … Air temperature at 6:55:00 Verified Air temperature at 7:45:00 Not Verified … [Object Values] [Meta: strings] ID-pred5 [Term IDs] [Meta: IDs] New Terms [Strings] … 101 245 … ID-pred6 1 2 … [Term IDs] [Meta: IDs] Potential Compression Differential … Prefix compression Zlib Snappy … Main Terms of Molecules [Strings] Prefix compression Zlib Snappy … Prefix compression Zlib Snappy … Zlib Snappy … Differential … Differential … … 1 0 … [Bits] New Structures New Predicates Zlib Snappy … New Object Marker ID-pred5 … 0 1 … [Bits] New Object Marker ID-pred6 1 1 … [Bits] 1 2 1 1 1 1 1 ID-pred7 … 7.7 9.4 …. [Object Values] [Meta: xsd:float] Differential … Value Channels Potential Compression Explicit list of values IDs pointing to a sliding object dictionary
  47. 47. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Practical Deployment 47 ID-Structures New Structure Marker … sens-obs:MeasureData_Air…55_00 sens-obs:Instant_2003…55_00 sens-obs:MeasureData_Air…45_00 sens-obs:Instant_2003…55_00 … … 30 30 … ID-pred1 weather: TemperatureObservation ID-pred2 ID-pred3 weather:_AirTemperature ID-pred4 sensobs: System_4UT01 ID-pred5 ID-pred6 ID-pred7 [IDs of Structures] … om-owl:samplingTime ex:CelsiusValue … [Encoded Structures] [Strings] Structural Channels …. sens-obs:Observation_AirTemperature...55_00 sens-obs:Observation_AirTemperature...45_00 …. ID-pred2 … Air temperature at 6:55:00 Verified Air temperature at 7:45:00 Not Verified … [Object Values] [Meta: strings] ID-pred5 [Term IDs] [Meta: IDs] New Terms [Strings] … 101 245 … ID-pred6 1 2 … [Term IDs] [Meta: IDs] Potential Compression Differential … Prefix compression Zlib Snappy … Main Terms of Molecules [Strings] Prefix compression Zlib Snappy … Prefix compression Zlib Snappy … Zlib Snappy … Differential … Differential … … 1 0 … [Bits] New Structures New Predicates Zlib Snappy … New Object Marker ID-pred5 … 0 1 … [Bits] New Object Marker ID-pred6 1 1 … [Bits] 1 2 1 1 1 1 1 ID-pred7 … 7.7 9.4 …. [Object Values] [Meta: xsd:float] Differential … Value Channels Potential Compression Explicit list of values IDs pointing to a sliding object dictionary Extraction of types
  48. 48. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Practical Deployment 48 ID-Structures New Structure Marker … sens-obs:MeasureData_Air…55_00 sens-obs:Instant_2003…55_00 sens-obs:MeasureData_Air…45_00 sens-obs:Instant_2003…55_00 … … 30 30 … ID-pred1 weather: TemperatureObservation ID-pred2 ID-pred3 weather:_AirTemperature ID-pred4 sensobs: System_4UT01 ID-pred5 ID-pred6 ID-pred7 [IDs of Structures] … om-owl:samplingTime ex:CelsiusValue … [Encoded Structures] [Strings] Structural Channels …. sens-obs:Observation_AirTemperature...55_00 sens-obs:Observation_AirTemperature...45_00 …. ID-pred2 … Air temperature at 6:55:00 Verified Air temperature at 7:45:00 Not Verified … [Object Values] [Meta: strings] ID-pred5 [Term IDs] [Meta: IDs] New Terms [Strings] … 101 245 … ID-pred6 1 2 … [Term IDs] [Meta: IDs] Potential Compression Differential … Prefix compression Zlib Snappy … Main Terms of Molecules [Strings] Prefix compression Zlib Snappy … Prefix compression Zlib Snappy … Zlib Snappy … Differential … Differential … … 1 0 … [Bits] New Structures New Predicates Zlib Snappy … New Object Marker ID-pred5 … 0 1 … [Bits] New Object Marker ID-pred6 1 1 … [Bits] 1 2 1 1 1 1 1 ID-pred7 … 7.7 9.4 …. [Object Values] [Meta: xsd:float] Differential … Value Channels Potential Compression Explicit list of values IDs pointing to a sliding object dictionary Extraction of types
  49. 49. EFFICIENT RDF INTERCHANGE (ERI) FORMAT – Practical Deployment 49 ID-Structures New Structure Marker … sens-obs:MeasureData_Air…55_00 sens-obs:Instant_2003…55_00 sens-obs:MeasureData_Air…45_00 sens-obs:Instant_2003…55_00 … … 30 30 … ID-pred1 weather: TemperatureObservation ID-pred2 ID-pred3 weather:_AirTemperature ID-pred4 sensobs: System_4UT01 ID-pred5 ID-pred6 ID-pred7 [IDs of Structures] … om-owl:samplingTime ex:CelsiusValue … [Encoded Structures] [Strings] Structural Channels …. sens-obs:Observation_AirTemperature...55_00 sens-obs:Observation_AirTemperature...45_00 …. ID-pred2 … Air temperature at 6:55:00 Verified Air temperature at 7:45:00 Not Verified … [Object Values] [Meta: strings] ID-pred5 [Term IDs] [Meta: IDs] New Terms [Strings] … 101 245 … ID-pred6 1 2 … [Term IDs] [Meta: IDs] Potential Compression Differential … Prefix compression Zlib Snappy … Main Terms of Molecules [Strings] Prefix compression Zlib Snappy … Prefix compression Zlib Snappy … Zlib Snappy … Differential … Differential … … 1 0 … [Bits] New Structures New Predicates Zlib Snappy … New Object Marker ID-pred5 … 0 1 … [Bits] New Object Marker ID-pred6 1 1 … [Bits] 1 2 1 1 1 1 1 ID-pred7 … 7.7 9.4 …. [Object Values] [Meta: xsd:float] Differential … Value Channels Potential Compression Explicit list of values IDs pointing to a sliding object dictionary Extraction of types
  50. 50. Outline Index 1. Introduction & Motivation 2. Background 3. Efficient RDF Interchange (ERI) Format i. Basic Concepts ii. ERI Streams iii. Practical Deployment 4. Evaluation 5. Conclusions and Next steps 50
  51. 51. EVALUATION - COMPRESSION 51
  52. 52. EVALUATION - COMPRESSION 52 ERI excels in space for streaming and statistical dataset
  53. 53. EVALUATION - COMPRESSION 53 ERI excels in space for streaming and statistical dataset RDSZ remains comparable to our approach
  54. 54. EVALUATION - COMPRESSION 54 ERI excels in space for streaming and statistical dataset RDSZ remains comparable to our approach The object dictionary can overload the representation, although it always obtains comparable compression ratios.
  55. 55. EVALUATION - COMPRESSION 55
  56. 56. EVALUATION - COMPRESSION 56 A smaller buffer in ERI-1k slightly affects the efficiency
  57. 57. EVALUATION - PARSING 57
  58. 58. EVALUATION - PARSING 58 ERI always outperforms the RDSZ compression time (3 and 3.8 times on average for ERI-4k and ERI-4k-Nodict, respectively)
  59. 59. EVALUATION - PARSING 59 ERI always outperforms the RDSZ compression time (3 and 3.8 times on average for ERI-4k and ERI-4k-Nodict, respectively) ERI decompression is commonly slower (1.4 times on average in both ERI configurations), typically due to decompressing several channels.
  60. 60. EVALUATION - PARSING 60 ERI always outperforms the RDSZ compression time (3 and 3.8 times on average for ERI-4k and ERI-4k-Nodict, respectively) ERI decompression is commonly slower (1.4 times on average in both ERI configurations), typically due to decompressing several channels. Channels could be grouped (as in EXI)
  61. 61. EVALUATION – CONSUMING SCENARIO 61 In parsing: transmission + decompression
  62. 62. EVALUATION – CONSUMING SCENARIO ERI-4k and ERI-4k-Nodict outperform the baseline in transmission + decompression except for those datasets with less regularities in the structure or the data values, 62 In parsing: transmission + decompression
  63. 63. EVALUATION – CONSUMING SCENARIO 63 In a scenario in which we include the compression time
  64. 64. EVALUATION – CONSUMING SCENARIO 64 In a scenario in which we include the compression time ERI-4k suffers an expected overhead as we are always including the time to process the information
  65. 65. EVALUATION – CONSUMING SCENARIO 65 In a scenario in which we include the compression time ERI-4k suffers an expected overhead as we are always including the time to process the information The time in which the client receives all data in ERI is comparable to the baseline
  66. 66. Results 66 • Compressed, efficient RDF interchange (ERI) format • exploit the RDF data stream regularity of their structure and data values • Flexible and extensible ERI configurations • Minimize transmission costs in RDF stream processing • State-of-the-art compression • Remains efficient in performance • Time overheads are relatively low and can be assumed in many scenarios.
  67. 67. Next steps 67 • Integration within RDF streaming Engines • e.g. morph-streams, CQELS Cloud • 3 purposes: • scaling to higher input data rates • minimizing the data exchange among processing nodes • serving a small set of operators on the compressed data • Parallel compression/decompression • preliminary proposal on Storm • Align the proposal with the results of W3C RSP group regarding streaming modeling and serialization
  68. 68. Efficient RDF Interchange (ERI) Format for RDF Data Streams Javier D. Fernández, Alejandro Llaves, Oscar Corcho Ontology Engineering Group (OEG), Universidad Politécnica de Madrid, Spain purl.org/net/ro-eri-ISWC14 Electronic edition: Research object:

×