Your SlideShare is downloading. ×
Linked Open Data - State of the Art, Challenges and Applications
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Linked Open Data - State of the Art, Challenges and Applications

142
views

Published on

Published in: Technology, Education

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
142
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. 15/03/2013 Research Skills, MSc ITEC Rui M. Vieira 1 Linked Open Data State of the art, challenges and applications Part of the Linking Open (LOD) Data Project Cloud Diagram
  • 2. 15/03/2013 Research Skills, MSc ITEC Rui M. Vieira 2 Open Data ● What is Open Data? – Non-proprietary and standard format – Machine computable – Non-discriminatory
  • 3. 15/03/2013 Research Skills, MSc ITEC Rui M. Vieira 3 What is Linked Open Data ● Open Data Distributed over the network ● Standardised format – Resource Description Format (RDF) as data model – SPARQL for queries ● RDF “Triples” serialised as – XML, N3 or many other standards ● Semantics and Ontologies ● Architecture similar to WWW – Hypertext Transfer Protocol (HTTP) – Domain Name System (DNS) – Uniform Resource Identifiers (URIs) @prefix a: <http://example.com/2013/02/A#> . @prefix b: <http://example.net/2013/02/B#> . @prefix eg: <http://example.com/> . eg:sharedContent a:externalContent <http://somewhere/RDF/ont.owl> N3 RDF example
  • 4. 15/03/2013 Research Skills, MSc ITEC Rui M. Vieira 4 Newcastle City Council Pollution data City: Newcastle Country: UK Values: … Units: Measurements: … Compound: … Linked Open Data UK government World Health Organisation Government data City: Newcastle Population: … Lat: … Long: … Exposure levels Compound: Max Levels: … (more data) HTTP HTTP Data consumer
  • 5. 15/03/2013 Research Skills, MSc ITEC Rui M. Vieira 5
  • 6. 15/03/2013 Research Skills, MSc ITEC Rui M. Vieira 6
  • 7. 15/03/2013 Research Skills, MSc ITEC Rui M. Vieira 7 Issues ● Data quality – Syntax error, invalid semantics ● Scalability – Unavailability, unresponsiveness ● Security – Trust, Encryption ● Currency – Temporal context, versioning ● Aggregation – Co-referencing, source
  • 8. 15/03/2013 Research Skills, MSc ITEC Rui M. Vieira 8 Issues ● Data quality – Syntax error, invalid semantics ● Scalability – Unavailability, unresponsiveness ● Security – Trust, Encryption ● Currency – Temporal context, versioning ● Aggregation – Co-referencing, source
  • 9. 15/03/2013 Research Skills, MSc ITEC Rui M. Vieira 9 Data Quality ● Tim Berners-Lee LOD principles ● Syntactically valid – Several validators ● RDFAlerts, Vapour – Frameworks to build own tools ● Redland RDF (C library to build RDF parsers) – Allow to gather metrics on conformance ● Dataset scoring validator Data metrics
  • 10. 15/03/2013 Research Skills, MSc ITEC Rui M. Vieira 10 Data Quality ● Semantically valid – Use standard vocabularies and ontologies when possible – Use Domain specific vocabularies: ● eg SensorML for data-in-motion ● Vast amount for medicine, bioinformatics, etc. ● Linked Open Vocabularies database
  • 11. 15/03/2013 Research Skills, MSc ITEC Rui M. Vieira 11 Scalability ● Unavailable or unresponsive providers – Non-dereferenceable resources is missing data – LOD uses HTTP and WS ● HTTP status codes to convey more information – Redirect to other resources ● Caching ● Dataset dumps – Use analytics to provide useful information ● Profile servers ● Tune servers ● Plan scalability
  • 12. 15/03/2013 Research Skills, MSc ITEC Rui M. Vieira 12 Security ● Trust – Community scoring of datasets and providers – Provenance ● Watermarking – All attributes can be in meta-data ● Encrypted channels – SSL, certificates to ensure provenance consumer Trusted provider Trusted? Trusted?
  • 13. 15/03/2013 Research Skills, MSc ITEC Rui M. Vieira 13 Currency ● Temporal context for data – Facebook hits/day: average? Specific day? ● Is data versioned? – Are we mixing old and new data? ● Solutions: – Currency meta-data ● At the statement level, not just dataset level – Specific ontologies for current (OntoCurrency)
  • 14. 15/03/2013 Research Skills, MSc ITEC Rui M. Vieira 14 Aggregation ● Co-reference – Discovering co-referents – Resolution of co-referents ● Multiple sources – Discovery – SPARQL over remote datasets ● Software solutions – Semantic Web Client Library ● Handles dereferencing and aggregation – DARQ ● Multi-source SPARQL queries example.net example.com 1203 A P consumer Person A? http://example.com/ People/A http://example.net/ 2013/Staff/1203
  • 15. 15/03/2013 Research Skills, MSc ITEC Rui M. Vieira 15 Applications ● Open Government ● The Web Of Things – Massive real-time structured sensor data ● The DataHub ● Data.gov ● PublicData.eu ● Association mining ● Intelligent recommendation systems
  • 16. 15/03/2013 Research Skills, MSc ITEC Rui M. Vieira 16 Conclusion ● LOD is becoming a preferred solution for data providers ● Obstacles to the global, machine computable, linked Semantic Web ● No integrated solution dealing with all the issues ● But solutions exist – To deal with specific areas – To build new tools
  • 17. 15/03/2013 Research Skills, MSc ITEC Rui M. Vieira 17 Questions ? slide:question en:noun ens:part slide:enquiry owl:sameAs speech:property rdfs:type