EDF2013: Data Science Curriculum: Barry Norton: Big Linked Data

1,241 views

Published on

Data Science Curriculum: Barry Norton, Ontotext AD, at the European Data Forum 2013, 10 April 2013 in Dublin, Ireland: Big Linked Data

Published in: Education, Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,241
On SlideShare
0
From Embeds
0
Number of Embeds
296
Actions
Shares
0
Downloads
0
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

EDF2013: Data Science Curriculum: Barry Norton: Big Linked Data

  1. 1. Big Linked Data Presented by: Barry Norton, Ontotext AD
  2. 2. Aims for curriculum 1.Show realistic solutions 2.Use real data 3.Use real tools 4.Show scalable solutions 5.Eat the dog’s food EUCLID – a Curriculum for Big Linked Data 2
  3. 3. Realistic SolutionApplication Analysis & Visualization RDFa Module Mining ModuleAccess SPARQL Endpoint PublishingLD Dataset Vocabulary Integrated Interlinking Cleansing Mapping Dataset Physical Wrapper LD Wrapper R2R Transf. LD WrapperData acquisition RDF/ XML Streaming providers Downloads Musical Content Metadata Other content EUCLID – a Curriculum for Big Linked Data 3
  4. 4. Real Data• No pizza EUCLID - Providing Linked Data 4
  5. 5. Real Data• No pizza • No wine EUCLID - Providing Linked Data 5
  6. 6. Real Data• No pizza • No wine• No Protégé EUCLID - Providing Linked Data 6
  7. 7. Real Data• MusicBrainz dataset:• Music Ontology: EUCLID - Providing Linked Data 7
  8. 8. Real Tools• Admitted simple start • Industry-strength by Module 2 EUCLID – a Curriculum for Big Linked Data 8
  9. 9. Real Tools• All tools explained by screencast EUCLID – a Curriculum for Big Linked Data 9
  10. 10. Real Tools• All tools explained by screencast Explains how Exercise 1 was created EUCLID – a Curriculum for Big Linked Data 10
  11. 11. Real Tools• The data of interest may be stored in a wide range or formats: Spreadsheets Databases Text or tabular data• Several tools support the process of mining data from different repositories, for example: R2RML EUCLID – a Curriculum for Big Linked Data 11
  12. 12. Scalable Solutions• MusicBrainz RDF derived via R2RML: 300M Triples lb:artist_member a rr:TriplesMap ; rr:logicalTable [rr:sqlQuery """SELECT a1.gid, a2.gid AS band FROM artist a1 INNER JOIN l_artist_artist ON a1.id = l_artist_artist.entity0 INNER JOIN link ON l_artist_artist.link = link.id INNER JOIN link_type ON link_type = link_type.id INNER JOIN artist a2 on l_artist_artist.entity1 = a2.id WHERE link_type.gid=5be4c609-9afa-4ea0-910b-12ffb71e3821"""] ; rr:subjectMap [rr:template "http://musicbrainz.org/artist/{gid}#_"] ; rr:predicateObjectMap [rr:predicate mo:member_of ; rr:objectMap [rr:template "http://musicbrainz.org/artist/{band}#_" ; rr:termType rr:IRI]] . EUCLID – a Curriculum for Big Linked Data 12
  13. 13. Dog Food• EUCLID output, topics and engagement monitored public-lod public-vocabs semantic-web EUCLID – a Curriculum for Big Linked Data 13
  14. 14. Dog Food• EUCLID output, topics and engagement monitored public-lod • Offered as public SPARQL endpoint public-vocabs semantic-web EUCLID – a Curriculum for Big Linked Data 14
  15. 15. Dog Food• EUCLID output, topics and engagement monitored public-lod • Offered as public SPARQL endpoint public-vocabs semantic-web • Will be used as basis of analysis examples EUCLID – a Curriculum for Big Linked Data 15
  16. 16. Results• Achieving ~100 live viewers:• Set to exceed 1000 post hoc views /channel (Webinar platform, Vime o, Slideshare): EUCLID – a Curriculum for Big Linked Data 16
  17. 17. For exercises, quiz and further material visit our website: http://www.euclid-project.eu eBook CourseOther channels: @euclid_project EUCLID project EUCLIDproject EUCLID – a Curriculum for Big Linked Data 17

×