A Slightly Different   Web of Data       Rinke Hoekstra
2 SemanticsDatato  From Data   Semantics for Scientific Data Publishers
2 SemanticsDatato  From Data   Semantics for Scientific Data Publishers                    EASY Data Repository           ...
2 SemanticsDatato  From Data   Semantics for Scientific Data Publishers                    EASY Data Repository           ...
2 SemanticsDatato  From Data   Semantics for Scientific Data Publishers                    EASY Data Repository           ...
The Research Cycle                                                                                   Cloud$               ...
Challenges•   Build useful services and tools for data publishers ...•   ... that maintain provenance information ...•   ....
Challenges•   Build useful services and tools for data publishers ...•   ... that maintain provenance information ...•   ....
“Can we pull this off with     Linked Data?”
Publishing Linked Data               Linked Data Rubik’s Cube by Duncan Hull
Linked Data Publishing•   RDF Annotations / Microdata•   RDF Dump file (gzipped) at some URL•   Self-maintained triple stor...
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
Linked                                                                                                                    ...
0                100                      200                            300                                  400 1 mei 20...
LODStatshttp://stats.lod2.eu
40.745.554.078 Triples!
40.745.554.078 Triples!   (1.75 Billion)
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
A Slightly Different Web of Data
Upcoming SlideShare
Loading in...5
×

A Slightly Different Web of Data

406

Published on

The Data2Semantics project (COMMIT P23) is all about enriching research data, and making it more reusable for future research. Using Linked Data for this task is a fairly obvious step to make (surprise!). However, there are several shortcomings the current practices in publishing Linked Data, that calls for a slightly
different approach which (hopefully) bridges a gap between Web 2.0 and Web 3.0. I will present a proof-of-concept service (Linkitup) that works on top of existing scientific data repositories, and allows individual researchers to enrich their data with additional (linked) metadata.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
406
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • may 2007 (12 sets) - september 2011 (295 sets)\n\nToday: 326 sets\n
  • may 2007 (12 sets) - september 2011 (295 sets)\n\nToday: 326 sets\n
  • may 2007 (12 sets) - september 2011 (295 sets)\n\nToday: 326 sets\n
  • may 2007 (12 sets) - september 2011 (295 sets)\n\nToday: 326 sets\n
  • may 2007 (12 sets) - september 2011 (295 sets)\n\nToday: 326 sets\n
  • may 2007 (12 sets) - september 2011 (295 sets)\n\nToday: 326 sets\n
  • may 2007 (12 sets) - september 2011 (295 sets)\n\nToday: 326 sets\n
  • may 2007 (12 sets) - september 2011 (295 sets)\n\nToday: 326 sets\n
  • may 2007 (12 sets) - september 2011 (295 sets)\n\nToday: 326 sets\n
  • may 2007 (12 sets) - september 2011 (295 sets)\n\nToday: 326 sets\n
  • may 2007 (12 sets) - september 2011 (295 sets)\n\nToday: 326 sets\n
  • may 2007 (12 sets) - september 2011 (295 sets)\n\nToday: 326 sets\n
  • may 2007 (12 sets) - september 2011 (295 sets)\n\nToday: 326 sets\n
  • may 2007 (12 sets) - september 2011 (295 sets)\n\nToday: 326 sets\n
  • may 2007 (12 sets) - september 2011 (295 sets)\n\nToday: 326 sets\n
  • may 2007 (12 sets) - september 2011 (295 sets)\n\nToday: 326 sets\n
  • may 2007 (12 sets) - september 2011 (295 sets)\n\nToday: 326 sets\n
  • \n
  • Analysis of all LD listed in CKAN\nIs endpoint ‘up’?\nDoes data dump exist?\n\n\n\n
  • 1.3 Billion triples, 54% of datasets ‘down’\nAdvertised: 40 Billion triples.\n\nCKAN API, Datahub.org\n
  • 1.3 Billion triples, 54% of datasets ‘down’\nAdvertised: 40 Billion triples.\n\nCKAN API, Datahub.org\n
  • \n
  • \n
  • STALE/INCONSISTENT\n* Huge dataset: 27 datasets, 4.65B Triples, 1B Entities \n* UMLS, LinkedCT, Sider, DailyMed, Drugbank etc.\n* (Size due to UniProt & PubMed)\n\n* Latest version of July 2011, that’s already old!\n* DrugBank (FU Berlin) 4772 drugs\n* DrugBank (Live) 6708\n\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Transcript of "A Slightly Different Web of Data"

    1. 1. A Slightly Different Web of Data Rinke Hoekstra
    2. 2. 2 SemanticsDatato From Data Semantics for Scientific Data Publishers
    3. 3. 2 SemanticsDatato From Data Semantics for Scientific Data Publishers EASY Data Repository Enrich datasets: census data
    4. 4. 2 SemanticsDatato From Data Semantics for Scientific Data Publishers EASY Data Repository Enrich datasets: census data Large volumes of publications Improve services to clients Automated services
    5. 5. 2 SemanticsDatato From Data Semantics for Scientific Data Publishers EASY Data Repository Enrich datasets: census data Large volumes of publications Improve services to clients Automated services Build systems for hospitals
    6. 6. The Research Cycle Cloud$ acquiring$data$from$text?$ Analysis/ Metrics$ Semi8 Automa;c$ Annota;on$ e.g.$GATE$ Amalgame$ SILK$ OpenCalais$ Querying$ Graph$Rewri;ng$ Graph$Rewri;ng$ and$Ranking$ RDF$ RDF$ Internal$ Link$to$ Conversion$ Cleaning$ Linking$ Other$Data$xml2rdf$ d2rq$ Visualiza;on$ sgvizler$rdb2rdf$ $ Provenance$ Enrichment$ User$ AIDA$Browser$ Interfaces$ Poseidon$(Pirates/Maps)$ Semi8 …$ Automa;c$ Conversion$ “tablinker”$ RDF$Feedback$ Provenance$
    7. 7. Challenges• Build useful services and tools for data publishers ...• ... that maintain provenance information ...• ... and cater for the entire research cycle ...• ... including a feedback loop to new research
    8. 8. Challenges• Build useful services and tools for data publishers ...• ... that maintain provenance information ...• ... and cater for the entire research cycle ...• ... including a feedback loop to new research
    9. 9. “Can we pull this off with Linked Data?”
    10. 10. Publishing Linked Data Linked Data Rubik’s Cube by Duncan Hull
    11. 11. Linked Data Publishing• RDF Annotations / Microdata• RDF Dump file (gzipped) at some URL• Self-maintained triple store• Database with mappings to RDF (D2RQ/RDB2RDF)• Professionally maintained triple store with multiple datasets
    12. 12. Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
    13. 13. Linked LOV User Slideshare tags2con Audio Feedback 2RDF delicious Moseley Scrobbler Bricklink Sussex Folk (DBTune) Reading St. GTAA Magna- Lists Andrews Klapp- tune stuhl- Resource NTU DB club Lists Resource Tropes Lotico Semantic yovisto John Music Man- Lists Music Tweet chester Hellenic Peel Brainz NDL (DBTune) (Data Brainz Reading subjects FBD (zitgist) Lists Open EUTC Incubator) Linked Hellenic Library Open t4gm Produc- Crunch- PD Surge RDF info tions Discogs base Library Radio Ontos Source Code Crime ohloh Plymouth (Talis) (Data News LEM Ecosystem Reading RAMEAU Reports business Incubator) Crime data.gov. Portal Linked Data Lists SH UK Music Jamendo (En- uk Brainz (DBtune) LinkedL Ox AKTing) FanHubz gnoss ntnusc (DBTune) SSW CCN Points Thesau- Last.FM Poké- Thesaur Popula- artists pédia Didactal us rus W LIBRIS tion (En- (DBTune) Last.FM ia theses. LCSH Rådata reegle research patents MARC AKTing) (rdfize) my fr nå! data.gov. data.go Codes Ren. NHS uk v.uk Good- Experi- Classical List Energy (En- win flickr ment (DB Pokedex Norwe- Genera- AKTing) Mortality BBC Family wrappr Sudoc PSH Tune) gian (En- tors Program MeSH AKTing) semantic mes BBC IdRef GND CO2 educatio OpenEI web.org SW Energy Sudoc ndlna Emission n.data.g Music Dog VIAF EEA (En- Chronic- Linked (En- ov.uk Portu- Food UB AKTing) ling Event MDB AKTing) guese Mann- Europeana BBC America Media DBpedia Calames heim Ord- Recht- Wildlife Deutsche Open Revyu DDC Openly spraak. Finder Bio- lobid Election nance legislation Local nl RDF graphie Resources NSZL Swedish Data Survey Tele- data Ulm EU New Book Project data.gov.uk graphis bnf.fr Catalog Open Insti- York Open Mashup Cultural tutions Times URI Greek P20 UK Post- Burner Calais Heritage codes DBpedia ECS Wiki statistics lobid GovWILD data.gov. Taxon iServe South- Organi- LOIUS BNB Brazilian uk Concept ECS ampton sations Geo World OS BibBase STW GESIS Poli- ESD South- ECS Names Fact- (RKB ticians stan- reference ampton data.gov.uk book Freebase Explorer) Budapest dards data.gov. NASA EPrints uk intervals Project OAI Lichfield transport (Data DBpedia data Guten- Pisa Spen- data.gov. Incu- dcs RESEX Scholaro- ISTAT ding bator) Fishes berg DBLP DBLP uk Geo meter Immi- Scotland of Texas (FU (L3S) Pupils & Uberblic DBLP gration Species Berlin) IRIT Exams Euro- dbpedia data- (RKB London TCM ACM stat lite open- Explorer) NVD Gazette (FUB) Gene IBM Traffic Geo ac-uk Scotland TWC LOGD Eurostat Daily DIT Linked UN/ Data UMBEL Med ERA Data LOCODE DEPLOY Gov.ie CORDIS YAGO New- lingvoj Disea- (RKB some SIDER RAE2001 castle LOCAH CORDIS Explorer) Linked Eurécom Eurostat Drug CiteSeer Roma (FUB) Sensor Data GovTrack (Ontology (Kno.e.sis) Open Bank Pfam Course- Central) riese Enipedia Cyc Lexvo LinkedCT ware Linked PDB UniProt VIVO EURES EDGAR dotAC US SEC Indiana ePrints IEEE (Ontology totl.net (rdfabout) Central) WordNet RISKS (VUA) Taxono UniProt US Census EUNIS Twarql HGNC Semantic Cornetto (Bio2RDF) (rdfabout) my VIVO FTS XBRL PRO- ProDom STITCH Cornell LAAS SITE KISTI NSF Scotland Geo- GeoWord LODE graphy Net WordNet WordNet JISC (W3C) (RKB Climbing Linked Affy- KEGG SMC Explorer) SISVU Pub VIVO UF Piedmont GeoData metrix Drug ECCO- Finnish Journals PubMed Gene SGD Chem Munici- Accomo- El AGROV Ontology TCP Media dations Alpine bible palities Viajero OC Ski ontology Tourism KEGG Ocean Austria Enzyme PBAC Geographic Metoffice GEMET ChEMBL Italian Drilling OMIM KEGG Weather Open public Codices AEMET Linked MGI Pathway schools Forecasts Data Open InterPro GeneID Publications EARTh Thesau- KEGG Turismo rus Colors Reaction de Zaragoza Product Smart KEGG User-generated content Weather DB Link Medi Glycan Janus Stations Product Care KEGG AMP UniParc UniRef UniSTS Government Types Italian Homolo Com- Yahoo! Airports Museums pound Ontology Google Gene Geo Art Planet National wrapper Chem2 Cross-domain Radio- Bio2RDF activity UniPath JP Sears Open Linked OGOLOD way Life sciences Corpo- Amster- Reactome dam medu- Open rates Numbers Museum cator As of September 2011Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
    14. 14. 0 100 200 300 400 1 mei 2007 8 okt. 2007 7 nov. 200710 nov. 200728 feb. 200831 mrt. 200818 sep. 2008 5 mrt. 200927 mrt. 2009 14 jul. 200922 sep. 201019 sep. 201123 feb. 2012
    15. 15. LODStatshttp://stats.lod2.eu
    16. 16. 40.745.554.078 Triples!
    17. 17. 40.745.554.078 Triples! (1.75 Billion)

    ×