Biblioteca in cerca di alleati. Oltre la condivisione, verso nuove strategie.
Stelline, Milano 15 marzo 2013.
Sessione: "Biblioteche, Archivi, Musei: la convergenza possibile".
"L'integrazione tra archivi e biblioteche alla prova del web semantico".
intervento di Salvatore vassallo, dottore di ricerca in Scienze Bibliografiche.
Programma: convegnostelline.it/sessione.php?IdUnivoco=7
Fotografie: picasaweb.google.com/117290793877692021380/ConvegnoStelline2013?authuser=0&feat=directlink
Registrazione in streaming con Livestream Procaster a cura di Sergio Primo Del Bello, ANAI Lombardia
Georgi Kobilarov presented on the status and future of DBpedia. DBpedia extracts structured data from Wikipedia and makes it available as linked open data. Current challenges include improving data quality, handling live Wikipedia updates, adding other data sources, and developing a new approach for infobox extraction using a domain-specific ontology. The vision is for DBpedia to become the Wikipedia of structured data and enable users and applications to access and query this data without having to understand its technical implementation.
The document discusses DBpedia, an effort to extract structured information from Wikipedia and make it available as linked open data. It provides details on DBpedia's internationalization, with datasets now available in over 190 languages. Statistics are presented on the mapping efforts for different language versions. The document also mentions current work related to quality analysis of DBpedia data and integrating Wikidata.
The slideset used to conduct an introduction/tutorial
on DBpedia use cases, concepts and implementation
aspects held during the DBpedia community meeting
in Dublin on the 9th of February 2015.
(slide creators: M. Ackermann, M. Freudenberg
additional presenter: Ali Ismayilov)
Kick-off seminar of the largest Wikimedia IEG, 2015 round 2 call.
In conjunction with Wikipedia's 15 birthday.
Project page: https://meta.wikimedia.org/wiki/Grants:IEG/StrepHit:_Wikidata_Statements_Validation_via_References
Unsupervised Learning of an Extensive and Usable Taxonomy for DBpediaMarco Fossati
Talk given by fellow Claus Stadler at the 11th International Conference on Semantic Systems - SEMANTiCS 2015
Paper available here: http://jens-lehmann.org/files/2015/semantics_dbtax.pdf
Biblioteca in cerca di alleati. Oltre la condivisione, verso nuove strategie.
Stelline, Milano 15 marzo 2013.
Sessione: "Biblioteche, Archivi, Musei: la convergenza possibile".
"L'integrazione tra archivi e biblioteche alla prova del web semantico".
intervento di Salvatore vassallo, dottore di ricerca in Scienze Bibliografiche.
Programma: convegnostelline.it/sessione.php?IdUnivoco=7
Fotografie: picasaweb.google.com/117290793877692021380/ConvegnoStelline2013?authuser=0&feat=directlink
Registrazione in streaming con Livestream Procaster a cura di Sergio Primo Del Bello, ANAI Lombardia
Georgi Kobilarov presented on the status and future of DBpedia. DBpedia extracts structured data from Wikipedia and makes it available as linked open data. Current challenges include improving data quality, handling live Wikipedia updates, adding other data sources, and developing a new approach for infobox extraction using a domain-specific ontology. The vision is for DBpedia to become the Wikipedia of structured data and enable users and applications to access and query this data without having to understand its technical implementation.
The document discusses DBpedia, an effort to extract structured information from Wikipedia and make it available as linked open data. It provides details on DBpedia's internationalization, with datasets now available in over 190 languages. Statistics are presented on the mapping efforts for different language versions. The document also mentions current work related to quality analysis of DBpedia data and integrating Wikidata.
The slideset used to conduct an introduction/tutorial
on DBpedia use cases, concepts and implementation
aspects held during the DBpedia community meeting
in Dublin on the 9th of February 2015.
(slide creators: M. Ackermann, M. Freudenberg
additional presenter: Ali Ismayilov)
Kick-off seminar of the largest Wikimedia IEG, 2015 round 2 call.
In conjunction with Wikipedia's 15 birthday.
Project page: https://meta.wikimedia.org/wiki/Grants:IEG/StrepHit:_Wikidata_Statements_Validation_via_References
Unsupervised Learning of an Extensive and Usable Taxonomy for DBpediaMarco Fossati
Talk given by fellow Claus Stadler at the 11th International Conference on Semantic Systems - SEMANTiCS 2015
Paper available here: http://jens-lehmann.org/files/2015/semantics_dbtax.pdf
This document outlines a Google Summer of Code project to teach machines to extract facts from Wikipedia articles by using machine learning and lexical semantics. It discusses extracting lexical units through part-of-speech tagging and statistical ranking, classifying frames and frame elements in an unsupervised or supervised manner, constructing a crowdsourced training set, and serializing the extracted facts into RDF triples for inclusion in DBpedia to discover new relations and populate the knowledge base automatically. The approach is demonstrated on soccer domain articles from the Italian Wikipedia.
The document discusses using Linked Open Data from DBpedia to help with Unicode localization interoperability (ULI). DBpedia extracts structured data from Wikipedia and makes it available as Linked Data. It describes how ULI aims to standardize localization data exchange between tools. DBpedia data on abbreviations in over 100 languages was extracted and evaluated, finding it could help improve text segmentation precision and recall. The extracted data is being considered for inclusion in the Common Locale Data Repository (CLDR) to further standardization efforts.
DBpedia: Glue for all Wikipedias and a Use Case for MultilingualismMarco Fossati
Dbpedia extracts structured data from Wikipedia to create a multilingual linked open data cloud. It has language-specific chapters that map data in different languages to a common structure. This enables multilingual queries over the data and use cases like helping with text segmentation by modeling abbreviations. Mapping sprints help create high quality data in new languages like the first Italian Dbpedia mapping done in a high school hackathon.
This document discusses challenges and solutions related to data quality. It addresses issues with template-dependent and fully manual mapping approaches and proposes machine learning-based methods and mapping assistants as solutions. It also discusses problems with community-based ontologies like lack of coverage and proposes consistency checks and data-driven schemas using sources like Wikipedia categories to address them. Finally, it lists various multimedia data sources for photos, audio and video that could be linked.
This document discusses outsourcing FrameNet annotation to crowdsourcing. It presents a two-step and simplified one-step methodology for crowdsourcing frame and semantic role annotation. Experiments using these methods on the CrowdFlower platform showed that the simplified one-step approach had higher accuracy and was faster than the two-step approach. Lessons learned include that definitions need to be simplified for non-experts and negation and modality are difficult concepts. Further research directions include larger-scale experiments and linking entities to structured knowledge bases.
This document outlines a Google Summer of Code project to teach machines to extract facts from Wikipedia articles by using machine learning and lexical semantics. It discusses extracting lexical units through part-of-speech tagging and statistical ranking, classifying frames and frame elements in an unsupervised or supervised manner, constructing a crowdsourced training set, and serializing the extracted facts into RDF triples for inclusion in DBpedia to discover new relations and populate the knowledge base automatically. The approach is demonstrated on soccer domain articles from the Italian Wikipedia.
The document discusses using Linked Open Data from DBpedia to help with Unicode localization interoperability (ULI). DBpedia extracts structured data from Wikipedia and makes it available as Linked Data. It describes how ULI aims to standardize localization data exchange between tools. DBpedia data on abbreviations in over 100 languages was extracted and evaluated, finding it could help improve text segmentation precision and recall. The extracted data is being considered for inclusion in the Common Locale Data Repository (CLDR) to further standardization efforts.
DBpedia: Glue for all Wikipedias and a Use Case for MultilingualismMarco Fossati
Dbpedia extracts structured data from Wikipedia to create a multilingual linked open data cloud. It has language-specific chapters that map data in different languages to a common structure. This enables multilingual queries over the data and use cases like helping with text segmentation by modeling abbreviations. Mapping sprints help create high quality data in new languages like the first Italian Dbpedia mapping done in a high school hackathon.
This document discusses challenges and solutions related to data quality. It addresses issues with template-dependent and fully manual mapping approaches and proposes machine learning-based methods and mapping assistants as solutions. It also discusses problems with community-based ontologies like lack of coverage and proposes consistency checks and data-driven schemas using sources like Wikipedia categories to address them. Finally, it lists various multimedia data sources for photos, audio and video that could be linked.
This document discusses outsourcing FrameNet annotation to crowdsourcing. It presents a two-step and simplified one-step methodology for crowdsourcing frame and semantic role annotation. Experiments using these methods on the CrowdFlower platform showed that the simplified one-step approach had higher accuracy and was faster than the two-step approach. Lessons learned include that definitions need to be simplified for non-experts and negation and modality are difficult concepts. Further research directions include larger-scale experiments and linking entities to structured knowledge bases.