SlideShare a Scribd company logo
BibBase Triplified http://data.bibbase.org/
Presented by:
Reynold S. Xin UC Berkeley
Joint work with:
Oktie Hassanzadeh, Yang Yang, Jiang Du, Minghua Zhao,
Renee J. Miller University of Toronto
Christian Fritz University of Southern California
Outline
 Goals and Status
 Duplicate detection
 Interlinking of data sources
 Additional features
 Conclusions and future work
Goals http://www.bibbase.org
 Makes it easy for scientists to maintain publications pages
 Scientists maintain a bibtex file; BibBase does the rest
 Publishes them in HTML
Goals http://data.bibbase.org
 Makes it easy for scientists to maintain publications pages
 Scientists maintain a bibtex file; BibBase does the rest
 Publishes them in HTML
 Publishes them in RDF
 Links entries to the open linked data cloud
 With incentive, scientists are helping us build a
bibliographic database (think DBLP but automated)
 Invaluable data set for benchmarking duplicate
detection and semantic link discovery systems
Some statistics
 “Beta” went online in June 2010
 As of yesterday (September 1, 2010)
 ~ 100 active users
 4520 publications, 4883 authors, 502 journals, 1881
proceedings, 88 keywords
 39201 author links, 2768 publication links, 30 keyword links
 Note that this is before we do any form of “marketing”
Duplicate Detection
 Examples
 Authors: “Renee J. Miller” or “R. J. Miller” or “RJ Miller”
 Publication entries
 Journal & conferences: “VLDB” or “Very Large Data Base”
 Solutions
 Local detection (within a single bibtex file)
 Global detection (across multiple files)
Local Detection
 A set of predefined rules to identify duplicates.
 E.g. within a single file, it is highly likely that “Renee J Miller” is
the same as “RJ Miller”.
 Users can specify a suffix to the name to differentiate
them (DBLP approach).
 E.g. “Min Wang” vs “Min Wang2”
Global Detection
 Duplicate detection, also known as entity resolution,
record linkage, or reference reconciliation is a well-
studied problem and an active research area. [Tutorial-
VLDB’05, Tutorial-SIGMOD’06]
 We use existing declarative techniques [D.App.σ-SIGMOD’07]
to detect duplicates across multiple files.
 Display disambiguation page on HTML interface and
rdfs:seeAlso attribute on RDF interface.
 Also enables user to provide feedback by
@string{vldb = Very Large Data Base}
Interlinking of Data Sources
 Leverages both offline dictionaries and online real-time
URL verifications.
 Some external data sources
 DBLP
 DBpedia
 RKBExplorer
 Semantic Web Dogfood
 LOD foaf
Additional Features
 Storage and publication of provenance information
 Dynamic grouping of entities (by year, keyword, etc)
 RSS feed for notification
 DBLP scraper to generate bibtex files from DBLP records
 Statistics on usage
 Enhancement to existing MIT bibtex ontology file
Conclusion and Future Work
 BibBase
 Light-weight publication of bibliographic data
 Semantic web technologies as a result of complex
triplification performed inside the system
 Invaluable data set
 Future Work
 More comprehensive duplicate detection
 Links to more external data sources
 Better engineering and service level agreement (99.99%?)
 Broader user base
Questions?

More Related Content

What's hot

Open semantic chemical structures
Open semantic chemical structuresOpen semantic chemical structures
Open semantic chemical structures
Stuart Chalk
 
Connecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life CycleConnecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life Cycle
Merce Crosas
 
Linked library data
Linked library dataLinked library data
Linked library data
Jindřich Mynarz
 
It19 20140721 linked data personal perspective
It19 20140721 linked data personal perspectiveIt19 20140721 linked data personal perspective
It19 20140721 linked data personal perspective
Janifer Gatenby
 
Linked GeoData - WhereCampDC 20110610
Linked GeoData - WhereCampDC 20110610Linked GeoData - WhereCampDC 20110610
Linked GeoData - WhereCampDC 20110610
Dave Smith / USEPA Office of Environmental Information
 
The expanding dataverse
The expanding dataverseThe expanding dataverse
The expanding dataverse
Merce Crosas
 
Datasets with bioschemas
Datasets with bioschemasDatasets with bioschemas
Datasets with bioschemas
Alejandra Gonzalez-Beltran
 
Citing data in research articles: principles, implementation, challenges - an...
Citing data in research articles: principles, implementation, challenges - an...Citing data in research articles: principles, implementation, challenges - an...
Citing data in research articles: principles, implementation, challenges - an...
FAIRDOM
 
CrossRef System Update
CrossRef System UpdateCrossRef System Update
CrossRef System Update
Crossref
 
Analysing Structured Scholarly Data Embedded in Web Pages
Analysing Structured Scholarly Data Embedded in Web PagesAnalysing Structured Scholarly Data Embedded in Web Pages
Analysing Structured Scholarly Data Embedded in Web Pages
Ujwal Gadiraju
 
Data quality problem and solution
Data quality problem and solutionData quality problem and solution
Data quality problem and solution
Punk Milton
 
CrossRef Overview and Initiatives, Copenhagen, June 2013
CrossRef Overview and Initiatives, Copenhagen, June 2013CrossRef Overview and Initiatives, Copenhagen, June 2013
CrossRef Overview and Initiatives, Copenhagen, June 2013Crossref
 
DataCite overview 2014
DataCite overview 2014DataCite overview 2014
DataCite overview 2014
datacite
 
Linked data as a library data platform
Linked data as a library data platformLinked data as a library data platform
Linked data as a library data platform
Jindřich Mynarz
 
Cataloger 3.0: Competencies and Education for the BIBFRAME Catalog
Cataloger 3.0: Competencies and Education for the BIBFRAME CatalogCataloger 3.0: Competencies and Education for the BIBFRAME Catalog
Cataloger 3.0: Competencies and Education for the BIBFRAME Catalog
Allison Jai O'Dell
 
Munger and Schnell, "ResearchBlogg, A Peer Review Research Discovery"
Munger and Schnell, "ResearchBlogg, A Peer Review Research Discovery"Munger and Schnell, "ResearchBlogg, A Peer Review Research Discovery"
Munger and Schnell, "ResearchBlogg, A Peer Review Research Discovery"
National Information Standards Organization (NISO)
 
Science in the open, what does it take?
Science in the open, what does it take?Science in the open, what does it take?
Science in the open, what does it take?
mhaendel
 
The FAIRDOM Commons for Systems Biology
The FAIRDOM Commons for Systems BiologyThe FAIRDOM Commons for Systems Biology
The FAIRDOM Commons for Systems Biology
FAIRDOM
 
Linked Data: A short(-ish) introduction
Linked Data: A short(-ish) introductionLinked Data: A short(-ish) introduction
Linked Data: A short(-ish) introduction
Pete Johnston
 

What's hot (20)

Open semantic chemical structures
Open semantic chemical structuresOpen semantic chemical structures
Open semantic chemical structures
 
Connecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life CycleConnecting Dataverse with the Research Life Cycle
Connecting Dataverse with the Research Life Cycle
 
Linked library data
Linked library dataLinked library data
Linked library data
 
It19 20140721 linked data personal perspective
It19 20140721 linked data personal perspectiveIt19 20140721 linked data personal perspective
It19 20140721 linked data personal perspective
 
Linked GeoData - WhereCampDC 20110610
Linked GeoData - WhereCampDC 20110610Linked GeoData - WhereCampDC 20110610
Linked GeoData - WhereCampDC 20110610
 
The expanding dataverse
The expanding dataverseThe expanding dataverse
The expanding dataverse
 
Datasets with bioschemas
Datasets with bioschemasDatasets with bioschemas
Datasets with bioschemas
 
Citing data in research articles: principles, implementation, challenges - an...
Citing data in research articles: principles, implementation, challenges - an...Citing data in research articles: principles, implementation, challenges - an...
Citing data in research articles: principles, implementation, challenges - an...
 
MIAPA
MIAPAMIAPA
MIAPA
 
CrossRef System Update
CrossRef System UpdateCrossRef System Update
CrossRef System Update
 
Analysing Structured Scholarly Data Embedded in Web Pages
Analysing Structured Scholarly Data Embedded in Web PagesAnalysing Structured Scholarly Data Embedded in Web Pages
Analysing Structured Scholarly Data Embedded in Web Pages
 
Data quality problem and solution
Data quality problem and solutionData quality problem and solution
Data quality problem and solution
 
CrossRef Overview and Initiatives, Copenhagen, June 2013
CrossRef Overview and Initiatives, Copenhagen, June 2013CrossRef Overview and Initiatives, Copenhagen, June 2013
CrossRef Overview and Initiatives, Copenhagen, June 2013
 
DataCite overview 2014
DataCite overview 2014DataCite overview 2014
DataCite overview 2014
 
Linked data as a library data platform
Linked data as a library data platformLinked data as a library data platform
Linked data as a library data platform
 
Cataloger 3.0: Competencies and Education for the BIBFRAME Catalog
Cataloger 3.0: Competencies and Education for the BIBFRAME CatalogCataloger 3.0: Competencies and Education for the BIBFRAME Catalog
Cataloger 3.0: Competencies and Education for the BIBFRAME Catalog
 
Munger and Schnell, "ResearchBlogg, A Peer Review Research Discovery"
Munger and Schnell, "ResearchBlogg, A Peer Review Research Discovery"Munger and Schnell, "ResearchBlogg, A Peer Review Research Discovery"
Munger and Schnell, "ResearchBlogg, A Peer Review Research Discovery"
 
Science in the open, what does it take?
Science in the open, what does it take?Science in the open, what does it take?
Science in the open, what does it take?
 
The FAIRDOM Commons for Systems Biology
The FAIRDOM Commons for Systems BiologyThe FAIRDOM Commons for Systems Biology
The FAIRDOM Commons for Systems Biology
 
Linked Data: A short(-ish) introduction
Linked Data: A short(-ish) introductionLinked Data: A short(-ish) introduction
Linked Data: A short(-ish) introduction
 

Viewers also liked

ERTS 2008 - Using Linux for industrial projects
ERTS 2008 - Using Linux for industrial projectsERTS 2008 - Using Linux for industrial projects
ERTS 2008 - Using Linux for industrial projects
Christian Charreyre
 
Comment travailler avec les logiciels Open Source
Comment travailler avec les logiciels Open SourceComment travailler avec les logiciels Open Source
Comment travailler avec les logiciels Open Source
Christian Charreyre
 
Meetup Systemd vs sysvinit
Meetup Systemd vs sysvinitMeetup Systemd vs sysvinit
Meetup Systemd vs sysvinit
Christian Charreyre
 
ERTS 2008 - Using Linux for industrial projects
ERTS 2008 - Using Linux for industrial projectsERTS 2008 - Using Linux for industrial projects
ERTS 2008 - Using Linux for industrial projects
Christian Charreyre
 
Créer une distribution Linux embarqué professionnelle avec Yocto Project
Créer une distribution Linux embarqué professionnelle avec Yocto ProjectCréer une distribution Linux embarqué professionnelle avec Yocto Project
Créer une distribution Linux embarqué professionnelle avec Yocto Project
Christian Charreyre
 
Yocto une solution robuste pour construire des applications à fort contenu ap...
Yocto une solution robuste pour construire des applications à fort contenu ap...Yocto une solution robuste pour construire des applications à fort contenu ap...
Yocto une solution robuste pour construire des applications à fort contenu ap...
Christian Charreyre
 
Présentation Yocto - SophiaConf 2015
Présentation Yocto - SophiaConf 2015Présentation Yocto - SophiaConf 2015
Présentation Yocto - SophiaConf 2015
Christian Charreyre
 
Open Embedded un framework libre pour assurer la cohérence de son projet
Open Embedded un framework libre pour assurer la cohérence de son projetOpen Embedded un framework libre pour assurer la cohérence de son projet
Open Embedded un framework libre pour assurer la cohérence de son projet
Christian Charreyre
 
OS libres pour l'IoT - Zephyr
OS libres pour l'IoT - ZephyrOS libres pour l'IoT - Zephyr
OS libres pour l'IoT - Zephyr
Christian Charreyre
 
Autotools
AutotoolsAutotools
Using heka
Using hekaUsing heka
Using heka
Exotel
 
Python Foundation – A programmer's introduction to Python concepts & style
Python Foundation – A programmer's introduction to Python concepts & stylePython Foundation – A programmer's introduction to Python concepts & style
Python Foundation – A programmer's introduction to Python concepts & style
Kevlin Henney
 
Making Steaks from Sacred Cows
Making Steaks from Sacred CowsMaking Steaks from Sacred Cows
Making Steaks from Sacred Cows
Kevlin Henney
 
Logiciels libres en milieu industriel
Logiciels libres en milieu industrielLogiciels libres en milieu industriel
Logiciels libres en milieu industriel
Christian Charreyre
 
The Architecture of Uncertainty
The Architecture of UncertaintyThe Architecture of Uncertainty
The Architecture of Uncertainty
Kevlin Henney
 
#Gophercon Talk by Smita Vijayakumar - Go's Context Library
#Gophercon Talk by Smita Vijayakumar - Go's Context Library#Gophercon Talk by Smita Vijayakumar - Go's Context Library
#Gophercon Talk by Smita Vijayakumar - Go's Context Library
Exotel
 
Linux et le temps réel - Meetup du 15 octobre 2015
Linux et le temps réel - Meetup du 15 octobre 2015Linux et le temps réel - Meetup du 15 octobre 2015
Linux et le temps réel - Meetup du 15 octobre 2015
Christian Charreyre
 
Working at Exotel
Working at ExotelWorking at Exotel
Working at Exotel
Exotel
 
Contrat 2010-2013 Scduag
Contrat 2010-2013 ScduagContrat 2010-2013 Scduag
Contrat 2010-2013 Scduag
scduag
 
Setting A Culture of Technical Excellence
Setting A Culture of Technical ExcellenceSetting A Culture of Technical Excellence
Setting A Culture of Technical Excellence
Exotel
 

Viewers also liked (20)

ERTS 2008 - Using Linux for industrial projects
ERTS 2008 - Using Linux for industrial projectsERTS 2008 - Using Linux for industrial projects
ERTS 2008 - Using Linux for industrial projects
 
Comment travailler avec les logiciels Open Source
Comment travailler avec les logiciels Open SourceComment travailler avec les logiciels Open Source
Comment travailler avec les logiciels Open Source
 
Meetup Systemd vs sysvinit
Meetup Systemd vs sysvinitMeetup Systemd vs sysvinit
Meetup Systemd vs sysvinit
 
ERTS 2008 - Using Linux for industrial projects
ERTS 2008 - Using Linux for industrial projectsERTS 2008 - Using Linux for industrial projects
ERTS 2008 - Using Linux for industrial projects
 
Créer une distribution Linux embarqué professionnelle avec Yocto Project
Créer une distribution Linux embarqué professionnelle avec Yocto ProjectCréer une distribution Linux embarqué professionnelle avec Yocto Project
Créer une distribution Linux embarqué professionnelle avec Yocto Project
 
Yocto une solution robuste pour construire des applications à fort contenu ap...
Yocto une solution robuste pour construire des applications à fort contenu ap...Yocto une solution robuste pour construire des applications à fort contenu ap...
Yocto une solution robuste pour construire des applications à fort contenu ap...
 
Présentation Yocto - SophiaConf 2015
Présentation Yocto - SophiaConf 2015Présentation Yocto - SophiaConf 2015
Présentation Yocto - SophiaConf 2015
 
Open Embedded un framework libre pour assurer la cohérence de son projet
Open Embedded un framework libre pour assurer la cohérence de son projetOpen Embedded un framework libre pour assurer la cohérence de son projet
Open Embedded un framework libre pour assurer la cohérence de son projet
 
OS libres pour l'IoT - Zephyr
OS libres pour l'IoT - ZephyrOS libres pour l'IoT - Zephyr
OS libres pour l'IoT - Zephyr
 
Autotools
AutotoolsAutotools
Autotools
 
Using heka
Using hekaUsing heka
Using heka
 
Python Foundation – A programmer's introduction to Python concepts & style
Python Foundation – A programmer's introduction to Python concepts & stylePython Foundation – A programmer's introduction to Python concepts & style
Python Foundation – A programmer's introduction to Python concepts & style
 
Making Steaks from Sacred Cows
Making Steaks from Sacred CowsMaking Steaks from Sacred Cows
Making Steaks from Sacred Cows
 
Logiciels libres en milieu industriel
Logiciels libres en milieu industrielLogiciels libres en milieu industriel
Logiciels libres en milieu industriel
 
The Architecture of Uncertainty
The Architecture of UncertaintyThe Architecture of Uncertainty
The Architecture of Uncertainty
 
#Gophercon Talk by Smita Vijayakumar - Go's Context Library
#Gophercon Talk by Smita Vijayakumar - Go's Context Library#Gophercon Talk by Smita Vijayakumar - Go's Context Library
#Gophercon Talk by Smita Vijayakumar - Go's Context Library
 
Linux et le temps réel - Meetup du 15 octobre 2015
Linux et le temps réel - Meetup du 15 octobre 2015Linux et le temps réel - Meetup du 15 octobre 2015
Linux et le temps réel - Meetup du 15 octobre 2015
 
Working at Exotel
Working at ExotelWorking at Exotel
Working at Exotel
 
Contrat 2010-2013 Scduag
Contrat 2010-2013 ScduagContrat 2010-2013 Scduag
Contrat 2010-2013 Scduag
 
Setting A Culture of Technical Excellence
Setting A Culture of Technical ExcellenceSetting A Culture of Technical Excellence
Setting A Culture of Technical Excellence
 

Similar to BibBase Linked Data Triplification Challenge 2010 Presentation

NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
National Information Standards Organization (NISO)
 
Semantic citation
Semantic citationSemantic citation
Semantic citation
Deepak K
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
Armin Haller
 
Library Linked Data and the Future of Bibliographic Control
Library Linked Data and the Future of Bibliographic ControlLibrary Linked Data and the Future of Bibliographic Control
Library Linked Data and the Future of Bibliographic Control
University of Toronto Libraries - Information Technology Services
 
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early AdoptersApril 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
National Information Standards Organization (NISO)
 
Publishing data on the Semantic Web
Publishing data on the Semantic WebPublishing data on the Semantic Web
Publishing data on the Semantic Web
Peter Mika
 
Exploration of a Data Landscape using a Collaborative Linked Data Framework.
Exploration of a Data Landscape using a Collaborative Linked Data Framework.Exploration of a Data Landscape using a Collaborative Linked Data Framework.
Exploration of a Data Landscape using a Collaborative Linked Data Framework.
Laurent Alquier
 
The Progress of BIBFRAME, by Angela Kroeger
The Progress of BIBFRAME, by Angela KroegerThe Progress of BIBFRAME, by Angela Kroeger
The Progress of BIBFRAME, by Angela Kroeger
Angela Kroeger
 
Linked data MLA 2015
Linked data MLA 2015Linked data MLA 2015
Linked data MLA 2015
Cason Snow
 
Linked Data MLA 2015
Linked Data MLA 2015Linked Data MLA 2015
Linked Data MLA 2015
Cason Snow
 
Semantic Web and Linked Open Data
Semantic Web and Linked Open DataSemantic Web and Linked Open Data
Semantic Web and Linked Open Data
University of Wisconsin-Madison
 
NCompass Live: Linked Data and Libraries: What? Why? How?
NCompass Live: Linked Data and Libraries: What? Why? How?NCompass Live: Linked Data and Libraries: What? Why? How?
NCompass Live: Linked Data and Libraries: What? Why? How?
Nebraska Library Commission
 
Linked data for Libraries, Archives, Museums
Linked data for Libraries, Archives, MuseumsLinked data for Libraries, Archives, Museums
Linked data for Libraries, Archives, Museums
ljsmart
 
Knowledge discoverylaurahollink
Knowledge discoverylaurahollinkKnowledge discoverylaurahollink
Knowledge discoverylaurahollink
SSSW
 
Semantic web assignment1
Semantic web assignment1Semantic web assignment1
Semantic web assignment1
BarryK88
 
Linked data presentation for libraries (COMO)
Linked data presentation for libraries (COMO)Linked data presentation for libraries (COMO)
Linked data presentation for libraries (COMO)
robin fay
 
Linked Data, Library Users, and the Discovery Tools of the Future
Linked Data, Library Users, and the Discovery Tools of the FutureLinked Data, Library Users, and the Discovery Tools of the Future
Linked Data, Library Users, and the Discovery Tools of the Future
Emily Nimsakont
 
Lifting the Lid on Linked Data
Lifting the Lid on Linked DataLifting the Lid on Linked Data
Lifting the Lid on Linked Data
Jane Stevenson
 
Metadata for researchers
Metadata for researchers Metadata for researchers
Metadata for researchers
Getaneh Alemu
 
Role of Semantic Web in Health Informatics
Role of Semantic Web in Health InformaticsRole of Semantic Web in Health Informatics
Role of Semantic Web in Health Informatics
Artificial Intelligence Institute at UofSC
 

Similar to BibBase Linked Data Triplification Challenge 2010 Presentation (20)

NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
 
Semantic citation
Semantic citationSemantic citation
Semantic citation
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
 
Library Linked Data and the Future of Bibliographic Control
Library Linked Data and the Future of Bibliographic ControlLibrary Linked Data and the Future of Bibliographic Control
Library Linked Data and the Future of Bibliographic Control
 
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early AdoptersApril 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
 
Publishing data on the Semantic Web
Publishing data on the Semantic WebPublishing data on the Semantic Web
Publishing data on the Semantic Web
 
Exploration of a Data Landscape using a Collaborative Linked Data Framework.
Exploration of a Data Landscape using a Collaborative Linked Data Framework.Exploration of a Data Landscape using a Collaborative Linked Data Framework.
Exploration of a Data Landscape using a Collaborative Linked Data Framework.
 
The Progress of BIBFRAME, by Angela Kroeger
The Progress of BIBFRAME, by Angela KroegerThe Progress of BIBFRAME, by Angela Kroeger
The Progress of BIBFRAME, by Angela Kroeger
 
Linked data MLA 2015
Linked data MLA 2015Linked data MLA 2015
Linked data MLA 2015
 
Linked Data MLA 2015
Linked Data MLA 2015Linked Data MLA 2015
Linked Data MLA 2015
 
Semantic Web and Linked Open Data
Semantic Web and Linked Open DataSemantic Web and Linked Open Data
Semantic Web and Linked Open Data
 
NCompass Live: Linked Data and Libraries: What? Why? How?
NCompass Live: Linked Data and Libraries: What? Why? How?NCompass Live: Linked Data and Libraries: What? Why? How?
NCompass Live: Linked Data and Libraries: What? Why? How?
 
Linked data for Libraries, Archives, Museums
Linked data for Libraries, Archives, MuseumsLinked data for Libraries, Archives, Museums
Linked data for Libraries, Archives, Museums
 
Knowledge discoverylaurahollink
Knowledge discoverylaurahollinkKnowledge discoverylaurahollink
Knowledge discoverylaurahollink
 
Semantic web assignment1
Semantic web assignment1Semantic web assignment1
Semantic web assignment1
 
Linked data presentation for libraries (COMO)
Linked data presentation for libraries (COMO)Linked data presentation for libraries (COMO)
Linked data presentation for libraries (COMO)
 
Linked Data, Library Users, and the Discovery Tools of the Future
Linked Data, Library Users, and the Discovery Tools of the FutureLinked Data, Library Users, and the Discovery Tools of the Future
Linked Data, Library Users, and the Discovery Tools of the Future
 
Lifting the Lid on Linked Data
Lifting the Lid on Linked DataLifting the Lid on Linked Data
Lifting the Lid on Linked Data
 
Metadata for researchers
Metadata for researchers Metadata for researchers
Metadata for researchers
 
Role of Semantic Web in Health Informatics
Role of Semantic Web in Health InformaticsRole of Semantic Web in Health Informatics
Role of Semantic Web in Health Informatics
 

Recently uploaded

Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBCSTRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
kimdan468
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
chanes7
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
Academy of Science of South Africa
 
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdfMASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
goswamiyash170123
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
vaibhavrinwa19
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
David Douglas School District
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
TechSoup
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 

Recently uploaded (20)

Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBCSTRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
 
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdfMASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 

BibBase Linked Data Triplification Challenge 2010 Presentation

  • 1. BibBase Triplified http://data.bibbase.org/ Presented by: Reynold S. Xin UC Berkeley Joint work with: Oktie Hassanzadeh, Yang Yang, Jiang Du, Minghua Zhao, Renee J. Miller University of Toronto Christian Fritz University of Southern California
  • 2. Outline  Goals and Status  Duplicate detection  Interlinking of data sources  Additional features  Conclusions and future work
  • 3.
  • 4.
  • 5. Goals http://www.bibbase.org  Makes it easy for scientists to maintain publications pages  Scientists maintain a bibtex file; BibBase does the rest  Publishes them in HTML
  • 6. Goals http://data.bibbase.org  Makes it easy for scientists to maintain publications pages  Scientists maintain a bibtex file; BibBase does the rest  Publishes them in HTML  Publishes them in RDF  Links entries to the open linked data cloud  With incentive, scientists are helping us build a bibliographic database (think DBLP but automated)  Invaluable data set for benchmarking duplicate detection and semantic link discovery systems
  • 7.
  • 8. Some statistics  “Beta” went online in June 2010  As of yesterday (September 1, 2010)  ~ 100 active users  4520 publications, 4883 authors, 502 journals, 1881 proceedings, 88 keywords  39201 author links, 2768 publication links, 30 keyword links  Note that this is before we do any form of “marketing”
  • 9. Duplicate Detection  Examples  Authors: “Renee J. Miller” or “R. J. Miller” or “RJ Miller”  Publication entries  Journal & conferences: “VLDB” or “Very Large Data Base”  Solutions  Local detection (within a single bibtex file)  Global detection (across multiple files)
  • 10. Local Detection  A set of predefined rules to identify duplicates.  E.g. within a single file, it is highly likely that “Renee J Miller” is the same as “RJ Miller”.  Users can specify a suffix to the name to differentiate them (DBLP approach).  E.g. “Min Wang” vs “Min Wang2”
  • 11. Global Detection  Duplicate detection, also known as entity resolution, record linkage, or reference reconciliation is a well- studied problem and an active research area. [Tutorial- VLDB’05, Tutorial-SIGMOD’06]  We use existing declarative techniques [D.App.σ-SIGMOD’07] to detect duplicates across multiple files.  Display disambiguation page on HTML interface and rdfs:seeAlso attribute on RDF interface.  Also enables user to provide feedback by @string{vldb = Very Large Data Base}
  • 12. Interlinking of Data Sources  Leverages both offline dictionaries and online real-time URL verifications.  Some external data sources  DBLP  DBpedia  RKBExplorer  Semantic Web Dogfood  LOD foaf
  • 13. Additional Features  Storage and publication of provenance information  Dynamic grouping of entities (by year, keyword, etc)  RSS feed for notification  DBLP scraper to generate bibtex files from DBLP records  Statistics on usage  Enhancement to existing MIT bibtex ontology file
  • 14. Conclusion and Future Work  BibBase  Light-weight publication of bibliographic data  Semantic web technologies as a result of complex triplification performed inside the system  Invaluable data set  Future Work  More comprehensive duplicate detection  Links to more external data sources  Better engineering and service level agreement (99.99%?)  Broader user base