Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Wikidata
for
Libraries and Archives
Emw
Exploring Wikidata and the Semantic Web for Libraries
Metropolitan New York Librar...
Wikidata is a free linked database that can be
read and edited by
humans and machines.
Wikidata's goals
● Centralize interwiki links
● Centralize Wikipedia infoboxes
● Provide an interface for rich queries
● S...
What you'll learn from this talk
● How to edit Wikidata
● Authority control properties
● Ontology
● Wikidata vocabulary
● ...
Elements of a Wikidata statement
Example: New York City (Q60)
Items and properties
●
Each item and property has its own page
● Items
– Represent subjects: Douglas Adams, Challenger dis...
Statements and claims
● Claims
– Claims are “triplets”
● Formally: subject, predicate, object
● In Wikidata: item, propert...
Qualifiers, ranks, references
● Qualifiers
– Qualifiers are properties used on claims rather than items
– “Yonkers populat...
More on Wikidata vocabulary
https://www.wikidata.org/wiki/Wikidata:Glossary
Wikipedia articles have a Wikidata item link in the
left navigation panel.
Wikidata link on Wikipedia
Getting to Wikidata from Wikipedia
Wikidata's instant search suggests items that have
labels or aliases matching your keyword.
Wikidata search
Search by label
Search by alias: “flu” -> influenza
Finding properties
● Is there a property for “number of windows”?
● What was the ID of that property, again?
● Search
– In...
Let's edit Wikidata
Waldseemuller map
https://www.wikidata.org/wiki/Q195801
● instance of (P31): map
● creator (P170): Martin Waldseemüller
● ...
Tools
– Querying: Autolist, by Magnus Manske
● http://tools.wmflabs.org/autolist/autolist1.html
– Batch editing: Widar, by...
Querying in Wikidata
Use case: Get a list of history books
Pseudo-query:
instance of: book AND genre: history
instance of:...
http://tools.wmflabs.org/autolist/autolist1.html?q=claim[31:571]%20AND%20claim[136:309]
Authority control on Wikidata
● “Identifier” properties link Wikidata items to entities in
external databases
● Also known...
Identifier properties on Wikidata
Top 10 ID properties on Wikidata as of 2015-03-15
1) Freebase identifier (P646) 1,153,78...
Ontology
● “A specification of a conceptualization”
● Rooted in long-standing philosophical traditions
● Useful for:
– bui...
Tree of Porphyry
User:VoiceOfTheCommons, CC-BY-SA 3.0
Classes and instances
● Plato is a human is a animal
● Plato instance of human subclass of animal
● Instance: concrete obj...
Concept hierarchy: painting
http://tools.wmflabs.org/wikidata-todo/tree.html?q=3305213&rp=279&lang=en
Concept tree diagram: painting
http://tools.wmflabs.org/wikidata-todo/tree.html?q=3305213&rp=279&lang=en&method=d3
Ontology on Wikidata
● instance of (P31)
– rdf:type in RDF and OWL
– 12,692,181 usages
– Most popular Wikidata property
● ...
Ontology on Wikidata
● Last but not least: part of (P361)
– Third basic membership property
– Top-level “part-whole” relat...
subproperty of (P1647)
● How to link author (P50) and creator (P170)
– subproperty of (P1647)
– author subproperty of crea...
Structured data for images
● Use case:
– A library, archive or other cultural institution wants to
make its images discove...
Structured Data
for Wikimedia Commons
● Purpose
– use structured data for all media files on Wikimedia sites
– make it eas...
Structured Data for Commons: FAQ
● How long will it take?
–Completion will take years, but project will be released in sta...
Wikidata for Commons
Partial support already exists!
Wikidata links to Commons
Commons links to Wikidata
Library ontology on Wikidata
● What is a book?
● Different levels of existence (FRBR)
– Work
– Expression - e.g., translat...
FRBR example
● Work
– Bible (Q1845)
● Expression
– Latin translation of the Bible
● Manifestation
– Vulgate (Q131175): a p...
Linking FRBR levels in Wikidata
Done by the property edition or translation of (P629)
● Gutenberg Bible (item)
– edition o...
Trouble: what is an instance?
● Thing with a unique location in space and time
● Bible instance of religious text: incorre...
Information Artifact Ontology
● IAO sets forth a basic ontological distinction:
– Entities in the world
● Actual people: G...
Information Artifact Ontology
● Addresses problems in Dublin Core Metadata Initiative
(DCMI)
– Smith (2014)
http://ncor.bu...
Summary
● Wikidata is a free knowledge base that anyone can edit
● Search properties by “p:search term”
● Wikidata items l...
Thank you!
https://www.wikidata.org/wiki/User:Emw
Upcoming SlideShare
Loading in …5
×

Wikidata for libraries and archives

1,847 views

Published on

A presentation on Wikidata for libraries and archives delivered on March 16, 2015 to the Metropolitan New York Library Council.

Contains minor edits and corrections from version presented.

Released under CC0.

Published in: Data & Analytics
  • Be the first to comment

Wikidata for libraries and archives

  1. 1. Wikidata for Libraries and Archives Emw Exploring Wikidata and the Semantic Web for Libraries Metropolitan New York Library Council 2015-03-16
  2. 2. Wikidata is a free linked database that can be read and edited by humans and machines.
  3. 3. Wikidata's goals ● Centralize interwiki links ● Centralize Wikipedia infoboxes ● Provide an interface for rich queries ● Structure the sum of all human knowledge
  4. 4. What you'll learn from this talk ● How to edit Wikidata ● Authority control properties ● Ontology ● Wikidata vocabulary ● Querying and other tools ● Wikidata & Commons plans
  5. 5. Elements of a Wikidata statement
  6. 6. Example: New York City (Q60)
  7. 7. Items and properties ● Each item and property has its own page ● Items – Represent subjects: Douglas Adams, Challenger disaster – Have identifiers like Q42, Q921090 –13,745,153 items ● Properties – Represent attribute names: occupation, cause of – Have identifiers like P106, P828 – 1,429 properties
  8. 8. Statements and claims ● Claims – Claims are “triplets” ● Formally: subject, predicate, object ● In Wikidata: item, property, value ● Example: Douglas Adams, occupation, author ● Statements – A claim is only part of a statement – Statements also include: ● References ● Ranks
  9. 9. Qualifiers, ranks, references ● Qualifiers – Qualifiers are properties used on claims rather than items – “Yonkers population 12,733 at time (P585) 1860” ● Ranks – Preferred, normal, deprecated – Useful to mark outdated claims ● References – Source of claim; provenance – “... stated in (P248) 1860 United States Census”
  10. 10. More on Wikidata vocabulary https://www.wikidata.org/wiki/Wikidata:Glossary
  11. 11. Wikipedia articles have a Wikidata item link in the left navigation panel. Wikidata link on Wikipedia
  12. 12. Getting to Wikidata from Wikipedia
  13. 13. Wikidata's instant search suggests items that have labels or aliases matching your keyword. Wikidata search
  14. 14. Search by label
  15. 15. Search by alias: “flu” -> influenza
  16. 16. Finding properties ● Is there a property for “number of windows”? ● What was the ID of that property, again? ● Search – In main site search box, prefix search term with “P:” – “P:number of”, “P:occupation” – Instant search doesn't work for properties, only items ● Browse – https://www.wikidata.org/wiki/Wikidata:List_of_properties ^ bookmark this!
  17. 17. Let's edit Wikidata
  18. 18. Waldseemuller map https://www.wikidata.org/wiki/Q195801 ● instance of (P31): map ● creator (P170): Martin Waldseemüller ● publication (P577): April 1507 ● language of work (P407): Latin ● depicts (P180): Americas, Africa, Europe, Asia, Pacific Ocean ● has part (P527): woodcut ● location (P276): Library of Congress
  19. 19. Tools – Querying: Autolist, by Magnus Manske ● http://tools.wmflabs.org/autolist/autolist1.html – Batch editing: Widar, by Magnus Manske ● https://tools.wmflabs.org/autolist/ – Software framework: Wikidata Toolkit, by Markus Kroetzsch et al. ● https://www.mediawiki.org/wiki/Wikidata_Toolkit ● https://github.com/Wikidata/Wikidata-Toolkit
  20. 20. Querying in Wikidata Use case: Get a list of history books Pseudo-query: instance of: book AND genre: history instance of: P31 book: Q571 genre: P136 history: Q309 Wikidata query in Autolist: claim[31:571] AND claim[136:309]
  21. 21. http://tools.wmflabs.org/autolist/autolist1.html?q=claim[31:571]%20AND%20claim[136:309]
  22. 22. Authority control on Wikidata ● “Identifier” properties link Wikidata items to entities in external databases ● Also known as authority control numbers ● Examples: – VIAF identifier (P214) – Freebase identifier (P646) – MusicBrainz artist ID (P434)
  23. 23. Identifier properties on Wikidata Top 10 ID properties on Wikidata as of 2015-03-15 1) Freebase identifier (P646) 1,153,786 2) China administrative division code (P442) 742,240 3) VIAF identifier (P214) 527,786 4) GeoNames ID (P1566) 490,444 5) GND identifier (P227) 345,247 6) ITIS TSN (P815) 334,053 7) PlantList-ID (P1070) 331,179 8) Tropicos taxon name identifier (P960) 318,719 9) IMDb identifier (P345) 286,637 10) IPNI taxon name identifier (P961) 276,515 https://www.wikidata.org/wiki/Wikidata:Database_reports/List_of_properties/Top100 Identifier properties on WikidataIdentifier properties on Wikidata
  24. 24. Ontology ● “A specification of a conceptualization” ● Rooted in long-standing philosophical traditions ● Useful for: – building concept hierarchies – making common knowledge computable
  25. 25. Tree of Porphyry User:VoiceOfTheCommons, CC-BY-SA 3.0
  26. 26. Classes and instances ● Plato is a human is a animal ● Plato instance of human subclass of animal ● Instance: concrete object, individual ● Class: abstract object
  27. 27. Concept hierarchy: painting http://tools.wmflabs.org/wikidata-todo/tree.html?q=3305213&rp=279&lang=en
  28. 28. Concept tree diagram: painting http://tools.wmflabs.org/wikidata-todo/tree.html?q=3305213&rp=279&lang=en&method=d3
  29. 29. Ontology on Wikidata ● instance of (P31) – rdf:type in RDF and OWL – 12,692,181 usages – Most popular Wikidata property ● subclass of (P279) – “all instances of A are also instances of B” – rdfs:subClassOf in RDF and OWL – 190,095 usages
  30. 30. Ontology on Wikidata ● Last but not least: part of (P361) – Third basic membership property – Top-level “part-whole” relation ● Instance of, subclass of and part of are all transitive ● Transitive relation : A subclass of B B subclass of C :. A subclass of C https://www.wikidata.org/wiki/Help:Basic_membership_properties
  31. 31. subproperty of (P1647) ● How to link author (P50) and creator (P170) – subproperty of (P1647) – author subproperty of creator ● Wikidata supports claims about properties ● subproperty of (P1647) has semantics of rdfs:subPropertyOf
  32. 32. Structured data for images ● Use case: – A library, archive or other cultural institution wants to make its images discoverable on Wikidata ● Structured Data on Wikimedia Commons – Project of the Wikimedia Foundation (WMF)
  33. 33. Structured Data for Wikimedia Commons ● Purpose – use structured data for all media files on Wikimedia sites – make it easier for users to read, translate and edit file information – enable developers to build better tools, both to contribute and reuse Commons content. https://commons.wikimedia.org/wiki/Commons:Structured_data/Overview
  34. 34. Structured Data for Commons: FAQ ● How long will it take? –Completion will take years, but project will be released in stages. –Forward-facing part of project is on hold. ● Is file meta data staying on Commons? –Yes ● How will file information be structured? –To be determined ● Will Commons still have templates? –Yes ● Will Commons still have categories? –Yes
  35. 35. Wikidata for Commons Partial support already exists!
  36. 36. Wikidata links to Commons
  37. 37. Commons links to Wikidata
  38. 38. Library ontology on Wikidata ● What is a book? ● Different levels of existence (FRBR) – Work – Expression - e.g., translation for a language – Manifestation - edition, thing with an ISBN – Item - physical copy
  39. 39. FRBR example ● Work – Bible (Q1845) ● Expression – Latin translation of the Bible ● Manifestation – Vulgate (Q131175): a particular Latin translation ● Item – Gutenberg Bible (Q158075): a physical book in New York Public Library(!)
  40. 40. Linking FRBR levels in Wikidata Done by the property edition or translation of (P629) ● Gutenberg Bible (item) – edition or translation of: Vulgate ● Vulgate (manifestation) – edition or translation of: Bible ● Bible (work)
  41. 41. Trouble: what is an instance? ● Thing with a unique location in space and time ● Bible instance of religious text: incorrect? – Gutenberg Bible instance of religious text: all agree ● Solution: Bible as information artifact
  42. 42. Information Artifact Ontology ● IAO sets forth a basic ontological distinction: – Entities in the world ● Actual people: Grace Hopper, Barack Obama ● That very old tree ● Gutenberg Bible – Information artifacts ● Entities about something in reality ● Concretized by some information bearer
  43. 43. Information Artifact Ontology ● Addresses problems in Dublin Core Metadata Initiative (DCMI) – Smith (2014) http://ncor.buffalo.edu/2014/IAOW/IAO-Tutorial-Smith-Rio-Sep-2014.pptx ● Rooted in Basic Formal Ontology (BFO), which is used in many scienitific ontologies – Smith et al. (2007). Nature Biotechnology. http://www.nature.com/nbt/journal/v25/n11/full/nbt1346.html
  44. 44. Summary ● Wikidata is a free knowledge base that anyone can edit ● Search properties by “p:search term” ● Wikidata items link to external databases through identifier properties, e.g. VIAF identifier (P214) ● Wikidata supports FRBR and other ontologies
  45. 45. Thank you! https://www.wikidata.org/wiki/User:Emw

×