Introduction to Wikidata

Uploaded on

"Introduction to Wikidata" presentation given 26th April 2013, at the British Library

"Introduction to Wikidata" presentation given 26th April 2013, at the British Library

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Introduction to WikidataBritish Library, 26/4/13Andrew | @generalising
  • 2. Wikidata summary●Central data repository for Wikimedia projects●Human- and machine-readable●Human- and machine-editable●Fully multilingual●Supports semantic
  • 3. Overall plan●Phase I– Centralise cross-language relationships●Phase II– Centralise core structured data●Phase III– Dynamic generation of list content
  • 4. Phase I●Centralising all “interwiki” cross-language links– Historically, a major maintenance headache!●Single conceptual entity => many articles– ...some unexpected oddities arise; not all 1:1●Almost all entities now listed●Inclusion standards currently restricted
  • 5. Phase I
  • 6. Phase I – oddities#
  • 7. Phase II●Building structured data on these entities●“Phase 2.1” - harvesting data from Wikipedia– and supplemented from other sources●“Phase 2.2” - displaying data on Wikipedia– autogenerated information templates
  • 8. Phase II
  • 9. Phase III●Automatic creation of lists and charts●Expected for late 2013...
  • 10. Wikidata entities●Single entity corresponding to one or moreWikipedia articles– Name (in various languages) + WP links– Contains various Phase II properties– Properties can include sources/qualifiers●No support (yet!) for entities not existing in WP
  • 11. Phase II – planned model
  • 12. Phase II – initial properties●Limited properties – gradual roll-outStandard●Single“main type”, but no restrictions on use– “the capital of Julius Caesar”●Relational properties implemented– but no automatic reciprocity yet●String datatypes created for identifiers●130 properties currently in use
  • 13. Phase II – future properties●Properties created by community discussion●Several awaiting datatypes:– time– geocoordinate– number (and dimension)●Qualifiers yet to be added
  • 14. Data reuse●Permanent numeric identifier for all items●API available (JSON)– but still being developed!●Regular XML dumps –– all item/property data licensed as CC-0
  • 15. Identifiers & authorities●GND, ISNI, LCCN, ULAN, VIAF, BNF,SUDOC, CALIS, CiNii, NDL, ICCU, NLA,MusicBrainz, IMDB●ISBN, ISSN, OCLC, DOI, NOR●OpenStreetMap IDs●Corporate, administrative, monument,chemical, gene identifiers, language codes●...and pigeon breed registries
  • 16. Tools●Examples of toolsets:– GeneaWiki (visualise relations)– Reasonator (display interface)– Query API (experimental, alternative)– Tree of Life (static dump)