SlideShare a Scribd company logo
Introduction to Wikidata
British Library, 26/4/13
Andrew Gray
andrew.gray@bl.uk | @generalising
Wikidata summary
●
Central data repository for Wikimedia projects
●
Human- and machine-readable
●
Human- and machine-editable
●
Fully multilingual
●
Supports semantic relationships
www.wikidata.org
Overall plan
●
Phase I
– Centralise cross-language relationships
●
Phase II
– Centralise core structured data
●
Phase III
– Dynamic generation of list content
Phase I
●
Centralising all “interwiki” cross-language links
– Historically, a major maintenance headache!
●
Single conceptual entity => many articles
– ...some unexpected oddities arise; not all 1:1
●
Almost all entities now listed
●
Inclusion standards currently restricted
Phase I
Phase I – oddities
#'
Phase II
●
Building structured data on these entities
●
“Phase 2.1” - harvesting data from Wikipedia
– and supplemented from other sources
●
“Phase 2.2” - displaying data on Wikipedia
– autogenerated information templates
Phase II
Phase III
●
Automatic creation of lists and charts
●
Expected for late 2013...
Wikidata entities
●
Single entity corresponding to one or more
Wikipedia articles
– Name (in various languages) + WP links
– Contains various Phase II properties
– Properties can include sources/qualifiers
●
No support (yet!) for entities not existing in WP
Phase II – planned model
Phase II – initial properties
●
Limited properties – gradual roll-outStandard
●
Single“main type”, but no restrictions on use
– “the capital of Julius Caesar”
●
Relational properties implemented
– but no automatic reciprocity yet
●
String datatypes created for identifiers
●
130 properties currently in use
Phase II – future properties
●
Properties created by community discussion
●
Several awaiting datatypes:
– time
– geocoordinate
– number (and dimension)
●
Qualifiers yet to be added
Data reuse
●
Permanent numeric identifier for all items
●
API available (JSON)
– but still being developed!
●
Regular XML dumps – dumps.wikimedia.org
– all item/property data licensed as CC-0
Identifiers & authorities
●
GND, ISNI, LCCN, ULAN, VIAF, BNF,
SUDOC, CALIS, CiNii, NDL, ICCU, NLA,
MusicBrainz, IMDB
●
ISBN, ISSN, OCLC, DOI, NOR
●
OpenStreetMap IDs
●
Corporate, administrative, monument,
chemical, gene identifiers, language codes
●
...and pigeon breed registries
Tools
●
Examples of toolsets:
– GeneaWiki (visualise relations)
– Reasonator (display interface)
– Query API (experimental, alternative)
– Tree of Life (static dump)

More Related Content

Similar to Introduction to Wikidata

2014-02-27 Wikidata talk Cambridge
2014-02-27 Wikidata talk Cambridge2014-02-27 Wikidata talk Cambridge
2014-02-27 Wikidata talk Cambridge
Magnus Manske
 
Ros platform overview
Ros platform overviewRos platform overview
Ros platform overview
Pablo Iñigo Blasco
 
The Semantic Web and Drupal 7 - Loja 2013
The Semantic Web and Drupal 7 - Loja 2013The Semantic Web and Drupal 7 - Loja 2013
The Semantic Web and Drupal 7 - Loja 2013
scorlosquet
 
DSpace-CRIS Workshop OR2015: Slides
DSpace-CRIS Workshop OR2015: SlidesDSpace-CRIS Workshop OR2015: Slides
DSpace-CRIS Workshop OR2015: Slides
Andrea Bollini
 
ROS Overview - Málaga 2012
ROS Overview - Málaga 2012ROS Overview - Málaga 2012
ROS Overview - Málaga 2012
Pablo Iñigo Blasco
 
Archival Technologies
Archival TechnologiesArchival Technologies
Archival Technologies
Cliff Landis
 
Using schema.org to improve SEO
Using schema.org to improve SEOUsing schema.org to improve SEO
Using schema.org to improve SEO
scorlosquet
 
Android development - the basics, FI MUNI, 2012
Android development - the basics, FI MUNI, 2012Android development - the basics, FI MUNI, 2012
Android development - the basics, FI MUNI, 2012
Tomáš Kypta
 
Slides semantic web and Drupal 7 NYCCamp 2012
Slides semantic web and Drupal 7 NYCCamp 2012Slides semantic web and Drupal 7 NYCCamp 2012
Slides semantic web and Drupal 7 NYCCamp 2012
scorlosquet
 
Drupal and the Semantic Web - ESIP Webinar
Drupal and the Semantic Web - ESIP WebinarDrupal and the Semantic Web - ESIP Webinar
Drupal and the Semantic Web - ESIP Webinar
scorlosquet
 
LibCT и контейнеры на уровне приложений -- Александр Бурлука
	LibCT и контейнеры на уровне приложений -- Александр Бурлука	LibCT и контейнеры на уровне приложений -- Александр Бурлука
LibCT и контейнеры на уровне приложений -- Александр Бурлука
OpenVZ
 
Using Semantic Web Technologies to Discover Resources within the Intranet of ...
Using Semantic Web Technologies to Discover Resources within the Intranet of ...Using Semantic Web Technologies to Discover Resources within the Intranet of ...
Using Semantic Web Technologies to Discover Resources within the Intranet of ...
Sabin Buraga
 
BEdita, a development platform
BEdita, a development platformBEdita, a development platform
BEdita, a development platform
Stefano Rosanelli
 
Not so brief history of Linux Containers - Kir Kolyshkin
Not so brief history of Linux Containers - Kir KolyshkinNot so brief history of Linux Containers - Kir Kolyshkin
Not so brief history of Linux Containers - Kir Kolyshkin
OpenVZ
 
Not so brief history of Linux Containers
Not so brief history of Linux ContainersNot so brief history of Linux Containers
Not so brief history of Linux Containers
Kirill Kolyshkin
 
Linked Data from a Digital Object Management System
Linked Data from a Digital Object Management SystemLinked Data from a Digital Object Management System
Linked Data from a Digital Object Management System
Uldis Bojars
 
Reversing Android Applications For Fun and Profit
Reversing Android Applications For Fun and ProfitReversing Android Applications For Fun and Profit
Reversing Android Applications For Fun and Profit
Maycon Vitali
 
Drupal for Higher Education and Virtual Learning
Drupal for Higher Education and Virtual LearningDrupal for Higher Education and Virtual Learning
Drupal for Higher Education and Virtual Learning
Gabriel Dragomir
 
Tools for the Open Source Internet of Things
Tools for the Open Source Internet of ThingsTools for the Open Source Internet of Things
Tools for the Open Source Internet of Things
Michael Koster
 
Tools for the Open Source Internet Of Things
Tools for the Open Source Internet Of ThingsTools for the Open Source Internet Of Things
Tools for the Open Source Internet Of Things
Michael Koster
 

Similar to Introduction to Wikidata (20)

2014-02-27 Wikidata talk Cambridge
2014-02-27 Wikidata talk Cambridge2014-02-27 Wikidata talk Cambridge
2014-02-27 Wikidata talk Cambridge
 
Ros platform overview
Ros platform overviewRos platform overview
Ros platform overview
 
The Semantic Web and Drupal 7 - Loja 2013
The Semantic Web and Drupal 7 - Loja 2013The Semantic Web and Drupal 7 - Loja 2013
The Semantic Web and Drupal 7 - Loja 2013
 
DSpace-CRIS Workshop OR2015: Slides
DSpace-CRIS Workshop OR2015: SlidesDSpace-CRIS Workshop OR2015: Slides
DSpace-CRIS Workshop OR2015: Slides
 
ROS Overview - Málaga 2012
ROS Overview - Málaga 2012ROS Overview - Málaga 2012
ROS Overview - Málaga 2012
 
Archival Technologies
Archival TechnologiesArchival Technologies
Archival Technologies
 
Using schema.org to improve SEO
Using schema.org to improve SEOUsing schema.org to improve SEO
Using schema.org to improve SEO
 
Android development - the basics, FI MUNI, 2012
Android development - the basics, FI MUNI, 2012Android development - the basics, FI MUNI, 2012
Android development - the basics, FI MUNI, 2012
 
Slides semantic web and Drupal 7 NYCCamp 2012
Slides semantic web and Drupal 7 NYCCamp 2012Slides semantic web and Drupal 7 NYCCamp 2012
Slides semantic web and Drupal 7 NYCCamp 2012
 
Drupal and the Semantic Web - ESIP Webinar
Drupal and the Semantic Web - ESIP WebinarDrupal and the Semantic Web - ESIP Webinar
Drupal and the Semantic Web - ESIP Webinar
 
LibCT и контейнеры на уровне приложений -- Александр Бурлука
	LibCT и контейнеры на уровне приложений -- Александр Бурлука	LibCT и контейнеры на уровне приложений -- Александр Бурлука
LibCT и контейнеры на уровне приложений -- Александр Бурлука
 
Using Semantic Web Technologies to Discover Resources within the Intranet of ...
Using Semantic Web Technologies to Discover Resources within the Intranet of ...Using Semantic Web Technologies to Discover Resources within the Intranet of ...
Using Semantic Web Technologies to Discover Resources within the Intranet of ...
 
BEdita, a development platform
BEdita, a development platformBEdita, a development platform
BEdita, a development platform
 
Not so brief history of Linux Containers - Kir Kolyshkin
Not so brief history of Linux Containers - Kir KolyshkinNot so brief history of Linux Containers - Kir Kolyshkin
Not so brief history of Linux Containers - Kir Kolyshkin
 
Not so brief history of Linux Containers
Not so brief history of Linux ContainersNot so brief history of Linux Containers
Not so brief history of Linux Containers
 
Linked Data from a Digital Object Management System
Linked Data from a Digital Object Management SystemLinked Data from a Digital Object Management System
Linked Data from a Digital Object Management System
 
Reversing Android Applications For Fun and Profit
Reversing Android Applications For Fun and ProfitReversing Android Applications For Fun and Profit
Reversing Android Applications For Fun and Profit
 
Drupal for Higher Education and Virtual Learning
Drupal for Higher Education and Virtual LearningDrupal for Higher Education and Virtual Learning
Drupal for Higher Education and Virtual Learning
 
Tools for the Open Source Internet of Things
Tools for the Open Source Internet of ThingsTools for the Open Source Internet of Things
Tools for the Open Source Internet of Things
 
Tools for the Open Source Internet Of Things
Tools for the Open Source Internet Of ThingsTools for the Open Source Internet Of Things
Tools for the Open Source Internet Of Things
 

More from Andrew Gray

Wikipedia and information literacy - LILAC 2014
Wikipedia and information literacy - LILAC 2014Wikipedia and information literacy - LILAC 2014
Wikipedia and information literacy - LILAC 2014
Andrew Gray
 
Wikipedia in the Library - The European Library, Amsterdam 2013
Wikipedia in the Library - The European Library, Amsterdam 2013Wikipedia in the Library - The European Library, Amsterdam 2013
Wikipedia in the Library - The European Library, Amsterdam 2013
Andrew Gray
 
Community communications slides
Community communications slidesCommunity communications slides
Community communications slides
Andrew Gray
 
Wikipedia in the Library Wikimania Hong Kong
Wikipedia in the Library   Wikimania Hong KongWikipedia in the Library   Wikimania Hong Kong
Wikipedia in the Library Wikimania Hong Kong
Andrew Gray
 
Dissecting Wikipedia
Dissecting WikipediaDissecting Wikipedia
Dissecting Wikipedia
Andrew Gray
 
Social Media at the British Library - Royal Manuscripts
Social Media at the British Library - Royal ManuscriptsSocial Media at the British Library - Royal Manuscripts
Social Media at the British Library - Royal Manuscripts
Andrew Gray
 
AHRC Wikipedian in Residence Report
AHRC Wikipedian in Residence ReportAHRC Wikipedian in Residence Report
AHRC Wikipedian in Residence Report
Andrew Gray
 
Wikipedia for Researchers
Wikipedia for ResearchersWikipedia for Researchers
Wikipedia for Researchers
Andrew Gray
 
Wikipedia Workshop presentation
Wikipedia Workshop presentationWikipedia Workshop presentation
Wikipedia Workshop presentation
Andrew Gray
 

More from Andrew Gray (9)

Wikipedia and information literacy - LILAC 2014
Wikipedia and information literacy - LILAC 2014Wikipedia and information literacy - LILAC 2014
Wikipedia and information literacy - LILAC 2014
 
Wikipedia in the Library - The European Library, Amsterdam 2013
Wikipedia in the Library - The European Library, Amsterdam 2013Wikipedia in the Library - The European Library, Amsterdam 2013
Wikipedia in the Library - The European Library, Amsterdam 2013
 
Community communications slides
Community communications slidesCommunity communications slides
Community communications slides
 
Wikipedia in the Library Wikimania Hong Kong
Wikipedia in the Library   Wikimania Hong KongWikipedia in the Library   Wikimania Hong Kong
Wikipedia in the Library Wikimania Hong Kong
 
Dissecting Wikipedia
Dissecting WikipediaDissecting Wikipedia
Dissecting Wikipedia
 
Social Media at the British Library - Royal Manuscripts
Social Media at the British Library - Royal ManuscriptsSocial Media at the British Library - Royal Manuscripts
Social Media at the British Library - Royal Manuscripts
 
AHRC Wikipedian in Residence Report
AHRC Wikipedian in Residence ReportAHRC Wikipedian in Residence Report
AHRC Wikipedian in Residence Report
 
Wikipedia for Researchers
Wikipedia for ResearchersWikipedia for Researchers
Wikipedia for Researchers
 
Wikipedia Workshop presentation
Wikipedia Workshop presentationWikipedia Workshop presentation
Wikipedia Workshop presentation
 

Recently uploaded

Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 

Recently uploaded (20)

Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 

Introduction to Wikidata

  • 1. Introduction to Wikidata British Library, 26/4/13 Andrew Gray andrew.gray@bl.uk | @generalising
  • 2. Wikidata summary ● Central data repository for Wikimedia projects ● Human- and machine-readable ● Human- and machine-editable ● Fully multilingual ● Supports semantic relationships www.wikidata.org
  • 3. Overall plan ● Phase I – Centralise cross-language relationships ● Phase II – Centralise core structured data ● Phase III – Dynamic generation of list content
  • 4. Phase I ● Centralising all “interwiki” cross-language links – Historically, a major maintenance headache! ● Single conceptual entity => many articles – ...some unexpected oddities arise; not all 1:1 ● Almost all entities now listed ● Inclusion standards currently restricted
  • 6. Phase I – oddities #'
  • 7. Phase II ● Building structured data on these entities ● “Phase 2.1” - harvesting data from Wikipedia – and supplemented from other sources ● “Phase 2.2” - displaying data on Wikipedia – autogenerated information templates
  • 9. Phase III ● Automatic creation of lists and charts ● Expected for late 2013...
  • 10. Wikidata entities ● Single entity corresponding to one or more Wikipedia articles – Name (in various languages) + WP links – Contains various Phase II properties – Properties can include sources/qualifiers ● No support (yet!) for entities not existing in WP
  • 11. Phase II – planned model
  • 12. Phase II – initial properties ● Limited properties – gradual roll-outStandard ● Single“main type”, but no restrictions on use – “the capital of Julius Caesar” ● Relational properties implemented – but no automatic reciprocity yet ● String datatypes created for identifiers ● 130 properties currently in use
  • 13. Phase II – future properties ● Properties created by community discussion ● Several awaiting datatypes: – time – geocoordinate – number (and dimension) ● Qualifiers yet to be added
  • 14. Data reuse ● Permanent numeric identifier for all items ● API available (JSON) – but still being developed! ● Regular XML dumps – dumps.wikimedia.org – all item/property data licensed as CC-0
  • 15. Identifiers & authorities ● GND, ISNI, LCCN, ULAN, VIAF, BNF, SUDOC, CALIS, CiNii, NDL, ICCU, NLA, MusicBrainz, IMDB ● ISBN, ISSN, OCLC, DOI, NOR ● OpenStreetMap IDs ● Corporate, administrative, monument, chemical, gene identifiers, language codes ● ...and pigeon breed registries
  • 16. Tools ● Examples of toolsets: – GeneaWiki (visualise relations) – Reasonator (display interface) – Query API (experimental, alternative) – Tree of Life (static dump)