SlideShare a Scribd company logo
1 of 33
Download to read offline
Up and running with Wikidata 
Emw 
New York City Wikidata workshop 
2014-12-14
Wikidata is a free linked database that can be 
read and edited by 
humans and machines.
Wikidata's goals 
● Centralize interwiki links 
● Centralize infoboxes 
● Provide an interface for rich queries 
● Structure the sum of all human knowledge
What you'll learn from this talk 
● How to edit Wikidata 
● How to classify 
● Ideas for small projects 
● Wikidata vocabulary 
● Where to find things 
● Awesome tools
Elements of a Wikidata statement
Example: New York City (Q60)
Items and properties 
● Each item and property has its own page 
● Items 
– Represent subjects: Douglas Adams, Challenger disaster 
– Have identifiers like Q42, Q921090 
– 12,906,291 items 
● Properties 
– Represent attribute names: occupation, has cause 
– Have identifiers like P106, P828 
– 1,329 properties
Statements and claims 
● Claims 
– Claims are “triplets” 
● Formally: subject, predicate, object 
● In Wikidata: item, property, value 
● Example: Douglas Adams, occupation, author 
● Statements 
– A claim is only part of a statement 
– Statements also include: 
● References 
● Ranks
Qualifiers, ranks, references 
●Qualifiers 
– Qualifiers are properties used on claims rather than items 
– “Yonkers population 12,733 at time (P585) 1860” 
●Ranks 
– Preferred, normal, deprecated 
– Useful to mark outdated claims 
●References 
– Source of claim; provenance 
– “... stated in (P248) 1860 United States Census”
More on Wikidata vocabulary 
https://www.wikidata.org/wiki/Wikidata:Glossary
Finding Wikidata items 
Wikipedia articles have a Wikidata item link in the 
left navigation panel.
Finding Wikidata items 
Wikidata search is quick and effective. 
Instant search suggests items that have labels or 
aliases matching your keyword.
Search by label
Search by alias: “flu” -> influenza
Finding properties 
● Is there a property for “number of windows”? 
● What was the ID of that property, again? 
● Search 
– In main site search box, prefix search term with “P:” 
– “P:number of”, “P:occupation” 
– Instant search doesn't work for properties, only items 
● Browse 
– https://www.wikidata.org/wiki/Wikidata:List_of_properties 
^ bookmark this!
Let's edit Wikidata.
Walking through edits for: 
Yonkers, New York 
https://www.wikidata.org/wiki/Q128114
Yonkers TODO 
https://www.wikidata.org/wiki/Q128114 
● population (P1082) claims for historical table in 
https://en.wikipedia.org/wiki/Yonkers,_New_York#Demographics 
● Include references! “1860 United States Census”, etc. 
● To add to item from inbofox: 
– head of government (P6), office held by head of government (P1313) 
– date of foundation or creation (P571) 
– ZIP code (P281)
Area? Population density? 
● Properties with units (km^2, people/km^2, $) 
are not yet possible 
● “Units” datatype in development 
● https://phabricator.wikimedia.org/T65722
Tools 
– Querying: Autolist, by Magnus Manske 
● http://tools.wmflabs.org/autolist/autolist1.html 
– Batch editing: Widar, by Magnus Manske 
● https://tools.wmflabs.org/autolist/ 
– Software framework: Wikidata Toolkit, by Markus Kroetzsch et al. 
● https://www.mediawiki.org/wiki/Wikidata_Toolkit 
● https://github.com/Wikidata/Wikidata-Toolkit
Querying in Wikidata 
List of politicians who died of a heart attack 
Pseudo-query: 
occupation: politician AND cause of death: heart attack 
occupation: P106 
politician: Q82955 
cause of death: P509 
heart attack: Q12152 
Wikidata query in Autolist: 
claim[106:82955] AND claim[509:12152]
http://tools.wmflabs.org/autolist/autolist1.html?q=claim[106:82955]%20AND%20claim[509:12152]
Classification on Wikidata 
● Taxonomy of knowledge 
● Enables powerful inference, novel applications 
● Interesting philosophical, design, and engineering issues
Tree of Porphyry 
User:VoiceOfTheCommons, CC-BY-SA 3.0
Classes and instances 
● Plato is a human is a animal 
● Plato instance of human subclass of animal 
● Instance: concrete object, individual 
● Class: abstract object
Classification on Wikidata 
● instance of (P31) 
– rdf:type in RDF and OWL 
– 11,930,243 usages 
– Most popular Wikidata property 
● subclass of (P279) 
– “all instances of A are also instances of B” 
– rdfs:subClassOf in RDF and OWL 
– 170,571 usages
Examples 
● USS Nimitz instance of Nimitz-class aircraft carrier 
Nimitz-class aircraft carrier subclass of aircraft carrier 
● 2012 Cannes Film Festival instance of Cannes Film Festival 
Cannes Film Festival subclass of film festival 
● an individual charm quark instance of charm quark 
charm quark subclass of quark 
^ Many “leaf nodes” in Wikidata's taxonomic hierarchy are not instances. 
(There are no items about individual quarks on Wikidata!) 
https://www.wikidata.org/wiki/Help:Basic_membership_properties
Bad smells 
Item has many instance of or subclass of claims 
Items typically satisfy a huge number of instance of claims: 
● Fido instance of dog 
● Fido instance of English Pointer 
● Fido instance of faithful animal 
● … 
Solution: use one class for instance of, put other class 
knowledge into normal properties 
● Fido instance of dog 
● Fido breed: English Pointer 
● Fido known for: faithfulness 
● ...
Bad smells 
subclass of claim that is nonsensical when interpreted as “All 
instances of A are also instances of B” 
Example: 
dog subclass of pet 
But not all dogs are pets! 
feral dog subclass of dog true 
feral dog subclass of pet false 
:. dog subclass of pet false 
Solution: put “pet” knowledge about dogs into claim that does not 
apply to all instances of dog. E.g. “dog has role pet”. (Has role 
would not be transitive. Also needed: some/all quantifier.)
Classification on Wikidata 
● Last but not least: part of (P361) 
– Third basic membership property 
– Top-level “part-whole” relation 
● Instance of, subclass of and part of are all transitive 
● Transitive relation: 
A subclass of B 
B subclass of C 
:. A subclass of C 
https://www.wikidata.org/wiki/Help:Basic_membership_properties
Ideas for small projects 
● Add data about towns and cities 
– population (P1082) 
– head of government (P6) 
● Add medical knowledge about historical figures 
– medical condition (P1050) 
– cause of death (P509) 
– manner of death (P1196) 
● Add cultural knowledge about works of art 
– instance of (P31) 
– creator (P170) 
– material used (P186) 
– collection (P195)

More Related Content

What's hot

Wikipedia infobox type_prediction_slides_dl4_k_gs
Wikipedia infobox type_prediction_slides_dl4_k_gsWikipedia infobox type_prediction_slides_dl4_k_gs
Wikipedia infobox type_prediction_slides_dl4_k_gs
Russa Biswas
 
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTDBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
Herbert Van de Sompel
 
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
Josef Petrák
 

What's hot (18)

FedX - Optimization Techniques for Federated Query Processing on Linked Data
FedX - Optimization Techniques for Federated Query Processing on Linked DataFedX - Optimization Techniques for Federated Query Processing on Linked Data
FedX - Optimization Techniques for Federated Query Processing on Linked Data
 
Semantic Web And Coldfusion
Semantic Web And ColdfusionSemantic Web And Coldfusion
Semantic Web And Coldfusion
 
Wikipedia infobox type_prediction_slides_dl4_k_gs
Wikipedia infobox type_prediction_slides_dl4_k_gsWikipedia infobox type_prediction_slides_dl4_k_gs
Wikipedia infobox type_prediction_slides_dl4_k_gs
 
Web Data Management with RDF
Web Data Management with RDFWeb Data Management with RDF
Web Data Management with RDF
 
An Introduction to RDF and the Web of Data
An Introduction to RDF and the Web of DataAn Introduction to RDF and the Web of Data
An Introduction to RDF and the Web of Data
 
Rich Data? Poor Data? Depends on...
Rich Data? Poor Data? Depends on...Rich Data? Poor Data? Depends on...
Rich Data? Poor Data? Depends on...
 
Keynote session - LOD2014 W3C event
Keynote session - LOD2014 W3C eventKeynote session - LOD2014 W3C event
Keynote session - LOD2014 W3C event
 
18 ° Nexa Lunch Seminar - Lo stato dell'arte dei Linked Open Data italiani
18 ° Nexa Lunch Seminar - Lo stato dell'arte dei Linked Open Data italiani18 ° Nexa Lunch Seminar - Lo stato dell'arte dei Linked Open Data italiani
18 ° Nexa Lunch Seminar - Lo stato dell'arte dei Linked Open Data italiani
 
PyConUK 2016 - Writing English Right
PyConUK 2016  - Writing English RightPyConUK 2016  - Writing English Right
PyConUK 2016 - Writing English Right
 
ALEC (A List of Everything Cool)
ALEC (A List of Everything Cool)ALEC (A List of Everything Cool)
ALEC (A List of Everything Cool)
 
Linked Open Data stuff
Linked Open Data stuffLinked Open Data stuff
Linked Open Data stuff
 
On the Way to a Holding Ontology
On the Way to a Holding OntologyOn the Way to a Holding Ontology
On the Way to a Holding Ontology
 
Clark - Metadata is the Message
Clark - Metadata is the MessageClark - Metadata is the Message
Clark - Metadata is the Message
 
Connections that work: Linked Open Data demystified
Connections that work: Linked Open Data demystifiedConnections that work: Linked Open Data demystified
Connections that work: Linked Open Data demystified
 
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...
 
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTDBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
 
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
 
MLB Teams Logic Puzzle Solution
MLB Teams Logic Puzzle SolutionMLB Teams Logic Puzzle Solution
MLB Teams Logic Puzzle Solution
 

Viewers also liked (6)

An Ambitious Wikidata Tutorial
An Ambitious Wikidata TutorialAn Ambitious Wikidata Tutorial
An Ambitious Wikidata Tutorial
 
Wikidata for libraries and archives
Wikidata for libraries and archivesWikidata for libraries and archives
Wikidata for libraries and archives
 
Wikidata presentation at SemTechBiz Berlin 2012
Wikidata presentation at SemTechBiz Berlin 2012Wikidata presentation at SemTechBiz Berlin 2012
Wikidata presentation at SemTechBiz Berlin 2012
 
Lecture slides6; Construction contract financial planning
Lecture slides6; Construction contract financial planningLecture slides6; Construction contract financial planning
Lecture slides6; Construction contract financial planning
 
Argument pp 1
Argument pp 1Argument pp 1
Argument pp 1
 
Wikidata and the Semantic Web of Food
Wikidata and the  Semantic Web of FoodWikidata and the  Semantic Web of Food
Wikidata and the Semantic Web of Food
 

Similar to Up and running with Wikidata

Csvconf data hacking-with_wikimedia_projects
Csvconf data hacking-with_wikimedia_projectsCsvconf data hacking-with_wikimedia_projects
Csvconf data hacking-with_wikimedia_projects
mattsenate
 
Curation and Digital Storytelling
Curation and Digital StorytellingCuration and Digital Storytelling
Curation and Digital Storytelling
Shawn Day
 
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Olivier Grisel
 
Arjun@WISE-2020 : Encoding Knowledge Graph Context in an Attentive Neural Net...
Arjun@WISE-2020 : Encoding Knowledge Graph Context in an Attentive Neural Net...Arjun@WISE-2020 : Encoding Knowledge Graph Context in an Attentive Neural Net...
Arjun@WISE-2020 : Encoding Knowledge Graph Context in an Attentive Neural Net...
Onando Mulang'
 

Similar to Up and running with Wikidata (19)

20141114 wikidata glam_workshop2
20141114 wikidata glam_workshop220141114 wikidata glam_workshop2
20141114 wikidata glam_workshop2
 
Entity Linking, Link Prediction, and Knowledge Graph Completion
Entity Linking, Link Prediction, and Knowledge Graph CompletionEntity Linking, Link Prediction, and Knowledge Graph Completion
Entity Linking, Link Prediction, and Knowledge Graph Completion
 
A Little SPARQL in your Analytics
A Little SPARQL in your AnalyticsA Little SPARQL in your Analytics
A Little SPARQL in your Analytics
 
Understanding the Standards Gap
Understanding the Standards GapUnderstanding the Standards Gap
Understanding the Standards Gap
 
Evaluation Initiatives for Entity-oriented Search
Evaluation Initiatives for Entity-oriented SearchEvaluation Initiatives for Entity-oriented Search
Evaluation Initiatives for Entity-oriented Search
 
Häskell und Grepl: Data Hacking Wikimedia Projects Exampled With Open Access ...
Häskell und Grepl: Data Hacking Wikimedia Projects Exampled With Open Access ...Häskell und Grepl: Data Hacking Wikimedia Projects Exampled With Open Access ...
Häskell und Grepl: Data Hacking Wikimedia Projects Exampled With Open Access ...
 
Csvconf data hacking-with_wikimedia_projects
Csvconf data hacking-with_wikimedia_projectsCsvconf data hacking-with_wikimedia_projects
Csvconf data hacking-with_wikimedia_projects
 
Quipu: Quechua Knowledge Graph [Pilot: Building virtual assistants based on Q...
Quipu: Quechua Knowledge Graph [Pilot: Building virtual assistants based on Q...Quipu: Quechua Knowledge Graph [Pilot: Building virtual assistants based on Q...
Quipu: Quechua Knowledge Graph [Pilot: Building virtual assistants based on Q...
 
Curation and Digital Storytelling
Curation and Digital StorytellingCuration and Digital Storytelling
Curation and Digital Storytelling
 
2014 10-11 Wikidata talk London WMF UK
2014 10-11 Wikidata talk London WMF UK2014 10-11 Wikidata talk London WMF UK
2014 10-11 Wikidata talk London WMF UK
 
Web Driven Revolution For Library Data
Web Driven Revolution For Library DataWeb Driven Revolution For Library Data
Web Driven Revolution For Library Data
 
VALA Tech Camp 2017: Intro to Wikidata & SPARQL
VALA Tech Camp 2017: Intro to Wikidata & SPARQLVALA Tech Camp 2017: Intro to Wikidata & SPARQL
VALA Tech Camp 2017: Intro to Wikidata & SPARQL
 
2014-02-27 Wikidata talk Cambridge
2014-02-27 Wikidata talk Cambridge2014-02-27 Wikidata talk Cambridge
2014-02-27 Wikidata talk Cambridge
 
Digital Narratives for Transylvania DH
Digital Narratives for Transylvania DHDigital Narratives for Transylvania DH
Digital Narratives for Transylvania DH
 
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
 
Arjun@WISE-2020 : Encoding Knowledge Graph Context in an Attentive Neural Net...
Arjun@WISE-2020 : Encoding Knowledge Graph Context in an Attentive Neural Net...Arjun@WISE-2020 : Encoding Knowledge Graph Context in an Attentive Neural Net...
Arjun@WISE-2020 : Encoding Knowledge Graph Context in an Attentive Neural Net...
 
Wikipedia and Civic Engagement
Wikipedia and Civic EngagementWikipedia and Civic Engagement
Wikipedia and Civic Engagement
 
Creating Narrative with Digital Objects
Creating Narrative with Digital ObjectsCreating Narrative with Digital Objects
Creating Narrative with Digital Objects
 
Entity Retrieval (SIGIR 2013 tutorial)
Entity Retrieval (SIGIR 2013 tutorial)Entity Retrieval (SIGIR 2013 tutorial)
Entity Retrieval (SIGIR 2013 tutorial)
 

Recently uploaded

Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
mikehavy0
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
wsppdmt
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
acoha1
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
q6pzkpark
 
一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单
一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单
一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单
aqpto5bt
 
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
yulianti213969
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Valters Lauzums
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives
23050636
 
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontangobat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
siskavia95
 

Recently uploaded (20)

Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
 
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
 
一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单
一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单
一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单
 
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data Analytics
 
Unsatisfied Bhabhi ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Gi...
Unsatisfied Bhabhi ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Gi...Unsatisfied Bhabhi ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Gi...
Unsatisfied Bhabhi ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Gi...
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
 
Seven tools of quality control.slideshare
Seven tools of quality control.slideshareSeven tools of quality control.slideshare
Seven tools of quality control.slideshare
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting Techniques
 
What is Insertion Sort. Its basic information
What is Insertion Sort. Its basic informationWhat is Insertion Sort. Its basic information
What is Insertion Sort. Its basic information
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives
 
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontangobat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
 

Up and running with Wikidata

  • 1. Up and running with Wikidata Emw New York City Wikidata workshop 2014-12-14
  • 2. Wikidata is a free linked database that can be read and edited by humans and machines.
  • 3. Wikidata's goals ● Centralize interwiki links ● Centralize infoboxes ● Provide an interface for rich queries ● Structure the sum of all human knowledge
  • 4. What you'll learn from this talk ● How to edit Wikidata ● How to classify ● Ideas for small projects ● Wikidata vocabulary ● Where to find things ● Awesome tools
  • 5. Elements of a Wikidata statement
  • 6. Example: New York City (Q60)
  • 7. Items and properties ● Each item and property has its own page ● Items – Represent subjects: Douglas Adams, Challenger disaster – Have identifiers like Q42, Q921090 – 12,906,291 items ● Properties – Represent attribute names: occupation, has cause – Have identifiers like P106, P828 – 1,329 properties
  • 8. Statements and claims ● Claims – Claims are “triplets” ● Formally: subject, predicate, object ● In Wikidata: item, property, value ● Example: Douglas Adams, occupation, author ● Statements – A claim is only part of a statement – Statements also include: ● References ● Ranks
  • 9. Qualifiers, ranks, references ●Qualifiers – Qualifiers are properties used on claims rather than items – “Yonkers population 12,733 at time (P585) 1860” ●Ranks – Preferred, normal, deprecated – Useful to mark outdated claims ●References – Source of claim; provenance – “... stated in (P248) 1860 United States Census”
  • 10. More on Wikidata vocabulary https://www.wikidata.org/wiki/Wikidata:Glossary
  • 11. Finding Wikidata items Wikipedia articles have a Wikidata item link in the left navigation panel.
  • 12.
  • 13. Finding Wikidata items Wikidata search is quick and effective. Instant search suggests items that have labels or aliases matching your keyword.
  • 15. Search by alias: “flu” -> influenza
  • 16. Finding properties ● Is there a property for “number of windows”? ● What was the ID of that property, again? ● Search – In main site search box, prefix search term with “P:” – “P:number of”, “P:occupation” – Instant search doesn't work for properties, only items ● Browse – https://www.wikidata.org/wiki/Wikidata:List_of_properties ^ bookmark this!
  • 18. Walking through edits for: Yonkers, New York https://www.wikidata.org/wiki/Q128114
  • 19. Yonkers TODO https://www.wikidata.org/wiki/Q128114 ● population (P1082) claims for historical table in https://en.wikipedia.org/wiki/Yonkers,_New_York#Demographics ● Include references! “1860 United States Census”, etc. ● To add to item from inbofox: – head of government (P6), office held by head of government (P1313) – date of foundation or creation (P571) – ZIP code (P281)
  • 20. Area? Population density? ● Properties with units (km^2, people/km^2, $) are not yet possible ● “Units” datatype in development ● https://phabricator.wikimedia.org/T65722
  • 21. Tools – Querying: Autolist, by Magnus Manske ● http://tools.wmflabs.org/autolist/autolist1.html – Batch editing: Widar, by Magnus Manske ● https://tools.wmflabs.org/autolist/ – Software framework: Wikidata Toolkit, by Markus Kroetzsch et al. ● https://www.mediawiki.org/wiki/Wikidata_Toolkit ● https://github.com/Wikidata/Wikidata-Toolkit
  • 22. Querying in Wikidata List of politicians who died of a heart attack Pseudo-query: occupation: politician AND cause of death: heart attack occupation: P106 politician: Q82955 cause of death: P509 heart attack: Q12152 Wikidata query in Autolist: claim[106:82955] AND claim[509:12152]
  • 24. Classification on Wikidata ● Taxonomy of knowledge ● Enables powerful inference, novel applications ● Interesting philosophical, design, and engineering issues
  • 25.
  • 26. Tree of Porphyry User:VoiceOfTheCommons, CC-BY-SA 3.0
  • 27. Classes and instances ● Plato is a human is a animal ● Plato instance of human subclass of animal ● Instance: concrete object, individual ● Class: abstract object
  • 28. Classification on Wikidata ● instance of (P31) – rdf:type in RDF and OWL – 11,930,243 usages – Most popular Wikidata property ● subclass of (P279) – “all instances of A are also instances of B” – rdfs:subClassOf in RDF and OWL – 170,571 usages
  • 29. Examples ● USS Nimitz instance of Nimitz-class aircraft carrier Nimitz-class aircraft carrier subclass of aircraft carrier ● 2012 Cannes Film Festival instance of Cannes Film Festival Cannes Film Festival subclass of film festival ● an individual charm quark instance of charm quark charm quark subclass of quark ^ Many “leaf nodes” in Wikidata's taxonomic hierarchy are not instances. (There are no items about individual quarks on Wikidata!) https://www.wikidata.org/wiki/Help:Basic_membership_properties
  • 30. Bad smells Item has many instance of or subclass of claims Items typically satisfy a huge number of instance of claims: ● Fido instance of dog ● Fido instance of English Pointer ● Fido instance of faithful animal ● … Solution: use one class for instance of, put other class knowledge into normal properties ● Fido instance of dog ● Fido breed: English Pointer ● Fido known for: faithfulness ● ...
  • 31. Bad smells subclass of claim that is nonsensical when interpreted as “All instances of A are also instances of B” Example: dog subclass of pet But not all dogs are pets! feral dog subclass of dog true feral dog subclass of pet false :. dog subclass of pet false Solution: put “pet” knowledge about dogs into claim that does not apply to all instances of dog. E.g. “dog has role pet”. (Has role would not be transitive. Also needed: some/all quantifier.)
  • 32. Classification on Wikidata ● Last but not least: part of (P361) – Third basic membership property – Top-level “part-whole” relation ● Instance of, subclass of and part of are all transitive ● Transitive relation: A subclass of B B subclass of C :. A subclass of C https://www.wikidata.org/wiki/Help:Basic_membership_properties
  • 33. Ideas for small projects ● Add data about towns and cities – population (P1082) – head of government (P6) ● Add medical knowledge about historical figures – medical condition (P1050) – cause of death (P509) – manner of death (P1196) ● Add cultural knowledge about works of art – instance of (P31) – creator (P170) – material used (P186) – collection (P195)