Diversity and Wikidata

Leveraging data as information
for all languages, cultures and interests
Wikidata
●

Stores semantic data with multi-language support

●

Had a flying start by servicing the interwiki process
–
–...
Issues
●

Links without labels

●

Items not referring to “your” language (no labels)

●

WP articles are not what they se...
Using WD for search
●

WD needs to support the “fat head” of search in
each language

●

Not only for WD but also WP and C...
User participation
●

Build upon the data that we have

●

Indicate what is missing in “their” language

●

●

Allow peopl...
Specific subjects
●

Consider the use of ontologies
–

WP categories ARE an ontology in their own right

–

Linking red li...
Some statistics
●

Most items do not have statements
–

●

Many items have links and no labels
–

●

This affects precisio...
More statistics
●

WMF statistics – not really relevant

●

Magnus's and his existing query functionality
–

●

Been addin...
Visualisation
●

It helps people understand what the data is about

●

It motivates to add labels and statements

●

When ...
Your agenda
●

Gender ratios are not that relevant; quality
information is.
–

Make sure your category of subjects is full...
Questions ?
User:GerardM
UltimateGerardM.blogspot.com
Upcoming SlideShare
Loading in …5
×

Diversity conference 2013 berlin

726 views

Published on

How to use Wikidata in diverse contexts. How it can help, what it does and what it does not.

Published in: Education, Business, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
726
On SlideShare
0
From Embeds
0
Number of Embeds
363
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Diversity conference 2013 berlin

  1. 1. Diversity and Wikidata Leveraging data as information for all languages, cultures and interests
  2. 2. Wikidata ● Stores semantic data with multi-language support ● Had a flying start by servicing the interwiki process – – The link to WP allows bots to enrich WD – ● There are pros and cons for this approach Most statements are derived from WP, any WP Terminology: – Item – Label – Link
  3. 3. Issues ● Links without labels ● Items not referring to “your” language (no labels) ● WP articles are not what they seem; lists for instance ● No statements for an item ● Statements with no label in “your” languages ● ● However, these issues are better than not having data to work with
  4. 4. Using WD for search ● WD needs to support the “fat head” of search in each language ● Not only for WD but also WP and Commons ● We do not know what people look for ● We do not know what labels WD does not serve
  5. 5. User participation ● Build upon the data that we have ● Indicate what is missing in “their” language ● ● Allow people to translate for the languages they know Compile a “concept cloud” for each subject – Ask help with missing labels – Suggest their use in WP articles – Suggest their use in WD – Present a basic info-box
  6. 6. Specific subjects ● Consider the use of ontologies – WP categories ARE an ontology in their own right – Linking red links to the “concept cloud” – Make use of global (ie WD) and external ontologies – Make the “concept cloud” part of the watchlist ● Consider that another WP may be more advanced ● Labels may be lacking in “your” language
  7. 7. Some statistics ● Most items do not have statements – ● Many items have links and no labels – ● This affects precision, but not the trends They cannot be found Basic WD gender ratio is for all and no Wikipedias – – ● Please apply your research data !! Learn how it affects ratios in other languages PS What sex is an eunuch?
  8. 8. More statistics ● WMF statistics – not really relevant ● Magnus's and his existing query functionality – ● Been adding sex info and it affects the numbers We could do with statistics on failed searches – So far it has no priority
  9. 9. Visualisation ● It helps people understand what the data is about ● It motivates to add labels and statements ● When served to “WD/WP readers” they get structured data and it may motivate to write WP articles
  10. 10. Your agenda ● Gender ratios are not that relevant; quality information is. – Make sure your category of subjects is full of statements – Do the subjects you like best first ● – ● Make sure that you enjoy yourself Soccer, hockey, swimming, gymnastics IGNORANCE is imho what we should concentrate on – Gapminder type data and visualisation is what we need
  11. 11. Questions ? User:GerardM UltimateGerardM.blogspot.com

×