Your SlideShare is downloading. ×
0
Metadata in Wikipedia

                                    Daniel Kinzler

                               Wikipedia

     ...
Metadata in Wikipedia
Wikipedia
                                                             Daniel Kinzler

             ...
Metadata in Wikipedia
Document Metadata
                                                               Daniel Kinzler

   ...
Metadata in Wikipedia
Document Metadata
                                                         Daniel Kinzler

         ...
Metadata in Wikipedia
Images Metadata
                                    Daniel Kinzler

                               W...
Metadata in Wikipedia
Online Export Interface
                                                         Daniel Kinzler

   ...
Metadata in Wikipedia
Online Export Interface XML
                                                         Daniel Kinzler
...
Metadata in Wikipedia
MediaWiki Web API
                                                             Daniel Kinzler

     ...
Metadata in Wikipedia
MediaWiki Web API XML
                                                     Daniel Kinzler


http://e...
Metadata in Wikipedia
MediaWiki RDF Extension
                                                      Daniel Kinzler

      ...
Metadata in Wikipedia
MediaWiki RDF Extension XML
                                                                    Dani...
Metadata in Wikipedia
Structural Information
                                                            Daniel Kinzler

 ...
Metadata in Wikipedia
Page Links
                                                         Daniel Kinzler

                ...
Metadata in Wikipedia
Category Links
                                                            Daniel Kinzler

         ...
Metadata in Wikipedia
Inter-Language Links
                                                                 Daniel Kinzler...
Metadata in Wikipedia
WikiWord
                                                                Daniel Kinzler

           ...
Metadata in Wikipedia
Data Records
                                                           Daniel Kinzler

            ...
Metadata in Wikipedia
Infoboxes
                                             Daniel Kinzler

                             ...
Metadata in Wikipedia
Personendaten
                                                                Daniel Kinzler

      ...
Metadata in Wikipedia
DBPedia
                                                        Daniel Kinzler

                    ...
Metadata in Wikipedia
DBPedia XML
                                                                        Daniel Kinzler

...
Metadata in Wikipedia
Semantic MediaWiki
                                                         Daniel Kinzler

        ...
Metadata in Wikipedia
Semantic MediaWiki XML
                                                                             ...
Metadata in Wikipedia
WikiData
                                                         Daniel Kinzler

                  ...
Metadata in Wikipedia
We Have
                                            Daniel Kinzler

                                ...
Metadata in Wikipedia
We Need
                                           Daniel Kinzler

                                 ...
Metadata in Wikipedia
Thank You
                                                      Daniel Kinzler

                    ...
Upcoming SlideShare
Loading in...5
×

Metadata in Wikipedia

2,606

Published on

Presentation by Daniel Kinzler about Metadata and Wikipedia at the DC-2008 Wikimedia Workshop on User Generated Metadata

Published in: Technology, Education

Transcript of "Metadata in Wikipedia"

  1. 1. Metadata in Wikipedia Daniel Kinzler Wikipedia Traditional Metadata Metadata in Wikipedia Document and Revision Media Metadata Accessing Metadata data in, data out Link Structure Hyperlinks Categories Inter-Language Links WikiWord Daniel Kinzler Structured Data Records Infoboxes Wikimedia Deutschland e.V. DBPedia Semantic MediaWiki WikiData September 26. 2008 Conclusion We Have We Need Thank You
  2. 2. Metadata in Wikipedia Wikipedia Daniel Kinzler Wikipedia Traditional Metadata Document and Revision Media Metadata Accessing Metadata Link Structure Hyperlinks Categories Wikipedia is the free encyclopedia anyone can edit Inter-Language Links WikiWord Founded in 2001 Structured Data Records Has become the standard online reference Infoboxes DBPedia Semantic MediaWiki Number 8 website (Alexa), 50K requests per second WikiData Conclusion Exists in 250 languages, has 10 million articles We Have We Need Run by Wikimedia, runs on MediaWiki Thank You Free content, free software
  3. 3. Metadata in Wikipedia Document Metadata Daniel Kinzler Wikipedia Traditional Metadata Document and Revision Media Metadata Accessing Metadata Traditional (document) metadata is available throughout Link Structure Hyperlinks Wikipedia Categories Inter-Language Links Document information WikiWord Structured Data Title Records URL Infoboxes DBPedia Semantic MediaWiki Revision information WikiData Author Conclusion We Have Timestamp We Need Thank You
  4. 4. Metadata in Wikipedia Document Metadata Daniel Kinzler Wikipedia Metadata for media files is maintained on-page, as Traditional Metadata content: Document and Revision Media Metadata Accessing Metadata Source, License, Contributors, . . . Link Structure Hyperlinks Categories Inter-Language Links WikiWord Structured Data Records Infoboxes DBPedia Semantic MediaWiki WikiData Conclusion We Have We Need Thank You
  5. 5. Metadata in Wikipedia Images Metadata Daniel Kinzler Wikipedia Traditional Metadata Document and Revision Metadata for image formats Media Metadata Accessing Metadata Resolution Link Structure Hyperlinks EXIF Categories Inter-Language Links Author, Copyright WikiWord Structured Data Timestamp Records Exposure, Aperture, Infoboxes DBPedia Flash Semantic MediaWiki WikiData Camera model Conclusion ... We Have We Need Metadata for audio and Thank You video formats is not yet supported.
  6. 6. Metadata in Wikipedia Online Export Interface Daniel Kinzler Wikipedia Traditional Metadata Document and Revision Media Metadata Accessing Metadata Link Structure MediaWiki’s page export facility provides limited Hyperlinks Categories metadata Inter-Language Links WikiWord Special:Export Structured Data Records Pages and revisions Infoboxes DBPedia Semantic MediaWiki XML wrapper around wikitext WikiData Conclusion Some basic metadata We Have We Need Thank You
  7. 7. Metadata in Wikipedia Online Export Interface XML Daniel Kinzler Wikipedia http://en.wikipedia.org/wiki/Special: Traditional Metadata Export/Berlin Document and Revision Media Metadata Accessing Metadata <page> <title>Berlin</title> Link Structure Hyperlinks <id>3354</id> Categories <revision> Inter-Language Links WikiWord <id>240627831</id> <timestamp>2008-09-24T06:44:58Z</timestamp> Structured Data Records <contributor> Infoboxes <username>Ling.Nut</username> DBPedia <id>1929579</id> Semantic MediaWiki WikiData </contributor> <minor/> Conclusion We Have <comment>clean up, typos fixed</comment> We Need <text xml:space=quot;preservequot;> Thank You {{pp-semi-protected|small=yes}} {{otheruses1|the capital of Germany}} {{Infobox German Bundesland
  8. 8. Metadata in Wikipedia MediaWiki Web API Daniel Kinzler Wikipedia Traditional Metadata Document and Revision Media Metadata Accessing Metadata Link Structure MediaWiki’s web API for bots/scripts Hyperlinks Categories api.php Inter-Language Links WikiWord supports complex queries Structured Data Records lots of properties Infoboxes DBPedia Semantic MediaWiki several output formats (JSON, YAML, WDDX, . . . ) WikiData Conclusion but no RDF We Have We Need Thank You
  9. 9. Metadata in Wikipedia MediaWiki Web API XML Daniel Kinzler http://en.wikipedia.org/w/api.php?action= Wikipedia query&titles=Berlin&prop=info| Traditional Metadata revisions&rvlimit=5&format=xml Document and Revision Media Metadata Accessing Metadata <page pageid=quot;3354quot; Link Structure ns=quot;0quot; Hyperlinks title=quot;Berlinquot; Categories touched=quot;2008-09-24T06:44:58Zquot; Inter-Language Links WikiWord lastrevid=quot;240627831quot; Structured Data counter=quot;2317quot; Records length=quot;91446quot;> Infoboxes <revisions> DBPedia Semantic MediaWiki <rev revid=quot;240627831quot; WikiData minor=quot;quot; Conclusion user=quot;Ling.Nutquot; We Have timestamp=quot;2008-09-24T06:44:58Zquot; We Need comment=quot;clean up, typos fixedquot; /> Thank You <rev revid=quot;239984512quot; user=quot;Lear 21quot; timestamp=quot;2008-09-21T12:03:45Zquot; comment=quot;/* Transportation */ refquot; />
  10. 10. Metadata in Wikipedia MediaWiki RDF Extension Daniel Kinzler Wikipedia Traditional Metadata Document and Revision Media Metadata The RDF Extension provides access to metadata Accessing Metadata Link Structure Per-page RDF output Hyperlinks Categories Document info mainly in DC and CC vocab Inter-Language Links WikiWord Also links, categories, images, etc Structured Data Records Output in XML, Turtle or NTriples Infoboxes DBPedia Semantic MediaWiki Supports custom RDF embedded in wiki pages WikiData Conclusion Compare http://www.communitywiki.org/en/ We Have DublinCoreForWiki We Need Thank You Not on Wikipedia, used by WikiTravel
  11. 11. Metadata in Wikipedia MediaWiki RDF Extension XML Daniel Kinzler Wikipedia Traditional Metadata Document and Revision http://wikitravel.org/en/Special:Rdf/Berlin Media Metadata Accessing Metadata <rdf:Description Link Structure Hyperlinks rdf:about=quot;http://wikitravel.org/en/Berlinquot;> Categories <dc:date Inter-Language Links rdf:datatype=quot;http://purl.org/dc/elements/1.1/W3CDTFquot;> WikiWord 2008-09-23T18:04:01Z Structured Data Records </dc:date> Infoboxes <dc:rights> DBPedia Creative Commons Attribution-ShareAlike 1.0 Semantic MediaWiki WikiData </dc:rights> <dc:title xml:lang=quot;enquot;> Conclusion We Have Berlin We Need </dc:title> Thank You
  12. 12. Metadata in Wikipedia Structural Information Daniel Kinzler Wikipedia Traditional Metadata Document and Revision Media Metadata Accessing Metadata Link Structure Hyperlinks Categories Wiki pages contain several types of links Inter-Language Links WikiWord The structure of hyperlinks encodes relations Structured Data Records Links connect on textual and conceptual level Infoboxes DBPedia Semantic MediaWiki Links maintened by users, relations are implicit WikiData Conclusion We Have We Need Thank You
  13. 13. Metadata in Wikipedia Page Links Daniel Kinzler Wikipedia Traditional Metadata Document and Revision Media Metadata Accessing Metadata Hyperlinks cross-reference pages Link Structure Hyperlinks Navigational, but also conceptual Categories Inter-Language Links WikiWord Mutually linked pages → related concepts Structured Data Link label and link target → word and meaning Records Infoboxes DBPedia Beware identity crisis when choosing URIs Semantic MediaWiki WikiData Conclusion [[Berlin Wall|The Wall]] We Have We Need Thank You
  14. 14. Metadata in Wikipedia Category Links Daniel Kinzler Wikipedia Traditional Metadata Document and Revision Media Metadata Accessing Metadata Pages are assigned to one or more categories. Link Structure Hyperlinks Categories form a poly-hierarchy (by convention) Categories Inter-Language Links Categories of pages → Subsumtion of concepts WikiWord Structured Data Structure often unclear or broken Records Infoboxes No intersection, no transitive inclusion DBPedia Semantic MediaWiki WikiData [[Category:Capitals in Europe]] Conclusion We Have [[Category:States of Germany]] We Need Thank You
  15. 15. Metadata in Wikipedia Inter-Language Links Daniel Kinzler Wikipedia Traditional Metadata Document and Revision Media Metadata Inter-language links refer to the same page in a different Accessing Metadata language (on another wiki) Link Structure Hyperlinks Granularity and coverage differ greatly Categories Inter-Language Links WikiWord Mutually linked pages probably describe the same Structured Data concept Records Infoboxes Maintained manually, and per bot DBPedia Semantic MediaWiki WikiData Would a centralized system be better? Conclusion We Have [[de:Berliner Mauer]] We Need Thank You [[fr:Mur de Berlin]]
  16. 16. Metadata in Wikipedia WikiWord Daniel Kinzler Wikipedia Traditional Metadata Document and Revision Media Metadata WikiWord builds a thesaurus by mining the link structure Accessing Metadata Link Structure Every page describes a concept Hyperlinks Categories Inter-Language Links Link labels are terms refering to those concepts WikiWord Structured Data Links and categories define relations Records Infoboxes Multilingual thesaurus by merging languages DBPedia Semantic MediaWiki Export to SKOS WikiData Conclusion No web interface yet We Have We Need Thank You http://brightbyte.de/page/WikiWord
  17. 17. Metadata in Wikipedia Data Records Daniel Kinzler Wikipedia Wikipedia uses templates to present structured data Traditional Metadata Document and Revision records Media Metadata Accessing Metadata Maintained directly by users Link Structure Hyperlinks Template parameters can be extracted Categories Inter-Language Links WikiWord MediaWiki stores them as plain text Structured Data Records External mining tools needed Infoboxes DBPedia Semantic MediaWiki {{Infobox German Bundesland WikiData |Name = Berlin Conclusion |image_photo = Cityscapeberlin2006.JPG We Have |area = 891.82 We Need Thank You |population = 3416300 |elevation = 34 - 115 |GDP = 81.7 ...
  18. 18. Metadata in Wikipedia Infoboxes Daniel Kinzler Wikipedia Traditional Metadata Document and Revision Media Metadata Accessing Metadata Link Structure Infoboxes present a terse overview of Hyperlinks Categories properties Inter-Language Links WikiWord Used for Cities, animals, bands, Structured Data Records books, chemicals, . . . Infoboxes DBPedia Semantic MediaWiki Qualifiers are problematic: WikiData date of measurement, error Conclusion We Have margin, unit, source, etc We Need Thank You
  19. 19. Metadata in Wikipedia Personendaten Daniel Kinzler Wikipedia “Personendaten” are biographic records on the German Traditional Metadata Document and Revision Wikipedia Media Metadata Accessing Metadata Works like a hidden infobox Link Structure Hyperlinks Contains date/place of birth/death, aliases, etc. Categories Inter-Language Links Maintained by a WikiProject WikiWord Structured Data Automated extraction (every now and then) Records Infoboxes DBPedia {{Personendaten Semantic MediaWiki WikiData |NAME=Einstein, Albert |ALTERNATIVNAMEN= Conclusion We Have |KURZBESCHREIBUNG=Physiker We Need |GEBURTSDATUM=14. M¨rz 1879 a Thank You |GEBURTSORT=[[Ulm]] |STERBEDATUM=18. April 1955 |STERBEORT=[[Princeton (New Jersey)|Princeton]], [[USA]] }}
  20. 20. Metadata in Wikipedia DBPedia Daniel Kinzler Wikipedia Traditional Metadata Document and Revision Media Metadata Accessing Metadata Link Structure Hyperlinks Categories DBPedia is a project that mines RDF triples from Inter-Language Links WikiWord Infoboxes Structured Data Records Allows SPARQL queries Infoboxes DBPedia Semantic MediaWiki Multiple languages WikiData 100 million triples Conclusion We Have We Need Web interface Thank You http://dbpedia.org
  21. 21. Metadata in Wikipedia DBPedia XML Daniel Kinzler Wikipedia Traditional Metadata http://dbpedia.org/data/Berlin Document and Revision Media Metadata Accessing Metadata <rdf:Description Link Structure rdf:about=quot;http://dbpedia.org/resource/Lothar_Bolzquot;> Hyperlinks <n0pred:deathPlace xmlns:n0pred=quot;http://dbpedia.org/property/quot; Categories rdf:resource=quot;http://dbpedia.org/resource/Berlinquot;/> Inter-Language Links WikiWord </rdf:Description> <rdf:Description Structured Data Records rdf:about=quot;http://dbpedia.org/resource/Alfred_Wegenerquot;> Infoboxes <n0pred:birthPlace xmlns:n0pred=quot;http://dbpedia.org/property/quot; DBPedia rdf:resource=quot;http://dbpedia.org/resource/Berlinquot;/> Semantic MediaWiki WikiData </rdf:Description> <rdf:Description Conclusion We Have rdf:about=quot;http://dbpedia.org/resource/Untotenquot;> We Need <n0pred:origin xmlns:n0pred=quot;http://dbpedia.org/property/quot; Thank You rdf:resource=quot;http://dbpedia.org/resource/Berlinquot;/> </rdf:Description>
  22. 22. Metadata in Wikipedia Semantic MediaWiki Daniel Kinzler Wikipedia Traditional Metadata Document and Revision Media Metadata Accessing Metadata Link Structure Hyperlinks Categories Semantic MediaWiki is a MediaWiki extension: Inter-Language Links WikiWord Builds an RDF structure Structured Data Records Allows SPARQL queries Infoboxes DBPedia Semantic MediaWiki Users enter semantic relations in wiki syntax WikiData Conclusion More complex syntax We Have We Need semantic-mediawiki.org Thank You Not supported by Wikipedia
  23. 23. Metadata in Wikipedia Semantic MediaWiki XML Daniel Kinzler Wikipedia http://semantic-mediawiki.org/wiki/Special: Traditional Metadata ExportRDF/Berlin Document and Revision Media Metadata Accessing Metadata <swivt:Subject rdf:about=quot;&wiki;Berlinquot;> Link Structure <rdfs:label>Berlin</rdfs:label> Hyperlinks <swivt:page rdf:resource=quot;&wikiurl;Berlinquot;/> Categories Inter-Language Links <rdfs:isDefinedBy rdf:resource=quot;&wikiurl;Special:ExportRDF/Berlinquot;/> WikiWord <rdf:type rdf:resource=quot;&wiki;Category-3ACityquot;/> Structured Data <property:Capital_of rdf:resource=quot;&wiki;Germanyquot;/> Records <property:Coordinates Infoboxes rdf:datatype=quot;http://www.w3.org/2001/XMLSchema#stringquot;> DBPedia Semantic MediaWiki 52◦ 31 0 N, 13◦ 24 0 E WikiData </property:Coordinates> Conclusion <property:Located_in rdf:resource=quot;&wiki;Germanyquot;/> We Have <property:Population We Need Thank You rdf:datatype=quot;http://www.w3.org/2001/XMLSchema#doublequot;> 3391407 </property:Population> </swivt:Subject>
  24. 24. Metadata in Wikipedia WikiData Daniel Kinzler Wikipedia Traditional Metadata Document and Revision Media Metadata Accessing Metadata Link Structure Hyperlinks Categories WikiData is a MediaWiki extension: Inter-Language Links WikiWord Stores structured data separate from wikitext Structured Data Records Reusable across wikis Infoboxes DBPedia Semantic MediaWiki Form-based structured data entry WikiData Conclusion No export interface We Have We Need omegawiki.org Thank You Not used by Wikipedia, active on OmegaWiki
  25. 25. Metadata in Wikipedia We Have Daniel Kinzler Wikipedia Traditional Metadata Document and Revision Media Metadata Accessing Metadata Link Structure Hyperlinks We have. . . Categories Inter-Language Links Document Metadata WikiWord Structured Data Structural Data Records Infoboxes Structured data records DBPedia Semantic MediaWiki WikiData Lots of people maintaining this Conclusion We Have We Need Thank You
  26. 26. Metadata in Wikipedia We Need Daniel Kinzler Wikipedia Traditional Metadata Document and Revision Media Metadata Accessing Metadata Link Structure We need ways to. . . Hyperlinks Categories maintain the data easily. Inter-Language Links WikiWord store structured data sensibly. Structured Data Records query the data efficiently. Infoboxes DBPedia Semantic MediaWiki access the data conveniently. WikiData Conclusion We need people to make it happen. We Have We Need Thank You
  27. 27. Metadata in Wikipedia Thank You Daniel Kinzler Wikipedia Traditional Metadata Document and Revision Media Metadata Accessing Metadata Link Structure Hyperlinks The End Categories Inter-Language Links WikiWord Structured Data Records Infoboxes DBPedia Semantic MediaWiki WikiData http://brightbyte.de/repos/papers/2008/ Conclusion We Have We Need Thank You
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×