The Metadata Ecosystem
Moving Records between Wikipedia
 and the German National Library


            Mathias Schindler
 ...
German language Wikipedia
• 930.000 articles
    277.000 of them are biographies (~30%)
• 450 million words of text
• 400 ...
2004 Wikipedia CD-ROM
• „Directmedia publishing“, a company in Berlin
    released a CD-ROM with the text of the
    Germa...
{{Personendaten}} template

{{Personendaten
|NAME=Blanchett, Cate
|ALTERNATIVNAMEN=Blanchett, Catherine Elise
(Geburtsname...
{{PND}}
• {{PND|129968471}}

• {{Normdaten|PND=118529579|
    LCCN=n/79/22889|VIAF=75121530|
    SELIBR=184709}}
   – http...
Special:Booksources (current)
German National Library
•   Integration of PND data since 2005
•   Feedback page at Wikipedia
•   Staff at the national li...
Upcoming work
• Let Wikipedians help maintain the authority file
    a bit more directly
• Let Wikipedians create authorit...
Geolocation




http://www.mozilla.com/en-US/firefox/geolocation/
Geo-tagging
Database links
How to access the data
• Download full dataset at
   download.wikipedia.org
• Parser tools like extraktor.pl
• Wikimedia T...
Thank you



      Mathias Schindler
    Wikimedia Deutschland

mathias.schindler@wikimedia.de
Upcoming SlideShare
Loading in …5
×

The Metadata Ecosystem Moving Records between Wikipedia and the German National Library

1,770 views

Published on

Presentation at GLAM-WIKI by Mathias Schindler.
http://glam.wikimedia.org.au

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,770
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

The Metadata Ecosystem Moving Records between Wikipedia and the German National Library

  1. 1. The Metadata Ecosystem Moving Records between Wikipedia and the German National Library Mathias Schindler Wikimedia Germany e.V. GLAM-Wiki August 7, 2009 Australian War Memorial Canberra
  2. 2. German language Wikipedia • 930.000 articles 277.000 of them are biographies (~30%) • 450 million words of text • 400 to 500 new articles per day • 1000 people with more than 100 edits per month (pretty stable since Oct 2006) • 90% of the articles larger than 0.5 Kbytes 50% of the articles larger than 2 Kbytes
  3. 3. 2004 Wikipedia CD-ROM • „Directmedia publishing“, a company in Berlin released a CD-ROM with the text of the German language edition of Wikipedia • Introduction of Persondata template • Correct lexicographic sorting
  4. 4. {{Personendaten}} template {{Personendaten |NAME=Blanchett, Cate |ALTERNATIVNAMEN=Blanchett, Catherine Elise (Geburtsname) |KURZBESCHREIBUNG=australische Filmschauspielerin |GEBURTSDATUM=14. Mai 1969 |GEBURTSORT=[[Melbourne]], Australien |STERBEDATUM= |STERBEORT= }}
  5. 5. {{PND}} • {{PND|129968471}} • {{Normdaten|PND=118529579| LCCN=n/79/22889|VIAF=75121530| SELIBR=184709}} – http://d-nb.info/gnd/118529579 – http://errol.oclc.org/laf/n79-22889.html – http://viaf.org/75121530 – http://libris.kb.se/auth/184709
  6. 6. Special:Booksources (current)
  7. 7. German National Library • Integration of PND data since 2005 • Feedback page at Wikipedia • Staff at the national library is parsing this list • Workshop with Wikipedians in 2008
  8. 8. Upcoming work • Let Wikipedians help maintain the authority file a bit more directly • Let Wikipedians create authority file entries on people who should have one • Integrate bibliographic data into Wikipedia • Create services to help find uses sources faster • Template:ISBN
  9. 9. Geolocation http://www.mozilla.com/en-US/firefox/geolocation/
  10. 10. Geo-tagging
  11. 11. Database links
  12. 12. How to access the data • Download full dataset at download.wikipedia.org • Parser tools like extraktor.pl • Wikimedia Toolserver
  13. 13. Thank you Mathias Schindler Wikimedia Deutschland mathias.schindler@wikimedia.de

×