Wikipedia and Libraries: Island Hopping the Data Archipelago

1,379 views

Published on

This talk delivered at Berkeley iSchool Friday Seminars describes the current state and future of connecting Data Islands such as VIAF and WorldCat with Wikipedia. Although there is a lot of talk about how the web ought to be linked, VIAFbot serves as a prototype for how bidirectional linking can be imitated by "link reciprocation method," a creation of the author Max Klein.

Published in: Technology, Education
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,379
On SlideShare
0
From Embeds
0
Number of Embeds
885
Actions
Shares
0
Downloads
8
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Wikipedia and Libraries: Island Hopping the Data Archipelago

  1. 1. Merrilee Proffitt andMax KleinOCLC ResearchAugust 24 2012
  2. 2.  45 years old Almost 30K libraries contributing from 170 countries More than 271 M items 1200 employees 21 offices worldwide
  3. 3.  Since 1978 46 people 3 locations (Dublin, San Mateo, Leiden) Pure research  not product R&D  not market research
  4. 4.  Wikipedians still complain about the vector skin
  5. 5.  Although content creation is fast Internal policy progress is glacial, conservative Consensus model over asynchronous and near- anonymous discussion
  6. 6.  “The free bureaucracy, that anyone can legislate.” ~ San Francisco Wiknic 2012
  7. 7.  Community orginated.  27,456 instances 2009 “Linkspam” accusations against OCLC.  Cause links to Amazon and B&N on the WorldCat page.  Original accuser was banned for being argumentative.
  8. 8.  Crux: Should Wikipedia promote any organization?  Open question in the community
  9. 9.  Disambiguation Collation
  10. 10.  Authority file matching During creation used Wikipedia data 2013. Wikipedia will be promoted to “source” rather than reference.
  11. 11.  English Wikipedia  4,000 instances German Wikipeida  220,000 instances Wikimedia Commons  45,000 instances … Added by hand Rules vary by language
  12. 12. …Load VIAF Data Check Deutsche Wikipedia Edit English Wikipedia
  13. 13.  English Only, for now Targets 260,000 pages  1/16th of English Wikipedia Still won’t be fully synched with Deutsche Wikipedia
  14. 14.  https://github.com/notconfusing/VIAFbot Uses Pywikipediabot In community code review: running within the next month
  15. 15.  Transclusion & Sugarcoated HTML
  16. 16.  Transclusion  You can draw in text from other pages (typically templates)  Can send parameters Templates can perform  Simple logic operations  Simple text manipulation Still Wikitext, not fully query-able
  17. 17. “The way you always thought Wikipedia worked.”~Merrilee Proffitt
  18. 18.  Phase 1  Revamping interlanguage links Phase 2  Data, Templates and Infoboxes Phase 3  Semantic querying
  19. 19.  Now: Added by  Soon: Wikidata hand or bot concept page
  20. 20.  Soon: Properties for a concept
  21. 21.  Soon: This won’t be a monumental effort.
  22. 22.  The end of the assumption that Wikipages store Wikitext. On Wikidata they store JSON.
  23. 23.  All the work VIAFbot is doing, will be accessible across 270 Wikis. Plus language specific lookup…
  24. 24.  RDF Data
  25. 25.  Backers: Google, Paul Allen Institute for Artificial Intelligence, Gordon and Betty Moore Foundation. Release Date: January 2013 Caveat: Requires adoption by each individual language wiki – by consensus. Wikipedias having found consensus so far: …
  26. 26.  Hungarian Wikipedia
  27. 27.  Bibliographic data is both:  An element of citation  An articles in its own right
  28. 28. • 411,274 citations of books• 244, 236 citations of journals• 57,868 citations of encyclopedias• 342,470 of newspapers• 1,055,845 total print citations• 1,169,495 citations of webhttp://en.wikipedia.org/wiki/User:Maximilianklein/Citations
  29. 29. • 154,978 Citation of Google books• 38,328 Citations of Amazon• 7,695 Citations of Worldcathttp://webempires.org/wikirank-wikipedias-top-sources/wiki_top/• Must Make it easier to link to libraries.
  30. 30.  Wikipedia features bidirectional linking.  Take links forward all the time, why not backwards?
  31. 31.  Could add “what cites this” What cites this
  32. 32.  A Wikipedia article could be a good way of declaring the aboutness of a record.~Asaf Bartov (User:Ijon)
  33. 33. links to
  34. 34.  Could add “what’s about this”What’s about this
  35. 35. What’sabout this
  36. 36.  Dream  Take your browser history
  37. 37.  Would still have to create bidirectional links between WorldCat and Wikipeida
  38. 38.  There is the practical solution. VIAFbot is the prototype of the link reciprocation solution
  39. 39.  Have to gain Wikipedia approval to reciprocate links with a bot  Subject to community approval Requires maintenance  Can become unsynchronized
  40. 40.  Seaplanes  Imitated bidirectional Islands  Wikipedia, VIAF, WorldCat Data Archipelago
  41. 41. Max Klein and Merrilee Proffitt@notconfusing and@merrileeiam

×