Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Digital Odyssey 2012: Open Data


Published on

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

Digital Odyssey 2012: Open Data

  1. 1. Open Data is Dead!Long Live Open Data!MJ SuhonosJune 8, 2012
  2. 2. ①The web and openness
  3. 3. 2009:The NextWeb• TED talk on the 20th anniversary of theWWW• Idea ofWWW borne of frustration• Unrealized potential due to incompatibility• Virtual documentation system on the Internet
  4. 4. ”vague, but exciting”
  5. 5. A new way of thinking• CD-ROMs already had isolated hyperlinking• Later done "on the side, as a play project”• Made everything openly and freely available
  6. 6. A grassroots movement• People started doing things that werentimagined originally• Network effect: more involvement = morenew, interesting, useful things• Most valuable thing was the community
  7. 7. Openness Movements• About community and culture building• Based around a new way of thinking• Facilitated by a new technology
  8. 8. Openness Movements• OpenAccess: 1997 (SPARC)• Open Source: 1998 (Open Source Summit)
  9. 9. Old ideas rebooted• Both actually go back to about 1910• New movements based on the idea of non-rivalry (digital reproduction)• Facilitated by the Internet andWWW
  10. 10. The value of data• Data is only useful when someone doessomething with it• No data = zero possibilities• More unrealized potential
  11. 11. RawDataNow!
  12. 12. Gold stars of Open Data1. Make your stuff openly available on the web★2. Make it available as structured data ★★e.g. Excel instead of PDF3. Use a non-proprietary format ★★★e.g. CSV instead of Excel
  13. 13. 2010:TPL Open Data• First project was to submit the entirecatalogue to the InternetArchive• 2.5 million MARC records, about 2GB
  14. 14. Open catalogue data• 2/3 stars for binary MARC format ★★• Downloaded 89 times since 2010• U ofT: 5400 times, UPEI 2900 times• TPL is hands-off: no updates, no license
  15. 15. 2009-2010
  16. 16. OCLC record use policy• Trying to protect their business model bypreventing sharing• Deliberately exploited uncertainty of legality• Librarians argued vocally for public domain• Policy retracted and changed (not defensible)
  17. 17. Circling the wagons• Libraries have the power to fight back• Best counter-strategy is to release the data• Need the ability to work together somehow
  18. 18. ②Linked Data
  19. 19. Linked Data• Technical framework for data interoperability• A common language for sharing data andrelations online• More unrealized potential due to massiveincompatibility & “siloing”
  20. 20. A new way of thinking• Fundamentally differs from conceptualizationunderlying data formats of the 20th century• From concept of "records" as bounded sets, toan unbounded set of "statements”
  21. 21. Based on a new technology• Same principles and mechanisms asWWW– URIs for names, HTTP for retrieval, plus RDF• Still organized facts about things, butinfinitely more flexible structure
  22. 22. ”vague, but exciting”
  23. 23. Why Linked Data?• Breaking data out of silos by pointing to andlinking between other databases• Formulate questions for which no answerexists on the current WWW• Anyone can contribute unique expertise in aform that can be reused and recombined
  24. 24. “The coolest thing to do to your data willbe thought of by someone else.”
  25. 25. ③Open Data
  26. 26. Open Data• Legal and policy framework for datainteroperability• Clarifies the terms and purposes of data use• Allows for a spectrum of licensing options– see Creative Commons
  27. 27. Open Data definition“freely usable, reusable andredistributable, subject, at most, to therequirements to attribute and share-alike”
  28. 28. Database hugging• People dont want to let go of their data:– until its perfect or complete or "finished”– because data is raw and unpolished and ugly– because “we know better than everyone else”– something unforeseeably terrible might happen
  29. 29. Misconception #1• Open Data will destroy/compromise quality– Already a lot of high-quality data being createdoutside of libraries– Our MARC records arent actually that great
  30. 30. Misconception #2• Open Data will reveal our mistakes/problems– everyones data is messy, that’s its nature– what if someone were able to clean it up for you?
  31. 31. Misconception #3• Open Data will facilitate competition– new and useful tools are good, even ones thatinvolve money– what if someone does a better job with our datathan we do?
  32. 32. Misconception #4• Open Data is a loss of control– if you deliberately make it available, you can setthe (legal) terms of its use– requires thinking about / dealing with legal stuff
  33. 33. An increasing trend• 2012: Canada Post Files Copyright LawsuitOver Crowd-sourced Postal Code Database take down the openly-licensed database2. pay damages on lost business ($5500/year)
  34. 34. New library business model1. Sell access to library catalogue data2. Sue every organization who makesbibliographic data available for freee.g. Internet Archive, Amazon, Library of Congress3. Profit!
  35. 35. Open Data vs. Linked Data• Open Data does not have to be Linked Data• Linked Data does not require it to be Open• But the potential of the both is best realizedwhen data is published as Open Linked Data
  36. 36. Open Linked DataLinkedDataOpenData
  37. 37. Gold stars of Open Linked Data1. Make your stuff openly available on the web★2. Make it available as structured data ★★3. Use a non-proprietary format ★★★4. Use URIs to identify your things ★★★★5. Link to other people’s things using URIs★★★★★
  38. 38. ④Libraries &The SemanticWeb
  39. 39. 2011: Library Linked Data• W3C Library Linked Data incubator group• Panel of invited librarians, academics, experts• “to help increase global interoperability oflibrary data on the SemanticWeb”• Final report produced October 2011
  40. 40. A struggle for relevancy• "library" = all cultural heritage & memoryinstitutions (archives, museums)• Natural extension to the collaborative sharingmodels historically employed by libraries• In a position to provide trusted metadata forresources of long-term cultural importance
  41. 41. Major goals for libraries1. Foster discussion about Open Data andrights management issues2. Develop library standards that arecompatible with Linked Data3. Apply library experience in curation andlong-term preservation to Open Linked Data
  42. 42. A discussion about Open Data• Data can have unclear and untested rightsissues that hinder their release as Open Data• Seek agreement with owners about licensing;consider the impact of usage restrictions• Establish institutional policies for data sharingand licensing
  43. 43. Issues with library standards• Data is expressed primarily in natural-language text• Technology changes depend on vendorsystems development• Data is not integrated with web resources• Designed only for the library community
  44. 44. Benefits of Open Linked Data• Will be able to use mainstream solutions• Can give libraries a wider choice of vendors anddevelopers to recruit from and interact with• Much larger community to provide IT support• Smaller institutions can make themselves morevisible and connected
  45. 45. Already going mainstream• National libraries of Sweden, Hungary,Germany, France, the British Library, L of C• BNB: 2.6 million records as 85 million RDFstatements, public domain license• Cities ofVancouver, Edmonton, Ottawa, andToronto have created grassroots @g4open
  46. 46. ⑤In Summary
  47. 47. Now is the time• Missed opportunities before• Don’t often get a second chance• Major opportunity here for libraries to catchup and become leaders online
  48. 48. Open Data Now!• Remember the 5 stars of Open Linked Data1. Choose a license, keep control of the rights2. Release the data – just get it out there
  49. 49. Thanks!@mjsuhonosmj@suhonos.ca