Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Where are Repository's Going?

2,048 views

Published on

Keynote talk by Sally Rumsey and Ben O'Steen, given at the Repository Fringe 2009, Edinburgh.

Published in: Education, Business, Technology
  • Be the first to comment

Where are Repository's Going?

  1. 1. Repository Fringe, Edinburgh 2009 Where are repositories going? Ben O’Steen (ORA Software Developer) Sally Rumsey (ORA Service & Development Manager)
  2. 2. Growth of repositories & historical parallel
  3. 3. Sir Thomas Bodley
  4. 4. Storage Content
  5. 5. “…and you having built an Ark to save learning from deluge, deserve propriety in any new instrument or engine, whereby learning should be improved or advanced.” Francis Bacon to Thomas Bodley Nov 1605 http://novels.mobi/create/out_mobi/pg/1/2/5/1/12515/12515/4.php
  6. 6. Search function Library catalogue 1620 Reproduced for this presentation with kind permission of King's College London, Foyle Special Collections Library www.kcl.ac.uk/.../exhibitions/marsex/mcoll.html
  7. 7. Jeffrey Keefer http://www.flickr.com/photos/jeffreykeefer/773540725/ CC licence: Attribution-Non-Commercial-Share Alike 2.0 Generic
  8. 8. Radcliffe Science Library 1861 Radcliffe Camera 1749 New Bodleian 1940
  9. 9. Artist’s impression of new Bodleian book depository at Swindon Details may change
  10. 10. Bodeian Stats 2009 8.5M volumes 1.6M visitors each year 65,000 registered readers* 5.4M requests for full-text journal articles 1.8M requests for e-books * 37,000 University card holders plus 28,000 external readers
  11. 11. Library terms and conditions Bodleian Library declaration: I hereby undertake not to remove from the Library, or to mark, deface, or injure in any way, any volume, document or other object belonging to it or in its custody; not to bring into the Library, or kindle therein, any fire or flame, and not to smoke in the Library; and I promise to obey all rules of the Library.
  12. 12. QUOD FELICITER VORTAT ACADEMICI OXONIENS BIBLIOTHECAM HANC VOBIS REIPUBLICAEQUE LITERORUM T.B.P. That it might turn out happily, Oxonian academics, for you and for the republic of lettered men Thomas Bodley placed this library
  13. 13. Growth in numbers of digital repositories Source: Tim Brody. ROAR Registry of Open Access Repositories. http://roar.eprints.org/
  14. 14. Some overarching themes
  15. 15. Theme Realisation as a catalyst for change
  16. 16. Theme Repositories as a concept
  17. 17. Paper in Institutional Paper out repository Repository as a box
  18. 18. Integration with other hard and soft systems enderisnotmyrealname http://www.flickr.com/photos/enderisnotmyrealname/3586300347/ Attribution-Non-Commercial-Share Alike 2.0 Generic
  19. 19. “An effective institutional repository of necessity represents a collaboration among librarians, information technologists, archives and records mangers, faculty, and university administrators and policymakers.” Clifford Lynch. ARL Bi-monthly report No. 226 Feb 2003 http://www.arl.org/resources/pubs/br/br226/br226ir.shtml
  20. 20. “… a university–based institutional repository is a set of services……an institutional repository is not simply a fixed set of software and hardware.” Clifford Lynch. ARL Bi-monthly report No. 226 Feb 2003 http://www.arl.org/resources/pubs/br/br226/br226ir.shtml
  21. 21. Sally Ben
  22. 22. The most successful repository is the internet. Embrace it.
  23. 23. Some pointers then: • Distributed across a number of nodes. •
  24. 24. Some pointers then: • Distributed across a number of nodes. • The services and storage should be separate. •
  25. 25. Some pointers then: • Distributed across a number of nodes. • The services and storage should be separate. • There should be multiple ways to search the content. •
  26. 26. Some pointers then: • Distributed across a number of nodes. • The services and storage should be separate. • There should be multiple ways to search the content. • Any service or storage can disappear, be added or upgraded without affecting the other systems unduly.
  27. 27. Some pointers then: • Distributed across a number of nodes. • The services and storage should be Just think how you might separate. • There should be multiple ways to search the make your IR work more like content. • Any service or storage can disappear, be the web does. added or upgraded without affecting the other systems unduly.
  28. 28. "The future is here. It's just not evenly distributed yet." "The future is here. It's just not evenly distributed yet." s William Gibson NPR Talk of the Nation 30 November 1999 Timecode: 11min 55sec Link: discover.npr.org/features/feature.jhtml?wfId=1067220 Also: www.npr.org/rundowns/rundown.php?prgld=5&prgDate=30-Nov-1999
  29. 29. "The future is here. It's just not evenly distributed yet." "The future is here. It's just not evenly distributed yet." s William Gibson NPR Talk of the Nation 30 November 1999 Timecode: 11min 55sec Link: discover.npr.org/features/feature.jhtml?wfId=1067220 rg/rundowns/rundown.php?prgld=5&prgDate=30-Nov-1999
  30. 30. For those who want to follow along: http://bit.ly/89AtD
  31. 31. Using google to assay the forms of usage
  32. 32. Using amazon's in-book search and browse to find the phrase.
  33. 33. 'Tim O'Reilly checked with Cory Doctorow who checked with Lorna Toolis who checked with Barry Wellman who checked with Ren Reynolds and Ellen Pozzi who point out that there's an NPR Talk of the Nation broadcast from 1999 where Gibson says, "As I've said many times, the future is already here. It's just not very evenly distributed." William Gibson NPR Talk of the Nation 30 November 1999 Timecode: 11min 55sec Link: discover.npr.org/features/feature.jhtml?wfId=1067220 Also: www.npr.org/rundowns/rundown.php?prgld=5&prgDate=30-Nov-1999
  34. 34. NPR has changed their site since then, breaking the link to the metadata about that recording... whoops...
  35. 35. But the link to the actual broadcast works: http://discover.npr.org/features/feature.jhtml?wfId=1067220 Notice anything interesting in that url?
  36. 36. Hint Hint: tml?wf Id=1067220
  37. 37. Realisation: People search for Things – the fact that they can only retrieve documents concerning those Things is incidental to them.
  38. 38. Things: • People • Places • Dates • Books • CDs • Performances/Events • • Topics/subjects • … etc, etc
  39. 39. Things: • What did they all have in common? • •
  40. 40. Things: • What did they all have in common? • • They all have 'names' of one sort in real-life. • • But there are plenty of those Things that don't have names on the web... • • How about we give them names?
  41. 41. Provide documents that directly relate to rather than simply mention a Thing the person is searching for.
  42. 42. Provide documents that directly relate to rather than simply mention a Thing the person is searching for.
  43. 43. “Relate” is a fluffy word. The key is knowing how a Document relates to a Thing.
  44. 44. “Relate” is a fluffy word. The key is knowing how a Document relates to a Thing. Does it describe it, comment on it, refer to it, locate it, disagree with it?
  45. 45. The types of relationships between Named Things are very important
  46. 46. Realisation – we have been avidly giving ourselves HTTP names for some time now xkcd.com/262/
  47. 47. Realisation – we have been avidly giving ourselves HTTP names for some time now • http://facebook.com/benosteen • http://twitter.com/benosteen • http://oxfordrepo.blogspot.com • Etc • Etc
  48. 48. Realisation – we have been avidly giving ourselves HTTP names for some time now • Things can have multiple, simultaneous names in real-life and online. • • The real power comes from relating names: • • “This named thing {is the same as} that other named thing”
  49. 49. And when it comes to people on the web, there has been a social sea- change
  50. 50. “Have you got a profile page on friendsreunited?”
  51. 51. “Are you on facebook?” “Are you on twitter?”
  52. 52. Linked Data and HTTP names Linked Data
  53. 53. Real Data • http://data.gov • • http://recovery.gov – – - Repositories of US Federal Data and Federal Funding information, being published in a re- usable manner using Atom and RDF.
  54. 54. Real Data • http://id.loc.gov – Library of Congress publishing their authority lists as Linked Data in RDF.
  55. 55. Yahoo and Google indexing RDF embedded in HTML pages (as RDFa) O'Reilly post on Google's “adoption” http://radar.oreilly.com/2009/05/google-announces-support-for-m.html A piece from the RDFa.info site about Yahoo and SearchMonkey's use of RDFa http://rdfa.info/2008/03/14/yahoo-into-semantic-web/
  56. 56. Ben Sally
  57. 57. Theme Policies
  58. 58. 1. Accession
  59. 59. “Dedicated to the freeing of the refereed research literature online through author/institution self- archiving” November 2000
  60. 60. “Recognised as the easiest and fastest way to set up repositories of: • research literature • scientific data • student theses • project reports • multimedia artefacts • teaching materials • scholarly collections • digitised records July 2009 • exhibitions and performances”
  61. 61. “Resources range from simple materials such as Word documents or Powerpoint presentations, to complex learning packages, IMS, SCORM and VLE course modules that combine various multimedia formats such as video, audio and animation.” http://www.jorum.ac.uk/
  62. 62. Success story Community specific user interfaces for deposit, discovery and access
  63. 63. Deposit in multiple repositories More needs to be done! Institutional repository Single Subject repository deposit Other repository
  64. 64. Sally Ben
  65. 65. Realisation: We've reinvented too many “wheels”
  66. 66. The Web exists and it works.
  67. 67. Don't fight it. Work with it.
  68. 68. Using the defacto standards of the web gives you a massive advantage.
  69. 69. Defacto standards • Transfer: – Files (HTTP) – Lists of things (Atom, RSS) • Create, Read, Update, Delete: – HTTP PUT, GET, POST, DELETE • Names: – URIs • Lookups: – DNS resolvers
  70. 70. The big advantages • Instant community • Lots of tried and tested software that you don't have to write from scratch ! • No wheels need be re-invented • May not be perfect, but it works
  71. 71. The big disadvantage • If you really are doing something unique or new (which is really unlikely) then try to get a community to help you. • • If noone else wants to do it like you, then think about what you are truely accomplishing.
  72. 72. What is new? • Doing something new does not mean using a more refined vocabulary to describe things. •
  73. 73. What is new? • I mean something that we don't have defacto standards for
  74. 74. What is new? • I mean something that we don't have defacto standards for: – real-time event notifications through the browser –
  75. 75. What is new? • I mean something that we don't have defacto standards for: – real-time event notifications through the browser – Simultaneous collaborative document editing
  76. 76. What is new? • I mean something that we don't have defacto standards for: – real-time event notifications through the browser – Simultaneous collaborative document editing – Data qualified and ranked by evidence
  77. 77. Participation! Name some repositories.
  78. 78. Some things I consider to be Repositories • Flickr • IRs • Facebook • Domain Rs (Jorum, • Google Docs Pubmed, etc) • A filesytem of BagITs • Publisher sites • Scribd • Forums • Slideshare • CVS/SVN/Git/Hg • Blogs (WP/etc) • WebDAV • Wikis • FTP • Twitter/Identi.ca
  79. 79. So, what do these repositories have in common? Standards? APIs?
  80. 80. So, what do these repositories have in common? Standards? APIs? Erm.... not much, but they all contain sets of things.
  81. 81. “Trying to get stuff into your repository? Noone gives a SIP...”
  82. 82. Realisation: Object transfer is still in a divergent state • Lots of containers, lots of formats, too many conventions that you just have to know..
  83. 83. Realisation: Object transfer is still in a divergent state • Lots of containers, lots of formats, too many conventions that you just have to know.. • • There is no negotiation for the format of a SIP – you deal with what you are given.
  84. 84. Realisation: Object transfer is still in a divergent state • Lots of containers, lots of formats, too many conventions that you just have to know.. • • There is no negotiation for the format of a SIP – you deal with what you are given. • • And sometimes, you just have to go and harvest what you can.
  85. 85. Don't Panic
  86. 86. Normal Archival Process • (paraphrased by an observer...) •
  87. 87. Normal Archival Process • - Accept delivery of boxes of stuff, and record roughly what was received. • • THINGS GET PERMANENT IDS NOW! • • Even if it is just on a per-box basis. The item carries that provenance throughout. –
  88. 88. Normal Archival Process • - Accept delivery of boxes of stuff, and record roughly what was received. • - Triage the contents within a stable environment. –
  89. 89. Normal Archival Process • - Accept delivery of boxes of stuff, and record roughly what was received. • - Triage the contents within a stable environment. – Deal with the fragile things first, things that will deteriorate. –
  90. 90. Normal Archival Process • - Accept delivery of boxes of stuff, and record roughly what was received. • - Triage the contents within a stable environment. – Deal with the fragile things first, things that will deteriorate. – Try to sort out issues that arise, with the next of kin/donater. –
  91. 91. Normal Archival Process • - Accept delivery of boxes of stuff, and record roughly what was received. • - Triage the contents within a stable environment. – Deal with the fragile things first, things that will deteriorate. – Try to sort out issues that arise, with the next of kin/donater. – Some things may stay in the box for a long time...
  92. 92. Normal Archival Process • - Accept delivery of boxes of stuff, and record roughly what was received. • - Triage the contents within a stable environment. – Deal with the fragile things first – Try to sort out issues that arise – Some things may stay in the box – Identify actions that need to be taken to ensure future access.
  93. 93. Normal Archival Process • - Accept delivery of boxes of stuff, and record roughly what was received. • - Triage the contents within a stable environment. - Characterise and catalogue the contents, using relevant tools.
  94. 94. Normal Archival Process • - Accept delivery of boxes of stuff, and record roughly what was received. • - Triage the contents within a stable environment. - Characterise and catalogue the contents, using relevant tools. • - Update archival records so that people can find the content (if they are allowed to.)
  95. 95. Digital Process • - Accept delivery of boxes of stuff, and record roughly what was received. • - Triage the contents within a stable environment. - Characterise and catalogue the contents, using relevant tools. • - Update archival records so that people can find the content (if they are allowed to.)
  96. 96. Digital Process The media may be different. And the tools may be too. You are still likely to be doing` something like this for a while.
  97. 97. Not all storage is the same The absolute biggest benefit to any repository is to separate out the concerns of storage and services. • • It will make you life so, so much easier. • • Trust me.
  98. 98. Hardware, software, people and storage will come and go. Your content is constant.
  99. 99. Ben Sally
  100. 100. “When it's one click deposit, I'll do it" Medical scientists at Oxford
  101. 101. Bill Hubbard: Institutional Policies and Processes for Mandate Compliance. May 2009. http://www.sherpa.ac.uk/documents/OA %20Choices%20-%20researcher%27s %20view.ppt
  102. 102. What we need is:  Deposit by stealth and other easy solutions  Multiple Repository Deposit Regime (MuRDeR)  Answers to related problems that worry people such as multiple versions  Automation, automation, automation
  103. 103. Copyright ©
  104. 104. “There is a supreme irony that just as technology is allowing greater access to books and other creative works than ever before for education and research, new restrictions threaten to lock away digital content in a way we would never countenance for printed material.” Dame Lynne Brindley, CEO The British Library Copyright for Education and Research Golden Opportunity or Digital Black Hole? http://www.bl.uk/ip/pdf/copyrightresearchreport.pdf
  105. 105. Legal Deposit as a parallel to repository mandates
  106. 106. Bodley’s agreement 1610 “That one Book of every sort that is new Printed, or Re-printed with Order of Additions, be sent to the University the Star of Oxford for the Use of the publick Library there, … to be sent to the Chamber Library at Oxford accordingly, upon 1637 pain of Imprisonment” http://www.british-history.ac.uk/report.aspx?compid=74953
  107. 107. “…there tower the few, the very few, Libraries of Deposit. These are the super-Dreadnoughts of the literary world, and the Bodleian claims to be among them … a really great library should have Universal scope, Independences, Size, Permanence, Wealth, and multiform Utility.” The Bodleian Library at Oxford. Falconer Madan. 1919 http://www.archive.org/stream/bodleianlibrarya00mada/bodleianlibrarya00mada_djvu.txt
  108. 108. 2. Management and preservation
  109. 109. alancleaver_2000 Attribution 2.0 Generic http://www.flickr.com/photos/alancleaver/ 2638883650/ “Preservation aims towards preserving access”
  110. 110. Assured secure storage and permanent access needs to be well-managed Aided by intra- and inter- institutional advisory and support services
  111. 111. Shared and distributed expertise
  112. 112. Sally Ben
  113. 113. Why do people choose to interact with systems?
  114. 114. Why do people choose to interact with systems? Disproportionate Feedback loops
  115. 115. Disproportionate Feedback Loop => The perception that a small effort leads to a very great benefit.
  116. 116. Disproportionate Feedback Loop => The perception that a small effort leads to a very great benefit. Which leads to more “little efforts” which add up!
  117. 117. High Scores Technically trivial, but... Psychologically addictive and drives a lot of replay
  118. 118. High Scores IR High scores? ● Usage stats ● Re-usage stats (trackbacks, tweets, references)
  119. 119. Ben Sally
  120. 120. Parliamentary Office for Science and Technology Peer review is used in the UK for 3 main purposes: 1. Allocation of research funding 2. Publication of research in scientific journals. To assess the quality of research submitted for publication and to assess its importance. 3. Assess the research rating of university departments http://www.parliament.uk/post/pn182.pdf
  121. 121. 3. Dissemination
  122. 122. Links to: Actionable raw data; Data fusion Links to: Interactive version; Google maps; additional visualisation
  123. 123. Open Access
  124. 124. April 1932 Ramblers set out from Bowden Bridge for the Kinder Scout mass trespass http://kindertrespass.com/index.asp?ID=37
  125. 125. “What many people fail to realise is that the uplands of this country once belonged to us, open common land, free for all to walk at will. They were only enclosed and parcelled off to the rich by acts of parliament pushed through by the rich. An old rhyme goes: They hang the man and flog the woman That steals the goose from oft the common Yet leave the greater villain loose That steals the common from the goose.” Mike Harding The Guardian, Wednesday 18 April 2007 http://www.guardian.co.uk/environment/2007/apr/18/society.guar diansocietysupplement1
  126. 126. Free! Free Free! Free! Free! Free! Free! Free! ! Free! Free! Free! Free! Free Free! Free! Free! Free! FREE! !
  127. 127. Green Open Access Open options Operates two Mandated open access different open access Models (as reported by SHERPA/Romeo) Open Archives (OAI) Service Gold Open Access Some journals in UKPMC allow harvesting of the full text of all items, others allow it for only Open access journals some items, and many do not allow it at all. See the PMC Open Access List for specifics. Too complicated!
  128. 128. Preview Time
  129. 129. You are here
  130. 130. Evolution Evolution Seismic change Rapid change Time Time Evolution Evolution Step change Incremental change Time Time
  131. 131. Trends 1. Entering a period of steady growth and change 2. Repositories as a set of services embedded within institutional systems 3. Names
  132. 132. Trends: Still waiting… • Easy multiple deposit • Collaboration between publishers, repositories, HEIs, government and research funders as a group • Common policies • Less complexity
  133. 133. Crystal ball by Hamachi! CC license. Attribution-Non-Commercial-No Derivative Works 2.0 Generic Available at http://www.flickr.com/photos/mawari/2091456761/
  134. 134. Print-on-demand is going to be big And I don't mean printed facsimiles.
  135. 135. What does having a book mean if you can print one in 5 minutes for £2? http://www.ondemandbooks.com/home.htm
  136. 136. How about this scenario then? It's all technically possible now
  137. 137. How about this scenario then? It's all technically possible now You print off a set of articles into a book on the libraries book-printer. Your collegues comments, tweets and reviews are interleaved with the text. Your collegues were found from your professional social network.
  138. 138. How about this scenario then? It's all technically possible now You print off a set of articles into a book on the libraries book-printer. Your collegues comments, tweets and reviews are interleaved with the text. Your collegues were found from your professional social network.
  139. 139. How about this scenario then? It's all technically possible now You print off a set of articles into a book on the libraries book-printer. Your collegues comments, tweets and reviews are interleaved with the text. Your collegues were found from your professional social network.
  140. 140. How about this scenario? You create a bookmark list of plates from 18th century books online which you believe to be the work of one anonymous artist. This list with your comments is a new resource in of itself, and can be commented on or printed as a book.
  141. 141. Permanent books, temporary magazines? Is this true? How about facsimiles that are printed, so that a student can study with their coffee and donuts? And why print to paper? Why not print to digital paper, once it arrives...

×