Where are Repository's Going?

1,959 views

Published on

Keynote talk by Sally Rumsey and Ben O'Steen, given at the Repository Fringe 2009, Edinburgh.

Published in: Education, Business, Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,959
On SlideShare
0
From Embeds
0
Number of Embeds
18
Actions
Shares
0
Downloads
43
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide
  • Start using a historical parallel by looking at the growth and development of the Bodleian Library
  • Sir Thomas Bodley was a scholar and diplomat In 1598 he donated funds to save the library which at that time was in a state of disrepair Around 2500 books donated to form the new collection
  • When the library opened in 1602 there was both storage and content There was a collection development policy of sorts. Bodley chose to exclude what he called the “baggage books” and ‘riff-raff”
  • In November 1605 Francis Bacon sent Bodley a copy of his newly published book ‘The Advancement of Learning’ In a letter that accompanied the book he called the Bodleian ‘an ark to save learning from deluge.’ Nice description of repositories
  • The First catalogue was printed 1605 and the new edition that you see here, in 1620 The original catalogue comprised an alphabetical list of authors and a series of subject catalogues. ‘Search’ function
  • The library continued to expand and as is the way with libraries, over time, lack of space became a problem
  • The Radcliffe Camera, Radcliffe science library and new Bodleian libraries were built to house the growing stock and extra storage was excavated under Radcliffe Square By 1849 there were 220,000 books and 21,000 manuscripts and the catalogue had grown to 3 volumes By end of 19th century the book collection was growing by about 30,000 volumes per year It reached 1 million volumes in 1914 By the beginning of 20th century there were around 100 users per day
  • Still growing rapidly with plans for a new book depository at Swindon due to become operational in 2010 http://www.ouls.ox.ac.uk/news/2009_mar_17
  • 2009 the numbers speak for themselves. A big and busy library
  • Like any digital repository, the library has its usage agreement Users today have to sign this.
  • The library has always been dedicated to serving the widest scholarly community, the republic of lettered men, as described on the plaque over the main entrance Tyack (2000) The Bodleian Library, Oxford
  • A more recent rapid growth in a different republic of letters This graph, taken from the Registry of Open Access Repositories , shows the growth in numbers of digital repositories over the last 15 years or so. In terms of content, are we building the digital equivalent of the Bodleian?
  • Before looking at details of repositories, introduce some overarching themes
  • First, realisation Realisation is basically the penny dropping that the status quo is not satisfactory and the climate is ripe for moving on Such realisation can be as important as change itself.
  • This next theme is of repositories as a concept, not as a stand alone ‘thing’ Repositories in its plural version purposefully as the single repository is on its way out
  • Gone is the idea of a repository as a simple digital box We are moving away from the Hokey Cokey repository - you put your paper in (deposit), you take your paper out (dissemination). What is beginning to happen is that repositories are being integrated into the general services of the institution As such they might ultimately become somewhat invisible as services are shared.
  • One factor drawing repositories away from the ‘box’ design is that they are becoming more integrated with other institutional management systems. Data sharing within and between institutions is becoming more common but it will take time to reach maturity Current Research Info Systems or similar will become more common, hopefully with common standards for data sharing such as use of CERIF. And it’s not only technical systems, human systems are becoming more integrated
  • As part of that systems integration, repository staff act as catalysts for community building, both culturally and technically within and without single institutions We’re seeing this now with the preparatory work for REF, data management and the bringing together of libraries, computing services, legal services, research services and of course the data producers.
  • Moving away from the box mentality, repositories are beginning to fulfil Clifford Lynch's description of a repository as a set of services. Repositories are part of an integrated technical and managerial infrastructure, not a 'thing' in their own right
  • Always loved this quote – ever since I heard it. Change happens slowly all around us, but realisation of the change is necessary for us to start to try to spread it around more evenly.
  • Note the citation – it actually took me a little while to track this down, and what I found was someone else's research on this phrase
  • Brian Dear's weblog Lays out the entire process he used to track this down why is this important? It illustrates a fundamental use case for searching: “I think author X said Y, but I dont know when or how, and it might not be author X”
  • It has an ID! There is still some slightly odd ideas going around that trusted repositories are the only one's who have considered the 'identifier' issue. I'd agree – an awful lot of talk has gone on about this in the academic world while the rest of the world has just gotten on with it.
  • Segue to policy concerns
  • Policies should drive everything about repositories. This includes the policies of external bodies that affect repositories
  • In some areas policies for repositories are common and accepted. The Directory of Open Access Repositories - Open DOAR even provides a policy tool
  • However, the PRESERV project worryingly found a lack of preservation policies in existing repositories The DISK-UK DataShare project is running a workshop this afternoon at this conference on policies for data management
  • Going to take a look at some trends in three main areas that make up the repository services. Firstly, accession
  • A huge expansion of item types deposited in repositories. Back in Nov 2000 the ePrints software stated that it was ‘dedicated to the freeing the refereed research literature’
  • Look at the range of item types recognised as repository fodder listed here for ePrints in July 2009.
  • And teaching and learning materials are equally broad in their variety of content types This wide variety will continue to grow
  • Arguably subject focused repositories have so far been the most successful. Think of Arxiv, SSRN and PubMedCentral to name a few
  • The realisation that this community buy-in works, prompts the pressing need to resolve the multiple deposit problem. Some work has taken place in this area but more needs to be done Technically it’s relatively straightforward, with some provisos It’s more a question of policy Until we get this cracked, there’s a real problem gaining buy-in from content creators
  • Segue to Sally talking about magic deposits Asking an academic to self-deposit is like negotiating for a SIP...
  • We really DO need to keep it simple. We need to cut the complexity. This not only includes the practical workflows, but the instructions and guidelines we provide At the moment we’re stymied because we’re having to try to explain inordinately complex processes and rules.
  • This diagram by Bill Hubbard illustrates the complexity that faces a researcher when faced with open access options and attempting to comply with a funder open access policy. Compare this with how it used to be…
  • No wonder researchers are confused.
  • What we need – stealth, murder, answers and automation Authors are not going to change their workflows willingly. We can be certain that they’re not in a hurry to create metadata Nature Publishing Group has adopted a policy to archive in repositories on behalf of authors. Maybe one day it will do this for institutional repositories. Will we see more of this? From some maybe
  • A great example of complexity. Our aim should be to keep the barriers to deposit as low as possible – and that includes copyright issues. Some well respected publishers have adopted much simpler and author friendly copyright policies But wouldn’t it be nice if things were more uniform?
  • This interesting point about the digital domain having more barriers to access than print was made in a recent British Library report. The report goes on to say that digital materials can be bound up in rights, making it inaccessible, impossible to use modern computer based research techniques and so that data cannot be shared. The copy norm that academic authors have traditionally worked to is like the Creative Commons attribution, non-commercial, share alike licence. Shouldn’t this be the norm for digital items as well?
  • Continuing the copyright theme draw a comparison between legal deposit in copyright libraries and deposit in repositories According to the agency for legal deposit it has “great advantages for authors and publishers” Although not identical, its sentiments have resonances with mandates for deposit in repositories which have been on the increase and with open access: They are to make items available, to preserve them and to retain the corpus of content for the benefit of all
  • In 1610 Thomas Bodley struck an agreement with the Stationers’ Company that his library could claim one copy of every book registered at Stationers’ Hall. This helped support his vision of a library that provided for the wider scholarly world. There were many ‘rubs and delays’ when setting up the agreement but “the advantage to the publishers was that whenever an edition ran out of print, there would always be a ‘perfect copy’ available for reprint or amendments. It paved the way for the future legal deposit that we still have today and that could be a model for mandated deposit in repositories
  • In 1919 this was written about the Bodleian. The author describes what he thinks distinguishes legal deposit libraries from others Again, this has strong parallels with what we are striving for with repositories To become a super-dreadnought repository, one might strive for universal scope, independence, size, permanence and multiform utility. wealth as well.
  • With all this content being acquired, the processes running repositories need to be well managed and the content preserved.
  • Firstly, preservation is no use without continued access. You can’t have one without the other Preservation should be renamed ‘Assured secure storage and permanent access.’
  • Such continued long-term access requires managing, management that should be underpinned by robust policies. one trend is that of the development of intra and inter institutional advisory and support services covering major areas of concern. The JISC takes its responsibilities in this area very seriously with services such as JISC legal. And we are fortunate in the UK to have a number of other national support and advice services
  • Another trend is that of shared and distributed expertise for example collaborative metadata creation. This was demonstrated nicely when the Library of Congress released photographs on Flickr which they invite users to tag. However, collaborative metadata creation is by no means a simple matter and the quality of metadata you get can be variable Often, you get what you're given, say when importing metadata from third parties. But using modern methods, can we achieve universal bibliographic control in the digital environment where we failed in print? There are signs of progress A recent Research Information Network report recommended that publishers should(!) make article level metadata more widely available to third parties in a standard format so they can be harvested and utilised by aggregators, libraries, repositories and others But will they? One other idea is that maybe institutions could automatically share items and metadata with institutions of co-authors All this could lead to new types of aggregations The technology is mainly there. It’s really a matter of policy.
  • Reports – eg for REF and other statutory reporting, funders, feedback to admin, authors, stakeholders. Making best use of usage stats. Feedback loops – effort required vs benefits
  • Reports – eg for REF and other statutory reporting, funders, feedback to admin, authors, stakeholders. Making best use of usage stats. Feedback loops – effort required vs benefits
  • One important feedback system is that of Peer review. The potential for the current system to change is enormous. It is partly dependent on the role of the traditional publication, attitudes of the community and different disciplines. If journal articles are, as Alma Swan has suggested, purely a “summary of research” and if in some disciplines the article becomes a mere formality necessary only for statutory assessment, then other forms of peer review may become prevalent. More raw research than ever before is being put out there via lab books, blogs, raw data and so on. These outputs could be reviewed using the power of the web. And repositories could find themselves playing a major role in this important area of scholarly communication and quality assessment.
  • New forms of peer review will be possible because of new forms of dissemination At the moment many materials that are disseminated via repositories are often digital equivalents of print items – in terms of both content and display. However, new forms of publishing are emerging that take advantage of the web environment
  • example, a semantically marked up article originally published in PLOS created by Dr David Shotton and his colleagues at Oxford. Nine classes of entities are highlighted, for example different habitats are highlighted in green Organisms (blue) are directly linked to their Linnaen classification The semantic enhancements to the original PLoS NTD paper by Reis et al (2008) were created by David Shotton, Alistair Miles, Graham Klyne and Katie Portwin, Image Bioinformatics Research Group, Department of Zoology, University of Oxford. This enhanced version of the paper was published on 3 September 2008 at doi:10.1371/journal.pntd.0000228.x001, and revised 18 April 2009. Reis RB, Ribeiro GS, Felzemburgh RDM, Santana FS, Mohr S, et al. (2008) Impact of Environment and Social Gradient on Leptospira Infection in Urban Slums. PLoS Negl Trop Dis 2(4): e228. doi:10.1371/journal.pntd.0000228
  • The cited data underpinning the research can be accessed and manipulated. One figure has been superimposed over Google maps See the raw data and the data fusion (mashup) functionality where two lots of data are combined, and the extra visualisation features There are more enhanced features, for instance some citations are linked to the exact spot in the cited work that has been quoted rather than the entire text Aren’t more authors going to want more of this sort of thing?
  • A brief look at a couple aspects of open access
  • My Great Aunt Edith who in 1932 took part in the mass trespass on Kinder Scout.
  • The trespass was prompted by dissatisfaction that less than 1% of the Peak District moorland could be accessed by the public The land was of poor quality, and was used intermittently by landowners for activities such as grouse shooting. Walkers could apply for a permit for access, but it had to be in writing and only 2 permits per week were awarded. Between 4 and 500 ramblers set off (including Great Aunt Edith) on the walk across this restricted land. There was a brief fight with a handful of keepers, but all returned triumphant to Hayfield. It had been expected that some members of the party might be fined, so there was a whip-round with a hat to support those who might need it. In fact, one or two ended up in prison. On the 75 th anniversary of the trespass, Mike Harding wrote in the Guardian… [NEXT SLIDE] Mass trespass Monday 25 April 1932 http://www.guardian.co.uk/uk/1932/apr/25/1 Mike Harding The Guardian, Wednesday 18 April 2007 http://www.guardian.co.uk/environment/2007/apr/18/society.guardiansocietysupplement1
  • “ What many people fail to realise is that the uplands of this country once belonged to us, open common land, free for all to walk at will.” Although far from a direct allegory for publishing today, the parallels with the current situation of open access are interesting. Money plays a key role, as does the concept that the land (or research in our case) being for the benefit of all. The mass trespass eventually led to legislation to establish the National Parks in 1949, and to walkers' rights over open country and common land’ in 2000. The rights problems associated with digital content could similarly be resolved in time – but it will take some trials and tribulations before we reach that point. http://www.kindertrespass.com/
  • One concern for many authors and users is that stuff given away at no charge is perceived as being of low quality. Also that the confusingly named ‘author pays’ model of open access publishing is little more than vanity publishing Which means that open access items even if they are peer reviewed, might be judged of little or no use and should not be trusted. There is much disagreement on this topic and disciplinary differences play a part The argument will probably continue for some time But meanwhile, the technology and the communications landscape will continue to develop and we may reach a significant point of ‘realisation’ at some time in the near future.
  • When considering open access publishing models there is a mixed economy with many different flavours of open access. What should an author do when faced with this bamboozling and over-complex situation? I feel that unfortunately we will retain this complexity for some time But simplification for repository depositors cannot come fast enough.
  • Take a look at where we’re going
  • This well known diagram indicates the development of technologies over time We reckon repositories have just turned the corner from the lowest point of disillusionment both in terms of technologies and culture.
  • However, our slope of enlightenment is not going to be steep and fast. Evolution of repositories is likely neither to be seismic nor step changes. It’s going to be more on the incremental change model. If we were designing repositories now what would we come up with? We’d certainly go beyond the Hokey Cokey repository of put your paper in you take your paper out. You could shake it all about - edit, annotate, update, link, text mine etc etc
  • We’ve spotted some trends that we think will influence bigger change in the not-too-distant future. draw your attention to three of them 1. We are entering a period of steady growth and change in all aspects and we must be patient 2. Repositories as a set of services embedded within institutional systems 3. The current trend towards naming entities will continue and bear fruit
  • Not yet seeing are things like these. Like to see more collaboration, common policies and less complexity in all areas of repositories hand over to Ben for a bit of crystal ball gazing
  • P
  • Where are Repository's Going?

    1. 1. Repository Fringe, Edinburgh 2009 Where are repositories going? Ben O’Steen (ORA Software Developer) Sally Rumsey (ORA Service & Development Manager)
    2. 2. Growth of repositories & historical parallel
    3. 3. Sir Thomas Bodley
    4. 4. Storage Content
    5. 5. “…and you having built an Ark to save learning from deluge, deserve propriety in any new instrument or engine, whereby learning should be improved or advanced.” Francis Bacon to Thomas Bodley Nov 1605 http://novels.mobi/create/out_mobi/pg/1/2/5/1/12515/12515/4.php
    6. 6. Search function Library catalogue 1620 Reproduced for this presentation with kind permission of King's College London, Foyle Special Collections Library www.kcl.ac.uk/.../exhibitions/marsex/mcoll.html
    7. 7. Jeffrey Keefer http://www.flickr.com/photos/jeffreykeefer/773540725/ CC licence: Attribution-Non-Commercial-Share Alike 2.0 Generic
    8. 8. Radcliffe Science Library 1861 Radcliffe Camera 1749 New Bodleian 1940
    9. 9. Artist’s impression of new Bodleian book depository at Swindon Details may change
    10. 10. Bodeian Stats 2009 8.5M volumes 1.6M visitors each year 65,000 registered readers* 5.4M requests for full-text journal articles 1.8M requests for e-books * 37,000 University card holders plus 28,000 external readers
    11. 11. Library terms and conditions Bodleian Library declaration: I hereby undertake not to remove from the Library, or to mark, deface, or injure in any way, any volume, document or other object belonging to it or in its custody; not to bring into the Library, or kindle therein, any fire or flame, and not to smoke in the Library; and I promise to obey all rules of the Library.
    12. 12. QUOD FELICITER VORTAT ACADEMICI OXONIENS BIBLIOTHECAM HANC VOBIS REIPUBLICAEQUE LITERORUM T.B.P. That it might turn out happily, Oxonian academics, for you and for the republic of lettered men Thomas Bodley placed this library
    13. 13. Growth in numbers of digital repositories Source: Tim Brody. ROAR Registry of Open Access Repositories. http://roar.eprints.org/
    14. 14. Some overarching themes
    15. 15. Theme Realisation as a catalyst for change
    16. 16. Theme Repositories as a concept
    17. 17. Paper in Institutional Paper out repository Repository as a box
    18. 18. Integration with other hard and soft systems enderisnotmyrealname http://www.flickr.com/photos/enderisnotmyrealname/3586300347/ Attribution-Non-Commercial-Share Alike 2.0 Generic
    19. 19. “An effective institutional repository of necessity represents a collaboration among librarians, information technologists, archives and records mangers, faculty, and university administrators and policymakers.” Clifford Lynch. ARL Bi-monthly report No. 226 Feb 2003 http://www.arl.org/resources/pubs/br/br226/br226ir.shtml
    20. 20. “… a university–based institutional repository is a set of services……an institutional repository is not simply a fixed set of software and hardware.” Clifford Lynch. ARL Bi-monthly report No. 226 Feb 2003 http://www.arl.org/resources/pubs/br/br226/br226ir.shtml
    21. 21. Sally Ben
    22. 22. The most successful repository is the internet. Embrace it.
    23. 23. Some pointers then: • Distributed across a number of nodes. •
    24. 24. Some pointers then: • Distributed across a number of nodes. • The services and storage should be separate. •
    25. 25. Some pointers then: • Distributed across a number of nodes. • The services and storage should be separate. • There should be multiple ways to search the content. •
    26. 26. Some pointers then: • Distributed across a number of nodes. • The services and storage should be separate. • There should be multiple ways to search the content. • Any service or storage can disappear, be added or upgraded without affecting the other systems unduly.
    27. 27. Some pointers then: • Distributed across a number of nodes. • The services and storage should be Just think how you might separate. • There should be multiple ways to search the make your IR work more like content. • Any service or storage can disappear, be the web does. added or upgraded without affecting the other systems unduly.
    28. 28. "The future is here. It's just not evenly distributed yet." "The future is here. It's just not evenly distributed yet." s William Gibson NPR Talk of the Nation 30 November 1999 Timecode: 11min 55sec Link: discover.npr.org/features/feature.jhtml?wfId=1067220 Also: www.npr.org/rundowns/rundown.php?prgld=5&prgDate=30-Nov-1999
    29. 29. "The future is here. It's just not evenly distributed yet." "The future is here. It's just not evenly distributed yet." s William Gibson NPR Talk of the Nation 30 November 1999 Timecode: 11min 55sec Link: discover.npr.org/features/feature.jhtml?wfId=1067220 rg/rundowns/rundown.php?prgld=5&prgDate=30-Nov-1999
    30. 30. For those who want to follow along: http://bit.ly/89AtD
    31. 31. Using google to assay the forms of usage
    32. 32. Using amazon's in-book search and browse to find the phrase.
    33. 33. 'Tim O'Reilly checked with Cory Doctorow who checked with Lorna Toolis who checked with Barry Wellman who checked with Ren Reynolds and Ellen Pozzi who point out that there's an NPR Talk of the Nation broadcast from 1999 where Gibson says, "As I've said many times, the future is already here. It's just not very evenly distributed." William Gibson NPR Talk of the Nation 30 November 1999 Timecode: 11min 55sec Link: discover.npr.org/features/feature.jhtml?wfId=1067220 Also: www.npr.org/rundowns/rundown.php?prgld=5&prgDate=30-Nov-1999
    34. 34. NPR has changed their site since then, breaking the link to the metadata about that recording... whoops...
    35. 35. But the link to the actual broadcast works: http://discover.npr.org/features/feature.jhtml?wfId=1067220 Notice anything interesting in that url?
    36. 36. Hint Hint: tml?wf Id=1067220
    37. 37. Realisation: People search for Things – the fact that they can only retrieve documents concerning those Things is incidental to them.
    38. 38. Things: • People • Places • Dates • Books • CDs • Performances/Events • • Topics/subjects • … etc, etc
    39. 39. Things: • What did they all have in common? • •
    40. 40. Things: • What did they all have in common? • • They all have 'names' of one sort in real-life. • • But there are plenty of those Things that don't have names on the web... • • How about we give them names?
    41. 41. Provide documents that directly relate to rather than simply mention a Thing the person is searching for.
    42. 42. Provide documents that directly relate to rather than simply mention a Thing the person is searching for.
    43. 43. “Relate” is a fluffy word. The key is knowing how a Document relates to a Thing.
    44. 44. “Relate” is a fluffy word. The key is knowing how a Document relates to a Thing. Does it describe it, comment on it, refer to it, locate it, disagree with it?
    45. 45. The types of relationships between Named Things are very important
    46. 46. Realisation – we have been avidly giving ourselves HTTP names for some time now xkcd.com/262/
    47. 47. Realisation – we have been avidly giving ourselves HTTP names for some time now • http://facebook.com/benosteen • http://twitter.com/benosteen • http://oxfordrepo.blogspot.com • Etc • Etc
    48. 48. Realisation – we have been avidly giving ourselves HTTP names for some time now • Things can have multiple, simultaneous names in real-life and online. • • The real power comes from relating names: • • “This named thing {is the same as} that other named thing”
    49. 49. And when it comes to people on the web, there has been a social sea- change
    50. 50. “Have you got a profile page on friendsreunited?”
    51. 51. “Are you on facebook?” “Are you on twitter?”
    52. 52. Linked Data and HTTP names Linked Data
    53. 53. Real Data • http://data.gov • • http://recovery.gov – – - Repositories of US Federal Data and Federal Funding information, being published in a re- usable manner using Atom and RDF.
    54. 54. Real Data • http://id.loc.gov – Library of Congress publishing their authority lists as Linked Data in RDF.
    55. 55. Yahoo and Google indexing RDF embedded in HTML pages (as RDFa) O'Reilly post on Google's “adoption” http://radar.oreilly.com/2009/05/google-announces-support-for-m.html A piece from the RDFa.info site about Yahoo and SearchMonkey's use of RDFa http://rdfa.info/2008/03/14/yahoo-into-semantic-web/
    56. 56. Ben Sally
    57. 57. Theme Policies
    58. 58. 1. Accession
    59. 59. “Dedicated to the freeing of the refereed research literature online through author/institution self- archiving” November 2000
    60. 60. “Recognised as the easiest and fastest way to set up repositories of: • research literature • scientific data • student theses • project reports • multimedia artefacts • teaching materials • scholarly collections • digitised records July 2009 • exhibitions and performances”
    61. 61. “Resources range from simple materials such as Word documents or Powerpoint presentations, to complex learning packages, IMS, SCORM and VLE course modules that combine various multimedia formats such as video, audio and animation.” http://www.jorum.ac.uk/
    62. 62. Success story Community specific user interfaces for deposit, discovery and access
    63. 63. Deposit in multiple repositories More needs to be done! Institutional repository Single Subject repository deposit Other repository
    64. 64. Sally Ben
    65. 65. Realisation: We've reinvented too many “wheels”
    66. 66. The Web exists and it works.
    67. 67. Don't fight it. Work with it.
    68. 68. Using the defacto standards of the web gives you a massive advantage.
    69. 69. Defacto standards • Transfer: – Files (HTTP) – Lists of things (Atom, RSS) • Create, Read, Update, Delete: – HTTP PUT, GET, POST, DELETE • Names: – URIs • Lookups: – DNS resolvers
    70. 70. The big advantages • Instant community • Lots of tried and tested software that you don't have to write from scratch ! • No wheels need be re-invented • May not be perfect, but it works
    71. 71. The big disadvantage • If you really are doing something unique or new (which is really unlikely) then try to get a community to help you. • • If noone else wants to do it like you, then think about what you are truely accomplishing.
    72. 72. What is new? • Doing something new does not mean using a more refined vocabulary to describe things. •
    73. 73. What is new? • I mean something that we don't have defacto standards for
    74. 74. What is new? • I mean something that we don't have defacto standards for: – real-time event notifications through the browser –
    75. 75. What is new? • I mean something that we don't have defacto standards for: – real-time event notifications through the browser – Simultaneous collaborative document editing
    76. 76. What is new? • I mean something that we don't have defacto standards for: – real-time event notifications through the browser – Simultaneous collaborative document editing – Data qualified and ranked by evidence
    77. 77. Participation! Name some repositories.
    78. 78. Some things I consider to be Repositories • Flickr • IRs • Facebook • Domain Rs (Jorum, • Google Docs Pubmed, etc) • A filesytem of BagITs • Publisher sites • Scribd • Forums • Slideshare • CVS/SVN/Git/Hg • Blogs (WP/etc) • WebDAV • Wikis • FTP • Twitter/Identi.ca
    79. 79. So, what do these repositories have in common? Standards? APIs?
    80. 80. So, what do these repositories have in common? Standards? APIs? Erm.... not much, but they all contain sets of things.
    81. 81. “Trying to get stuff into your repository? Noone gives a SIP...”
    82. 82. Realisation: Object transfer is still in a divergent state • Lots of containers, lots of formats, too many conventions that you just have to know..
    83. 83. Realisation: Object transfer is still in a divergent state • Lots of containers, lots of formats, too many conventions that you just have to know.. • • There is no negotiation for the format of a SIP – you deal with what you are given.
    84. 84. Realisation: Object transfer is still in a divergent state • Lots of containers, lots of formats, too many conventions that you just have to know.. • • There is no negotiation for the format of a SIP – you deal with what you are given. • • And sometimes, you just have to go and harvest what you can.
    85. 85. Don't Panic
    86. 86. Normal Archival Process • (paraphrased by an observer...) •
    87. 87. Normal Archival Process • - Accept delivery of boxes of stuff, and record roughly what was received. • • THINGS GET PERMANENT IDS NOW! • • Even if it is just on a per-box basis. The item carries that provenance throughout. –
    88. 88. Normal Archival Process • - Accept delivery of boxes of stuff, and record roughly what was received. • - Triage the contents within a stable environment. –
    89. 89. Normal Archival Process • - Accept delivery of boxes of stuff, and record roughly what was received. • - Triage the contents within a stable environment. – Deal with the fragile things first, things that will deteriorate. –
    90. 90. Normal Archival Process • - Accept delivery of boxes of stuff, and record roughly what was received. • - Triage the contents within a stable environment. – Deal with the fragile things first, things that will deteriorate. – Try to sort out issues that arise, with the next of kin/donater. –
    91. 91. Normal Archival Process • - Accept delivery of boxes of stuff, and record roughly what was received. • - Triage the contents within a stable environment. – Deal with the fragile things first, things that will deteriorate. – Try to sort out issues that arise, with the next of kin/donater. – Some things may stay in the box for a long time...
    92. 92. Normal Archival Process • - Accept delivery of boxes of stuff, and record roughly what was received. • - Triage the contents within a stable environment. – Deal with the fragile things first – Try to sort out issues that arise – Some things may stay in the box – Identify actions that need to be taken to ensure future access.
    93. 93. Normal Archival Process • - Accept delivery of boxes of stuff, and record roughly what was received. • - Triage the contents within a stable environment. - Characterise and catalogue the contents, using relevant tools.
    94. 94. Normal Archival Process • - Accept delivery of boxes of stuff, and record roughly what was received. • - Triage the contents within a stable environment. - Characterise and catalogue the contents, using relevant tools. • - Update archival records so that people can find the content (if they are allowed to.)
    95. 95. Digital Process • - Accept delivery of boxes of stuff, and record roughly what was received. • - Triage the contents within a stable environment. - Characterise and catalogue the contents, using relevant tools. • - Update archival records so that people can find the content (if they are allowed to.)
    96. 96. Digital Process The media may be different. And the tools may be too. You are still likely to be doing` something like this for a while.
    97. 97. Not all storage is the same The absolute biggest benefit to any repository is to separate out the concerns of storage and services. • • It will make you life so, so much easier. • • Trust me.
    98. 98. Hardware, software, people and storage will come and go. Your content is constant.
    99. 99. Ben Sally
    100. 100. “When it's one click deposit, I'll do it" Medical scientists at Oxford
    101. 101. Bill Hubbard: Institutional Policies and Processes for Mandate Compliance. May 2009. http://www.sherpa.ac.uk/documents/OA %20Choices%20-%20researcher%27s %20view.ppt
    102. 102. What we need is:  Deposit by stealth and other easy solutions  Multiple Repository Deposit Regime (MuRDeR)  Answers to related problems that worry people such as multiple versions  Automation, automation, automation
    103. 103. Copyright ©
    104. 104. “There is a supreme irony that just as technology is allowing greater access to books and other creative works than ever before for education and research, new restrictions threaten to lock away digital content in a way we would never countenance for printed material.” Dame Lynne Brindley, CEO The British Library Copyright for Education and Research Golden Opportunity or Digital Black Hole? http://www.bl.uk/ip/pdf/copyrightresearchreport.pdf
    105. 105. Legal Deposit as a parallel to repository mandates
    106. 106. Bodley’s agreement 1610 “That one Book of every sort that is new Printed, or Re-printed with Order of Additions, be sent to the University the Star of Oxford for the Use of the publick Library there, … to be sent to the Chamber Library at Oxford accordingly, upon 1637 pain of Imprisonment” http://www.british-history.ac.uk/report.aspx?compid=74953
    107. 107. “…there tower the few, the very few, Libraries of Deposit. These are the super-Dreadnoughts of the literary world, and the Bodleian claims to be among them … a really great library should have Universal scope, Independences, Size, Permanence, Wealth, and multiform Utility.” The Bodleian Library at Oxford. Falconer Madan. 1919 http://www.archive.org/stream/bodleianlibrarya00mada/bodleianlibrarya00mada_djvu.txt
    108. 108. 2. Management and preservation
    109. 109. alancleaver_2000 Attribution 2.0 Generic http://www.flickr.com/photos/alancleaver/ 2638883650/ “Preservation aims towards preserving access”
    110. 110. Assured secure storage and permanent access needs to be well-managed Aided by intra- and inter- institutional advisory and support services
    111. 111. Shared and distributed expertise
    112. 112. Sally Ben
    113. 113. Why do people choose to interact with systems?
    114. 114. Why do people choose to interact with systems? Disproportionate Feedback loops
    115. 115. Disproportionate Feedback Loop => The perception that a small effort leads to a very great benefit.
    116. 116. Disproportionate Feedback Loop => The perception that a small effort leads to a very great benefit. Which leads to more “little efforts” which add up!
    117. 117. High Scores Technically trivial, but... Psychologically addictive and drives a lot of replay
    118. 118. High Scores IR High scores? ● Usage stats ● Re-usage stats (trackbacks, tweets, references)
    119. 119. Ben Sally
    120. 120. Parliamentary Office for Science and Technology Peer review is used in the UK for 3 main purposes: 1. Allocation of research funding 2. Publication of research in scientific journals. To assess the quality of research submitted for publication and to assess its importance. 3. Assess the research rating of university departments http://www.parliament.uk/post/pn182.pdf
    121. 121. 3. Dissemination
    122. 122. Links to: Actionable raw data; Data fusion Links to: Interactive version; Google maps; additional visualisation
    123. 123. Open Access
    124. 124. April 1932 Ramblers set out from Bowden Bridge for the Kinder Scout mass trespass http://kindertrespass.com/index.asp?ID=37
    125. 125. “What many people fail to realise is that the uplands of this country once belonged to us, open common land, free for all to walk at will. They were only enclosed and parcelled off to the rich by acts of parliament pushed through by the rich. An old rhyme goes: They hang the man and flog the woman That steals the goose from oft the common Yet leave the greater villain loose That steals the common from the goose.” Mike Harding The Guardian, Wednesday 18 April 2007 http://www.guardian.co.uk/environment/2007/apr/18/society.guar diansocietysupplement1
    126. 126. Free! Free Free! Free! Free! Free! Free! Free! ! Free! Free! Free! Free! Free Free! Free! Free! Free! FREE! !
    127. 127. Green Open Access Open options Operates two Mandated open access different open access Models (as reported by SHERPA/Romeo) Open Archives (OAI) Service Gold Open Access Some journals in UKPMC allow harvesting of the full text of all items, others allow it for only Open access journals some items, and many do not allow it at all. See the PMC Open Access List for specifics. Too complicated!
    128. 128. Preview Time
    129. 129. You are here
    130. 130. Evolution Evolution Seismic change Rapid change Time Time Evolution Evolution Step change Incremental change Time Time
    131. 131. Trends 1. Entering a period of steady growth and change 2. Repositories as a set of services embedded within institutional systems 3. Names
    132. 132. Trends: Still waiting… • Easy multiple deposit • Collaboration between publishers, repositories, HEIs, government and research funders as a group • Common policies • Less complexity
    133. 133. Crystal ball by Hamachi! CC license. Attribution-Non-Commercial-No Derivative Works 2.0 Generic Available at http://www.flickr.com/photos/mawari/2091456761/
    134. 134. Print-on-demand is going to be big And I don't mean printed facsimiles.
    135. 135. What does having a book mean if you can print one in 5 minutes for £2? http://www.ondemandbooks.com/home.htm
    136. 136. How about this scenario then? It's all technically possible now
    137. 137. How about this scenario then? It's all technically possible now You print off a set of articles into a book on the libraries book-printer. Your collegues comments, tweets and reviews are interleaved with the text. Your collegues were found from your professional social network.
    138. 138. How about this scenario then? It's all technically possible now You print off a set of articles into a book on the libraries book-printer. Your collegues comments, tweets and reviews are interleaved with the text. Your collegues were found from your professional social network.
    139. 139. How about this scenario then? It's all technically possible now You print off a set of articles into a book on the libraries book-printer. Your collegues comments, tweets and reviews are interleaved with the text. Your collegues were found from your professional social network.
    140. 140. How about this scenario? You create a bookmark list of plates from 18th century books online which you believe to be the work of one anonymous artist. This list with your comments is a new resource in of itself, and can be commented on or printed as a book.
    141. 141. Permanent books, temporary magazines? Is this true? How about facsimiles that are printed, so that a student can study with their coffee and donuts? And why print to paper? Why not print to digital paper, once it arrives...

    ×