Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
SAGE Journals Online
Download to read offline and view in fullscreen.


Web 2.0 and repositories - have we got our repository architecture right?

Download to read offline

A presentation given at the Talis Xiphos Research Day, 10 June 2008.

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Web 2.0 and repositories - have we got our repository architecture right?

  1. Web 2.0 and repositories… … have we got our repository architecture right?
  2. Outline <ul><li>where are we now? </li></ul><ul><li>what’s wrong with where we are now? </li></ul><ul><li>what can we do about it? </li></ul><ul><li>do we need a new vision? </li></ul>
  3. Where are we now? <ul><li>where are we now? </li></ul>
  4. What is a repository? a university-based institutional repository is a set of services that a university offers to the members of its community for the management and dissemination of digital materials created by the institution and its community members. It is most essentially an organizational commitment to the stewardship of these digital materials, including long-term preservation where appropriate, as well as organization and access or distribution . … An institutional repository is not simply a fixed set of software and hardware (Cliff Lynch, 2003)
  5. Repository “doing” words <ul><li>manage </li></ul><ul><li>deposit </li></ul><ul><li>disclose </li></ul><ul><li>make openly available </li></ul><ul><li>curate </li></ul><ul><li>preserve </li></ul>
  6. Repository content <ul><li>all sorts… but most “academic” focus currently on </li></ul><ul><ul><li>scholarly publications </li></ul></ul><ul><ul><li>learning objects </li></ul></ul><ul><ul><li>research data </li></ul></ul>
  7. Repository content <ul><li>all sorts… but most “academic” focus currently on </li></ul><ul><ul><li>scholarly publications </li></ul></ul><ul><ul><li>learning objects </li></ul></ul><ul><ul><li>research data </li></ul></ul><ul><li>this talk focuses on the first of these, but with the intention that most of what I say will be generic </li></ul>
  8. Repository architecture <ul><li>largely institutional focus though some exceptions – arXiv, RePEC, JORUM, etc. </li></ul><ul><li>interoperability through centralised aggregators (national and global) </li></ul><ul><ul><li>search services (OAIster, Intute, …) </li></ul></ul><ul><ul><li>registries (DOAR, ROAR, …) </li></ul></ul><ul><li>harvesting metadata about content using OAI-PMH (metadata = simple Dublin Core) </li></ul><ul><li>content = PDF </li></ul><ul><li>SWORD as deposit API </li></ul>
  9. What’s “wrong” with where we are now? <ul><li>what’s “ wrong ” with where we are now? </li></ul>
  10. #1 We talk about “repositories”…
  11. …rather than “the Web” a focus on ‘ making content available on the Web’ would be more intuitive to researchers
  12. Whatever happened to the CMS? <ul><li>a focus on ‘ content management ’ would change our emphasis </li></ul><ul><li>OAI-PMH out… </li></ul><ul><li>search engine optimisation, usability, accessibility, Web design, tagging, information architecture, cool URIs in… </li></ul>
  13. #2 We don’t emphasise… <ul><li>Google indexing </li></ul><ul><li>RSS feeds </li></ul><ul><li>widget technology – embedding functionality into other sites </li></ul>
  14. #3 Our focus is on sharing metadata… <ul><li>… even though we have full-text to share </li></ul><ul><li>worse… the full-text we share tends to be PDF rather than native Web format </li></ul><ul><ul><li>the Web equivalent of a cul de sac </li></ul></ul><ul><li>and the metadata we share tends to be “simple Dublin Core” </li></ul><ul><ul><li>little consistency in approaches to describing ‘files’ vs. ‘documents’ </li></ul></ul><ul><ul><li>little consistency in naming authors and subjects </li></ul></ul><ul><ul><li>ultimately, it is both too simple and too complex! </li></ul></ul>
  15. #4 We ignore the Web Architecture <ul><li>we have tended to adopt service oriented approaches </li></ul><ul><li>in line with long tradition from Z39.50 to SOAP/WSDL </li></ul><ul><ul><li>e.g. JISC eFramework </li></ul></ul><ul><li>focus is on building “services on content” rather than on the “content” </li></ul>pbo31 @ flickr
  16. REST is good <ul><li>we don’t tend to adopt a resource oriented approach </li></ul><ul><li>we don’t adopt REST – an architectural style with a focus on resources, their identifiers (e.g. URIs), and a simple uniform set of operations that each resource supports (e.g. GET, PUT, POST, DELETE) </li></ul><ul><li>we don’t encourage a Web style “follow your nose” approach </li></ul>
  17. #5 We are antisocial… <ul><li>… at least, we tend to treat “content” in isolation from the “social networks” that need to grow around that content </li></ul><ul><li>successful “repositories” (Flickr, YouTube, Slideshare, etc.) promote the social activity that takes place around content as well as the content management and disclosure activity </li></ul><ul><ul><li>friends, groups, social tagging, comments, embedding, re-purposing, etc. </li></ul></ul>
  18. But not just about functionality… <ul><li>the institutional approach has fundamental mismatch with the real-life social networks adopted by researchers </li></ul><ul><ul><li>subject-based </li></ul></ul><ul><ul><li>cross-institutional </li></ul></ul><ul><ul><li>global </li></ul></ul><ul><li>while institutional approach is good from perspective of institutional management, preservation, etc. </li></ul><ul><li>globally “concentrated” repositories might better reflect the social networks that need to arise </li></ul>
  19. The net effect… <ul><li>… is that there is no net effect </li></ul><ul><li>repositories remain uncompelling places to disclose scholarly publications from POV of the researcher </li></ul><ul><li>perceived cost of deposit remains higher than perceived benefits </li></ul><ul><li>we resort to institutional or funder mandates, “thou shalt deposit”, to fill what would otherwise remain empty </li></ul>
  20. Wait just a minute… <ul><li>didn’t we used to have globally “concentrated” repository services? </li></ul><ul><li>arXiv – the first Web 2.0 service? </li></ul><ul><li>invented before the Web </li></ul><ul><li>unfortunately, also invented before Amazon S3 </li></ul><ul><li>i.e. before we knew how to scale things </li></ul>
  21. Wait just another minute… <ul><li>… doesn’t the blogsphere successfully layer a set of globally concentrated services over a distributed network of content? </li></ul><ul><ul><li>e.g. Technorati </li></ul></ul><ul><li>yes… but… </li></ul><ul><li>the content is under the control of ‘individuals’ rather than ‘institutions’, and… </li></ul><ul><li>the interoperability “glue” (RSS and tagging) is very lightweight and RESTful </li></ul>
  22. Having the conversation is hard <ul><li>highly political space </li></ul><ul><li>strong “open access” voices who, understandably, don’t want their agenda de-railed by discussion about </li></ul><ul><ul><li>preservation </li></ul></ul><ul><ul><li>search engine optimisation </li></ul></ul><ul><ul><li>Web 2.0 </li></ul></ul><ul><ul><li>social networks </li></ul></ul><ul><ul><li>semantic Web </li></ul></ul><ul><ul><li>the future of peer review </li></ul></ul><ul><li>it can be hard to get the conversation started </li></ul>
  23. What can we do about it? <ul><li>what can we do about it? </li></ul>
  24. Things can go two ways… I think that things can go two ways… The Web 2.0 Way or The Semantic Web Way … possibly both
  25. Things can go two ways… what would a Web 2.0 repository look like?
  26. Like this?
  27. A Web 2.0 repository? <ul><li>high-quality browser-based document viewer (not Acrobat!) </li></ul><ul><li>tagging, commentary, more-like-this, favorites, … </li></ul><ul><li>persistent (cool) URIs to content </li></ul><ul><li>ability to form simple social groups </li></ul><ul><li>ability to embed documents in other Web sites </li></ul><ul><li>high visibility to Google </li></ul><ul><li>offer RSS as primary API </li></ul><ul><li>use of Amazon S3 to cope with scalability </li></ul>
  28. In short… we go “simple” <ul><li>we develop simple(ish) repositories </li></ul><ul><li>and complex aggregators and search engines </li></ul><ul><li>RSS/Atom as primary “glue” </li></ul><ul><li>social tagging as “description” </li></ul><ul><li>full-text indexing </li></ul><ul><li>microformats </li></ul><ul><li>Google Sitemaps to guide harvesters to content </li></ul><ul><li>complex functional requirements (e.g. author disambiguation) either ignored or met thru complexity in aggregators </li></ul>
  29. Alternatively… we go “complex” <ul><li>…we look to the Semantic Web </li></ul><ul><li>we create and share much richer metadata about scholarly publications than we do currently </li></ul><ul><li>we explicitly model complexity (a la FRBR) </li></ul><ul><li>and aggregations </li></ul><ul><li>we expose resulting metadata thru the SW “graph” </li></ul>
  30. We go “complex”... <ul><li>SWAP and ORE </li></ul>
  31. We go “complex”… <ul><li>SWAP – Scholarly Works Application Profile </li></ul><ul><li>an application of the Dublin Core Abstract Model and Application Profiles </li></ul><ul><li>capturing relationships between works, expressions, manifestations, items and agents </li></ul><ul><li>ORE – OAI Object Re-use and Exchange </li></ul><ul><li>capturing relationships between aggregations and aggregated resources </li></ul><ul><li>note that ORE not tied to specific entity in FRBR </li></ul><ul><li>note that ORE implemented as profile of Atom </li></ul>
  32. SWAP application profile model ScholarlyWork Expression 0..∞ isExpressedAs Manifestation isManifestedAs 0..∞ Copy isAvailableAs 0..∞ 0..∞ 0..∞ isCreatedBy isPublishedBy 0..∞ isEditedBy 0..∞ isFundedBy isSupervisedBy AffiliatedInstitution Agent
  33. OAI ORE
  34. Summary <ul><li>what can we learn from Web 2.0? </li></ul><ul><ul><li>user interface design matters </li></ul></ul><ul><ul><li>global ‘concentration’ is an enabler of social interaction </li></ul></ul><ul><li>simple DC is both too simple and too complex </li></ul><ul><li>richer DC application profiles such as SWAP and/or RDF applications like ORE may be a way forward </li></ul><ul><li>but need to ensure that their use does not over-complicate user interfaces and workflows </li></ul>
  35. A new vision? <ul><li>a new vision? </li></ul>
  36. Flickr and digital cameras… <ul><li>didn’t just take the practice of photography and put it on the Web </li></ul><ul><li>they fundamentally changed what photography was about </li></ul>
  37. What’s our vision? <ul><li>the standards we adopt in the scholarly communication space… </li></ul><ul><li>OAI-PMH, OpenURL, DOI, PDF, … </li></ul><ul><li>are primarily about replicating in a Web world what we have always done on paper </li></ul><ul><li>this is not surprising given the necessary inertia of the scholarly communication life-cycle </li></ul><ul><li>but… do we need to re-envision scholarly communication as a true Web process? </li></ul><ul><li>if so, what would a repository look like? </li></ul>
  38. thank you
  • hendro

    Aug. 19, 2016
  • Valentish

    Nov. 20, 2015
  • mrmeaw

    Sep. 6, 2014
  • IvaHostikov

    Mar. 9, 2013
  • gxhrid

    Jan. 19, 2011
  • boonlert

    May. 15, 2010
  • princelydoss

    Jan. 18, 2010
  • sergi.blanes

    Oct. 30, 2009
  • andypowe11

    Sep. 29, 2009
  • iabaro

    Jun. 9, 2009
  • witk

    May. 27, 2009
  • NCurse

    May. 26, 2009
  • blenkle

    Mar. 15, 2009
  • skansa

    Mar. 10, 2009
  • ebrown

    Feb. 11, 2009
  • Strathslide

    Feb. 6, 2009
  • JosebaAbaitua

    Feb. 4, 2009
  • edlanter

    Jan. 14, 2009
  • m4rc3l

    Dec. 21, 2008
  • CameronNeylon

    Sep. 29, 2008

A presentation given at the Talis Xiphos Research Day, 10 June 2008.


Total views


On Slideshare


From embeds


Number of embeds