Libraries, digital libraries and
digital library research


                   Lorcan Dempsey
                   OCLC

                   Keynote presentation at

                   European Conference on Digital
                   Libraries 2004

                   University of Bath
                   September 12 – 17 2004
Overview
Holes
‘There was once a man who aspired to be the author of the
general theory of holes.

When asked “What kind of hole – holes dug by children in the
sand for amusement, holes dug by gardeners to plant lettuce
seedlings, tank traps, holes made by roadmakers?” he would
reply indignantly that he wished for a general theory that
would explain all of these.



This man’s achievement has
passed totally unnoticed except by me.’
Digital libraries and holes …

 ‘Digital library’ has no
  precise or agreed referent


 Different communities of
  practice
                                Compare ‘archive’
 Different incentives            • Archival institution
   • Serve                        • Archival materials
   • Build                        • OAI
   • Research                     • A promise of
                                    preservation?
Digital
           library
          Research




          Digital
          library


                      Digital
Library
                     libraries
Anthropology/ethnography/              Grid       W3C
   social science
                                                  Computer science
                             Digital
    Library and
    Information science
                             library                Economics
                            Research
                                                    Industrial R&D
                      HCI
                                                  Semantic web



                               Digital                           Artstor
     Entertainment             library             Jorum
                                                             Libraries …
   Library
                                                                Amazon
E-research                                        Digital
               ‘Business’                                        Inst Rep
E-learning                                       libraries
  Banks                                                          arXiv
    Cultural
    heritage                             Internet archive    BBC archive
Emphasis:
           Digital
           library   Library
          Research




          Digital
          library


                      Digital
Library
                     libraries
A library as institution
Libraries
            ‘So why have I written
            this? I can’t show it if it’s
            going to contradict or
            undermine my case.

            There are a number of
            reasons. First and
            foremost, I am a
            librarian. I live for
            records and documents.’
A library as institution


Because the purpose and result of absorbing information
is always finally to produce further information, i.e., to
continue the conversation,

the function of the library must be understood as one
that assists members of the community both in taking
particular positions and in recognizing and assessing the
positions taken by others.

    Ross Atkinson. Contingency and contradiction: The place(s) of the
     Ross Atkinson. Contingency and contradiction: The place(s) of the
    library at the dawn of the new millennium
     library at the dawn of the new millennium
    Journal of the American Society for Information Science and
     Journal of the American Society for Information Science and
    Technology, Volume 52, Issue 1, Pages 3-11. Published Online:
     Technology, Volume 52, Issue 1, Pages 3-11. Published Online:
    2001.
     2001.
A library as institution


We often hear it said that libraries (and librarians) select,
organize, retrieve, and transmit information or knowledge. That
is true.

But those are the activities, not the mission, of the library.

… the important question is: "To what purpose?" We do not do
those things by and for themselves.

We do them in order to address an important and continuing
need of the society we seek to serve. In short, we do it to
support learning.

         Robert Martin. Libraries and Learners in the Twenty-first Century.
          Robert Martin. Libraries and Learners in the Twenty-first Century.
         http://www.imls.gov/scripts/text.cgi?/whatsnew/current/sp040503.htm
          http://www.imls.gov/scripts/text.cgi?/whatsnew/current/sp040503.htm
Libraries and digital libraries
 Support research and learning.
 Discover position of others and form one’s own position.



 In order to uphold their mission and values…
 … they must renovate their practices.
“Search engine mindshare”
    John Regazzi

                         “In a survey for this lecture,
   Scientists:          librarians and scientists were
     • Google            asked to name the top scientific
     • Yahoo             and medical search resources
     • PubMed            that they use or are aware of.
   Librarians:          The difference is startling.”
     • Science Direct
     • ISI Web of Science
     • MedLine




                     Source: John Regazzi,
                     The Battle for Mindshare: A battle beyond access and retrieval
                     http://www.nfais.org/publications/mc_lecture_2004.htm
Pattern recognition – libraries now

 The ‘Amazoogle’ effect
 Value                     User behavior
                             opaque
                            Uncertainty about
                             digital directions



                              ‘The future is
                              here. It's just
                              not evenly
                              distributed yet’
                              William Gibson
The difficulty in creating a digital management strategy stems
in part from the bewildering convergence of technological
developments.

Developing a digital management strategy is further
complicated by the fact that there are no recognized patterns or
models for managing digital assets.

Some managers seek to develop fully distributed institutional
repositories but still must choose between open-source
solutions or commercial providers. Others prefer to place their
material in one of a limited number of dedicated storage
institutions. While best practices may exist for given technical
processes, library managers do not have a single paradigm to
use as the basis for developing operational plans and policies to
capture, store, index, preserve, and redistribute the intellectual
output in digital formats.
                           Managing Digital Assets, CLIR primer
                            Managing Digital Assets, CLIR primer
                           program, 2005
                            program, 2005
Impact of digital library research?
   User studies
     • How much do we know about changing patterns of research,
       learning and engagement?
   Federation and metasearch
     • FDI, IndexData, Cheshire, iPort, …
     • OAI/OpenURL                                    Local
     • NISO metasearch – issues still to be addressed successes …
   Repositories/digital library systems
     • Multiple communities
     • Dspace, Fedora, CONTENTdm, DLXS, ..            … but we
   Metadata                                          have many
     • Growing acronymic density
     • Collections, rights, policies, services, …
                                                      open
     • Complex objects, relations                     questions.
   Identifiers/citation
   Preservation
Collections grid
                                      Stewardship
                                   high        low

Books                                                Freely-accessible
Journals                                             web resources
               Uniqueness

•Newspapers
                            low


•Gov. docs
•CD, DVD
•Maps
•Scores
                                                     Research and learning
                                                     materials
                            high




Special
                                                         •ePrints/tech reports
 collections                                             •Learning objects
Archives                                                 •Courseware
•Rare books                                              •E-portfolios
•Local history materials                                 •Research data
•Archives & Manuscripts
                                                     Untransferred records
•Theses & dissertations
Collections grid

                    high        low    disclosure
Publishing
                                       Amazoogle
    D2D
             low




Reformatting
             high




                                          E-learning
                                          E-research
Cultural               Digital asset
heritage               management
lab books


                                                            PDAs
                                                                           campus portal

learning management systems


                                         course material                         exhibitions
                                         text book

personal collections
                                                                                     reading
                                                                                     lists



user environments
resource environment                                               library              Virtual
                                                                                        reference


  Institutional repository

                                                                                   Aggregations
  Digital collections
                                                                                Licensed
                                                  Catalog
                             E-reserve                                          collections
                                                                   Cataloging
                                                                   ILL
The world is changing …

 Why is it difficult?
Scope, scale, diversity

 Systemic issues
   • No single system is the sole focus of a user’s attention
   • How do systems and services work across the four
     quadrants of the collections grid
   • How do they fit into wider enterprise systems
 Structure of costs does not reflect users’ value perception
   • Reallocation of resources difficult
   • Little substitution – ‘and’ not ‘or’
A new world

 Co-evolution with research and learning behaviors which
  are themselves changing

 Unsure about appropriate “economy of presence”
   • Place, network hub, channel, …
   • Web services, portlets, channels, …
   • Ambience, diffusion, ubiquity, recombinance, …


 E.g. Trajectory of search
   • Search system
   • Search system, machine interface, metasearch
   • Provide data, externalize search
       • Google, OAI
Webulation …

 Monolithic applications resistant to
   • Webulation
   • Service oriented architectures


 Massive legacy investment in knowledge structure
  unconnected to the web
   • How to release its value in a network environment


 Content does not easily flow into user space for
  manipulation, packaging, aggregation
Vendor environment

 Many libraries have outsourced development effort
 Library vendors do not have large R&D budgets
 Poor out-of-the-box support for ‘below-the-line’ materials in
  digital form
 Interesting tension between commodity (standards) and
  added value
 OSS environment very unsophisticated
 Limited support for logistics/supply chain/integration
  services
Limited application platforms

   Consider                            Library world
     • Google                             • Fragmented systems and
     • Amazon                                 development effort
     • E-bay                              • Does not benefit from

     • MapQuest
                                              scale
                                          • Unsustainable local
   Massively central applications
                                              development agendas
    platforms working in loosely
    coupled webby world                 Organizational rearticulation
   Software as a service                difficult.
     • APIs
                                        Application platforms?
                                          • CDL
     • GMAIL
                                          • JISC
     • Paypal
                                          • DEF
     • search
                                          • OCLC/RLG
Architecture? Theory?

 Do we need a big picture?
 Allows the articulation of technical and business discussion?
 An unnecessary constraint?
Without it we are susceptible to ….

   Marchitecture
   Techeology
   Portal envy
   Gratuitous acronym requests in RFPs
   Beauty contests
     • Dspace, Fedora, ….
A history of consumption means that we are
unprepared for contribution
 Standards
 Open source software
 Common services




 Limited structures to capture contribution and support.
And finally ..

 Libraries need to think about libraries not digital libraries
 And they need help from wherever they can get it!

Digital librarie

  • 1.
    Libraries, digital librariesand digital library research Lorcan Dempsey OCLC Keynote presentation at European Conference on Digital Libraries 2004 University of Bath September 12 – 17 2004
  • 2.
  • 3.
    Holes ‘There was oncea man who aspired to be the author of the general theory of holes. When asked “What kind of hole – holes dug by children in the sand for amusement, holes dug by gardeners to plant lettuce seedlings, tank traps, holes made by roadmakers?” he would reply indignantly that he wished for a general theory that would explain all of these. This man’s achievement has passed totally unnoticed except by me.’
  • 4.
    Digital libraries andholes …  ‘Digital library’ has no precise or agreed referent  Different communities of practice  Compare ‘archive’  Different incentives • Archival institution • Serve • Archival materials • Build • OAI • Research • A promise of preservation?
  • 5.
    Digital library Research Digital library Digital Library libraries
  • 6.
    Anthropology/ethnography/ Grid W3C social science Computer science Digital Library and Information science library Economics Research Industrial R&D HCI Semantic web Digital Artstor Entertainment library Jorum Libraries … Library Amazon E-research Digital ‘Business’ Inst Rep E-learning libraries Banks arXiv Cultural heritage Internet archive BBC archive
  • 7.
    Emphasis: Digital library Library Research Digital library Digital Library libraries
  • 8.
    A library asinstitution
  • 9.
    Libraries ‘So why have I written this? I can’t show it if it’s going to contradict or undermine my case. There are a number of reasons. First and foremost, I am a librarian. I live for records and documents.’
  • 10.
    A library asinstitution Because the purpose and result of absorbing information is always finally to produce further information, i.e., to continue the conversation, the function of the library must be understood as one that assists members of the community both in taking particular positions and in recognizing and assessing the positions taken by others. Ross Atkinson. Contingency and contradiction: The place(s) of the Ross Atkinson. Contingency and contradiction: The place(s) of the library at the dawn of the new millennium library at the dawn of the new millennium Journal of the American Society for Information Science and Journal of the American Society for Information Science and Technology, Volume 52, Issue 1, Pages 3-11. Published Online: Technology, Volume 52, Issue 1, Pages 3-11. Published Online: 2001. 2001.
  • 11.
    A library asinstitution We often hear it said that libraries (and librarians) select, organize, retrieve, and transmit information or knowledge. That is true. But those are the activities, not the mission, of the library. … the important question is: "To what purpose?" We do not do those things by and for themselves. We do them in order to address an important and continuing need of the society we seek to serve. In short, we do it to support learning. Robert Martin. Libraries and Learners in the Twenty-first Century. Robert Martin. Libraries and Learners in the Twenty-first Century. http://www.imls.gov/scripts/text.cgi?/whatsnew/current/sp040503.htm http://www.imls.gov/scripts/text.cgi?/whatsnew/current/sp040503.htm
  • 12.
    Libraries and digitallibraries  Support research and learning.  Discover position of others and form one’s own position.  In order to uphold their mission and values…  … they must renovate their practices.
  • 13.
    “Search engine mindshare” John Regazzi “In a survey for this lecture,  Scientists: librarians and scientists were • Google asked to name the top scientific • Yahoo and medical search resources • PubMed that they use or are aware of.  Librarians: The difference is startling.” • Science Direct • ISI Web of Science • MedLine Source: John Regazzi, The Battle for Mindshare: A battle beyond access and retrieval http://www.nfais.org/publications/mc_lecture_2004.htm
  • 14.
    Pattern recognition –libraries now  The ‘Amazoogle’ effect  Value  User behavior opaque  Uncertainty about digital directions ‘The future is here. It's just not evenly distributed yet’ William Gibson
  • 15.
    The difficulty increating a digital management strategy stems in part from the bewildering convergence of technological developments. Developing a digital management strategy is further complicated by the fact that there are no recognized patterns or models for managing digital assets. Some managers seek to develop fully distributed institutional repositories but still must choose between open-source solutions or commercial providers. Others prefer to place their material in one of a limited number of dedicated storage institutions. While best practices may exist for given technical processes, library managers do not have a single paradigm to use as the basis for developing operational plans and policies to capture, store, index, preserve, and redistribute the intellectual output in digital formats. Managing Digital Assets, CLIR primer Managing Digital Assets, CLIR primer program, 2005 program, 2005
  • 16.
    Impact of digitallibrary research?  User studies • How much do we know about changing patterns of research, learning and engagement?  Federation and metasearch • FDI, IndexData, Cheshire, iPort, … • OAI/OpenURL Local • NISO metasearch – issues still to be addressed successes …  Repositories/digital library systems • Multiple communities • Dspace, Fedora, CONTENTdm, DLXS, .. … but we  Metadata have many • Growing acronymic density • Collections, rights, policies, services, … open • Complex objects, relations questions.  Identifiers/citation  Preservation
  • 17.
    Collections grid Stewardship high low Books Freely-accessible Journals web resources Uniqueness •Newspapers low •Gov. docs •CD, DVD •Maps •Scores Research and learning materials high Special •ePrints/tech reports collections •Learning objects Archives •Courseware •Rare books •E-portfolios •Local history materials •Research data •Archives & Manuscripts Untransferred records •Theses & dissertations
  • 18.
    Collections grid high low disclosure Publishing Amazoogle D2D low Reformatting high E-learning E-research Cultural Digital asset heritage management
  • 19.
    lab books PDAs campus portal learning management systems course material exhibitions text book personal collections reading lists user environments resource environment library Virtual reference Institutional repository Aggregations Digital collections Licensed Catalog E-reserve collections Cataloging ILL
  • 20.
    The world ischanging …  Why is it difficult?
  • 21.
    Scope, scale, diversity Systemic issues • No single system is the sole focus of a user’s attention • How do systems and services work across the four quadrants of the collections grid • How do they fit into wider enterprise systems  Structure of costs does not reflect users’ value perception • Reallocation of resources difficult • Little substitution – ‘and’ not ‘or’
  • 22.
    A new world Co-evolution with research and learning behaviors which are themselves changing  Unsure about appropriate “economy of presence” • Place, network hub, channel, … • Web services, portlets, channels, … • Ambience, diffusion, ubiquity, recombinance, …  E.g. Trajectory of search • Search system • Search system, machine interface, metasearch • Provide data, externalize search • Google, OAI
  • 23.
    Webulation …  Monolithicapplications resistant to • Webulation • Service oriented architectures  Massive legacy investment in knowledge structure unconnected to the web • How to release its value in a network environment  Content does not easily flow into user space for manipulation, packaging, aggregation
  • 24.
    Vendor environment  Manylibraries have outsourced development effort  Library vendors do not have large R&D budgets  Poor out-of-the-box support for ‘below-the-line’ materials in digital form  Interesting tension between commodity (standards) and added value  OSS environment very unsophisticated  Limited support for logistics/supply chain/integration services
  • 25.
    Limited application platforms  Consider  Library world • Google • Fragmented systems and • Amazon development effort • E-bay • Does not benefit from • MapQuest scale • Unsustainable local  Massively central applications development agendas platforms working in loosely coupled webby world  Organizational rearticulation  Software as a service difficult. • APIs  Application platforms? • CDL • GMAIL • JISC • Paypal • DEF • search • OCLC/RLG
  • 26.
    Architecture? Theory?  Dowe need a big picture?  Allows the articulation of technical and business discussion?  An unnecessary constraint?
  • 27.
    Without it weare susceptible to ….  Marchitecture  Techeology  Portal envy  Gratuitous acronym requests in RFPs  Beauty contests • Dspace, Fedora, ….
  • 28.
    A history ofconsumption means that we are unprepared for contribution  Standards  Open source software  Common services  Limited structures to capture contribution and support.
  • 29.
    And finally .. Libraries need to think about libraries not digital libraries  And they need help from wherever they can get it!

Editor's Notes

  • #7 Digital libraries – a wide range of services are digital-library-like. The involve selection, curation and disclosure of digital materials for particular audiences. Depending on your definition some of these are in, some out. Wherever you draw the line there is significant activity. ‘ Business’ – many organizations do digital-library-like activities to support their business needs. For example, think what will happen with historical collections of media materials in the ‘media’ business; collections of business documents (insurance, cheques) and so on in financial services companies; e-learning repositories; developing research collections. Digital library research reaches into many disciplines. Although there is a somewhat diffuse community of ‘digital library researchers’ in computer science, library and information science, and related issues, those who are building digital libraries are potentially interested in a wider research hinterland. This means that ‘digital library’ relates to a very diffuse set of interests.
  • #8 I will focus on libraries!
  • #14 Not clear how extensive the survey was or what the population was.
  • #15 Amazoogle – from a policy and funding point of view libraries are increasingly working in an environment shaped by expectations created by Google and Amazon. The library has to create the value case in such an environment. User behaviors are changing in a network environment. Research and learning behaviors are co-evolving with general network activity. People create and consume information in new ways. There are no patterns for digital directions.
  • #16 This may slightly overstate the case, but it is clear that we are some way from being able to routinely create viable digital information environments.
  • #17 It is difficult to measure the impact of digital library research. It is clear that there have been local successes and one can point to certain outcomes which benefited from programmatic research funding. The ROI on user studies seems low. Many are tied to particular systems or services. Some commercial metasearch products have been assisted by being part of the EU technical research and development investment. OCLC Pica’s iPort grew out of the EU project Decomate. Fretwell Downing participated in EU and JISC funded activities which contributed to their current suite. IndexData did nice work in several EU projects also. Cheshire assisted by NSF funding. OpenURL and OAI – Herbert Van de Sompel.
  • #20 Increasingly the library needs to provide services into the user environment – it needs to be visible in course management systems, in university portals, and so on. Not everybody will come to the library or to the library portal.
  • #23 “economy of presence” – a phrase of Bill Mitchell’s. Users have heterogeneous requirements. What is an effective network presence.
  • #25 ‘ below-the-line’ – i.e. below the line in the collections grid. These tend to be unique materials – special collections and research and learning materials (e-prints, data sets, courseware, …)
  • #27 JISC has its Information Architecture and now the E-Learning Framework. These help us have conversations, create shared understanding, help us partition problems, and so on. The library community seems resistant to such shared architectures, which may be a good or a bad thing depending on your point of view.
  • #28 Marchitecture – an architecture produced by a vendor for marketing purposes. May not be the best guide to the applications space. (do a search on google for more) Techeology – a mixture of technology and ideology. Discussion where ideological beliefs cloud technical discussion. I find this a useful word to describe quite a bit of the conversation one comes across. Portal envy – we must have a portal, everybody else does Beauty contests – discussion starts with which of the commonly known repository frameworks one wants rather than with requirements etc
  • #29 Continued health of standards and OSS depends on intellectual and other contributions and sustaining frameworks. Mackenzie Smith spoke about difficulties with OSS at this conference. Neil McLean spoke about common services and the need for such infrastructural services. Again we are not sure how to secure and sustain these.