The JISC Information Environment and collection description Andy Powell [email_address] UKOLN, University of Bath UN/WHO HIV/AIDS meeting, Geneva 29-30 May 2003
Contents JISC Information Environment technical architecture http://www.ukoln.ac.uk/distributed-systems/jisc-ie/ collection description JISC IE service registry http://www.mimas.ac.uk/iesr/
Simple scenario consider a researcher searching for material to inform a research paper on HIV and/or AIDS he or she searches for  ‘hiv aids’  using: the RDN, to discover Internet resources  ZETOC, to discover recent journal articles (and, of course, he or she may use a whole range of other search strategies using other services as well)
 
 
 
 
Issues different user interfaces look-and-feel subject classification, metadata usage everything is HTML – human-oriented difficult to merge results, e.g. combine into a list of references difficult to build a reading list to pass on to students need to manually copy-and-paste search results into HTML page or MS-Word document or desktop reference manager or …
Issues (2) difficult to move from discovering journal article to having copy in hand (or on desktop) users need to manually join services together problem with hardwired links to books and journal articles, e.g. lecturer links to university library OPAC but student is distance learner and prefers to buy online at Amazon lecturer links to IngentaJournals but student prefers paper copy in library
The problem space… from perspective of ‘data consumer’ need to interact with multiple collections of stuff - bibliographic, full-text, data, image, video, etc. delivered thru multiple Web sites few cross-collection discovery services  (with exception of big search engines like Google, but lots of stuff is not available to Google, i.e. it is part of the ‘invisible Web’) from perspective of ‘data provider’ few agreed mechanisms for disclosing availability of content
UK JISC IE context… 206 collections and counting… (Hazel Woodward, e-ICOLC, Helsinki, Nov 2001) Books:     10,000 + Journals:     5,000 + Images: 250,000 + Discovery tools: 50 +  A & I databases, COPAC, RDN, … National mapping data & satellite imagery plus institutional content (e-prints, research data, library content, learning resources, etc.) plus content made available thru projects – 5/99, FAIR, X4L, … plus …
The problem(s)… portal problem how to provide seamless discovery across multiple content providers appropriate-copy problem how to provide access to the most appropriate copy of a resource (given access rights, preferences, cost, speed of delivery, etc.)
A solution… an information environment framework of machine-oriented services allowing the end-user to discover ,  access ,  use  and  publish  resources across a range of content providers move away from lots of stand-alone Web sites...  ...towards more coherent whole remove need for use to interact with multiple content providers note: ‘remove need’  rather than  ‘prevent’
JISC Information Env. discover finding stuff across multiple content providers access streamlining access to appropriate copy content providers expose metadata about their content for searching harvesting alerting develop services that bring stuff together portals (subject portals, media-specific portals, geospatial portals, institutional portals, VLEs, …)
A note about ‘portals’ ‘ portal’ word possibly slightly misleading the JISC IE architecture supports many different kinds of user-focused services… subject portal reading list and other tools in VLE commercial ‘portals’ (ISI Web of Knowledge, ingenta, Bb Resource Center, etc.) library ‘portal’ (e.g. Zportal or MetaLib) SFX service component personal desktop reference manager (e.g. Endnote)
Discovery technologies that allow providers to disclose metadata to portals searching - Z39.50 (Bath Profile), and SRW harvesting - OAI-PMH alerting - RDF Site Summary (RSS) fusion services may sit between provider and portal broker (searching) aggregator (harvesting and alerting)
Access in the case of books, journals, journal articles, end-user wants access to the most appropriate copy need to join up discovery services with access/delivery services (local library OPAC, ingentaJournals, Amazon, etc.) need localised view of available services discovery service uses the  OpenURL  to pass metadata about the resource to an ‘OpenURL resolver’ the ‘OpenURL resolver’ provides pointers to the most appropriate copy of the resource, given: user and institutional preferences, cost, access rights, location, etc.
Shared services service registry information about collections (content) and services (protocol) that make that content available authentication and authorisation OpenURL and other resolver services user preferences and institutional profiles terminology services metadata registries ...
JISC IE architecture JISC-funded content providers institutional content providers external content providers brokers aggregators catalogues indexes institutional portals subject portals learning management systems media-specific portals end-user desktop/browser presentation fusion provision OpenURL resolvers shared infrastructure authentication/authorisation (Athens) JISC IE service registry institutional preferences services terminology services user preferences services resolvers metadata schema registries
Summary Z39.50 (Bath Profile), OAI, RSS are key ‘discovery’ technologies... …  and by implication, XML and simple/unqualified Dublin Core IEEE LOM doesn’t feature – but anticipate delivery of rich metadata as part of content packages access to resources via OpenURL and resolvers where appropriate Z39.50 and OAI not mutually exclusive general need for all services to know what other services are available to them
Collections content is often managed and made  available in the form of ‘collections’ collection “ an aggregation of one or more items” aggregation by  location, type/form of item, provenance of item, source/ownership of item, nature of item content, etc.
Physical vs. digital physical collections of physical items (e.g. books, journals) digital collections of digital items (texts, images, multimedia objects, software, datasets, “learning objects”, etc.)  of digital metadata records  describing  physical items  (e.g. MARC records in OPAC)  describing  digital items  (e.g. Dublin Core records in subject gateway database)
Services service “ the provision of, or system of supplying, one or more functions of interest to an end user or software application” informational services provide access to items and/or collections e.g. a library, a Web site, a catalogue transactional services not primarily concerned with supply of information e.g. photocopy service, authentication service users access collections of content and metadata through services
Services (2) physical service provided  physically  (e.g. a library) network service provided  digitally  (e.g. an image archive) structured network service network service that provides  structured  access to  structured  resources user is software application unstructured network service presenting resources to human user i.e. a Web site!
Physical collections Physical services make physical collections available at physical locations   Collection of physical items Physical location Physical service
Digital collections Network services make digital collections available at digital locations   Collection of digital items Digital location Web site Network service (unstructured)
Collections and catalogues OPAC Web interface Digital location Network service (unstructured) Collection of digital metadata records
Digital collections / metadata Collection of digital items Web site Network service (unstructured) Digital location Collection of digital metadata records
Collections and services OAI repository Harvest via OAI-PMH Z39.50 target Search/retrieve via Z39.50 Web site Collection of digital metadata records Collection of digital or physical items SOAP receiver operations via SOAP Collection available via multiple network services unstructured network service structured network service RSS channel Alert via RSS/HTTP
JISC IE architecture JISC-funded content providers institutional content providers external content providers brokers aggregators catalogues indexes institutional portals subject portals learning management systems media-specific portals end-user desktop/browser presentation fusion provision OpenURL resolvers shared infrastructure authentication/authorisation (Athens) JISC IE service registry institutional preferences services terminology services user preferences services resolvers metadata schema registries
JISC IE Service Registry JISC IE Service Registry (IESR) holds descriptions about physical and digital collections of content digital collections of metadata (about the above) the structured and unstructured network services that make those collections available (Web sites, OAI repositories, Z39.50 and SRW targets, RSS channels) the owners and administrators of those collections and services schema still under development
IESR usage intended to be used by any service that needs machine-readable collection/service descriptions any service that simply wants to display collection descriptions to end-users portals, brokers, the RDN, VLEs, the JISC Web site, desktop tools like EndNote, etc. pilot service
IESR interfaces need to consider both real-time or batch-mode access descriptions made available for searching and harvesting using Z39.50 SRW OAI-PMH UDDI not yet clear how or when UDDI will be supported, but probably by registering at  uddi.org  in the first instance
Related activities DCMI Collection Description working group http://dublincore.org/groups/collections/ NISO Metasearch Initiative (including collection description issues) plus various application or protocol specific initiatives ZING ZeeRex, ISO ILL Directory,  Digital Reference Standard (Collections and Service), EAD community, …
Questions…
 

The JISC Information Environment and collection description

  • 1.
    The JISC InformationEnvironment and collection description Andy Powell [email_address] UKOLN, University of Bath UN/WHO HIV/AIDS meeting, Geneva 29-30 May 2003
  • 2.
    Contents JISC InformationEnvironment technical architecture http://www.ukoln.ac.uk/distributed-systems/jisc-ie/ collection description JISC IE service registry http://www.mimas.ac.uk/iesr/
  • 3.
    Simple scenario considera researcher searching for material to inform a research paper on HIV and/or AIDS he or she searches for ‘hiv aids’ using: the RDN, to discover Internet resources ZETOC, to discover recent journal articles (and, of course, he or she may use a whole range of other search strategies using other services as well)
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
    Issues different userinterfaces look-and-feel subject classification, metadata usage everything is HTML – human-oriented difficult to merge results, e.g. combine into a list of references difficult to build a reading list to pass on to students need to manually copy-and-paste search results into HTML page or MS-Word document or desktop reference manager or …
  • 9.
    Issues (2) difficultto move from discovering journal article to having copy in hand (or on desktop) users need to manually join services together problem with hardwired links to books and journal articles, e.g. lecturer links to university library OPAC but student is distance learner and prefers to buy online at Amazon lecturer links to IngentaJournals but student prefers paper copy in library
  • 10.
    The problem space…from perspective of ‘data consumer’ need to interact with multiple collections of stuff - bibliographic, full-text, data, image, video, etc. delivered thru multiple Web sites few cross-collection discovery services (with exception of big search engines like Google, but lots of stuff is not available to Google, i.e. it is part of the ‘invisible Web’) from perspective of ‘data provider’ few agreed mechanisms for disclosing availability of content
  • 11.
    UK JISC IEcontext… 206 collections and counting… (Hazel Woodward, e-ICOLC, Helsinki, Nov 2001) Books: 10,000 + Journals: 5,000 + Images: 250,000 + Discovery tools: 50 + A & I databases, COPAC, RDN, … National mapping data & satellite imagery plus institutional content (e-prints, research data, library content, learning resources, etc.) plus content made available thru projects – 5/99, FAIR, X4L, … plus …
  • 12.
    The problem(s)… portalproblem how to provide seamless discovery across multiple content providers appropriate-copy problem how to provide access to the most appropriate copy of a resource (given access rights, preferences, cost, speed of delivery, etc.)
  • 13.
    A solution… aninformation environment framework of machine-oriented services allowing the end-user to discover , access , use and publish resources across a range of content providers move away from lots of stand-alone Web sites... ...towards more coherent whole remove need for use to interact with multiple content providers note: ‘remove need’ rather than ‘prevent’
  • 14.
    JISC Information Env.discover finding stuff across multiple content providers access streamlining access to appropriate copy content providers expose metadata about their content for searching harvesting alerting develop services that bring stuff together portals (subject portals, media-specific portals, geospatial portals, institutional portals, VLEs, …)
  • 15.
    A note about‘portals’ ‘ portal’ word possibly slightly misleading the JISC IE architecture supports many different kinds of user-focused services… subject portal reading list and other tools in VLE commercial ‘portals’ (ISI Web of Knowledge, ingenta, Bb Resource Center, etc.) library ‘portal’ (e.g. Zportal or MetaLib) SFX service component personal desktop reference manager (e.g. Endnote)
  • 16.
    Discovery technologies thatallow providers to disclose metadata to portals searching - Z39.50 (Bath Profile), and SRW harvesting - OAI-PMH alerting - RDF Site Summary (RSS) fusion services may sit between provider and portal broker (searching) aggregator (harvesting and alerting)
  • 17.
    Access in thecase of books, journals, journal articles, end-user wants access to the most appropriate copy need to join up discovery services with access/delivery services (local library OPAC, ingentaJournals, Amazon, etc.) need localised view of available services discovery service uses the OpenURL to pass metadata about the resource to an ‘OpenURL resolver’ the ‘OpenURL resolver’ provides pointers to the most appropriate copy of the resource, given: user and institutional preferences, cost, access rights, location, etc.
  • 18.
    Shared services serviceregistry information about collections (content) and services (protocol) that make that content available authentication and authorisation OpenURL and other resolver services user preferences and institutional profiles terminology services metadata registries ...
  • 19.
    JISC IE architectureJISC-funded content providers institutional content providers external content providers brokers aggregators catalogues indexes institutional portals subject portals learning management systems media-specific portals end-user desktop/browser presentation fusion provision OpenURL resolvers shared infrastructure authentication/authorisation (Athens) JISC IE service registry institutional preferences services terminology services user preferences services resolvers metadata schema registries
  • 20.
    Summary Z39.50 (BathProfile), OAI, RSS are key ‘discovery’ technologies... … and by implication, XML and simple/unqualified Dublin Core IEEE LOM doesn’t feature – but anticipate delivery of rich metadata as part of content packages access to resources via OpenURL and resolvers where appropriate Z39.50 and OAI not mutually exclusive general need for all services to know what other services are available to them
  • 21.
    Collections content isoften managed and made available in the form of ‘collections’ collection “ an aggregation of one or more items” aggregation by location, type/form of item, provenance of item, source/ownership of item, nature of item content, etc.
  • 22.
    Physical vs. digitalphysical collections of physical items (e.g. books, journals) digital collections of digital items (texts, images, multimedia objects, software, datasets, “learning objects”, etc.) of digital metadata records describing physical items (e.g. MARC records in OPAC) describing digital items (e.g. Dublin Core records in subject gateway database)
  • 23.
    Services service “the provision of, or system of supplying, one or more functions of interest to an end user or software application” informational services provide access to items and/or collections e.g. a library, a Web site, a catalogue transactional services not primarily concerned with supply of information e.g. photocopy service, authentication service users access collections of content and metadata through services
  • 24.
    Services (2) physicalservice provided physically (e.g. a library) network service provided digitally (e.g. an image archive) structured network service network service that provides structured access to structured resources user is software application unstructured network service presenting resources to human user i.e. a Web site!
  • 25.
    Physical collections Physicalservices make physical collections available at physical locations Collection of physical items Physical location Physical service
  • 26.
    Digital collections Networkservices make digital collections available at digital locations Collection of digital items Digital location Web site Network service (unstructured)
  • 27.
    Collections and cataloguesOPAC Web interface Digital location Network service (unstructured) Collection of digital metadata records
  • 28.
    Digital collections /metadata Collection of digital items Web site Network service (unstructured) Digital location Collection of digital metadata records
  • 29.
    Collections and servicesOAI repository Harvest via OAI-PMH Z39.50 target Search/retrieve via Z39.50 Web site Collection of digital metadata records Collection of digital or physical items SOAP receiver operations via SOAP Collection available via multiple network services unstructured network service structured network service RSS channel Alert via RSS/HTTP
  • 30.
    JISC IE architectureJISC-funded content providers institutional content providers external content providers brokers aggregators catalogues indexes institutional portals subject portals learning management systems media-specific portals end-user desktop/browser presentation fusion provision OpenURL resolvers shared infrastructure authentication/authorisation (Athens) JISC IE service registry institutional preferences services terminology services user preferences services resolvers metadata schema registries
  • 31.
    JISC IE ServiceRegistry JISC IE Service Registry (IESR) holds descriptions about physical and digital collections of content digital collections of metadata (about the above) the structured and unstructured network services that make those collections available (Web sites, OAI repositories, Z39.50 and SRW targets, RSS channels) the owners and administrators of those collections and services schema still under development
  • 32.
    IESR usage intendedto be used by any service that needs machine-readable collection/service descriptions any service that simply wants to display collection descriptions to end-users portals, brokers, the RDN, VLEs, the JISC Web site, desktop tools like EndNote, etc. pilot service
  • 33.
    IESR interfaces needto consider both real-time or batch-mode access descriptions made available for searching and harvesting using Z39.50 SRW OAI-PMH UDDI not yet clear how or when UDDI will be supported, but probably by registering at uddi.org in the first instance
  • 34.
    Related activities DCMICollection Description working group http://dublincore.org/groups/collections/ NISO Metasearch Initiative (including collection description issues) plus various application or protocol specific initiatives ZING ZeeRex, ISO ILL Directory, Digital Reference Standard (Collections and Service), EAD community, …
  • 35.
  • 36.