The Social OPAC: Past, Present and Future Carolyn Brown, Rebecca Chovnick and Sara Mangel Knowledge Organization – Spring 2009 The History of the Catalog A library catalog is a record of all the bibliographic items found in a library or a group of libraries. An item can be many different things: books, movies, CDs, computer files, maps, etc. Libraries started cataloging their holdings as far back as ancient times. Catalogs started as lists of manuscripts, typically in a loose leaf or book form. The card catalog most people are familiar with emerged in the 19 th century alongside the Dewey Decimal and Cutter Classification systems. “ Next Generation” OPACs “ Next Generation” OPACs offer a wide variety of new features geared toward the needs of the next generation user: “immediacy, interactivity, personalization, and mobility” (Rettig 2003). Based on analysis of seven current OPAC interfaces, the most popular features are: Relevancy Ranking, Enhancements (Visual Appeal and Content Enrichment), Faceted Results, Breadcrumb Trails, Persistent URLs (Permalinks), Syndication Feeds (RSS), Suggestions for Search Modifications, Recommendations, Tagging, Annotations, Rating, Reviewing, and Social Networking/Web 2.0 Tools. Figure 1. Traditional card catalog entry. Figure 2. Comparison of Current OPACs Figure 3. “Next Generation” Features in SCRIBLIO Figure 5. University of Virginia’s Blacklight Interface OPACs of the Future New models of Social OPACs are constantly being created. New features are being incorporated into these new models in the hopes of making searching more effective and easy for the catalog user. Two new models currently being tested are Blacklight and Extensible. It is important for Social OPACs to be intuitive and simple to use. It is also imperative that these OPACs provide instruction on how to use the navigation/interface. Two radical ideas about the future of library catalogs are that a) there is no library catalog, and b) there is a central catalog, either world-wide or regional. References Rettig, J. (2003) Technology, Cluelessness, Anthropology and the Memex: the future of academic reference service. Reference Services Review, 31 (1), 17-21. What Is an OPAC? The Online Public Access Catalog is an online database of materials held by a library or group of libraries. Early OPACs tended to closely reflect the card catalogs they were intended to replace. The interface was often confusing, search options were limited and results lacked relevancy ranking. Next Generation OPAC Options Libraries can take different approaches to make their OPACs more user-friendly: Enhancements (LibraryThing for Libraries), Wrappers (Scriblio) and Replacements (AquaBrowser). Pratt Institute School of Information & Library Science
Sarah Ball & Sara Grozanick Abstract We examined the use of semantic web technologies in accessing and sharing cultural heritage information. After researching theories and concepts of semantic technology we looked at specific applications designed to make cultural heritage accessible through the semantic web. We also explored some of the current cultural heritage projects utilizing semantic technologies, evaluating the degree to which we felt they were effective. Through discussion, and supplemental interdisciplinary reading, we were able to understand the challenges specific to the idea of cultural heritage and the task of presenting it within the semantic web model.
Ontologies & Cultural Heritage
ISO 21127:2006: CIDOC-CRM
What? CIDOC Conceptual Reference Model, a formal domain ontology for cultural heritage; object-oriented
Objective? To enable integration and exchange of information between different cultural heritage sources
Figure 1. Layered approach to Semantic Web. (G. Antoniou and F. van Harmelen, 2004, p. 18) 653-02 Knowledge Organization
Ontologies & Semantic Web: Layers
URI (Uniform Resource Identifier)
XML (eXtensible Markup Language) allows the creation of tags to annotate the Web pages or portions of text on the page; programs called scripts can make use of these tags, but the script has to know what the tag means.
RDF (Resource Description Framework): encodes meaning in triples similar to a
Spring 2009 Figure 2. The central classes and properties for data interchange in CIDOC-CRM ( Ø. Eide, A. Felicetti, C.E. Ore, A. D’Andrea and J. Holmen, 2008)
sentence’s subject-predicate-object structure. The triples form “webs of information about related things.”
Ontologies: provide a solution to semantic confusion by formally defining relations among terms.
Logic: enhances ontology language for the writing of declarative knowledge in specific applications.
Proof: actual deductive process and representation of proofs in Web languages and proof validation.
Trust: digital signatures, etc.
(G. Antonious & F. van Harmelen, p. 15-18).
Scope? Heterogeneous “scientific document-ation” relating to museum collections, a.k.a. “the curated knowledge of museums” (CIDOC-CRM, 2006)
Convertible to machine-readable fomats, i.e. RDF Schema, KIF, DAML-OIL, OWL, STEP, etc.
Able to be encoded in RDF, XML, DAML-OIL, OWL and more
Cultural Heritage on the Semantic Web: CultureSampo
CultureSampo is a project that aims at creating a “collective semantic memory of the cultural heritage” of the nation of Finland.
almost 30 different types of content
18 different metadata schemas
aggregated knowledge base of 52,000 cultural objects
draws information from 22 different Finnish museums; also Wikipedia and Panoramio
also Web 2.0 features like user comments to contibute new knowledge (i.e. identifying a photograph) (E. Hyvönen et al, 2009, p. 1-2)
Figure 3. Hetergeneous information sources integrated through CultureSampo. References (see page 2)
R eferences Agar, M. (1993). Language shock: Understanding the culture of conversation . New York: Wm. Morrow. Antoniou, G., Franconi, E., & van Harmelen, F. (2005). Introduction to Semantic Web Ontology Languages. Lecture Notes in Computer Science. (3564), 1-21. Antoniou., G., & Van Harmelen, F. (2008). A semantic Web primer . Cooperative information systems. Cambridge, Mass: MIT Press. Asian Semantic Web Conferences, Mizoguchi, R., Shi, Z., & Giunchiglia, F. (2006). The semantic web - ASWC 2006 :First Asian Semantic Web Conference, Beijing, China, September 3-7, 2006 ; proceedings . Lecture notes in computer science, 4185. Berlin: Springer-Verlag. http://springerlink.metapress.com/openurl.asp?genre=issue&issn=0302-9743&volume=4185 . Berners-Lee, T. The Semantic Web. The Scientific American Magazine, May 2001 . Retrieved April 6, 2009 from Scientific American Web site: http://www.sciam.com/article.cfm?id=the-semantic-web. Davies, J., Studer, R. & Warren P. (2006). Semantic web technologies: trends and research in ontology-based systems. Eide, Ã˜., Felicetti, A., Ore, CE, Dâ€™Andrea, A., Holmen, J. (2008) Encoding Cultural Heritage Information for the Semantic Web: Procedures for Data Integration through CIDOC-CRM Mapping, in press Floridi, L. (2004). Open Problems in the Philosophy of Information. Metaphilosophy. 35 (4), 554-582. Herman, I. (2009). W3C Semantic Web Activity. Retrieved April 27, 2009 from W3C Semantic Web Web site: http://www.w3.org/2001/ sw/ Hyvönen, E. et al. (2009). CultureSampo—Finnish Cultural Heritage Collections on the Semantic Web 2.0. The 1st International Symposium on Digital Humanities for Japanese Arts and Cultures (DH-JAC2009). Retrieved April 28, 2009 from PDF: http:// www.seco.tkk.fi/publications/2009/hyvonen-et-al-culturesampo-dh-jac-2009.pdf ICOM-CIDOC. (2004). Definition of the CIDOC Conceptual Reference Model (version 4.2.1 October 2006). Retrieved April 20, 2009 from International Council of Museums Web site: http://cidoc.ics.forth.gr/official_ release_cidoc.html/ Lakoff, G. (1984). Classifiers as a reflection of mind: A cognitive model approach to prototype theory . Berkeley cognitive science report, no. 19. Berkeley: Cognitive Science Program, Institute of Cognitive Studies, University of California at Berkeley. The Paul J. Getty Trust. (2009). Art & Architecture Thesaurus Online . Retrieved April 21, 2009, from http://www.getty.edu/research/ conducting_research/vocabularies/aat UNESCO. (2009). Culture . Retrieved April 21, 2009, from http://portal.unesco.org/culture/en Web Ontology Language . Retrieved April 20, 2009, from Wikipedia, the free encyclopedia Web site: http://en.wikipedia.org/wiki/ Web_Ontology_Language Wilson, M. (2008) Retreived April 27, 2009 from W3C: http://www.w3c.rl.ac.uk/pasttalks/slidemaker/Pandora/talk/slide11-0.html Woods, D. (2006). Providing Access to Maori and Pacific Photographs. The Journal of Pacific History, 41(2), 219-25. Retrieved March 12, 2009, from Humanities Full Text database. W3C (2004). W3C Recommendation. Retrieved April 25, 2009 from W3C Web site: http://www.w3.org/TR/2004/REC-rdf- primer-20040210/#rdfschema W3C (2005) Retrieved April 25, 2009 from W3C: http://www-sop.inria.fr/acacia/personnel/Fabien.Gandon/tmp/grddl/scenario- gallery.htm
The “unseen” web- over 500 times larger than the surface web
7,500 terabytes of information, compared to the surface web’s 19 terabytes
Topics of Deep Web Data: Agriculture ･ Arts ･ Business ･ Computing/Web ･ Education ･ Employment ･ Engineering ･ Government ･ Health ･ Humanities ･ Law/Politics ･ Lifestyles ･ News/ Media ･ People, Companies ･ Recreation, Sports ･ References ･ Science/Math ･ Travel ･ Shopping A Presentation by Davida Marion, Katie Giari, and Nicole Gitau 4/30/09 Pratt SILS LIS 653-03
How does one access it?
Specialized search engines using a combination of sorting/ranking algorithms
Special divisions of familiar search engines (Google Scholar)
Build the Open Shelves Classification Description: I hereby invite you to join the Open Shelves Classification (OSC), a free, "humble," modern, open-source, crowd-sourced replacement for the Dewey Decimal System. The Vision Free Modern Humble Collaboratively written. Collaboratively assigned Why it's necessary. The Dewey Decimal System® was great for its time, but it's outlived that. Libraries today should not be constrained by the mental models of the 1870s, doomed to tinker with an increasingly irrelevant system. Nor should they be forced into a proprietary system—copyrighted, trademarked and licensed by a single entity—expensive to adopt and encumbered by restrictions on publishing detailed schedules or coordinating necessary changes. This mural is said to depict Dewey and the railroad service he gave to Lake Placid, FL. It's time to throw Dewey under the train. Emma Carbone (Fiction) Suki Park (Religion) Janice Dekoff (Performing Arts) Jessica Peterson (History) -- Pratt SILS LIS 653-03 – April 23, 2009
eXtensible Catalog (XC) A next generation library catalog interface and metadata management tool Yasmin Mathew and Chris Bentley www.extensiblecatalog.org
1 st phase (2007): $283,000 from Andrew W. Mellon Foundation
2 nd phase: $749,000 from Andrew W. Mellon Foundation + $2 million from University of Rochester
Future: members of the XC community (participating institutions) will fund further research and development, as will benefit and save money using open-source software.
Release: NCIP and OAI-PMH toolkits released in March 2009, MST and Drupal in development, and LMS in design phase, July 2009 for complete launch on Apache software license, code available on Google Code
How is XC funded?
The eXtensible Catalog (XC): A Next Generation Library Catalog Interface and Metadata Management Tool Yasmin Mathew and Chris Bentley Acknowledgement: Special thanks to Jennifer Bowen at the University of Rochester for providing us with insight into the XC development process. Pratt Institute School of Information and Library Science
Potential Impact on Libraries
Transitioning libraries away from the “silo-model”
Helping libraries transition from AACR2 to RDA
Facilitating adoption of open access
Why do libraries need new tools like XC?
Viable open source alternatives improve the quality of commercial products
Platform for experimentation and testing
Lowered tech bar allows almost any library to implement XC
Libraries need these tools if they expect to maintain relevance in a rapidly evolving information ecosystem
The Way Forward
Metadata reuse is the future: Libraries and other institutions need to pool and share their existing metadata in new contexts
XC is forming partnerships with libraries, vendors, and community-sourced tech support
XC is nearing its launch date and development partners are talking it up
The library community awaits with anticipation…
Underlying Technology and Standards
NSDL Metadata Management System
Programming languages: Java, PHP
Communication protocols: OAI-PMH, NCIP, LDAP
Metadata standards: MARC 21, Dublin Core, RDA
Open source applications: jOAI, MARC4j, Lucene, SOLR, Drupal
Figure : XC System Architecture . XC features a modular framework make up of toolkits, applications that are useful individually, and collectively. The yellow box in this diagram shows the starting point for metadata as it is harvested from the ILS via OAI-PMH. The metadata is then sent to the Metadata Services Toolkit for enhancement and transformation, before being delivered to one of the user interfaces. (Diagram source: Bowen, J. (2008). Envisioning an “eXtensible” future [PowerPoint slides].)
What is the eXtensible Catalog (XC)?
A set of new open source software toolkits for libraries
Developed by University of Rochester Libraries
Not directly comparable to either a traditional integrated library system (ILS) or a “next-generation” discovery interface
Works alongside a library’s existing ILS to provide new functionality
A discovery layer + metadata infrastructure
Goals of the XC Development Team
Provide libraries with an alternative way to reveal their collections to library users
Integrate library content more effectively with the open Web
Unify access to information resource silos in a single discovery environment
Implement a next generation user interface featuring faceted searching, user tagging and FRBR-informed search results grouping
Ruby Gaba, Jaclyn Costa, Rachel Correll Dr. Pattuelli, LIS-653-02, Pratt Institute, 7 May 2009 References Faceted classification. (2009, April 23). In Wikipedia, The Free Encyclopedia . Retrieved 22:20, May 5, 2009, from http://en.wikipedia.org/w/index.php?title=Faceted_classification&oldid=285686129 Five laws of library science. (2009, April 28). In Wikipedia, The Free Encyclopedia . Retrieved 22:17, May 5, 2009, from http://en.wikipedia.org/w/index.php?title=Five_laws_of_library_science&oldid=286547974 Mi, J., & Weng, C. (2007, April 11). Revitalizing the Library OPAC: Challenges faced by academic librarians . Paper presented at The Academic Librarian: Dinosaur or Phoenix? Conference. Retrieved May 4, 2009, from http://www.lib.cuhk.edu.hk/conference/aldp2007/programme/aldp2007_full_paper/CathyWeng.pdf Taylor, A. G. (1992). Introduction to cataloging and classification . 8 th ed. Englewood, Colorado: Libraries Unlimited. "The ultimate goal is that users will be comfortable and confident using library OPACs for their information needs wherever a computer is available and without special training." -Jia Mi and Cathy Weng Faceted Classification A system allows the assignment of multiple classifications to an object, enabling the classifications to be ordered in multiple ways, rather than in a single, pre-determined, taxonomic order. Faceted classification is used in faceted search systems that enable a user to navigate information along multiple paths corresponding to different orderings of the facets. The Colon classification developed by Ranganathan is an example of faceted classification applied to the physical world, specifically for the purpose of organizing library materials. Ranaganthan and his Five Laws Ranganathan formulated objectives and principles for the organization of, access to, and use of library materials, which he called his, “Five Laws of Library Science.” These Laws are: First law: Books are for use – books and other library materials are important not as objects but for the knowledge and information they contain. Second Law: Every reader his or her book – this law teaches us two lessons. First, is that we do not acquire library materials in the abstract. Second lesson is that even the most apt selection choices can be vitiated if they are not backed up by an efficient and user-friendly bibliographic control system. Third Law: Every book its reader - Ranganathan is telling us that when a library user comes to a library or gains access to library services, certain materials (textual, graphic, and/or numeric) will meet his or her needs. Fourth Law Fourth Law: Save the time of the Reader - when properly understood and employed a management tool of great utility. Fifth Law: The library is a growing organism - libraries do grow and change and will always do so.
Consists of “clearly defined, mutually exclusive, and collectively exhaustive aspects, properties or characteristics of a class or specific subject” (Taylor 1992)
Is divided into values that represent various possible situations for the facet (e.g. “Books” and “DVDs” as values for the facet Format)
Faceted browsing allows the searcher to enter a simple search term without much forethought
The searcher can then focus on a certain facet of the topic and even further refine by other facets such as material format, library location, and language
This is exploratory searching, where the first search is not necessarily the most important part of the search
The results list continually gets smaller as the searcher adds more facets
OPACs with Faceted Browsing Endeca at North Carolina State University Libraries n
Problems with Traditional OPACs
Subject searching was difficult
Users wanted other items in a collection added to the OPAC
Search results not organized in an effective way
Does not allow users to correct mistakes easily
No way to refine searches once begun
Only citation available
Users were not able to connect with the information that they were seeking in a time efficient and simple way .
Example of a facet with the specific values below it Facet-nation with OPACs Visual representation of exploratory searching with facets used to narrow results Retrieved May 2, 2009, from http://www2.lib.ncsu.edu/ catalog/?Nty=1&N=0&Ntt=zombies&Ntk=Keyword Retrieved May 2, 2009, from http://virgobeta.lib.virginia.edu/catalog?q=zombies&focus=&per_page=10 Retrieved May 2, 2009, from http://uwashington.worldcat.org/search?qt=worldcat_org_all&q=zombies Blacklight at University of Virginia WorldCat Local at University of Washington Facet Values
Danielle Panek and Lindsay Friedman LIS 653, Section 02, Spring Semester Pratt Institute “ A good catalog will allow access to the image through multiple access points regardless of the principle decisions. IFLA came out with Functional Requirement for Bibliographic Records (FRBR) to elucidate whether something is a work, a manifestation, an expression, or an item. However, there is still ambiguity in many cases. CCO is spoken of as a data content standard and its closest parallel among library cataloging tools is AACR. Yet it goes beyond the basic description and access rules of AACR by providing a section on authorities and an XML schema and data structures while AACR does not...No matter how old or new your rules are, it is really a matter of applying them. Even a literal application of rules by two catalogers will not yield the same results. The fundamental thing to remember, however, is that you have to use some sort of rules and guidelines, and use them consistently in conjunction with controlled vocabularies, if you want to effectively share your records with others. ” - Sherman Clarke, New York University Libraries The Visual Resources Association is a multi-disciplinary organization dedicated to furthering research and education in the field of image management within the educational, cultural heritage, and commercial environments. The Association offers a forum for issues of vital concern to the field, including: preservation of and access to digital and analog images of visual culture; cataloging and classification standards and practices; integration of technology-based instruction and research; intellectual property policy; and other topics of interest to the field.
Ten Key Principles of CCO
Establish the logical focus of each Work Record.
Include all the required CCO elements.
Follow the CCO rules.
Use controlled vocabularies.
Create local authorities that are populated with terminology from standard published controlled vocabularies.
Use established metadata standards.
Understand that cataloging, classification, indexing, and display are different but related functions.
Be consistent in establishing relationships between works and images, between a group or collection and works, among works, and among images.
Be consistent regarding capitalization, punctuation, and syntax.
Use English-language data values whenever possible.
VRA Core + CCO + XML = Sharable Metadata Imaging Cataloging: Cataloging Cultural Objects and VRA Core 4.0
Personal Information Management Marvin Rusinek, Joseph Ketner, Sabrina Hirsch Pratt Institute, LIS653 Spring 2009
We regularly locate, encounter or acquire information that we know we will want to use again.
We need to organize and manage the information that we need to use for work, fun, and everyday tasks.
The Need for PIM
Information gets stuck in inaccessible silos
Finding information created earlier is difficult
Secure sharing of personal information with others is difficult
Inability to re-use information
E-mailing documents to self creates inbox clutter
Keeping Found Things Found
How people keep information
How people re-find their information
How people organize their information
Dealing with Information Fragmentation
Easier to organize, rearrange, incorporate, order, and keep everything together. You can drag and drop web-pages and documents; create emails within the plan.
Facebook’s Sentiment Engine
Goal: “real-time awareness” of user opinions and feedback, via:
active engagement in polls
passive, automated collection of data
Gauging public reactions “just by listening to what people are talking about in status updates and comments.”
Track reactions to current events and changes in reactions over time
Discern the major preoccupations and interests of Facebook users
Marketing and product feedback
Study public awareness of specific issues
THE THE/OPEN THE/OPEN/SHELVES THE/OPEN/SHELVES/CLASSIFICATION PROJECT An Adventure in No-Holds-Barred Classification
LT is a site for users who want to catalog their books online. ( http://www.librarything.com/about )
LT is a social site.
You can search, sort, and tag books or use LCC or DDC to organize your site library.
The site is also self-described as “a full-powered cataloging application, searching the Library of Congress, all five national Amazon sites, and more than 80 world libraries” ( http://www.librarything.com/about )
About religion: “How about ‘something that people would think would be in the religion section of the library.’" (Tim Spaulding: LT Creator) ( http://www.librarything.com/talktopic.php?topic=58485 )
Must have dedicated, officially recognized moderators.
Moderators should be able to communicate w/ leadership.
“ Sounds like what OCLC staff are doing with DDC, which is in fact an international collaborative effort. (I guess the major failing here is that this is the collaboration of experts, who actually know what they're doing, rather than well-intentioned "users" who don't?)”