Introduction to indexing


Published on

Published in: Education, Technology, Business
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Introduction to indexing

  1. 1. Prepared by: Daryl L. SuperioCentral Philippine University Iloilo City, Philippines
  2. 2.  Sumer, 3000 B.C.- where the first systematic organization of written records was found 2000 B.C., China & India- when record keeping became part of the society ◦ an orderly society is parallel to the orderly record of what has occurred ◦ laws had been passed requiring that all business transactions be recorded and authorized 900 A.D.- when encyclopedia was arranged in alphabetical order
  3. 3.  early indexes were concordance indexes, were limited to personal names or were indexes to the occurrence of words on text marginal summaries were around as early as the 9th century indexes took a major step forward with the development of codex blank pages binding at the back of the book were utilized to be written references ◦ known as do-it-yourself indexes ◦ indexes were usually at the front of the book, lifted verbatim from the text, simple but not easy to use
  4. 4.  1850s- W.F. Poole published an index that cut across many journals ◦ the beginning of a single publication indexing numerous issues of many journals 1900- H.W. Wilson, first published the Readers’ Guide to Periodical Literature ◦ notable for the emphasis it placed on subject access and cross-referencing ◦ each periodicals were indexed under its author and its specific subject
  5. 5.  19th Century- when Paul Otlet and Henri La Fontaine founded the International Institute of Bibliography ◦ one of the purpose was to improve indexing approaches to scholarly literature ◦ title-word indexing was proposed, which led to keyword and free indexing book indexing continued to improve; indexes began to have subdivisions of terms, and slowly cross-references began to appear
  6. 6.  1950s- when computers were utilized in indexing and abstracting ◦ Hans Peter Luhn, of IBM introduced a mechanized form of derived title-word indexing schemes 1960s- brought the third generation computers, indexes and abstracts began to publish with computers using batch processing methods 1990s- when keyword searching of computer- stored indexes had been perfected 20th Century- greater progress in the development of indexing methods; indexes to individual work, through indexes to several volumes, to cooperative and massive indexes and currently, the web indexes
  7. 7.  Index ◦ a systematic arrangement of entries designed to enable users to locate information in a document ◦ an alphabetically arranged list of headings consisting of the personal names, places, and subjects treated in a written work, with page numbers to refer the reader to the point in the text at which information pertaining to the heading is found  in single-volume works of reference and nonfiction, any indexes appear at the end of the back matter  in a multivolume work, they are found at the end of the last volume  in very large multivolume reference works, the last volume may be devoted entirely to indexes
  8. 8.  index also refers to:  an open-end finding guide to the literature of an academic field or discipline  ex. Philosophers Index  works of a specific literary form  ex. Biography Index  published in a specific format  ex. Readers Guide to Periodical Literature  analyzed contents of a serial publication  ex. New York Times Index
  9. 9.  Indexing ◦ the operation of creating an index for information retrieval ◦ the process of:  compiling one or more indexes for a single publication, such as a monograph or multivolume reference work,  adding entries for new documents to an open-end index covering a particular publication format (example: newspapers), works of a specific literary form (biography, book reviews, etc.), or the literature of an academic field, discipline, or group of disciplines.
  10. 10.  Indexer ◦ a person who does indexing Indexable matter ◦ the portions of documents that are actually analyzed and indexed Indexing language ◦ in a broad sense, any vocabulary, including uncontrolled vocabulary, used for indexing and the rules of syntax for its application ◦ in a narrower sense, a controlled vocabulary or classification system and the rules of syntax for its application
  11. 11.  minimize the time and effort in finding information and maximize the searching success of users identify potentially relevant information in the document or collection being indexed analyze concepts treated in a document to produce appropriate index headings based on the indexing language assigned indicate relationship among terms group together related topics scattered due to the arrangement used in a document or collection direct the users seeking information under terms not chosen as index headings to headings that have been chosen, by means of See reference suggest related topics by means of see also reference tools for current awareness services
  12. 12.  Anderson, James D. 1997. NISO-TR02, Guidelines for indexes and related information retrieval devices. ◦ provides guidelines for the content, organization, and presentation of indexes used for the retrieval of documents and parts of documents ◦ deals with the principles of indexing, regardless of the type of material indexed, the indexing method used (intellectual analysis, machine algorithm, or both), the medium of the index, or the method of presentation for searching ◦ it emphasizes three processes essential for all indexes: comprehensive design, vocabulary management, and the provision of syntax
  13. 13.  Wellisch, Hans 1999. NISO-TR03, Guidelines for alphabetical arrangement of letters and sorting of numerals and other symbols. ◦ provides rules for the alphabetical arrangement of headings in lists of all kinds, such as bibliographies, indexes, dictionaries, directories, inventories, etc. ◦ it also covers the sorting of Arabic or Roman numbers, and other symbols ◦ it consists of seven rules that cover problems which may arise in alphanumeric arrangement of headings ◦ is based on the traditional order of letters in the English alphabet and that of numerals in ascending arithmetical order
  14. 14.  ISO 999:1996, Information and documentation— guidelines for the content, organization and presentation of indexes ◦ gives guidelines for the content, arrangement and presentation of indexes to books, periodicals, reports, patent documents and other written documents, also to non-print materials, such as electronic documents, films, sound and video recordings. ◦ concerned with basic indexing principles and practice rather than with the detailed procedures of indexing that vary according to type of matter indexed and the users for whom the index is intended ◦ covers the choice, form and arrangement of headings and subheadings used in index entries once the subjects to be indexed have been determined
  15. 15.  ISO 25964-1: 2011 Information and documentation – Thesauri and interoperability with other vocabularies – Part 1: Thesauri for information retrieval ◦ gives recommendations for the development and maintenance of thesauri intended for information retrieval applications ◦ applicable to vocabularies used for retrieving information about all types of information resources, irrespective of the media used (text, sound, still or moving image, physical object or multimedia) including knowledge bases and portals, bibliographic databases, text, museum or multimedia collections, and the items within them ◦ provides a data model and recommended format for the import and export of thesaurus data ◦ applicable to monolingual and multilingual thesauri ◦ not applicable to the preparation of back-of-the-book indexes, although many of its recommendations could be useful for that purpose ◦ not applicable to the databases or software used directly in search or indexing applications, but does anticipate the needs of such applications among its recommendations for thesaurus management
  16. 16.  ASI/H.W. Wilson Award ◦ was established in 1978 to honor excellence in indexing of an English language monograph or other non-serial work published in the United States during the previous calendar year ◦ its purpose is two-fold:  for indexers, to provide and publicize models of excellence in indexing;  for publishers, to encourage greater recognition of the importance of quality in book indexing.
  17. 17.  The Theodore C. Hines Award or Hines Award ◦ was established in 1993 to honor those members who have provided exceptional service to American Society for Indexers. ◦ ASI’s highest honor to its own, and was named for Ted Hines, who played a large part in the establishment of the Society
  18. 18.  Web Indexing Awards to encourage high quality web site indexes and to promote the web indexing work of professional indexers, the Web & Electronic Indexing Special Interest Group of the American Society for Indexing awards a deserving indexer the annual Web & Electronic Indexing SIG Award for excellence in web site indexing
  19. 19.  Indexes by type of object referred to a. authors: all types of document creators such as writers, composers, illustrators, translators, editors, choreographers, artists, sculptors, painters, inventors b. subjects (topics or features): topics treated in documents and/ or features of documentary units (for example, genre, format, methodological approach). Separate indexes are often devoted to special types of topics such as persons, places, or corporate bodies; features, such as genres (for example, poetry, drama); or notations, such as International Standard Book Numbers (ISBN).
  20. 20.  Indexes by type of term used for headings a. names: proper nouns, such as names of persons, places, corporate bodies. b. numbers or notations: numerical or coded designations, such as classification notation, patent number, ISBN, date. c. words and phrases: common words and phrases (as opposed to names or proper nouns).
  21. 21.  Indexes by type or extent of indexable matter on which an index is based a. full text of document b. abstracts c. titles only d. first lines only (for example, first lines of poems) e. citations(reference citations to other documents
  22. 22.  Indexes by arrangement of entries a. alphabetical or alphanumeric b. classified: headings arranged on the basis of relations among concepts represented by headings, for example, hierarchy, inclusion, chronology, or other association. Classified indexes are often based on existing classification schemes, such as the Dewey Decimal Classification. c. alphabetico-classed: broad headings arranged alphabetically. Narrower headings are grouped under broad headings and arranged alphanumerically or relationally on the basis of hierarchy, inclusion, chronology, or other association.
  23. 23.  Indexes by method of document analysis a. human intellectual analysis and identification of topics and concepts expressed and/ or features manifested b. computer algorithms designed to identify useful terms, phrases, or features c. combination of computer-based and human analysis.
  24. 24.  Indexes by method of term selection a. assignment of terms to represent topics and features (whether or not the term is in the documentary unit being indexed) b. extraction of terms from the documentary unit c. a combination of assignment and extraction methods
  25. 25.  Indexes by method of term coordination a. pre-coordinate combination: such as subject heading indexes, string indexes, chain indexes, keyword indexes (including KWIC, KWOC, KWAC indexes), rotated, and permuted indexes b. post-coordinate combination: includes the use of Boolean operators, proximity measures, and the combination of weighted terms.
  26. 26.  Indexes by type, periodicity, format, genre, or medium of document(s) being indexed ◦ Examples are: books, monographs, periodicals, serials, poetry, fiction, short stories, films, videos, illustrations, pictures, paintings, artifacts, software, computer readable texts, maps, and sound recordings Indexes by medium of index a. printed or written b. microform c. electronic media, including online, CD-ROM d. braille
  27. 27. Indexes by periodicity of the index a. one-time, closed-end indexes b. continuing, open-end indexesIndexes by authorship a. authored: an authored index; a separately authored document distinct from the document(s) that is (are) being indexed. It is created independently by one or more persons through intellectual analysis of text, as distinguished from indexes that are created solely through algorithmic analysis of text carried out electronically b. automatically generated