Your SlideShare is downloading. ×
Tales From the Field: Implementing Information Technology
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Tales From the Field: Implementing Information Technology

527
views

Published on

Presented by Marjorie Hlava, president of Access Innovations, Inc., at the American Society for Information Science and Technology's 23rd Annual SIG/CR Classification Research Workshop on October 26, …

Presented by Marjorie Hlava, president of Access Innovations, Inc., at the American Society for Information Science and Technology's 23rd Annual SIG/CR Classification Research Workshop on October 26, 2012.


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
527
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Tales from the Field: Implementing Information Theory SIG CR - 2012Marjorie Hlava, PresidentAccess Innovations, Inc. www.accessinn.com
  • 2. Implementing Information Theory The case of the missing abstracts Russian information US PTO Getty adventures Vatican bibles Past basics Thoughts on directions
  • 3. The Bleeding Edge Figure Out the client needs Figure out the specifications Get approval on the specifications Figure out how to deliver the data following the specs Quality control the data delivery …. But then life happens
  • 4. The Case of Missing Abstracts Tests showed that just searching the indexing did not provide the full answers users wanted. Searching the titles and abstracts as well would improve search Enough space could be found on servers if the data was moved to in-house from Dialog and Orbit. New platform going into production New format – Messenger Specifications written, test file approved
  • 5. Specifications Need 99.998% accuracy for user acceptance Left tagged ASCII Office in Mexico City – Access de Mexico Triple key - double proof Two sets of volumes 792,000 abstract tapes destroyed 1970 – 1982 data
  • 6. Access de Mexico7:17 Am Shift changeSeptember 19,19858.7 earthquake
  • 7. CAS to Philippines Limo from the airport with the remaining volumesTyphoon DotOctober 12, 1985Clark Air Force base evacuatedPower out for weeks
  • 8. Jamaica Hurricane Kate November 1985 4 inches of water in the computer room No power on the island
  • 9. Beijing China November 1985 NOTHING HAPPENED Finished On time Under budget At promised accuracy level Client said “ when I read your contract I thought you had an unusual level of detail on the Acts of God clauses…. But I didn’t expect you to use every one of them!”
  • 10. Russian Information
  • 11. Implementing Information Theory Viniti Maxwell Information map PDP-8’s Microfilm machines – no batteries Glastnof – open but no trust
  • 12. Payments incash in ourshoes
  • 13. Puzzles, Keys, and Digitization Photocomposition keys Science typographers Puzzles – SGML Encyclopaedia Britannica Marquis Who’s Who Designing the Chicago Research and trading “desks”
  • 14. US PTO Conversions Scan at 300 dpi OCR to 97% 5,400,000 patents Create the machines Testy QC algorithms Display image Search dirty OCR Spell right once in 30 pages = findable
  • 15. Perugia Bible 12” VideoDisc
  • 16. British Library Map Collection225,000 maps pre-1850From printed catalog todigital catalog
  • 17. Getty AAT to AATA
  • 18. Success - Failure - Future Successes • Chemical Abstracts • USPTO • Getty AATA • British Map Collection Failures • Access Russia • Ipsoa Video Disk • MAI Mail
  • 19. All projects use classification To organize the job To organize the information To allow the finding of the items once digital Apply term tags • thesaurus and controlled Apply notation • Not necessarily classification • Just reflects the content The classification is NEVER done • Needs to reflect the ever-changing data
  • 20. Theoretical Underpinnings Outlines of Knowledge • Thomas Aquinas • John Knox (Bacon) • Morton Taube - Encyclopaedia Britannica Organization of Knowledge • Cutter – 1896 • COSATI – 1964 • Alvin Weinberg • Cranfield Institute papers • Cleverton, Aitcheson, Vickery
  • 21. Theory of knowledge…. began early Plato et al. - BC  Knowledge of reality is philosophy Realism  St. Augustine 354 - 430 AD  St. Thomas Aquinas 1225 -1274 AD  Characteristics common in particulars  Not the same object without them 38 © 2010. Access Innovations, Inc. All Rights Reserved.
  • 22. Theory of knowledge William of Occam (or Ockham) –  c. 1288 – c. 1348 Nominalism - Universals are represented by words Conceptualism - Universals are general concepts, mind dependent, formed by extraction from particular experiences 39 © 2010. Access Innovations, Inc. All Rights Reserved.
  • 23. Theory of knowledge The Knower (Subject) The Known (Object) Knowing (a subjective process) An act, a process, or a concept Facts or perception? Yes or no answers 40 © 2010. Access Innovations, Inc. All Rights Reserved.
  • 24. The basis of knowledge René Descartes 1596 - 1650  Separate what is known - philosophy  From new knowledge - science  Conditions of reason, suspension of belief  Je pense donc je suis  Cogito, ergo sum (from Socrates)  I think, therefore I am  Cartesian 41 © 2010. Access Innovations, Inc. All Rights Reserved.
  • 25. Conditions for knowledge  John Locke - 1632 - 1704  “A sailor needs to know the length of a line he has available before he goes out to sound the ocean with it.” - J. Locke  Acquire knowledge of reality  Establish the conditions needed to acquire knowledge  Establish possible extent and limitations of knowledge 42 © 2010. Access Innovations, Inc. All Rights Reserved.
  • 26. John Locke 1632 - 1704 Classification of kinds of knowledge Some Thoughts Concerning Education 43© 2010. Access Innovations, Inc. All Rights Reserved.
  • 27. Outlines of knowledge Carl Linnaeus 1707 – 1778  Placed plants in categories  Systematized the three kingdoms of nature  Replaced “natural systems” classification Immanuel Kant 1724 - 1804  A posteriori and a priori judgments  A posteriori and a priori concepts  Outline of knowledge The nature of this distinction has been disputed by various philosophers; however, the terms may be roughly defined as follows: A priori knowledge is knowledge that is known independently of experience (that is, it is non-empirical, or arrived at beforehand, usually by reason). A posteriori knowledge is knowledge that is known by experience (that is, it is empirical, or arrived at afterward). 44 © 2010. Access Innovations, Inc. All Rights Reserved.
  • 28. Epistemology James Frederick Ferrier 1808 - 1864 Analyzing the nature of knowledge How it relates to connected notions  truth, belief, justification The means of production of knowledge Skepticism about different knowledge claims http://en.wikipedia.org/wiki/Epistemology 45 © 2010. Access Innovations, Inc. All Rights Reserved.
  • 29. Personification of knowledge (Greek Επιστημη, Episteme) in Celsus Library in Ephesus, Turkey. Epistemology from Greek ἐπιστήμη – epistēmē, "knowledge, science" + λόγος, "logos") or theory of knowledge is the branch of philosophy concerned with the nature and scope (limitations) of knowledge. It addresses the questions: What is knowledge? How is knowledge acquired? How do we know what we know? 46© 2010. Access Innovations, Inc. All Rights Reserved.
  • 30. Philosophy of knowledgedivides 20th century thought  Memory  Perception and memory  Religion  Linguistic analysis  Classification of knowledge  Vocabulary control  Linguistic analysis 47 © 2010. Access Innovations, Inc. All Rights Reserved.
  • 31. Rise of Classification Charles Ami Cutter 1837 - 1903  Cutter Classification System Melville Dewey 1851 - 1931  Dewey Decimal Classification Vladimir Lenin 1870 – 1924  Rubricon - Russia  Rubricator S. R. Ranganathan – India,1892 – 1972  Faceted Classification System  Colonicity 48 © 2010. Access Innovations, Inc. All Rights Reserved.
  • 32. Charles Ammi Cutter Harvard College, index catalog,  using cards instead of published volumes,  an author index  and a “classed catalog” or subject index. Expansive Classification System (Cutter)  seven levels of classification,  each with increasing specificity  use lower levels and still be specific 49 © 2010. Access Innovations, Inc. All Rights Reserved.
  • 33. Thesauri Philo of Byblos Herennius Philon; c. 64- 141 AD Sanskrit, the Amarakosha 4th century verse Rogets Thesaurus, 1805  by Peter Mark Roget, and published in 1852 COSATI - 1964  TEST - 1967 50 © 2010. Access Innovations, Inc. All Rights Reserved.
  • 34. Points of knowledge Single point of knowledge  Eve and the apple  First organism  All science  Examples  Linnean system  Rubricator  Locke system  Dewey 51 © 2010. Access Innovations, Inc. All Rights Reserved.
  • 35. Points of knowledge Multiple points of origin  Several fields come together  Top terms  Should they be captured separately or together?  Facets or different views?  Anarchy in the universe  Examples  Physical biochemistry  NICEM  Engineering  Cutter, COSATI, Ranganathan 52 © 2010. Access Innovations, Inc. All Rights Reserved.
  • 36. Information access is changing Teletype Fax Online CD-ROM Downloading Internet
  • 37. The players are changing Standalone publishers Aggregators Serials and book vendors Hosting services Cloud Disaggregation Everyone is an author Loss of quality, accuracy, review
  • 38. The formats are changing  Handwritten  Gutenberg  Linotype  Web Presses • Photocomposition Digital layout Desktop publishing Web publishing
  • 39. Search is (finally) changing  Stairs Online search  Elhill Boolean search  Orbit Cached search  String search Bayesian  Verity  Co-occurrence  Neural nets  Fast  Machine learning  Lucene Faceted (fielded)  Muse Global Rules systems  Perfect Search © 2010. Access Innovations, Inc. All Rights Reserved.
  • 40. Tagging is still debated Permuted Indexes • Chem abs • Bio abs • Portals Permaterm indexes • IFI Predicasts • Classification systems LC • Thesauri Inverted files Triples
  • 41. Horizons are more complicated Field formatted data Relational and SQL databases Object oriented systems Semantic web Linked data
  • 42. Formats just keep being added Photocomposition markup SGML XML JSON CallsStorage keeps changing Big iron Server farms Cloud farms
  • 43. Telecommunications tries tokeep up Party lines Direct connect lines Trunk lines Fiber optic Cell towers Wireless
  • 44. Media Punch cards 9 track tapes Mountain tapes Removable drives Diskettes • 8” – • 5.25 – • 3.5 • Flash drives • Chips
  • 45. Indexes Pre-coordinate • Back of the book • Subject headings Post-coordinate Bayesian Co-occurrence Neural nets Machine learning Rules systems
  • 46. Now Changing the way we learn Changing the way we find things Easier to manipulate what we know • http://www.youtube.com/watch?v=B8ofWFx5 25s Comprehensive information / invasive • http://www.youtube.com/watch?v=RNJl9EEc soE People now know what search is.
  • 47. Future Information any place, any time A great big mess - Unless we corral it. • Tag it, • Clean it, • Weed it • Curate it Everyone is creating content
  • 48. The informationexplosion has just begun© 2010. Access Innovations, Inc. All Rights Reserved.
  • 49. We should all be part of itQuestions?Marjorie M.K. HlavaPresidentAccess Innovations, Inc.Mhlava@accessinn.com505-998-0800

×