1. USERS LOST:Reflections on the past, future, and limits of information science A presentation by Meg Eastwood on the 1997 paper by Dr. TefkoSaracevic INF384H September 12, 2011
2. Part One:What is Information Science, and why does it matter for Information Retrieval?
3. What is IR? “the undisputed objective of IR is to provide potentially relevant answers to users’ questions” (pg. 17)
4. Is IR a branch of Computer Science or Information Science? Computer science: “systematic study of algorithmic processes that describe and transfer information” (Denning et al., 1989) Information science: “trying to organize and make accessible the universe of knowledge records, literature, in a way that ‘texts’ most likely to be relevant or of value to users are made most accessible intellectually and physically” (pg. 23)
5. Three “Senses” of Information “signals or messages for decisions involving little or no cognitive processing” (pg. 17) 0 1
6. Three “Senses” of Information 2. “Information involving cognitive processing and understanding” 3. Information that involves cognitively-processed messages and a context (pg. 17-18) Photo courtesy of Lowell Observatory Archives
8. The Beginnings of Information Science Vannevar Bush’s 1945 paper: Defined “the massive problem of making more accessible a bewildering store of knowledge” (Bush 1945) Proposed a technological solution: the “Memex” Photo from http://en.wikipedia.org/wiki/File:Vannevar_Bush_portrait.jpg
9. Focus of Information Science “The proper study for information science is the problem of effective and efficient interface between people and literatures” pg. 20
10. Specialties within Information Science Domain Cluster versus Retrieval Cluster FIG. 3.Top 100 authors in information science, 1980–1987. from White and McCain 1998, pg. 345
11. Traditional Systems-Centered Approach to IR Calvin Mooers, 1951: Defined IR as as “embrac[ing] the intellectual aspects of the description of information and its specification for search, and also whatever systems, techniques or machines that are employed to carry out the operation.” Focuses on algorithms and “computational advantages” (pg. 22) “People and users are absent” (pg. 21)
12. Human-Centered Approach to IR “cognitive, situational, and interactive studies and models involving the use of retrieval systems” Mantra: “results have implications for systems design and practice” (pg. 21) From http://www.bleedingcool.com/wp-content/uploads//2011/08/tron-in-tron.jpg
13. Two Distinct Education Systems in IR Shera model Attempted to integrate IR courses into traditional library school curriculum and connect it to professional practice Strengths: “Service framework” “User-oriented” Salton model Education is integrated with experimental research as part of a computer science curriculum Strengths: Firm grounding in math and algorithms Students prepared to contribute to research in field
15. “Natural Limits” of Information Science Human knowledge records are too diverse for a general IR solution Every person searches for, assesses, and copes with information differently
16. Discussion Did Saracevic described the history of IR in unbiased manner? What did you think of Saracevic’s definition of Information Science? Have the relationships between the two camps of IR (systems-centered versus human-centered approach) changed since 1997? Research Education Natural limits of IR?
17. References Saracevic, T. (1997). Users lost: reflections on the past, future, and limits of information science. SIGIR Forum 31 (2):16-27. White, H.D. & McCain, K. W. (2008). Visualizing a discipline: An author co-citation analysis of information science, 1972-1995. Journal of the American Society for Information Science 49 (4):327–355.
Editor's Notes
Acceptance address for the 1997 ACM SIGIR Gerald Salton Award for Excellence in Research (Association for Computing Machinery) (Special Interest Group of Information Retrieval)
Why is that the objective? Because the early pioneers of IR chose that to be the objective—they could have chosen uncertainty or aboutness, but they chose to focus on relevance, and so for now, that’s what the field focuses on
“The fundamental question in computing is: What can be (efficiently) automated?”Saracevic thinks that computer science is not enough, since it “assumes users” and relevance….Pg 17: “Computer science provides the infrastructure. Information science [provides] the context”Now that Saracevic has hopefully convinced you that info science matters for IR, he wants to educate you more about what info science is. First he asks, “What kind of information does information science deal with?”Now, in Saracevic’s opinion, there are three senses of information
Like bits—any two-valued attribute--true/false, yes/no,on/off
2. So, for example, a picture is tangible, it’s something you can hold in your hand, but the informational content of the picture is a “transaction” between the record and the user3. Third sense—photo is a message—cognitively processed by user, in a context—let’s say you’re a science historian, and you want to know more about what it would have been like to be an astronomer in the early 1900s—you would look at photos like these.Both Info science and IR use this third broadest sense of information—we’re looking at users who are looking at info in a particular context—that’s why we need to consider info science when doing IR, because we always need to consider the user.
He coined the term “information explosion” to describe this problemMemex—would be capable of “association of ideas” and artificial intelligence—yet to be realizedBUT lots of people read Bush’s paper – including governments, who started providing funding for related studies—led to National Science Foundation’s Information, Robotics, and Intelligent Systems Division, Digital Libraries Initiative, etc.Saracevic glossed over the rest of the history of the field, because in his view, the critical question in “what in particular is being accomplished now?”
“Effective communication relates to relevance”Efficient relates to “costs and time”Literature – “aggregate of human knowledge records in a domain” – for info science, most important attribute of these records in their content—so, IR is supposed to efficiently connect people with the record content that they need.
You can see cluster are fairly distinct—there aren’t many authors with a foot in both worlds.Domain cluster includes such studies as citation analysis and bibliometricsRetrieval cluster – deals with information retrieval, a term coined by Calvin Moers in 1951.
The term IR was coined by Calvin Mooers in 1951, and his definition of IR was basis of IT research for next several decades.But, another IR research movement started in late 70s and gathered steam in 80s—Human-centered approach to IR—and these researchers fight for the user…
(slide)Saracevic defines himself as belonging very firmly in the human-centered camp. BUT he also admits that most studies in this camp just end up making suggestions about design—they don’t really deliver concrete results.Some authors, like Dervin and Nilan (1986) describe the systems approac as “dreadful,” but Saracevic thinks that’s not helpfulIR would best be served by attempting to meld the two groups--
JesseShera—dean of library school at Western Research UniversityMost of people on human centered side of IR were trained under Shera modelGerardSalton—computer scientistWeaknesses:Shera model loses algorithms, Salton model loses users…At the time that Saracevic wrote this, he said that the educational approaches were “completely independent” of each other—that was, of course, 1997, and the very existence of this class proves that this is no longer the case—we’ll discuss that more in a bit.