INFORMATION SEEKING BEHAVIOR
IM-50
PRESENTED BY
ANDALEEB ASIM
AYESHA KHAN
NAILA ISHTIAQ
SYED AAMIR ALI NAQVI
OUTLINE
 History of Belkin Theory Anomalous State of
Knowledge
 Introduction
 What is Anomaly
 Background
 Comparison of Traditional and Belkin’s Models
 Applications
 Implication
 Conclusions
 References
ANOMALOUS STATE OF KNOWLEDGE
INTRODUCTION
 We are drowning in the overflow of data that are
being collected world-wide, while starving for
knowledge at the same time.
 Anomalous events occur relatively infrequently
 However, when they do occur, their consequences
can be quite dramatic and quite often in a negative
sense
WHAT ARE ANOMALIES?
 Anomaly is a pattern in the data that does
not conform to the expected behaviour
 Also referred to as outliers, exceptions,
peculiarities, surprise, etc.
HISTORY NICHOLAS BELKIN
• Nicholas J. Belkin is a Professor at the
school of Communication and information at
Rutgers University.
• Belkin is best know for his work on human-
centered Information Retrieval and
hypothesis of Anomalous State of
Knowledge(ASK)
• Belkin realized that in many cases, Users of
search systems are unable to precisely
formulate what they need. They miss some
vital Knowledge to formulate their queries.
BELKIN’S THEORY HISTORY
• In Such cases it is more suitable to attempt to
describe a user’s anomalous state of
knowledge than to ask the user to specify her
/his need as request to the system.
• Among the main themes of his research are
digital libraries; information-seeking behaviors;
and information retrieval system.
• Dr. Belkin was the chair of SIGIR in 1995-99
and the president of American Society for
Information Science and Technology (ASIS&T)
in 2005
BACKGROUND
 Information retrieval (IR) systems as presently
designed in terms of complete recall and precision or
complete user satisfaction.
 Traditional view of IR
THE INFORMATION RETRIEVAL CYCLE
Source
Selection
Search
Quer
y
Selection
Ranked List
Examination
Documents
Delivery
Documents
Query
Formulation
Resource
source reselection
System discovery
Vocabulary discovery
Concept discovery
Document discovery
BELKIN MODEL
ANOMALOUS STATE OF KNOWLEDGE
 Basic paradox:
 Information needs arise because the user doesn’t know
something: “an anomaly in his state of knowledge with
respect to the problem faced”
 Search systems are designed to satisfy these needs,
but the user needs to know what he is looking for
 However, if the user knows what he’s looking for, there
may not be a need to search in the first place
 Implication: computing “similarity” between queries
and documents is fundamentally wrong
 How do we resolve this paradox?
APPLICATIONS OF ANOMALY DETECTION
 Network intrusion detection
 Insurance / Credit card fraud detection
 Healthcare Informatics / Medical diagnostics
 Industrial Damage Detection
 Image Processing / Video surveillance
 Novel Topic Detection in Text Mining
 This new approach recognizes that a fundamental
element in the IR situation is the development of an
information need out of an inadequate state of
knowledge.
 Appropriate representation is consideration of the
information need as an 'anomalous state of knowledge'
(ASK).6,9
ANOMLOUS STATE OF KNOWLEDGE
 ‘’The ASK hypothesis is that an information need
arises from a recognized anomaly in the user's state
of knowledge concerning some topic or situation and
that, in general, the user is unable to specify precisely
what is needed to resolve that anomaly’’.
IMPLICATIONS
 The typical IR system now available, either
operational or experimental, depends on what we
call the 'best-match' principle.
 ASK = Non-Specifiability of need (Cognitive or
Linguistic)
 Cognitive Non- Specifiability
 Linguistic Non-Specifiability
OTHER THEORIES RELATED
 Unconscious Need by Robert S.Taylor (1968)
 Problematic Situation by Wersig (1971)
 Gaps by Dervin (1983)
RESEARCH METHODOLOGY
 Tape recording a number of interviews with
users of actual information systems.
 Adaptation and implementation of the text
analysis program developed by Belkin so as to
produce structural representations of this data.
 Obtaining the authors' /users' evaluations of
these representations, through the use of
questionnaires or interviews where appropriate.
CONCLUSIONS
 Anomaly detection can detect critical information in
data
 Highly applicable in various application domains
 Nature of anomaly detection problem is dependent
on the application domain
 Need different approaches to solve a particular
problem formulation
REFERENCES
 Ling, C., Li, C. Data mining for direct marketing:
Problems and solutions, KDD, 1998.
 Kubat M., Matwin, S., Addressing the Curse of
Imbalanced Training Sets: One-Sided Selection,
ICML 1997.
 N. Chawla et al., SMOTE: Synthetic Minority Over-
Sampling Technique, JAIR, 2002.
 W. Fan et al, Using Artificial Anomalies to Detect
Unknown and Known Network Intrusions, ICDM
2001
CONT.……
 N. Abe, et al, Outlier Detection by Active Learning,
KDD 2006
 C. Cardie, N. Howe, Improving Minority Class
Prediction Using Case specific feature weighting,
ICML 1997.
 J. Grzymala et al, An Approach to Imbalanced Data
Sets Based on Changing Rule Strength, AAAI
Workshop on Learning from Imbalanced Data Sets,
2000.
 George H. John. Robust linear discriminant trees.
AI&Statistics, 1995
CONT.……
 Barbara, D., Couto, J., Jajodia, S., and Wu, N.
Adam: a testbed for exploring the use of data
mining in intrusion detection. SIGMOD Rec., 2001
 Otey, M., Parthasarathy, S., Ghoting, A., Li, G.,
Narravula, S., and Panda, D. Towards nic-based
intrusion detection. KDD 2003
 He, Z., Xu, X., Huang, J. Z., and Deng, S. A
frequent pattern discovery method for outlier
detection. Web-Age Information Management, 726–
732, 2004
CONT.……
 Lee, W., Stolfo, S. J., and Mok, K. W. Adaptive
intrusion detection: A data mining approach.
Artificial Intelligence Review, 2000
 Qin, M. and Hwang, K. Frequent episode rules for
internet anomaly detection. In Proceedings of the
3rd IEEE International Symposium on Network
Computing and Applications, 2004
 Ide, T. and Kashima, H. Eigenspace-based
anomaly detection in computer systems. KDD,
2004
 Sun, J. et al., Less is more: Compact matrix
representation of large sparse graphs. ICDM 2007

Information seeking behavior

  • 1.
    INFORMATION SEEKING BEHAVIOR IM-50 PRESENTEDBY ANDALEEB ASIM AYESHA KHAN NAILA ISHTIAQ SYED AAMIR ALI NAQVI
  • 2.
    OUTLINE  History ofBelkin Theory Anomalous State of Knowledge  Introduction  What is Anomaly  Background  Comparison of Traditional and Belkin’s Models  Applications  Implication  Conclusions  References
  • 3.
    ANOMALOUS STATE OFKNOWLEDGE INTRODUCTION  We are drowning in the overflow of data that are being collected world-wide, while starving for knowledge at the same time.  Anomalous events occur relatively infrequently  However, when they do occur, their consequences can be quite dramatic and quite often in a negative sense
  • 4.
    WHAT ARE ANOMALIES? Anomaly is a pattern in the data that does not conform to the expected behaviour  Also referred to as outliers, exceptions, peculiarities, surprise, etc.
  • 5.
    HISTORY NICHOLAS BELKIN •Nicholas J. Belkin is a Professor at the school of Communication and information at Rutgers University. • Belkin is best know for his work on human- centered Information Retrieval and hypothesis of Anomalous State of Knowledge(ASK) • Belkin realized that in many cases, Users of search systems are unable to precisely formulate what they need. They miss some vital Knowledge to formulate their queries.
  • 6.
    BELKIN’S THEORY HISTORY •In Such cases it is more suitable to attempt to describe a user’s anomalous state of knowledge than to ask the user to specify her /his need as request to the system. • Among the main themes of his research are digital libraries; information-seeking behaviors; and information retrieval system. • Dr. Belkin was the chair of SIGIR in 1995-99 and the president of American Society for Information Science and Technology (ASIS&T) in 2005
  • 7.
    BACKGROUND  Information retrieval(IR) systems as presently designed in terms of complete recall and precision or complete user satisfaction.  Traditional view of IR
  • 8.
    THE INFORMATION RETRIEVALCYCLE Source Selection Search Quer y Selection Ranked List Examination Documents Delivery Documents Query Formulation Resource source reselection System discovery Vocabulary discovery Concept discovery Document discovery
  • 9.
  • 10.
    ANOMALOUS STATE OFKNOWLEDGE  Basic paradox:  Information needs arise because the user doesn’t know something: “an anomaly in his state of knowledge with respect to the problem faced”  Search systems are designed to satisfy these needs, but the user needs to know what he is looking for  However, if the user knows what he’s looking for, there may not be a need to search in the first place  Implication: computing “similarity” between queries and documents is fundamentally wrong  How do we resolve this paradox?
  • 11.
    APPLICATIONS OF ANOMALYDETECTION  Network intrusion detection  Insurance / Credit card fraud detection  Healthcare Informatics / Medical diagnostics  Industrial Damage Detection  Image Processing / Video surveillance  Novel Topic Detection in Text Mining
  • 12.
     This newapproach recognizes that a fundamental element in the IR situation is the development of an information need out of an inadequate state of knowledge.  Appropriate representation is consideration of the information need as an 'anomalous state of knowledge' (ASK).6,9 ANOMLOUS STATE OF KNOWLEDGE  ‘’The ASK hypothesis is that an information need arises from a recognized anomaly in the user's state of knowledge concerning some topic or situation and that, in general, the user is unable to specify precisely what is needed to resolve that anomaly’’.
  • 13.
    IMPLICATIONS  The typicalIR system now available, either operational or experimental, depends on what we call the 'best-match' principle.  ASK = Non-Specifiability of need (Cognitive or Linguistic)  Cognitive Non- Specifiability  Linguistic Non-Specifiability
  • 14.
    OTHER THEORIES RELATED Unconscious Need by Robert S.Taylor (1968)  Problematic Situation by Wersig (1971)  Gaps by Dervin (1983)
  • 15.
    RESEARCH METHODOLOGY  Taperecording a number of interviews with users of actual information systems.  Adaptation and implementation of the text analysis program developed by Belkin so as to produce structural representations of this data.  Obtaining the authors' /users' evaluations of these representations, through the use of questionnaires or interviews where appropriate.
  • 16.
    CONCLUSIONS  Anomaly detectioncan detect critical information in data  Highly applicable in various application domains  Nature of anomaly detection problem is dependent on the application domain  Need different approaches to solve a particular problem formulation
  • 17.
    REFERENCES  Ling, C.,Li, C. Data mining for direct marketing: Problems and solutions, KDD, 1998.  Kubat M., Matwin, S., Addressing the Curse of Imbalanced Training Sets: One-Sided Selection, ICML 1997.  N. Chawla et al., SMOTE: Synthetic Minority Over- Sampling Technique, JAIR, 2002.  W. Fan et al, Using Artificial Anomalies to Detect Unknown and Known Network Intrusions, ICDM 2001
  • 18.
    CONT.……  N. Abe,et al, Outlier Detection by Active Learning, KDD 2006  C. Cardie, N. Howe, Improving Minority Class Prediction Using Case specific feature weighting, ICML 1997.  J. Grzymala et al, An Approach to Imbalanced Data Sets Based on Changing Rule Strength, AAAI Workshop on Learning from Imbalanced Data Sets, 2000.  George H. John. Robust linear discriminant trees. AI&Statistics, 1995
  • 19.
    CONT.……  Barbara, D.,Couto, J., Jajodia, S., and Wu, N. Adam: a testbed for exploring the use of data mining in intrusion detection. SIGMOD Rec., 2001  Otey, M., Parthasarathy, S., Ghoting, A., Li, G., Narravula, S., and Panda, D. Towards nic-based intrusion detection. KDD 2003  He, Z., Xu, X., Huang, J. Z., and Deng, S. A frequent pattern discovery method for outlier detection. Web-Age Information Management, 726– 732, 2004
  • 20.
    CONT.……  Lee, W.,Stolfo, S. J., and Mok, K. W. Adaptive intrusion detection: A data mining approach. Artificial Intelligence Review, 2000  Qin, M. and Hwang, K. Frequent episode rules for internet anomaly detection. In Proceedings of the 3rd IEEE International Symposium on Network Computing and Applications, 2004  Ide, T. and Kashima, H. Eigenspace-based anomaly detection in computer systems. KDD, 2004  Sun, J. et al., Less is more: Compact matrix representation of large sparse graphs. ICDM 2007