SlideShare a Scribd company logo
1 of 11
BY
NANTHINI R O
II – MLIS
PONDICHERRY UNIVERSITY








Theory based approach to design various
aspects of information retrieval systems
Based on a set of principles and assumptions

Theory drives experiment by suggesting new
ways and means of doing tests
Experiment drives theory by justifying or
helping to improve the model


Cognitive or user centered
◦ Human information behaviour models
◦ Eg: Wilson’s model, Dervin’s model, Ellis’s model,
Bates’s model, Kulthau’s model, etc...



Structural or system centered
◦ Classical models based on logical and mathematical
principles
◦ Eg: Boolean search model, Vector Space model,
probabilistic model, etc...








Also called as ‘term vector model’ or ‘vector
processing model’
Represents both documents and queries by term
sets and compares global similarities between
queries and documents
used in information filtering, information
retrieval, indexing and relevancy rankings

first use was in the SMART Information Retrieval
System


term vectors are assigned for the keywords of the
documents and weights are provided according to
relevance



to compare different texts and retrieve relevant
records similar to the queries



terms are single words, keywords, or longer phrases



If words are chosen to be the terms, the
dimensionality of the vector is the number of words
in the vocabulary (the number of distinct words occurring in the corpus)


BASICS: (i and j are 2 documents, k – term, t – last term)

◦ Denotes the sum of the weights of all properties of
a vector

◦ Denotes the sum of products of corresponding term
weights for two vectors
◦ Denotes the sum of minimum component weights
of the corresponding two vectors


Similarity coefficients
◦ The Dice Coefficient

◦ The Jaccard Coefficient

acc. to Salton and McGill
Let the weights for the index terms assigned to two
documents i and j be as follows:

Doci = 3,2,1,0,0,0,1,1
Docj = 1,1,1,0,0,1,0,0
= 2 [(3*1)+(2*1)+(1*1)+(0*0)+(0*0)+(0*1)+(1*0)+(1*0)]
(3+2+1+0+0+0+1+1)+(1+1+1+0+0+1+0+0)
=12/12 = 1
= 6/(12-6)
= 1
Vector space model of information retrieval

More Related Content

What's hot

WEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEMWEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEMSai Kumar Ale
 
Introduction to Information Retrieval & Models
Introduction to Information Retrieval & ModelsIntroduction to Information Retrieval & Models
Introduction to Information Retrieval & ModelsMounia Lalmas-Roelleke
 
Information retrieval s
Information retrieval sInformation retrieval s
Information retrieval ssilambu111
 
Information retrieval introduction
Information retrieval introductionInformation retrieval introduction
Information retrieval introductionnimmyjans4
 
Information retrieval (introduction)
Information  retrieval (introduction) Information  retrieval (introduction)
Information retrieval (introduction) Primya Tamil
 
The impact of web on ir
The impact of web on irThe impact of web on ir
The impact of web on irPrimya Tamil
 
Information Retrieval Models
Information Retrieval ModelsInformation Retrieval Models
Information Retrieval ModelsNisha Arankandath
 
Information retrieval system
Information retrieval systemInformation retrieval system
Information retrieval systemLeslie Vargas
 
Ppt evaluation of information retrieval system
Ppt evaluation of information retrieval systemPpt evaluation of information retrieval system
Ppt evaluation of information retrieval systemsilambu111
 
Open source search engine
Open source search engineOpen source search engine
Open source search enginePrimya Tamil
 
Information retrieval 7 boolean model
Information retrieval 7 boolean modelInformation retrieval 7 boolean model
Information retrieval 7 boolean modelVaibhav Khanna
 
automatic classification in information retrieval
automatic classification in information retrievalautomatic classification in information retrieval
automatic classification in information retrievalBasma Gamal
 
Latent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information RetrievalLatent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information RetrievalSudarsun Santhiappan
 
Tdm information retrieval
Tdm information retrievalTdm information retrieval
Tdm information retrievalKU Leuven
 

What's hot (20)

Term weighting
Term weightingTerm weighting
Term weighting
 
WEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEMWEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEM
 
Introduction to Information Retrieval & Models
Introduction to Information Retrieval & ModelsIntroduction to Information Retrieval & Models
Introduction to Information Retrieval & Models
 
Information retrieval s
Information retrieval sInformation retrieval s
Information retrieval s
 
Information retrieval introduction
Information retrieval introductionInformation retrieval introduction
Information retrieval introduction
 
Information retrieval (introduction)
Information  retrieval (introduction) Information  retrieval (introduction)
Information retrieval (introduction)
 
The impact of web on ir
The impact of web on irThe impact of web on ir
The impact of web on ir
 
Information Retrieval Models
Information Retrieval ModelsInformation Retrieval Models
Information Retrieval Models
 
Web search vs ir
Web search vs irWeb search vs ir
Web search vs ir
 
Information retrieval system
Information retrieval systemInformation retrieval system
Information retrieval system
 
Information Retrieval
Information RetrievalInformation Retrieval
Information Retrieval
 
Ppt evaluation of information retrieval system
Ppt evaluation of information retrieval systemPpt evaluation of information retrieval system
Ppt evaluation of information retrieval system
 
Open source search engine
Open source search engineOpen source search engine
Open source search engine
 
Information retrieval 7 boolean model
Information retrieval 7 boolean modelInformation retrieval 7 boolean model
Information retrieval 7 boolean model
 
web mining
web miningweb mining
web mining
 
Metadata ppt
Metadata pptMetadata ppt
Metadata ppt
 
automatic classification in information retrieval
automatic classification in information retrievalautomatic classification in information retrieval
automatic classification in information retrieval
 
Information Retrieval Evaluation
Information Retrieval EvaluationInformation Retrieval Evaluation
Information Retrieval Evaluation
 
Latent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information RetrievalLatent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information Retrieval
 
Tdm information retrieval
Tdm information retrievalTdm information retrieval
Tdm information retrieval
 

Similar to Vector space model of information retrieval

Research on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesResearch on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesKausar Mukadam
 
Types of case study
Types of  case studyTypes of  case study
Types of case studylaveleen
 
Chao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docxChao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docxsleeperharwell
 
Chao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docxChao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docxketurahhazelhurst
 
Achieving Highly Effective Personalized Learning through Learning Objects
Achieving Highly Effective Personalized Learning through Learning ObjectsAchieving Highly Effective Personalized Learning through Learning Objects
Achieving Highly Effective Personalized Learning through Learning ObjectsBabatunde Ishola
 
E-learning research methodological issues
E-learning research methodological issuesE-learning research methodological issues
E-learning research methodological issuesgrainne
 
Graduate Paper--Hierarchical clustring and topology for psychometrics paper
Graduate Paper--Hierarchical clustring and topology for psychometrics paperGraduate Paper--Hierarchical clustring and topology for psychometrics paper
Graduate Paper--Hierarchical clustring and topology for psychometrics paperColleen Farrelly
 
Reading Material: Qualitative Interview
Reading Material: Qualitative InterviewReading Material: Qualitative Interview
Reading Material: Qualitative Interviewfirdausabdmunir85
 
Data Mining for Education. Ryan S.J.d. Baker, Carnegie Mellon University
Data Mining for Education.  Ryan S.J.d. Baker, Carnegie Mellon UniversityData Mining for Education.  Ryan S.J.d. Baker, Carnegie Mellon University
Data Mining for Education. Ryan S.J.d. Baker, Carnegie Mellon Universityeraser Juan José Calderón
 
Lecture 1 research methods
Lecture 1 research methodsLecture 1 research methods
Lecture 1 research methodsAdina Dudau
 
Chapter 5 theory and methodology
Chapter 5 theory and methodology Chapter 5 theory and methodology
Chapter 5 theory and methodology grainne
 
The Case StudyMany disciplines use various forms of the ca.docx
The Case StudyMany disciplines use various forms of the ca.docxThe Case StudyMany disciplines use various forms of the ca.docx
The Case StudyMany disciplines use various forms of the ca.docxmamanda2
 
The Case StudyMany disciplines use various forms of the ca.docx
The Case StudyMany disciplines use various forms of the ca.docxThe Case StudyMany disciplines use various forms of the ca.docx
The Case StudyMany disciplines use various forms of the ca.docxarnoldmeredith47041
 
Writing the Theoretical and Conceptual Framework of a Quantitative Research
Writing the Theoretical and Conceptual Framework of a Quantitative ResearchWriting the Theoretical and Conceptual Framework of a Quantitative Research
Writing the Theoretical and Conceptual Framework of a Quantitative Researchschool
 
In house training 151114 qualitative research
In house training 151114 qualitative researchIn house training 151114 qualitative research
In house training 151114 qualitative researchHiram Ting
 

Similar to Vector space model of information retrieval (20)

Research on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesResearch on ontology based information retrieval techniques
Research on ontology based information retrieval techniques
 
Types of case study
Types of  case studyTypes of  case study
Types of case study
 
43144 12
43144 1243144 12
43144 12
 
Chao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docxChao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docx
 
Chao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docxChao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docx
 
Achieving Highly Effective Personalized Learning through Learning Objects
Achieving Highly Effective Personalized Learning through Learning ObjectsAchieving Highly Effective Personalized Learning through Learning Objects
Achieving Highly Effective Personalized Learning through Learning Objects
 
E-learning research methodological issues
E-learning research methodological issuesE-learning research methodological issues
E-learning research methodological issues
 
Graduate Paper--Hierarchical clustring and topology for psychometrics paper
Graduate Paper--Hierarchical clustring and topology for psychometrics paperGraduate Paper--Hierarchical clustring and topology for psychometrics paper
Graduate Paper--Hierarchical clustring and topology for psychometrics paper
 
Reading Material: Qualitative Interview
Reading Material: Qualitative InterviewReading Material: Qualitative Interview
Reading Material: Qualitative Interview
 
THE-USE-OF-THEORY.pptx
THE-USE-OF-THEORY.pptxTHE-USE-OF-THEORY.pptx
THE-USE-OF-THEORY.pptx
 
Data Mining for Education. Ryan S.J.d. Baker, Carnegie Mellon University
Data Mining for Education.  Ryan S.J.d. Baker, Carnegie Mellon UniversityData Mining for Education.  Ryan S.J.d. Baker, Carnegie Mellon University
Data Mining for Education. Ryan S.J.d. Baker, Carnegie Mellon University
 
Lecture 1 research methods
Lecture 1 research methodsLecture 1 research methods
Lecture 1 research methods
 
Chapter 5 theory and methodology
Chapter 5 theory and methodology Chapter 5 theory and methodology
Chapter 5 theory and methodology
 
Theoretical & framework
Theoretical & frameworkTheoretical & framework
Theoretical & framework
 
The Case StudyMany disciplines use various forms of the ca.docx
The Case StudyMany disciplines use various forms of the ca.docxThe Case StudyMany disciplines use various forms of the ca.docx
The Case StudyMany disciplines use various forms of the ca.docx
 
The Case StudyMany disciplines use various forms of the ca.docx
The Case StudyMany disciplines use various forms of the ca.docxThe Case StudyMany disciplines use various forms of the ca.docx
The Case StudyMany disciplines use various forms of the ca.docx
 
2. theoretical framework
2. theoretical framework2. theoretical framework
2. theoretical framework
 
Writing the Theoretical and Conceptual Framework of a Quantitative Research
Writing the Theoretical and Conceptual Framework of a Quantitative ResearchWriting the Theoretical and Conceptual Framework of a Quantitative Research
Writing the Theoretical and Conceptual Framework of a Quantitative Research
 
In house training 151114 qualitative research
In house training 151114 qualitative researchIn house training 151114 qualitative research
In house training 151114 qualitative research
 
Cdst12 ijtel
Cdst12 ijtelCdst12 ijtel
Cdst12 ijtel
 

Recently uploaded

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 

Recently uploaded (20)

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

Vector space model of information retrieval

  • 1. BY NANTHINI R O II – MLIS PONDICHERRY UNIVERSITY
  • 2.     Theory based approach to design various aspects of information retrieval systems Based on a set of principles and assumptions Theory drives experiment by suggesting new ways and means of doing tests Experiment drives theory by justifying or helping to improve the model
  • 3.  Cognitive or user centered ◦ Human information behaviour models ◦ Eg: Wilson’s model, Dervin’s model, Ellis’s model, Bates’s model, Kulthau’s model, etc...  Structural or system centered ◦ Classical models based on logical and mathematical principles ◦ Eg: Boolean search model, Vector Space model, probabilistic model, etc...
  • 4.     Also called as ‘term vector model’ or ‘vector processing model’ Represents both documents and queries by term sets and compares global similarities between queries and documents used in information filtering, information retrieval, indexing and relevancy rankings first use was in the SMART Information Retrieval System
  • 5.  term vectors are assigned for the keywords of the documents and weights are provided according to relevance  to compare different texts and retrieve relevant records similar to the queries  terms are single words, keywords, or longer phrases  If words are chosen to be the terms, the dimensionality of the vector is the number of words in the vocabulary (the number of distinct words occurring in the corpus)
  • 6.  BASICS: (i and j are 2 documents, k – term, t – last term) ◦ Denotes the sum of the weights of all properties of a vector ◦ Denotes the sum of products of corresponding term weights for two vectors
  • 7. ◦ Denotes the sum of minimum component weights of the corresponding two vectors  Similarity coefficients ◦ The Dice Coefficient ◦ The Jaccard Coefficient acc. to Salton and McGill
  • 8. Let the weights for the index terms assigned to two documents i and j be as follows: Doci = 3,2,1,0,0,0,1,1 Docj = 1,1,1,0,0,1,0,0