SlideShare a Scribd company logo
Research Paper for CSCI 6370, Topics in Computer Science
Name: Sai Nithish Kumar Posani
SID : 20356909
Professor: Zhixiang Chen
Abstract:
The main theme of Informational retrieval is to send the exact response of a user for specific
Query.
Existing model:
The existing model and the functionality of the information retrieval is done by analyzing entire
document to give response to given query and the related terms to the query are extracted.Indexing
weight is plays a major role here, it is applied to all the terms and at the end it provides the
response to the user. In the existing model they did not considered the context into consideration
so the information cannot be retrieved efficiently.
Proposed Model:
In this paper, to gain the information retrieval in efficient way they proposed a context-sensitive
document indexing approach. By using a concept called lexical resource the content carrying
terms and background terms will separate in this approach. Here Indexing weight will be
calculate for content carrying terms. The highest indexing weight is taken account as the most
salient sentence and these sentences. Are retrieved and document summarization is done.
If user entered a query this will treat as a keyword the based on keyword search the information
will search.
Then this keyword is corresponding with the summarized document. Incase if it is matched the
related sentence will extracted by using the indexing algorithm. At the end these senses are
treated as a response of an appropriate response. By using this flow the information retrieval
process will be done successfully.
Overview:
In present world the internet usage and computers usage is rapidly increased. Even in villages
people are using computers or personnel systems. The usage of the computer technology is
raised. Compare with last two decades our technology is doubled, tripled, quadrupled. The main
focus here is to connect with internet and search different webpages from the different web pages
gather the information. The web sites may be entertainment, education, business, personnel,
social, and history, political, mechanical, and industrial it may be anything. If we enter the
content in the browser it will treat as a query. We have different type of client software’s
available nothing but browsers for example Google chrome, Mozilla Firefox, safari, torch…etc.
The web search has more importance now, because if we type something and click enter if it
takes 2 minutes to give response is useless. So everyone need fast and quick response from
server. And one more issue if you type something and you got the output is not related this is
also a different problem. For example if you type apple word for apple company information it
showing results like apple fruits, pine apple, advantages of eating apple, like that. This is also a
problem. Whenever, wherever if you search for anything you should get a good result, related
content should be display. To achieve such type of results you need to write a very good program
for the search engines. Here I am going to explain about the document indexing approach, by
using this approach we can get good result like as we discussed earlier. This type of content we
need a better idea on search engines. How user going to search, what he is going to expect form
server, what he is showing more interest like that. We have different techniques for search
engine development like Document indexing, web crawler, keyword search, document
clustering, link based ranking….etc. The term search has more importance here in the web
terminology. Content mining is the procedure of separating the valuable and superb data from the report.
The unique report in the site comprises of tremendous measure of data. The client finds harder to get the
fundamental topic of the report. Keeping in mind the end goal to conquer these troubles data recovery
from an abridged report is finished. The report rundown comprise of diverse sorts: single report rundown
is the procedure of compressing a solitary report. Multi report is utilized to compress the substance of one
or more reports. The fundamental point of the Information Retrieval is to fulfill the data need of a client.
He general undertaking of data recovery is utilized to recover the significant term as per the client
questions inside of the worthy reaction time. The primary point of the Data recovery is to give
Information sets which coordinating to the pivotal words of an inquiry. Data recovery predominantly
manages the representation, stockpiling, association furthermore, access to data things. Data recovery is
used to decrease the issue called data over-burden. Data over-burden alludes to the trouble of a man to
Comprehend the issues brought about by the vicinity of a lot of data. At the point when the client enters
the inquiry, the data is recovered. The inquiries are executed and recovered from the record by utilizing
SQL. At the point when the client enters the question, the inquiry is coordinated with the report. At that
point the inquiry related data is extricated. The best match is displayed as the reactions to the client.
Advantages:
 Reduced number of commands required to be known to the client for a given level of output.
 Here reduced number of clicks or keystrokes required to carry out a given appropriate
operation.
 It will give permission to consistent behavior to be pre-programmed or altered by the
user/client.
 It will reduces the number of choices to be on console at one time (i.e. "clutter")
 the splitting of Sentences : The content carrying term issued to give the important idea about
the main content
 Lexical association
 Context Based indexing approach
Disadvantages:
 The Context sensitive actions might be perceived as dumbing down of the user interface -leaving
the operative at a loss as to what to do when the computer decides to perform an unwanted action.
 Moreover non-automatic actions may be hidden or covered by the context sensitive interface
causing a rise in user workload for operations the developers did not foresee.
Improvisation:
In this paper they concentrated on Document indexing concept, it is really useful and it will raise the
efficiency of web pages. But my idea is concentrating on only document indexing is not fine for web
pages, at a time two or three different techniques should be implement in a single engine like Document
indexing, page ranking, crawler implementation, page clustering. All these applications can affect the
information search retrieval pattern. Here mainly information retrieval is playing a major role in each and
every aspect. That’s why we need to concentrate on each and every angle to get the output from server as
early as possible.
The information search retrieval is a very big process, to achieve this concept we need to develop an
application with more effect and we have to use techniques like Document indexing, page ranking,
clustering technique. Among all of these Document index is plays avital role while searching why since
instead of searching hundreds of thousands of documents it will directly go to the particular index and
will give the output here. Here our achievement mainly is indexing, the clear meaning of the indexing is
storing an index is to optimize speed and performance in finding the appropriate/corresponding document
for the user searched query.
My conclusion is the context based index approach is used in the query retrieval, this is mainly from the
source document. Instead of searching every page on server, finding technically is better. Due to this we
can save our time, we can reduce the burden of server.
References:
1. Professor D.R. Radev, H. Jing, M. Stys, and D. Tam, "Centroid-Based
Summarization of Multiple Documents," Information Processing and
Management,
2. Professor, I. Mani, G. Klein, D. House, L. Hirschman, T. Firmin, and B.
Sundheim, "Summac: A Text Summarization Evaluation," Nat'
3. Professor, Xiaojun Wan Jianwu Yang Jianguo Xia "Towards an iterative
Reinforcement approach for simultaneous document summarization
And keyword extraction”
4. Professor, K. Morita, E.-S. Atlam, M. Fuketra, K. Tsuda, M. Oono, and .I.-i.
Aoe, "Word Classification and Hierarchy using Co-Occurrence Word
Intonation," Intonation Processing and Management,
5. Professor, H. Li, "Word Clustering and Disambiguation Based on Co-Occurrence
Data," Nat'! Language Eng.,
6. Professor, c.-Y. Lin, G. Cao, .I. Gao, and J.-Y. Nie, "An Information-Theoretic
Approach to Automatic Evaluation of Summaries," Proc. Main Conf.
Human Language Technology Conf. North Am. Chapter of the
Assoc. of Computational Linguistics,

More Related Content

What's hot

A survey on various architectures, models and methodologies for information r...
A survey on various architectures, models and methodologies for information r...A survey on various architectures, models and methodologies for information r...
A survey on various architectures, models and methodologies for information r...IAEME Publication
 
Annotation Approach for Document with Recommendation
Annotation Approach for Document with Recommendation Annotation Approach for Document with Recommendation
Annotation Approach for Document with Recommendation
ijmpict
 
SEMANTIC INFORMATION EXTRACTION IN UNIVERSITY DOMAIN
SEMANTIC INFORMATION EXTRACTION IN UNIVERSITY DOMAINSEMANTIC INFORMATION EXTRACTION IN UNIVERSITY DOMAIN
SEMANTIC INFORMATION EXTRACTION IN UNIVERSITY DOMAIN
cscpconf
 
Classification-based Retrieval Methods to Enhance Information Discovery on th...
Classification-based Retrieval Methods to Enhance Information Discovery on th...Classification-based Retrieval Methods to Enhance Information Discovery on th...
Classification-based Retrieval Methods to Enhance Information Discovery on th...
IJMIT JOURNAL
 
What IA, UX and SEO Can Learn from Each Other
What IA, UX and SEO Can Learn from Each OtherWhat IA, UX and SEO Can Learn from Each Other
What IA, UX and SEO Can Learn from Each Other
Ian Lurie
 
Challenging Issues and Similarity Measures for Web Document Clustering
Challenging Issues and Similarity Measures for Web Document ClusteringChallenging Issues and Similarity Measures for Web Document Clustering
Challenging Issues and Similarity Measures for Web Document Clustering
IOSR Journals
 
Context Driven Technique for Document Classification
Context Driven Technique for Document ClassificationContext Driven Technique for Document Classification
Context Driven Technique for Document Classification
IDES Editor
 
Improving search result via search keywords and data classification similarity
Improving search result via search keywords and data classification similarityImproving search result via search keywords and data classification similarity
Improving search result via search keywords and data classification similarity
Conference Papers
 
A Review: Text Classification on Social Media Data
A Review: Text Classification on Social Media DataA Review: Text Classification on Social Media Data
A Review: Text Classification on Social Media Data
IOSR Journals
 
QUESTION ANSWERING MODULE LEVERAGING HETEROGENEOUS DATASETS
QUESTION ANSWERING MODULE LEVERAGING HETEROGENEOUS DATASETSQUESTION ANSWERING MODULE LEVERAGING HETEROGENEOUS DATASETS
QUESTION ANSWERING MODULE LEVERAGING HETEROGENEOUS DATASETS
ijnlc
 
Demystifying analytics in e discovery white paper 06-30-14
Demystifying analytics in e discovery   white paper 06-30-14Demystifying analytics in e discovery   white paper 06-30-14
Demystifying analytics in e discovery white paper 06-30-14Steven Toole
 
Improving Annotations in Digital Documents using Document Features and Fuzzy ...
Improving Annotations in Digital Documents using Document Features and Fuzzy ...Improving Annotations in Digital Documents using Document Features and Fuzzy ...
Improving Annotations in Digital Documents using Document Features and Fuzzy ...
IRJET Journal
 

What's hot (15)

A survey on various architectures, models and methodologies for information r...
A survey on various architectures, models and methodologies for information r...A survey on various architectures, models and methodologies for information r...
A survey on various architectures, models and methodologies for information r...
 
Annotation Approach for Document with Recommendation
Annotation Approach for Document with Recommendation Annotation Approach for Document with Recommendation
Annotation Approach for Document with Recommendation
 
SEMANTIC INFORMATION EXTRACTION IN UNIVERSITY DOMAIN
SEMANTIC INFORMATION EXTRACTION IN UNIVERSITY DOMAINSEMANTIC INFORMATION EXTRACTION IN UNIVERSITY DOMAIN
SEMANTIC INFORMATION EXTRACTION IN UNIVERSITY DOMAIN
 
Classification-based Retrieval Methods to Enhance Information Discovery on th...
Classification-based Retrieval Methods to Enhance Information Discovery on th...Classification-based Retrieval Methods to Enhance Information Discovery on th...
Classification-based Retrieval Methods to Enhance Information Discovery on th...
 
What IA, UX and SEO Can Learn from Each Other
What IA, UX and SEO Can Learn from Each OtherWhat IA, UX and SEO Can Learn from Each Other
What IA, UX and SEO Can Learn from Each Other
 
Challenging Issues and Similarity Measures for Web Document Clustering
Challenging Issues and Similarity Measures for Web Document ClusteringChallenging Issues and Similarity Measures for Web Document Clustering
Challenging Issues and Similarity Measures for Web Document Clustering
 
Context Driven Technique for Document Classification
Context Driven Technique for Document ClassificationContext Driven Technique for Document Classification
Context Driven Technique for Document Classification
 
dexa08linli
dexa08linlidexa08linli
dexa08linli
 
Improving search result via search keywords and data classification similarity
Improving search result via search keywords and data classification similarityImproving search result via search keywords and data classification similarity
Improving search result via search keywords and data classification similarity
 
A Review: Text Classification on Social Media Data
A Review: Text Classification on Social Media DataA Review: Text Classification on Social Media Data
A Review: Text Classification on Social Media Data
 
QUESTION ANSWERING MODULE LEVERAGING HETEROGENEOUS DATASETS
QUESTION ANSWERING MODULE LEVERAGING HETEROGENEOUS DATASETSQUESTION ANSWERING MODULE LEVERAGING HETEROGENEOUS DATASETS
QUESTION ANSWERING MODULE LEVERAGING HETEROGENEOUS DATASETS
 
Introduction abstract
Introduction abstractIntroduction abstract
Introduction abstract
 
Demystifying analytics in e discovery white paper 06-30-14
Demystifying analytics in e discovery   white paper 06-30-14Demystifying analytics in e discovery   white paper 06-30-14
Demystifying analytics in e discovery white paper 06-30-14
 
BusinessIntelligence
BusinessIntelligenceBusinessIntelligence
BusinessIntelligence
 
Improving Annotations in Digital Documents using Document Features and Fuzzy ...
Improving Annotations in Digital Documents using Document Features and Fuzzy ...Improving Annotations in Digital Documents using Document Features and Fuzzy ...
Improving Annotations in Digital Documents using Document Features and Fuzzy ...
 

Viewers also liked

Cody
CodyCody
Eight Steps To Effective Home Buying
Eight Steps To Effective Home BuyingEight Steps To Effective Home Buying
Eight Steps To Effective Home Buyingjesseostdiek
 
O que esperar da Black Friday 2016
O que esperar da Black Friday 2016O que esperar da Black Friday 2016
O que esperar da Black Friday 2016
Norma David
 
Rosie...Our Journey
Rosie...Our JourneyRosie...Our Journey
Rosie...Our Journey
Kristi McCann
 
A History of Romney on Health Care
A History of Romney on Health CareA History of Romney on Health Care
A History of Romney on Health CareJoshua Cohen
 
Closet Works Beautifully Designed Spaces
Closet Works Beautifully Designed SpacesCloset Works Beautifully Designed Spaces
Closet Works Beautifully Designed Spaces
ZenaHallman
 
Closet Works Beautifully Designed Spaces
Closet Works Beautifully Designed SpacesCloset Works Beautifully Designed Spaces
Closet Works Beautifully Designed Spacesmelissacipra1
 
Chemistry basics
Chemistry basicsChemistry basics
Chemistry basics
C M Paul Mathai
 
Google dia das mães - Insights de mercado
Google dia das mães - Insights de mercadoGoogle dia das mães - Insights de mercado
Google dia das mães - Insights de mercado
Norma David
 
Google dia das mães Estratégias de Search
Google  dia das mães   Estratégias de SearchGoogle  dia das mães   Estratégias de Search
Google dia das mães Estratégias de Search
Norma David
 
Como Fazer Remarketing no Facebook e Google - Atualizado
Como Fazer Remarketing no Facebook e Google - AtualizadoComo Fazer Remarketing no Facebook e Google - Atualizado
Como Fazer Remarketing no Facebook e Google - Atualizado
Norma David
 
Research report nithish
Research report nithishResearch report nithish
Research report nithish
Nithish Kumar
 
Porque utilizar vídeos nos negócios ONDE e COMO divulgar no Youtube. Palestr...
Porque utilizar vídeos nos negócios ONDE e COMO divulgar no  Youtube. Palestr...Porque utilizar vídeos nos negócios ONDE e COMO divulgar no  Youtube. Palestr...
Porque utilizar vídeos nos negócios ONDE e COMO divulgar no Youtube. Palestr...
Norma David
 

Viewers also liked (18)

Cody
CodyCody
Cody
 
Jpk dreamhouse
Jpk dreamhouseJpk dreamhouse
Jpk dreamhouse
 
Eight Steps To Effective Home Buying
Eight Steps To Effective Home BuyingEight Steps To Effective Home Buying
Eight Steps To Effective Home Buying
 
Revised Credit
Revised CreditRevised Credit
Revised Credit
 
Jpk dreamhouse
Jpk dreamhouseJpk dreamhouse
Jpk dreamhouse
 
O que esperar da Black Friday 2016
O que esperar da Black Friday 2016O que esperar da Black Friday 2016
O que esperar da Black Friday 2016
 
Rosie...Our Journey
Rosie...Our JourneyRosie...Our Journey
Rosie...Our Journey
 
Jpk dreamhouse
Jpk dreamhouseJpk dreamhouse
Jpk dreamhouse
 
A History of Romney on Health Care
A History of Romney on Health CareA History of Romney on Health Care
A History of Romney on Health Care
 
Revised Credit
Revised CreditRevised Credit
Revised Credit
 
Closet Works Beautifully Designed Spaces
Closet Works Beautifully Designed SpacesCloset Works Beautifully Designed Spaces
Closet Works Beautifully Designed Spaces
 
Closet Works Beautifully Designed Spaces
Closet Works Beautifully Designed SpacesCloset Works Beautifully Designed Spaces
Closet Works Beautifully Designed Spaces
 
Chemistry basics
Chemistry basicsChemistry basics
Chemistry basics
 
Google dia das mães - Insights de mercado
Google dia das mães - Insights de mercadoGoogle dia das mães - Insights de mercado
Google dia das mães - Insights de mercado
 
Google dia das mães Estratégias de Search
Google  dia das mães   Estratégias de SearchGoogle  dia das mães   Estratégias de Search
Google dia das mães Estratégias de Search
 
Como Fazer Remarketing no Facebook e Google - Atualizado
Como Fazer Remarketing no Facebook e Google - AtualizadoComo Fazer Remarketing no Facebook e Google - Atualizado
Como Fazer Remarketing no Facebook e Google - Atualizado
 
Research report nithish
Research report nithishResearch report nithish
Research report nithish
 
Porque utilizar vídeos nos negócios ONDE e COMO divulgar no Youtube. Palestr...
Porque utilizar vídeos nos negócios ONDE e COMO divulgar no  Youtube. Palestr...Porque utilizar vídeos nos negócios ONDE e COMO divulgar no  Youtube. Palestr...
Porque utilizar vídeos nos negócios ONDE e COMO divulgar no Youtube. Palestr...
 

Similar to Research Report on Document Indexing-Nithish Kumar

professional fuzzy type-ahead rummage around in xml type-ahead search techni...
professional fuzzy type-ahead rummage around in xml  type-ahead search techni...professional fuzzy type-ahead rummage around in xml  type-ahead search techni...
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
Kumar Goud
 
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
inventionjournals
 
Sweeny ux-seo om-cap 2014_v3
Sweeny ux-seo om-cap 2014_v3Sweeny ux-seo om-cap 2014_v3
Sweeny ux-seo om-cap 2014_v3
Marianne Sweeny
 
Effective Performance of Information Retrieval on Web by Using Web Crawling  
Effective Performance of Information Retrieval on Web by Using Web Crawling  Effective Performance of Information Retrieval on Web by Using Web Crawling  
Effective Performance of Information Retrieval on Web by Using Web Crawling  
dannyijwest
 
Web content mining
Web content miningWeb content mining
Web content mining
Daminda Herath
 
Perception Determined Constructing Algorithm for Document Clustering
Perception Determined Constructing Algorithm for Document ClusteringPerception Determined Constructing Algorithm for Document Clustering
Perception Determined Constructing Algorithm for Document Clustering
IRJET Journal
 
IJRET : International Journal of Research in Engineering and TechnologyImprov...
IJRET : International Journal of Research in Engineering and TechnologyImprov...IJRET : International Journal of Research in Engineering and TechnologyImprov...
IJRET : International Journal of Research in Engineering and TechnologyImprov...
eSAT Publishing House
 
Performance Evaluation of Query Processing Techniques in Information Retrieval
Performance Evaluation of Query Processing Techniques in Information RetrievalPerformance Evaluation of Query Processing Techniques in Information Retrieval
Performance Evaluation of Query Processing Techniques in Information Retrieval
idescitation
 
SearchEngine.pptx
SearchEngine.pptxSearchEngine.pptx
SearchEngine.pptx
MohdSohail65
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 
Projection Multi Scale Hashing Keyword Search in Multidimensional Datasets
Projection Multi Scale Hashing Keyword Search in Multidimensional DatasetsProjection Multi Scale Hashing Keyword Search in Multidimensional Datasets
Projection Multi Scale Hashing Keyword Search in Multidimensional Datasets
IRJET Journal
 
Semantic Search Engine using Ontologies
Semantic Search Engine using OntologiesSemantic Search Engine using Ontologies
Semantic Search Engine using Ontologies
IJRES Journal
 
An Improved Annotation Based Summary Generation For Unstructured Data
An Improved Annotation Based Summary Generation For Unstructured DataAn Improved Annotation Based Summary Generation For Unstructured Data
An Improved Annotation Based Summary Generation For Unstructured Data
Melinda Watson
 
Vol 12 No 1 - April 2014
Vol 12 No 1 - April 2014Vol 12 No 1 - April 2014
Vol 12 No 1 - April 2014
ijcsbi
 
Search Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignSearch Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By Design
Marianne Sweeny
 
2 ijmtst031002
2 ijmtst0310022 ijmtst031002
2 ijmtst031002
IJMTST Journal
 
Information retrieval introduction
Information retrieval introductionInformation retrieval introduction
Information retrieval introduction
nimmyjans4
 
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEEMEMTECHSTUDENTPROJECTS
 

Similar to Research Report on Document Indexing-Nithish Kumar (20)

50120140502013
5012014050201350120140502013
50120140502013
 
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
professional fuzzy type-ahead rummage around in xml  type-ahead search techni...professional fuzzy type-ahead rummage around in xml  type-ahead search techni...
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
 
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
 
Sweeny ux-seo om-cap 2014_v3
Sweeny ux-seo om-cap 2014_v3Sweeny ux-seo om-cap 2014_v3
Sweeny ux-seo om-cap 2014_v3
 
Effective Performance of Information Retrieval on Web by Using Web Crawling  
Effective Performance of Information Retrieval on Web by Using Web Crawling  Effective Performance of Information Retrieval on Web by Using Web Crawling  
Effective Performance of Information Retrieval on Web by Using Web Crawling  
 
Web content mining
Web content miningWeb content mining
Web content mining
 
Web Content Mining
Web Content MiningWeb Content Mining
Web Content Mining
 
Perception Determined Constructing Algorithm for Document Clustering
Perception Determined Constructing Algorithm for Document ClusteringPerception Determined Constructing Algorithm for Document Clustering
Perception Determined Constructing Algorithm for Document Clustering
 
IJRET : International Journal of Research in Engineering and TechnologyImprov...
IJRET : International Journal of Research in Engineering and TechnologyImprov...IJRET : International Journal of Research in Engineering and TechnologyImprov...
IJRET : International Journal of Research in Engineering and TechnologyImprov...
 
Performance Evaluation of Query Processing Techniques in Information Retrieval
Performance Evaluation of Query Processing Techniques in Information RetrievalPerformance Evaluation of Query Processing Techniques in Information Retrieval
Performance Evaluation of Query Processing Techniques in Information Retrieval
 
SearchEngine.pptx
SearchEngine.pptxSearchEngine.pptx
SearchEngine.pptx
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
Projection Multi Scale Hashing Keyword Search in Multidimensional Datasets
Projection Multi Scale Hashing Keyword Search in Multidimensional DatasetsProjection Multi Scale Hashing Keyword Search in Multidimensional Datasets
Projection Multi Scale Hashing Keyword Search in Multidimensional Datasets
 
Semantic Search Engine using Ontologies
Semantic Search Engine using OntologiesSemantic Search Engine using Ontologies
Semantic Search Engine using Ontologies
 
An Improved Annotation Based Summary Generation For Unstructured Data
An Improved Annotation Based Summary Generation For Unstructured DataAn Improved Annotation Based Summary Generation For Unstructured Data
An Improved Annotation Based Summary Generation For Unstructured Data
 
Vol 12 No 1 - April 2014
Vol 12 No 1 - April 2014Vol 12 No 1 - April 2014
Vol 12 No 1 - April 2014
 
Search Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignSearch Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By Design
 
2 ijmtst031002
2 ijmtst0310022 ijmtst031002
2 ijmtst031002
 
Information retrieval introduction
Information retrieval introductionInformation retrieval introduction
Information retrieval introduction
 
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
 

Recently uploaded

Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
WSO2
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
Tier1 app
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
kalichargn70th171
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Anthony Dahanne
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
Ortus Solutions, Corp
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
Globus
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
Donna Lenk
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 

Recently uploaded (20)

Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 

Research Report on Document Indexing-Nithish Kumar

  • 1. Research Paper for CSCI 6370, Topics in Computer Science Name: Sai Nithish Kumar Posani SID : 20356909 Professor: Zhixiang Chen
  • 2. Abstract: The main theme of Informational retrieval is to send the exact response of a user for specific Query. Existing model: The existing model and the functionality of the information retrieval is done by analyzing entire document to give response to given query and the related terms to the query are extracted.Indexing weight is plays a major role here, it is applied to all the terms and at the end it provides the response to the user. In the existing model they did not considered the context into consideration so the information cannot be retrieved efficiently. Proposed Model: In this paper, to gain the information retrieval in efficient way they proposed a context-sensitive document indexing approach. By using a concept called lexical resource the content carrying terms and background terms will separate in this approach. Here Indexing weight will be calculate for content carrying terms. The highest indexing weight is taken account as the most salient sentence and these sentences. Are retrieved and document summarization is done. If user entered a query this will treat as a keyword the based on keyword search the information will search. Then this keyword is corresponding with the summarized document. Incase if it is matched the related sentence will extracted by using the indexing algorithm. At the end these senses are treated as a response of an appropriate response. By using this flow the information retrieval process will be done successfully. Overview: In present world the internet usage and computers usage is rapidly increased. Even in villages people are using computers or personnel systems. The usage of the computer technology is raised. Compare with last two decades our technology is doubled, tripled, quadrupled. The main focus here is to connect with internet and search different webpages from the different web pages gather the information. The web sites may be entertainment, education, business, personnel, social, and history, political, mechanical, and industrial it may be anything. If we enter the content in the browser it will treat as a query. We have different type of client software’s available nothing but browsers for example Google chrome, Mozilla Firefox, safari, torch…etc. The web search has more importance now, because if we type something and click enter if it takes 2 minutes to give response is useless. So everyone need fast and quick response from server. And one more issue if you type something and you got the output is not related this is also a different problem. For example if you type apple word for apple company information it showing results like apple fruits, pine apple, advantages of eating apple, like that. This is also a problem. Whenever, wherever if you search for anything you should get a good result, related content should be display. To achieve such type of results you need to write a very good program
  • 3. for the search engines. Here I am going to explain about the document indexing approach, by using this approach we can get good result like as we discussed earlier. This type of content we need a better idea on search engines. How user going to search, what he is going to expect form server, what he is showing more interest like that. We have different techniques for search engine development like Document indexing, web crawler, keyword search, document clustering, link based ranking….etc. The term search has more importance here in the web terminology. Content mining is the procedure of separating the valuable and superb data from the report. The unique report in the site comprises of tremendous measure of data. The client finds harder to get the fundamental topic of the report. Keeping in mind the end goal to conquer these troubles data recovery from an abridged report is finished. The report rundown comprise of diverse sorts: single report rundown is the procedure of compressing a solitary report. Multi report is utilized to compress the substance of one or more reports. The fundamental point of the Information Retrieval is to fulfill the data need of a client. He general undertaking of data recovery is utilized to recover the significant term as per the client questions inside of the worthy reaction time. The primary point of the Data recovery is to give Information sets which coordinating to the pivotal words of an inquiry. Data recovery predominantly manages the representation, stockpiling, association furthermore, access to data things. Data recovery is used to decrease the issue called data over-burden. Data over-burden alludes to the trouble of a man to Comprehend the issues brought about by the vicinity of a lot of data. At the point when the client enters the inquiry, the data is recovered. The inquiries are executed and recovered from the record by utilizing SQL. At the point when the client enters the question, the inquiry is coordinated with the report. At that point the inquiry related data is extricated. The best match is displayed as the reactions to the client. Advantages:  Reduced number of commands required to be known to the client for a given level of output.  Here reduced number of clicks or keystrokes required to carry out a given appropriate operation.  It will give permission to consistent behavior to be pre-programmed or altered by the user/client.  It will reduces the number of choices to be on console at one time (i.e. "clutter")  the splitting of Sentences : The content carrying term issued to give the important idea about the main content  Lexical association  Context Based indexing approach Disadvantages:  The Context sensitive actions might be perceived as dumbing down of the user interface -leaving the operative at a loss as to what to do when the computer decides to perform an unwanted action.  Moreover non-automatic actions may be hidden or covered by the context sensitive interface causing a rise in user workload for operations the developers did not foresee. Improvisation: In this paper they concentrated on Document indexing concept, it is really useful and it will raise the efficiency of web pages. But my idea is concentrating on only document indexing is not fine for web
  • 4. pages, at a time two or three different techniques should be implement in a single engine like Document indexing, page ranking, crawler implementation, page clustering. All these applications can affect the information search retrieval pattern. Here mainly information retrieval is playing a major role in each and every aspect. That’s why we need to concentrate on each and every angle to get the output from server as early as possible. The information search retrieval is a very big process, to achieve this concept we need to develop an application with more effect and we have to use techniques like Document indexing, page ranking, clustering technique. Among all of these Document index is plays avital role while searching why since instead of searching hundreds of thousands of documents it will directly go to the particular index and will give the output here. Here our achievement mainly is indexing, the clear meaning of the indexing is storing an index is to optimize speed and performance in finding the appropriate/corresponding document for the user searched query. My conclusion is the context based index approach is used in the query retrieval, this is mainly from the source document. Instead of searching every page on server, finding technically is better. Due to this we can save our time, we can reduce the burden of server. References: 1. Professor D.R. Radev, H. Jing, M. Stys, and D. Tam, "Centroid-Based Summarization of Multiple Documents," Information Processing and Management, 2. Professor, I. Mani, G. Klein, D. House, L. Hirschman, T. Firmin, and B. Sundheim, "Summac: A Text Summarization Evaluation," Nat' 3. Professor, Xiaojun Wan Jianwu Yang Jianguo Xia "Towards an iterative Reinforcement approach for simultaneous document summarization And keyword extraction” 4. Professor, K. Morita, E.-S. Atlam, M. Fuketra, K. Tsuda, M. Oono, and .I.-i. Aoe, "Word Classification and Hierarchy using Co-Occurrence Word Intonation," Intonation Processing and Management, 5. Professor, H. Li, "Word Clustering and Disambiguation Based on Co-Occurrence Data," Nat'! Language Eng., 6. Professor, c.-Y. Lin, G. Cao, .I. Gao, and J.-Y. Nie, "An Information-Theoretic Approach to Automatic Evaluation of Summaries," Proc. Main Conf. Human Language Technology Conf. North Am. Chapter of the Assoc. of Computational Linguistics,