2. OVERVIEW
• Introduction
• Problem statement
• Literature Reviews
• Query log
• Querying the search engine
• Search engine optimization
• Proposed algorithm
• Result analysis
• Conclusion
• Future scope
• reference
3. INTRODUCTION
• Seo is a concept by which we retrieve information effectively
and efficiently.
• Search engine optimization is method that refers to the course
of improving the traffic to a certain website.
• Searching academic journals, scholarly articles may need special
consideration to other factors beyond the keyword search and
context-based querying strategies.
• ASEO used for improving rank and explosure of academic
article.
• SemAcSearch scans through database of articles and rank based
on relevance of search query.
4. PROBLEM STATEMENT
• How Lingo(a novel algorithm for clustering search results)is
able to reduce the original term-document matrix
• How incorporating semantics along with other ASEO
techniques could possibly enhance the ranking of most
relevant articles.
5. Literature Review
• Rekha Singhal (2016) et al displays around a query log frequently, contains data
about user, issued query, clicked comes to fruition, et cetera. From this data,
knowledge can be separated to enhance the quality (both as far as adequacy and
productivity) of their framework. The course of action of question log addresses a
record using five highlights: user id, inquiry, timestamp, rank of the clicked result,
have string of the clicked URL.
• B. Prasanthi (2016) et al presents executing and building up a Novel Image
Repositioning System (NIRS) that consequently learns disconnected the different
visual semantic highlights for various inquiries through keyword expansions.
• Fayyaz Ali (2016) et al we propose another sort of PageRank called Ratio-based
Weighted PageRank that performs better than PageRank and Weighted PageRank
calculations particularly in the terms of association.
• Lei et al proposed a semantic model to analyse the search queries. The proposed
semantic model breaks the query into varied components of the sentence and
assigns a weight to each component.
6. QUERY LOG
A standout amongst the most utilized methods for
improving the users’ search experience, truth be told,
is the abuse of the knowledge contained inside past
queries. They record queries issued to search engine
and also a lot of further information like the user
submits the query, pages viewed and click in the
result set, the position of every result, the accurate
time at which a particular act was done etc.
7. Querying the Search Engine
Inverted Index
The primary data structure of most of the IR systems is in the form of inverted index. We can define an inverted
index as a data structure that list, for every word, all documents that contain it and frequency of the occurrences
in document. It makes it easy to search for ‘hits’ of a query word.
Stop Word Elimination
Stop words are those high frequency words that are deemed unlikely to be useful for searching. All such kind of
words are in a list called stop list. For example, articles “a”, “an”, “the” and prepositions like “in”, “of”, “for”, “at”
etc. are the examples of stop words.
Stemming
Stemming, the simplified form of morphological analysis, is the heuristic process of extracting the base form of
words by chopping off the ends of words. For example, the words laughing, laughs, laughed would be stemmed
to the root word laugh.
Semantics
The first part of semantic analysis, studying the meaning of individual words is called lexical semantics and when
it is done on the chunks of paragraph then it is known as semantics. It includes words, sub-words, affixes (sub-
units), compound words and phrases also. All the words, sub-words, etc. are collectively called lexical items. In
other words, we can say that lexical semantics is the relationship between lexical items, meaning of sentences
and syntax of sentence.
8. SEARCH ENGINE OPTIMIZATION
• To Increase Visibility on Search
Engines.
• Convert a Local Business into
International Business.
• Get more traffic on our website
• Easy way to show our business as
a brand
11. RESULTS:
• For finding weight of every URL,
NewRank(X) = OldRank(X) + Weight(X) + RankByUser(X)
• Performance of search engine measured by:
(a) Precision(P) = TP/(TP+FP)
where, TP=the number of relevant results correctly identifies as relevant
(b) Recall(R) =TP/(TP+FN) FP=the number of irrelevant results incorrectly identifies as relevant
FN=the number of relevant results incorrectly identifies as irrelevant.
(c) F1-score= 2*P*R/(P+R)
12. CONCLUSION
• In this paper, SemAcSearch a search engine on
journal and scholarly article search, that considers
context and semantics of the query in computing
the overall ranking of the publications is
presented
• We used a lingo clustering algorithm to cluster
the similar entities into same group and thus
abstract the original data into half of the original
document.
13. FUTURE SCOPE
• SemAcSearch only focuses on making sense of the context associated with each
word. A more Fool Proof and relevant method would be to also reference other
sections of each paper to develop an impact factor for a paper with respect to
other peer authors. This lead to a graph model based on which a more effective
ranking can be computed, which can ensure greater relevance of results.
• Further analysis of query like, assigning a weight based on POS tagging and score
based on the component will also be explored next.