SlideShare a Scribd company logo
1 of 6
Download to read offline
Personalizing Search
By applying user context and uncovering essential information, search
engines can deliver a more rewarding experience, resulting in more
digital revenue for the organization.
• Cognizant 20-20 Insights
Executive Summary
Billions of searches are performed by millions of
users, every day. Although the numerous search
engines that are currently available work well on
basic keyword searches, they are often lacking
when it comes to advanced research, especially
to meet the requirements of information and
media services providers. Regardless of the indi-
vidual user needs, the search engine typically
provides the same results to all.
This white paper highlights improvements that
can be made to the relevance of search engine
results, as well as how information and media ser-
vices providers can personalize search returns
based on each user’s interests and requirements.
It also highlights the different dimensions of per-
sonalization, and recommends ways to measure
and collect insights against these dimensions.
Search Engine Results Relevancy
When gauging search engine effectiveness, two
key questions must be answered:
•	 What should the result set consist of?
•	 What should the sort order of the returns be?
Search engines generally use a variation of the tf-
idf (term frequency–inverse document frequency)
algorithm for search and relevance (see Quick
Take, next page). This means if a user searches for
“the driving license,” the engine takes two steps
based on term frequency:
1.	 Find all documents with “the driving license” or
a variation of that term.
2.	Sort document returns, displaying those with
the highest number of term mentions at the
top, and the least number of mentions at the
bottom.
Because “the” is a common or “noise” term in
the phrase, and the other two words (driving and
license) are more specific, driving/driver/drive
and license/licensing/licensed will be given pri-
ority for relevance. This is called idf (or inverse
document frequency). The search engine returns
all documents with “driving license” and varia-
tions, ignoring “the.” But the question remains: Is
it sufficient for relevant search results?
Beyond Basic Search
Returning the result set based on the search
term is enough for basic search engines but not
for advanced research engines. Along with tf-idf,
user context is the most important factor to con-
sider. In the above example, the search engine
will return all documents with the term “driv-
ing license;” however, top results might relate
to applying for a license in California, which is
cognizant 20-20 insights | december 2015
not useful if the user’s
location is outside of
that state. As a result,
top returned results
are relevant only for a
very small portion of
the user base. If the
search engine records
and analyzes the user’s
previous searches,
however, then it would
be able to apply addi-
tional context to what the user is seeking and
thus provide more relevant results.
In the above case, if user context is factored in,
the search engine would provide California as the
top result to a California resident, a New York top
result to a New York resident and Alaska the top
result to an Alaska user. User context typically
consists of the following dimensions:
•	 Geographic location.
•	 Area of interest.
•	 Profession and economic background.
•	 Gender and age group.
Another example of how user context can be
useful is when a user is located in a particular
state and searches for an item, say a Samsung
phone. When the results are returned with
Samsung phone on top, not all the returns may
be relevant for the following reasons:
•	 Geographic location: The user wants a
localized search. While thousands of stores sell
Samsung phones in the U.S., he is interested in
just the four to five nearby outlets.
•	 Area of interest: Responses should be different
for someone interested in news vs. one seeking
technical details.
•	 Profession and economic background:
Responses should also consider the user’s
profession and economic background. If a
lawyer is conducting a search, for example,
he might be seeking updates on an Apple
vs. Samsung lawsuit1
and be uninterested in
Samsung stores and reviews.
•	 Gender and age group: Not only does interest
level vary based on gender and age group; so
does the relevance of results. A younger user
might be seeking the best games on a Samsung
phone, which differ based on age and gender.
While some user dimensions, such as gender, do
not change over time, others, such as location,
profession or area of interest, can vary and must
cognizant 20-20 insights 2
Understanding the tf-idf Algorithm
The following are variations of tf, idf or tf-idf that are used for relevance:	
Simple tf–idf calculation
tfidf(t,d,D)=tf(t,d) * idf(t,D)
Variants of tf Weight
Weighting Scheme tf Weight
Binary {0,1}
Raw frequency f(t,d)
Log normalization log(1 + f(t,d))
Double normalization 0.5 0.5 + 0.5(f(t,d)/max(f(t,d)))
Double normalization K K+(1-K)(f(t,d)/max(f(t,d)))
Variants of idf Weight
Weighting Scheme idf Weight
Unary {0,1}
Inverse frequency log(N,n(t))
Inverse frequency smooth log(1 + (N,n(t)))
Inverse frequency max Log(1+ (max(t) * n(t))/n(t))
Probabilistic inverse frequency Log((N-n(t))/n(t))
Recommended Weighting Schemes
Weighting Scheme Document Term Weight
1 f(t,d) * log(N,n(t))
2 log(1 + f(t,d)) * log(1 + (N,n(t)))
3 log(1 + f(t,d)) * log(N,n(t))
Quick Take
Source: if-idf entry on Wikipedia, (http://en.wikipedia.org/wiki/Tf%E2%80%93idf)
If the search engine
records and analyzes
the user’s previous
searches, however, then
it would be able to apply
additional context to
what the user is seeking
and thus provide more
relevant results.
cognizant 20-20 insights 3
be tracked and updated regularly. This is where
some search engines excel over others.
For example, Netflix is preferred over other
movie services because its search engine collects
and acts on user context dimensions obtained
through the user’s profile, which includes data on:
•	 Age group.
•	 Ratings given to movies viewed.
•	 Number of times content is viewed (which
provides area of interest and content
popularity).
•	 Previous selections in a genre and browse
history (which provides area of interest and
ease of finding content).
•	 Viewing location.
•	 Amount of continuous-viewing time spent
(which provides level of Interest).
As a result of a deep-dive analysis of these dimen-
sions, Netflix can predict exactly which movie the
user plans to watch during the weekend. This is
the essence of what we call Code Halo™ think-
ing, in which companies make meaning from the
intersection of digital code that surrounds people,
processes, organizations and things. (To learn
more, read our book, Code Halos: How the Digital
Lives of People, Things, and Organizations are
Changing the Rules of Business).2
Target and Context Search
Two types of advanced search capabilities have
emerged, each of which must be handled differ-
ently by the engine.
•	 Target search: Here, the user knows what he
is seeking, and the engine must return specific
results without any ambiguity. Even if the user
incorrectly spells the search term, the search
engine needs to return the right results, with
supporting facts and recommendations, partic-
ularly if the user is looking for a side suggestion.
•	 Context search: In this case, the user is
searching based on terms, questions or fuzzy
query, and the engine needs to return the
result based on tf-idf, along with user context,
to return personalized results. The user should
find what he is looking for within the first five
results. More than 75% of click-through rates
are from the top-five results (see Figure 2).
Result Ranking and Click-Through Rate
Source: Search Engine Results Page entry, https://en.wikipedia.org/wiki/Search_engine_results_page
Figure 2
TOP 10 RESULTS
0
10
20
30
40
50
60
70
80
90
100%
1 2 3 4 5 6 7 8 9 10
Click-through Rate
Cumulative
A search engine results page is the listing of results returned by a search engine in response to a keyword query.
The results normally include a list of items with titles, a reference to the full version, and a short description show-
ing where the keywords have matched content within the page.
Each page of search engine results usually contains 10 organic listings. The listings on the first page are the most
important ones, because those get 91% of the click-through rates from a particular search. According to a 2013
study by Chitika, the click-through rates for the first page are:
User Interaction on the Search Engine
After logging into the search engine and inputting
a search term, the result set is returned. What the
user does next reveals much about the engine:
•	 Refines/redefines the search term and
searches again.
•	 Narrows down the result set.
•	 Browses the result set, clicks on it for additional
details and then looks at other results.
•	 Browses the result set and clicks on the results
that satisfy his search.
•	 Browses the top result set, clicks on it and
obtains his required information.
•	 Finds his information just after submitting the
search term.
Although the difference in user behavior is excep-
tionally small, each response actually has a very
different meaning. Let’s explore further.
•	 Refines/redefines the search term: When
the user entered the search term, he received
irrelevant results. This means the search
engine did not help the user, nor has tf-idf been
implemented properly, causing the user to
refine/redefine the search term to find what he
is looking for.
•	 Narrows down the result set: The user
received results based on the term entered
but also received results in which he is not
interested. To clean up the result set, the user
tries to remove unwanted results. Once he
has the refined results, he can examine the
returned documents. In this case, the tf-idf is
working to its potential; however, there is no
user context. Narrowing down results means
the user is trying to set the right search context.
•	 Clicks for more details: The user received
relevant results; however, the associated infor-
mation is insufficient. To check whether each
result includes the right details, the user must
click and read it. In this case, tf-idf is working to
its potential with some user context.
•	 Browses result set: The user received all
relevant results on the first page and has the
right supporting details; however, the best
result is not among those listed at the top of
the return, so he has to browse all the results
to get to the right one. In this case, the tf-idf
is working to its potential with user context.
However, user context is not fully optimized.
•	 Browses top result set: In this case, the user
received his results in the top result set with
all the right supporting details. In this case,
the tf-idf is working to its potential with fully
optimized user context.
•	 Finds information immediately: In this case,
the user gets his result with the submission of
the search term. This is especially applicable
for entity search results. This search result has
all the required supporting details. In this case,
the tf-idf is working to its potential with fully
optimized user context.
How to Gather User Context
Every time a user logs into a search engine, he
adds more data to his digital imprint, or Code
Halo. This can be useful for subsequent search-
es, other users’ searches and for the popularity
of the content. When a user is new to the search
engine, there is no user context.
Based on the user’s interaction with the search
engine, a machine learning algorithm creates user
context that factors into his search pattern. When
the user profile is
new or inconclusive,
the safest bet is to
return results based
on the tf-idf. Once
the search engine
has enough details
to infer context, the
result set can be per-
sonalized to address
user interest. This
not only ensures that more relevant results are
returned, but it also saves unnecessary user
effort.
User context is a continuous and ongoing learning
process. For some users, there may not initially
be a conclusive pattern, but over time, a specific
pattern may develop that yields user context. In
other cases, the engine needs to unlearn some
patterns and create a new pattern with updated
user context (see Figure 3, next page).
Context can be drawn from the user’s previous
searches, as well as from users with similar inter-
action patterns.
•	 User’s previous searches: The safest way of
creating user context is to refer to previous user
behavior with searches and intuit a research
cognizant 20-20 insights 4
Based on the user
interaction with the
search engine, a
machine learning
algorithm creates user
context that factors
into his search pattern.
cognizant 20-20 insights 5
path. Once there is clarity on user context, the
engine provides relevant results and sugges-
tions. For example, iTunes applies Code Halo
thinking to help the user discover new music
based on what she already has or likes. It relies
on the user’s context and recommends new
music.
•	 Users with similar interaction patterns:
This model uncovers similar users’ research
patterns and applies those learnings to return
more relevant results (i.e., collaborative
filtering3
). For example, an “other topics you
might be interested in” feature is based on other
users’ similar patterns. This approach may not
be used by all search engines, especially paid
engines. Users may raise concerns that others
benefit from their search pattern. This option
should be used on a case-by-case basis; for
example, while it should be avoided in legal
or scientific research, it can be very useful for
music streaming websites.
Even if no specific user context is obtained, con-
text can be derived from the popularity of the
content.
Looking Ahead
General search engines take the user’s entered
search term, remove the noise, apply tf-idf and
return the result set based on the relevance order.
The result set may or may not be useful to the
user, and relevance is dependent on user context.
To make search results more relevant, informa-
tion and media organizations need to:
•	 Measure the effectiveness of the search engine.
•	 Understand the user’s search pattern.
•	 Relate the user pattern to specific dimensions
of the user context.
•	 Apply the user context to personalize the
results for the user.
The above steps should be a living, breathing,
continuously evolving process. This will help
organizations return personalized results, and
deliver content that improves the overall satisfac-
tion with the search engine experience.
Note: Code Halo is a trademark of Cognizant
Technology Solutions.
The User Context Learning Loop
USER CONTEXT
LEARNING
Measure
Effectiveness
Improve
Personalization
Machine Learning
User Pattern
Figure 3
Footnotes
1	
Wikipedia entry on Apple vs. Samsung, https://en.wikipedia.org/wiki/Apple_Inc._v._Samsung_
Electronics_Co.
2	
Code Halos: How the Digital Lives of People, Things and Organizations are Changing the Rules of Business,
by Malcolm Frank, Paul Roehrig and Ben Pring, published by John Wiley & Sons, April 2014.
3	
Wikipedia entry on collaborative filtering, https://en.wikipedia.org/wiki/Collaborative_filtering.
About Cognizant
Cognizant (NASDAQ: CTSH) is a leading provider of information technology, consulting, and business process out-
sourcing services, dedicated to helping the world’s leading companies build stronger businesses. Headquartered in
Teaneck, New Jersey (U.S.), Cognizant combines a passion for client satisfaction, technology innovation, deep industry
and business process expertise, and a global, collaborative workforce that embodies the future of work. With over
100 development and delivery centers worldwide and approximately 219,300 employees as of September 30, 2015,
Cognizant is a member of the NASDAQ-100, the S&P 500, the Forbes Global 2000, and the Fortune 500 and is ranked
among the top performing and fastest growing companies in the world. Visit us online at www.cognizant.com or follow
us on Twitter: Cognizant.
World Headquarters
500 Frank W. Burr Blvd.
Teaneck, NJ 07666 USA
Phone: +1 201 801 0233
Fax: +1 201 801 0243
Toll Free: +1 888 937 3277
Email: inquiry@cognizant.com
European Headquarters
1 Kingdom Street
Paddington Central
London W2 6BD
Phone: +44 (0) 20 7297 7600
Fax: +44 (0) 20 7121 0102
Email: infouk@cognizant.com
India Operations Headquarters
#5/535, Old Mahabalipuram Road
Okkiyam Pettai, Thoraipakkam
Chennai, 600 096 India
Phone: +91 (0) 44 4209 6000
Fax: +91 (0) 44 4209 6060
Email: inquiryindia@cognizant.com
­­© Copyright 2015, Cognizant. All rights reserved. No part of this document may be reproduced, stored in a retrieval system, transmitted in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise, without the express written permission from Cognizant. The information contained herein is
subject to change without notice. All other trademarks mentioned herein are the property of their respective owners. 	 TL Codex 1373
About the Author
Utsav Joshi is an Associate Director within Cognizant’s Information, Media & Entertainment business
unit. He has more than 16 years of service delivery experience, with a key focus on global project and ser-
vice delivery management and an emphasis on architecture and delivery management in North America.
He is currently responsible for program and solution delivery of an online legal research application
developed by a leading U.S.-based legal information services provider. He has successfully delivered
large critical programs for Fortune 500 organizations across multiple segments. He holds a B.E. (electri-
cal) and is a TOGAF and AWS certified professional. He can be reached at Utsav.Joshi@cognizant.com.

More Related Content

What's hot

A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet
A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet
A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet IJECEIAES
 
TEXT MINING-TAPPING HIDDEN KERNELS OF WISDOM
TEXT MINING-TAPPING HIDDEN KERNELS OF WISDOMTEXT MINING-TAPPING HIDDEN KERNELS OF WISDOM
TEXT MINING-TAPPING HIDDEN KERNELS OF WISDOMITC Infotech
 
Context based Document Indexing and Retrieval using Big Data Analytics - A Re...
Context based Document Indexing and Retrieval using Big Data Analytics - A Re...Context based Document Indexing and Retrieval using Big Data Analytics - A Re...
Context based Document Indexing and Retrieval using Big Data Analytics - A Re...rahulmonikasharma
 
Binary search query classifier
Binary search query classifierBinary search query classifier
Binary search query classifierEsteban Ribero
 
IRJET- E-Commerce Recommendation based on Users Rating Data
IRJET-  	  E-Commerce Recommendation based on Users Rating DataIRJET-  	  E-Commerce Recommendation based on Users Rating Data
IRJET- E-Commerce Recommendation based on Users Rating DataIRJET Journal
 
Not Good Enough but Try Again! Mitigating the Impact of Rejections on New Con...
Not Good Enough but Try Again! Mitigating the Impact of Rejections on New Con...Not Good Enough but Try Again! Mitigating the Impact of Rejections on New Con...
Not Good Enough but Try Again! Mitigating the Impact of Rejections on New Con...Aleksi Aaltonen
 
Making sense of text: artificial intelligence-enabled content analysis
Making sense of text: artificial intelligence-enabled content analysisMaking sense of text: artificial intelligence-enabled content analysis
Making sense of text: artificial intelligence-enabled content analysisIan McCarthy
 
Web Search and Mining
Web Search and MiningWeb Search and Mining
Web Search and Miningsathish sak
 
Value Delivery through RakutenBig Data Intelligence Ecosystem and Technology
Value Delivery through RakutenBig Data Intelligence Ecosystem  and  TechnologyValue Delivery through RakutenBig Data Intelligence Ecosystem  and  Technology
Value Delivery through RakutenBig Data Intelligence Ecosystem and TechnologyRakuten Group, Inc.
 
Volume 2-issue-6-2016-2020
Volume 2-issue-6-2016-2020Volume 2-issue-6-2016-2020
Volume 2-issue-6-2016-2020Editor IJARCET
 
IRJET- Implementation of Review Selection using Deep Learning
IRJET-  	  Implementation of Review Selection using Deep LearningIRJET-  	  Implementation of Review Selection using Deep Learning
IRJET- Implementation of Review Selection using Deep LearningIRJET Journal
 
Tech Scouting (Companies & Patents)
Tech Scouting (Companies & Patents)Tech Scouting (Companies & Patents)
Tech Scouting (Companies & Patents)quidsupport
 
Vol 12 No 1 - April 2014
Vol 12 No 1 - April 2014Vol 12 No 1 - April 2014
Vol 12 No 1 - April 2014ijcsbi
 
Document Retrieval System, a Case Study
Document Retrieval System, a Case StudyDocument Retrieval System, a Case Study
Document Retrieval System, a Case StudyIJERA Editor
 
Empirical analysis of ensemble methods for the classification of robocalls in...
Empirical analysis of ensemble methods for the classification of robocalls in...Empirical analysis of ensemble methods for the classification of robocalls in...
Empirical analysis of ensemble methods for the classification of robocalls in...IJECEIAES
 

What's hot (19)

A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet
A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet
A Framework for Arabic Concept-Level Sentiment Analysis using SenticNet
 
TEXT MINING-TAPPING HIDDEN KERNELS OF WISDOM
TEXT MINING-TAPPING HIDDEN KERNELS OF WISDOMTEXT MINING-TAPPING HIDDEN KERNELS OF WISDOM
TEXT MINING-TAPPING HIDDEN KERNELS OF WISDOM
 
Sentiment analysis on unstructured review
Sentiment analysis on unstructured reviewSentiment analysis on unstructured review
Sentiment analysis on unstructured review
 
Context based Document Indexing and Retrieval using Big Data Analytics - A Re...
Context based Document Indexing and Retrieval using Big Data Analytics - A Re...Context based Document Indexing and Retrieval using Big Data Analytics - A Re...
Context based Document Indexing and Retrieval using Big Data Analytics - A Re...
 
Binary search query classifier
Binary search query classifierBinary search query classifier
Binary search query classifier
 
IRJET- E-Commerce Recommendation based on Users Rating Data
IRJET-  	  E-Commerce Recommendation based on Users Rating DataIRJET-  	  E-Commerce Recommendation based on Users Rating Data
IRJET- E-Commerce Recommendation based on Users Rating Data
 
Not Good Enough but Try Again! Mitigating the Impact of Rejections on New Con...
Not Good Enough but Try Again! Mitigating the Impact of Rejections on New Con...Not Good Enough but Try Again! Mitigating the Impact of Rejections on New Con...
Not Good Enough but Try Again! Mitigating the Impact of Rejections on New Con...
 
2
22
2
 
Sentiment analysis on_unstructured_review-1
Sentiment analysis on_unstructured_review-1Sentiment analysis on_unstructured_review-1
Sentiment analysis on_unstructured_review-1
 
Making sense of text: artificial intelligence-enabled content analysis
Making sense of text: artificial intelligence-enabled content analysisMaking sense of text: artificial intelligence-enabled content analysis
Making sense of text: artificial intelligence-enabled content analysis
 
Web Search and Mining
Web Search and MiningWeb Search and Mining
Web Search and Mining
 
Value Delivery through RakutenBig Data Intelligence Ecosystem and Technology
Value Delivery through RakutenBig Data Intelligence Ecosystem  and  TechnologyValue Delivery through RakutenBig Data Intelligence Ecosystem  and  Technology
Value Delivery through RakutenBig Data Intelligence Ecosystem and Technology
 
Volume 2-issue-6-2016-2020
Volume 2-issue-6-2016-2020Volume 2-issue-6-2016-2020
Volume 2-issue-6-2016-2020
 
IRJET- Implementation of Review Selection using Deep Learning
IRJET-  	  Implementation of Review Selection using Deep LearningIRJET-  	  Implementation of Review Selection using Deep Learning
IRJET- Implementation of Review Selection using Deep Learning
 
Tech Scouting (Companies & Patents)
Tech Scouting (Companies & Patents)Tech Scouting (Companies & Patents)
Tech Scouting (Companies & Patents)
 
Vol 12 No 1 - April 2014
Vol 12 No 1 - April 2014Vol 12 No 1 - April 2014
Vol 12 No 1 - April 2014
 
Ieee format 5th nccci_a study on factors influencing as a best practice for...
Ieee format 5th nccci_a study on factors influencing as  a  best practice for...Ieee format 5th nccci_a study on factors influencing as  a  best practice for...
Ieee format 5th nccci_a study on factors influencing as a best practice for...
 
Document Retrieval System, a Case Study
Document Retrieval System, a Case StudyDocument Retrieval System, a Case Study
Document Retrieval System, a Case Study
 
Empirical analysis of ensemble methods for the classification of robocalls in...
Empirical analysis of ensemble methods for the classification of robocalls in...Empirical analysis of ensemble methods for the classification of robocalls in...
Empirical analysis of ensemble methods for the classification of robocalls in...
 

Viewers also liked

Umkomaas Airfield - Airfield Layout (A0)
Umkomaas Airfield - Airfield Layout (A0)Umkomaas Airfield - Airfield Layout (A0)
Umkomaas Airfield - Airfield Layout (A0)Noel McDonogh
 
Social Media - Facebook for Business
Social Media - Facebook for BusinessSocial Media - Facebook for Business
Social Media - Facebook for BusinessLara Abdulhadi
 
The middle age ángel
The middle age ángelThe middle age ángel
The middle age ángelInma Guillen
 
Best Practices for Utilizing Linkedin
Best Practices for Utilizing LinkedinBest Practices for Utilizing Linkedin
Best Practices for Utilizing LinkedinLara Abdulhadi
 
Manual de heridas del hospital universitario de móstoles
Manual de heridas del hospital universitario de móstoles Manual de heridas del hospital universitario de móstoles
Manual de heridas del hospital universitario de móstoles Aran Nja
 

Viewers also liked (12)

Presentation1
Presentation1Presentation1
Presentation1
 
Solar
SolarSolar
Solar
 
Aju aja jojojo
Aju aja jojojoAju aja jojojo
Aju aja jojojo
 
iPad @ PEGS 2012
iPad @ PEGS 2012iPad @ PEGS 2012
iPad @ PEGS 2012
 
Umkomaas Airfield - Airfield Layout (A0)
Umkomaas Airfield - Airfield Layout (A0)Umkomaas Airfield - Airfield Layout (A0)
Umkomaas Airfield - Airfield Layout (A0)
 
Competencias
CompetenciasCompetencias
Competencias
 
6110 bibliography
6110 bibliography6110 bibliography
6110 bibliography
 
Social Media - Facebook for Business
Social Media - Facebook for BusinessSocial Media - Facebook for Business
Social Media - Facebook for Business
 
Youssef Magdy CV
Youssef Magdy CVYoussef Magdy CV
Youssef Magdy CV
 
The middle age ángel
The middle age ángelThe middle age ángel
The middle age ángel
 
Best Practices for Utilizing Linkedin
Best Practices for Utilizing LinkedinBest Practices for Utilizing Linkedin
Best Practices for Utilizing Linkedin
 
Manual de heridas del hospital universitario de móstoles
Manual de heridas del hospital universitario de móstoles Manual de heridas del hospital universitario de móstoles
Manual de heridas del hospital universitario de móstoles
 

Similar to Personalizing Search

Efficient way of user search location in query processing
Efficient way of user search location in query processingEfficient way of user search location in query processing
Efficient way of user search location in query processingeSAT Publishing House
 
tap publisher
tap publishertap publisher
tap publisherrahuljk3
 
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...Providing Highly Accurate Service Recommendation over Big Data using Adaptive...
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...IRJET Journal
 
IRJET- E-Commerce Recommender System using Data Mining Algorithms
IRJET-  	  E-Commerce Recommender System using Data Mining AlgorithmsIRJET-  	  E-Commerce Recommender System using Data Mining Algorithms
IRJET- E-Commerce Recommender System using Data Mining AlgorithmsIRJET Journal
 
Vertical Search - challenges and opportunities
Vertical Search - challenges and opportunitiesVertical Search - challenges and opportunities
Vertical Search - challenges and opportunitiesAndy Black
 
Data analysis step by step guide
Data analysis   step by step guideData analysis   step by step guide
Data analysis step by step guideManish Gupta
 
IRJET- E-Commerce Recommendation System: Problems and Solutions
IRJET- E-Commerce Recommendation System: Problems and SolutionsIRJET- E-Commerce Recommendation System: Problems and Solutions
IRJET- E-Commerce Recommendation System: Problems and SolutionsIRJET Journal
 
Search analytics what why how - By Otis Gospodnetic
Search analytics what why how - By Otis GospodneticSearch analytics what why how - By Otis Gospodnetic
Search analytics what why how - By Otis Gospodneticlucenerevolution
 
Search analytics what why how - By Otis Gospodnetic
 Search analytics what why how - By Otis Gospodnetic  Search analytics what why how - By Otis Gospodnetic
Search analytics what why how - By Otis Gospodnetic lucenerevolution
 
Measuring What Really Matters: Search Engine Metrics & Tracking Tips - David ...
Measuring What Really Matters: Search Engine Metrics & Tracking Tips - David ...Measuring What Really Matters: Search Engine Metrics & Tracking Tips - David ...
Measuring What Really Matters: Search Engine Metrics & Tracking Tips - David ...Energy Digital Summit
 
Top 9 Search-Driven Analytics Evaluation Criteria
Top 9 Search-Driven Analytics Evaluation CriteriaTop 9 Search-Driven Analytics Evaluation Criteria
Top 9 Search-Driven Analytics Evaluation CriteriaJames Capurro
 
IRJET- Search Engine Optimization (Seo)
IRJET-  	  Search Engine Optimization (Seo)IRJET-  	  Search Engine Optimization (Seo)
IRJET- Search Engine Optimization (Seo)IRJET Journal
 
Apple Search Ads - Keyword expansion with discovery campaigns & Performance b...
Apple Search Ads - Keyword expansion with discovery campaigns & Performance b...Apple Search Ads - Keyword expansion with discovery campaigns & Performance b...
Apple Search Ads - Keyword expansion with discovery campaigns & Performance b...Anton Tatarinovich
 
Apple Search Ads - Keyword expansion with discovery campaigns & Performance b...
Apple Search Ads - Keyword expansion with discovery campaigns & Performance b...Apple Search Ads - Keyword expansion with discovery campaigns & Performance b...
Apple Search Ads - Keyword expansion with discovery campaigns & Performance b...SplitMetrics
 
iCTRE: The Informal community Transformer into Recommendation Engine
iCTRE: The Informal community Transformer into Recommendation EngineiCTRE: The Informal community Transformer into Recommendation Engine
iCTRE: The Informal community Transformer into Recommendation EngineIRJET Journal
 
The beginners guide to SEO
The beginners guide to SEOThe beginners guide to SEO
The beginners guide to SEOThanh Nguyen
 
One Stop Recommendation
One Stop RecommendationOne Stop Recommendation
One Stop RecommendationIRJET Journal
 
One Stop Recommendation
One Stop RecommendationOne Stop Recommendation
One Stop RecommendationIRJET Journal
 

Similar to Personalizing Search (20)

[Study] The Definitive Organic Click Through Rate Study
[Study] The Definitive Organic Click Through Rate Study[Study] The Definitive Organic Click Through Rate Study
[Study] The Definitive Organic Click Through Rate Study
 
Slingshot Google vs Bing CTR Study 2012
Slingshot Google vs Bing CTR Study 2012Slingshot Google vs Bing CTR Study 2012
Slingshot Google vs Bing CTR Study 2012
 
Efficient way of user search location in query processing
Efficient way of user search location in query processingEfficient way of user search location in query processing
Efficient way of user search location in query processing
 
tap publisher
tap publishertap publisher
tap publisher
 
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...Providing Highly Accurate Service Recommendation over Big Data using Adaptive...
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...
 
IRJET- E-Commerce Recommender System using Data Mining Algorithms
IRJET-  	  E-Commerce Recommender System using Data Mining AlgorithmsIRJET-  	  E-Commerce Recommender System using Data Mining Algorithms
IRJET- E-Commerce Recommender System using Data Mining Algorithms
 
Vertical Search - challenges and opportunities
Vertical Search - challenges and opportunitiesVertical Search - challenges and opportunities
Vertical Search - challenges and opportunities
 
Data analysis step by step guide
Data analysis   step by step guideData analysis   step by step guide
Data analysis step by step guide
 
IRJET- E-Commerce Recommendation System: Problems and Solutions
IRJET- E-Commerce Recommendation System: Problems and SolutionsIRJET- E-Commerce Recommendation System: Problems and Solutions
IRJET- E-Commerce Recommendation System: Problems and Solutions
 
Search analytics what why how - By Otis Gospodnetic
Search analytics what why how - By Otis GospodneticSearch analytics what why how - By Otis Gospodnetic
Search analytics what why how - By Otis Gospodnetic
 
Search analytics what why how - By Otis Gospodnetic
 Search analytics what why how - By Otis Gospodnetic  Search analytics what why how - By Otis Gospodnetic
Search analytics what why how - By Otis Gospodnetic
 
Measuring What Really Matters: Search Engine Metrics & Tracking Tips - David ...
Measuring What Really Matters: Search Engine Metrics & Tracking Tips - David ...Measuring What Really Matters: Search Engine Metrics & Tracking Tips - David ...
Measuring What Really Matters: Search Engine Metrics & Tracking Tips - David ...
 
Top 9 Search-Driven Analytics Evaluation Criteria
Top 9 Search-Driven Analytics Evaluation CriteriaTop 9 Search-Driven Analytics Evaluation Criteria
Top 9 Search-Driven Analytics Evaluation Criteria
 
IRJET- Search Engine Optimization (Seo)
IRJET-  	  Search Engine Optimization (Seo)IRJET-  	  Search Engine Optimization (Seo)
IRJET- Search Engine Optimization (Seo)
 
Apple Search Ads - Keyword expansion with discovery campaigns & Performance b...
Apple Search Ads - Keyword expansion with discovery campaigns & Performance b...Apple Search Ads - Keyword expansion with discovery campaigns & Performance b...
Apple Search Ads - Keyword expansion with discovery campaigns & Performance b...
 
Apple Search Ads - Keyword expansion with discovery campaigns & Performance b...
Apple Search Ads - Keyword expansion with discovery campaigns & Performance b...Apple Search Ads - Keyword expansion with discovery campaigns & Performance b...
Apple Search Ads - Keyword expansion with discovery campaigns & Performance b...
 
iCTRE: The Informal community Transformer into Recommendation Engine
iCTRE: The Informal community Transformer into Recommendation EngineiCTRE: The Informal community Transformer into Recommendation Engine
iCTRE: The Informal community Transformer into Recommendation Engine
 
The beginners guide to SEO
The beginners guide to SEOThe beginners guide to SEO
The beginners guide to SEO
 
One Stop Recommendation
One Stop RecommendationOne Stop Recommendation
One Stop Recommendation
 
One Stop Recommendation
One Stop RecommendationOne Stop Recommendation
One Stop Recommendation
 

More from Cognizant

Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...
Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...
Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...Cognizant
 
Data Modernization: Breaking the AI Vicious Cycle for Superior Decision-making
Data Modernization: Breaking the AI Vicious Cycle for Superior Decision-makingData Modernization: Breaking the AI Vicious Cycle for Superior Decision-making
Data Modernization: Breaking the AI Vicious Cycle for Superior Decision-makingCognizant
 
It Takes an Ecosystem: How Technology Companies Deliver Exceptional Experiences
It Takes an Ecosystem: How Technology Companies Deliver Exceptional ExperiencesIt Takes an Ecosystem: How Technology Companies Deliver Exceptional Experiences
It Takes an Ecosystem: How Technology Companies Deliver Exceptional ExperiencesCognizant
 
Intuition Engineered
Intuition EngineeredIntuition Engineered
Intuition EngineeredCognizant
 
The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...
The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...
The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...Cognizant
 
Enhancing Desirability: Five Considerations for Winning Digital Initiatives
Enhancing Desirability: Five Considerations for Winning Digital InitiativesEnhancing Desirability: Five Considerations for Winning Digital Initiatives
Enhancing Desirability: Five Considerations for Winning Digital InitiativesCognizant
 
The Work Ahead in Manufacturing: Fulfilling the Agility Mandate
The Work Ahead in Manufacturing: Fulfilling the Agility MandateThe Work Ahead in Manufacturing: Fulfilling the Agility Mandate
The Work Ahead in Manufacturing: Fulfilling the Agility MandateCognizant
 
The Work Ahead in Higher Education: Repaving the Road for the Employees of To...
The Work Ahead in Higher Education: Repaving the Road for the Employees of To...The Work Ahead in Higher Education: Repaving the Road for the Employees of To...
The Work Ahead in Higher Education: Repaving the Road for the Employees of To...Cognizant
 
Engineering the Next-Gen Digital Claims Organisation for Australian General I...
Engineering the Next-Gen Digital Claims Organisation for Australian General I...Engineering the Next-Gen Digital Claims Organisation for Australian General I...
Engineering the Next-Gen Digital Claims Organisation for Australian General I...Cognizant
 
Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...
Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...
Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...Cognizant
 
Green Rush: The Economic Imperative for Sustainability
Green Rush: The Economic Imperative for SustainabilityGreen Rush: The Economic Imperative for Sustainability
Green Rush: The Economic Imperative for SustainabilityCognizant
 
Policy Administration Modernization: Four Paths for Insurers
Policy Administration Modernization: Four Paths for InsurersPolicy Administration Modernization: Four Paths for Insurers
Policy Administration Modernization: Four Paths for InsurersCognizant
 
The Work Ahead in Utilities: Powering a Sustainable Future with Digital
The Work Ahead in Utilities: Powering a Sustainable Future with DigitalThe Work Ahead in Utilities: Powering a Sustainable Future with Digital
The Work Ahead in Utilities: Powering a Sustainable Future with DigitalCognizant
 
AI in Media & Entertainment: Starting the Journey to Value
AI in Media & Entertainment: Starting the Journey to ValueAI in Media & Entertainment: Starting the Journey to Value
AI in Media & Entertainment: Starting the Journey to ValueCognizant
 
Operations Workforce Management: A Data-Informed, Digital-First Approach
Operations Workforce Management: A Data-Informed, Digital-First ApproachOperations Workforce Management: A Data-Informed, Digital-First Approach
Operations Workforce Management: A Data-Informed, Digital-First ApproachCognizant
 
Five Priorities for Quality Engineering When Taking Banking to the Cloud
Five Priorities for Quality Engineering When Taking Banking to the CloudFive Priorities for Quality Engineering When Taking Banking to the Cloud
Five Priorities for Quality Engineering When Taking Banking to the CloudCognizant
 
Getting Ahead With AI: How APAC Companies Replicate Success by Remaining Focused
Getting Ahead With AI: How APAC Companies Replicate Success by Remaining FocusedGetting Ahead With AI: How APAC Companies Replicate Success by Remaining Focused
Getting Ahead With AI: How APAC Companies Replicate Success by Remaining FocusedCognizant
 
Crafting the Utility of the Future
Crafting the Utility of the FutureCrafting the Utility of the Future
Crafting the Utility of the FutureCognizant
 
Utilities Can Ramp Up CX with a Customer Data Platform
Utilities Can Ramp Up CX with a Customer Data PlatformUtilities Can Ramp Up CX with a Customer Data Platform
Utilities Can Ramp Up CX with a Customer Data PlatformCognizant
 
The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...
The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...
The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...Cognizant
 

More from Cognizant (20)

Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...
Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...
Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...
 
Data Modernization: Breaking the AI Vicious Cycle for Superior Decision-making
Data Modernization: Breaking the AI Vicious Cycle for Superior Decision-makingData Modernization: Breaking the AI Vicious Cycle for Superior Decision-making
Data Modernization: Breaking the AI Vicious Cycle for Superior Decision-making
 
It Takes an Ecosystem: How Technology Companies Deliver Exceptional Experiences
It Takes an Ecosystem: How Technology Companies Deliver Exceptional ExperiencesIt Takes an Ecosystem: How Technology Companies Deliver Exceptional Experiences
It Takes an Ecosystem: How Technology Companies Deliver Exceptional Experiences
 
Intuition Engineered
Intuition EngineeredIntuition Engineered
Intuition Engineered
 
The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...
The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...
The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...
 
Enhancing Desirability: Five Considerations for Winning Digital Initiatives
Enhancing Desirability: Five Considerations for Winning Digital InitiativesEnhancing Desirability: Five Considerations for Winning Digital Initiatives
Enhancing Desirability: Five Considerations for Winning Digital Initiatives
 
The Work Ahead in Manufacturing: Fulfilling the Agility Mandate
The Work Ahead in Manufacturing: Fulfilling the Agility MandateThe Work Ahead in Manufacturing: Fulfilling the Agility Mandate
The Work Ahead in Manufacturing: Fulfilling the Agility Mandate
 
The Work Ahead in Higher Education: Repaving the Road for the Employees of To...
The Work Ahead in Higher Education: Repaving the Road for the Employees of To...The Work Ahead in Higher Education: Repaving the Road for the Employees of To...
The Work Ahead in Higher Education: Repaving the Road for the Employees of To...
 
Engineering the Next-Gen Digital Claims Organisation for Australian General I...
Engineering the Next-Gen Digital Claims Organisation for Australian General I...Engineering the Next-Gen Digital Claims Organisation for Australian General I...
Engineering the Next-Gen Digital Claims Organisation for Australian General I...
 
Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...
Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...
Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...
 
Green Rush: The Economic Imperative for Sustainability
Green Rush: The Economic Imperative for SustainabilityGreen Rush: The Economic Imperative for Sustainability
Green Rush: The Economic Imperative for Sustainability
 
Policy Administration Modernization: Four Paths for Insurers
Policy Administration Modernization: Four Paths for InsurersPolicy Administration Modernization: Four Paths for Insurers
Policy Administration Modernization: Four Paths for Insurers
 
The Work Ahead in Utilities: Powering a Sustainable Future with Digital
The Work Ahead in Utilities: Powering a Sustainable Future with DigitalThe Work Ahead in Utilities: Powering a Sustainable Future with Digital
The Work Ahead in Utilities: Powering a Sustainable Future with Digital
 
AI in Media & Entertainment: Starting the Journey to Value
AI in Media & Entertainment: Starting the Journey to ValueAI in Media & Entertainment: Starting the Journey to Value
AI in Media & Entertainment: Starting the Journey to Value
 
Operations Workforce Management: A Data-Informed, Digital-First Approach
Operations Workforce Management: A Data-Informed, Digital-First ApproachOperations Workforce Management: A Data-Informed, Digital-First Approach
Operations Workforce Management: A Data-Informed, Digital-First Approach
 
Five Priorities for Quality Engineering When Taking Banking to the Cloud
Five Priorities for Quality Engineering When Taking Banking to the CloudFive Priorities for Quality Engineering When Taking Banking to the Cloud
Five Priorities for Quality Engineering When Taking Banking to the Cloud
 
Getting Ahead With AI: How APAC Companies Replicate Success by Remaining Focused
Getting Ahead With AI: How APAC Companies Replicate Success by Remaining FocusedGetting Ahead With AI: How APAC Companies Replicate Success by Remaining Focused
Getting Ahead With AI: How APAC Companies Replicate Success by Remaining Focused
 
Crafting the Utility of the Future
Crafting the Utility of the FutureCrafting the Utility of the Future
Crafting the Utility of the Future
 
Utilities Can Ramp Up CX with a Customer Data Platform
Utilities Can Ramp Up CX with a Customer Data PlatformUtilities Can Ramp Up CX with a Customer Data Platform
Utilities Can Ramp Up CX with a Customer Data Platform
 
The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...
The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...
The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...
 

Personalizing Search

  • 1. Personalizing Search By applying user context and uncovering essential information, search engines can deliver a more rewarding experience, resulting in more digital revenue for the organization. • Cognizant 20-20 Insights Executive Summary Billions of searches are performed by millions of users, every day. Although the numerous search engines that are currently available work well on basic keyword searches, they are often lacking when it comes to advanced research, especially to meet the requirements of information and media services providers. Regardless of the indi- vidual user needs, the search engine typically provides the same results to all. This white paper highlights improvements that can be made to the relevance of search engine results, as well as how information and media ser- vices providers can personalize search returns based on each user’s interests and requirements. It also highlights the different dimensions of per- sonalization, and recommends ways to measure and collect insights against these dimensions. Search Engine Results Relevancy When gauging search engine effectiveness, two key questions must be answered: • What should the result set consist of? • What should the sort order of the returns be? Search engines generally use a variation of the tf- idf (term frequency–inverse document frequency) algorithm for search and relevance (see Quick Take, next page). This means if a user searches for “the driving license,” the engine takes two steps based on term frequency: 1. Find all documents with “the driving license” or a variation of that term. 2. Sort document returns, displaying those with the highest number of term mentions at the top, and the least number of mentions at the bottom. Because “the” is a common or “noise” term in the phrase, and the other two words (driving and license) are more specific, driving/driver/drive and license/licensing/licensed will be given pri- ority for relevance. This is called idf (or inverse document frequency). The search engine returns all documents with “driving license” and varia- tions, ignoring “the.” But the question remains: Is it sufficient for relevant search results? Beyond Basic Search Returning the result set based on the search term is enough for basic search engines but not for advanced research engines. Along with tf-idf, user context is the most important factor to con- sider. In the above example, the search engine will return all documents with the term “driv- ing license;” however, top results might relate to applying for a license in California, which is cognizant 20-20 insights | december 2015
  • 2. not useful if the user’s location is outside of that state. As a result, top returned results are relevant only for a very small portion of the user base. If the search engine records and analyzes the user’s previous searches, however, then it would be able to apply addi- tional context to what the user is seeking and thus provide more relevant results. In the above case, if user context is factored in, the search engine would provide California as the top result to a California resident, a New York top result to a New York resident and Alaska the top result to an Alaska user. User context typically consists of the following dimensions: • Geographic location. • Area of interest. • Profession and economic background. • Gender and age group. Another example of how user context can be useful is when a user is located in a particular state and searches for an item, say a Samsung phone. When the results are returned with Samsung phone on top, not all the returns may be relevant for the following reasons: • Geographic location: The user wants a localized search. While thousands of stores sell Samsung phones in the U.S., he is interested in just the four to five nearby outlets. • Area of interest: Responses should be different for someone interested in news vs. one seeking technical details. • Profession and economic background: Responses should also consider the user’s profession and economic background. If a lawyer is conducting a search, for example, he might be seeking updates on an Apple vs. Samsung lawsuit1 and be uninterested in Samsung stores and reviews. • Gender and age group: Not only does interest level vary based on gender and age group; so does the relevance of results. A younger user might be seeking the best games on a Samsung phone, which differ based on age and gender. While some user dimensions, such as gender, do not change over time, others, such as location, profession or area of interest, can vary and must cognizant 20-20 insights 2 Understanding the tf-idf Algorithm The following are variations of tf, idf or tf-idf that are used for relevance: Simple tf–idf calculation tfidf(t,d,D)=tf(t,d) * idf(t,D) Variants of tf Weight Weighting Scheme tf Weight Binary {0,1} Raw frequency f(t,d) Log normalization log(1 + f(t,d)) Double normalization 0.5 0.5 + 0.5(f(t,d)/max(f(t,d))) Double normalization K K+(1-K)(f(t,d)/max(f(t,d))) Variants of idf Weight Weighting Scheme idf Weight Unary {0,1} Inverse frequency log(N,n(t)) Inverse frequency smooth log(1 + (N,n(t))) Inverse frequency max Log(1+ (max(t) * n(t))/n(t)) Probabilistic inverse frequency Log((N-n(t))/n(t)) Recommended Weighting Schemes Weighting Scheme Document Term Weight 1 f(t,d) * log(N,n(t)) 2 log(1 + f(t,d)) * log(1 + (N,n(t))) 3 log(1 + f(t,d)) * log(N,n(t)) Quick Take Source: if-idf entry on Wikipedia, (http://en.wikipedia.org/wiki/Tf%E2%80%93idf) If the search engine records and analyzes the user’s previous searches, however, then it would be able to apply additional context to what the user is seeking and thus provide more relevant results.
  • 3. cognizant 20-20 insights 3 be tracked and updated regularly. This is where some search engines excel over others. For example, Netflix is preferred over other movie services because its search engine collects and acts on user context dimensions obtained through the user’s profile, which includes data on: • Age group. • Ratings given to movies viewed. • Number of times content is viewed (which provides area of interest and content popularity). • Previous selections in a genre and browse history (which provides area of interest and ease of finding content). • Viewing location. • Amount of continuous-viewing time spent (which provides level of Interest). As a result of a deep-dive analysis of these dimen- sions, Netflix can predict exactly which movie the user plans to watch during the weekend. This is the essence of what we call Code Halo™ think- ing, in which companies make meaning from the intersection of digital code that surrounds people, processes, organizations and things. (To learn more, read our book, Code Halos: How the Digital Lives of People, Things, and Organizations are Changing the Rules of Business).2 Target and Context Search Two types of advanced search capabilities have emerged, each of which must be handled differ- ently by the engine. • Target search: Here, the user knows what he is seeking, and the engine must return specific results without any ambiguity. Even if the user incorrectly spells the search term, the search engine needs to return the right results, with supporting facts and recommendations, partic- ularly if the user is looking for a side suggestion. • Context search: In this case, the user is searching based on terms, questions or fuzzy query, and the engine needs to return the result based on tf-idf, along with user context, to return personalized results. The user should find what he is looking for within the first five results. More than 75% of click-through rates are from the top-five results (see Figure 2). Result Ranking and Click-Through Rate Source: Search Engine Results Page entry, https://en.wikipedia.org/wiki/Search_engine_results_page Figure 2 TOP 10 RESULTS 0 10 20 30 40 50 60 70 80 90 100% 1 2 3 4 5 6 7 8 9 10 Click-through Rate Cumulative A search engine results page is the listing of results returned by a search engine in response to a keyword query. The results normally include a list of items with titles, a reference to the full version, and a short description show- ing where the keywords have matched content within the page. Each page of search engine results usually contains 10 organic listings. The listings on the first page are the most important ones, because those get 91% of the click-through rates from a particular search. According to a 2013 study by Chitika, the click-through rates for the first page are:
  • 4. User Interaction on the Search Engine After logging into the search engine and inputting a search term, the result set is returned. What the user does next reveals much about the engine: • Refines/redefines the search term and searches again. • Narrows down the result set. • Browses the result set, clicks on it for additional details and then looks at other results. • Browses the result set and clicks on the results that satisfy his search. • Browses the top result set, clicks on it and obtains his required information. • Finds his information just after submitting the search term. Although the difference in user behavior is excep- tionally small, each response actually has a very different meaning. Let’s explore further. • Refines/redefines the search term: When the user entered the search term, he received irrelevant results. This means the search engine did not help the user, nor has tf-idf been implemented properly, causing the user to refine/redefine the search term to find what he is looking for. • Narrows down the result set: The user received results based on the term entered but also received results in which he is not interested. To clean up the result set, the user tries to remove unwanted results. Once he has the refined results, he can examine the returned documents. In this case, the tf-idf is working to its potential; however, there is no user context. Narrowing down results means the user is trying to set the right search context. • Clicks for more details: The user received relevant results; however, the associated infor- mation is insufficient. To check whether each result includes the right details, the user must click and read it. In this case, tf-idf is working to its potential with some user context. • Browses result set: The user received all relevant results on the first page and has the right supporting details; however, the best result is not among those listed at the top of the return, so he has to browse all the results to get to the right one. In this case, the tf-idf is working to its potential with user context. However, user context is not fully optimized. • Browses top result set: In this case, the user received his results in the top result set with all the right supporting details. In this case, the tf-idf is working to its potential with fully optimized user context. • Finds information immediately: In this case, the user gets his result with the submission of the search term. This is especially applicable for entity search results. This search result has all the required supporting details. In this case, the tf-idf is working to its potential with fully optimized user context. How to Gather User Context Every time a user logs into a search engine, he adds more data to his digital imprint, or Code Halo. This can be useful for subsequent search- es, other users’ searches and for the popularity of the content. When a user is new to the search engine, there is no user context. Based on the user’s interaction with the search engine, a machine learning algorithm creates user context that factors into his search pattern. When the user profile is new or inconclusive, the safest bet is to return results based on the tf-idf. Once the search engine has enough details to infer context, the result set can be per- sonalized to address user interest. This not only ensures that more relevant results are returned, but it also saves unnecessary user effort. User context is a continuous and ongoing learning process. For some users, there may not initially be a conclusive pattern, but over time, a specific pattern may develop that yields user context. In other cases, the engine needs to unlearn some patterns and create a new pattern with updated user context (see Figure 3, next page). Context can be drawn from the user’s previous searches, as well as from users with similar inter- action patterns. • User’s previous searches: The safest way of creating user context is to refer to previous user behavior with searches and intuit a research cognizant 20-20 insights 4 Based on the user interaction with the search engine, a machine learning algorithm creates user context that factors into his search pattern.
  • 5. cognizant 20-20 insights 5 path. Once there is clarity on user context, the engine provides relevant results and sugges- tions. For example, iTunes applies Code Halo thinking to help the user discover new music based on what she already has or likes. It relies on the user’s context and recommends new music. • Users with similar interaction patterns: This model uncovers similar users’ research patterns and applies those learnings to return more relevant results (i.e., collaborative filtering3 ). For example, an “other topics you might be interested in” feature is based on other users’ similar patterns. This approach may not be used by all search engines, especially paid engines. Users may raise concerns that others benefit from their search pattern. This option should be used on a case-by-case basis; for example, while it should be avoided in legal or scientific research, it can be very useful for music streaming websites. Even if no specific user context is obtained, con- text can be derived from the popularity of the content. Looking Ahead General search engines take the user’s entered search term, remove the noise, apply tf-idf and return the result set based on the relevance order. The result set may or may not be useful to the user, and relevance is dependent on user context. To make search results more relevant, informa- tion and media organizations need to: • Measure the effectiveness of the search engine. • Understand the user’s search pattern. • Relate the user pattern to specific dimensions of the user context. • Apply the user context to personalize the results for the user. The above steps should be a living, breathing, continuously evolving process. This will help organizations return personalized results, and deliver content that improves the overall satisfac- tion with the search engine experience. Note: Code Halo is a trademark of Cognizant Technology Solutions. The User Context Learning Loop USER CONTEXT LEARNING Measure Effectiveness Improve Personalization Machine Learning User Pattern Figure 3 Footnotes 1 Wikipedia entry on Apple vs. Samsung, https://en.wikipedia.org/wiki/Apple_Inc._v._Samsung_ Electronics_Co. 2 Code Halos: How the Digital Lives of People, Things and Organizations are Changing the Rules of Business, by Malcolm Frank, Paul Roehrig and Ben Pring, published by John Wiley & Sons, April 2014. 3 Wikipedia entry on collaborative filtering, https://en.wikipedia.org/wiki/Collaborative_filtering.
  • 6. About Cognizant Cognizant (NASDAQ: CTSH) is a leading provider of information technology, consulting, and business process out- sourcing services, dedicated to helping the world’s leading companies build stronger businesses. Headquartered in Teaneck, New Jersey (U.S.), Cognizant combines a passion for client satisfaction, technology innovation, deep industry and business process expertise, and a global, collaborative workforce that embodies the future of work. With over 100 development and delivery centers worldwide and approximately 219,300 employees as of September 30, 2015, Cognizant is a member of the NASDAQ-100, the S&P 500, the Forbes Global 2000, and the Fortune 500 and is ranked among the top performing and fastest growing companies in the world. Visit us online at www.cognizant.com or follow us on Twitter: Cognizant. World Headquarters 500 Frank W. Burr Blvd. Teaneck, NJ 07666 USA Phone: +1 201 801 0233 Fax: +1 201 801 0243 Toll Free: +1 888 937 3277 Email: inquiry@cognizant.com European Headquarters 1 Kingdom Street Paddington Central London W2 6BD Phone: +44 (0) 20 7297 7600 Fax: +44 (0) 20 7121 0102 Email: infouk@cognizant.com India Operations Headquarters #5/535, Old Mahabalipuram Road Okkiyam Pettai, Thoraipakkam Chennai, 600 096 India Phone: +91 (0) 44 4209 6000 Fax: +91 (0) 44 4209 6060 Email: inquiryindia@cognizant.com ­­© Copyright 2015, Cognizant. All rights reserved. No part of this document may be reproduced, stored in a retrieval system, transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the express written permission from Cognizant. The information contained herein is subject to change without notice. All other trademarks mentioned herein are the property of their respective owners. TL Codex 1373 About the Author Utsav Joshi is an Associate Director within Cognizant’s Information, Media & Entertainment business unit. He has more than 16 years of service delivery experience, with a key focus on global project and ser- vice delivery management and an emphasis on architecture and delivery management in North America. He is currently responsible for program and solution delivery of an online legal research application developed by a leading U.S.-based legal information services provider. He has successfully delivered large critical programs for Fortune 500 organizations across multiple segments. He holds a B.E. (electri- cal) and is a TOGAF and AWS certified professional. He can be reached at Utsav.Joshi@cognizant.com.