Personalization Tutorial at ACM Compute 2008

Personalization:
Techniques and
applications
Krishnan Ramanathan, Geetha Manjunath,
Somnath Banerjee
HP Labs, Bangalore

© 2006 Hewlett-Packard Development Company, L.P.
The information contained herein is subject to change without notice

Topics
• Overview of Personalization
• User Profile creation
• Personalizing Search
• Document modeling
• Recommender system
• Semantics in Personalization

2 22 January 2008

Overview of
Personalization
Krishnan Ramanathan


Why Personalization ?
• Scale of the web is limiting its utility
• There is too much information
• Consumer has to do all the work to use the web
• Search engines and portals provide the same results for
different personalities, intentions and contexts
• Personalization can be the solution
• Customize the web for individuals by
• Filtering out irrelevant information
• Identifying relevant information

4 22 January 2008

Some quotes from NY bits
•I am married with a house. Why do I see so many
ads for online dating sites and cheap mortgages?

Should I be happy that I see those ads? It means
Internet advertisers still have no idea who I am.

5 22 January 2008

Personalization
• Goal – Provide users what they need without requiring
them to ask for it explicitly
• Steps
• Generate useful, actionable knowledge about users
• Use this knowledge for personalizing an application
• User centric data model – Data must be attributable to
specific user
• Two kinds
• Business Centric : Amazon, Ebay
• Consumer Centric
• Personalization requires User Profiling

6 22 January 2008

Applications of Personalization
• Interface Personalization
• E.g. Go directly to the web page of interest instead of
site home page
• Content personalization
• Filtering (News, blog articles, videos etc)
• Ratings based recommendations
• Amazon, Stumbleupon
• Search
• Text, images, stories, research papers
• Ads
• Service Personalization

7 22 January 2008

Why is personalization hard ?
• Server side personalization – Sites do not see all
data
• E.g. A user might visit Expedia and Orbitz, Expedia
doesn’t know what the user did on Orbitz
• Difficult to get user context
• User needs to agree to cookies or login
• Site profiles are not portable
• Some standards are emerging (Attention profile markup
language)
• Privacy

8 22 January 2008

Personalization example 1 (Routing
queries)
Google alerts Google news page
routing queries

9 22 January 2008

Personalization example 2 - Amazon

10 22 January 2008

Personalization Example 3 – Google
news

11 22 January 2008

Personalization Example 4 – Yahoo
MyWeb

12 22 January 2008

The future …

13 22 January 2008

User Profile
Creation
Krishnan Ramanathan


Outline
• Profile Privacy
• Evaluating and managing user profiles
• Personalizing search

15 22 January 2008

User profile information
• Two kinds of information
• Factual (Explicit)
• Behavioral (Implicit)
• Factual – Geographic, Demographic,
Psychographic information
• Eg. Age is 25 years, searched for Lexus, lives in
Bangalore
• Behavioral – Describes behavioral activities (visits
finance sites, buys gadgets)

16 22 January 2008

Client side versus Server side profiles
Server side
Client side
Have queries, clickstreams from
No access to clickstreams of
multiple users
multiple users
Don’t see all the user data
See all user data
No way for users to aggregate
Possible for user to aggregate
and reuse the profiles different
and reuse their attentional
websites (Google, Yahoo, ..) build
information
using their data
Strong privacy model
Privacy is a big problem
Can access the full compute
Server cycles have to be shared,
power at the client
however some computations can
be done once and reused

17 22 January 2008

Desired profile characteristics
• Represent multiple interests
• Adapt to changing user interests
• Incorporate contextual information

18 22 January 2008

Using User profiles to personalize services

Search query, Content
news, video,
…

Explicit
and User
Implicit info Profile Profile
Data Profile to Content
Collection Constructor Matching

User Personalized services

Diagram adapted from Gauch et.al,
Chapter 2, The Adaptive Web, Springer
LNCS 4321
19 22 January 2008

User Profiling approaches
• Broadly two approaches
• IR approach
• User interests derived from text (documents/search
queries)
• Machine learning approach
• Model user based on positive and negative examples of
his interests
• Problems
• Getting labeled samples
• High dimensional feature space

20 22 January 2008

Profile building Steps
• Authenticate the user
• Select information to build profile from and archive
the information if necessary (eg. Web pages might
get flushed from IE cache)
• Build/Refresh/Expand/Prune the profile
• Use it in an application
• Evaluate the profile

21 22 January 2008

Authenticating the user
• Users need to be authenticated in order to attribute
data to a particular user for profile creation
• Identifying single user
• Login
• Cookies
• IP address (when it is static)
• Identifying different users on same machine
• Login
• Biometrics

22 22 January 2008

Explicit user information collection
• Ask the user for
• static information
• Name, age, residence location, hobbies, interests etc
• Google personalization – found explicit information to be noisy
• People specified literature as one of their interests but did not make
a single related search
• Matchmine – presents examples (movies, TV shows, music, blog topics)
and asks the users to explicity rate them
• Ratings
• Netflix, Stumbleupon (thumbs up/down)
• In general, people do not like to give explicit information
frequently
• Recent research (Jian Hu WWW 2007) showed good
results for gender and age prediction based on users
browsing behavior

23 22 January 2008

Explicit information collection:
Matchmine interface

24 22 January 2008

Implicit user information collection
• Data sources
• Web pages, documents, search queries, location
• Information from applications (Media players, Games)
• Data collection techniques
• Desktop based
• Browser cache
• Proxy servers
• Browser plugins
• Server side
• Web logs
• Search logs

25 22 January 2008

How much implicit info to use ?
• Teevan (SIGIR 2005) constructed two profiles
• One with only search queries
• Other using all information on desktop
• Findings
• More richer information => better profile
• All docs better than only recent docs better than only
web pages better than only search queries better than
no personalization
• Drawback with implicit info – cannot collect info
about user dislikes

26 22 January 2008

Stereotypes
• Generalizations from communities of users
• Characteristics of group of users
• Stereotypes alleviate the bootstrap problem
• Construction of stereotypes
• Manual – e.g. Bangalore user will be interested in IT
• Automatic method
• Clustering – Similar profiles are clustered and common
characteristics extracted

27 22 January 2008

How Acxiom delivers personalized ads
(source - WSJ)
• Acxiom has accumulated a database of about 133 million households and
divided it into 70 demographic and lifestyle clusters based on information
available from public sources.
• A person gives one of Acxiom’s Web partners his address by buying
something, filling out a survey or completing a contest form on one of the
sites.
• Acxiom checks the address against its database and places a “cookie,” or
small piece of tracking software, embedded with a code for that person’s
demographic and behavioral cluster on his computer hard drive.
• When the person visits an Acxiom partner site in the future, Acxiom can use
that code to determine which ads to show
• Through another cookie, Acxiom tracks what consumers do on partner Web
sites

28 22 January 2008

Profile representation
• Bag of words (BOW)
• Use words in user documents to represent user interests
• Issues
• Words appear independent of page content (“Home”, “page”)
• Polysemy (word has multiple meanings e.g. bank)
• Synonymy (multiple words have same meanings e.g. joy, happiness)
• Large profile sizes
• Concepts (e.g. DMOZ)
• Use existing ontology maintained for free
• Issues
• Too large (about 6 lakh DMOZ nodes), ontology has to be drastically
pruned for use
• Need to build classifiers for each DMOZ node

29 22 January 2008

Word based term vector profiles
• Profile represented as sets of words tf*idf weighted
• Could use one long profile vector or different
vectors for different topics (sports, health, finance)
• Documents converted to same representation,
matched with keyword vectors using cosine
similarity
• Should take structure of the document into account
(ignore html tags, email header vs body)

30 22 January 2008

Word based hierarchical profiles

Support of User Profile:10
Interest

Research:5 Sports:3.5 Sex:1.5

IR:3 DB:2 Soccer:2 Others:1.5

Search:2 ...

Support decreases from high to low level, and from left to right

We are thankful to Yabo Arber-Xu from Simon Fraser University
for kindly allowing us to use slides numbered 31,37,38,39 from
his WWW 07 presentation.
31 22 January 2008

Building word based hierarchical
profiles
• Builda (word, document) map for each word
occurring in the corpus
• Order words by amount of support
• Support of a word = number of documents in which
word appears
• For each word
• Decide whether to merge with another word (using some
measure of similarity)
• Decide whether to make one word the child of other

32 22 January 2008

Term similarity and Parent-child terms
• Words that cover the same document sets are similar
• Jacquard measure

Sim( w1, w2) =| D( w1 ) I D( w2 ) | / | D ( w1 ) U D ( w2 ) |

• Parent child terms
• A specific term is a child of a more general term if it frequently occurs
with a general term (but the reverse is not true)
• Word w2 is taken as child of term w1 if P(w1|w2) > some_threshold
• e.g. Terms “Soccer” and “Badminton” might co-occur with the term
“Sport” but not the other way around

34 22 January 2008

Personalization and Privacy
• Studies have shown that
• People are comfortable sharing preferences (favourite TV
show, snack etc.), demographic and lifestyle information
• People not comfortable sharing financial and purchase
related information
• Facebook fiasco because of reporting “Your friends bought …”
• Financial rewards (even small amounts) encourage
disclosure
• People parted with valuable information for Singapore
$15

35 22 January 2008

Privacy related attitudes
(Teltzrow/Kobsa 2003)

36 22 January 2008

What and How much to Reveal? - 1

More
User Profile:10 Sensitive

More
Research:5 Sports:3.5 Sex:1.5
specific


Search:2 ...

Manual Option – Absolute privacy guarantee, but requires a lot of user
intervention

37 22 January 2008


User Profile U à indicator of a user’s possible interests
Term t à indicator of a possible interest,
P(t)=Sup(t)/|D|

The amount of information for an interest t
I(t) = log(1/P(t))= log(|D|/ Sup(t)).
àindication of the specificity and sensitivity of an interest

H(U) – the amount of information carried by U
H(U)=∑tP(t)×I(t)

Two Privacy Parameters:

MinDetail - Protect t with P(t)<MinDetail
ExpRatio – H(U[exp] )/H(U)
The more detail we expose, the higher expRatio.

38 22 January 2008


User Profile:10 minDetail=0.5
expRatio=44%

minDetail=0.3
Research:5 Sports:3.5 Sex:1.5 expRatio=69%


Search:2 ...

The mindetail and expRation parameters allow a balance between privacy
and personalization.

39 22 January 2008

Profile portability
• Move the profile to a central server
• Claria PersonalWeb, Google-Yahoo-Microsoft
• Provision to delete search queries, visited pages
• No control over which part of the profile can be used
• Have a client side component that reconstructs the
profile on the client using server side info
(Matchmine)
• Attention Profile markup language
• Allows explicit and implicit information to be stored (as
XML) and provided to web services

40 22 January 2008

Attention Profile Markup
(http://www.apml.org)

41 22 January 2008

Application-independent evaluation of
the profile
• Stability
• Number of profile elements that do not change over the evaluation
cycle
• Precision
• How many items in the profile does the user agree with as
representative of his interests ?
• Does the user agree with the strength of the interest ?
• Do interests at deeper levels of the hierarchy have less precision
compared to interests at higher levels ?
• Which data sources (bookmarks, search keywords, web
pages) is better ?
• Bookmarks were not very representative of user interests in our study

42 22 January 2008

Profile evaluation
Sample Evaluation of one profiling
algorithm 0.8
0.7
0.6
0.5

Stability
Stability_alpha
0.4
•Profiles are stable (fig 1) 0.3
Stability_date

0.2
•Profile elements with high support 0.1

have high precision (fig 2) 0
0 200 400 600

•Profile elements at all levels of the Number of web pages in cache

hierarchy have similar precision (fig 3)
Figure 1

1 1.2

0.95 1

0.9 Percent (%) 0.8
Precision

Percentage in profile
0.85 0.6
Precis ion
0.8 0.4

0.75 0.2

0.7 0
Support > 5 3 < Support < 5 Support < 3 Level 1 Level 2 Level 3 Level 4 Level 5 Level 6

Figure 2 Figure 3
43 22 January 2008

Managing the profile
• Profiles may need to be expanded (bootstrapped) or
pruned
• Allowing users to manually edit their profiles to add/delete
topics of interest was found to make performance worse
(Jae-wook Ahn, WWW 2007)
• Adding and deleting topics to profile harmed system performance
• Deleting topics harmed performance four times more compared to
adding topics
• Some agents learn short term and long term profiles
separately using different techniques (K-NN for short term
interests, Naïve Bayes for long term interests)

44 22 January 2008

Personalizing
Search
Krishnan Ramanathan


Personalized search
• Search can be personalized based on
• User profile
• Current working context
• Past search queries
• Server side clickstreams
• Personalized Pagerank
• Determining user intent is hard (e.g query Visa)

46 22 January 2008

A generic personalized search algorithm
using a user profile
• Inputs- User profile, Search query
• Output – A results vector reordered by the user’s
preference
• Steps
• Send the query to a search engine
• Results[] = A vector of the search engine’s results
• For each item i in Results[] calculate the preference
Pref [i] = α *Similarity(Results[i] , User Profile)
+ (1- α)*SearchEngineRank
• Sort Results[] using Pref [i] as the comparator

47 22 January 2008

Current working context – JIT retrieval
• Context includes time, location, applications
currently running, documents currently opened, IM
status
• Use profile and current context to provide relevant
(and just-in-time) information
• Blinkx toolbar – provides relevant news, video and
Wikipedia articles within different applications (Micrsoft
Word, IE browser)
• Intersectinterests from the overall profile with
current context to get the contextual profile
• Context can also be used in query expansion

48 22 January 2008

Personalization based on Search history
• Use query-to-query similarity to suggest results that
satisfied past queries
• Create user profiles from past queries/snippets from
search results clicked
• Misearch (Gauch et.al 2004) creates weighted concept
hierarchies based on ODP as the reference concept hierarchy
• Compute degree of similarity between search engine result
snippets (title and text summaries) and user profile as
n
sim ( user i , doc j ) = ∑ wp
k =1
ik * wd jk

wp ik = weight of concept k in profile i
wd jk = weight of concept k in document j

49 22 January 2008

Personalization by clickthrough data analysis
– CubeSVD (Jian-Tao Sun, WWW 2005)
• Search engine has tuples of the form (User, Query, Visited
page)
• Multiple tuples constitute a tensor (generalization of matrix
to higher dimensions)
• Higher order SVD (HOSVD) performs SVD on tensor
• The reconstructed tensor is a tuple of the form (User,
Query, web page, p)
• Where p is the probability that the user posing the query will visit
the web page
• Recommend pages with highest value of p
• Computationally intensive but HOSVD can be done offline
• Need to recompute to account for new clickthrough data

50 22 January 2008

Topic sensitive pagerank (Haveliwala
2002)
• For top 16 ODP categories, create a pagerank vector
• Each web page/document d has multiple ranks
depending on what the topic of interest j is
• For a query compute, P(Cj|q) = P(Cj)*P(q,Cj)
• Intuition: If a topic is more probable given a query, the
topic specific rank should have more say in the final
rank
• Compute query sensitive rank as

∑ P(C j | q) * rank jd

51 22 January 2008

Topics
• Overview of Personalization
• Personalizing Search
• Document modeling
• Recommender System
• Semantics in Personalization

52 22 January 2008

Document
modeling
Somnath Banerjee


Under this topic
• Document representation

• Document analysis using
• Latent Semantic Analysis (LSA)
• Probabilistic Latent Semantic Analysis (PLSA)

• Document Classification
• Support Vector Machine (SVM): A machine learning
algorithm

54 22 January 2008

Document representation
• Term vector
• Document is represented as vector of terms
• Each dimension corresponds to a separate term

• Several methods of computing the weights of the terms
• Binary weighting: 1 if the word appear in the document

• Most well known is TF*IDF
ni , j
tf i , j =
∑n
k
k, j

D
idf i = log
{d j : ti ∈ d j }
tfidf i , j = tf i , j × idf i
55 22 January 2008

Computing similarity

sim ( A, B ) = cos ine (θ ) =
A•B
=
∑ A ×Bi i

A
2
× B
2
∑A ∑B
i
2
i
2

AI B
Jaccard coefficien t = J ( A, B ) =
AU B

2 AI B
Dice' s coefficien t = D ( A, B ) =
A+ B

56 22 January 2008

Example
• g1: Google Gets Green Light from FTC for DoubleClick Acquisition
• g2: Google Closes In on DoubleClick Acquisition
• g3: FTC clears Google DoubleClick deal
• g4: US regulator clears DoubleClick deal
• g5: DoubleClick deal brings greater focus on privacy

• e1: EU Agrees to Reduce Aviation Emissions
• e2: Aviation to be included in EU emissions trading
• e3: EU wants tougher green aviation laws

• Underlined words appeared in more than one documents

57 22 January 2008

Term Document Matrix (X)
g1 g2 g3 g4 g5 e1 e2 e3

google 1 1 1 0 0 0 0 0
green 1 0 0 0 0 0 0 1
ftc 1 0 1 0 0 0 0 0
doubleclick 1 1 1 1 1 0 0 0
acquisition 1 1 0 0 0 0 0 0
clear 0 0 1 1 0 0 0 0
deal 0 0 1 1 1 0 0 0
eu 0 0 0 0 0 1 1 1
aviation 0 0 0 0 0 1 1 1
emmision 0 0 0 0 0 1 1 0

58 22 January 2008

Retrieval example
• Query (or Profile) q = “Google Acquisition”

• Query vector q = [1 0 0 0 1 0 0 0 0 0]'

• Cosine similarity of the query to the documents

g1 g2 g3 g4 g5 e1 e2 e3
S=
0.634 0.816 0.447 0 0 0 0 0

• What about the documents g4 and g5?
• Problem of data sparsity

59 22 January 2008

Under this topic


• Support Vector Machine (SVM) ): A machine learning
algorithm

60 22 January 2008

Latent Semantic Analysis (LSA)
• You searching for “Tata Nano” are not the documents
containing “People’s Car” also relevant?

• How a machine can understand that?
• Analyze the collection of documents

• Documents that contain “Tata Nano” generally contain “People’s
Car” as well
• Covariance of these two dimensions are high

• LSA finds such correlation using a technique from linear algebra

61 22 January 2008

LSA
• Transforms the term document matrix into a relation
between the
• terms and some concepts,
• relation between those concepts and the documents

• Concepts are the dimensions of maximum variance

• Removes the dimensions with low variance
• Reduction in feature space
• Term document matrix becomes denser

62 22 January 2008

Singular Value Decomposition
documents
•1
•2
•3
… D'
terms
X = T S •m

mxm mxd

txd txm

•1• •2 • … • •m>0
m is the rank of the matrix X
T and D are orthonormal matrix
S is a diagonal matrix of singular values

63 22 January 2008

Reduced SVD
documents
•1
•2
•3 Dk '
…
= Tk Sk
terms Xk •k

mxk mxk

txd txk

-Choose largest k singular values (•1… •k)
-Choose k columns of T and D
-Then construct Xk
-Xk is the best k rank approximation of X in terms of Frobenius norm
64 22 January 2008

Example



65 22 January 2008

Term Document Matrix (X)
g1 g2 g3 g4 g5 e1 e2 e3

google 1 1 1 0 0 0 0 0
green 1 0 0 0 0 0 0 1
ftc 1 0 1 0 0 0 0 0
doubleclick 1 1 1 1 1 0 0 0
acquisition 1 1 0 0 0 0 0 0
clear 0 0 1 1 0 0 0 0
deal 0 0 1 1 1 0 0 0
eu 0 0 0 0 0 1 1 1
aviation 0 0 0 0 0 1 1 1
emmision 0 0 0 0 0 1 1 0

66 22 January 2008

LSA Example

T(10x7) =

S(7x7) =

D‘(7x8) =

67 22 January 2008

LSA Example
• Rank 2 approximation of X

documents

terms

68 22 January 2008

LSA Example
• Query vector q = [1 0 0 0 1 0 0 0 0 0]'

• Representation of the query
Dq = q'T2S2 -1 = [-0.204 0.005 ]

• Query to document similarity
Sim = Dq S22 D2'

69 22 January 2008

LSA Example
Dq S22

X X
D2'

Sim =

70 22 January 2008

Example
• g1: Google Gets Green Light from FTC for DoubleClick Acquisition [1.28
4]
• g2: Google Closes In on DoubleClick Acquisition [0.936]
• g3: FTC clears Google DoubleClick deal [1.426]
• g4: US regulator clears DoubleClick deal [0.891]
• g5: DoubleClick deal brings greater focus on privacy [0.697]

• e1: EU Agrees to Reduce Aviation Emissions [0.035]
• e2: Aviation to be included in EU emissions trading [0.035]
• e3: EU wants tougher green aviation laws [0.152]


71 22 January 2008

Under this topic


algorithm

72 22 January 2008

Probabilistic Latent Semantic Analysis
(PLSA)
• If we know the document collection contains two
topics can we do better?
• Can we estimate
• Probability( topic | document) ?
• Probability( word | topic) ?

• If we can also estimate Probability( topic | query) then we
can compute the document to query similarity

• PLSA is a statistical technique to estimate those probability
from a collection of documents

73 22 January 2008

Probabilistic Latent Semantic Analysis
(PLSA)
• Dyadic data: Two (abstract) sets of objects, X ={x1,
..,xm} and Y ={y1, … ,yn} in which observations are
made of dyads(x,y)
• Simplest case: observation of co-occurrence of x and y
• Other cases may involve scalar weight for each
observation

• Examples:
• X = Documents, Y =Words
• X = Users, Y =Purchased Items
• X = Pixels, Y =Values

74 22 January 2008

PLSA
• Document consists of topics and words in the document are generated
based on those topics

• Generative model (asymmetric): (di, wj) is generated as follow
• pick a document with probability P(di),
• pick a topic zk with probability P(zk | di),
• generate a word wj with probability P(wj | zk)

( ) (
P d i , w j = P(d i )P w j | d i )
P(di) P(zk |di) P(wj |zk)

( ) ∑ P(w j | z k )P(z k | d i )
K
D Z W
P w j | di =
k =1

75 22 January 2008

PLSA
• Parameters P(di), P(zk | di), P(wj | zk)
• P(di) is proportional to number of times the document is observed and be
computed independently
• P(zk | di), P(wj | zk) can be estimated using Expectation Maximization
(EM) algorithm

∏∏ P(d , w )
N M
P ( D, W ) = i j
n(di ,w j )

i =1 j =1

∑∑ n(d , w )ln P(d , w )
M N
L= i j i j
i =1 j =1

M = Number of documents; N = Number of distinct words

76 22 January 2008

PLSA Example
g1 google e1 eu
g1 green e1 aviation
•Dyadic data in our g1 ftc e1 emission
example
g1 doubleclick e2 aviation
g1 acquisition e2 eu
g2 google e2 emission
g2 doubleclick e3 eu
g2 acquisition e3 green
g3 ftc e3 aviation
g3 clear
g3 google
g3 doubleclick
g3 deal
g4 clear
g4 doubleclick
g4 deal
g5 doubleclick
g5 deal
78 22 January 2008

PLSA Example
• After 20 iterations of EM algorithm

P(zk |di)

P(wj |zk)

79 22 January 2008

Example
• g1: Google Gets Green Light from FTC for DoubleClick Acquisition [1.0]

• g2: Google Closes In on DoubleClick Acquisition [1.0]
• g3: FTC clears Google DoubleClick deal [1.0]
• g4: US regulator clears DoubleClick deal [1.0]
• g5: DoubleClick deal brings greater focus on privacy [1.0]

• e1: EU Agrees to Reduce Aviation Emissions [0.0]
• e2: Aviation to be included in EU emissions trading [0.0]
• e3: EU wants tougher green aviation laws [0.0]


81 22 January 2008

Under this topic


algorithm

82 22 January 2008

Document Classification

83 22 January 2008

Document classification with SVM
• We will concentrate on binary classification
• {sports, not sports}, {interesting, not interesting} etc
• In general {+1,-1} also called {positive, negative}

• SVM is a supervised machine learning technique. It learns
the pattern from a training set

• Training set
• A set of documents with labels belonging to {+1, -1}

• SVM tries to draw a hyperplane that best separates the
positive and negative data in the training set

84 22 January 2008

Support Vector Machine (SVM)
• A Machine learning algorithm

• SVM was introduced in COLT-92 by Boser, Guyon and
Vapnik.

• Initially popularized in the NIPS community, now an
important and active field of all Machine Learning
Research

• Successful applications in many fields (text, bioinformatics,
handwriting, image recognition etc.)

85 22 January 2008

SVM – Maximum margin separation

SVM illustration by
Bülent Üstün
Radboud Universiteit

86 22 January 2008

Mapping to higher dimension for non-
separable data
P1 • (0,0) x {+1}
P2 • (0,1) x {-1}
P2 P3
P3 • (1,1) x {+1}
P4 • (1,0) x {-1}

P1 P4

P1 • (0,0,0) x {+1}
 x  2
 1

x → φ (x ) →  x  2
2
P2 • (0,1,0) x {-1}
x x  P3 • (1,1,1) x {+1}
 1 2
P4 • (1,0,0) x {-1}
87 22 January 2008

The XOR example

SVM uses kernel trick to map data to higher
dimensional feature space without incurring
much computational overhead

88 22 January 2008

Recommender
System
Select top N items for a user
-Somnath Banerjee


Example

90 22 January 2008

Example

91 22 January 2008

Classification
• Broadly three approaches
• Content Based Recommendation
• Collaborative Filtering
• Hybrid approach

92 22 January 2008

Content based recommendation
• Utility of an item for a user is determined based on
the items preferred by the user in the past

• Applies similar techniques as introduced in the
document modeling part

93 22 January 2008

Basic Approach
• Create and represent the user profile from the items rated by the user in
the past
• A popular choice of profile representation is vector of terms weighted
based on TF*IDF

• Represent the item in the same format
• A news item can be represented using (TF*IDF) term vector
• For movies, books one needs to get sufficient metadata to represent the
item in vector format

• Define a similarity measure to compute the similarity between the
profile and the item
• Popular choice is cosine similarity
• Advance machine learning techniques can also be applied to do the
matching

• Recommend most similar items

94 22 January 2008

Problems with content based
recommendation
• Knowledge engineering problem
• How do you describe multimedia, graphics, movies,
songs

• Recommendation shows limited diversity

• New user problem
• It requires large number of ratings from the user to generate
quality recommendation

95 22 January 2008

Collaborative filtering
• Recommends items that are liked in the past by other users
with similar tastes

• Quite popular in e-commerce sites, like Amazon, eBay

• Can recommend various media types, text, video, audio,
Ads, products

96 22 January 2008

Advantages
• Does not have the knowledge engineering
problem
• Both user and items can be represented using just ids

• Often recommendation shows good amount of
diversity

98 22 January 2008

Lets learn C.F. with an example

Ran Casablanca Ben Tomb Raider MI -II Air Force
Hur One
Jane 5 5 ? 2
Bill 2 3 4
Tom 2 2 5 5
Cathy 3 3 1 1

What rating Jane will possibly give to MI – II?

99 22 January 2008

Normalizing the ratings
• All users won’t give equal rating even if they all equally liked/disliked an
item
• Normalize rating r = ru ,i − ru

Ran Casablanca Ben Tomb MI -II Air Force
Hur Raider One
Jane 1 1 -2
Bill -1 0 1
Tom -1.5 -1.5 1.5 1.5
Cathy 1 1 -1 -1

100 22 January 2008

Similarity between users
• Who are the other users with similar taste like Jane
• Each row of the matrix is a vector representing the user
• Compute cosine similarity between the users

Bill Tom Cathy
Jane -0.289 -0.612 0.816

101 22 January 2008

Compute probable rating
• Possible rating is the rating given by the other users weighted by the
similarity
• Sometimes only top N similar users are taken

∑ sim(u, v )∗ (r − r )
v∈V
v ,i v
ru ,i = ru +
∑ sim(u, v )
v∈V

Jane will rate MI-II as (−0.289 × 1) + (−0.612 × 1.5) + (0.816 × −1)
• = 4+
0.289 + 0.612 + 0.816
≈ 2.82

102 22 January 2008

Remarks
• There is another popular version of the above technique
where instead of user to user similarity item to item
similarity is computed
• Rating prediction is based on the similarity to the items rated by the
user

• The above mentioned methods are known as memory
based techniques
• It has the disadvantage that it require more online
computations

103 22 January 2008

Model based technique
• A model is learnt using the collection of ratings as training
set

• Prediction is done using the model

• More offline computing and less online computing

104 22 January 2008

• A simple model

ru ,i = E (ru ,i ) = ∑ r × Pr (r
u ,i = r | ru , s′ , s ′ ∈ I u )
r∈R

105 22 January 2008

Problems of C.F.
• New user problem

• New Item problem

• Sparsity problem
• A user rates only a few items

• Unusual user
• User whose tastes are unusual compared to the rest of
the population

107 22 January 2008

Hybrid approaches
- Combining Collaborative and Content based methods

• Combining predictions of Content based method
and C.F.
• Implement separate content based and collaborative
filtering method

• Combine their predictions using
• Linear combination
• Voting schemes

• Alternatively select a prediction method based on some
confidence measure on the recommendation

108 22 January 2008

Hybrid Approaches
• Adding content based characteristics into a C.F.
based method
• Maintain a content based profile for each user
• Use these content based profiles (not the commonly
rated items) to compute the similarity between users
• Then do C.F.
• Helps to overcome sparsity related problems as
generally not many items are commonly rated two users

109 22 January 2008

Hybrid approaches
• Adding C.F. characteristics into a content based
method
• Most popular techniques in this category is
dimensionality reduction on a group of content based
profiles

• Dimensionality reduction technique like LSA can improve
prediction quality by having compact representation of
profile

110 22 January 2008

Future directions of research (Adomavicious et
al)

• Incorporating richer user and item profile in a unified
framework of different methods

• Using contextual information in recommendation
• Example: Recommending a vacation package the system should
consider
• User
• Time of the year
• With whom the user plans to travel
• Traveling conditions and restrictions at the time

• Multi-Criteria ratings
• E.g. three criteria restaurant ratings food, décor and service

111 22 January 2008

Future directions of research
• Non-intrusiveness

• Flexibility
• Enabling end-users to customize recommendation

• Evaluation
• Empirical evaluation on test data that users choose to
rate
• Items that users choose to rate are likely to be biased
• Economics-oriented measures

112 22 January 2008

References (Recommender System)
• Adomavicius, G., and Tuzhilin, A., “Toward the Next
Generation of Recommender Systems: A Survey of the
State-of-the-Art and possible Extensions”, IEEE
Transaction on Knowledge and Data Engineering, 2005

113 22 January 2008

Semantics
in Personalization

Geetha Manjunath
Hewlett Packard Labs India


Topic Outline
• Why use semantic information?
• Introduction to Ontology
• Formal Specification of an Ontology
• A Quick Overview of Semantic Web
• Techniques and Approaches
• Word Sense Disambiguation
• Semantic Profiles
• Constrained Spreading Activation
• Semantic Similarity
• Looking Ahead

115 22 January 2008

News Example Revisited


•

116 22 January 2008

News Example Modified
• g1: Apple Gets Green Light from FTC for TripleClick Acquisition
• g2: Apple Closes In on TripleClick Acquisition
• g3: FTC clears Apple TripleClick deal
• g4: US regulator clears TripleClick deal
IT company
Google
• g5: TripleClick deal brings greater focus on privacy
Acquisition
Acquisition


• f1: Apple prices soaring high.
• f2: Increased apple rates causes concern to doctors.
• f3: Cost of 10 kg of apple to become Rs 1000 from 1 Feb.
117 22 January 2008

Semantics for Personalization
Profile
Represent Search query,
Representation Content
Profiles as news, video,
using domain …
meaningful
concepts
concepts
Explicit
and User
Implicit info Profile Profile
Data Profile to Content
Collection Constructor Matching

Semantics
based
Matching
Function
Implicit Expand the
User Personalized services
Cluster
Documents Info based generated
documents
on domain profile using
based on
knowledge domain info
better User
118 22 January 2008 groups

Techniques and Approaches
1. Implicit Information based on domain knowledge
• Word Sense Disambiguation
2. Represent Profiles as meaningful concepts
• Semantic Profiles
3. Semantics based Matching Function
• Semantic Distance
4. Expand the generated profile using domain info
• Constrained Spreading Activation
5. Cluster documents based on better User groups
• Social Semantic Networks

119 22 January 2008

Word Sense disambiguation
Animal
Using Wordnet Transport

Mammal Vehicle
Hyponyms

Meronyms
Carnivore Motor Vehicle
tail
Accelerator
fur
Feline Automobile Door
nail
contains Bumper

Big cat Car Wheel

type of Synonyms
Jaguar Panther Jaguar
same as
120 22 January 2008

Word Sense disambiguation
Abstract
entity
entity

group substance

employee
organization Advisory board solid
stocks eat
animal
institution food plant
Revenue ripe
Business Acquisition tree
company Sales tax fruit plant
skin
….. seed
Apple Apple pulp

KEY: Additional domain information
121 22 January 2008

Three level Conceptual Network

• Domain Ontology

• Co-occurrence
• synonyms
• hyponyms
• ..

• Hyperlinks
• Order of access
• Browsed together
•…

122 22 January 2008

Introduction to
Ontologies


Views on Ontologies
TopicMaps Front-End
Thesauri
Navigation
Taxonomies Information Retrieval
Query Expansion Sharing of Knowledge

Queries
Ontologies Semantic Networks
Consistency Checking
EAI
Mediation
Reasoning

Extended ER-Models
Predicate Logic
Back-End

124 22 January 2008

Structure of an Ontology
Ontologies typically have two components:
• Names for important concepts in the domain
• Elephant is a concept whose members are a kind of
animal
• Herbivore is a concept whose members are exactly
those animals who eat only plants or parts of plants
• Background knowledge/constraints on the
domain
• No individual can be both a Herbivore and a
Carnivore

125 22 January 2008

A Simple Ontology
Object
Is a Is a
knows Described in
Person Topic Document
writes
Is a
Student Researcher Semantics Ontology
Is a
similar
PhD Student

Described in Is about
Topic Document Document Topic

Is about
writes knows
Person Document Topic Person Topic

126 22 January 2008

Defining Ontology
[Gruber, 1993]
An Ontology is a
formal specification Ø Executable
of a shared Ø Group of persons
conceptualization Ø About concepts
of a domain of interest. Ø Application & “unique truth”

•Formal description of concepts and their relationships
•Strong Basis in the family of First Order Logics (DL)
•Deductive Inference based on ground truth of the domain.

127 22 January 2008

Formal
Specification of
Ontologies
Semantic Web: A quick introduction


The Semantic Web Vision
Semantic web aims to transform WWW into a global database

“The semantic web is a web
for computers”

129 22 January 2008

Semantic web
Make web resources more accessible to automated processes
• Extend existing rendering markup with semantic markup
• Metadata annotations that describe content/funtion of web
accessible resources
• Use Ontologies to provide vocabulary for annotations
• “Formal specification” is accessible to machines
• A prerequisite is a standard web ontology language
• Need to agree common syntax before we can share semantics
• Syntactic web based on standards such as HTTP and HTML

130 22 January 2008

Semantic Web Layers

Context for
vocabulary
Globally
User definable, Unambiguous
domain specific Identifiers
markup

131 22 January 2008

What is RDF ?
• RDF – resource description framework
• RDF is a data model
• Statement-based approach
• Subject/predicate/object triples – simple powerful unit
• All resources identified by URIs
• Triples create a directed labelled graph of
• object/attribute/value
• (semantic) relationships between objects
• RDF model is an abstract layer independent of XML
• XML serialization is supported

132 22 January 2008

RDF Example
resource value
../presentation.ppt property
dc:creator
dc:date dc:description
people.com/../dave_reynolds

Some starter slides…
org:email

2005-09-23
mailto:dave.reynolds@hp.com

<rdf:Description rdf:about=“allppt.com/presentation.pptquot;>
<dc:creator resource=“people.com/person/dave_reynoldsquot;/>
</rdf:Description>
Enables easy merge of information
<rdf:Description rdf:ID=“people.com/person/dave_reynoldsquot;>
• Indirect metadata (anyone can say anything about anything)
<org:email resource= “mailto:dave.reynolds@hp.com” />
• Extensibility (open world assumption, compositional)
</rdf:Description>
133 22 January 2008

RDF Schema
• Defines small vocabulary for RDF:
• Class, subClassOf, type
rdfs:Resource
• Property, subPropertyOf
rdfs:subClassOf
• domain, range
Veh: MotorVehicle
• Vocabulary can be used to define other
vocabularies for yourrdfs:subClassOf
application domain
Veh: Van Veh: Truck

Veh: PassengerVehicle

rdfs:subClassOf

Veh: MiniVan

134 22 January 2008

OWL – Web Ontology Language
• A language to express an ontology
• An OWL ontology is an RDF graph
• A set of RDF triples
• Vocabulary Extension
Domain Restrictions/Truth
• Structure
• Ontology headers Important Concepts of the Domain

• Class Axioms
• Class Descriptions, Enumeration, Membership Restrictions
• Property Axioms
• Property Descriptions, Property Restrictions, Functional Spec
• Facts about individuals

135 22 January 2008

OWL Class Constructors

136 22 January 2008

The Syntax
Parent = Person with at least one child

<owl:Class rdf:ID=“Parent”>
<owl:intersectionOf >
<owl:Class rdf:about=quot;#Personquot;/>
<owl:Restriction>
<owl:onProperty rdf:resource=quot;#hasChildquot;/>
<owl:minCardinality>1</owl:minCardinality>
</owl:Restriction>
</owl:intersectionOf>
</owl:Class>

137 22 January 2008

OWL Axioms

138 22 January 2008

SPARQL
• RDF Query Language
• Triples with unbound variables
• Protocol
• HTTP binding
• SOAP binding
• XML Results Format
• Easy to transform (XSLT, XQuery)

139 22 January 2008

Why Ontologies?
• Enable formalisation of user preferences
• Common underlying, interoperable representation
• Public vocabulary agreed & shared between different systems
• Better content matching & sharing across applications
• User interests can be matched to content meaning
• Using conceptual reasoning
• Richer, more precise, less ambiguous than keyword-based
• Provides adequate grounding for hierarchical representation
• coarse to fine-grained user interests
• Formal, computer processable meaning on the concepts

140 22 January 2008

Semantic User
Profiles


Semantic Profiles
• User Profile as concepts
• Books, Clothes and Soccer Web
pages
visited by
Top the user

Shopping Science Sports ..… ..…
W=2 W=0 W=1

Books Clothes Soccer Cricket
W=1 W=1 W=1 W=0

How do we map documents/users to concepts?
142 22 January 2008

Building concept profiles based on ODP
The Machine Learning Approach

ODP
Training ODP
categories +
classifier
documents

Step 1: Build ODP classifier for selected ODP categories

User web ODP ODP
pages classifier concepts

Step 2: Use user data and ODP
classifier to build the user profile
Add to
profile
143 22 January 2008

Topic Hierarchy from ODP / DMOZ

144 22 January 2008

Using Wikipedia to map documents to
concepts
Item: “Sony to slash PlayStation3 price”
Term vector Representation: <sony:1>,<slash:1>, <playstation3:1>,<price:1>

Item: “Jittery Sony Knocks $100 Off PS3 Price Tag”
Term vector Representation: <jittery:1>, <sony:1>, <knocks:1> <ps3:1>,<price:1>, <tag:1>

Additional features: titles of the retrieved
articles

query 1. PlayStation Network Platform
2. PlayStation 2
3. Ducks demo
4. PlayStation 3
Sony to slash PlayStation3 price 5. PlayStation
6. Ken Kutaragi
7. PlayStation Portable
8. Console manufacturer
9. Sony Group
Index of Wikipedia dump 10. Crystal Dynamics
11. PlayStation 3 accessories
12. …
13. …

A Search Approach

145 22 January 2008

Profile: Words Vs Concepts
TF * IDF based user profile Wikipedia Based user profile

Search Text Retrieval Conference
Home HTML element
Help Bank of America
News Google search
Privacy ICICI Bank
Google IDBI Bank
Terms Bank fraud
New Artificial neural network
Page Web crawler
Use Web design
Web Debit card
View Extensible Markup Language
Results Hewlett-Packard
Information Microsoft
Account XHTML
Demand account

146 22 January 2008

Semantic Profiles
• Vector of weights – representing the intensity of user
interest for each concept (-1 to 1)
• Content also described by a set of weighted concepts
(0 to 1)

• Concept Profiles: Can express fine grained interests
• Interest in atheletes who have won a gold medal
• Interest in IT companies which have acquired atleast 3
companies in the last one year
• Only movies with either Amitabh or Sharukh

147 22 January 2008

Ontology-based
Profile Spreading


Profile Expansion
• Use inference mechanism to enhance personalisation
• Synonym expansion
• Interest in multiple subclasses implies broader interest
• Transitive closure (locatedIn, subtopic)
• Interest in superclass leads to potential interest in subclass
• Guess changing interest over time

149 22 January 2008

Constrained Spreading

Artificial
Intelligence

Machine
Learning

Neural
Networks

150 22 January 2008

Constrained Spreading Activation
• Cannot
take ‘all’ related data
• Commonly used SA models
• Distance Constraint
• Fan-out Constraint
• Path Constraints
• App dependent inference rules
• Type of relationship
• Preferential paths
• Activation Constraint
• Threshold function at each single node level

151 22 January 2008

Learning preferences using semantic links
Two main ways of updating Concept History Stack
1. Interest Assumption Completion
• Add more potential user interests
• Based on Hierarchical relationships
• Threshold on value of pseudo-occurrence for insertion
• Nocc (C supertype) = γ * Nocc (C subtype)
where γ < 1 is determined empherically
• Based on Semantic relationships
• All related concepts such that ∃ prop p, p (C, C related)
• Pseudo-occurrence Nocc (Crelated) = αi* Nocc (C)

152 22 January 2008

Learning preferences using semantic links
(contd)
2. Preference update by expansion
• Re-weighting over time
• Wnew (Crelated) = Wold (Crelated) + βi * Wnew(C)
• βI – Semantic Factor that depends on the level of semantic proximity
• Directly part of definition (Tbox)
• Related through inferred transitive relation (# such links matter)
• Notion of Semantic distance

153 22 January 2008

Semantic Similarity


Similarity/Matching
• Cosine similarity
•U represents user preference
•D represents content object
•Dimension: #concepts in the ontology

similarity ( U, D )= cos (U ,D) =
U• D
=
∑ U ×Di i

U × D
2 2
∑U ∑D
i
2
i
2

155 22 January 2008

Semantic distance d(x,y,c)
• Semantic distance between 2 nodes x and y is defined
with respect to a concept, c
• Example: a black cat and an orange cat
• very similar as instances of the category Animal, since their
common catlike properties would be the most significant for
distinguishing them from other kinds of animals.
• But in the category Cat, they would share their catlike properties
with all the other kinds of cats, and the difference in color would be
more significant.
• In the category BlackEntity, color would be the most relevant
property, and the black cat would be closer to a crow or a lump of
coal than to the orange cat.

156 22 January 2008

Semantic Similarity

Mtrl: Material
Accm: Accompaniment

157 22 January 2008

Using Wordnet (hypernyms)

158 22 January 2008

Match all nodes

159 22 January 2008

Similarity Formula

160 22 January 2008

Looking Ahead


Contextual Personalisation
• Finer,
qualitative, context sensitive activation of
user pref
• Notion of a Semantic Runtime Context
• Representation: Vector of concept weights
• Fuzzy semantic intersection between user
preferences and runtime context
• Using Constrained spreading activation

162 22 January 2008

Semantic Social Networking
• Identify hidden links between users
• Similarity between user preferences
• Collaborative Recommender systems
• Use of Global Preferences not correct
• Partial & Strong Similarities are very useful
• Eg: Coinciding interest in cinema but drastically different
in sports

163 22 January 2008

Semantic Social Networks

164 22 January 2008

Microformats

Metadata
Social links Geo
Outline hResume
adr
Licensing
tags

http://microformats.org

• Microformats are small bits of HTML that represent things like people, events,
tags, etc. in web pages.
• Building blocks that enable users to own, control, move, and share their data on
the Web.
• Microformats enable
• publishing of higher fidelity information on the Web,
• the fastest and simplest way to support feeds and APIs for your website.
165 22 January 2008

eRDF
• A subset of RDF embedded into XHTML or HTML by using
common idioms and attributes.
• No new elements or attributes have been invented and the
usages of the HTML attributes are within normal bounds.
• This scheme is designed to work with CSS and other HTML
support technologies.
• HTML Embeddable RDF.
• all HTML Embeddable RDF is valid RDF, not all RDF is
Embeddable RDF

167 22 January 2008

GRDDL
• Gleaning Resource Descriptions from Dialects of
Languages
• Obtaining RDF data from XHTML pages
• Explicitly associated transformation algorithms
(XSLT)

168 22 January 2008

Acknowledgements
• Self-tuning Personalized Information Retrieval in an
Ontology-Based Framework, Pablo Castells, Miriam
Fernández, David Vallet, et al, OTM Workshop 2005

• An Approach for Semantic Search by Matching RDF Graphs,
Haiping Zhu, Jiwei Zhong, Jianming Li and Yong Yu

• Semantic Web Tutorials

169 22 January 2008

Concluding Remarks
• Personalization: An upcoming area of technology
• Personalization aims at faster access to information to improve
user productivity
• Server-side Vs Client-side personalization
• Technologies
• Machine Learning techniques
• Semantic Web
• New Markup Languages
• Challenges
• Understanding the user behaviour, intentions, likes, …
• Relating human edited content to the profile

170 22 January 2008

Thank you
Questions?


Personalization Tutorial at ACM Compute 2008

Recommended

Recommended

More Related Content

Similar to Personalization Tutorial at ACM Compute 2008

Similar to Personalization Tutorial at ACM Compute 2008 (20)

Recently uploaded

Recently uploaded (20)

Personalization Tutorial at ACM Compute 2008