The document discusses personalization and user modeling on the social web. It describes how user data is generated from various online activities and interactions that can be used to create user profiles and models. Several approaches for developing user models are presented, including overlay models that describe user characteristics, elicitation models that ask users for information or observe their behaviors, stereotyping models that apply typical attributes to users, and relevance models that learn what items are pertinent. The best approach depends on the specific application conditions.
Lecture 4: Social Web Personalization (2012)Lora Aroyo
This is the fourth lecture in the Social Web course at the VU University Amsterdam
Visit the website for more information: http://semanticweb.cs.vu.nl/socialweb2012/
Thanks to Fabian Abel for letting me adopt slides from his lectures
CHIP Project: Personalized Museum Tour with Real-Time Adaptation on a Mobile ...Lora Aroyo
For more information visit our website: http://chip-project.org
This is a presentation of a MSc thesis by Ivo Roes, performed within the CHIP project, entitled:
Personalized Museum Tour with Real-Time Adaptation on a Mobile Device with Multi-Point Touch Interface
The CHIP project is a collaborative effort between the Rijksmuseum Amsterdam, VU University Amsterdam and Eindhoven University of Technology
http://chip-project.org
Lecture 5: Personalization on the Social Web (2014)Lora Aroyo
This is the fifth lecture in the Social Web course (2014) at the VU University Amsterdam. Visit the website for more information: http://thesocialweb2014.wordpress.com/
Mobile Visual Search (MVS) is a fascinating research field with many open challenges and opportunities, which have the potential to impact the way we organize, annotate, and retrieve visual data (images and videos) using mobile devices.
This talk is structured in four parts:
1. Opportunities: where I present recent and relevant numbers of the mobile computing market, particularly in the field of photography apps, social networks, and mobile search.
2. Basic concepts: where I explain the basic MVS pipeline and discuss the three main MVS scenarios and associated challenges.
3. Technical aspects: where I briefly cover topics such as feature extraction, indexing, descriptor matching, and geometric verification, discuss the state of the art in these fields, and comment on open problems and research opportunities.
4. Examples and applications: where I show representative examples of academic research and commercial apps in this field.
Advances and Challenges in Visual Information Search and Retrieval (WVC 2012 ...Oge Marques
Part I – Concepts, challenges, and state of the art
Part II – Medical image retrieval
Part III – Mobile visual search
Part IV – Where is image search headed?
Lecture 4: Social Web Personalization (2012)Lora Aroyo
This is the fourth lecture in the Social Web course at the VU University Amsterdam
Visit the website for more information: http://semanticweb.cs.vu.nl/socialweb2012/
Thanks to Fabian Abel for letting me adopt slides from his lectures
CHIP Project: Personalized Museum Tour with Real-Time Adaptation on a Mobile ...Lora Aroyo
For more information visit our website: http://chip-project.org
This is a presentation of a MSc thesis by Ivo Roes, performed within the CHIP project, entitled:
Personalized Museum Tour with Real-Time Adaptation on a Mobile Device with Multi-Point Touch Interface
The CHIP project is a collaborative effort between the Rijksmuseum Amsterdam, VU University Amsterdam and Eindhoven University of Technology
http://chip-project.org
Lecture 5: Personalization on the Social Web (2014)Lora Aroyo
This is the fifth lecture in the Social Web course (2014) at the VU University Amsterdam. Visit the website for more information: http://thesocialweb2014.wordpress.com/
Mobile Visual Search (MVS) is a fascinating research field with many open challenges and opportunities, which have the potential to impact the way we organize, annotate, and retrieve visual data (images and videos) using mobile devices.
This talk is structured in four parts:
1. Opportunities: where I present recent and relevant numbers of the mobile computing market, particularly in the field of photography apps, social networks, and mobile search.
2. Basic concepts: where I explain the basic MVS pipeline and discuss the three main MVS scenarios and associated challenges.
3. Technical aspects: where I briefly cover topics such as feature extraction, indexing, descriptor matching, and geometric verification, discuss the state of the art in these fields, and comment on open problems and research opportunities.
4. Examples and applications: where I show representative examples of academic research and commercial apps in this field.
Advances and Challenges in Visual Information Search and Retrieval (WVC 2012 ...Oge Marques
Part I – Concepts, challenges, and state of the art
Part II – Medical image retrieval
Part III – Mobile visual search
Part IV – Where is image search headed?
he AlphaSphere is a new musical instrument that opens up music making to a new generation of musicians. Designed for composition, production, performance and learning, each of the tactile pads are pressure sensitive and ergonomically mapped around the surface of sphere. Underlying the design is a geometric notational logic that allows the custom mappings to be created between different pads.
¿No eres informático y necesitas decidir sobre qué plataforma vas a montar un ecommerce? ¿Desconoces las tecnologías que hay detrás de estos proyectos?. En este webinar descubriremos cómo funciona un ecommerce desde el punto de vista de la tecnología.
Además, analizaremos cuáles son las principales opciones de plataforma sobre las que podemos crear un ecommerce y algunos conceptos básicos sobre la seguridad.
Puedes ver el video: https://www.youtube.com/watch?v=tHh4mO-gGfA
Masteroppgåve i spesialpedagogikk: Korleis opplever ungdom og unge vaksne som hadde spesialundervisning i ungdomsskulen, at relasjonen til læraren var? Ei kvalitativ, retrospektiv intervjuundersøking av 8 ungdomar mellom 17 og 20 år som hadde spesialundervisning då dei gjekk i ungdomsskulen.
Hacktivism - The Hacker News Magazine - May 2012 IssueReputelligence
Article from me „The Many Faces of Modern Day Hackers“ inside this issue. Welcome cyber space readers and internet junkies from around the world. May brings us into an in-depth look at our favorite topic: Hacktivism
he AlphaSphere is a new musical instrument that opens up music making to a new generation of musicians. Designed for composition, production, performance and learning, each of the tactile pads are pressure sensitive and ergonomically mapped around the surface of sphere. Underlying the design is a geometric notational logic that allows the custom mappings to be created between different pads.
¿No eres informático y necesitas decidir sobre qué plataforma vas a montar un ecommerce? ¿Desconoces las tecnologías que hay detrás de estos proyectos?. En este webinar descubriremos cómo funciona un ecommerce desde el punto de vista de la tecnología.
Además, analizaremos cuáles son las principales opciones de plataforma sobre las que podemos crear un ecommerce y algunos conceptos básicos sobre la seguridad.
Puedes ver el video: https://www.youtube.com/watch?v=tHh4mO-gGfA
Masteroppgåve i spesialpedagogikk: Korleis opplever ungdom og unge vaksne som hadde spesialundervisning i ungdomsskulen, at relasjonen til læraren var? Ei kvalitativ, retrospektiv intervjuundersøking av 8 ungdomar mellom 17 og 20 år som hadde spesialundervisning då dei gjekk i ungdomsskulen.
Hacktivism - The Hacker News Magazine - May 2012 IssueReputelligence
Article from me „The Many Faces of Modern Day Hackers“ inside this issue. Welcome cyber space readers and internet junkies from around the world. May brings us into an in-depth look at our favorite topic: Hacktivism
This proposal of work contains details and samples of the user centric design process I follow. I have been trying to find a good graph that represents the process, but at the end I have decided to make my own! ;)
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
how to discover requirement by identify problem
how to solve the problem by discovering requirement
how identify customer need
How to Capture Requirements Once They Are Discovered?
What Are Requirements?
There are Different types of requirements
There are Common types of requirements
Data Gathering
Probes
what is Probes
types of Probes
what is Contextual Inquiry
Brainstorming for innovation
Personas and scenarios
There are a range of different tools and methods for defining target groups such as interviews, observations, questionnaires etc.. This report describes the Persona method, and is based upon the work of Alan Cooper, the inventor of the Personas approach.
Review and analysis of machine learning and soft computing approaches for use...IJwest
The adequacy of user models depends mainly on the accuracy and precision of information that is retrieved to the user. The real challenge in user modelling studies is due to the inadequacy of data, improper use of techniques, noise within the data and imprecise nature of human behavior. For the best results of user modelling, one should choose an appropriate way to do it i.e. by selecting the best suitable approach for the desired domain. Machine learning and Soft computing Techniques have the ability to handle the uncertainty and are extensively being used for user modeling purpose. This paper reviews various approaches of user modeling and critically analyzes the machine learning and soft computing techniques that have successfully captured and formally modelled the human behavior.
Similar to Lecture 5: Personalization on the Social Web (2013) (20)
The Rijksmuseum Collection as Linked DataLora Aroyo
Presentation at ISWC2018: http://iswc2018.semanticweb.org/sessions/the-rijksmuseum-collection-as-linked-data/ of our paper published originally in the Semantic Web Journal: http://www.semantic-web-journal.net/content/rijksmuseum-collection-linked-data-2
Many museums are currently providing online access to their collections. The state of the art research in the last decade shows that it is beneficial for institutions to provide their datasets as Linked Data in order to achieve easy cross-referencing, interlinking and integration. In this paper, we present the Rijksmuseum linked dataset (accessible at http://datahub.io/dataset/rijksmuseum), along with collection and vocabulary statistics, as well as lessons learned from the process of converting the collection to Linked Data. The version of March 2016 contains over 350,000 objects, including detailed descriptions and high-quality images released under a public domain license.
FAIRview: Responsible Video Summarization @NYCML'18Lora Aroyo
Presentation at the NYC Media Lab (NYCML2018). There is a growing demand for news videos online, with more consumers preferring to watch the news than read or listen to it. On the publisher side, there is a growing effort to use video summarization technology in order to create easy-to-consume previews (trailers) for different types of broadcast programs. How can we measure the quality of video summaries and their potential to misinform? This workshop will inform participants about automatic video summarization algorithms and how to produce more “representative” video summaries. The research presented is from the FAIRview project and is supported by the Digital News Innovation Fund (DNI Fund), which is part of the Google News Initiative.
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...Lora Aroyo
Lora Aroyo, Chiel van den Akker, Marnix van Berchum, Lodewijk
Petram, Gerard Kuys, Tommaso Caselli, Jacco van Ossenbruggen, Victor de Boer, Sabrina Sauer, Berber Hagedoorn
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017Lora Aroyo
The process of gathering ground truth data through human annotation is a major bottleneck in the use of information extraction methods. Crowdsourcing-based approaches are gaining popularity in the attempt to solve the issues related to the volume of data and lack of annotators. Typically these practices use inter-annotator agreement as a measure of quality. However, this assumption often creates issues in practice. Previous experiments we performed found that inter-annotator disagreement is usually never captured, either because the number of annotators is too small to capture the full diversity of opinion, or because the crowd data is aggregated with metrics that enforce consensus, such as majority vote. These practices create artificial data that is neither general nor reflects the ambiguity inherent in the data.
To address these issues, we proposed the method for crowdsourcing ground truth by harnessing inter-annotator disagreement. We present an alternative approach for crowdsourcing ground truth data that, instead of enforcing an agreement between annotators, captures the ambiguity inherent in semantic annotation through the use of disagreement-aware metrics for aggregating crowdsourcing responses. Based on this principle, we have implemented the CrowdTruth framework for machine-human computation, that first introduced the disagreement-aware metrics and built a pipeline to process crowdsourcing data with these metrics.
In this paper, we apply the CrowdTruth methodology to collect data over a set of diverse tasks: medical relation extraction, Twitter event identification, news event extraction and sound interpretation. We prove that capturing disagreement is essential for acquiring a high-quality ground truth. We achieve this by comparing the quality of the data aggregated with CrowdTruth metrics with a majority vote, a method which enforces consensus among annotators. By applying our analysis over a set of diverse tasks we show that, even though ambiguity manifests differently depending on the task, our theory of inter-annotator disagreement as a property of ambiguity is generalizable.
My ESWC 2017 keynote: Disrupting the Semantic Comfort ZoneLora Aroyo
Ambiguity in interpreting signs is not a new idea, yet the vast majority of research in machine interpretation of signals such as speech, language, images, video, audio, etc., tend to ignore ambiguity. This is evidenced by the fact that metrics for quality of machine understanding rely on a ground truth, in which each instance (a sentence, a photo, a sound clip, etc) is assigned a discrete label, or set of labels, and the machine’s prediction for that instance is compared to the label to determine if it is correct. This determination yields the familiar precision, recall, accuracy, and f-measure metrics, but clearly presupposes that this determination can be made. CrowdTruth is a form of collective intelligence based on a vector representation that accommodates diverse interpretation perspectives and encourages human annotators to disagree with each other, in order to expose latent elements such as ambiguity and worker quality. In other words, CrowdTruth assumes that when annotators disagree on how to label an example, it is because the example is ambiguous, the worker isn’t doing the right thing, or the task itself is not clear. In previous work on CrowdTruth, the focus was on how the disagreement signals from low quality workers and from unclear tasks can be isolated. Recently, we observed that disagreement can also signal ambiguity. The basic hypothesis is that, if workers disagree on the correct label for an example, then it will be more difficult for a machine to classify that example. The elaborate data analysis to determine if the source of the disagreement is ambiguity supports our intuition that low clarity signals ambiguity, while high clarity sentences quite obviously express one or more of the target relations. In this talk I will share the experiences and lessons learned on the path to understanding diversity in human interpretation and the ways to capture it as ground truth to enable machines to deal with such diversity.
Data Science with Human in the Loop @Faculty of Science #Leiden UniversityLora Aroyo
Software systems are becoming ever more intelligent and more useful, but the way we interact with these machines too often reveals that they don’t actually understand people. Knowledge Representation and Semantic Web focus on the scientific challenges involved in providing human knowledge in machine-readable form. However, we observe that various types of human knowledge cannot yet be captured by machines, especially when dealing with wide ranges of real-world tasks and contexts. The key scientific challenge is to provide an approach to capturing human knowledge in a way that is scalable and adequate to real-world needs. Human Computation has begun to scientifically study how human intelligence at scale can be used to methodologically improve machine-based knowledge and data management. My research is focusing on understanding human computation for improving how machine-based systems can acquire, capture and harness human knowledge and thus become even more intelligent. In this talk I will show how the CrowdTruth framework (http://crowdtruth.org) facilitates data collection, processing and analytics of human computation knowledge.
Some project links:
- http://controcurator.org/
- http://crowdtruth.org/
- http://diveproject.beeldengeluid.nl/
- http://vu-amsterdam-web-media-group.github.io/linkflows/
"Video Killed the Radio Star": From MTV to Snapchat
Lecture 5: Personalization on the Social Web (2013)
1. Social Web
Lecture V: Personalization on the Social Web
(some slides were adopted from Fabian Abel)
Lora Aroyo
The Network Institute
VU University Amsterdam
Monday, March 4, 13
2. Personalization &
Social Web
• To design ‘social’ functionality we need to understand
to data the application can provide to filter the
relevant information (what users perceive as
relevant)
• Therefore we need to understand
• how good personalization (recommenders)
are
• how good the user models are they are based on
• In this lecture, we consider theory & techniques for
how to design and evaluate recommenders and user
models (for use in SW applications)
Monday, March 4, 13
3. User Modeling
How to infer & represent user
information that supports a given
application or context?
Kevin Kelly
Monday, March 4, 13
4. User Modeling
Challenge
• Application has to obtain,
understand & exploit information
about the user
• Information (need & context) about
user
• Inferring information about user &
representing it so that it can be
consumed by the application
• Data relevant for inferring
information about user
Monday, March 4, 13
5. User & Usage Data is
everywhere
• People leave traces on the Web and on their computers:
• Usage data, e.g., query logs, click-through-data
• Social data, e.g., tags, (micro-)blog posts, comments,
bookmarks, friend connections
• Documents, e.g., pictures, videos
• Personal data, e.g., affiliations, locations
• Products, applications, services - bought, used, installed
• Not only a user’s behavior, but also interactions of other users
• “people can make statements about me”, “people who are
similar to me can reveal information about me” --> “social
learning” collaborative recommender systems
Monday, March 4, 13
6. UM: Basic Concepts
• User Profile: a data structure that represents a characterization of
a user at a particular moment of time represents what, from
a given (system) perspective, there is to know about a user. The data
in the profile can be explicitly given by the user or derived by the
system
• User Model: contains the definitions & rules for the interpretation
of observations about the user and about the translation of that
interpretation into the characteristics in a user profile user
model is the recipe for obtaining and interpreting user profiles
• User Modeling: the process of representing the user
Monday, March 4, 13
7. User Modeling
Approaches
• Overlay User Modeling: describe user characteristics, e.g.
“knowledge of a user”, “interests of a user” with respect to
“ideal” characteristics
• Customizing: user explicitly provides & adjusts elements of the
user profile
• User model elicitation: ask & observe the user; learn & improve
user profile successively “interactive user modeling”
• Stereotyping: stereotypical characteristics to describe a user
• User Relevance Modeling: learn/infer probabilities that a given
item or concept is relevant for a user
Related scientific conference: http://umap2011.org/ Related journal: http:/umuai.org/
Monday, March 4, 13
8. Which approach
suits best the
conditions of
applications?
http://farm7.staticflickr.com/6240/6346803873_e756dd9bae_b.jpg
Monday, March 4, 13
9. Overlay User Models
• among the oldest user models
• used for modeling student
knowledge
• the user is typically characterized
in terms of domain concepts &
hypotheses of the user’s knowledge
about these concepts in relation
to an (ideal) expert’s knowledge
• concept-value pairs
Monday, March 4, 13
10. User Model Elicitation
• Ask the user explicitly learn
• NLP, intelligent dialogues
• Bayesian networks, Hidden Markov models
• Observe the user learn
• Logs, machine learning
• Clustering, classification, data mining
• Interactive user modeling: mixture of direct inputs of a user,
observations and inferences
Monday, March 4, 13
13. Stereotyping
• set of characteristics (e.g. attribute-value
pairs) that describe a group of users.
• user is not assigned to a single stereotype -
user profile can feature characteristics of
several different stereotypes
Monday, March 4, 13
14. Why are
stereotypes
useful?
http://farm1.staticflickr.com/155/413650229_31ef379b0b_b.jpg
Monday, March 4, 13
16. Can we infer a Twitter-
based user profile?
Personalized News
Recommender
Profile I want my
?
personalized news
recommendations!
User Modeling
(4 building blocks)
Semantic Enrichment,
Linkage and Alignment
Example from Abel et al. (2011)
Monday, March 4, 13
17. User Modeling Building
Blocks
1. Temporal
Constraints
Profile?
(a) time period
1. Which tweets of concept weight
the user should be
analyzed? ? (b) temporal patterns
start weekends end
Morning:
Afternoon:
time
Night:
June 27 July 4 July 11
Monday, March 4, 13
18. User Modeling Building
Blocks
1. Temporal
Constraints
Francesca T Sport
Schiavone 2. Profile
Profile? Type
concept weight
?
Francesca Schiavone won # hashtag-based
French Open #fo2010 entity-based
T topic-based
French
Open # fo2010
2. What type of concepts
should represent “interests”?
time
June 27 July 4 July 11
Monday, March 4, 13
19. User Modeling Building
Blocks
1. Temporal
Constraints
Francesca (a) tweet-based
Schiavone 2. Profile
Profile? Type
concept weight
Francesca
Francesca Schiavone won! Schiavone
3. Semantic
http://bit.ly/2f4t7a French Open
Enrichment
Tennis
Francesca wins French Open
French
Thirty in women's Open (b) further enrichment
tennis is primordially
old, an age when
agility and desire Tennis
recedes as the …
3. Further enrich the semantics of tweets?
Monday, March 4, 13
20. User Modeling Building
Blocks
1. Temporal
Constraints
Profile? 2. Profile
concept weight Type
4. How to weight the Francesca
4
?
Schiavone
3. Semantic
concepts? French Open 3
Enrichment
Tennis 6
Concept frequency (TF)
4. Weighting
TFxIDF weight(French Open) Scheme
weight(Francesca
Time-sensitive Schiavone)
weight(Tennis)
time
June 27 July 4 July 11
Monday, March 4, 13
21. Observations
• Profile characteristics:
• Semantic enrichment solves sparsity problems
• Profiles change over time: fresh profiles reflect better current
user demands
• Temporal patterns: weekend profiles differ significantly from
weekday profiles
• Impact on news recommendations:
• The more fine-grained the concepts the better the
recommendation performance: entity-based > topic-based >
hashtag-based
• Semantic enrichment improves recommendation quality
• Time-sensitivity (adapting to trends) improves performance
Monday, March 4, 13
22. User Modeling
it is not about putting everything in a user profile
it is about making the right choices
Monday, March 4, 13
23. User Adaptation
Knowing the user - this knowledge - can be applied to adapt
a system or interface to the user
to improve the system functionality and user experience
Monday, March 4, 13
24. Fig. 1 Functional model of tasks and sub-tasks specifically suited for SASs
Fig. 1 Functional model of tasks and sub-tasks specifically suited for SASs (Ilaria Torre (2009)
Monday, March 4, 13
25. User-Adaptive Systems
user
profile
user modeling profile analysis
observations,
data and adaptation
information decisions
about user
A. Jameson. Adaptive interfaces and agents. The HCI handbook: fundamentals,
evolving technologies and emerging applications, pp. 305–330, 2003.
Monday, March 4, 13
26. Last.fm: adapts to your
music taste
user profile
interests in
genres,
artists, tags
user modeling compare profile
(infer current with possible next
musical taste) songs to play
history of next song to
songs, like, be played
ban, pause,
skip
Monday, March 4, 13
27. Issues in User-Adaptive
Systems
• Overfitting, “bubble effects”, loss of serendipity problem:
• systems may adapt too strongly to the interests/behavior
• e.g., an adaptive radio station may always play the same or
very similar songs
• We search for the right balance between novelty and
relevance for the user
• “Lost in Hyperspace” problem:
• when adapting the navigation – i.e. the links on which
users can click to find/access information
• e.g., re-ordering/hiding of menu items may lead to
confusion
Monday, March 4, 13
28. personalization -
are we in control?
personalization -
FoF weaken
recommendations?
personalization -
self fulfilling
prophecy?
Monday, March 4, 13
29. What is good
user modeling &
personalization?
http://www.flickr.com/photos/bellarosebyliz/4729613108
Monday, March 4, 13
30. Success perspectives
• From the consumer perspective of an
adaptive system:
Adaptive system maximizes hard to measure/obtain
satisfaction of the user
• From the provider perspective of an
adaptive system:
influence of UM &
Adaptive system maximizes personalization may be
the profit hard to measure/obtain
Monday, March 4, 13
31. Evaluation Strategies
• User studies: ask/observe (selected) people whether you did a
good job
• Log analysis: Analyze (click) data and infer whether you did a
good job,
• Evaluation of user modeling:
• measure quality of profiles directly, e.g. measure overlap with
existing (true) profiles, or let people judge the quality of
the generated user profiles
• measure quality of application that exploits the user profile,
e.g., apply user modeling strategies in a recommender
system
Monday, March 4, 13
32. Evaluating User
Modeling in RecSys
training data test data (ground truth)
item C item G
item A item H
item E
item B measure
item D item F time quality
training
data Recommendations:
Strategy X X Y Z
?
Strategy Y Recommender item R item H item F
item H item G item H
Strategy Z
? ? ?
User Modeling strategies to compare item M item N item M
Monday, March 4, 13
33. Possible Metrics
• The usual IR metrics:
• Precision: fraction of retrieved items that are relevant
• Recall: fraction of relevant items that have been retrieved
• F-Measure: (harmonic) mean of precision and recall
• Metrics for evaluating recommendation (rankings):
• Mean Reciprocal Rank (MRR) of first relevant item
• Success@k: probability that a relevant item occurs within the top k
• If a true ranking is given: rank correlations performance
strategy X
• Precision@k, Recall@k & F-Measure@k baseline
• Metrics for evaluating prediction of user preferences:
• MAE = Mean Absolute Error
• True/False Positives/Negatives runs
Is strategy X better than the baseline?
Monday, March 4, 13
34. Example Evaluation
• [Rae et al.] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
• Task: test how well the different strategies (different tag contexts) can be used
for tag prediction/recommendation
• Steps:
1. Gather a dataset of tag data part of which can be used as input and aim to
test the recommendation on the remaining tag data
2. Use the input data and calculate for the different strategies the predictions
3. Measure the performance using standard (IR) metrics: Precision of the
top 5 recommended tags (P@5), Mean Reciprocal Rank (MRR), Mean
Average Precision (MAP)
4. Test the results for statistical significance using T-test, relative to the
baseline (e.g. existing approach, competitive approach)
[Rae et al. Improving Tag Recommendations Using Social Networks, RIAO’10]]
Monday, March 4, 13
35. Example Evaluation
• [Guy et al.] shows another example of a similar evaluation
approach
• The different strategies differ in the way people and tags are
used in the strategies: with these tag-based systems, there are
complex relationships between users, tags and items, and
strategies aim to find the relevant aspects of these relationships
for modeling and recommendation
• Their baseline is the strategy of the ‘most popular’ tags: this
is a strategy often used, to compare the globally most popular
tags to the tags predicted by a particular personalization
strategy, thus investigating whether the personalization is worth
the effort and is able to outperform the easily available baseline.
[Guy et al. Social Media Recommendation based on People and Tags, SIGIR’10]]
Monday, March 4, 13
36. Recommendation
Systems
Predict items that are relevant/useful/interesting
(and to what extent)
for given user (in a given context)
it’s often a ranking task
Monday, March 4, 13
41. Collaborative Filtering
• Memory-based: User-Item matrix: ratings/preferences of users => compute
similarity between users & recommend items of similar users
• Model-based: Item-Item matrix: similarity (e.g. based on user ratings) between
items => recommend items that are similar to the ones the user likes
• Model-based: Clustering: cluster users according to their preferences =>
recommend items of users that belong to the same cluster
• Model-based: Bayesian networks: P(u likes item B | u likes item A) = how
likely is it that a user, who likes item A, will like item B learn probabilities
from user ratings/preferences
• Others: rule-based, other data mining techniques
•
u1 likes likes likes ! u1 likes
u2
Pulp Fiction?
Monday, March 4, 13
42. Memory vs. Model-based
• complete input data is • abstraction (model) of input
required data
• pre-computation not possible • pre-computation (partially)
possible (model has to be re-
• does not scale well (“tricks” built from time to time)
are needed)
• scales better
• high quality of
recommendations • abstraction may reduce
recommendation quality
Monday, March 4, 13
43. Collaborative Filtering:
Pros & Cons
• No domain knowledge required • Cold-start problem / New User
(recommendations are just problem
based on the social interactions)
• Sparsity problem / New Item
• Quality may improve over time problem
(e.g. the more users are
interacting with items) • SPAM (e.g. SPAM users might
promote items by rating those
• Implicit user feedback is sufficient items and other non-similar
items positively)
• Word-of-mouth approach may
lead to more diversity in the
recommendations
Monday, March 4, 13
44. Rating Issues
• Reliability, consistency: Would a person give
the same grade when re-grading?
• Validity, accuracy: Does the grade reflect the
deeper quality? Do opinions shift over
time, e.g. after grading many?
• Anonymity, trust: Does anonymity or explicit
identification impact the reviewing? Are
people led by other motives?
Monday, March 4, 13
45. Social Networks &
Interest Similarity
• collaborative filtering: ‘neighborhoods’ of people with similar
interest & recommending items based on likings in neighborhood
• limitations: next to ‘cold start’ and ‘sparsity’ the lack of control
(over one’s neighborhood) is also a problem, i.e. cannot add ‘trusted’
people, nor exclude ‘strange’ ones
• therefore, interest in ‘social recommenders’, where presence of social
connections defines the similarity in interests (e.g. social tagging CiteULike):
• does a social connection indicate user interest similarity?
• how much users interest similarity depends on the strength of their
connection?
• is it feasible to use a social network as a personalized
recommendation?
[Lin & Brusilovsky, Social Networks and Interest Similarity: The Case of CiteULike, HT’10]
Monday, March 4, 13
46. Conclusions
• pairs unilaterally connected have more common information
items, metadata, and tags than non-connected pairs.
• the similarity was largest for direct connections and
decreased with the increase of distance between users in the
social networks
• users involved in a reciprocal relationship exhibited
significantly larger similarity than users in a unidirectional
relationship
• traditional item-level similarity may be less reliable way to find similar
users in social bookmarking systems.
• items collections of peers connected by self-defined social
connections could be a useful source for cross-recommendation.
Monday, March 4, 13
47. personalization
semantics = good
or bad?
social
personalization -
are people aware?
Monday, March 4, 13
48. Content-based
Recommendations
• Input: characteristics of items & interests of a user into
characteristics of items => Recommend items that feature
characteristics which meet the user’s interests
• Techniques:
• Data mining methods: Cluster items based on their
characteristics => Infer users’ interests into clusters
• IR methods: Represent items & users as term vectors =>
Compute similarity between user profile vector and items
• Utility-based methods: Utility function that gets an item as
input; the parameters of the utility function are customized
via preferences of a user
Monday, March 4, 13
49. Government stops Tower Bridge
is a combined bascule and suspension
renovation of tower bridge in London, England, over the
River Thames.
bridge Oct 13th 2011
Category: politics, england
Related Twiper news:
@bob: Why do they stop to… [more]
@mary: London stops reno… [more]
Tower Bridge today Under construction
db:Politics 0.2
Content db:Sports
db:Education
0
0
Features db:London 0.2 = a
db:Tower_Bridge 0.4
db:Government 0.1
Weighting strategy: db:UK 0.1
- occurrence frequency
- normalize vectors (1-norm ! sum of vector equals 1)
Monday, March 4, 13
50. User’s Twitter
history
RT: Government stops User
renovation of tower
bridge Oct 13th 2011
Model
I am in London at the
moment Oct 13th 2011
db:Politics 0
I am doing sports db:Sports 0.1
Oct 12th 2011 db:Education 0
db:London 0.5 = u
db:Tower_Bridge 0.2
db:Government 0.2
Weighting strategy:
db:UK 0
- occurrence frequency (e.g. smoothened by occurrence time ! recent
concepts are more important
- normalize vectors (1-norm ! sum of vector equals 1)
Monday, March 4, 13
51. candidate items user
a b c u
db:Politics 0.2 0 0 0
db:Sports 0 0 0.5 0.1
db:Education 0 0 0.2 0
db:London 0.2 0.8 0 0.5
db:Tower_Bridge 0.4 0.2 0 0.2
db:Government 0.1 0 0 0.2
db:UK 0.1 0 0.3 0
cosine a b c
similarities
u 0.67 0.92 0.14
Recommendations
Ranking of recommended items:
1. b
2. a
3. c
Monday, March 4, 13
52. Content-based Filtering:
Pros & Cons
• New item problem may not occur • Cold-start problem / New User
(depends slightly on how the problem
item representation is created)
• Changing User Preferences
• Quality may improve over time
(e.g. if more profile
information is available about a
• Lack of Diversity / Overfitting
user) • SPAM (e.g. if item descriptions
can be added/modified by
• Implicit user feedback is sufficient users, then SPAM users might
add non-valid descriptions)
Monday, March 4, 13
53. RecSys Problems
• Cold-start problem / New User problem: no or little data is available to infer
the preferences of new users
• Changing User Preferences: user interests may change over time
• Sparsity problem / New Item problem: item descriptions are sparse, e.g. not
many user rated or tagged an item
• Lack of Diversity / Overfitting: for many applications good recommendations
should be relevant and new to the user (exceptions: predicting re-visiting
behavior, etc.). When adapting too strongly to the preferences of a user, the
user might see again and again same/similar recommendations
• Use the right context: users do lots of things, which might not be relevant for
their user model, e.g. try out things, do stuff for other people
• Research challenge: find right balance between serendipity & personalization
• Research challenge: find right way to use the influence of the
recommendations on the user’s behavior
Monday, March 4, 13
54. personalization,
privacy & laws -
are there here or
not?
Monday, March 4, 13
56. Hands-on Teaser
• Your Facebook Friends’
popularity in a spread sheet
• Locations of your Facebook
Friends
• Tag Cloud of your wall posts
image source: http://www.flickr.com/photos/bionicteaching/1375254387/
Monday, March 4, 13