Serendipity module in   Michele Filannino
                         [info@filanninomichele.com]

 ITem Recommender       Piero Molino
                         [piero.molino@gmail.com]
Outline

• What is serendipity


• Computer science applications


• Serendipity in the Information
  Filtering


• Evaluation metrics


• A new approach


• Architecture


• Upcoming developments
What is serendipity
Origins [1]

• Christopher Armeno, 1557
  Persian fairy tale “The Three
  Princes of Serendip”


• Horace Walpole, 1754
  “Making discoveries by
  accident and sagacity, of things
  which one is not on quest of”
  ispiring from fairy tale


• Pek van Andel, 1994
  “The art of making an unsought
  finding”
Discoveries and inventions

• The discovery of America by Christopher Columbus


• Gelignite by Alfred Nobel, when he accidentally mixed collodium (gun cotton)
  with nitroglycerin


• Penicillin by Alexander Fleming. He failed to disinfect cultures of bacteria
  when leaving for his vacations, only to find them contaminated with
  Penicillium molds, which killed the bacteria


• The psychedelic effects of LSD by Albert Hofmann


• Cellophane, was developed in 1908 by swiss chemist Jacques
  Brandenberger, as a material for covering stain-proof tablecloth


• The structure of benzene by Friedric August Kekulé.
Serendipity in
scientific research [2]

                         • "It should be recognized that
                           serendipitous discoveries are of
                           significant value in the advancement of
                           science and often present the foundation
                           for important intellectual leaps of
                           understanding" - M.K. Stoskopf


                         • Serendipity as result of a wide culture and
                           curious open mind


                         • These characteristics allow to recognize
                           serendipity on manifestation
Serendipity, creativity and randomness [1] [3]

• Seredipity don’t come from randomness, but from events brought to light by
  an activity on the edge between consciousness and unconsciousness


• Classification of “information seekers”


• Role of personal characteristics in the serendipity


• “Lateral thinking” and methods (de Bono)
Serendipity equations 1/2 [4]

• P = Problem
  KP = Knowledge domain
  EP = Incorrect knowledge
  M = Inspiring metaphor
  KM = Knowledge domain of inspiring metaphor
  S = Solution
  KN = Additional knowledge


• Conventional creativity
  P1 ∈ (KP1), M ∈ (KM) S1 ∈ (KP1, KM, KN)


• Serendipity
  P1 ∈ (KP1), M ∈ (KM)   P2 ∈ (KP2), S2 ∈ (KP2, KM, KN)
Serendipity equations 2/2 [4]

• P = Problem
  KP = Knowledge domain
  EP = Incorrect knowledge
  M = Inspiring metaphor
  KM = Knowledge domain of inspiring metaphor
  S = Solution
  KN = Additional knowledge


• Serendipity without inspiring metaphor
  P1 ∈ (KP1) P2 ∈ (KP2), S2 ∈ (KP2, KN)


• Serendipity from incorrect knowledge
  P1 ∈ (KP1, EP1) P2 ∈ (KP2), S2 ∈ (KP2, KN)
Computer Science
    applications
Max 1/2 [5]

• Software agent that browse the web, simulating human behaviour, searching
  for interesting pieces of information


• The target is to encourage user creativity allowing new access points to
  informations and to lead serendipity-based discoveries


• It uses IR methods and ad-hoc heuristics
Max 2/2 [5]

• Search and browse process (best fit - treshold)


• External product based heuristic valutation


• Interaction with users happens via e-mail


• It uses WordNet
Serendipity in the
Information Filtering
Yesterday, today and tomorrow...

• “Informations research with computer technologies may tend to reduce the
  opportunity for serendipitous informations encounter” - Gup (1997-1998)


• “The image of the academic specialist, searching the shelves for a
  serendipitous connection, may seem quaint, but it remains powerful. The
  challenge for the digital library is to preserve this opportunity in cyberspace” -
  Huwe 1999


• There is a level of emotional reaction associated with serendipity that is
  difficult to capture in any metric
Obviousness [6]

• Examples (Travels - White
  Album - Star Trek)


• Ratability: probability that an
  item will be the next item that
  the user will consume (and then
  rate) given what the system
  knows of the user’s profile


• The implicit assumption is that
  a user is always interested in
  the items with the highest
  ratability. This assumption is
  true in classification problems,
  isn’t true in recommenders
Novelty vs Serendipity [7] [8]

• Both examples of not-obviousness


• Novelty: A serendipitous recommendation helps the user find a surprisingly
  interesting item he might have autonomously discovered


• Serendipity: A serendipitous recommendation helps the user find a
  surprisingly interesting item he might not have otherwise discovered


• Example: Movie Recommender
Evaluation metrics
Inadequacy of classic
metrics [7]

• Classic metrics don’t take into
  account obviousness, novelty
  and serendipity


• Accurate reccomendation ≠
  Useful reccomendation for user


• It’s impossible to evaluate the
  serendipity degree without
  considering user feedback
User-based evaluation [9]

• Users don’t want an algorithm with best score, but a sensible recomendation


• We need to consider the tasks and targets of the user in relation with different
  algorithms to obtain useful reccomendation


• Human-Recommender Interaction: Framework to structure interaction
  aspects between human and recommender, based on user experience and
  needs
Suggestions

• Interview user with:


   • Unknown items percentage
     in respect to all the articles
     suggested


   • Interesting items
     percentage in respect to all
     the articles suggested


   • Satisfaction about the
     reccomandation
Strategies to improve serendipity [3]

• “Blind Luck”: return of casual reccommendation


• “Prepared Mind”: deep user profiling


• “Anomalies and Exception”: search by poor similarity


• “Reasoning by Analogy”: not implemented yet
A new approach
Base assumption

• The user profile doesn’t represent user tastes like in a classic recommending
  system, but it represents what the user knows


• The user profile can be updated with informations not only about the
  purchased items, but also about researches because, if the user searches for
  something, this is known or it become known after the showing of the search
  results


• The user profile can be updated also with informations about the item
  visualization
Probability of serendipitous happenings

• Serendipity can’t happen if the user already knows what is recommended


• The lower is the probability that the user knows an item, the higher is the
  probability of a serendipitous reccomendation.


• We can say that te probability that the user knows something semantically
  near to what we are sure he knows is higher than the probability of something
  semantically far


• If we evaluate semantical distance with a similarity metric, it results that is
  more probable to get a serendipitous reccomandation recommending to the
  user something less similar to his profile
To support, not to substitute

• A proposal with the intent to promote serendipity can be based on poor
  similarity


• Obviously in the practical use of a recommender we can’t entrust only to
  serendipitous reccomandations


• It’s possible to support a reccomandation based on classic methods with a
  serendipitous reccomendation that stimulates the user and gives him new
  entry points to the items in the system
Noble and practical objectives

• The objectives are:


   • Noble: to enable the user to know something new, to make intresting
     discoveries, to find something different from what he is used to, stimulatig
     his curiosity


   • Practical: to improve the possibility that the user knows an item that he
     couldn’t otherwise have known (or that he would have been difficult to find
     otherwise), to improve the overall serendipity rate of the system
     reccomendations
Architecture
Knowledge profile

• The user profile usually represents the user’s tastes


• A profile that represents the user’s knowlege, the areas of interest, and so on
  would be more useful for the implementation of a serendipity module


• For that reason it would be usefull to track the page visits and the serches
  made by the user
Inverted profile

• The aim is to search by poor similarity, so the system will create an “inverted”
  version of the user profile


• Let’s substitute the tf-idf weights with new weights obtained by this formula:


    • ∀ wi ∈ P: nwi = maxweight(P) - wi


• wi is the weight of the word in the i-th position in the original vector,
  maxweight(P) is the highest weight in the profile P, nwi is the weight of the
  word in the i-th position in “inverted” vector
Randomness inside the treshold

• To avoid cold start problems and repeated recommendations it’s possible to
  select a random reccomendation


• Given a list of results ordered by ranking, it’s possible to set a treshold of
  similarity (poor similarity) and to select a random item to reccomend inside
  this treshold


• It’s also possible to select a random item to reccommend from the x more
  similar (poorly similar) regardless of the similarity (poor similarity). The x is
  propotional to the total number of items in the system
Structure


                               ITR

                                                      Serendipitous
                      Profile
                                                    Reccommendation

                               Serendipity Module

            Query                Inverse Profile
            Results                Generator

                                         Inverted
                                         Profile
                  Query



                               Recommendation
                                  Generator
Upcoming developments
Future developments

• Implementation of a Reasoner by Analogy


• Implementation of the other algorithms proposed by de Bono and Toms


• Implementation of a system that selects which algorithm to use depending on
  the kind of the user and his task


• Design of a “virtual shopkeeper” to interact with while browsing that analizes
  the user and his task and place them inside a HRI profile and changes the
  systeam consequently, changing the retrieval algorithm, the filtering approach,
  etc.
Bibliography 1/2

• [1] Anatomy of the Unsought Finding. Serendipity: Orgin, History, Domains,
  Traditions, Appearances, Patterns and Programmability - van Andel (1994)


• [2] Serendipity and Information Seeking - Foster & Ford (2003)


• [3] Serendipitous Information Retrieval - Toms (2000)


• [4] The Serendipity Equations - de Figueiredo, Campos (2001)


• [5] Searching the Unsearchable: Inducing Serendipitous Insights - de
  Figueiredo, Campos (2001)
Bibliography 2/2

• [6] Accurate is not always good: How Accuracy Metrics have hurt
  Recommender Systems - McNee, Riedl, Konstan (2006)


• [7] Evaluating Collaborative Filtering Recommender Systems - Herlocker,
  Konstan, Terveel, Riedl (2004)


• [8] Modern Information Retrieval - Baeza-Yates, Ribeiro-Neto (1999)


• [9] Making Recommendations Better: An Analytic Model for Human-
  Recommender Interaction - McNee, Riedl, Konstan (2006)

Serendipity module in Item Recommender System

  • 1.
    Serendipity module in Michele Filannino [info@filanninomichele.com] ITem Recommender Piero Molino [piero.molino@gmail.com]
  • 2.
    Outline • What isserendipity • Computer science applications • Serendipity in the Information Filtering • Evaluation metrics • A new approach • Architecture • Upcoming developments
  • 3.
  • 4.
    Origins [1] • ChristopherArmeno, 1557 Persian fairy tale “The Three Princes of Serendip” • Horace Walpole, 1754 “Making discoveries by accident and sagacity, of things which one is not on quest of” ispiring from fairy tale • Pek van Andel, 1994 “The art of making an unsought finding”
  • 5.
    Discoveries and inventions •The discovery of America by Christopher Columbus • Gelignite by Alfred Nobel, when he accidentally mixed collodium (gun cotton) with nitroglycerin • Penicillin by Alexander Fleming. He failed to disinfect cultures of bacteria when leaving for his vacations, only to find them contaminated with Penicillium molds, which killed the bacteria • The psychedelic effects of LSD by Albert Hofmann • Cellophane, was developed in 1908 by swiss chemist Jacques Brandenberger, as a material for covering stain-proof tablecloth • The structure of benzene by Friedric August Kekulé.
  • 6.
    Serendipity in scientific research[2] • "It should be recognized that serendipitous discoveries are of significant value in the advancement of science and often present the foundation for important intellectual leaps of understanding" - M.K. Stoskopf • Serendipity as result of a wide culture and curious open mind • These characteristics allow to recognize serendipity on manifestation
  • 7.
    Serendipity, creativity andrandomness [1] [3] • Seredipity don’t come from randomness, but from events brought to light by an activity on the edge between consciousness and unconsciousness • Classification of “information seekers” • Role of personal characteristics in the serendipity • “Lateral thinking” and methods (de Bono)
  • 8.
    Serendipity equations 1/2[4] • P = Problem KP = Knowledge domain EP = Incorrect knowledge M = Inspiring metaphor KM = Knowledge domain of inspiring metaphor S = Solution KN = Additional knowledge • Conventional creativity P1 ∈ (KP1), M ∈ (KM) S1 ∈ (KP1, KM, KN) • Serendipity P1 ∈ (KP1), M ∈ (KM) P2 ∈ (KP2), S2 ∈ (KP2, KM, KN)
  • 9.
    Serendipity equations 2/2[4] • P = Problem KP = Knowledge domain EP = Incorrect knowledge M = Inspiring metaphor KM = Knowledge domain of inspiring metaphor S = Solution KN = Additional knowledge • Serendipity without inspiring metaphor P1 ∈ (KP1) P2 ∈ (KP2), S2 ∈ (KP2, KN) • Serendipity from incorrect knowledge P1 ∈ (KP1, EP1) P2 ∈ (KP2), S2 ∈ (KP2, KN)
  • 10.
    Computer Science applications
  • 11.
    Max 1/2 [5] •Software agent that browse the web, simulating human behaviour, searching for interesting pieces of information • The target is to encourage user creativity allowing new access points to informations and to lead serendipity-based discoveries • It uses IR methods and ad-hoc heuristics
  • 12.
    Max 2/2 [5] •Search and browse process (best fit - treshold) • External product based heuristic valutation • Interaction with users happens via e-mail • It uses WordNet
  • 13.
  • 14.
    Yesterday, today andtomorrow... • “Informations research with computer technologies may tend to reduce the opportunity for serendipitous informations encounter” - Gup (1997-1998) • “The image of the academic specialist, searching the shelves for a serendipitous connection, may seem quaint, but it remains powerful. The challenge for the digital library is to preserve this opportunity in cyberspace” - Huwe 1999 • There is a level of emotional reaction associated with serendipity that is difficult to capture in any metric
  • 15.
    Obviousness [6] • Examples(Travels - White Album - Star Trek) • Ratability: probability that an item will be the next item that the user will consume (and then rate) given what the system knows of the user’s profile • The implicit assumption is that a user is always interested in the items with the highest ratability. This assumption is true in classification problems, isn’t true in recommenders
  • 16.
    Novelty vs Serendipity[7] [8] • Both examples of not-obviousness • Novelty: A serendipitous recommendation helps the user find a surprisingly interesting item he might have autonomously discovered • Serendipity: A serendipitous recommendation helps the user find a surprisingly interesting item he might not have otherwise discovered • Example: Movie Recommender
  • 17.
  • 18.
    Inadequacy of classic metrics[7] • Classic metrics don’t take into account obviousness, novelty and serendipity • Accurate reccomendation ≠ Useful reccomendation for user • It’s impossible to evaluate the serendipity degree without considering user feedback
  • 19.
    User-based evaluation [9] •Users don’t want an algorithm with best score, but a sensible recomendation • We need to consider the tasks and targets of the user in relation with different algorithms to obtain useful reccomendation • Human-Recommender Interaction: Framework to structure interaction aspects between human and recommender, based on user experience and needs
  • 20.
    Suggestions • Interview userwith: • Unknown items percentage in respect to all the articles suggested • Interesting items percentage in respect to all the articles suggested • Satisfaction about the reccomandation
  • 21.
    Strategies to improveserendipity [3] • “Blind Luck”: return of casual reccommendation • “Prepared Mind”: deep user profiling • “Anomalies and Exception”: search by poor similarity • “Reasoning by Analogy”: not implemented yet
  • 22.
  • 23.
    Base assumption • Theuser profile doesn’t represent user tastes like in a classic recommending system, but it represents what the user knows • The user profile can be updated with informations not only about the purchased items, but also about researches because, if the user searches for something, this is known or it become known after the showing of the search results • The user profile can be updated also with informations about the item visualization
  • 24.
    Probability of serendipitoushappenings • Serendipity can’t happen if the user already knows what is recommended • The lower is the probability that the user knows an item, the higher is the probability of a serendipitous reccomendation. • We can say that te probability that the user knows something semantically near to what we are sure he knows is higher than the probability of something semantically far • If we evaluate semantical distance with a similarity metric, it results that is more probable to get a serendipitous reccomandation recommending to the user something less similar to his profile
  • 25.
    To support, notto substitute • A proposal with the intent to promote serendipity can be based on poor similarity • Obviously in the practical use of a recommender we can’t entrust only to serendipitous reccomandations • It’s possible to support a reccomandation based on classic methods with a serendipitous reccomendation that stimulates the user and gives him new entry points to the items in the system
  • 26.
    Noble and practicalobjectives • The objectives are: • Noble: to enable the user to know something new, to make intresting discoveries, to find something different from what he is used to, stimulatig his curiosity • Practical: to improve the possibility that the user knows an item that he couldn’t otherwise have known (or that he would have been difficult to find otherwise), to improve the overall serendipity rate of the system reccomendations
  • 27.
  • 28.
    Knowledge profile • Theuser profile usually represents the user’s tastes • A profile that represents the user’s knowlege, the areas of interest, and so on would be more useful for the implementation of a serendipity module • For that reason it would be usefull to track the page visits and the serches made by the user
  • 29.
    Inverted profile • Theaim is to search by poor similarity, so the system will create an “inverted” version of the user profile • Let’s substitute the tf-idf weights with new weights obtained by this formula: • ∀ wi ∈ P: nwi = maxweight(P) - wi • wi is the weight of the word in the i-th position in the original vector, maxweight(P) is the highest weight in the profile P, nwi is the weight of the word in the i-th position in “inverted” vector
  • 30.
    Randomness inside thetreshold • To avoid cold start problems and repeated recommendations it’s possible to select a random reccomendation • Given a list of results ordered by ranking, it’s possible to set a treshold of similarity (poor similarity) and to select a random item to reccomend inside this treshold • It’s also possible to select a random item to reccommend from the x more similar (poorly similar) regardless of the similarity (poor similarity). The x is propotional to the total number of items in the system
  • 31.
    Structure ITR Serendipitous Profile Reccommendation Serendipity Module Query Inverse Profile Results Generator Inverted Profile Query Recommendation Generator
  • 32.
  • 33.
    Future developments • Implementationof a Reasoner by Analogy • Implementation of the other algorithms proposed by de Bono and Toms • Implementation of a system that selects which algorithm to use depending on the kind of the user and his task • Design of a “virtual shopkeeper” to interact with while browsing that analizes the user and his task and place them inside a HRI profile and changes the systeam consequently, changing the retrieval algorithm, the filtering approach, etc.
  • 34.
    Bibliography 1/2 • [1]Anatomy of the Unsought Finding. Serendipity: Orgin, History, Domains, Traditions, Appearances, Patterns and Programmability - van Andel (1994) • [2] Serendipity and Information Seeking - Foster & Ford (2003) • [3] Serendipitous Information Retrieval - Toms (2000) • [4] The Serendipity Equations - de Figueiredo, Campos (2001) • [5] Searching the Unsearchable: Inducing Serendipitous Insights - de Figueiredo, Campos (2001)
  • 35.
    Bibliography 2/2 • [6]Accurate is not always good: How Accuracy Metrics have hurt Recommender Systems - McNee, Riedl, Konstan (2006) • [7] Evaluating Collaborative Filtering Recommender Systems - Herlocker, Konstan, Terveel, Riedl (2004) • [8] Modern Information Retrieval - Baeza-Yates, Ribeiro-Neto (1999) • [9] Making Recommendations Better: An Analytic Model for Human- Recommender Interaction - McNee, Riedl, Konstan (2006)