Random Indexing and Quantum Negation for TV-Shows Retrieval and Classification

1,166 views

Published on

My final presentation at the end of the internship in Philips Research (Eindhoven)

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,166
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
32
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Random Indexing and Quantum Negation for TV-Shows Retrieval and Classification

    1. 1. High Tech Campus, Philips Research Eindhoven, NetherlandsRandom Indexing and Quantum Negation for TV-Shows Retrieval and Classification Cataldo Musto, Ph.D. Student cataldomusto@di.uniba.it - cataldo.musto@philips.com University of Bari “Aldo Moro” (Italy), SWAP Research Group Philips Research Center - Eindhoven (Netherlands) - HI&E Group 14.07.11
    2. 2. outline • part 1: introduction • information overload, personalization, information filtering, recommender systems • part 2: approaches • vector space model, random indexing, quantum negation • part 3: scenario • tv-show recommendation, description of the data, description of the tasks • part 4: experimental evaluation • results, discussion, future workC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    3. 3. part 1: introduction what are we talking about?C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    4. 4. TVC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    5. 5. text messagesC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    6. 6. phone callsC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    7. 7. internet navigationC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    8. 8. scenario • Daily interaction with electronic devices • eMail, Web navigation, Social media, instant messaging • Continuous flow of information • in 2007, 500.000 terabyte of information have been produced on the Web in one year • By including also telephone, radio, TV and so on we reach 18 exabytes of data!C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    9. 9. information overload • Consequences: cognitive overload • It is impossible to effectively deal with this surplus of information • It is difficult to quickly find the information we really needC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    10. 10. Solution:personalization
    11. 11. information filtering ” An information filtering system is a system that removes redundant of unwanted information from an information stream using automated methods ” Wikipedia.C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    12. 12. information filtering systems • How do they work? • Usually, in three steps • Training Step • User Modeling • FilteringC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    13. 13. Step 1: TrainingC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    14. 14. Step 2: User ModelingC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    15. 15. Step 3: FilteringC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    16. 16. recommender systems • A specific type of Information Filtering system that attempts to recommend information items (films, television, video on demand, music, books,  etc) that are likely to be of interest to the user • Everyday we interact with recommender systems, even if we do not know it!C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    17. 17. AmazonC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    18. 18. YouTubeC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    19. 19. recommendation approaches • Content-based filtering • No interactions between users. Each user is an atomic entity • Prerequisite: each item to be recommended has to be described through a set of textual features • We store in a user profile the features that often occur in the items she like • Assumption: if a user usually likes items in whose description often occurs a specific feature we can assume that he will like that items also in the future • e.g. • If User_A likes a news with the features “Football” and “Internazionale FC” inside • We can recommend her other news about both Football or Internazionale FCC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    20. 20. part 2: approaches vector space model, random indexing,quantum negationC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    21. 21. vector space model • Introduced by Salton in 1975 • Given a set of M documents (items) d = (d1.....dM) • Given N features describing the documents • Each document (item) is represented in a an N- dimensional vector space • The whole corpus is represented in a N*M matrix called term/document matrixC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    22. 22. vector space model • VSM in a recommendation scenario • Document: point in the vector space • User profile: point in the vector space • e.g. built as the sum of the vector space representation of the documents liked in the past by the user • Goal: to find the documents that are the most relevant ones for that user profile • Assumption • the most similar documents in the vector space are the most relevant ones • Cosine Similarity to compute the similarity between query and documentsC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    23. 23. vsm analysis (2) • Weak Points • Not incremental • The whole Vector Space has to be generated from scratch whenever a new item is added to the repository • High Dimensionality • NLP operations (stopwords elimination, stemming and so on) • Does not manage negative evidence • The vector space representation only depends on the features that occur in the document, there are no assumption about the features that don’t occur • Does not manage the latent semantic of documents • Any permutation of the terms in a document has the same VSM representation!C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    24. 24. idea • To introduce tools and techniques able to overcome these drawbacks • Random Indexing • Dimensionality reduction technique Sahlgren, 2005 • Quantum Negation • Based on Quantum Logic Widdows, 2007C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    25. 25. random indexing • Random Indexing (RI) is an incremental and effective technique for dimensionality reduction • Distributional Models • Assumption: we can infer information about terms by analyzing how are they used in large corpus of data • Based on the so-called “Distributional Hypothesis” • “Words that occur in the same context tend to have similar meanings” • “Meaning is its use” (Wittgenstein)C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    26. 26. how it works? Random Indexing reduces the original dimensional term/doc matrix to a new lower dimensional matrixC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    27. 27. how it works? • How? • By multiplying the original matrix with a random one, built in an incremental way • formally: An,m * Rm,k = Bn,k • k << m • After projection, the distance between points in the vector space is preserved • Johnson-Lindenstrauss LemmaC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    28. 28. random matrix • How is the random matrix build? • The whole process is based on the concept of “context” • Given a term, its “context” could be the whole document, a paragraph, a sentence, a sliding window of words and so on. • The definition of the context influences the structure of the matrix • The matrix is built in an iterative and incremental way • The vector representing each document depends on the terms that occur in it • The vector representing each term depends on its contextC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    29. 29. item representation • A context vector is assigned for each context (for simplicity, we assume as context the whole document) • This vector has a fixed dimension (k) and it can contain only values in -1, 0,1. Values are distributed in a random way but the number of non- zero elements is much smaller. • The Vector Space representation of a term is obtained by summing all its context (the documents it occurs in). • The Vector Space representation of a document (item) is obtained by summing the context vectors of the terms that occur in it • Output: lower-dimensional vector space representation based on random context vectorsC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    30. 30. quantum negation • Random Indexing is still not capable of managing negative evidence • RI can be coupled with Quantum Negation (QN) operator • Definition inherited by Quantum logic • Negation as a form of orthogonality between vectors • Given two vectors A e B , we can define the vector A not B • It represents the projection of the vector A on the subspace orthogonal to those generated by vector B • In a recommendation scenario, this operator could be used to model two vectors, the first one representing positive evidence and the second one for modeling negative onesC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    31. 31. ...summing up • VSM is an effective model for document retrieval • It can be exploited in recommendation scenarios • It suffers from some well-known drawbacks • Solutions • Random Indexing is an incremental and effective approach that can catch the high-dimensionality problem • Quantum Negation can effectively model negative evidence • The combined use of RI and QN is a good alternative to VSM, especially for real-life scenariosC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    32. 32. part 3: scenario tv-shows recommendationC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    33. 33. Scenario: EPG (Electronic Program Guides) personalizationC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    34. 34. scenario • Given a set of TV-Shows we want to provide user a set of suggestions about the shows that she should watch, according on her preferencesC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    35. 35. approach Currently the recommendation model is implemented through the Vector Space Model (VSM)C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    36. 36. data • TV shows gathered from a set of 47 German-language broadcast channel • Each TV show is described through a set of textual features (title, synopsis, description, etc.) gathered from an XML feed • Each TV-Show is mapped to a fixed program type (Movie, Sport, Documentary, Magazine, etc.)C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    37. 37. problems • How to represent the data? • We compared two approaches • Bag of Words (BOW) • Tag.me • Which ones are the typical use cases? • We identified two tasks • Classification Task • Retrieval TaskC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    38. 38. data representation • Bag of Words • Each item i is described through the words that appear in the text • Weighting of the words • Counting of the occurrences, normalization, TF-IDF weighting, etc.C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    39. 39. BOW representation • To improve BOW representation • Usually textual description are very noisy • Full of uninformative words • Further processing can improve the classical BOW representation • Stopword removal: filtering of all the uninformative words (articles, adverbs, adjectives and so on)C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    40. 40. data representation • Tag.me • Online tool developed by the University of Pisa (Italy) • Goal: to identify Wikipedia concepts that occur in the text • Idea: to process original text through Tag.me in order to avoid noise and provide a novel representation based on high-level Wikipedia conceptsC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    41. 41. tag.me web interfaceC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    42. 42. final output Bow Tag.meC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    43. 43. description of the tasks • task 1: classification • Given a flow of TV shows, we would classify them against a the set of program types • task 2: retrieval • Given a set of program type and a repository of TV shows, we would retrieve the shows that belong to a specific program typeC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    44. 44. VSM for TV shows classification • Steps • 1) Build a vector space for the tv shows • 2) Build a vector for each program type • 3) Use cosine similarity to compare tv shows and program types • 4) Assign the TV show to the program type that got the highest cosine similarityC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    45. 45. VSM for TV shows classification • Step 1: build a vector space representation of the TV-shows • For each TV show we collected a set of words by using the synopsis and the title of the show • We filtered out the set of the words through a fixed set of 996 stopwords for German language • We calculated the TF-IDF score for each documentC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    46. 46. VSM for TV shows classification • Step 2: build a vector for each program type • Given the vector space representation of each document • The vector space representation of each program type is the sum of the vector space representations of each tv- show that belongs to that program typeC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    47. 47. VSM for TV shows classification • Given a set of TV-shows • T=(s1...sn) • Given a set of program types • P=(t1...tm) • We define a function pt: P T • It returns the program type of a tv show • We can build the set S(t_i) as the set of the tv-shows that belong to t_i • It returns the program type of a tv show •C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    48. 48. VSM for TV shows classification • Given the set S(t_i) with a cardinality of k, the vector space representation of the program type is simply given byC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    49. 49. VSM for TV shows classification • Step 3 and Step 4 • Given the vector space representation of both program types and tv shows • Use of cosine similarity to compare each TV shows against the set of the program types • We assigned the TV show to the program type that got the highest cosine similarityC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    50. 50. RI for TV shows classification • Steps • 1) Build a vector space for the tv shows • 2) Reduce the vector space through the Random Indexing algorithm • 3) Build a vector for each program type on the (reduced) vector space • 4) Use cosine similarity to compare tv shows and program types • 5) Assign the TV show to the program type that got the highest cosine similarityC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    51. 51. RI for TV shows retrieval • Steps • 1) Build a vector space for the tv shows • 2) Reduce the vector space through the Random Indexing algorithm • 3) Build a positive vector for each program type on the (reduced) vector space • 4) Use cosine similarity to compare tv shows and program types • 5) Rank the tv shows and assign the first N to the program typeC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    52. 52. RI+QN for TV shows retrieval • Steps • 1) Build a vector space for the tv shows • 2) Reduce the vector space through the Random Indexing algorithm • 3) Build a positive vector for each program type on the (reduced) vector space • 4) Build a negative vector for each program type on the (reduced) vector space • 5) Use cosine similarity to compare tv shows with both positive and negative program types vectors • 6) Rank the tv shows and assign the first N to the program typeC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    53. 53. RI+QN for TV shows retrieval • Given a set of TV-shows • T=(s1...sn) • Given a set of program types • P=(t1...tm) • We define a function pt: P T • It returns the program type of a tv show • We can build the set S(t_i) as the set of the tv-shows that belong to t_i • It returns the program type of a tv show •C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    54. 54. RI+QN for TV shows retrieval • Given the sets S(t_i) and its complement with a cardinality of k and z the vector space representation of the program type is simply given by • The positive and negative vector will be combined in order to emphasize the features that occur in the positive vector and avoid the ones that occur in the negative oneC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    55. 55. ...summing up • Classification task • Comparison of VSM and RI • We build a vector space • Applied RI to reduce the vector space • We tried to classify TV shows in the complete vector space and in the reduced one, comparing the accuracy • Retrieval task • Comparison of RI and RI+QN • We build a vector space • Applied RI to reduce the vector space • Build both positive and negative program types vectors and applied QN • We tried to retrieve TV shows and we compared the the RI without negation and the RI with negationC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    56. 56. part 4: experimental evaluation results, discussion, future workC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    57. 57. dataset program tv shows 133.579 17 types features features 306,006 74,599 (BOW) (Tag.me) avg avg features 42.11 features 9.21 (BOW) (Tag.me)C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    58. 58. experimental design • 10-fold cross validation • Dataset splitted in 10 partitions • 9 partitions for training the models, the last one for testing • Results averaged over all the partitionsC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    59. 59. metrics • classification task • precision = • retrieval task • precision @n = • precision @k% =C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    60. 60. tuning of parameters • Random Indexing algorithm • Dimension of the vectors • Classification task: 500, 700 • Retrieval task: 500, 1000, 1500, 2000 • Minimum number of occurrences • Classification task: 2 • Retrieval task: 1, 3 • Training Cycles • Classification task: 1, 2 • Retrieval task: 1C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    61. 61. classification task - results size occur. cycles tag.me bow 500 2 1 37.38 42.91 700 2 1 40.28 47.76 500 2 1 44.61 54.32 700 2 1 45.33 54.33C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    62. 62. classification task: comparison 68.7 54.3 54.3 47.7 42.9C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    63. 63. classification - results per program typeC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    64. 64. classification task - outcomes • BOW better than Tag.me • Representation too poor • Difficult to learn a solid and effective model for text classification • Dimension of the vector space and the second training cycles affect the predictive accuracy • RI does not overcome the baseline • Vector space reduced over 99% (from 133579 to 500 or 700) • Too much loss of information • but • Splitting the results for single program types the Random Indexing got better results in 10 out of 17 program types • Need to investigate the reasons of thatC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    65. 65. retrieval task - bow - p@n 82.6% 66.3%C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    66. 66. retrieval task - bow - p@n 65.9% 45.2%C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    67. 67. retrieval task - bow - p@n 58.1% 36.5%C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    68. 68. retrieval task - bow - p@k% 86.0% 58.1%C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    69. 69. retrieval task - bow - p@k% 55.4% 35.4%C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    70. 70. retrieval task - tagme - p@n 61.9% 47.9%C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    71. 71. retrieval task - tagme - p@n 53.7% 40.9%C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    72. 72. retrieval task - tagme - p@n 51.6% 39.0%C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    73. 73. retrieval task - tagme - p@k% 76.6% 57.9%C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    74. 74. retrieval task - tagme - p@k% 49.6% 35.4%C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    75. 75. retrieval task - overview 82.6% 61.9%C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    76. 76. retrieval task - overview 65.0% 53.0%C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    77. 77. retrieval task - overview 58.3% 53.2%C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    78. 78. retrieval task - outcomes • BOW always better than Tag.me • Between 5 and 20% difference • Parameters do not affect the accuracy • QN operator improves the retrieval accuracy by almost 20%C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    79. 79. conclusions & future work • In scenarios where the recommender system has to deal with a continous flow of information the VSM is not suitable • RI is able to effectively catch typical VSM drawbacks • Classification task • Even if its accuracy is lower, these preliminar results need to be further investigated, for example testing the algorithm with different values of the parameters • Is a worsening in precision suitable for an algorithm that provides a big improvement in scalability and efficiency? • Retrieval Task • QN improves the predictive accuracy of the model in the retrieval tasks • Novel operator, this is important outcome with a good scientific impactC.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11
    80. 80. Thanks for you attention. Cataldo Musto, Ph.D. Student cataldomusto@di.uniba.it - cataldo.musto@philips.com University of Bari “Aldo Moro” (Italy), SWAP Research Group Philips Research Center - Eindhoven (Netherlands) - HI&E Group 14.07.11C.Musto: Random Indexing and Quantum Negation for TV shows Retrieval and Classification - Philips Research , Eindhoven (The Netherlands) - 14.07.11

    ×