Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
I want to answer, who has a
1. Author: Gideon Dror, Yehuda Koren, YoelleMaarek, IdanSzpektor Publication: KDD 2011 Presenter: Po-chih Chen I Want to Answer, Who Has a Question?Yahoo! Answers Recommender System Copyright 2011 ACM 1
2. Outline Introduction Related work and background Problem characterization A multi-channel recommender system model Empirical study Conclusions Copyright 2011 ACM 2
3. Introduction In spite of the continuous progress of Web search engines, Many of users’ needs still remain unanswered. The internet behind the query not being well expressed The absence of relevant content. While Community Question Answering (CQA)sites can feature factoid questions, their primary goal is to satisfy needs such as: Opinion seeking Recommendation Open-ended questions Problem solving Copyright 2011 ACM 3
4. Introduction(cont.) Searching for questions to answer is a different challenge than regular Web search, as users are driven by more than content similarity or page popularity. The figure shows a snapshot of a list of recent questions at a given time. Waiting for questions which you want to answer? Copyright 2011 ACM 4
5. Introduction(cont.) In this paper, to address the “answering mood” need by suggesting the “right questions” to potential answerers They propose multi-channel recommender (MCR) system. MCR accounts for the multiple dimensions of the data Copyright 2011 ACM 5
6. Related work and background Yahoo! Answers It is currently the largest existing CQA site. A question thread starts: The question remains “open” for four days with an option for extension no best answer is chosen: in-voting or for less if the asker chose a best answer After actions above: The question considered “resolved.” It has high variance in perceived question and answer quality. Their research focuses on a complementary task: matching questions to users before answers are written. Copyright 2011 ACM 6
7. Related work and background(cont.) Recommender Systems Recommender systems are based on two different strategies: Collaborative filtering (CF) It relies on analyzing relationships between users and interdependencies among products in order to identify new user-item matches. Content analysis (CA) These techniques create a characterizing profile for each user or product. The resulting profiles allow programs to associate users with matching products. for solving cold-startscenarios Copyright 2011 ACM 7
8. Recommender Systems The two primary schools of CF are: latent factor models It explain ratings by characterizing both items and users on factors inferred from rating patterns neighborhood methods It compute the relationships among users, estimating unknown ratings based on recorded ratings of like minded users Their method introduces a novel, symmetric integration of CF with CA approaches that allows exploiting behavioral signals together with user- and question-attributes. Copyright 2011 ACM 8
9. Problem characterization The task of recommending questions brings less well addressed challenges, which induce the unique design criteria for their model. The first factor to consider is that different families of item descriptors need to be exploited A second factor comes from the need to account for the multiple kinds of interactions of different intensities between users and questions When data per user and item is scarce, exploiting these diverse types of user item interactions is vital. Copyright 2011 ACM 9
10. A multi-channel recommender system model This section we introduce a Multi-Channel Recommender system model (MCR) for assessing the match between a user and a question. how questions and users are mapped into their attribute representation Question Attributes User attributes how multiple features are derived from the multi-channel attributes of the users and questions Interaction features Bias features how user and question-specific features are incorporated into MCR and how the model is trained. Copyright 2011 ACM 10
11. Question Attributes Question attributes are split into three families: textual, categories and user IDs Textual Family This family encodes textual information and takes text tokens as values. For each text block, our tokenizer annotates each word with its part-of-speech(POS) tag and lemma. The extracted terms are counted separately within each field, producing four sets of (term, count) as values of four attributes Then they filter out non-representative terms Copyright 2011 ACM 11
12. Question Attributes (cont.) Then they filter out non-representative terms For every question, we retain only terms that are either nouns, verbs or adjectives, based on their POS tags Then, each term t is ranked by its “usefulness” L(t). They define “usefulness” as the entropy of the distribution of categories given t C is the set of all categories in Yahoo! Answers #c(t) is the number of times term t appeared in text fields within category c, and Copyright 2011 ACM 12
13. Question Attributes (cont.) Copyright 2011 ACM 13 Category Family Category Family reflects the category of the question that the user has to select, from a predefined taxonomy They obviously select the user-selected category as a direct attribute, but we also add parent and grand-parent categories, when available, in order to inherit semantic similarities.
14. Question Attributes (cont.) Users can interact with a question in various ways, each deserving a different treatment asker: the user asking the question best answerer: the user who provided the best answer answerers: other users who answered the question question voters: users who starred the question as interesting answer voters: users who voted on the quality of individual answers (by “thumb up/down” votes) best answer selectors: users who participated in the best answer voting process question tracers: users who requested to receive updates on the question Copyright 2011 ACM 14
15. Question Attributes (cont.) Copyright 2011 ACM 15 Formal question attributes model A question q is described by an attribute matrix The d1 columns of the matrix correspond to each individual textual token, category and user The d2 rows correspond to the attributes Qq[i][j] holds the count for term j of attribute i. Example: Qq[title][football] = 1, d2 = 14
16. User attributes Copyright 2011 ACM 16 Users may explicitly pick their preferences over attributes within each of the attribute families. Question-driven attributes They do not want to arbitrarily weigh the relative importance of each of questions and answers interaction types. They keep them separate by adding another dimension to the user repository, called channels. Channels that qualify the user interaction with the questions Asked best answered Answered voted on question voted on an associated answer voted on best answer Traced the question
17. User attributes (cont.) Copyright 2011 ACM 17 Question-driven attributes (cont.) Channels serve a different purpose: associating a user with questions Each channel aggregates properties from the questions corresponding to a certain kind of interaction The model describes 49 kinds of user-user interaction Cartesian product of the two identical 7-tuples Explicit user attributes one more channel for expressing direct user preferences. user can explicitly specify which keywords and categories he is interested in or which other users s/he would like to follow. Textual and category families in this channel remain empty.
18. User attributes (cont.) Copyright 2011 ACM 18 Formal user attributes model A user u is represented by a 3-dimensional tensor The first dimension corresponds to the channels of interaction ( d3 =8) The other two dimensions correspond to attributes and values, in analogy to the question representation is the set of questions with which user u interacted through channel c
19. Interaction features Copyright 2011 ACM 19 These features are used by a classifier to evaluate the match between the user and the question. Pairing each question attribute with each user attribute creates multiple features For each question and user attributes of the same family They create a distinct interaction feature by measuring the cosine similarity between their corresponding attribute vectors The interaction feature resulting by matching s and t under c is the inner product: let t be one of the question attributes s be one of the user attributes under channel c
20. Bias features Copyright 2011 ACM 20 some questions that already received several answers are less attractive to users who shoot for best answer votes. They address these intuitions by adding 5 user-specific and question-specific biases as features to each question-user pair
21. Empirical study Copyright 2011 ACM 21 Experimental Setup They built user profiles based on past user activity, and then, at test time, we match these users to new questions. User profiles were constructed from four consecutive months of Yahoo! Answers activity logs. New questions were then taken from the following fifth month.
22. Empirical study (cont.) Copyright 2011 ACM 22 Model Training They training the MCR model using several linear and non-linear classifiers The best results were achieved by Gradient Boosted Decision Trees (GBDT) The feature space is not very large, they could afford using complex classifier There are four parameters controlling GBDT number of trees size of each tree Shrinkage (or, “learning rate”) sampling rate In their setup the parameter settings are: #trees=100, tree-size=20, shrinkage=1, and sampling-rate=0.5
23. Baseline Models Copyright 2011 ACM 23 The weight of each feature is the sum, over all channels and attributes, of the multiplication of the feature weight in the question and in the user models. c are all the possible channels sand t are all the possible user and question attributes wc are manually set channel weights
24. Baseline Models (cont.) Copyright 2011 ACM 24 We constructed two baselines: simple baseline Assumes all channels are equally informative (wc = 1), weighted baseline chose wc = 1 for asking, answering and best-answering wc= 1/2 for the remaining channels We examined several ways to modify the feature distribution: standardization: each feature is scaled so that its variance is 1 logarithm transformation: xi log(1+xi) and normalization: the features of each feature family are scaled to have a squared norm 1
25. Results Copyright 2011 ACM 25 They evaluated the performance of MCR and the baseline by calculating the accuracy and the Area Under ROC Curve (AUC) on test examples. The AUC metric measures the probability that a positive example isscored higher than a negative example This result shows the advantage of the MCR model
26. Results(cont.) Copyright 2011 ACM 26 To gain some insight on our model’s performance, we inspectedthe most important features, as ranked by GBDT the top features are quite evenly distributed, showing the importance of utilizing each of these families. This also shows the importance of splitting the attribute space into multiple channels, as otherwise this signal would have been lost.
27. Results(cont.) Copyright 2011 ACM 27 Table 5 describes the results of testing the classifier with the possible feature-subsets The results show that direct social features between users play only a marginal role in the discovery of promising user-question pairs
28. Results(cont.) Copyright 2011 ACM 28 They expect the MCR model to be more precise when recommending questions to users who interact more with the system. They divided the users into 12 disjoint bins on a logarithmic scale, according to the number of answered questions in the user model. Figure 5 depicts the mean accuracy and AUC for each set of users
29. Conclusions This paper introduced a novel multi-channel recommender system approach for suggesting questions to potential answerers in Yahoo! Answers. The MCR model enabled us to take advantage of various types of signals, in full symmetry, without worrying about which should be emphasized, or which would dilute others. Their experiments showed that learning to combine many signals significantly improves the baseline. Their analysis discovered that direct social relations are not as important as content signals. Copyright 2011 ACM 29