TEMPORAL MODELS FOR MINING, RANKING AND RECOMMENDATION IN THE WEB Tu Nguyen L3S Research Center Leibniz Universität Hannover
Outline 2 Temporal Dynamics Web Web Archives Collaborative Knowledge Bases Social Networks
Through the Lens of Time.. 3 tim e
4 Temporal Dynamics Web
Research Questions 5 • RQ1.1: How do the relevant aspects of an entity-centric query change around the associated event ti...
Motivation 7 Australia Open
Motivation 8 Winners Nominations Movies Actors Location Athletes Australia Open Winners Schedule DrawResults
Motivation 9 t Jan Feb Mar Querying time
Motivation 10 Long-term cumulativeness vs. Short-term interest.
Motivation 11 In addition, different event times and types entail different characteristics toward long- term and short- t...
Problem Statment 12 •Problem (Temporal Entity-Aspect Recommendation): Given an event entity e and hitting time t as input,...
Approach Overview 13
Approach Overview 14 Sub-Task
Sub-task 15 • Time and Type cascaded classification • Semantic relation between task labels • à joint-learning in cascaded...
Approach Overview 16 Ranking Task
Multi-criteria Learning 17 • Multiple Ranking Models • Idea: divide-and-conquer, each feature-set performs better for cert...
Ranking Features 18 • Salience features • Mainly extracted from Wikipedia or long duration query logs • Avg. TF-IDF • Lang...
Datasets 19 • AOL query logs • 03-2006 to 05-2006: 3 months • Over 30 mil. Queries • Manual construction: • 837 entity que...
Methods for Comparison 20 • Random walk with restart (RWR) • SOTA time-aware query auto-completion: • Most popular complet...
Experiments 21 • How do long-term salience and short-term interest features perform at different time periods of different...
Experiments (2) 22 • How do long-term salience and short-term interest features perform at different time periods of diffe...
Experiments (3) 23 • How does the ensemble ranking model perform compared to the single model approaches?
Research Questions 24 • RQ1.2: Given an entity-centric query of semantical or topical ambiguity at an event time, how shou...
Motivation 25 music spy satellite mission beer beer Search in November 2019
Motivation 26 music spy satellite mission beer beer Search in November 2019 Search in March 2020
Temporal Search Results Diversification 27 Objective function of the greedy optimization: • c: subtopic • S: incremental s...
Motivation 28 Temporal Dynamics Collaborative Knowledge Bases
Wikipedia as a Global Memory Place 29
Collective memory in Wikipedia 30 •What triggers human remembering of past events?
Motivation 31 • Wikipedia as a source for global memory • Largest and most up-to-date online encyclopedia • Its open const...
Research Questions 32 • RQ2.1: How past events are remembered and what triggers human remembering of these events in Wikip...
Approach 34 • We propose a 3-step approach, for a given event: 1. Heuristically quantify “remembering scores” of past even...
Approach 35 • Remembering score: A linear mixture model of: • Cross-correlation coefficient (CCF) • Or sliding inner produ...
Studied Features for Triggered Remembering 36 • Temporal similarity • Time distance between two events (in days, months or...
Study on Atlantic Hurricanes 37 Location and time have a low effect on the category
Study on Aviation Accidents 38 Location and time have a stronger effect on the category
Lessons Learned 39 • We identified some first patterns for event memory triggering for diverse event types including natur...
Research Questions 40 • RQ2.2: How do we quantify the semantic relatedness between two entities / events?
Dynamic Entity Relatedness Ranking 41 TaylorLautner in“Twilight“ [2008-2012] TaylorLautner in“Cuckoo“ [2012-] TaylorLautne...
Dynamic Entity Relatedness Ranking 42 • Dynamic Entity Relatedness: between two entities es , ed , where es is the source ...
Dynamic Entity Relatedness Ranking 43 • A joint “neural” learning model • Graph-based representation • Content-based repre...
Temporal time-series based similarity 44 • 1-D Convolution layer • Decay-guided self-attention mechanism • Dot-product bet...
Experiment settings 45 • Datasets
Experiment settings 46 • Baselines • Wikipedia Link-based (WLM) • DeepWalk (DW) • Entity2Vec Model (E2V) • ParaVecs(PV) • ...
Experiment Results 47
48 Temporal Dynamics Web Archives
Motivation 49 Correlation between time series mined from anchor text (left, ccf = 0.69, τdelay = 2) and Google Trend (righ...
Motivation 50 Time series of popular vote (ccf = 0.94, τdelay = 2), border fence (ccf = 0.40,τdelay = 1) and heath care re...
Motivation 51 Cumulative signals from anchor text tend to well-reflect real- world event trend patterns with some slight d...
Motivation 52 In this work, we rely solely on the Web Archive link-graph to mine important documents.
Research Questions 53 • RQ3: Given a query and the Web Archive, how do we come up with a top-k ranked list of documents wh...
Anchor-text based Retrieval Pipeline 54
Motivation 55 • DivRank[*] • Rich-get-richer phenomenon • Has a clear optimization explanation • [*] Mei, Qiaozhu, Jian Gu...
Temporal Random Surfer Model 56 • Time-aware Teleportation • jump to any snapshot with a time preference • Time-aware Tran...
Absorbing Random Walk on Temporal Graph 57 • Vertex-Reinforcement Random walk • within-snapshot: the transition probabilit...
Experiment results 58 Diversity by time Diversity by topics
59 Temporal Dynamics Social Network
Research Questions 60 • RQ4: How do temporal models develop and how do we control and improve the stability of such models...
Motivation 62
Motivation 63 The Amuay Explosion news and Castro’ Death rumor spread over Twitter[*] [*] Jin, Fang, et al. "Epidemiologic...
Motivation 64 The Amuay Explosion news and Castro’s Death rumor spread over Twitter[*] [*] Jin, Fang, et al. "Epidemiologi...
System pipeline 65 • Sometimes Average is the best..
System pipeline 66 • Sometimes Average is the best.. Dynamic Series Time Structure: feature vector representation: • incop...
Tweet-level credibility model 67 Tweet-level credibility model 6619.01.20
Experiment Results 68
70 Temporal Dynamics Social Network Clinical domain
Research Questions 71 • RQ4: How do temporal models develop and how do we control and improve the stability of such models...
Motivation 72 Task: predict BG-level in 1 hour
Motivation 73 Sparsity: Measurements taken periodically and (somewhat) spontaneously.
Motivation – preliminary results 74
Uncertainty in Machine Learning 75 [*] Digrams adopted from https://www.groundai.com/project/aleatoric-and-epistemic-uncer...
Uncertainty in Machine Learning 76 Go Bayesian.. Posterior distribution Weighted average [*] Digrams adopted from https://...
Uncertainty in Random Forest 77 Tree Finite#bootstrapreplicatesB Tree Tree variance estimatesRF Ensemble Learning
Uncertainty in Random Forest 78 RF Tree Finite#bootstrapreplicatesB Tree Tree variance estimates MC noise sampling noise
Uncertainty in Random Forest 79 • *Wager, Stefan, Trevor Hastie, and Bradley Efron. "Confidence intervals for random fores...
Experiment results 80 • Sanity filter: carefully-designed heuristic methods (e.g., no long gap prediction, no malformed in...
Conclusions 81 Temporal Dynamics Web Web Archives Collaborative Knowledge Bases Social Networks Search Recommendations Anc...
Temporal models for mining, ranking and recommendation in the Web

Talk on temporal models for information retrieval

Temporal models for mining, ranking and recommendation in the Web

  1. 1. TEMPORAL MODELS FOR MINING, RANKING AND RECOMMENDATION IN THE WEB Tu Nguyen L3S Research Center Leibniz Universität Hannover 1
  2. 2. Outline 2 Temporal Dynamics Web Web Archives Collaborative Knowledge Bases Social Networks
  3. 3. Through the Lens of Time.. 3 tim e
  4. 4. 4 Temporal Dynamics Web
  5. 5. Research Questions 5 • RQ1.1: How do the relevant aspects of an entity-centric query change around the associated event time, specifically just before, during and after the event time. • RQ1.2: Given an entity-centric query of semantical or topical ambiguity at an event time, how should the ranked list of relevant documents be formed so that the coverage at top-k is maximized?
  6. 6. Research Questions 6 • RQ1.1: How do the relevant aspects of an entity-centric query change around the associated event time, specifically just before, after and during the event time.
  7. 7. Motivation 7 Australia Open
  8. 8. Motivation 8 Winners Nominations Movies Actors Location Athletes Australia Open Winners Schedule DrawResults
  9. 9. Motivation 9 t Jan Feb Mar Querying time
  10. 10. Motivation 10 Long-term cumulativeness vs. Short-term interest.
  11. 11. Motivation 11 In addition, different event times and types entail different characteristics toward long- term and short- term interests.
  12. 12. Problem Statment 12 •Problem (Temporal Entity-Aspect Recommendation): Given an event entity e and hitting time t as input, find the ranked list of entity aspects that most relevant with regards to e and t.
  13. 13. Approach Overview 13
  14. 14. Approach Overview 14 Sub-Task
  15. 15. Sub-task 15 • Time and Type cascaded classification • Semantic relation between task labels • à joint-learning in cascaded manner • Features • Seasonality • Trending • Auto-correlation • Prediction Errors • SpikeM fitting parameters[1] [1] Matsubara, Yasuko, et al. "Rise and fall patterns of information diffusion: model and implications." Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2012. 02060100140 observed 202530 trend 0204060 seasonal −4002040 1990 1995 2000 2005 random Time Decomposition of additive time series
  16. 16. Approach Overview 16 Ranking Task
  17. 17. Multi-criteria Learning 17 • Multiple Ranking Models • Idea: divide-and-conquer, each feature-set performs better for certain entity type and at certain event time. 1. Probability the event entity e, at time t, of type C ∈ {Breaking, Anticipated} 2. Probability e is with subject to C is at event time T ∈ {Before, During, After} 3. We use RankSVM to estimate the ranking function f(X, ω) for yˆa 1 2 3
  18. 18. Ranking Features 18 • Salience features • Mainly extracted from Wikipedia or long duration query logs • Avg. TF-IDF • Language Model-based features • MLE, Entropy: reward most (cumulated) frequent aspects • Short-term interest features • Mainly extracted from recent query logs • Trending velocity • Temporal click entropy • Cross correlation (entity vs. aspect) • Temporal LM
  19. 19. Datasets 19 • AOL query logs • 03-2006 to 05-2006: 3 months • Over 30 mil. Queries • Manual construction: • 837 entity queries • 300 event-related queries • Ground-truth: 70 queries (Breaking: 30, Anticipated: 40)
  20. 20. Methods for Comparison 20 • Random walk with restart (RWR) • SOTA time-aware query auto-completion: • Most popular completion[2] • Recent MPC[2] • Last N query distribution[2] • Predicted next N query distribution[2,3] • SVM-salience: with all salient features[4] • SVM-timeliness: with all short-term interest features • SVM-all: with all features •[2] S. Whiting and J. M. Jose. Recent and robust query auto-completion. In WWW ‘14. •[3] M. Shokouhi and K. Radinsky. Time-sensitive query auto-completion. In SIGIR ’12. •[4] Reinanda, Ridho, Edgar Meij, and Maarten de Rijke. Mining, ranking and recommending entity aspects. In SIGIR’15.
  21. 21. Experiments 21 • How do long-term salience and short-term interest features perform at different time periods of different event types? • Breaking: Salience model performs well for before, worsen for after
  22. 22. Experiments (2) 22 • How do long-term salience and short-term interest features perform at different time periods of different event types? • Anticipated: Timeliness model performs well for before and after, worsen for during
  23. 23. Experiments (3) 23 • How does the ensemble ranking model perform compared to the single model approaches?
  24. 24. Research Questions 24 • RQ1.2: Given an entity-centric query of semantical or topical ambiguity at an event time, how should the ranked list of relevant documents be formed so that the coverage at top-k is maximized?
  25. 25. Motivation 25 music spy satellite mission beer beer Search in November 2019
  26. 26. Motivation 26 music spy satellite mission beer beer Search in November 2019 Search in March 2020
  27. 27. Temporal Search Results Diversification 27 Objective function of the greedy optimization: • c: subtopic • S: incremental set of diversified documents • q: query • d: target document - sensitive to time - should take document age into account
  28. 28. Motivation 28 Temporal Dynamics Collaborative Knowledge Bases
  29. 29. Wikipedia as a Global Memory Place 29
  30. 30. Collective memory in Wikipedia 30 •What triggers human remembering of past events?
  31. 31. Motivation 31 • Wikipedia as a source for global memory • Largest and most up-to-date online encyclopedia • Its open construction and negotiation in Wikipedia is an important new cultural and societal phenomenon • Indicators for identifying real-world events • View logs as the proxy for collective memory • Public page view traffics with a (very) long time span • Not directly reflect how people forget; significant patterns are a good estimate of public remembering
  32. 32. Research Questions 32 • RQ2.1: How past events are remembered and what triggers human remembering of these events in Wikipedia? • RQ2.2: How do we quantify the semantic relatedness between two entities / events?
  33. 33. Research Questions 33 • RQ2.1: How past events are remembered and what triggers human remembering of these events in Wikipedia? • Large-scale analysis over 5500 high-impact events from 11 event categories
  34. 34. Approach 34 • We propose a 3-step approach, for a given event: 1. Heuristically quantify “remembering scores” of past events within the same category • Using page views 2. Rank related past events by the computed remembering scores • Refer to thesis for details 3. Identify features (e.g., time, location, impact) having a high correlation with remembering
  35. 35. Approach 35 • Remembering score: A linear mixture model of: • Cross-correlation coefficient (CCF) • Or sliding inner product • a measure of similarity of two series as a function of the displacement of one relative to the other • Sum of squared prediction error (SSE) or surprise score • Holt-winters as prediction model • Skewness (Kurtosis) • a measure for the degree of peakedness/flatness in the variable distribution
  36. 36. Studied Features for Triggered Remembering 36 • Temporal similarity • Time distance between two events (in days, months or years) • Time distance based on exponential decay functions • Location similarity • Map a geographic hierarchy of event locations as follows: • City à State à Country -> Neighbor countries -> Continent • Assign 4 scale values: 4 to same city, 3 to state, 2 to country,1 to continent • Impact of Events • Damaged area/properties/cost/fatalities • Magnitude (for earthquake events) • Highest winds, lowest pressure (for Atlantic hurricanes)
  37. 37. Study on Atlantic Hurricanes 37 Location and time have a low effect on the category
  38. 38. Study on Aviation Accidents 38 Location and time have a stronger effect on the category
  39. 39. Lessons Learned 39 • We identified some first patterns for event memory triggering for diverse event types including natural and manmade disasters as well as accidents and terrorism. • Our analysis confirmed the influence of high-level features i.e., time and location, but other (latent) semantic features of events also influence which event memories are triggered by an event. • Interpreting systematically factors contribute to event remembering is hard, even for humans.
  40. 40. Research Questions 40 • RQ2.2: How do we quantify the semantic relatedness between two entities / events?
  41. 41. Dynamic Entity Relatedness Ranking 41 TaylorLautner in“Twilight“ [2008-2012] TaylorLautner in“Cuckoo“ [2012-] TaylorLautner in“RuntheTide“ [2016]
  42. 42. Dynamic Entity Relatedness Ranking 42 • Dynamic Entity Relatedness: between two entities es , ed , where es is the source entity and ed is the target entity, in a given time t, is a function (denoted by ft(es , ed)) with the following properties. • Asymmetric: ft(ei , ej) != ft(ej , ei) • Non-negativity: f(ei , ej) ≥ 0 • Indiscernibility of identicals: ei = ej → f(ei , ej) = 1 Elon Tesla • Dynamic Entity Relatedness Ranking: Given a source entity es and time point t, rank the candidate entities et d by their semantic relatedness at time t+1. • Prediction task • Use normalized pageview as supervision
  43. 43. Dynamic Entity Relatedness Ranking 43 • A joint “neural” learning model • Graph-based representation • Content-based representation • Time-series representation • Neural ranking: • Early-interaction, (late for ts) • Pair-wise ranking • Cross-entropy loss
  44. 44. Temporal time-series based similarity 44 • 1-D Convolution layer • Decay-guided self-attention mechanism • Dot-product between feature states. • The context vector is decay-guided based on time. • Decay function: Polynomial Curve with a single decay (hyper)parameter.
  45. 45. Experiment settings 45 • Datasets
  46. 46. Experiment settings 46 • Baselines • Wikipedia Link-based (WLM) • DeepWalk (DW) • Entity2Vec Model (E2V) • ParaVecs(PV) • RankSVM + handcrafted features • Metrics • Pearson correlation • Spearman correlation • Normalized Discounted Cumulative Gain - NDCG Page views Human judgment
  47. 47. Experiment Results 47
  48. 48. 48 Temporal Dynamics Web Archives
  49. 49. Motivation 49 Correlation between time series mined from anchor text (left, ccf = 0.69, τdelay = 2) and Google Trend (right, ccf = 0.68, τdelay = 9) for query electoral college
  50. 50. Motivation 50 Time series of popular vote (ccf = 0.94, τdelay = 2), border fence (ccf = 0.40,τdelay = 1) and heath care reform (ccf = 0.44, τdelay = 2) from anchor text and Google Trend from left to right
  51. 51. Motivation 51 Cumulative signals from anchor text tend to well-reflect real- world event trend patterns with some slight delay.
  52. 52. Motivation 52 In this work, we rely solely on the Web Archive link-graph to mine important documents.
  53. 53. Research Questions 53 • RQ3: Given a query and the Web Archive, how do we come up with a top-k ranked list of documents where the coverage of the most important documents -- topic-wise and time-wise -- are maximized.
  54. 54. Anchor-text based Retrieval Pipeline 54
  55. 55. Motivation 55 • DivRank[*] • Rich-get-richer phenomenon • Has a clear optimization explanation • [*] Mei, Qiaozhu, Jian Guo, and Dragomir Radev. "Divrank: the interplay of prestige and diversity in information networks." Proceedings of KDD 2010 Illustrated graph PageRank DivRank
  56. 56. Temporal Random Surfer Model 56 • Time-aware Teleportation • jump to any snapshot with a time preference • Time-aware Transition probability • a snapshot at time ti with high time preference will have higher transition probability. • a node most propagates its authority to the nearest peaked time • propagation scope is restricted to a time window
  57. 57. Absorbing Random Walk on Temporal Graph 57 • Vertex-Reinforcement Random walk • within-snapshot: the transition probability in the Markov random walk (to a state from others) is reinforced by the number of previous visits to that state • cross-snapshots: voting mechanism, only one node gets propagated at a time
  58. 58. Experiment results 58 Diversity by time Diversity by topics
  59. 59. 59 Temporal Dynamics Social Network
  60. 60. Research Questions 60 • RQ4: How do temporal models develop and how do we control and improve the stability of such models at early-stage?
  61. 61. Research Questions 61 • RQ4: How do temporal models develop and how do we control and improve the stability of such models at early-stage? • Task 1: Rumor detection in Twitter
  62. 62. Motivation 62
  63. 63. Motivation 63 The Amuay Explosion news and Castro’ Death rumor spread over Twitter[*] [*] Jin, Fang, et al. "Epidemiological modeling of news and rumors on twitter.” Workshop on Social Network Mining and Analysis 2013.
  64. 64. Motivation 64 The Amuay Explosion news and Castro’s Death rumor spread over Twitter[*] [*] Jin, Fang, et al. "Epidemiological modeling of news and rumors on twitter.” Workshop on Social Network Mining and Analysis 2013. How do we handle the case when it is too early for any propagation patterns to form?
  65. 65. System pipeline 65 • Sometimes Average is the best..
  66. 66. System pipeline 66 • Sometimes Average is the best.. Dynamic Series Time Structure: feature vector representation: • incoporate the slopes of features between two consecutive intervals[*] •[*] Ma, Jing, et al. "Detect rumors using time series of social context information on microblogging websites." CIKM 2015
  67. 67. Tweet-level credibility model 67 Tweet-level credibility model 6619.01.20
  68. 68. Experiment Results 68
  69. 69. Research Questions 69 • RQ4: How do temporal models develop and how do we control and improve the stability of such models at early-stage? • Task 2: Personalized blood glucose prediction in clinical domain
  70. 70. 70 Temporal Dynamics Social Network Clinical domain
  71. 71. Research Questions 71 • RQ4: How do temporal models develop and how do we control and improve the stability of such models at early-stage? • Task 2: Personalized blood glucose prediction in clinical domain • Strategy: allowing model to refuse to predict
  72. 72. Motivation 72 Task: predict BG-level in 1 hour
  73. 73. Motivation 73 Sparsity: Measurements taken periodically and (somewhat) spontaneously.
  74. 74. Motivation – preliminary results 74
  75. 75. Uncertainty in Machine Learning 75 [*] Digrams adopted from https://www.groundai.com/project/aleatoric-and-epistemic-uncertainty-in-machine-learning-a-tutorial-introduction/1 Ensemble Learning.. Bagging or Boosting Prediction variance
  76. 76. Uncertainty in Machine Learning 76 Go Bayesian.. Posterior distribution Weighted average [*] Digrams adopted from https://www.groundai.com/project/aleatoric-and-epistemic-uncertainty-in-machine-learning-a-tutorial-introduction/1 However, high computational cost
  77. 77. Uncertainty in Random Forest 77 Tree Finite#bootstrapreplicatesB Tree Tree variance estimatesRF Ensemble Learning
  78. 78. Uncertainty in Random Forest 78 RF Tree Finite#bootstrapreplicatesB Tree Tree variance estimates MC noise sampling noise
  79. 79. Uncertainty in Random Forest 79 • *Wager, Stefan, Trevor Hastie, and Bradley Efron. "Confidence intervals for random forests: The jackknife and the infinitesimal jackknife." JMLR (2014). RF Tree Finite#bootstrapreplicates(B) Tree Tree variance estimates MC noise Bias- corrected* B = Θ(n)
  80. 80. Experiment results 80 • Sanity filter: carefully-designed heuristic methods (e.g., no long gap prediction, no malformed input). • Stability filter: confidence interval based.
  81. 81. Conclusions 81 Temporal Dynamics Web Web Archives Collaborative Knowledge Bases Social Networks Search Recommendations Anchor-text and Link-based Analysis & Temporal Ranking Entity and Event Relatedness Mining and Recommendation Enrichment methods for cold- start predictions ESWC’18 - oral ECIR’14 - oral SIGIR’15 (short) WWW’15 Companion CoNLL’18 - full JCDL’14 - oral Socinfo’17 - full CIKM’17&18 Workshops

