Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Time-evolving Relational Classificationand Ensemble MethodsRyan A. RossiJennifer Neville
 Given a graph G and a set of attributes X for each           node, the task is to infer Y the class labels of the nodes ...
ProblemAttribute prediction in time-varying networks                     Previous workContributions        - only models t...
What types of information can change?
Edges Most real networks are dynamic    • social, information, biological,...⋯                                           ...
Edges Most real networks are dynamic                                    Node Attributes             .3 … C               ...
Edges Most real networks are dynamic       Node Attributes                                       Edge Attributes       - ...
 Attributes of linked nodes are correlated Temporal locality: Recent events are more influential than  distant ones     ...
How to represent the data with these in mind?
X1 X2 X3 X4 ... Xm                                   Input                                           ...      Temporal-Rel...
TIME     Window   Window   WindowG1    G2                 G3
X1 X2 X3 X4 ... Xm                                   Input                                           ...      Temporal-Rel...
TIME                              Window   G1                 G2                  G3              Summary Graph           ...
X1 X2 X3 X4 ... Xm                                   Input                                           ...      Temporal-Rel...
The attribute values are weighted by the product of their attribute weight and thecorresponding link weight  TVRC (weights...
 Learn model on Dt (or set of timesteps)  apply it to Dt+1 Use k-fold cross-validation to learn “best” model (k=10) Av...
 PyComm (Python Developer Communication Network)  • Extracted emails and bug discussions from the open-source python    d...
The simplest temporal                                      relational model still                                      out...
Complexity in representation/accuracy     Temporal influence of edges                                                     ...
Can we increase accuracy by creating ensembles    using temporal-relational information?
Key Idea: Introduce variance in both the temporal andrelational information1.   Transform the temporal-relational represen...
1. Transforming the Temporal Nodes and Links   • sampling nodes and edges from discrete timesteps2. Transforming the Tempo...
TIME               .3 … C                           .2 … C                               .3 … C    .2 … C                 ...
TIME               .3 … C                           .2 … C                               .3 … C    .2 … C                 ...
TVRC improvement significant at p < 0.05,  16% reduction in error                              Temporal-relational ensembl...
1.0                                                       TVRC                                                       RPT  ...
Main Contributions Proposed a general time-evolving relational classification  framework for modeling temporal edges, att...
 Systematically investigate the ensemble methods and  identify when or where to apply each strategy
Thanks!          Questions?         rrossi@purdue.eduhttp://www.cs.purdue.edu/homes/rrossi
U. Sharan et al. (ICDM 2008)Differences Our methods generalize for any dynamics  •   They model only edges Propose and u...
X1;t X2;t X3;t X4;t ... Xm;t                                             …                                             ......
TIME                  Window   Window   Window   G1              G2                  G3           Summary Graph        2. ...
TIME                                 WindowXi1                     Xi2                     Xi3          Summary Attributes...
where t = 1,...,tmax  Temporal-Relational Classification FrameworkRelational Information       Edges     Attributes       ...
Temporal-Relational ModelRelational Information       Edges     Attributes   NodesTemporal Granularity        Timestep    ...
Previous example was actually TVRCRelational Information     Edges     Attributes   NodesTemporal Granularity      Timeste...
Let Gt = (Vt , Et ,Wt E ) be the relational graph at time step tWe define the summary graph GS(t ) = (VS(t ), ES(t ),WS(t)...
Attribute Weighting and SummarizationGraph Weighting and Summarization
Exponential                   t1      t2      t3     t4                     Weighting  (1- l )t-i lNode Attribute     τ1  ...
Time-Evolving Relational Classification and Ensemble Methods
Upcoming SlideShare
Loading in …5
×

Time-Evolving Relational Classification and Ensemble Methods

8,494 views

Published on

Talk I gave at PAKDD12

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

Time-Evolving Relational Classification and Ensemble Methods

  1. 1. Time-evolving Relational Classificationand Ensemble MethodsRyan A. RossiJennifer Neville
  2. 2.  Given a graph G and a set of attributes X for each node, the task is to infer Y the class labels of the nodes class n1 n2 Training n3 . ..3 … B .1 … B . Ignores temporal information! .1 … B .4 … A .5 … A Learning Model Inference .3 … A .9 … A.2 … A .4 … A .1 … B .4 … A ? ? ? ? .3 … A .9 … A Testing
  3. 3. ProblemAttribute prediction in time-varying networks Previous workContributions - only models temporal edges - predicts only static attributeConsider the space - only single-classifier of temporal-relational representations - limited representation!1. Propose temporal-relational classification framework2. Temporal-relational ensembles
  4. 4. What types of information can change?
  5. 5. Edges Most real networks are dynamic • social, information, biological,...⋯ ⋯ ⋯ Time
  6. 6. Edges Most real networks are dynamic Node Attributes .3 … C .2 … C .3 … C .1 … C .6 … L .5 … L .2 … C .2 … C .5 … L .3 … L .6 … L .6 … L .5 … L .2 … C .7 … L .4 … L .5 … L⋯ ⋯ ⋯ Time
  7. 7. Edges Most real networks are dynamic Node Attributes Edge Attributes - … 5 + … 2 + … 8 + … 8 + … 4 + … 2 + … 1⋯ ⋯ ⋯ Time
  8. 8.  Attributes of linked nodes are correlated Temporal locality: Recent events are more influential than distant ones Time u u u u u u u u τ2 τ2 τ1 τ1 τ1 τ1 τ1 v v v v v v v v Temporal recurrence: A regular series of events is more likely to indicate a stronger relationship than an isolated event Time u u u u u u u u v v v v v v v v
  9. 9. How to represent the data with these in mind?
  10. 10. X1 X2 X3 X4 ... Xm Input ... Temporal-Relational ... ... Classification Framework … … Dt= ... … ...Edges Attributes Nodes ⋮ ⋮ ⋮ ⋮ ⋱⋮ Gt Xt where t = 1,...,tmax - Single timestep Temporal Granularity - Window of timesteps - Union (all timesteps) For each: edges, attrs,... 1. Select the past information
  11. 11. TIME Window Window WindowG1 G2 G3
  12. 12. X1 X2 X3 X4 ... Xm Input ... Temporal-Relational ... ... Classification Framework … … Dt= ... … ...Edges Attributes Nodes ⋮ ⋮ ⋮ ⋮ ⋱⋮ Gt Xt where t = 1,...,tmax - Single timestep Temporal Granularity - Window of timesteps - Union (all timesteps) For each: edges, attrs,... - Exponential 1. Select the past information Temporal Influence ⋮ 2. Select weighting - Uniform
  13. 13. TIME Window G1 G2 G3 Summary Graph Temporal Granularity Temporal Influence1. Create summary graph2. Assigns weights to edges (recent/frequent events larger weights)• Weight functions: Exponential, linear, uniform,...
  14. 14. X1 X2 X3 X4 ... Xm Input ... Temporal-Relational ... ... Classification Framework … … Dt= ... … ...Edges Attributes Nodes ⋮ ⋮ ⋮ ⋮ ⋱⋮ Gt Xt where t = 1,...,tmax - Single timestep Temporal Granularity - Window of timesteps - Union (all timesteps) For each: edges, attrs,... - Exponential 1. Select the past information Temporal Influence ⋮ 2. Select weighting - Uniform 3. Select relational classifier - Relational Probability Trees Relational Classifier ⋮ - Relational Bayes Classifier Predictions
  15. 15. The attribute values are weighted by the product of their attribute weight and thecorresponding link weight TVRC (weights only edges) TENC (edges & attributes)Weighted mode feature: TVRC  red (τ3 ) TENC  blue(τ2)
  16. 16.  Learn model on Dt (or set of timesteps)  apply it to Dt+1 Use k-fold cross-validation to learn “best” model (k=10) Avg AUC over 10 trials
  17. 17.  PyComm (Python Developer Communication Network) • Extracted emails and bug discussions from the open-source python development environment Cora Citation Network PyComm (Communication Network) Cora Citiation Network Developers: 1,914 Papers: 16,153 Emails: 13,181 References: 29,603 Bug Messages: 69,435 Authors 21,976 Timespan: Feb 2007 – May 2008 Timespan: 1981 - 1998 In this talk we focus on the more difficult dynamic prediction task Dataset Prediction Task PyComm Is developer productive or not in t+1? (closes bug) [Dynamic] Cora 1. Is paper AI or not? 2. Predict paper topic (1 out of 7 ML topics) [Static]
  18. 18. The simplest temporal relational model still outperforms the others 1.00 TVRC Most primitive RPT Intrinsic temporal-relational model 0.98 Int+time Int+graph Int+topicsAUC 0.96 Important to moderate the relational information with 0.94 the temporal info! 0.92 TVRC RPT DT w/ additional attrs
  19. 19. Complexity in representation/accuracy Temporal influence of edges and attributes 0.95 TENC TVRC Temporal influence of edges TVRC+Union 0.90 Window Model strictly based on Union Model temporal-granularity  Very efficient 0.85AUC 0.80 Main finding: Accuracy generally 0.75 increases as more temporal information is included in the representation 0.70 0.65 T=1 T=2 T=3 T=4
  20. 20. Can we increase accuracy by creating ensembles using temporal-relational information?
  21. 21. Key Idea: Introduce variance in both the temporal andrelational information1. Transform the temporal-relational representation using some process (e.g., randomization)2. Learn models (on different training data)3. Consider a weighted vote from model predictions
  22. 22. 1. Transforming the Temporal Nodes and Links • sampling nodes and edges from discrete timesteps2. Transforming the Temporal Feature Space • randomization of attributes (locally) • varying temporal influence or granularity • resample from temporal features Many possibilities!3. Adding Temporal Noise or Randomness • randomly permute node feature values • randomly permute links across time4. Transforming Time-varying Labels • randomly permuting previously learned labels at t-1 (or more distant) with the true labels at t5. Multiple Classification Algorithms • sampling from a set of classifiers (RPT, RBC,...)
  23. 23. TIME .3 … C .2 … C .3 … C .2 … C .1 … C .6 … L .5 … L .2 … C .2 … C .5 … L .3 … L .6 … L .6 … L .5 … L .2 … C .7 … L .4 … L .5 … L⋯ ⋯ ⋯1. Select edge Randomization Edges2. Randomly permute it over time .3 … C .2 … C .3 … C .2 … C .1 … C .6 … L .5 … L .2 … C .2 … C .5 … L .3 … L .6 … L .6 … L .5 … L .2 … C .7 … L .4 … L .5 … L⋯ ⋯ ⋯
  24. 24. TIME .3 … C .2 … C .3 … C .2 … C .1 … C .6 … L .5 … L .2 … C .2 … C .5 … L .3 … L .6 … L .6 … L .5 … L .2 … C .7 … L .4 … L .5 … L⋯ ⋯ ⋯1. Select node and attribute Randomization Edges2. Randomize its values over time Attributes .3 … C .2 … C .3 … C .2 … C .1 … C .6 … L .5 … L .2 … C .2 … C .5 … L .3 … L .6 … L .6 … L .5 … L .2 … C .5 … L .7 … L .4 … L⋯ ⋯ ⋯
  25. 25. TVRC improvement significant at p < 0.05, 16% reduction in error Temporal-relational ensemble Relational ensemble 1.00 Non-relational ensemble TVRC RPT DT 0.98 Main Findings: Temporal-relational ensembles canAUC 0.96 improve over relational-ensembles, even using the minimum temporal information! 0.94 0.92 T=1 T=2 T=3 T=4 Avg
  26. 26. 1.0 TVRC RPT Slight improvement DT Significant improvement! 0.9 Main Finding: Temporal-relational ensemblesAUC 0.8 perform significantly better than the others when there are more dynamics 0.7 0.6 Communication Team Centrality Topics Relatively stationary attributes Frequently changing attributes
  27. 27. Main Contributions Proposed a general time-evolving relational classification framework for modeling temporal edges, attributes, and nodes Proposed classes of temporal ensemble methodsMain Findings Temporal-relational models are more accurate than relational and non-relational models Incorporating temporal information can increase accuracy of classification for both static and dynamic prediction tasks Temporal-relational ensembles yield better accuracy than relational and non-relational ensembles
  28. 28.  Systematically investigate the ensemble methods and identify when or where to apply each strategy
  29. 29. Thanks! Questions? rrossi@purdue.eduhttp://www.cs.purdue.edu/homes/rrossi
  30. 30. U. Sharan et al. (ICDM 2008)Differences Our methods generalize for any dynamics • They model only edges Propose and use temporal-relational ensembles Static and dynamic prediction tasks
  31. 31. X1;t X2;t X3;t X4;t ... Xm;t … ... … ⇒ ...Dt = … ... ... ... ⋮ ⋮ ⋮ ⋮ ⋱ ⋮ ⋮ Gt Xt Yt+1 E(Yt+1 | Gt ,Gt-1,..., Xt , Xt-1,...)
  32. 32. TIME Window Window Window G1 G2 G3 Summary Graph 2. Temporal Granularity 3. Temporal Influence1. For each piece of relational data • attributes, edges, or nodes2. Select the amount of past information to use3. Select how the past information is weighted • Weight function and decay parameter
  33. 33. TIME WindowXi1 Xi2 Xi3 Summary Attributes Select the timesteps for learning Temporal weighting /summarization We can also do this for each attribute!
  34. 34. where t = 1,...,tmax Temporal-Relational Classification FrameworkRelational Information Edges Attributes NodesTemporal Granularity Timestep Window UnionTemporal Influence Uniform WeightingRelational Classification RPT ⋯ RBC Predictions
  35. 35. Temporal-Relational ModelRelational Information Edges Attributes NodesTemporal Granularity Timestep Window UnionTemporal Influence Uniform WeightingRelational Classification RPT ⋯ RBC
  36. 36. Previous example was actually TVRCRelational Information Edges Attributes NodesTemporal Granularity Timestep Window UnionTemporal Influence Uniform WeightingRelational Classifier RPT ⋯ RBC
  37. 37. Let Gt = (Vt , Et ,Wt E ) be the relational graph at time step tWe define the summary graph GS(t ) = (VS(t ), ES(t ),WS(t) ) as the weighted Esum of graphs up to time t as follows: VS(t ) = V1 ÈV2 ÈÈVt ES(t ) = E1 È E2 ÈÈ Et t WS(t ) = a1W1E + a 2W2E ++ atWt E = å K E (Gi ;t, q ) E i=1 The α weights determine the contribution of each snapshot We use the kernel function K with parameters θ to determines the influence of each edge in the summary Weights can be viewed as probabilities that a relationship (or attribute value) is still active at the current time step t, given that it was observed at time (t-k)
  38. 38. Attribute Weighting and SummarizationGraph Weighting and Summarization
  39. 39. Exponential t1 t2 t3 t4 Weighting (1- l )t-i lNode Attribute τ1 τ2 τ2 τ1 ⇒ τ1: (1−λ)3λ + λ τ2: λ((1−λ)2 + (1−λ)) ωEdge Attribute τ1 τ2 τ1 ⇒ τ1: (1−λ)3λ + λ τ2: (1−λ)λEdge ⇒ ω = θ(1 + (1−θ) + (1−θ)3)Weights can be viewed as probabilities that a relationship (or attribute value) is stillactive at the current time step t, given that it was observed at time (t-k)

×