Trace Complexity of
Network Inference
Bruno Abrahao (Cornell)
Flavio Chierichetti (Sapienza)
Robert Kleinberg (Cornell)
Al...
Influence and diffusion on networks
2
Wednesday, August 14, 13
Influence and diffusion on networks
• Network Inference: Find influencers, improve marketing,
prevent disease outbreaks, and...
The Network Inference Problem
• Learning each edge independently
- [Adar,Adamic‘2005]
• MLE-inspired approaches
- [Gomez-R...
The Network Inference Problem
• Learning each edge independently
- [Adar,Adamic‘2005]
• MLE-inspired approaches
- [Gomez-R...
The Network Inference Problem
• The relationship between the amount of data and the
performance of inference algorithms is...
Our goal
• Provide rigorous foundation to network inference
1. develop a measure that relates the amount of data to the
pe...
We assume an underlying cascade model
6
b
d
e
a
c
st = 0.0
Wednesday, August 14, 13
We assume an underlying cascade model
6
b
d
e
a
c
s
Pr{H} = p
Wednesday, August 14, 13
We assume an underlying cascade model
6
b
d
e
a
c
s
Pr{H} = p
Wednesday, August 14, 13
We assume an underlying cascade model
6
b
d
e
a
c
s
Exp( )
Wednesday, August 14, 13
We assume an underlying cascade model
6
b
d
e
a
c
s
!
Wednesday, August 14, 13
We assume an underlying cascade model
6
b
d
e
a
c
s
c
Wednesday, August 14, 13
We assume an underlying cascade model
6
b
d
e
a
c
s
c
a b
Wednesday, August 14, 13
We assume an underlying cascade model
6
b
d
e
a
c
s
c
a b
Node s
t=0.0
Node c
t=0.345
Node a
t=1.236
Node b
t=1.705
Trace
...
Traces and cascades
• Each cascade generates one trace
• Random cascade: starts at a node chosen uniformly at
random (assu...
Traces and cascades
• Each cascade generates one trace
• Random cascade: starts at a node chosen uniformly at
random (assu...
Our Research Question I
How many traces do we need to reconstruct the
underlying network?
We call this measure the trace c...
Our Research Question II
How does trace length play a role for inference?
As we keep scanning the trace, it becomes less a...
Our Research Question II
How does trace length play a role for inference?
As we keep scanning the trace, it becomes less a...
Our Research Question II
How does trace length play a role for inference?
As we keep scanning the trace, it becomes less a...
Our Research Question II
How does trace length play a role for inference?
As we keep scanning the trace, it becomes less a...
• First-Edge Algorithm
- Infers the edge corresponding to the first two nodes in
each trace (and ignores the rest of the tr...
• First-Edge Algorithm
- Infers the edge corresponding to the first two nodes in
each trace (and ignores the rest of the tr...
1. The head of traces
• First-Edge is close to the best we can do for exact reconstruction
2. The tail of traces
• We give...
How many traces do we need for exact
reconstruction of general graphs?
12
Wednesday, August 14, 13
Lower bound for exact reconstruction of general graphs
13
a b
c...
df
e
G0 = Kn
a b
c...
df
e
G1 = Kn {a, b}
1. We choose ...
14
a b
c...
df
e
G?
Given a set of ` random traces T1, . . . , T`,
Bayes’ rule can tell us which of the two alternatives
G...
15
a b
c...
df
e
G?
Lower bound for exact reconstruction of general graphs
Lemma
Let ` < n2 ✏
. For any small positive con...
16
Lower bound for exact reconstruction of general graphs
Corollary
If ` < n · 1 ✏
, any algorithm will fail to reconstruc...
The head of the trace
17
First-Edge reconstructs the graph with O(n log n) traces.
First-Edge
O(n log n)
Lower bound
⌦(n 1...
Can we reconstruct special families
of graphs using fewer traces?
18
Wednesday, August 14, 13
• Useful information to reconstruct special graphs
• We give algorithms for inference using exponentially fewer traces.
- ...
Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using
O(log n) traces.
Wedn...
Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using
O(log n) traces.
Take...
Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using
O(log n) traces.
Take...
Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using
O(log n) traces.
Take...
Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using
O(log n) traces.
Take...
Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using
O(log n) traces.
Take...
Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using
O(log n) traces.
Take...
Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using
O(log n) traces.
Take...
Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using
O(log n) traces.
Take...
Local MLE for inferring bounded degree graphs
• Think of the potential neighbor sets of u as “forecasters”
predicting the ...
Can we recover properties of a network without
paying the full price of network reconstruction?
22
Wednesday, August 14, 13
• Useful to reason about the behavior of processes that take
place in the network
• Robustness [Cohen et al.’00]
• Network...
Reconstructing degree distribution
24
s
Wednesday, August 14, 13
Reconstructing degree distribution
24
s
t1
Trace 1 t1
Wednesday, August 14, 13
Reconstructing degree distribution
24
s
t2
Trace 1 t1
Trace 2 t2
Wednesday, August 14, 13
Reconstructing degree distribution
24
s
t3
Trace 1 t1
Trace 2 t2
Trace 3 t3
Wednesday, August 14, 13
Reconstructing degree distribution
24
s
Trace 1 t1
Trace 2 t2
Trace 3 t3
.
.
.
Trace ` t`
t`
Wednesday, August 14, 13
Reconstructing degree distribution
24
s
Trace 1 t1
Trace 2 t2
Trace 3 t3
.
.
.
Trace ` t`
Let d be the degree of s
• T =
P...
Reconstructing degree distribution
24
s
Pr{Erlang(n, ) < z} = Pr{Pois(z · ) n}
Trace 1 t1
Trace 2 t2
Trace 3 t3
.
.
.
Trac...
Reconstructing degree distribution
24
s
Pr{Erlang(n, ) < z} = Pr{Pois(z · ) n}
Trace 1 t1
Trace 2 t2
Trace 3 t3
.
.
.
Trac...
Reconstructing degree distribution
• Using 10n traces
25
Barabasi-Albert
1024 nodes
Facebook-Rice Undergraduate
1220 nodes...
Building on the First-Edge algorithm
• First-Edge is close to optimal, but
• Naive and too conservative: Ignores most of t...
Could we discover more true positives
if we are willing to take more (calculated) risks?
27
Wednesday, August 14, 13
First-Edge+
28
Wednesday, August 14, 13
• Idea: 1. Reconstruct degree distribution
2. Guess edges by exploring the memoryless property.
First-Edge+
28
Wednesday, ...
• Idea: 1. Reconstruct degree distribution
2. Guess edges by exploring the memoryless property.
First-Edge+
28
s
t0
N(s) =...
• Idea: 1. Reconstruct degree distribution
2. Guess edges by exploring the memoryless property.
First-Edge+
28
s
t0
N(s) =...
• Idea: 1. Reconstruct degree distribution
2. Guess edges by exploring the memoryless property.
First-Edge+
28
ds 1 + du 1...
• Idea: 1. Reconstruct degree distribution
2. Guess edges by exploring the memoryless property.
First-Edge+
28
ds 1 + du 1...
• Idea: 1. Reconstruct degree distribution
2. Guess edges by exploring the memoryless property.
First-Edge+
28
ds 1 + du 1...
• Idea: 1. Reconstruct degree distribution
2. Guess edges by exploring the memoryless property.
First-Edge+
28
ds 1 + du 1...
• Idea: 1. Reconstruct degree distribution
2. Guess edges by exploring the memoryless property.
First-Edge+
28
ds 1 + du 1...
• Idea: 1. Reconstruct degree distribution
2. Guess edges by exploring the memoryless property.
First-Edge+
28
ds 1 + du 1...
Experimental Inference Results
29
Barabasi-Albert
1024 nodes
= 174
Power-law tree
1024 nodes
= 94
Facebook
1220 nodes
= 28...
Experimental Inference Results
29
Barabasi-Albert
1024 nodes
= 174
Power-law tree
1024 nodes
= 94
Facebook
1220 nodes
= 28...
Experimental Inference Results
29
Barabasi-Albert
1024 nodes
= 174
Power-law tree
1024 nodes
= 94
Facebook
1220 nodes
= 28...
Experimental Inference Results
29
Barabasi-Albert
1024 nodes
= 174
Power-law tree
1024 nodes
= 94
Facebook
1220 nodes
= 28...
Experimental Inference Results
29
Barabasi-Albert
1024 nodes
= 174
Power-law tree
1024 nodes
= 94
Facebook
1220 nodes
= 28...
Conclusions
• Our results have direct implication in the design of network
inference algorithms
• We provide rigorous anal...
Open questions and challenges
• Performance guarantees for approximated reconstruction
• Trace complexity under other dist...
Trace complexity of
Network Inference
Bruno Abrahao (Cornell)
Flavio Chierichetti (Sapienza)
Robert Kleinberg (Cornell)
Al...
Upcoming SlideShare
Loading in...5
×

Trace Complexity of Network Inference

270

Published on

The network inference problem consists of reconstructing the edge set of a network given traces representing the chronology of infection times as epidemics spread through the network. This problem is a paradigmatic representative of prediction tasks in machine learning that require deducing a latent structure from observed patterns of activity in a network, which often require an unrealistically large number of resources (e.g., amount of available data, or computational time). A fundamental question is to understand which properties we can predict with a reasonable degree of accuracy with the available resources, and which we cannot. We define the trace complexity as the number of distinct traces required to achieve high fidelity in reconstructing the topology of the unobserved network or, more generally, some of its properties. We give algorithms that are competitive with, while being simpler and more efficient than, existing network inference approaches. Moreover, we prove that our algorithms are nearly optimal, by proving an information-theoretic lower bound on the number of traces that an optimal inference algorithm requires for performing this task in the general case. Given these strong lower bounds, we turn our attention to special cases, such as trees and bounded-degree graphs, and to property recovery tasks, such as reconstructing the degree distribution without inferring the network. We show that these problems require a much smaller (and more realistic) number of traces, making them potentially solvable in practice. By Bruno Abrahao, Flavio Chierichetti, Robert Kleinberg and Alessandro Panconesi.

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
270
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Trace Complexity of Network Inference

  1. 1. Trace Complexity of Network Inference Bruno Abrahao (Cornell) Flavio Chierichetti (Sapienza) Robert Kleinberg (Cornell) Alessandro Panconesi (Sapienza) Cornell University 1 Text Sapienza University Wednesday, August 14, 13
  2. 2. Influence and diffusion on networks 2 Wednesday, August 14, 13
  3. 3. Influence and diffusion on networks • Network Inference: Find influencers, improve marketing, prevent disease outbreaks, and forecast crimes 2 Wednesday, August 14, 13
  4. 4. The Network Inference Problem • Learning each edge independently - [Adar,Adamic‘2005] • MLE-inspired approaches - [Gomez-Rodriguez, Leskovec, Krause’2010] - [Gomez-Rodriguez, Balduzzi, Scholkopf’2011] - [Myers, Leskovec‘2011] - [Du et al.‘2012] • Information theoretic - [Netrapalli, Sanghavi‘2012] - [Grippon, Rabbat‘2013] 3 Wednesday, August 14, 13
  5. 5. The Network Inference Problem • Learning each edge independently - [Adar,Adamic‘2005] • MLE-inspired approaches - [Gomez-Rodriguez, Leskovec, Krause’2010] - [Gomez-Rodriguez, Balduzzi, Scholkopf’2011] - [Myers, Leskovec‘2011] - [Du et al.‘2012] • Information theoretic - [Netrapalli, Sanghavi‘2012] - [Grippon, Rabbat‘2013] 3 Our work Wednesday, August 14, 13
  6. 6. The Network Inference Problem • The relationship between the amount of data and the performance of inference algorithms is not well understood 4 What can be inferred? What amounts of resources are required? How hard is the inference task? Wednesday, August 14, 13
  7. 7. Our goal • Provide rigorous foundation to network inference 1. develop a measure that relates the amount of data to the performance of algorithms 2. give information-theoretic performance guarantees 3. develop more efficient algorithms 5 Wednesday, August 14, 13
  8. 8. We assume an underlying cascade model 6 b d e a c st = 0.0 Wednesday, August 14, 13
  9. 9. We assume an underlying cascade model 6 b d e a c s Pr{H} = p Wednesday, August 14, 13
  10. 10. We assume an underlying cascade model 6 b d e a c s Pr{H} = p Wednesday, August 14, 13
  11. 11. We assume an underlying cascade model 6 b d e a c s Exp( ) Wednesday, August 14, 13
  12. 12. We assume an underlying cascade model 6 b d e a c s ! Wednesday, August 14, 13
  13. 13. We assume an underlying cascade model 6 b d e a c s c Wednesday, August 14, 13
  14. 14. We assume an underlying cascade model 6 b d e a c s c a b Wednesday, August 14, 13
  15. 15. We assume an underlying cascade model 6 b d e a c s c a b Node s t=0.0 Node c t=0.345 Node a t=1.236 Node b t=1.705 Trace Wednesday, August 14, 13
  16. 16. Traces and cascades • Each cascade generates one trace • Random cascade: starts at a node chosen uniformly at random (assumption in some of our models) • Traces do not directly reflect the underlying network over which the cascade propagates 7 Node s t=0.0 Node c t=0.345 Node a t=1.236 Node b t=1.705 Wednesday, August 14, 13
  17. 17. Traces and cascades • Each cascade generates one trace • Random cascade: starts at a node chosen uniformly at random (assumption in some of our models) • Traces do not directly reflect the underlying network over which the cascade propagates 7 Node s t=0.0 Node c t=0.345 Node a t=1.236 Node b t=1.705 How much structural information is contained in a trace? Wednesday, August 14, 13
  18. 18. Our Research Question I How many traces do we need to reconstruct the underlying network? We call this measure the trace complexity of the problem. 8 Wednesday, August 14, 13
  19. 19. Our Research Question II How does trace length play a role for inference? As we keep scanning the trace, it becomes less and less informative. 9 Node s t=0.0 Node c t=0.345 Node a t=1.236 Node b t=1.705 Wednesday, August 14, 13
  20. 20. Our Research Question II How does trace length play a role for inference? As we keep scanning the trace, it becomes less and less informative. 9 Node s t=0.0 Node c t=0.345 Node a t=1.236 Node b t=1.705 s c Wednesday, August 14, 13
  21. 21. Our Research Question II How does trace length play a role for inference? As we keep scanning the trace, it becomes less and less informative. 9 Node s t=0.0 Node c t=0.345 Node a t=1.236 Node b t=1.705 s c a ? ? Wednesday, August 14, 13
  22. 22. Our Research Question II How does trace length play a role for inference? As we keep scanning the trace, it becomes less and less informative. 9 Node s t=0.0 Node c t=0.345 Node a t=1.236 Node b t=1.705 s c a ? ? b ? ? ? Wednesday, August 14, 13
  23. 23. • First-Edge Algorithm - Infers the edge corresponding to the first two nodes in each trace (and ignores the rest of the trace) The head of the trace 10 Node s t=0.0 Node c t=0.345 Node a t=1.236 Node b t=1.705 Node d t=1.725 Wednesday, August 14, 13
  24. 24. • First-Edge Algorithm - Infers the edge corresponding to the first two nodes in each trace (and ignores the rest of the trace) The head of the trace 10 Node s t=0.0 Node c t=0.345 Wednesday, August 14, 13
  25. 25. 1. The head of traces • First-Edge is close to the best we can do for exact reconstruction 2. The tail of traces • We give algorithms using exponentially fewer traces - trees - bounded degree graphs 3. Infer properties without reconstructing the network itself - degree distribution Contributions 11 O(log n) ⌦(n 1 ✏ ) O(n) O(poly( ) log n) Wednesday, August 14, 13
  26. 26. How many traces do we need for exact reconstruction of general graphs? 12 Wednesday, August 14, 13
  27. 27. Lower bound for exact reconstruction of general graphs 13 a b c... df e G0 = Kn a b c... df e G1 = Kn {a, b} 1. We choose the unknown graph in {G0, G1} 2. Run random cascades on the chosen graph Wednesday, August 14, 13
  28. 28. 14 a b c... df e G? Given a set of ` random traces T1, . . . , T`, Bayes’ rule can tell us which of the two alternatives G0 or G1 is the most likely. Lower bound for exact reconstruction of general graphs Wednesday, August 14, 13
  29. 29. 15 a b c... df e G? Lower bound for exact reconstruction of general graphs Lemma Let ` < n2 ✏ . For any small positive constant ✏. Then with prob. 1-o(1) over the random traces T1, ..., T`, the posterior Pr{G0|T1, ..., T`} lies in [1 2 o(1), 1 2 + o(1)] Wednesday, August 14, 13
  30. 30. 16 Lower bound for exact reconstruction of general graphs Corollary If ` < n · 1 ✏ , any algorithm will fail to reconstruct the graph with high probability. Let Δ be the largest degree of a node in the network ⌦(n 1 ✏ ) traces are necessary Wednesday, August 14, 13
  31. 31. The head of the trace 17 First-Edge reconstructs the graph with O(n log n) traces. First-Edge O(n log n) Lower bound ⌦(n 1 ✏ ) First-Edge is close to the best we can do for exact reconstruction! Wednesday, August 14, 13
  32. 32. Can we reconstruct special families of graphs using fewer traces? 18 Wednesday, August 14, 13
  33. 33. • Useful information to reconstruct special graphs • We give algorithms for inference using exponentially fewer traces. - trees - bounded degree graphs The tail of the trace 19 O(log n) O(poly( ) log n) Wednesday, August 14, 13
  34. 34. Maximum Likelihood Tree Estimation 20 We can perfectly reconstruct trees with high probability using O(log n) traces. Wednesday, August 14, 13
  35. 35. Maximum Likelihood Tree Estimation 20 We can perfectly reconstruct trees with high probability using O(log n) traces. Take ` traces Wednesday, August 14, 13
  36. 36. Maximum Likelihood Tree Estimation 20 We can perfectly reconstruct trees with high probability using O(log n) traces. Take ` traces 1. Set c(u, v) as the median of observations |t(u) t(v)| over all traces Wednesday, August 14, 13
  37. 37. Maximum Likelihood Tree Estimation 20 We can perfectly reconstruct trees with high probability using O(log n) traces. Take ` traces u v 1. Set c(u, v) as the median of observations |t(u) t(v)| over all traces Wednesday, August 14, 13
  38. 38. Maximum Likelihood Tree Estimation 20 We can perfectly reconstruct trees with high probability using O(log n) traces. Take ` traces u v {u,v} is the only route of infection between u and v 1. Set c(u, v) as the median of observations |t(u) t(v)| over all traces Wednesday, August 14, 13
  39. 39. Maximum Likelihood Tree Estimation 20 We can perfectly reconstruct trees with high probability using O(log n) traces. Take ` traces u v {u,v} is the only route of infection between u and v Incubation time between u and v is a sample of Exp(λ) 1. Set c(u, v) as the median of observations |t(u) t(v)| over all traces Wednesday, August 14, 13
  40. 40. Maximum Likelihood Tree Estimation 20 We can perfectly reconstruct trees with high probability using O(log n) traces. Take ` traces u v {u,v} is the only route of infection between u and v Incubation time between u and v is a sample of Exp(λ) 1. Set c(u, v) as the median of observations |t(u) t(v)| over all traces If (u, v) 2 E, c(u, v) < 1 with prob. approaching 1 exponentially with ` Wednesday, August 14, 13
  41. 41. Maximum Likelihood Tree Estimation 20 We can perfectly reconstruct trees with high probability using O(log n) traces. Take ` traces u v 1. Set c(u, v) as the median of observations |t(u) t(v)| over all traces (*Step 3 omitted) Otherwise⇤ , c(u, v) > 1 with prob. approaching 1 exponentially with ` If (u, v) 2 E, c(u, v) < 1 with prob. approaching 1 exponentially with ` Wednesday, August 14, 13
  42. 42. Maximum Likelihood Tree Estimation 20 We can perfectly reconstruct trees with high probability using O(log n) traces. Take ` traces u v 1. Set c(u, v) as the median of observations |t(u) t(v)| over all traces (*Step 3 omitted) Otherwise⇤ , c(u, v) > 1 with prob. approaching 1 exponentially with ` If (u, v) 2 E, c(u, v) < 1 with prob. approaching 1 exponentially with ` Prob. that all these events happen 1 1 nc using ` c · log n traces. Wednesday, August 14, 13
  43. 43. Local MLE for inferring bounded degree graphs • Think of the potential neighbor sets of u as “forecasters” predicting the infection time of u, given their own infection times 21 • Identify the most accurate using a proper scoring rule Trace complexity O(poly( ) log n) Wednesday, August 14, 13
  44. 44. Can we recover properties of a network without paying the full price of network reconstruction? 22 Wednesday, August 14, 13
  45. 45. • Useful to reason about the behavior of processes that take place in the network • Robustness [Cohen et al.’00] • Network evolution [Leskovec, Kleinberg, Faloutsos’05] • ... Obtaining network properties cheaper 23 We can infer the degree distribution with high probability with O(n) traces. Lower bound for reconstruct the whole network ⌦(n 1 ✏ ) Wednesday, August 14, 13
  46. 46. Reconstructing degree distribution 24 s Wednesday, August 14, 13
  47. 47. Reconstructing degree distribution 24 s t1 Trace 1 t1 Wednesday, August 14, 13
  48. 48. Reconstructing degree distribution 24 s t2 Trace 1 t1 Trace 2 t2 Wednesday, August 14, 13
  49. 49. Reconstructing degree distribution 24 s t3 Trace 1 t1 Trace 2 t2 Trace 3 t3 Wednesday, August 14, 13
  50. 50. Reconstructing degree distribution 24 s Trace 1 t1 Trace 2 t2 Trace 3 t3 . . . Trace ` t` t` Wednesday, August 14, 13
  51. 51. Reconstructing degree distribution 24 s Trace 1 t1 Trace 2 t2 Trace 3 t3 . . . Trace ` t` Let d be the degree of s • T = P` i=1 ti is Erlang(`, d ) Wednesday, August 14, 13
  52. 52. Reconstructing degree distribution 24 s Pr{Erlang(n, ) < z} = Pr{Pois(z · ) n} Trace 1 t1 Trace 2 t2 Trace 3 t3 . . . Trace ` t` Let d be the degree of s • T = P` i=1 ti is Erlang(`, d ) Wednesday, August 14, 13
  53. 53. Reconstructing degree distribution 24 s Pr{Erlang(n, ) < z} = Pr{Pois(z · ) n} Trace 1 t1 Trace 2 t2 Trace 3 t3 . . . Trace ` t` Let d be the degree of s • T = P` i=1 ti is Erlang(`, d ) Output: ˆd = ` T We achieve (1 + ✏)-approximation with probability 1 using ⌦ ⇣ ln 1 ✏2 ⌘ traces. Using the Poisson tail bound Wednesday, August 14, 13
  54. 54. Reconstructing degree distribution • Using 10n traces 25 Barabasi-Albert 1024 nodes Facebook-Rice Undergraduate 1220 nodes Facebook-Rice Graduate 503 nodes Wednesday, August 14, 13
  55. 55. Building on the First-Edge algorithm • First-Edge is close to optimal, but • Naive and too conservative: Ignores most of the trace information • predictable performance: At most as many true- positive edges as the number of traces (and no false positives) 26 Wednesday, August 14, 13
  56. 56. Could we discover more true positives if we are willing to take more (calculated) risks? 27 Wednesday, August 14, 13
  57. 57. First-Edge+ 28 Wednesday, August 14, 13
  58. 58. • Idea: 1. Reconstruct degree distribution 2. Guess edges by exploring the memoryless property. First-Edge+ 28 Wednesday, August 14, 13
  59. 59. • Idea: 1. Reconstruct degree distribution 2. Guess edges by exploring the memoryless property. First-Edge+ 28 s t0 N(s) = ds Wednesday, August 14, 13
  60. 60. • Idea: 1. Reconstruct degree distribution 2. Guess edges by exploring the memoryless property. First-Edge+ 28 s t0 N(s) = ds u t1 N(u) = du Wednesday, August 14, 13
  61. 61. • Idea: 1. Reconstruct degree distribution 2. Guess edges by exploring the memoryless property. First-Edge+ 28 ds 1 + du 1 edges waiting at time t1 s t0 N(s) = ds u t1 N(u) = du Wednesday, August 14, 13
  62. 62. • Idea: 1. Reconstruct degree distribution 2. Guess edges by exploring the memoryless property. First-Edge+ 28 ds 1 + du 1 edges waiting at time t1 s t0 N(s) = ds u t1 N(u) = du Any of these are equally likely to be the first to finish Wednesday, August 14, 13
  63. 63. • Idea: 1. Reconstruct degree distribution 2. Guess edges by exploring the memoryless property. First-Edge+ 28 ds 1 + du 1 edges waiting at time t1 v ?t2 s t0 N(s) = ds u t1 N(u) = du Wednesday, August 14, 13
  64. 64. • Idea: 1. Reconstruct degree distribution 2. Guess edges by exploring the memoryless property. First-Edge+ 28 ds 1 + du 1 edges waiting at time t1 v ?t2 s t0 N(s) = ds u t1 N(u) = du s infected v with probability p(s,v) = ds 1 ds+du 2 u infected v with probability p(u,v) = du 1 ds+du 2 Wednesday, August 14, 13
  65. 65. • Idea: 1. Reconstruct degree distribution 2. Guess edges by exploring the memoryless property. First-Edge+ 28 ds 1 + du 1 edges waiting at time t1 Infer (x, y) if p(x,y) 0.5 v ?t2 s t0 N(s) = ds u t1 N(u) = du s infected v with probability p(s,v) = ds 1 ds+du 2 u infected v with probability p(u,v) = du 1 ds+du 2 Wednesday, August 14, 13
  66. 66. • Idea: 1. Reconstruct degree distribution 2. Guess edges by exploring the memoryless property. First-Edge+ 28 ds 1 + du 1 edges waiting at time t1 Infer (x, y) if p(x,y) 0.5 Given a larger trace prefix u1, · · · , uk ( u1 is the source) p(ui,uk+1) ' dui P j duj v ?t2 s t0 N(s) = ds u t1 N(u) = du s infected v with probability p(s,v) = ds 1 ds+du 2 u infected v with probability p(u,v) = du 1 ds+du 2 Wednesday, August 14, 13
  67. 67. Experimental Inference Results 29 Barabasi-Albert 1024 nodes = 174 Power-law tree 1024 nodes = 94 Facebook 1220 nodes = 287 = max. degree -Netinf [Gomez-Rodriguez, Leskovec, Krause’2010] -First-Edge -First-Edge+ Wednesday, August 14, 13
  68. 68. Experimental Inference Results 29 Barabasi-Albert 1024 nodes = 174 Power-law tree 1024 nodes = 94 Facebook 1220 nodes = 287 = max. degree First-Edge+ exhibit competitive performance -Netinf [Gomez-Rodriguez, Leskovec, Krause’2010] -First-Edge -First-Edge+ Wednesday, August 14, 13
  69. 69. Experimental Inference Results 29 Barabasi-Albert 1024 nodes = 174 Power-law tree 1024 nodes = 94 Facebook 1220 nodes = 287 = max. degree NetInf’s performance flattens -Netinf [Gomez-Rodriguez, Leskovec, Krause’2010] -First-Edge -First-Edge+ Wednesday, August 14, 13
  70. 70. Experimental Inference Results 29 Barabasi-Albert 1024 nodes = 174 Power-law tree 1024 nodes = 94 Facebook 1220 nodes = 287 = max. degree Our algorithm perfectly reconstructs trees with ~30 traces -Netinf [Gomez-Rodriguez, Leskovec, Krause’2010] -First-Edge -First-Edge+ Wednesday, August 14, 13
  71. 71. Experimental Inference Results 29 Barabasi-Albert 1024 nodes = 174 Power-law tree 1024 nodes = 94 Facebook 1220 nodes = 287 = max. degree • First-Edge+: competitive performance, extremely simple to implement, computationally efficient, preemptive. -Netinf [Gomez-Rodriguez, Leskovec, Krause’2010] -First-Edge -First-Edge+ Wednesday, August 14, 13
  72. 72. Conclusions • Our results have direct implication in the design of network inference algorithms • We provide rigorous analysis of the relationship between the amount of data and the performance of algorithms • We give algorithms that are competitive with, while being simpler and more efficient than, existing approaches 30 Wednesday, August 14, 13
  73. 73. Open questions and challenges • Performance guarantees for approximated reconstruction • Trace complexity under other distributions of incubation times • Bounded degree network inference has trace complexity polynomial in Δ, but running time exponential in Δ - Can we optimize the algorithm? • Other network properties that can be recovered without reconstructing the network 31 Wednesday, August 14, 13
  74. 74. Trace complexity of Network Inference Bruno Abrahao (Cornell) Flavio Chierichetti (Sapienza) Robert Kleinberg (Cornell) Alessandro Panconesi (Sapienza) Cornell University 32 Text Sapienza University Complete version including all proofs www.arxiv.org/abs/1308.2954 or http://www.cs.cornell.edu/~abrahao Wednesday, August 14, 13
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×