Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Testing Forest-Isomorphism
in the Adjacency List Model
Mitsuru Kusumoto†, Yuichi Yoshida†*
† : Preferred Infrastructure, I...
Overview
Given two forests G and H, determine if G ≅ H or G and H are
far from being so by looking at very small parts of ...
Introduction
3
Property Testing
We want to solve decision problem as efficiently as possible!!
 Example : Graph connectivity
 Standard ...
Property Testing
Property testing algorithm is a (randomized) algorithm that
checks if input satisfies property P or is fa...
Graph Property Testing - Review
The efficiency of property testing algorithms depends on the
input models.
Adjacency matri...
Graph Property Testing - Review
 Only a few efficient algorithms.
 Many hardness results: △-freeness, k-colorability, et...
Forest-Isomorphism
We focus on forest-isomorphism in adjacency list model.
Input : Two forests G and H represented by adja...
Forest-Isomorphism
We focus on forest-isomorphism in adjacency list model.
Input : Two forests G and H represented by adja...
Forest-Isomorphism
We focus on forest-isomorphism in adjacency list model.
Motivation
 Problem is fundamental: Forest is ...
Forest-Isomorphism
We focus on forest-isomorphism in adjacency list model.
Related Work
 If there is no restriction on in...
Our Contribution
Furthermore, we obtained more general result:
 If the input is a forest, every graph property is testabl...
Overview of Our Algorithm
13
Overview of Our Method
1. Partitioning oracle:
We define a procedure that removes
small fractions of edges to partition
th...
Partitioning Oracle
Partitioning Oracle: Given ε>0 and access to G, there exists
integer s=s(ε) and subgraph G’⊆ G s.t.,
...
Partitioning Oracle
Partitioning Oracle: Given ε>0 and access to G, there exists
integer s=s(ε) and subgraph G’⊆ G s.t.,
...
Partitioning Oracle
Partitioning Oracle: Given ε>0 and access to G, there exists
integer s=s(ε) and subgraph G’⊆ G s.t.,
...
Graph Partition
Suppose that G is obtained through the partitioning oracle.
We split G into the following parts for some c...
Isomorphism between Each Partitions
Graph partition is useful in the following sense.
Lemma. d(G, H) ≤ Σi d(G[i], H[i]).
P...
Isomorphism between Each Partitions
Testing G[i]≅H[i]
 For i=0 : We can use a tester for the bounded-degree model
[NS’11]...
Conclusion
 If the input is a forest, every graph property is testable in
poly(log n) queries.
Future Work?
Can we obtain...
Appendix : Lower bound
22
Lower bound - Overview
1. We construct two distributions of input, D1, D2.
∀(G, H) ∈ D1, G ≅ H
∀(G, H) ∈ D2, d(G, H) ≥ n...
Lower bound
Let Fk := (n / (2klogn)) copies of a star graph with 2k vertices
(Remark that |Fi| = n / logn)
F3
F2
F1
F0
…
F...
Lower bound
Construct two distributions D1, D2 :
D1 : G=H
D2 : randomly assign Fk to
either G or H so that
|V(G)| = |V(H)|...
Lower bound
Because we can perform only “random-sampling” and
(degree/neighbor)-query, checking if G ≅ H is equivalent to
...
Lower bound
Lemma. ∀(G, H) ∈ D2, d(G, H) ≥ n/8
Proof.
Let Φ:V(G)→V(H) be a bijection achieves minimum graph edit
distance....
Upcoming SlideShare
Loading in …5
×

Testing Forest-Isomorphism in the Adjacency List Model

2,898 views

Published on

Published in: Science, Technology
  • Be the first to comment

  • Be the first to like this

Testing Forest-Isomorphism in the Adjacency List Model

  1. 1. Testing Forest-Isomorphism in the Adjacency List Model Mitsuru Kusumoto†, Yuichi Yoshida†* † : Preferred Infrastructure, Inc. * : National Institute of Informatics. 1
  2. 2. Overview Given two forests G and H, determine if G ≅ H or G and H are far from being so by looking at very small parts of G and H. Outline  Introduction Property testing Problem setting  Our algorithms ≅ ? 2 / 21
  3. 3. Introduction 3
  4. 4. Property Testing We want to solve decision problem as efficiently as possible!!  Example : Graph connectivity  Standard setting : BFS is enough. → Θ(n) time.  Property testing : Check if G is connected or G is far from being connected. → O(1) time!? Connected Not connected 4 / 21
  5. 5. Property Testing Property testing algorithm is a (randomized) algorithm that checks if input satisfies property P or is far from P with high probability (e.g., ≥ 2/3) with sublinear query or time complexity. Main Interest  What kinds of properties are testable efficiently? Connected Not connected We want to distinguish them Far from being connected Close to being connected 5 / 21
  6. 6. Graph Property Testing - Review The efficiency of property testing algorithms depends on the input models. Adjacency matrix model [01010] [10110] G = [01001] [11001] [00110] Adjacency list model v A B C 1 2 3 O(v, 1) = A O(v, 2) = B O(v, 3) = C • Input model for dense graphs. [GGR’98] • Many properties are testable. (e.g., connectivity, △-freeness, ... .) • Necessity & sufficiency for constant- time testability are known. [Alon+’09] • Input model for sparse graphs. [GR’02] [KKR’04] • Many properties are testable. (e.g., connectivity, H-minor-freeness.) • But many results assume bounded- degree condition: degrees of vertices must be bounded by some constant. 6 / 21
  7. 7. Graph Property Testing - Review  Only a few efficient algorithms.  Many hardness results: △-freeness, k-colorability, etc., requires Ω(√n) queries. [A+08, B+08, K+04] Question : Is it possible to obtain efficient algorithms for fundamental problems without bounded-degree condition? Adjacency list model v A B C 1 2 3 O(v, 1) = A O(v, 2) = B O(v, 3) = C • Input model for sparse graphs. [GR’02] [KKR’04] • Many properties are testable. (e.g., connectivity, H-minor-freeness.) • But many results assume bounded- degree condition: degrees of vertices must be bounded by some constant. What happens if we do not assume the bounded-degree condition? 7 / 21
  8. 8. Forest-Isomorphism We focus on forest-isomorphism in adjacency list model. Input : Two forests G and H represented by adjacency lists and proximity parameter ε > 0. Query Model : We can access to G and H via following queries:  deg(v): returns the degree of vertex v.  adj(v, i): returns a vertex adjacent to v by i-th edge.  random(): returns a randomly chosen vertex. ≅ ? 8 / 21
  9. 9. Forest-Isomorphism We focus on forest-isomorphism in adjacency list model. Input : Two forests G and H represented by adjacency lists and proximity parameter ε > 0. ε-Farness : d(G, H) := # of edge-(additions / deletions) to transform G to H. (Graph edit distance) For ε>0, (G, H) are ε-far from being isomorphic ⇔ d(G, H) ≥ εn. Objective: Determine G≅H or d(G, H) ≥ εn. ≅ ? 9 / 21
  10. 10. Forest-Isomorphism We focus on forest-isomorphism in adjacency list model. Motivation  Problem is fundamental: Forest is simple structure and isomorphism is a theoretically important problem.  Isomorphism was sometimes considered in property testing literature. [AS’05, AS’08, NS’11] ≅ ? 10 / 21
  11. 11. Forest-Isomorphism We focus on forest-isomorphism in adjacency list model. Related Work  If there is no restriction on input, graph isomorphism testing in the adjacency list model requires Ω(√n) queries. [FM’08] Good motivation for our focus on forests.  If input is a bounded-degree hyperfinite graph, then graph isomorphism is constant-time testable. [NS’11] But if there is no degree bound, testability was unknown. ≅ ? 11 / 21
  12. 12. Our Contribution Furthermore, we obtained more general result:  If the input is a forest, every graph property is testable in poly(log n) queries in the adjacency list model.  We use a similar technique with [Newman and Sohler’11]. Query complexity Upper bound poly(log n) Lower bound Ω(√log n) 12 / 21
  13. 13. Overview of Our Algorithm 13
  14. 14. Overview of Our Method 1. Partitioning oracle: We define a procedure that removes small fractions of edges to partition the graph into several parts with “good” properties. G The Partitioning Oracle H 2. We check if each corresponding part in G and H is isomorphic or far from so. If G, H are far from being isomorphic, there is at least one corresponding part in G, H that is also far from being isomorphic. 14 / 21
  15. 15. Partitioning Oracle Partitioning Oracle: Given ε>0 and access to G, there exists integer s=s(ε) and subgraph G’⊆ G s.t.,  |E(G) – E(G’)| ≤ εn / 3  Each connected component of G’ is either s-bounded-degree-tree or s-rooted-tree. s-rooted tree: A tree where there exists v ∈ V(T) s.t. deg(v) ≥ s and (size of each sub-tree) < s. (We call the vertex v a root.) s-bounded-degree-tree: A tree where (degree of each vertex) < s. v 15 / 21
  16. 16. Partitioning Oracle Partitioning Oracle: Given ε>0 and access to G, there exists integer s=s(ε) and subgraph G’⊆ G s.t.,  |E(G) – E(G’)| ≤ εn / 3  Each connected component of G’ is either s-bounded-degree-tree or s-rooted-tree. We can provide query access to G’. Alive Edge Query: Check if edge (v, i) still exists in G’.  The subgraph G’ is chosen deterministically.  If G ≅ H, then G’ ≅ H’. v A B C 1 2 3 (v, 1) : not alive (v, 2) : not alive (v, 3) : alive 16 / 21
  17. 17. Partitioning Oracle Partitioning Oracle: Given ε>0 and access to G, there exists integer s=s(ε) and subgraph G’⊆ G s.t.,  |E(G) – E(G’)| ≤ εn / 3  Each connected component of G’ is either s-bounded-degree-tree or s-rooted-tree. So…  If d(G, H) = 0 ⇒ d(G’, H’) = 0 G’ and H’ are chosen deterministically.  If d(G, H) ≥ εn ⇒ d(G’, H’) ≥ εn / 3 We remove at most εn / 3 edges from G and H. Thus, it is enough to consider the partitioned graphs G’ and H’. 17 / 21
  18. 18. Graph Partition Suppose that G is obtained through the partitioning oracle. We split G into the following parts for some constants α,γ>1.  G[0] := s-bounded degree trees in G  G[1] := s-rooted trees in G with root degrees in [s, αγ)  G[2] := s-rooted trees in G with root degrees in [αγ, αγ2)  G[3] := s-rooted trees in G with root degrees in [αγ2, αγ3)  ... O(log n) parts G[0] G[1] G[2] ...... 18 / 21
  19. 19. Isomorphism between Each Partitions Graph partition is useful in the following sense. Lemma. d(G, H) ≤ Σi d(G[i], H[i]). Proof. Transformation from G[i] to H[i] for each i would transform G to H. □ Corollary. If d(G, H) ≥ εn, then for βi > 0 with Σ βi = ε, ∃i s.t. d(G[i], H[i]) ≥ βin. □ Thus, it suffices to check the isomorphism between G[i] and H[i] for each i=0,1,2,…. We set β0=ε/2, β1=β2=…=O(ε / log n). 19 / 21
  20. 20. Isomorphism between Each Partitions Testing G[i]≅H[i]  For i=0 : We can use a tester for the bounded-degree model [NS’11].  For i≥1 : We develop a new algorithm. Sketch : We randomly sample root vertices. For each root vertex, we randomly sample its subtrees and create a histogram of subtrees. After this, we compute the minimum matching between the histograms in G and H. This minimum matching turns out to be a good approximation to d(G, H). :2 :2 :1 … 20 / 21
  21. 21. Conclusion  If the input is a forest, every graph property is testable in poly(log n) queries. Future Work? Can we obtain similar results for larger graph class than forests? Outerplanar graphs, Bounded-tree width graphs, Scale-free graphs, … Query complexity Upper bound poly(log n) Lower bound Ω(√log n) Actually O(log^2^poly(1/ε)(n)) 21 / 21
  22. 22. Appendix : Lower bound 22
  23. 23. Lower bound - Overview 1. We construct two distributions of input, D1, D2. ∀(G, H) ∈ D1, G ≅ H ∀(G, H) ∈ D2, d(G, H) ≥ n/8 2. We reduce the isomorphism testing to checking if two probabilistic distributions are the same or not. This requires Ω(√N) queries. ≅ ? ≅ ? 23 / 21
  24. 24. Lower bound Let Fk := (n / (2klogn)) copies of a star graph with 2k vertices (Remark that |Fi| = n / logn) F3 F2 F1 F0 … Flogn 24 / 21
  25. 25. Lower bound Construct two distributions D1, D2 : D1 : G=H D2 : randomly assign Fk to either G or H so that |V(G)| = |V(H)|. G = F0 ∪ F1 ∪ … Flogn H = F0 ∪ F1 ∪ … Flogn G = ................................ H = ............................... F0 F1 … Flogn 25 / 21
  26. 26. Lower bound Because we can perform only “random-sampling” and (degree/neighbor)-query, checking if G ≅ H is equivalent to checking two probabilistic distributions are the same. Lemma. We need Ω(√logn) queries to distinguish D1 and D2. proba. to observe by random-sampling F0 F1 F2 Flogn G H G=H 26 / 21
  27. 27. Lower bound Lemma. ∀(G, H) ∈ D2, d(G, H) ≥ n/8 Proof. Let Φ:V(G)→V(H) be a bijection achieves minimum graph edit distance. It holds that d(G, H) ≥ Σv∈V(G) |deg(v) – deg(Φ(v))| / 2. If we restrict v in the sum to the root of stars, we obtain d(G, H) ≥ Σk=2,3,4,... (n / (2k logn)) ∙ 2k-1/2 ≥ n/8. □ Thus, Ω(logn) lower bound holds. Φ 27 / 21

×