Successfully reported this slideshow.
Upcoming SlideShare
×

# Graph generation using a graph grammar

1,127 views

Published on

This presentation material provides an introduction to graph grammar and its application to learning a graph generative model. Presented at IBIS 2019, Nagoya, Japan.

Published in: Data & Analytics
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Very nice tips on this. In case you need help on any kind of academic writing visit website ⇒ www.HelpWriting.net ⇐ and place your order

Are you sure you want to  Yes  No
• My brother found Custom Writing Service ⇒ www.WritePaper.info ⇐ and ordered a couple of works. Their customer service is outstanding, never left a query unanswered.

Are you sure you want to  Yes  No

### Graph generation using a graph grammar

1. 1. © 2019 IBM Corporation Graph generation using a graph grammar Hiroshi Kajino IBM Research - Tokyo IBIS 2019
2. 2. © 2019 IBM Corporation /52 I will talk about an application of formal language to a graph generation problem §Formal language –Context-free grammar –Hyperedge replacement grammar (HRG) –HRG inference algorithm §Application to molecular graph generation –Molecular hypergraph grammar (a special case of HRG) –MHG inference algorithm –Combination with VAE 2 Contents
3. 3. © 2019 IBM Corporation /52 A molecular graph should satisfy some constraints to be valid §Learning a generative model of a molecular graph –Input: set of molecular graphs ! = {\$%, \$', … , \$)} –Output: probability distribution +(\$) such that \$. ∼ + §Hard vs. soft constraints on +(\$)’s support –Hard constraint: valence condition ∃ rule-based classifier that judges this constraint –Soft constraint: stability ∄ rule-based classifier, in general 3 Why should we care about a formal language? !" Formal language can help
4. 4. © 2019 IBM Corporation /52 A formal language defines a set of strings with certain properties; an associated grammar tells us how to generate them §Formal language Typically deﬁned as a set of strings –Language point of view = a set of strings with certain properties (= a subset of all possible strings) –Generative point of view A grammar is often associated with a language = how to generate strings in ℒ 4 Why should we care about a formal language? All possible strings Σ∗ Language ℒ Grammar 5 ℒ = 6.7. ∶ 9 ≥ 1 ⊂ 6, 7 ∗ = Σ∗ 5 = = , 6, 7 , =, = → 67, = → 6=7
5. 5. © 2019 IBM Corporation /52 A formal language deﬁnes a set of graphs satisfying hard constraints; an associated grammar tells us how to generate them §Formal language Typically deﬁned as a set of graphs –Language point of view = a set of graphs satisfying the hard constraints (= a subset of all possible graphs) –Generative point of view A grammar is often associated with a language = how to generate graphs in ℒ 5 Why should we care about a formal language? All possible graphs Σ∗ Language ℒ Grammar 5 ℒ = Molecules satisfying the valence conditions ⊂ {All possible graphs} ???
6. 6. © 2019 IBM Corporation /52 I will talk about an application of formal language to a graph generation problem §Formal language –Context-free grammar –Hyperedge replacement grammar (HRG) –HRG inference algorithm §Application to molecular graph generation –Molecular hypergraph grammar (a special case of HRG) –MHG inference algorithm –Combination with VAE 6 Contents
7. 7. © 2019 IBM Corporation /52 CFG generates a string by repeatedly applying a production rule to a non-terminal, until there exists no non-terminal §Context-free grammar 5 = (T, Σ, U, =) – T: set of non-terminals – Σ: set of terminals – U: set of production rules – = ∈ T: the start symbol 7 Context-free grammar § Example – T = {=} – Σ = 6, 7 – U = {= → 67, = → 6=7} – = =
8. 8. © 2019 IBM Corporation /52 CFG generates a string by repeatedly applying a production rule to a non-terminal, until there exists no non-terminal §Context-free grammar 5 = (T, Σ, U, =) – T: set of non-terminals – Σ: set of terminals – U: set of production rules – = ∈ T: the start symbol 8 Context-free grammar § Example – T = {=} – Σ = 6, 7 – U = {= → 67, = → 6=7} – = =
9. 9. © 2019 IBM Corporation /52 CFG generates a string by repeatedly applying a production rule to a non-terminal, until there exists no non-terminal §Context-free grammar 5 = (T, Σ, U, =) – T: set of non-terminals – Σ: set of terminals – U: set of production rules – = ∈ T: the start symbol 9 Context-free grammar § Example – T = {=} – Σ = 6, 7 – U = {= → 67, = → 6=7} – = 6=7
10. 10. © 2019 IBM Corporation /52 CFG generates a string by repeatedly applying a production rule to a non-terminal, until there exists no non-terminal §Context-free grammar 5 = (T, Σ, U, =) – T: set of non-terminals – Σ: set of terminals – U: set of production rules – = ∈ T: the start symbol 10 Context-free grammar § Example – T = {=} – Σ = 6, 7 – U = {= → 67, = → 6=7} – = 6=7
11. 11. © 2019 IBM Corporation /52 CFG generates a string by repeatedly applying a production rule to a non-terminal, until there exists no non-terminal §Context-free grammar 5 = (T, Σ, U, =) – T: set of non-terminals – Σ: set of terminals – U: set of production rules – = ∈ T: the start symbol 11 Context-free grammar § Example – T = {=} – Σ = 6, 7 – U = {= → 67, = → 6=7} – = 66=77
12. 12. © 2019 IBM Corporation /52 CFG generates a string by repeatedly applying a production rule to a non-terminal, until there exists no non-terminal §Context-free grammar 5 = (T, Σ, U, =) – T: set of non-terminals – Σ: set of terminals – U: set of production rules – = ∈ T: the start symbol 12 Context-free grammar § Example – T = {=} – Σ = 6, 7 – U = {= → 67, = → 6=7} – = 66=77
13. 13. © 2019 IBM Corporation /52 CFG generates a string by repeatedly applying a production rule to a non-terminal, until there exists no non-terminal §Context-free grammar 5 = (T, Σ, U, =) – T: set of non-terminals – Σ: set of terminals – U: set of production rules – = ∈ T: the start symbol 13 Context-free grammar § Example – T = {=} – Σ = 6, 7 – U = {= → 67, = → 6=7} – = 666777
14. 14. © 2019 IBM Corporation /52 I will talk about an application of formal language to a graph generation problem §Formal language –Context-free grammar –Hyperedge replacement grammar (HRG) –HRG inference algorithm §Application to molecular graph generation –Molecular hypergraph grammar (a special case of HRG) –MHG inference algorithm –Combination with VAE 14 Contents
15. 15. © 2019 IBM Corporation /52 Hypergraph is a generalization of a graph §Hypergraph W = (T, X) consists of… –Node Y ∈ T –Hyperedge Z ∈ X ⊆ 2 ] : Connect an arbitrary number of nodes cf, An edge in a graph connects exactly two nodes 15 Hyperedge replacement grammar HyperedgeNode
16. 16. © 2019 IBM Corporation /52 HRG generates a hypergraph by repeatedly replacing non-terminal hyperedges with hypergraphs §Hyperedge replacement grammar (HRG) 5 = (T, Σ, U, =) –T: set of non-terminals –Σ: set of terminals –=: start symbol –U: set of production rules – A rule replaces a non-terminal hyperedge with a hypergraph 16 Hyperedge replacement grammar 1C 2 H H1N 2 N N NS Labels on hyperedges C N S [Feder, 71], [Pavlidis+, 72]
17. 17. © 2019 IBM Corporation /52 Start from start symbol S 17 Hyperedge replacement grammar S 1C 2 H H1N 2 N N NS Production rules ^
18. 18. © 2019 IBM Corporation /52 The left rule is applicable 18 Hyperedge replacement grammar S 1C 2 H H1N 2 N N NS Production rules ^
19. 19. © 2019 IBM Corporation /52 We obtain a hypergraph with three non-terminals 19 Hyperedge replacement grammar 1C 2 H H1N 2 N N NS Production rules ^ N N N
20. 20. © 2019 IBM Corporation /52 Apply the right rule to one of the non-terminals 20 Hyperedge replacement grammar 1C 2 H H1N 2 N N NS Production rules ^ N N N
21. 21. © 2019 IBM Corporation /52 Two non-terminals remain 21 Hyperedge replacement grammar 1C 2 H H1N 2 N N NS Production rules ^ C N N H H
22. 22. © 2019 IBM Corporation /52 Repeat the procedure until there is no non-terminal 22 Hyperedge replacement grammar 1C 2 H H1N 2 N N NS Production rules ^ C N N H H
23. 23. © 2019 IBM Corporation /52 Repeat the procedure until there is no non-terminal 23 Hyperedge replacement grammar 1C 2 H H1N 2 N N NS Production rules ^ C C N H H H H
24. 24. © 2019 IBM Corporation /52 Repeat the procedure until there is no non-terminal 24 Hyperedge replacement grammar 1C 2 H H1N 2 N N NS Production rules ^ C C N H H H H
25. 25. © 2019 IBM Corporation /52 Graph generation halts when there is no non-terminal 25 Hyperedge replacement grammar 1C 2 H H1N 2 N N NS Production rules ^ C C C H H H H H H
26. 26. © 2019 IBM Corporation /52 I will talk about an application of formal language to a graph generation problem §Formal language –Context-free grammar –Hyperedge replacement grammar (HRG) –HRG inference algorithm §Application to molecular graph generation –Molecular hypergraph grammar (a special case of HRG) –MHG inference algorithm –Combination with VAE 26 Contents
27. 27. © 2019 IBM Corporation /52 HRG inference algorithm outputs HRG that can reconstruct the input §HRG inference algorithm [Aguiñaga+, 16] –Input: Set of hypergraphs ℋ –Output: HRG such that ℋ ⊆ ℒ(HRG) Minimum requirement of the grammar’s expressiveness –Idea: Infer production rules necessary to obtain each hypergraph Decompose each hypergraph into a set of production rules 27 HRG inference algorithm Language of the grammar
28. 28. © 2019 IBM Corporation /52 Tree decomposition discovers a tree-like structure in a graph §Tree decomposition –All the nodes and edges must be included in the tree –For each node, the tree nodes that contain it must be connected 28 HRG inference algorithm * Digits represent the node correspondence 1 3 2 1 3 4 3C 4 H H 3 C 2 H H 1C 4 H H 1 C 2 H H C C H HH H C C H HH H
29. 29. © 2019 IBM Corporation /52 Tree decomposition and (a syntax tree of) HRG are equivalent §Relationship between tree decomposition and HRG 1. Connecting hypergraphs in tree recovers the original hypergraph 2. Connection ⇔ Hyperedge replacement 29 HRG inference algorithm 1 3 2 1 3 4 3C 4 H H 3 C 2 H H 1C 4 H H 1 C 2 H H C C H HH H C C H HH H
30. 30. © 2019 IBM Corporation /52 Tree decomposition and (a syntax tree of) HRG are equivalent §Relationship between tree decomposition and HRG 1. Connecting hypergraphs in tree recovers the original hypergraph 2. Connection ⇔ Hyperedge replacement 31 HRG inference algorithm 1 3 2 1 3 4 3C 4 H H 3 C 2 H H 1C 4 H H 1 C 2 H H 1C 4 H H 1 4 N 1 3 4 1 4 N Production rule = attach
31. 31. © 2019 IBM Corporation /52 Tree decomposition and (a syntax tree of) HRG are equivalent §Relationship between tree decomposition and HRG 1. Connecting hypergraphs in tree recovers the original hypergraph 2. Connection ⇔ Hyperedge replacement 32 HRG inference algorithm 1 3 2 1 3 4 3C 4 H H 3 C 2 H H 1C 4 H H 1 C 2 H H 1C 4 H H 1 4 N 1 3 4 1 4 N Production rule = attach Children
32. 32. © 2019 IBM Corporation /52 Tree decomposition and (a syntax tree of) HRG are equivalent §Relationship between tree decomposition and HRG 1. Connecting hypergraphs in tree recovers the original hypergraph 2. Connection ⇔ Hyperedge replacement 33 HRG inference algorithm 1 3 2 1 3 4 3C 4 H H 3 C 2 H H 1C 4 H H 1 C 2 H H 1C 4 H H 1 4 N 1 3 4 1 4 N Production rule = attach N N
33. 33. © 2019 IBM Corporation /52 HRG can be inferred from tree decompositions of input hypergraphs. §HRG inference algorithm [Aguiñaga+, 16] –Algorithm: –Expressiveness: ℋ ⊆ ℒ(HRG) The resultant HRG can generate all input hypergraphs. (∵ clear from its algorithm) 34 HRG inference algorithm 1. Compute tree decompositions of input hypergraphs 2. Extract production rules 3. Compose HRG by taking their union
34. 34. © 2019 IBM Corporation /52 I will talk about an application of formal language to a graph generation problem §Formal language –Context-free grammar –Hyperedge replacement grammar (HRG) –HRG inference algorithm §Application to molecular graph generation –Molecular hypergraph grammar (a special case of HRG) –MHG inference algorithm –Combination with VAE 35 Contents
35. 35. © 2019 IBM Corporation /52 We want a graph grammar that guarantees hard constraints §Objective Construct a graph grammar that never violates the valence condition §Application: Generative model of a molecule –Grammar-based generation guarantees the valence condition –Probabilistic model could learn soft constraints 36 Molecular hypergraph grammar !"
36. 36. © 2019 IBM Corporation /52 A simple application to molecular graphs doesn’t work §A simple application to molecular graphs –Input: Molecular graphs –Issue: Valence conditions can be violated 37 Molecular hypergraph grammar H C C H H H H H Input C H H C C H H C H C C C C H Tree decomposition → → → S NC NC NH C NC C N N N N C H Extracted rules This rule increases the degree of carbon
37. 37. © 2019 IBM Corporation /52 Our idea is to use a hypergraph representation of a molecule §Conserved quantity –HRG: # of nodes in a hyperedge –Our grammar: # of bonds connected to each atom (valence) ∴ Atom should be modeled as a hyperedge §Molecular hypergraph – Atom = hyperedge – Bond = node 38 Molecular hypergraph grammar C C H H HH H H C C H HH H C C H HH H
38. 38. © 2019 IBM Corporation /52 A language for molecular hypergraphs consists of two properties §Molecular hypergraph as a language A set of hypergraphs with the following properties: 1. Each node has degree 2 (=2-regular) 2. Label on a hyperedge determines # of nodes it has (= valence) 39 Molecular hypergraph grammar H HC H C H H HC H C H C C H H HH H H C C H HH H C C H HH H
39. 39. © 2019 IBM Corporation /52 MHG, a grammar for the language, is deﬁned as a subclass of HRG §Molecular Hypergraph Grammar (MHG) –Deﬁnition: HRG that generates molecular hypergraphs only –Counterexamples: 40 Molecular hypergraph grammar MHG HRG C C H H HH H HH C C H H HH H HH Valence # 2-regularity # This can be avoided by learning HRG from data [Aguiñaga+, 16] Use an irredundant tree decomposition (our contribution)
40. 40. © 2019 IBM Corporation /52 I will talk about an application of formal language to a graph generation problem §Formal language –Context-free grammar –Hyperedge replacement grammar (HRG) –HRG inference algorithm §Application to molecular graph generation –Molecular hypergraph grammar (a special case of HRG) –MHG inference algorithm –Combination with VAE 41 Contents
41. 41. © 2019 IBM Corporation /52 A naive application of the existing algorithm doesn’t work §Naive application of the HRG inference algorithm [Aguiñaga+, 16] –Input: Set of hypergraphs –Output: HRG w/ the following properties: • All the input hypergraphs are in the language \$ • Guarantee the valence conditions \$ • No guarantee on 2-regularity % 42 MHG inference algorithm C C H H HH H HH This cannot be transformed into a molecular graph
42. 42. © 2019 IBM Corporation /52 Irredundant tree decomposition is a key to guarantee 2-regularity §Irredundant tree decomposition –The connected subgraph induced by a node must be a path –Any tree decomposition can be made irredundant in poly-time 43 MHG inference algorithm 1 3 2 1 3 4 3C 4 H H 3 C 2 H H 1C 4 H H 1 C 2 H H 4 RedundantIrredundant
43. 43. © 2019 IBM Corporation /52 MHG inference algorithm is diﬀerent from the existing one by two steps §MHG Inference algorithm [Kajino, 19] –Input: Set of molecular graphs –Output: MHG w/ the following properties: • All the input hypergraphs are in the language \$ • Guarantee the valence conditions \$ • Guarantee 2-regularity & 44 MHG inference algorithm 1. Convert molecular graphs into molecular hypergraphs 2. Compute tree decompositions of molecular hypergraphs 3. Convert each tree decomposition to be irredundant 4. Extract production rules 5. Compose MHG by taking their union Thanks to HRG Our contribution
44. 44. © 2019 IBM Corporation /52 I will talk about an application of formal language to a graph generation problem §Formal language –Context-free grammar –Hyperedge replacement grammar (HRG) –HRG inference algorithm §Application to molecular graph generation –Molecular hypergraph grammar (a special case of HRG) –MHG inference algorithm –Combination with VAE 45 Contents
45. 45. © 2019 IBM Corporation /52 We obtain (Enc, Dec) between molecule and latent vector by combining MHG and RNN-VAE §MHG-VAE: (Enc, Dec) between molecule & latent vector 46 Combination with VAE EncG EncN EncH Molecular graph Molecular hypergraph Parse Tree according to MHG ! ∈ ℝ\$ Latent vector MHG-VAE encoder MHG Enc of RNN-VAE
46. 46. © 2019 IBM Corporation /52 First, we learn (Enc, Dec) between a molecule and its vector representation using MHG-VAE §Global molecular optimization [Gómez-Bombarelli+, 16] –Find: Molecule that maximizes the target –Method: VAE+BO 1. Obtain MHG from the input molecules 2. Train RNN-VAE on syntax trees 3. Obtain vector representations f. ∈ ℝh .i% ) 1. Some of which have target values {j. ∈ ℝ} 4. BO gives us candidates fk ∈ ℝh ki% l that may maximize the target 5. Decode them to obtain molecules !k ki% l 47 Combination with VAE Image from [Gómez-Bombarelli+, 16]
47. 47. © 2019 IBM Corporation /52 Given vector representations and their target values, we use BO to obtain a vector that optimizes the target §Global molecular optimization [Gómez-Bombarelli+, 16] –Find: Molecule that maximizes the target –Method: VAE+BO 1. Obtain MHG from the input molecules 2. Train RNN-VAE on syntax trees 3. Obtain vector representations f. ∈ ℝh .i% ) 1. Some of which have target values {j. ∈ ℝ} 4. BO gives us candidates fk ∈ ℝh ki% l that may maximize the target 5. Decode them to obtain molecules !k ki% l 48 Combination with VAE Image from [Gómez-Bombarelli+, 16]
48. 48. © 2019 IBM Corporation /52 We evaluate the beneﬁt of our grammar-based representation, compared with existing ones §Empirical study –Purpose: How much does our representation facilitate VAE training? –Baselines: • {C,G,SD}VAE use SMILES (text repr.) • JT-VAE assembles molecular components – It requires NNs other than VAE for scoring –Tasks: • VAE reconstruction • Valid prior ratio • Global molecular optimization 49 Combination with VAE Image from [Jin+, 18]
49. 49. © 2019 IBM Corporation /52 Our grammar-based representation achieves better scores. This result empirically supports the eﬀectiveness of our approach. §Result 50 Combination with VAE Synthetic accessibility score Penalty to a ring larger than six Water solubility Maximizing m(n)
50. 50. © 2019 IBM Corporation /52 A graph grammar can be a building block for a graph generative model §Classify constraints into hard ones and soft ones ML for the soft ones, rules for the hard ones §Deﬁne a language by encoding hard constraints E.g., valence conditions §Design a grammar for the language Sometime, w/ an inference algorithm Code is now public on Github https://github.com/ibm-research-tokyo/graph_grammar 51 Takeaways
51. 51. © 2019 IBM Corporation /52 [Aguiñaga+, 16] Aguiñaga, S., Palacios, R., Chiang, D., and Weninger, T.: Growing graphs from hyperedge replacement graph grammars. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 469–478, 2016. [Feder, 71] Feder, J: Plex languages. Information Sciences, 3, pp. 225-241, 1971. [Gómez-Bombarelli+, 16] Gómez-Bombarelli, R., Wei, J. N., Duvenaud, D., Hernández-Lobato, J. M., Sánchez- Lengeling, B., Sheberla, D., Aguilera-Iparraguirre, J., Hirzel, T. D., Adams, R. P., and Aspuru-Guzik, A.: Automatic chemical design using a data-driven continuous representation of molecules. ACS Central Science, 2018. (ArXiv ver. appears in 2016) [Jin+, 18] Jin, W., Barzilay, R., and Jaakkola, T.: Junction tree variational autoencoder for molecular graph generation. In Proceedings of the Thirty-ﬁfth International Conference on Machine Learning, 2018. [Kajino, 19] Kajino, H.: Molecular hypergraph grammar with its application to molecular optimization. In Proceedings of the Thirty-sixth International Conference on Machine Learning, 2019. [Pavlidis, 72] Pavlidis, T.: Linear and context-free graph grammars. Journal of the ACM, 19(1), pp.11-23, 1972. [You+, 18] You, J., Liu, B., Ying, Z., Pande, V., and Leskovec, J.: Graph convolutional policy network for goal-directed molecular graph generation. In Advances in Neural Information Processing Systems 31, pp. 6412–6422, 2018. . 52 References