Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
SLATE 2015, Madrid, Spain June 18-19, 2015
1/67
The Application of Grammar
Inference to Software
Language Engineering
Marj...
SLATE 2015, Madrid, Spain June 18-19, 2015
2/67
Outline of the Presentation
• What is a Grammar Inference (GI)?
• A short ...
SLATE 2015, Madrid, Spain June 18-19, 2015
3/67
What is a Grammar Inference (GI)?
• Grammatical inference is a process of
...
SLATE 2015, Madrid, Spain June 18-19, 2015
4/67
What is a Grammar Inference (GI)?
• Given a sentence ps and a grammar G we...
SLATE 2015, Madrid, Spain June 18-19, 2015
5/67
What is a Grammar Inference (GI)?
• Given a set S+ and S-, which might be ...
SLATE 2015, Madrid, Spain June 18-19, 2015
6/67
What is a Grammar Inference (GI)?
print 5
print a where a=10
print b+1 whe...
SLATE 2015, Madrid, Spain June 18-19, 2015
7/67
A short introduction to SLE
• Software Language Engineering (SLE) is a
you...
SLATE 2015, Madrid, Spain June 18-19, 2015
8/67
A short introduction to SLE
Technicalliterature
Existing
implementations
C...
SLATE 2015, Madrid, Spain June 18-19, 2015
9/67
Some applications of GI to SLE
• A. Stevenson, J. R. Cordy. A Survey of gr...
SLATE 2015, Madrid, Spain June 18-19, 2015
10/67
Some applications of GI to SLE
• Inference of GPLs
– Problem is still too...
SLATE 2015, Madrid, Spain June 18-19, 2015
11/67
Some applications of GI to SLE
• Inference of GPLs
– Restrictive assumpti...
SLATE 2015, Madrid, Spain June 18-19, 2015
12/67
Some applications of GI to SLE
• HRNČIČ, Dejan, MERNIK, Marjan, BRYANT, B...
SLATE 2015, Madrid, Spain June 18-19, 2015
13/67
Some applications of GI to SLE
• Initial grammar (ANSI C):
1. translation...
SLATE 2015, Madrid, Spain June 18-19, 2015
14/67
Some applications of GI to SLE
• Initial grammar (ANSI C):
int main() {
c...
SLATE 2015, Madrid, Spain June 18-19, 2015
15/67
Some applications of GI to SLE
• Inferred Grammar:
1. translation unit ::...
SLATE 2015, Madrid, Spain June 18-19, 2015
16/67
Some applications of GI to SLE
• Inference of DSLs
– A DSL grammar is usu...
SLATE 2015, Madrid, Spain June 18-19, 2015
17/67
Some applications of GI to SLE
• 12 input samples of DESK language on whi...
SLATE 2015, Madrid, Spain June 18-19, 2015
18/67
Some applications of GI to SLE
Original grammar:
1. DESK ::= print E C
2....
SLATE 2015, Madrid, Spain June 18-19, 2015
19/67
Some applications of GI to SLE
DSL for hypertree description
SLATE 2015, Madrid, Spain June 18-19, 2015
20/67
Some applications of GI to SLE
Inferred grammar for hypertree description...
SLATE 2015, Madrid, Spain June 18-19, 2015
21/67
Some applications of GI to SLE
• As a model conforms to a metamodel in a
...
SLATE 2015, Madrid, Spain June 18-19, 2015
22/67
Some applications of GI to SLE
• A set of positive samples is denoted wit...
SLATE 2015, Madrid, Spain June 18-19, 2015
23/67
Some applications of GI to SLE
• Given a set of positive samples S+ and s...
SLATE 2015, Madrid, Spain June 18-19, 2015
24/67
Some applications of GI to SLE
• JAVED, Faizan, MERNIK, Marjan, GRAY, Jef...
SLATE 2015, Madrid, Spain June 18-19, 2015
25/67
Some applications of GI to SLE
• ESML (Embedded System Modeling Language)
SLATE 2015, Madrid, Spain June 18-19, 2015
26/67
Some applications of GI to SLE
• Original ESML metamodel - Configuration ...
SLATE 2015, Madrid, Spain June 18-19, 2015
27/67
Some applications of GI to SLE
• Inferred ESML metamodel - Configuration ...
SLATE 2015, Madrid, Spain June 18-19, 2015
28/67
Some applications of GI to SLE
• Inference of Visual Languages (based on ...
SLATE 2015, Madrid, Spain June 18-19, 2015
29/67
Graph grammar inference
Positive and negative samples for hydrocarbons
wi...
SLATE 2015, Madrid, Spain June 18-19, 2015
30/67
Graph grammar inference
Inferred graph grammar
SLATE 2015, Madrid, Spain June 18-19, 2015
31/67
Graph grammar inference
Positive samples
for flowcharts
SLATE 2015, Madrid, Spain June 18-19, 2015
32/67
Graph grammar inference
Inferred graph grammar
SLATE 2015, Madrid, Spain June 18-19, 2015
33/67
Some applications of GI to SLE
• Inference of execution traces
– GI is we...
SLATE 2015, Madrid, Spain June 18-19, 2015
34/67
Some applications of GI to SLE
• Javier Canovas, Jordi Cabot, Jesus Lopez...
SLATE 2015, Madrid, Spain June 18-19, 2015
35/67
Some applications of GI to SLE
• Javier Canovas, Jordi Cabot, Jesus Lopez...
SLATE 2015, Madrid, Spain June 18-19, 2015
36/67
Some applications of GI to SLE
• Model evolution using metamodel inference
SLATE 2015, Madrid, Spain June 18-19, 2015
37/67
Some applications of GI to SE
• Applications to Software Engineering (SE)...
SLATE 2015, Madrid, Spain June 18-19, 2015
38/67
Some applications of GI to SE
• A grammar-based system is any system that...
SLATE 2015, Madrid, Spain June 18-19, 2015
39/67
Some applications of GI to SE
• Applications to Software Engineering (SE)...
SLATE 2015, Madrid, Spain June 18-19, 2015
40/67
Inside GI
• Several theoretical learning models.
– Identification in the ...
SLATE 2015, Madrid, Spain June 18-19, 2015
41/67
Inside GI
• Gold Theorem (1967) - it is impossible to
identify any of the...
SLATE 2015, Madrid, Spain June 18-19, 2015
42/67
Inside GI
• Intuitively, Gold's theorem can be explained
by recognizing t...
SLATE 2015, Madrid, Spain June 18-19, 2015
43/67
Inside GI
• A lot of research has been done on
extraction of context-free...
SLATE 2015, Madrid, Spain June 18-19, 2015
44/67
Inside GI
SLATE 2015, Madrid, Spain June 18-19, 2015
45/67
Inside GI
SLATE 2015, Madrid, Spain June 18-19, 2015
46/67
Inside GI
SLATE 2015, Madrid, Spain June 18-19, 2015
47/67
Inside GI
SLATE 2015, Madrid, Spain June 18-19, 2015
48/67
Inside GI
SLATE 2015, Madrid, Spain June 18-19, 2015
49/67
Inside GI
• Memetic algorithms are evolutionary
algorithms with local sea...
SLATE 2015, Madrid, Spain June 18-19, 2015
50/67
Inside GI
example n
example 1
regular
definitions
...
initiali-
zation
lo...
SLATE 2015, Madrid, Spain June 18-19, 2015
51/67
Inside GI
• Sequitur: http://sequitur.info/
• abcabdabcabd
0 → 1 1
1 → 2 ...
SLATE 2015, Madrid, Spain June 18-19, 2015
52/67
Inside GI
print a where c=2
print 5+b where b = 10print id where id=num
p...
SLATE 2015, Madrid, Spain June 18-19, 2015
53/67
Inside GI
Apply diff command!
1a2,3
> num
> +
print id where id=num
print...
SLATE 2015, Madrid, Spain June 18-19, 2015
54/67
Inside GI
Start with the grammar
that parses first sample:
print a where ...
SLATE 2015, Madrid, Spain June 18-19, 2015
55/67
Inside GI
• Input samples:
s1,s2,...,sn (true positive)
s1,s2,...,sk,a1,....
SLATE 2015, Madrid, Spain June 18-19, 2015
56/67
Inside GI
• Nx → α1 • α2
– if
Nx ::= α1 N1 α2
N1 ::= ai+1 ... am
N1 ::= ε...
SLATE 2015, Madrid, Spain June 18-19, 2015
57/67
Inside GI
print a where c=2
print 5+b where b = 10
N1 → print • N2 where ...
SLATE 2015, Madrid, Spain June 18-19, 2015
58/67
Inside GI
Production: Nx ::= α1 Ny α2
Option
Nx ::= α1 Nz α2
Nz ::= Ny
Nz...
SLATE 2015, Madrid, Spain June 18-19, 2015
59/67
Inside GI
Nx ::= α Ny Nx ::= Ny Ny
Ny ::= α Ny ::= α
Ny ::= β Ny ::= βWha...
SLATE 2015, Madrid, Spain June 18-19, 2015
60/67
Semantic inference
a) Grammar G:
E ::= T EE
EE ::= + T EE
EE ::= ε
T ::= ...
SLATE 2015, Madrid, Spain June 18-19, 2015
61/67
Semantic inference
// infered attribute grammars
E ::= T EE
{ E[0].vals =...
SLATE 2015, Madrid, Spain June 18-19, 2015
62/67
Semantic inference
SLATE 2015, Madrid, Spain June 18-19, 2015
63/67
Semantic inference
Actually, the prototypical implementation found two ad...
SLATE 2015, Madrid, Spain June 18-19, 2015
64/67
Semantic inference
SLATE 2015, Madrid, Spain June 18-19, 2015
65/67
Semantic inference
SLATE 2015, Madrid, Spain June 18-19, 2015
66/67
Conclusion
Hope that I convinced you
that grammatical inference is
intere...
SLATE 2015, Madrid, Spain June 18-19, 2015
67/67
Conclusion
Collaborators:
M. Črepinšek, D. Hrnčič1,
B. Bryant2, A. Spragu...
Upcoming SlideShare
Loading in …5
×

The Application of Grammar Inference to Software Language Engineering

695 views

Published on

Conferencia impartida por el Dr. Marjan Mernik el 19 de Junio de 2015 dentro del ciclo de conferencias de Posgrado

Published in: Education
  • Be the first to comment

  • Be the first to like this

The Application of Grammar Inference to Software Language Engineering

  1. 1. SLATE 2015, Madrid, Spain June 18-19, 2015 1/67 The Application of Grammar Inference to Software Language Engineering Marjan Mernik University of Maribor, Slovenia
  2. 2. SLATE 2015, Madrid, Spain June 18-19, 2015 2/67 Outline of the Presentation • What is a Grammar Inference (GI)? • A short introduction to Software Language Engineering (SLE) • Some applications of GI to SLE and SE (CFG inference, Metamodel inference, Graph grammar inference, semantic inference) • Inside GI • Conclusion
  3. 3. SLATE 2015, Madrid, Spain June 18-19, 2015 3/67 What is a Grammar Inference (GI)? • Grammatical inference is a process of learning the grammar from positive (and negative) language samples (sentences). • Grammatical inference attracts researchers from different fields such as pattern recognition, computational linguistic, natural language acquisition, software engineering, ...
  4. 4. SLATE 2015, Madrid, Spain June 18-19, 2015 4/67 What is a Grammar Inference (GI)? • Given a sentence ps and a grammar G we can tell whether ps belongs to L(G) (ps ∈ L(G)). Such sentence is called positive sample. • A set of positive samples is denoted with S+. In similar manner we can defined set of negative samples S-. Those samples do not belong to L(G) and can no be derived from starting symbol S.
  5. 5. SLATE 2015, Madrid, Spain June 18-19, 2015 5/67 What is a Grammar Inference (GI)? • Given a set S+ and S-, which might be also empty, the task of grammar inference is to find at least one grammar G such that S+⊆L(G) and S-⊆L (G). • A set of positive samples S+ of a L(G) is structurally complete if each grammar production is used in the generation of at least one sentence in S+.
  6. 6. SLATE 2015, Madrid, Spain June 18-19, 2015 6/67 What is a Grammar Inference (GI)? print 5 print a where a=10 print b+1 where b=1 print a+b+2 where a=1, b=2 What computer language she used? Try out our newly developed grammar inference algorithm! What is a grammar of this language?
  7. 7. SLATE 2015, Madrid, Spain June 18-19, 2015 7/67 A short introduction to SLE • Software Language Engineering (SLE) is a young engineering discipline with the aim of establishing a systematic and rigorous approach to the development, use, and maintenance of computer languages, which comprises specification, modeling and programming languages. • SLE is equally focused on General-Purpose Langauges (GPLs) and Domain-Specific Languages (DSLs)
  8. 8. SLATE 2015, Madrid, Spain June 18-19, 2015 8/67 A short introduction to SLE Technicalliterature Existing implementations Customsurveys Expertadvice Currentandfuture requirements Terminology Concepts Commonalities Variations Syntax Semantics DSLDSL Possible existing implementations Requirements DSL DSL development phases
  9. 9. SLATE 2015, Madrid, Spain June 18-19, 2015 9/67 Some applications of GI to SLE • A. Stevenson, J. R. Cordy. A Survey of grammatical inference in software engineering. Science of Computer Programming 96 (2014) 444-459. – Inference of GPLs – Inference of DSLs (grammar and metamodel based) – Inference of Visual Languages (based on graph grammars) – Inference of execution traces
  10. 10. SLATE 2015, Madrid, Spain June 18-19, 2015 10/67 Some applications of GI to SLE • Inference of GPLs – Problem is still too complex to infer a GPL with several hundreds of production rules. – Several successful attempts have been developed to infer a dialect GPL grammar G’ from existing GPL grammar G. M. Di Penta, P. Lombardi, K. Taneja, L. Troiano. Search- Based Inference of Dialect Grammars. Soft Computing 12 (2008) 51-66. A. Dubej, P. Jalote, S. Aggarval. Learning context free grammar rules from a set of programs. IET Software 2 (2008) 223-240.
  11. 11. SLATE 2015, Madrid, Spain June 18-19, 2015 11/67 Some applications of GI to SLE • Inference of GPLs – Restrictive assumption have been declared in these works: 1. G and G’ share the same set of non-terminals 2. T ⊆ T’ 3. (T’-T) consists only of keywords and operators 4. Production rules can be only added and not removed 5. Missing rules always begin with a keyword
  12. 12. SLATE 2015, Madrid, Spain June 18-19, 2015 12/67 Some applications of GI to SLE • HRNČIČ, Dejan, MERNIK, Marjan, BRYANT, Barrett Richard. Embedding DSLS into GPLS: A Grammatical Inference Approach. Information Technology and Control , 2011, vol. 40, no. 4, pp. 307-315. • Our approach can be used also for syntax extensions and for DSL embedding – To embed domain-specific language (e.g, SQL) into another programming language (GPL or DSL)
  13. 13. SLATE 2015, Madrid, Spain June 18-19, 2015 13/67 Some applications of GI to SLE • Initial grammar (ANSI C): 1. translation unit ::= external decl 2. translation unit ::= translation unit external decl 3. external decl ::= function denition 4. external decl ::= decl 6. function denition ::= declarator decl list compound stat 9. decl ::= decl specs init declarator list ; 10. decl ::= decl specs ; 11. decl list ::= decl 12. decl list ::= decl list decl 15. decl specs ::= type spec decl specs 27. type spec ::= int | long | ... 45. init declarator list ::= init declarator 46. init declarator list ::= init declarator list , init declarator 47. init declarator ::= declarator 64. enumerator ::= id 65. enumerator ::= id = const exp 67. declarator ::= direct declarator 68. direct declarator ::= id 69. direct declarator ::= ( declarator ) 70. direct declarator ::= direct declarator [ const exp ] 71. direct declarator ::= direct declarator [ ] 72. direct declarator ::= direct declarator ( param type list ) 73. direct declarator ::= direct declarator ( id list ) 74. direct declarator ::= direct declarator ( ) 88. id list ::= id 89. id list ::= id list , id 90. initializer ::= assignment exp 91. initializer ::= initializer list 93. initializer list ::= initializer 94. initializer list ::= initializer list , initializer 110. stat ::= labeled stat | exp stat | compound stat | selection stat 114. stat ::= iteration stat | jump stat 116. labeled stat ::= id : stat 117. labeled stat ::= case const exp : stat 118. labeled stat ::= default : stat 119. exp stat ::= exp ; 120. exp stat ::= ; 121. compound stat ::= decl list stat list 125. stat list ::= stat 126. stat list ::= stat list stat 127. selection stat ::= if ( exp ) stat 129. selection stat ::= switch ( exp ) stat 130. iteration stat ::= while ( exp ) stat 131. iteration stat ::= do stat while ( exp ) ; 132. iteration stat ::= for ( exp ; exp ; exp ) stat 140. jump stat ::= goto id ; | continue ; | break ; | return exp ; 145. exp ::= assignment exp 146. exp ::= exp , assignment exp 147. assignment exp ::= conditional exp 148. assignment exp ::= conditional exp assignment operator assignment exp 205. const ::= int const | char const | oat const
  14. 14. SLATE 2015, Madrid, Spain June 18-19, 2015 14/67 Some applications of GI to SLE • Initial grammar (ANSI C): int main() { char str[][]; int i; printf("Students:"); for(i = 0; i < str.length; i++) { printf(str[i]); } return 0; } int main() { char str[][] = { SELECT Name FROM Students }; int i; printf("Students:"); for(i = 0; i < str.length; i++) { printf(str[i]); } return 0; } true positive sample false negative samples: int main() { char str[][] = { SELECT Name, Surname FROM Students, Professors }; int i; printf("Students and Professors:"); for(i = 0; i < str.length; i++) { printf(str[i]); } return 0; }
  15. 15. SLATE 2015, Madrid, Spain June 18-19, 2015 15/67 Some applications of GI to SLE • Inferred Grammar: 1. translation unit ::= external decl 2. translation unit ::= translation unit external decl 3. external decl ::= function denition 4. external decl ::= decl 6. function denition ::= declarator decl list compound stat 9. decl ::= decl specs init declarator list ; 10. decl ::= decl specs ; 11. decl list ::= decl 12. decl list ::= decl list decl 15. decl specs ::= type spec decl specs 27. type spec ::= int | long | ... 45. init declarator list ::= init declarator 46. init declarator list ::= init declarator list , init declarator 47. init declarator ::= declarator 64. enumerator ::= id 65. enumerator ::= id = const exp 67. declarator ::= direct declarator NT1 68. direct declarator ::= id 69. direct declarator ::= ( declarator ) 70. direct declarator ::= direct declarator [ const exp ] 71. direct declarator ::= direct declarator [ ] 72. direct declarator ::= direct declarator ( param type list ) 73. direct declarator ::= direct declarator ( id list ) 74. direct declarator ::= direct declarator ( ) 88. id list ::= id 89. id list ::= id list , id 90. initializer ::= assignment exp 91. initializer ::= initializer list 93. initializer list ::= initializer 94. initializer list ::= initializer list , initializer 110. stat ::= labeled stat | exp stat | compound stat | selection stat 114. stat ::= iteration stat | jump stat 116. labeled stat ::= id : stat 117. labeled stat ::= case const exp : stat 118. labeled stat ::= default : stat 119. exp stat ::= exp ; 120. exp stat ::= ; 121. compound stat ::= decl list stat list 125. stat list ::= stat 126. stat list ::= stat list stat 127. selection stat ::= if ( exp ) stat 129. selection stat ::= switch ( exp ) stat 130. iteration stat ::= while ( exp ) stat 131. iteration stat ::= do stat while ( exp ) ; 132. iteration stat ::= for ( exp ; exp ; exp ) stat 140. jump stat ::= goto id ; | continue ; | break ; | return exp ; 145. exp ::= assignment exp 146. exp ::= exp , assignment exp 147. assignment exp ::= conditional exp 148. assignment exp ::= conditional exp assignment operator assignment exp 205. const ::= int const | char const | oat const 208. NT1 ::= = SELECT id NT2 FROM id NT2 | ϵ 210. NT2 ::= , id NT2 | ϵ
  16. 16. SLATE 2015, Madrid, Spain June 18-19, 2015 16/67 Some applications of GI to SLE • Inference of DSLs – A DSL grammar is usually small enough that the inference process is feasible. – Inference of grammar-based DSLs. – Inference of metamodel-base DSLs. HRNČIČ, Dejan, MERNIK, Marjan, BRYANT, Barrett Richard, JAVED, Faizan. A memetic grammar inference algorithm for language learning. Applied Soft Computing, 2012, vol. 12, iss. 3, pp. 1006-1020. HRNČIČ, Dejan, MERNIK, Marjan, BRYANT, Barrett Richard. Improving grammar inference by a memetic algorithm. IEEE Transactions on Systems, Man, and Cybernetics - Part C, 2012, vol. 42, no. 5, pp. 692-703.
  17. 17. SLATE 2015, Madrid, Spain June 18-19, 2015 17/67 Some applications of GI to SLE • 12 input samples of DESK language on which the algorithm was tested: 1. print a 2. print 3 3. print b + 14 4. print a + b + c 5. print a where b = 14 6. print 10 where d = 15 7. print 9 + b where b = 16 8. print 1 + 2 where id = 1 9. print a where b = 5, c = 4 10. print 21 where a = 6, b = 5 11. print 5 + 6 where a = 3, c = 14 12. print a + b + c where a = 4, b = 3, c = 2
  18. 18. SLATE 2015, Madrid, Spain June 18-19, 2015 18/67 Some applications of GI to SLE Original grammar: 1. DESK ::= print E C 2. E ::= E + F 3. E ::= F 4. F ::= id 5. F ::= num 6. C ::= where Ds 7. C ::= ε 8. Ds ::= D 9. Ds ::= Ds , D 10. D ::= id = num Inferred grammar: 1: NT1 -> print NT3 NT5 2: NT2 -> + NT3 3: NT2 -> ε 4: NT3 -> num NT2 5: NT3 -> id NT2 6: NT4 -> , id = num NT4 7: NT4 -> ε 8: NT5 -> where id = num NT4 9: NT5 -> ε
  19. 19. SLATE 2015, Madrid, Spain June 18-19, 2015 19/67 Some applications of GI to SLE DSL for hypertree description
  20. 20. SLATE 2015, Madrid, Spain June 18-19, 2015 20/67 Some applications of GI to SLE Inferred grammar for hypertree description DSL
  21. 21. SLATE 2015, Madrid, Spain June 18-19, 2015 21/67 Some applications of GI to SLE • As a model conforms to a metamodel in a similar manner to how a program conforms to a grammar, the metamodel inference can be defined as follows. • The set of all models that conform to a given metamodel MM will be called the language of the metamodel and denoted L(MM). Given a model instance m and a metamodel MM we can tell whether m conforms to MM (m ∈ L(MM)).
  22. 22. SLATE 2015, Madrid, Spain June 18-19, 2015 22/67 Some applications of GI to SLE • A set of positive samples is denoted with S+. Conversely, a negative sample belongs to L(MM), which denotes a set of all models that do not conform to metamodel MM. A set of negative samples is denoted with S-. • A set of positive samples S+ of a metamodel MM is structurally complete if each metamodel element appears in at least one model in S+.
  23. 23. SLATE 2015, Madrid, Spain June 18-19, 2015 23/67 Some applications of GI to SLE • Given a set of positive samples S+ and set of negative samples S-, which might be also empty, the task of metamodel inference is to find at least one metamodel MM such that S+⊆L(MM) and S-⊆L(MM).
  24. 24. SLATE 2015, Madrid, Spain June 18-19, 2015 24/67 Some applications of GI to SLE • JAVED, Faizan, MERNIK, Marjan, GRAY, Jeffrey G., BRYANT, Barrett Richard. MARS: A Metamodel Recovery System Using Grammar Inference. Information and Software Technology, 2008, vol. 50, iss. 9-10, pp. 948-968. • Qichao Liu, Jeff Gray, Marjan Mernik, Barrett R. Bryant. Application of Metamodel Inference with Large-Scale Metamodels. Int. J. Software and Informatics 6(2): 201-231 (2012)
  25. 25. SLATE 2015, Madrid, Spain June 18-19, 2015 25/67 Some applications of GI to SLE • ESML (Embedded System Modeling Language)
  26. 26. SLATE 2015, Madrid, Spain June 18-19, 2015 26/67 Some applications of GI to SLE • Original ESML metamodel - Configuration viewpoint
  27. 27. SLATE 2015, Madrid, Spain June 18-19, 2015 27/67 Some applications of GI to SLE • Inferred ESML metamodel - Configuration viewpoint
  28. 28. SLATE 2015, Madrid, Spain June 18-19, 2015 28/67 Some applications of GI to SLE • Inference of Visual Languages (based on graph grammars) FÜRST, Luka, MERNIK, Marjan, MAHNIČ, Viljan. Graph grammar induction as a parser-controlled heuristic search process. AGTIVE’12, pp. 121-136. FÜRST, Luka, MERNIK, Marjan, MAHNIČ, Viljan. Converting metamodels to graph grammars: doing without advanced graph grammar features. Software and System Modeling (SoSym), 2013, Article in Press.
  29. 29. SLATE 2015, Madrid, Spain June 18-19, 2015 29/67 Graph grammar inference Positive and negative samples for hydrocarbons with single and double bonds
  30. 30. SLATE 2015, Madrid, Spain June 18-19, 2015 30/67 Graph grammar inference Inferred graph grammar
  31. 31. SLATE 2015, Madrid, Spain June 18-19, 2015 31/67 Graph grammar inference Positive samples for flowcharts
  32. 32. SLATE 2015, Madrid, Spain June 18-19, 2015 32/67 Graph grammar inference Inferred graph grammar
  33. 33. SLATE 2015, Madrid, Spain June 18-19, 2015 33/67 Some applications of GI to SLE • Inference of execution traces – GI is well suited to analyses involving program execution because the events in execution traces are temporally ordered in the same way that tokens are spatially ordered. – GI can be used in the dynamic analysis of software. N. Walkinshaw, K. Bogdanov, M. Holcombe, S. Salahuddin. Improving dynamic software analysis by applying grammar inference principles. J. Softw. Maint. Evol. 20 (2008) 269- 290.
  34. 34. SLATE 2015, Madrid, Spain June 18-19, 2015 34/67 Some applications of GI to SLE • Javier Canovas, Jordi Cabot, Jesus Lopez-Fernandez, Jesus Sanchez, Cuadrado, Esther Guerra, Juan De Lara. Engaging End-Users in the Collaborative Development of Domain- Specific Modelling Languages, CDVE 2013: 101-110.
  35. 35. SLATE 2015, Madrid, Spain June 18-19, 2015 35/67 Some applications of GI to SLE • Javier Canovas, Jordi Cabot, Jesus Lopez-Fernandez, Jesus Sanchez, Cuadrado, Esther Guerra, Juan De Lara. Engaging End-Users in the Collaborative Development of Domain- Specific Modelling Languages. CDVE 2013: 101-110.
  36. 36. SLATE 2015, Madrid, Spain June 18-19, 2015 36/67 Some applications of GI to SLE • Model evolution using metamodel inference
  37. 37. SLATE 2015, Madrid, Spain June 18-19, 2015 37/67 Some applications of GI to SE • Applications to Software Engineering (SE) • Grammar-based systems • Marjan Mernik, Matej Crepinsek, Tomaz Kosar, Damijan Rebernak, Viljem Zumer. Grammar-Based Systems: Definition and Examples. Informatica 28(3): 245-255 (2004)
  38. 38. SLATE 2015, Madrid, Spain June 18-19, 2015 38/67 Some applications of GI to SE • A grammar-based system is any system that uses a grammar and/or sentences produced by this grammar to solve various problems outside the domain of programming language definition and its implementation. The vital component of such a system is well structured and expressed with a grammar or with sentences produced by this grammar in an explicit or implicit manners.
  39. 39. SLATE 2015, Madrid, Spain June 18-19, 2015 39/67 Some applications of GI to SE • Applications to Software Engineering (SE) • Wil van der Aalst. Process Mining: Making Sense of Processes Hidden in Big Event Data. Invited talk at FedCSIS 2013 • Sentence = Trace in event log • Grammar = Process model
  40. 40. SLATE 2015, Madrid, Spain June 18-19, 2015 40/67 Inside GI • Several theoretical learning models. – Identification in the limit (allows the inference algorithm to converge on the target grammar given a sufficiently large quantity of sentences). – Teacher and queries (introduces a teacher who knows the target language and can answer particular types of queries from the learner) – Probably Approximately Correct (PAC) learning (doesn’t guarantee exact identification).
  41. 41. SLATE 2015, Madrid, Spain June 18-19, 2015 41/67 Inside GI • Gold Theorem (1967) - it is impossible to identify any of the four classes of languages in the Chomsky hierarchy in the limit using only positive samples. • Using both negative and positive samples, the Chomsky hierarchy languages can be identified in the limit.
  42. 42. SLATE 2015, Madrid, Spain June 18-19, 2015 42/67 Inside GI • Intuitively, Gold's theorem can be explained by recognizing the fact that the final generalization of positive samples would be an automation that accept all strings. • Singular use of positive samples results in an uncertainty as to when the generalization steps should be stopped. This implies the need for some restrictions or background knowledge on the generalization process.
  43. 43. SLATE 2015, Madrid, Spain June 18-19, 2015 43/67 Inside GI • A lot of research has been done on extraction of context-free grammars, but the problem is still not solved sufficiently mainly due to immense search space.
  44. 44. SLATE 2015, Madrid, Spain June 18-19, 2015 44/67 Inside GI
  45. 45. SLATE 2015, Madrid, Spain June 18-19, 2015 45/67 Inside GI
  46. 46. SLATE 2015, Madrid, Spain June 18-19, 2015 46/67 Inside GI
  47. 47. SLATE 2015, Madrid, Spain June 18-19, 2015 47/67 Inside GI
  48. 48. SLATE 2015, Madrid, Spain June 18-19, 2015 48/67 Inside GI
  49. 49. SLATE 2015, Madrid, Spain June 18-19, 2015 49/67 Inside GI • Memetic algorithms are evolutionary algorithms with local search operator – use of evolutionary concepts (population, evolutionary operators) – improves the search for solutions with local search.
  50. 50. SLATE 2015, Madrid, Spain June 18-19, 2015 50/67 Inside GI example n example 1 regular definitions ... initiali- zation local search mutation generali- zation selection evolutionary cycle found grammars parse positive examples MAGIc evaluate (LISA parser) - simple - Sequitur diff • Memetic Algorithm for Grammatical Inference
  51. 51. SLATE 2015, Madrid, Spain June 18-19, 2015 51/67 Inside GI • Sequitur: http://sequitur.info/ • abcabdabcabd 0 → 1 1 1 → 2 c 2 d 2 → a b • p i w i=n, i=n // print id where id=n, id=n 0 → p 1 w 2, 2 1 → i 2 → 1 = n
  52. 52. SLATE 2015, Madrid, Spain June 18-19, 2015 52/67 Inside GI print a where c=2 print 5+b where b = 10print id where id=num print num+id where id=num
  53. 53. SLATE 2015, Madrid, Spain June 18-19, 2015 53/67 Inside GI Apply diff command! 1a2,3 > num > + print id where id=num print num+id where id=num What is the difference among two samples? print id where id=num print num+id where id=num But where to change the grammar?
  54. 54. SLATE 2015, Madrid, Spain June 18-19, 2015 54/67 Inside GI Start with the grammar that parses first sample: print a where c=2 N1 ::= print N2 where id = num N2 ::= id Use information from LR(1) parsing on 2nd sample. Configurations returned from the LR(1) parser: Nx → α1 • α2 Ny → β • Nz → • γ
  55. 55. SLATE 2015, Madrid, Spain June 18-19, 2015 55/67 Inside GI • Input samples: s1,s2,...,sn (true positive) s1,s2,...,sk,a1,...,am,sk+1,...sn (false negative) – difference: a1,...,am
  56. 56. SLATE 2015, Madrid, Spain June 18-19, 2015 56/67 Inside GI • Nx → α1 • α2 – if Nx ::= α1 N1 α2 N1 ::= ai+1 ... am N1 ::= ε – if Nx ::= α1 N1 N1 ::= α2 N1 ::= ai+1 ... am – if change in this configuration can’t be made )FIRST(αs 21k ∈+ FOLLOW(Nx)s)FIRST(αs 1k21k ∈∧∉ ++ FOLLOW(Nx)s)FIRST(αs 1k21k ∉∧∉ ++
  57. 57. SLATE 2015, Madrid, Spain June 18-19, 2015 57/67 Inside GI print a where c=2 print 5+b where b = 10 N1 → print • N2 where id = num N1 ::= print N2 where id = num N2 ::= id N1 ::= print N3 N2 where id = num N2 ::= id N3 ::= num + N3 ::= ε
  58. 58. SLATE 2015, Madrid, Spain June 18-19, 2015 58/67 Inside GI Production: Nx ::= α1 Ny α2 Option Nx ::= α1 Nz α2 Nz ::= Ny Nz ::= ε But, how mutation is done?
  59. 59. SLATE 2015, Madrid, Spain June 18-19, 2015 59/67 Inside GI Nx ::= α Ny Nx ::= Ny Ny Ny ::= α Ny ::= α Ny ::= β Ny ::= βWhat about generalization step? Nx ::= Ny Ny Nx ::= Ny Ny ::= α Ny ::= α Ny Ny ::= β Ny ::= β Ny Ny ::= ε
  60. 60. SLATE 2015, Madrid, Spain June 18-19, 2015 60/67 Semantic inference a) Grammar G: E ::= T EE EE ::= + T EE EE ::= ε T ::= int b) Attributes and their types: int synthesized vals int inherited inVal c) Set of terminals T = {‘nonterminal.attribute’, ‘lexValue’} d) Set of functions F = {sum} Set of positive programs with associated meanings: (5, 5) (2+5, 7) (10+5+8, 23)
  61. 61. SLATE 2015, Madrid, Spain June 18-19, 2015 61/67 Semantic inference // infered attribute grammars E ::= T EE { E[0].vals = EE[0].vals; EE[0].inVal = T[0].vals;} EE ::= + T EE { EE[0].vals = EE[1].vals; EE[1].inVal = sum(T[0].vals, EE[0].inVal);} EE ::= ε {EE[0].vals = EE[0].inVal;} T ::= int {T[0].vals = Integer.parseInt(#int.value());}
  62. 62. SLATE 2015, Madrid, Spain June 18-19, 2015 62/67 Semantic inference
  63. 63. SLATE 2015, Madrid, Spain June 18-19, 2015 63/67 Semantic inference Actually, the prototypical implementation found two additional attribute grammars, which differ in the semantics for the 2nd production EE ::= + T EE, but both are valid. The following two semantic rules for the 2nd production were found: a) EE ::= + T EE { EE[0].vals = sum(EE[1].vals, T[0].vals); EE[1].inVal = EE[0].inVal;} b) EE ::= + T EE { EE[0].vals = sum(EE[1].vals, EE[0].inVal); EE[1].inVal = T[0].vals;}
  64. 64. SLATE 2015, Madrid, Spain June 18-19, 2015 64/67 Semantic inference
  65. 65. SLATE 2015, Madrid, Spain June 18-19, 2015 65/67 Semantic inference
  66. 66. SLATE 2015, Madrid, Spain June 18-19, 2015 66/67 Conclusion Hope that I convinced you that grammatical inference is interesting and useful. Yes, I will used in my current project on business process mining.
  67. 67. SLATE 2015, Madrid, Spain June 18-19, 2015 67/67 Conclusion Collaborators: M. Črepinšek, D. Hrnčič1, B. Bryant2, A. Sprague2, J. Gray2, J. Faizan2, Q. Liu2 L. Fürst3, V. Mahnič3 1University of Maribor, Slovenia 2The University of Alabama at Birmingham, USA 3University of Ljubljana, Slovenia Sent comments/questions to: marjan.mernik@um.si

×