L0080 - 2010-11-02
Redistribution and other use of this material requires written permission from The RCP Company.
ITU - M...
L0080 - 2010-11-02
2
Textual DSLs versus Graphical DSLs
 With both textual and graphical syntax you can
 model any meta ...
L0080 - 2010-11-02
3
What is needed for a successful DSL
 You must know
 The intended audience
 The expressiveness of t...
L0080 - 2010-11-02
4
The intended audience
 Who will be using the language
 Temporary help in a call center
 Full-time ...
L0080 - 2010-11-02
5
The expressiveness of the language
 What type of information must be expressed in the language:
 Si...
L0080 - 2010-11-02
6
The extendibility of the language
 Can you be absolute certain that the language will never be exten...
L0080 - 2010-11-02
7
Internal versus External DSLs
 Textual DSLs are normally divided into two distinct groups:
 Interna...
L0080 - 2010-11-02
8
Some Examples
Internal DSL
 4GL – attempt in the 1990s to raise the productivity of programmers by a...
L0080 - 2010-11-02
9
 Consists of two layers
 a lexical layer
 a grammar layer
 Lexical syntax: the spelling of
words ...
L0080 - 2010-11-02
10
 Tokens/Terminals: units in programming language
 Lexical syntax: correspondence between the writt...
L0080 - 2010-11-02
11
 Concrete syntax: describes its written representation, including lexical details
such as the place...
L0080 - 2010-11-02
12
 Grammar has four parts
 A set of tokens or terminals

atomic symbols that can not be sub-divided...
L0080 - 2010-11-02
13
 The productions are rules for building strings
 Parse Trees: show how a string can be built
Notat...
L0080 - 2010-11-02
14
Backus-Naur Form (Backus Normal Form)
 Grammar Rules or Productions: define symbols.
 Non-terminal...
L0080 - 2010-11-02
15
 Different notations (same meaning):
 assignment_stmt ::= id = expression + term
 <assignment-stm...
L0080 - 2010-11-02
16
Backus-Naur Form (Backus Normal Form)
 Another way to write alternatives:
 Null symbol,: ε or @
us...
L0080 - 2010-11-02
17
 Here is a grammar for assignment with arithmetic operations, e.g. y = (2*x +
5)*x - 7;
Arithmetic ...
L0080 - 2010-11-02
18
Example of a Parse Tree
if
expression statement
!=
a
constantvariable
“If” “(“ “a” “!=“ “null” “)” “...
L0080 - 2010-11-02
19
 Grammar Rules:
expr => expr + expr | expr ∗ expr
| ( expr ) | NUMBER
 Expression: 2 + 3 * 4
 Two...
L0080 - 2010-11-02
20
 The standard metalanguage: Extended BNF
 Terminal symbols of the language are quoted
 [ and ] in...
L0080 - 2010-11-02
21
BNF versus EBNF
BNF:
EBNF:
expression ::= expression + term
| expression - term
| term
term ::= term...
L0080 - 2010-11-02
22
More Information
 “Model-Driven Software Development” by Thomas Stahl and Markus Völter
 Chapter 8...
L0080 - 2010-11-02
23
Exercise 1
 Design a textual language (grammar) for a travel package.
 How would you like to speci...
Upcoming SlideShare
Loading in …5
×

ITU - MDD - Textural Languages and Grammars

1,197 views

Published on

This presentation describes the use and design of textural domain specific language - DSL. It has two basic purposes:
Introduce you to some of the more important design criteria in language design
Introduce you to BNF

This presentation is developed for MDD 2010 course at ITU, Denmark.

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,197
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
12
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

ITU - MDD - Textural Languages and Grammars

  1. 1. L0080 - 2010-11-02 Redistribution and other use of this material requires written permission from The RCP Company. ITU - MDD – Textural Languages and Grammars This presentation describes the use and design of textural domain specific language - DSL. It has two basic purposes: Introduce you to some of the more important design criteria in language design Introduce you to BNF This presentation is developed for MDD 2010 course at ITU, Denmark. Some materials are taken from Artur Boronat, University of Leicester with permissions.
  2. 2. L0080 - 2010-11-02 2 Textual DSLs versus Graphical DSLs  With both textual and graphical syntax you can  model any meta model  verify constraints in real time  (Eclipse) write ordinary EMF models  Graphical Editors are good to show structural relationships  Graphical Editors are sexy and often appeals to management  Textual Editors are better for „algorithmic“ aspects, or hierarchical models  Textual Editors integrate better with CVS etc. (diff, merge)  Textual Editors often provides better support for quick assist and quick fixes  If you must work with a DSL many hours every day, consider textual DSLs over graphical DSLs!’
  3. 3. L0080 - 2010-11-02 3 What is needed for a successful DSL  You must know  The intended audience  The expressiveness of the language  The extendibility of the language
  4. 4. L0080 - 2010-11-02 4 The intended audience  Who will be using the language  Temporary help in a call center  Full-time Clerks  Accountants  Engineers – like yourselves  Any mismatch between the language and the audience is likely to alienate your audience  Keywords:  concise vs. redundant  intuitive  simple to write and read (host “heimdal” (ipif 192.167.54.55) (disk:type SSD 500GB)) Define host name = “heimdal” with IP interface IPv4 address = 192.167.54.55 end IP address with disk type = SSD size = 500GB end disk End host
  5. 5. L0080 - 2010-11-02 5 The expressiveness of the language  What type of information must be expressed in the language:  Simple static information  Complex network of objects  Algorithms  Must the DSL be Turing complete?  What type of constructs must be supported  Lists  Tables  Recursive data structures  Variables  Macros  Conditions and loops  Functions  Comments E = MC2
  6. 6. L0080 - 2010-11-02 6 The extendibility of the language  Can you be absolute certain that the language will never be extended?  “The last language…”  Backward compatibility  If your grammar is successful, then it will be used – so what should you do with all the existing “stuff” when you need to extend your grammar with new functionality  Open versus closed grammars  An open grammar can be extended in the future with new constructs or statements in a natural manner  A closed grammar can be very difficult to extend while retaining backward compatibility heimdal 192.167.54.55 SSD 500 (host “heimdal” (ipif 192.167.54.55) (disk type SSD 500GB))
  7. 7. L0080 - 2010-11-02 7 Internal versus External DSLs  Textual DSLs are normally divided into two distinct groups:  Internal DSL (or extension language)  Using the syntax and semantics of a host language or system – e.g. Java or Excel  You can argument that any API is in itself an DSL  External DSL  A completely new language
  8. 8. L0080 - 2010-11-02 8 Some Examples Internal DSL  4GL – attempt in the 1990s to raise the productivity of programmers by adding more high-level constructs to 3GL languages (such as C, Pascal, etc)  JSP – current attempt to ease the creation of web pages by adding Java as an embedded scripting language  Some languages that have been designed to be a good host language for DSL:  Tcl  Python  Groovy  Ruby final IFormCreator detailsSection = myForm.addSection("Details", table.getBinding().getSingleSelection()); detailsSection.addObjectMessages(); detailsSection.addField("contact.name(w=200)"); detailsSection.addField("logoFileName(w=200)"); detailsSection.addField("loyalty").arg(Constants.ARG_PREFERRED_CONTROL, RadioGroup.class.getName()); # Assume $remote_server, $my_user_id, $my_password, and $my_command # were read in earlier in the script. spawn telnet $remote_server expect "username:" send "$my_user_idr" expect "password:" send "$my_passwordr" expect "%" send "$my_commandr" expect "%" set results $expect_out(buffer) send "exitr" expect eof
  9. 9. L0080 - 2010-11-02 9  Consists of two layers  a lexical layer  a grammar layer  Lexical syntax: the spelling of words and punctuation  Grammars: the key formalism for describing syntax; universal progra mming language Example  if (<expr>) <statement>  if <expr> then <statement>  <statement> if <expr>  (if <expr> <statement>) Formal Syntax of an DSL if expression statement != a constantvariable if (a != null) b = a.getArg(); “If” “(“ “a” “!=“ “null” “)” “b” “=“ “a” “.” “getArg” “(“ “)” “;” assignment a expression null … …
  10. 10. L0080 - 2010-11-02 10  Tokens/Terminals: units in programming language  Lexical syntax: correspondence between the written representation (spelling) and the tokens or terminals in a grammar  Keywords:  alphabetic character sequences  unit in a language - if , while  Punctuation  Like ‘(‘, ‘)’, ‘:=‘  Whitespace Lexical Syntax if (a != null) b = a.getArg(); “If” “(“ “a” “!=“ “null” “)” “b” “=“ “a” “.” “getArg” “(“ “)” “;”
  11. 11. L0080 - 2010-11-02 11  Concrete syntax: describes its written representation, including lexical details such as the placement of keywords and punctuation marks  Context-free grammar or grammars: notation for specifying concrete syntax  Context-free means that the understanding of the next token may not depend on the current context  Example from PL/1 of a context dependent grammar:  IF IF THEN THEN ELSE ELSE;  Legal if “IF” is a Boolean variable and “THEN” and “ELSE” are valid procedures  Most modern languages have a context-free grammar  Notable exceptions are C and C++!!! Context-Free Grammars
  12. 12. L0080 - 2010-11-02 12  Grammar has four parts  A set of tokens or terminals  atomic symbols that can not be sub-divided  A set of nonterminals  Identifiers or variables representing constructs  A set of rules (called productions)  identify the component of construct  <nonterminal >::= terminal | <nonterminal>  A starting nonterminal  represents a main construct Describing Context-Free Grammars assignment => ID = expression; expression => expression + term | expression - term | term term => term * factor | term / factor | factor factor => ( expression ) | ID | NUMBER
  13. 13. L0080 - 2010-11-02 13  The productions are rules for building strings  Parse Trees: show how a string can be built Notation to write grammar  Backus-Naur Form (BNF)  Extended BNF (EBNF) Writing Grammars
  14. 14. L0080 - 2010-11-02 14 Backus-Naur Form (Backus Normal Form)  Grammar Rules or Productions: define symbols.  Non-terminal Symbols: anything that is defined on the left-side of some production.  Terminal Symbols: things that are not defined by productions. They can be literals, symbols, and other lexemes of the language defined by lexical rules.  Identifiers: id::= [A-Za-z_]*  Delimiters: ;  Operators: = + - * / % assignment_stmt ::= id = expression; The non-terminal symbol being defined. The definition (production)
  15. 15. L0080 - 2010-11-02 15  Different notations (same meaning):  assignment_stmt ::= id = expression + term  <assignment-stmt> => <id> = <expr> + <term>  AssignmentStmt → id = expression + term ::=, =>, → mean "consists of" or "defined as"  Alternatives ( " | " ):  Concatenation: Backus-Naur Form (Backus Normal Form) expression => expression + term | expression - term | term number => DIGIT number | DIGIT
  16. 16. L0080 - 2010-11-02 16 Backus-Naur Form (Backus Normal Form)  Another way to write alternatives:  Null symbol,: ε or @ used to allow a production to match nothing.  Example: a variable is an identifier followed by an optional subscript Expression => expression + term => expression - term => term variable => identifier subscript subscript => [ expression ] | e
  17. 17. L0080 - 2010-11-02 17  Here is a grammar for assignment with arithmetic operations, e.g. y = (2*x + 5)*x - 7; Arithmetic Grammar assignment => ID = expression; expression => expression + term | expression - term | term term => term * factor | term / factor | factor factor => ( expression ) | ID | NUMBER
  18. 18. L0080 - 2010-11-02 18 Example of a Parse Tree if expression statement != a constantvariable “If” “(“ “a” “!=“ “null” “)” “b” “=“ “a” “.” “getArg” “(“ “)” “;” assignment a expression null … …
  19. 19. L0080 - 2010-11-02 19  Grammar Rules: expr => expr + expr | expr ∗ expr | ( expr ) | NUMBER  Expression: 2 + 3 * 4  Two possible parse trees: expr expr expr expr + *expr expr expr + *expr exprexprNUMBER (2) NUMBER (3) NUMBER (4) NUMBER (2) NUMBER (3) NUMBER (4) Ambiguity
  20. 20. L0080 - 2010-11-02 20  The standard metalanguage: Extended BNF  Terminal symbols of the language are quoted  [ and ] include optional symbols  { and } indicate repetition  ( and ) are used to group items  | definition-separator-symbol stands for or. In EBNF we need to quote ( and ) literals as '(' ... ')’  * repetition symbol  = defining symbol  ; terminator symbol EBNF – Extended BNF
  21. 21. L0080 - 2010-11-02 21 BNF versus EBNF BNF: EBNF: expression ::= expression + term | expression - term | term term ::= term * factor | term / factor | factor factor ::= ( expression ) | id | number expression ::= term { (‘+’|’-’) term } term ::= factor { (‘*’|’/’) factor } factor ::= '(' expression ')' | id | number
  22. 22. L0080 - 2010-11-02 22 More Information  “Model-Driven Software Development” by Thomas Stahl and Markus Völter  Chapter 8.1 “DSL Construction”  Good high-level introduction to DSLs  “Domain-Specific Languages: An Annotated Bibliography”  http://homepages.cwi.nl/~arie/papers/dslbib/  Not very interesting except for the very long annotated bibliography!  “BNF and EBNF: What are they and how do they work?” by Lars Marius Garshol - http://www.garshol.priv.no/download/text/bnf.html  About Backus-Naur Form and parsing  “Internal Domain-Specific Languages”  http://fragmental.tw/research-on-dsls/domain-specific-languages-dsls/internal-dsls/  Outlines use of DSL in Java and Ruby
  23. 23. L0080 - 2010-11-02 23 Exercise 1  Design a textual language (grammar) for a travel package.  How would you like to specify a travel package in text?  Do you think, you can teach a travel administrator to use the language?  Is the grammar open – i.e. can it be extended in the future?  For the grammar you can use ID, STRING and NUMBER as terminators where relevant

×