This presentation describes the use and design of textural domain specific language - DSL. It has two basic purposes:
Introduce you to some of the more important design criteria in language design
Introduce you to BNF
This presentation is developed for MDD 2010 course at ITU, Denmark.
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
ITU - MDD - Textural Languages and Grammars
1. L0080 - 2010-11-02
Redistribution and other use of this material requires written permission from The RCP Company.
ITU - MDD – Textural Languages and Grammars
This presentation describes the use and design of textural domain specific
language - DSL. It has two basic purposes:
Introduce you to some of the more important design criteria in language
design
Introduce you to BNF
This presentation is developed for MDD 2010 course at ITU, Denmark.
Some materials are taken from Artur Boronat, University of Leicester with permissions.
2. L0080 - 2010-11-02
2
Textual DSLs versus Graphical DSLs
With both textual and graphical syntax you can
model any meta model
verify constraints in real time
(Eclipse) write ordinary EMF models
Graphical Editors are good to show structural relationships
Graphical Editors are sexy and often appeals to management
Textual Editors are better for „algorithmic“ aspects, or hierarchical models
Textual Editors integrate better with CVS etc. (diff, merge)
Textual Editors often provides better support for quick assist and quick fixes
If you must work with a DSL many hours every day, consider textual DSLs over
graphical DSLs!’
3. L0080 - 2010-11-02
3
What is needed for a successful DSL
You must know
The intended audience
The expressiveness of the language
The extendibility of the language
4. L0080 - 2010-11-02
4
The intended audience
Who will be using the language
Temporary help in a call center
Full-time Clerks
Accountants
Engineers – like yourselves
Any mismatch between the language and the audience is likely to alienate your
audience
Keywords:
concise vs. redundant
intuitive
simple to write and read
(host “heimdal”
(ipif 192.167.54.55)
(disk:type SSD 500GB))
Define host name = “heimdal”
with IP interface
IPv4 address = 192.167.54.55
end IP address
with disk
type = SSD
size = 500GB
end disk
End host
5. L0080 - 2010-11-02
5
The expressiveness of the language
What type of information must be expressed in the language:
Simple static information
Complex network of objects
Algorithms
Must the DSL be Turing complete?
What type of constructs must be supported
Lists
Tables
Recursive data structures
Variables
Macros
Conditions and loops
Functions
Comments
E = MC2
6. L0080 - 2010-11-02
6
The extendibility of the language
Can you be absolute certain that the language will never be extended?
“The last language…”
Backward compatibility
If your grammar is successful, then it will be used – so what should you do
with all the existing “stuff” when you need to extend your grammar with new
functionality
Open versus closed grammars
An open grammar can be extended in the future with new constructs or
statements in a natural manner
A closed grammar can be very difficult to extend while retaining backward
compatibility
heimdal 192.167.54.55 SSD 500
(host “heimdal”
(ipif 192.167.54.55)
(disk type SSD 500GB))
7. L0080 - 2010-11-02
7
Internal versus External DSLs
Textual DSLs are normally divided into two distinct groups:
Internal DSL (or extension language)
Using the syntax and semantics of a host language or system – e.g. Java
or Excel
You can argument that any API is in itself an DSL
External DSL
A completely new language
8. L0080 - 2010-11-02
8
Some Examples
Internal DSL
4GL – attempt in the 1990s to raise the productivity of programmers by adding
more high-level constructs to 3GL languages (such as C, Pascal, etc)
JSP – current attempt to ease the creation of web pages by adding Java as an
embedded scripting language
Some languages that have been designed to be a good host language for DSL:
Tcl
Python
Groovy
Ruby
final IFormCreator detailsSection = myForm.addSection("Details",
table.getBinding().getSingleSelection());
detailsSection.addObjectMessages();
detailsSection.addField("contact.name(w=200)");
detailsSection.addField("logoFileName(w=200)");
detailsSection.addField("loyalty").arg(Constants.ARG_PREFERRED_CONTROL,
RadioGroup.class.getName());
# Assume $remote_server, $my_user_id, $my_password, and $my_command
# were read in earlier in the script.
spawn telnet $remote_server expect "username:"
send "$my_user_idr" expect "password:"
send "$my_passwordr" expect "%"
send "$my_commandr" expect "%"
set results $expect_out(buffer)
send "exitr" expect eof
9. L0080 - 2010-11-02
9
Consists of two layers
a lexical layer
a grammar layer
Lexical syntax: the spelling of
words and punctuation
Grammars: the key formalism for
describing syntax; universal progra
mming language
Example
if (<expr>) <statement>
if <expr> then <statement>
<statement> if <expr>
(if <expr> <statement>)
Formal Syntax of an DSL
if
expression statement
!=
a
constantvariable
if (a != null) b = a.getArg();
“If” “(“ “a” “!=“ “null” “)” “b” “=“ “a” “.” “getArg” “(“ “)” “;”
assignment
a expression
null
… …
10. L0080 - 2010-11-02
10
Tokens/Terminals: units in programming language
Lexical syntax: correspondence between the written representation (spelling) and
the tokens or terminals in a grammar
Keywords:
alphabetic character sequences
unit in a language - if , while
Punctuation
Like ‘(‘, ‘)’, ‘:=‘
Whitespace
Lexical Syntax
if (a != null) b = a.getArg();
“If” “(“ “a” “!=“ “null” “)” “b” “=“ “a” “.” “getArg” “(“ “)” “;”
11. L0080 - 2010-11-02
11
Concrete syntax: describes its written representation, including lexical details
such as the placement of keywords and punctuation marks
Context-free grammar or grammars: notation for specifying concrete syntax
Context-free means that the understanding of the next token may not depend on
the current context
Example from PL/1 of a context dependent grammar:
IF IF THEN THEN ELSE ELSE;
Legal if “IF” is a Boolean variable and “THEN” and “ELSE” are valid
procedures
Most modern languages have a context-free grammar
Notable exceptions are C and C++!!!
Context-Free Grammars
12. L0080 - 2010-11-02
12
Grammar has four parts
A set of tokens or terminals
atomic symbols that can not be sub-divided
A set of nonterminals
Identifiers or variables representing constructs
A set of rules (called productions)
identify the component of construct
<nonterminal >::= terminal | <nonterminal>
A starting nonterminal
represents a main construct
Describing Context-Free Grammars
assignment => ID = expression;
expression => expression + term
| expression - term
| term
term => term * factor
| term / factor
| factor
factor => ( expression )
| ID
| NUMBER
13. L0080 - 2010-11-02
13
The productions are rules for building strings
Parse Trees: show how a string can be built
Notation to write grammar
Backus-Naur Form (BNF)
Extended BNF (EBNF)
Writing Grammars
14. L0080 - 2010-11-02
14
Backus-Naur Form (Backus Normal Form)
Grammar Rules or Productions: define symbols.
Non-terminal Symbols: anything that is defined on the left-side of some
production.
Terminal Symbols: things that are not defined by productions. They can be
literals, symbols, and other lexemes of the language defined by lexical rules.
Identifiers: id::= [A-Za-z_]*
Delimiters: ;
Operators: = + - * / %
assignment_stmt ::= id = expression;
The non-terminal symbol being
defined.
The definition (production)
15. L0080 - 2010-11-02
15
Different notations (same meaning):
assignment_stmt ::= id = expression + term
<assignment-stmt> => <id> = <expr> + <term>
AssignmentStmt → id = expression + term
::=, =>, → mean "consists of" or "defined as"
Alternatives ( " | " ):
Concatenation:
Backus-Naur Form (Backus Normal Form)
expression => expression + term
| expression - term
| term
number => DIGIT number | DIGIT
16. L0080 - 2010-11-02
16
Backus-Naur Form (Backus Normal Form)
Another way to write alternatives:
Null symbol,: ε or @
used to allow a production to match nothing.
Example: a variable is an identifier followed by an optional subscript
Expression => expression + term
=> expression - term
=> term
variable => identifier subscript
subscript => [ expression ] | e
17. L0080 - 2010-11-02
17
Here is a grammar for assignment with arithmetic operations, e.g. y = (2*x +
5)*x - 7;
Arithmetic Grammar
assignment => ID = expression;
expression => expression + term
| expression - term
| term
term => term * factor
| term / factor
| factor
factor => ( expression )
| ID
| NUMBER
18. L0080 - 2010-11-02
18
Example of a Parse Tree
if
expression statement
!=
a
constantvariable
“If” “(“ “a” “!=“ “null” “)” “b” “=“ “a” “.” “getArg” “(“ “)” “;”
assignment
a expression
null
… …
19. L0080 - 2010-11-02
19
Grammar Rules:
expr => expr + expr | expr ∗ expr
| ( expr ) | NUMBER
Expression: 2 + 3 * 4
Two possible parse trees:
expr
expr expr
expr
+
*expr
expr
expr
+
*expr
exprexprNUMBER
(2)
NUMBER
(3)
NUMBER
(4)
NUMBER
(2)
NUMBER
(3)
NUMBER
(4)
Ambiguity
20. L0080 - 2010-11-02
20
The standard metalanguage: Extended BNF
Terminal symbols of the language are quoted
[ and ] include optional symbols
{ and } indicate repetition
( and ) are used to group items
| definition-separator-symbol stands for or. In EBNF we need to quote ( and )
literals as '(' ... ')’
* repetition symbol
= defining symbol
; terminator symbol
EBNF – Extended BNF
21. L0080 - 2010-11-02
21
BNF versus EBNF
BNF:
EBNF:
expression ::= expression + term
| expression - term
| term
term ::= term * factor
| term / factor
| factor
factor ::= ( expression )
| id
| number
expression ::= term { (‘+’|’-’) term }
term ::= factor { (‘*’|’/’) factor }
factor ::= '(' expression ')'
| id | number
22. L0080 - 2010-11-02
22
More Information
“Model-Driven Software Development” by Thomas Stahl and Markus Völter
Chapter 8.1 “DSL Construction”
Good high-level introduction to DSLs
“Domain-Specific Languages: An Annotated Bibliography”
http://homepages.cwi.nl/~arie/papers/dslbib/
Not very interesting except for the very long annotated bibliography!
“BNF and EBNF: What are they and how do they work?” by Lars Marius Garshol -
http://www.garshol.priv.no/download/text/bnf.html
About Backus-Naur Form and parsing
“Internal Domain-Specific Languages”
http://fragmental.tw/research-on-dsls/domain-specific-languages-dsls/internal-dsls/
Outlines use of DSL in Java and Ruby
23. L0080 - 2010-11-02
23
Exercise 1
Design a textual language (grammar) for a travel package.
How would you like to specify a travel package in text?
Do you think, you can teach a travel administrator to use the language?
Is the grammar open – i.e. can it be extended in the future?
For the grammar you can use ID, STRING and NUMBER as terminators where
relevant