Xtext Grammar Language
Jan Köhnlein
2014, Kiel
grammar org.xtextcon.Statemachine 	
with org.eclipse.xtext.common.Terminals	
!
generate statemachine 	
"http://www.xtextcon.org/Statemachine"	
!
Statemachine:	
name=STRING	
elements+=Element*;	
!
Element:	
	 Event | State;	
	 	
Event:	
	 name=ID description=STRING?;	
!
State:	
	 'state' name=ID	
	 transitions+=Transition*;	
!
Transition:	
	 event=[Event] '=>' state=[State];
Precedences
Action, Assignment, Keyword, 

RuleCall, Parenthesized
Cardinalities *, +, ?

Predicates =>, ->
Group <blank>
Unordered Group &
Alternative |
AssignmentFirst:	
	 name=ID?;	
//	(name=ID)?;	
!
CardinalityFirst:	
	 'a' 'b'?;	
//	'a' ('b'?);	
!
PredicateFirst:	
	 =>'a' 'b';	
//	(=>'a') 'b';	
!
GroupFirst:	
	 'a' | 'b' 'c';	
//	'a' | ('b' 'c');
Syntactic Aspects
grammar org.xtextcon.Statemachine 	
with org.eclipse.xtext.common.Terminals	
!
generate statemachine 	
"http://www.xtextcon.org/Statemachine"	
!
Statemachine:	
name=STRING	
	 ('events' events+=Event+)?	
	 states+=State*;	
!
Event:	
	 name=ID description=STRING?;	
!
State:	
	 'state' name=ID	
	 transitions+=Transition*;	
!
Transition:	
	 event=[Event|ID] '=>' state=[State|ID];
Lexing
Lexer
Splits document text into tokens
Works before and independent of the parser
Token matching
First match wins
Precedence
All keywords
Local terminal rules
Terminal rules from super grammar
As last resort you can define a custom lexer
Lexing
// Examplen	
’Lamp’	
n	
events button_pressed	
n	
state on	
nt	
button_pressed => off	
n	
state off	
nt	
button_pressed => light
SL_COMMENT	
STRING 	
WS	
events WS ID 	
WS	
state ID 	
WS	
ID WS => WS ID 	
WS	
state ID 	
WS	
ID WS => WS ID
events

state

=>

ID

STRING

ML_COMMENT

SL_COMMENT	
WS	
ANY_OTHER
Tokens
Token StreamChar Stream
Terminal Rules
grammar org.eclipse.xtext.common.Terminals hidden(WS, ML_COMMENT, SL_COMMENT)	
	 	 	 	 	 	 	 	 	 	 	 	
import "http://www.eclipse.org/emf/2002/Ecore" as ecore	
!
terminal ID 		 	 : '^'?('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*;	
terminal INT returns ecore::EInt: ('0'..'9')+;	
terminal STRING	 	 : 	
	 	 	 '"' ( '' ('b'|'t'|'n'|'f'|'r'|'u'|'"'|"'"|'') | !(''|'"') )* '"' |	
	 	 	 "'" ( '' ('b'|'t'|'n'|'f'|'r'|'u'|'"'|"'"|'') | !(''|"'") )* "'";	
	
terminal ML_COMMENT	: '/*' -> '*/';	
terminal SL_COMMENT : '//' !('n'|'r')* ('r'? 'n')?;	
!
terminal WS	 	 	 : (' '|'t'|'r'|'n')+;	
!
terminal ANY_OTHER	 : .;
Terminals in the Parser
Hidden Terminals
Are ignored by the parser inside the respective scope
Can be defined on grammar or rule level (inherited)



grammar org.eclipse.xtext.common.Terminals 

hidden(WS, ML_COMMENT, SL_COMMENT)
or on parser rules



QualifiedName hidden(): // strictly no whitespace	
ID ('.' ID)*;
Datatype Rules
Return a value (instead of an EObject)
Are processed by the parser
Superior alternative to terminal rules
Examples



Double returns ecore::EDouble hidden():

'-'? INT '.' INT ('e'|'E' '-'? INT)?;



QualifiedName: // returns ecore::EString (default)

ID ('.' ID)*;



ValidID:

ID | 'keyword0' | 'keyword1';
Transforms textual representation to value and back
For terminal and datatype rules
e.g. strips quotes (STRINGValueConverter)

remove leading ^ (IDValueConverter)
EMF defaults are often sufficient (EFactory)
Value Converter
class MyValueConverterService extends AbstractDeclarativeValueConverterService {	
	 	
	 @Inject MyValueConverter myValueConverter;	
!
	 @ValueConverter(rule = "MY_TERMINAL") def MY_TERMINAL() {	
	 	 return myValueConverter	
	 }	
}
Ambiguities
Example
Expression:	
	 If;	
	 	
If:	
	 Variable | 	
	 'if' condition=Expression 	
	 'then' then=Expression 	
	 ('else' else=Expression)?;	
	 	
Variable:	
	 name=ID;
warning(200): ...: Decision can match input such as "'else'" using multiple alternatives: 1, 2	
As a result, alternative(s) 2 were disabled for that input	
warning(200): ...: Decision can match input such as "'else'" using multiple alternatives: 1, 2	
As a result, alternative(s) 2 were disabled for that input
if cond1 then	
if cond2 then 	
foo	
else // dangling	
bar
Ambiguity Analysis
In the workflow, add
!
AntlrWorks www.antlr3.org/works/
fragment = DebugAntlrGeneratorFragment {	
	 options = auto-inject {}	
}
Ambiguity Resolution
Add keywords
Use syntactic predicates:

“If the predicated element matches in the current
decision, decide for it.”
plain

(=>'else' else=Expression)?; 

=>('else' else=Expression)?; // not so good
first-set

->('else' else=Expression)?;
Correspondence to Ecore
grammar org.xtextcon.Statemachine 	
with org.eclipse.xtext.common.Terminals	
!
generate statemachine 	
"http://www.xtextcon.org/Statemachine"	
!
Statemachine returns Statemachine:	
name=STRING	
	 ('events' events+=Event+)?	
	 states+=State*;	
!
Event returns Event:	
	 name=ID description=STRING?;	
!
State returns State:	
	 'state' name=ID	
	 transitions+=Transition*;	
!
Transition returns Transition:	
	 event=[Event|ID] '=>' state=[State|ID];
statemachine
name : String
Statemachine
name : String
Transition
name : String
State
name : String
description: String
Event
*
* *states events
transitions
event1
1
state
Supertypes
... 	
!
Element:	
	 Event | State;	
!
Event returns Event:	
	 name=ID description=STRING?;	
!
State returns State:	
	 'state' name=ID	
	 transitions+=Transition*;	
!
...
description: String
EventState
name : String
Element
name : String
Transition
Common features are promoted to the super type
Dispatcher rule Element needs not to be called
A rule can return a subtype of the specified return type
Imported EPackage
Workflow {	
bean = StandaloneSetup {	
	...	
	registerGeneratedEPackage =

“org.xtextcon.statemachine.StatemachinePackage”	
	registerGenModelFile = 

“platform:/resource/<path-to>/Statemachine.genmodel"	
}	
//generate statemachine "http://www.xtextcon.org/Statemachine"	
import "http://www.xtextcon.org/Statemachine"
AST Creation
EObject Creation
The current pointer
Points to the EObject returned by a parser rule call
Is set to null when entering the rule
On the first assignment the EObject is created and
assigned to current
Further assignments go to current
EObject Creation
Type :	
(Class | DataType) 'is' visibility=('public' |'private');	
	
DataType :	
'datatype' name=ID;	
	
Class :	
'class' name=ID;
datatype A is public
Type current = null;	
current = new DataType();	
current.setName(“A”);	
current.setVisibility(“public”);	
return current;
Actions
Alternative without assignment -> no returned object





Actions make sure an EObject is created
Specified type must be compatible with return type
Created element is assigned to current
BooleanLiteral:	
	 value?='true' | 'false';
BooleanLiteral returns Expression:	
	 {BooleanLiteral} (value?=‘true’ | ‘false');
The rule 'BooleanLiteral' may be consumed without object instantiation.
Add an action to ensure object creation, e.g. '{BooleanLiteral}'.
Left Recursion
Expression :	
  Expression '+' Expression |	
  Number;	
!
Number :	
value = INT;
The rule 'Expression' is left recursive.
Left Factoring
Expression:	
{Addition} left=Number ('+' right=Number)*;	
	
Number:	
value = INT;
1
// model	
Addition {	
	 left = Number {	
	 	 value = 1	
}	
}
Assigned Action
Addition returns Expression:	
Number ({Sum.left = current} '+' right=Number)*;	
	
Number:	
value = INT;
1
Expression current = null;	
current = new Number();	
current.setValue(1);	
return current;
// model	
Number {	
	 value = 1	
}
Assigned Action II
Addition returns Expression:	
Number ({Sum.left = current} '+' right=Number)*;	
	
Number:	
value = INT;
1 + 2
Expression current = null;	
current = new Number();	
current.setValue(1);	
Expression temp = new Sum();	
temp.setLeft(current);	
current = temp;	
…	
current.right = <second Number>;	
return current;
// model	
Addition {	
	 left = Number {	
	 	 value = 1	
}	
	 right = Number {	
	 	 value = 2	
	 }	
}
Assigned Action III
Creates a new element
makes it the parent of current
and sets current to it
Needed
normalizing left-factored grammars
infix operators
Best Practices
Best Practices
Use nsURIs for imported EPackages
Prefer datatype rules to terminal rules
Use syntactic predicates instead of backtracking
Be sloppy in the grammar but strict in validation

The Xtext Grammar Language

  • 1.
    Xtext Grammar Language JanKöhnlein 2014, Kiel
  • 2.
    grammar org.xtextcon.Statemachine withorg.eclipse.xtext.common.Terminals ! generate statemachine "http://www.xtextcon.org/Statemachine" ! Statemachine: name=STRING elements+=Element*; ! Element: Event | State; Event: name=ID description=STRING?; ! State: 'state' name=ID transitions+=Transition*; ! Transition: event=[Event] '=>' state=[State];
  • 3.
    Precedences Action, Assignment, Keyword,
 RuleCall, Parenthesized Cardinalities *, +, ?
 Predicates =>, -> Group <blank> Unordered Group & Alternative | AssignmentFirst: name=ID?; // (name=ID)?; ! CardinalityFirst: 'a' 'b'?; // 'a' ('b'?); ! PredicateFirst: =>'a' 'b'; // (=>'a') 'b'; ! GroupFirst: 'a' | 'b' 'c'; // 'a' | ('b' 'c');
  • 4.
  • 5.
    grammar org.xtextcon.Statemachine withorg.eclipse.xtext.common.Terminals ! generate statemachine "http://www.xtextcon.org/Statemachine" ! Statemachine: name=STRING ('events' events+=Event+)? states+=State*; ! Event: name=ID description=STRING?; ! State: 'state' name=ID transitions+=Transition*; ! Transition: event=[Event|ID] '=>' state=[State|ID];
  • 6.
  • 7.
    Lexer Splits document textinto tokens Works before and independent of the parser Token matching First match wins Precedence All keywords Local terminal rules Terminal rules from super grammar As last resort you can define a custom lexer
  • 8.
    Lexing // Examplen ’Lamp’ n events button_pressed n stateon nt button_pressed => off n state off nt button_pressed => light SL_COMMENT STRING WS events WS ID WS state ID WS ID WS => WS ID WS state ID WS ID WS => WS ID events
 state
 =>
 ID
 STRING
 ML_COMMENT
 SL_COMMENT WS ANY_OTHER Tokens Token StreamChar Stream
  • 9.
    Terminal Rules grammar org.eclipse.xtext.common.Terminalshidden(WS, ML_COMMENT, SL_COMMENT) import "http://www.eclipse.org/emf/2002/Ecore" as ecore ! terminal ID : '^'?('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*; terminal INT returns ecore::EInt: ('0'..'9')+; terminal STRING : '"' ( '' ('b'|'t'|'n'|'f'|'r'|'u'|'"'|"'"|'') | !(''|'"') )* '"' | "'" ( '' ('b'|'t'|'n'|'f'|'r'|'u'|'"'|"'"|'') | !(''|"'") )* "'"; terminal ML_COMMENT : '/*' -> '*/'; terminal SL_COMMENT : '//' !('n'|'r')* ('r'? 'n')?; ! terminal WS : (' '|'t'|'r'|'n')+; ! terminal ANY_OTHER : .;
  • 10.
  • 11.
    Hidden Terminals Are ignoredby the parser inside the respective scope Can be defined on grammar or rule level (inherited)
 
 grammar org.eclipse.xtext.common.Terminals 
 hidden(WS, ML_COMMENT, SL_COMMENT) or on parser rules
 
 QualifiedName hidden(): // strictly no whitespace ID ('.' ID)*;
  • 12.
    Datatype Rules Return avalue (instead of an EObject) Are processed by the parser Superior alternative to terminal rules Examples
 
 Double returns ecore::EDouble hidden():
 '-'? INT '.' INT ('e'|'E' '-'? INT)?;
 
 QualifiedName: // returns ecore::EString (default)
 ID ('.' ID)*;
 
 ValidID:
 ID | 'keyword0' | 'keyword1';
  • 13.
    Transforms textual representationto value and back For terminal and datatype rules e.g. strips quotes (STRINGValueConverter)
 remove leading ^ (IDValueConverter) EMF defaults are often sufficient (EFactory) Value Converter class MyValueConverterService extends AbstractDeclarativeValueConverterService { @Inject MyValueConverter myValueConverter; ! @ValueConverter(rule = "MY_TERMINAL") def MY_TERMINAL() { return myValueConverter } }
  • 14.
  • 15.
    Example Expression: If; If: Variable | 'if' condition=Expression 'then' then=Expression ('else' else=Expression)?; Variable: name=ID; warning(200): ...: Decision can match input such as "'else'" using multiple alternatives: 1, 2 As a result, alternative(s) 2 were disabled for that input warning(200): ...: Decision can match input such as "'else'" using multiple alternatives: 1, 2 As a result, alternative(s) 2 were disabled for that input if cond1 then if cond2 then foo else // dangling bar
  • 16.
    Ambiguity Analysis In theworkflow, add ! AntlrWorks www.antlr3.org/works/ fragment = DebugAntlrGeneratorFragment { options = auto-inject {} }
  • 17.
    Ambiguity Resolution Add keywords Usesyntactic predicates:
 “If the predicated element matches in the current decision, decide for it.” plain
 (=>'else' else=Expression)?; 
 =>('else' else=Expression)?; // not so good first-set
 ->('else' else=Expression)?;
  • 18.
  • 19.
    grammar org.xtextcon.Statemachine withorg.eclipse.xtext.common.Terminals ! generate statemachine "http://www.xtextcon.org/Statemachine" ! Statemachine returns Statemachine: name=STRING ('events' events+=Event+)? states+=State*; ! Event returns Event: name=ID description=STRING?; ! State returns State: 'state' name=ID transitions+=Transition*; ! Transition returns Transition: event=[Event|ID] '=>' state=[State|ID]; statemachine name : String Statemachine name : String Transition name : String State name : String description: String Event * * *states events transitions event1 1 state
  • 20.
    Supertypes ... ! Element: Event| State; ! Event returns Event: name=ID description=STRING?; ! State returns State: 'state' name=ID transitions+=Transition*; ! ... description: String EventState name : String Element name : String Transition Common features are promoted to the super type Dispatcher rule Element needs not to be called A rule can return a subtype of the specified return type
  • 21.
    Imported EPackage Workflow { bean= StandaloneSetup { ... registerGeneratedEPackage =
 “org.xtextcon.statemachine.StatemachinePackage” registerGenModelFile = 
 “platform:/resource/<path-to>/Statemachine.genmodel" } //generate statemachine "http://www.xtextcon.org/Statemachine" import "http://www.xtextcon.org/Statemachine"
  • 22.
  • 23.
    EObject Creation The currentpointer Points to the EObject returned by a parser rule call Is set to null when entering the rule On the first assignment the EObject is created and assigned to current Further assignments go to current
  • 24.
    EObject Creation Type : (Class| DataType) 'is' visibility=('public' |'private'); DataType : 'datatype' name=ID; Class : 'class' name=ID; datatype A is public Type current = null; current = new DataType(); current.setName(“A”); current.setVisibility(“public”); return current;
  • 25.
    Actions Alternative without assignment-> no returned object
 
 
 Actions make sure an EObject is created Specified type must be compatible with return type Created element is assigned to current BooleanLiteral: value?='true' | 'false'; BooleanLiteral returns Expression: {BooleanLiteral} (value?=‘true’ | ‘false'); The rule 'BooleanLiteral' may be consumed without object instantiation. Add an action to ensure object creation, e.g. '{BooleanLiteral}'.
  • 26.
  • 27.
    Left Factoring Expression: {Addition} left=Number('+' right=Number)*; Number: value = INT; 1 // model Addition { left = Number { value = 1 } }
  • 28.
    Assigned Action Addition returnsExpression: Number ({Sum.left = current} '+' right=Number)*; Number: value = INT; 1 Expression current = null; current = new Number(); current.setValue(1); return current; // model Number { value = 1 }
  • 29.
    Assigned Action II Additionreturns Expression: Number ({Sum.left = current} '+' right=Number)*; Number: value = INT; 1 + 2 Expression current = null; current = new Number(); current.setValue(1); Expression temp = new Sum(); temp.setLeft(current); current = temp; … current.right = <second Number>; return current; // model Addition { left = Number { value = 1 } right = Number { value = 2 } }
  • 30.
    Assigned Action III Createsa new element makes it the parent of current and sets current to it Needed normalizing left-factored grammars infix operators
  • 31.
  • 32.
    Best Practices Use nsURIsfor imported EPackages Prefer datatype rules to terminal rules Use syntactic predicates instead of backtracking Be sloppy in the grammar but strict in validation