This document discusses syntactic editor services including formatting, syntax coloring, and syntactic completion. It describes how syntactic completion can be provided generically based on a syntax definition. The document also discusses how context-free grammars can be extended with templates to specify formatting layout when pretty-printing abstract syntax trees to text. Templates are used to insert whitespace, line breaks, and indentation to produce readable output.
What Goes Wrong with Language Definitions and How to Improve the Situation
CS4200 2019 | Lecture 4 | Syntactic Services
1. Lecture 4: Syntactic Editor Services
CS4200 Compiler Construction
Eelco Visser
TU Delft
September 2019
2. Syntactic editor services
- more interpretations of syntax definitions
Formatting specification
- how to map (abstract syntax) trees to text
Syntactic completion
- proposing valid syntactic completions in an editor
!2
This Lecture
4. !4
The inverse of parsing is unparsing or pretty-printing or
formatting, i.e. mapping a tree representation of a program
to a textual representation. A plain context-free grammar
can be used as specification of an unparser. However, then
it is unclear where the whitespace should go.
This paper extends context-free grammars with templates
that provide hints for layout of program text when
formatting.
https://doi.org/10.1145/2427048.2427056
Based on master’s thesis project of Tobi Vollebregt
5. !5
Syntax definitions cannot be used just for parsing, but
for many other operations. This paper shows how
syntactic completion can be provided generically given
a syntax definition.
https://doi.org/10.1145/2427048.2427056
Part of PhD thesis work Eduardo Amorim
6. !6
The SDF3 syntax
definition formalism is
documented at the
metaborg.org website.
http://www.metaborg.org/en/latest/source/langdev/meta/lang/sdf3/index.html
13. Generated Syntax Coloring
!13
// compute the n-th fibonacci number
let function fib(n: int): int =
if n <= 1 then 1
else fib(n - 1) + fib(n - 2)
in fib(10)
end
module libspoofax/color/default
imports
libspoofax/color/colors
colorer // Default, token-based
highlighting
keyword : 127 0 85 bold
identifier : default
string : blue
number : darkgreen
var : 139 69 19 italic
operator : 0 0 128
layout : 63 127 95 italic
14. Customized Syntax Coloring
!14
module Tiger-Colorer
colorer
red = 255 0 0
green = 0 255 0
blue = 0 0 255
TUDlavender = 123 160 201
colorer token-based highlighting
keyword : red
Id : TUDlavender
StrConst : darkgreen
TypeId : blue
layout : green
// compute the n-th fibonacci number
let function fib(n: int): int =
if n <= 1 then 1
else fib(n - 1) + fib(n - 2)
in fib(10)
end
16. !16
The inverse of parsing is unparsing or pretty-printing or
formatting, i.e. mapping a tree representation of a program
to a textual representation. A plain context-free grammar
can be used as specification of an unparser. However, then
it is unclear where the whitespace should go.
This paper extends context-free grammars with templates
that provide hints for layout of program text when
formatting.
https://doi.org/10.1145/2427048.2427056
Based on master’s thesis project of Tobi Vollebregt
19. From ASTs to text
- insert keywords
- insert layout: spaces, line breaks, indentation
- insert parentheses to preserve tree structure
Unparser
- derive transformation rules from context-free grammar
- keywords, literals defined in grammar productions
- parentheses determined by priority, associativity rules
- separate all symbols by a space => not pretty, or even readable
Pretty-printer
- introduce spaces, line breaks, and indentation to produce readable text
- doing that manually is tedious
!19
Pretty-Printing
20. Specifying Formatting Layout with Templates
!20
context-free syntax
Exp.Seq = <
(
<{Exp ";n"}*>
)
>
Exp.If = <
if <Exp> then
<Exp>
else
<Exp>
>
Exp.IfThen = <
if <Exp> then
<Exp>
>
Exp.While = <
while <Exp> do
<Exp>
>
Inverse quotation
- template quotes literal text with <>
- anti-quotations insert non-terminals with <>
Layout directives
- whitespace (linebreaks, indentation, spaces) in template
guides formatting
- is interpreted as LAYOUT? for parsing
Formatter generation
- generate rules for mapping AST to text (via box expressions)
Applications
- code generation; pretty-printing generated AST
- syntactic completions
- formatting
22. Templates for Tiger: Functions
!22
context-free syntax
Dec.FunDecs = <<{FunDec "n"}+>> {longest-match}
FunDec.ProcDec = <
function <Id>(<{FArg ", "}*>) =
<Exp>
>
FunDec.FunDec = <
function <Id>(<{FArg ", "}*>) : <Type> =
<Exp>
>
FArg.FArg = <<Id> : <Type>>
Exp.Call = <<Id>(<{Exp ", "}*>)>
No space after function name in call
Space after comma!
Function declarations
separated by newline
Indent body
of function
23. Templates for Tiger: Bindings and Records
!23
context-free syntax
Exp.Let = <
let
<{Dec "n"}*>
in
<{Exp ";n"}*>
end
>
context-free syntax // records
Type.RecordTy = <
{
<{Field ", n"}*>
}
>
Field.Field = <<Id> : <TypeId>>
Exp.NilExp = <nil>
Exp.Record = <<TypeId>{ <{InitField ", "}*> }>
InitField.InitField = <<Id> = <Exp>>
LValue.FieldVar = <<LValue>.<Id>>
Note spacing / layout in separators
31. Tiger Syntax: Whitespace & Comments
!31
module Whitespace
lexical syntax
LAYOUT = [ tnr]
context-free restrictions
LAYOUT? -/- [ tnr]
module Comments
lexical syntax // multiline comments
CommentChar = [*]
LAYOUT = "/*" InsideComment* "*/"
InsideComment = ~[*]
InsideComment = CommentChar
lexical restrictions
CommentChar -/- [/]
context-free restrictions
LAYOUT? -/- [/].[/]
lexical syntax // single line comments
LAYOUT = "//" ~[nr]* NewLineEOF
NewLineEOF = [nr]
NewLineEOF = EOF
EOF =
// end of file since it cannot be followed by any character
// avoids the need for a newline to close a single line comment
// at the last line of a file
lexical restrictions
EOF -/- ~[]
context-free restrictions
LAYOUT? -/- [/].[*]
36. !36
Syntax definitions cannot be used just for parsing, but
for many other operations. This paper shows how
syntactic completion can be provided generically given
a syntax definition.
https://doi.org/10.1145/2427048.2427056
Part of PhD thesis work Eduardo Amorim
37. !37
S. Amann, S. Proksch, S. Nadi, and M. Mezini. A study of
visual studio usage in practice. In SANER, 2016.
84. Syntax coloring
- ESV: mapping token sorts to colors
Formatting
- Unparsing: mapping from abstract syntax to concrete syntax
- Pretty-printing: derived from templates
- Parenthesization: derived from disambiguation declarations
Completion
- Make incompleteness explicit, part of the structure
- Generating proposals: derived from structure
- Pretty-printing proposals: derived from templates
!84
Syntactic Services from Syntax Definition