SlideShare a Scribd company logo
Matt Ellis

@citizenmattHow to parse a file
DON’T
@citizenmatt
Why would we write a parser?
• Speed, efficiency
• Reduce dependencies
• Custom or simple formats
• Things that aren’t files - DSLs

Command line options, HTTP headers, stdout, natural language commands

E.g. YouTrack queries
• When we’re just as interested in the structure of a file

as its contents
Matt Ellis
Developer advocate

JetBrains

@citizenmatt
@citizenmatt
@citizenmatt
PSI
Features
Project Model
Base Platform
JetBrains IDE
architecture (kinda)
@citizenmatt
Unity and ShaderLab
@citizenmatt
What are we trying to build?
@citizenmatt
How to parse a file for an IDE
@citizenmatt
Hand rolled parser
var	c	=	ReadChar();

switch	(c)	{

		case	's':

				c	=	ReadChar();

				switch	(c)	{

						case	'h':

								//	Parse	rest	of	"Shader",	then	sub-elements,	…

								//	Create	syntax	tree	node(s)	…

								break;	
						default:

								SyntaxError();

								break;

				}

				break;	
		case	'p':

				//	Parse	rest	of	"Properties",	then	sub-elements,	…

				//	Create	syntax	tree	node(s)	…

				break;

}
@citizenmatt
Back endFront end
Compiler pipeline
Lexical
analysis
Syntactic
analysis
Semantic
analysis
Code
optimisation
Code
generation
@citizenmatt
IDE pipeline
Lexical
analysis
Syntactic
analysis
Semantic
analysis
@citizenmatt
IDE pipeline
Parser
Program
structureLexer
@citizenmatt
Lexers
@citizenmatt
What is a lexer (aka scanner)?
• Performs lexical analysis

Lexical - relating to the words or vocabulary of a language
• Converts a string into a stream of tokens

Identifier, comment, string literal, braces, parentheses, whitespace, etc.
• Tokens are lightweight - typically integer values

(ReSharper uses singleton object instances)
• Parser pattern matches over tokens

Integer or object reference comparisons
@citizenmatt
Lexer output
//	Colored	vertex	lighting	
Shader	"MyShader"	
{	
		//	a	single	color	property	
		Properties	{	
				_Color	("Main	Color",	Color)	=	(1,	.5,.5,1)	
		}	
		//	define	one	subshader	
		SubShader	
		{	
				//	a	single	pass	in	our	subshader	
				Pass	
				{	
						Material	
						{	
								Diffuse	[_Color]	
						}	
						Lighting	On	
				}	
		}	
}	
0000:	END_OF_LINE_COMMENT	'//	Colored	vertex	lighting'	
0026:	NEW_LINE	'rn'	
0028:	SHADER_KEYWORD	'Shader'	
0034:	WHITESPACE	'	'	
0035:	STRING_LITERAL	'"MyShader"'	
0045:	NEW_LINE	'rn'	
0047:	LBRACE	'{'	
0048:	NEW_LINE	'rn'	
0050:	WHITESPACE	'		'	
0052:	END_OF_LINE_COMMENT	'//	a	single	color	property'	
0078:	NEW_LINE	'rn'	
0080:	WHITESPACE	'		'	
0082:	PROPERTIES_KEYWORD	'Properties'	
0092:	WHITESPACE	'	'	
0093:	LBRACE	'{'	
0094:	NEW_LINE	'rn'	
0096:	WHITESPACE	'				'	
0100:	IDENTIFIER	'_Color'	
0106:	WHITESPACE	'	'	
0107:	LPAREN	'('	
0108:	STRING_LITERAL	'"Main	Color"'	
0120:	COMMA	','	
0121:	WHITESPACE	'	'	
0122:	COLOR_KEYWORD	'Color'	
0127:	RPAREN	')'	
0128:	WHITESPACE	'	'	
0129:	EQUALS	'='	
0130:	WHITESPACE	'	'	
0131:	LPAREN	'('	
…
@citizenmatt
Lexers are a solved problem
Use a lexer generator

lex (1975), flex, CsLex, FsLex, JFLex, etc.
@citizenmatt
Anatomy of a lexer input file
User code (e.g. using directives)
%%	
directives

set up namespaces, class names, interfaces

declare regex macros

declare states
%%	
rules and actions

<state> rule { action }
@citizenmatt
ShaderLab lexer
Demo
@citizenmatt
How does it work?
• Lexer generates source code
• Rules (regexes) converted into single Finite State Machine

All regexes combined, matched at same time
• Encoded in state transition tables
• Lookup based on state and input char
• Very fast
• Not very maintainable

Seriously
@citizenmatt
a(b|c)d*e+
Pete Jinks - http://www.cs.man.ac.uk/~pjj/cs211/ho/node6.html
@citizenmatt
Rule: a(b|c)d*e+
‘a’ ‘b’ ‘c’ ‘d’ ‘e’ other
0 m(1) E E E E E
1 E m(2) m(2) E E E
2 E E E m(2) m(3) E
3 a a a a m(3) a
m(x) - match,
move to state x
a - accept
E - error
Pete Jinks - http://www.cs.man.ac.uk/~pjj/cs211/ho/node6.html
@citizenmatt
Rule: a(b|c)d*e+
‘a’ ‘b’ ‘c’ ‘d’ ‘e’ other
0 m(1) E E E E E
1 E m(2) m(2) E E E
2 E E E m(2) m(3) E
3 a a a a m(3) a
m(x) - match,
move to state x
a - accept
E - error
Pete Jinks - http://www.cs.man.ac.uk/~pjj/cs211/ho/node6.html
@citizenmatt
Rule: a(b|c)d*e+
‘a’ ‘b’ ‘c’ ‘d’ ‘e’ other
0 m(1) E E E E E
1 E m(2) m(2) E E E
2 E E E m(2) m(3) E
3 a a a a m(3) a
m(x) - match,
move to state x
a - accept
E - error
Pete Jinks - http://www.cs.man.ac.uk/~pjj/cs211/ho/node6.html
@citizenmatt
Rule: a(b|c)d*e+
‘a’ ‘b’ ‘c’ ‘d’ ‘e’ other
0 m(1) E E E E E
1 E m(2) m(2) E E E
2 E E E m(2) m(3) E
3 a a a a m(3) a
m(x) - match,
move to state x
a - accept
E - error
Pete Jinks - http://www.cs.man.ac.uk/~pjj/cs211/ho/node6.html
@citizenmatt
Rule: a(b|c)d*e+
‘a’ ‘b’ ‘c’ ‘d’ ‘e’ other
0 m(1) E E E E E
1 E m(2) m(2) E E E
2 E E E m(2) m(3) E
3 a a a a m(3) a
m(x) - match,
move to state x
a - accept
E - error
Pete Jinks - http://www.cs.man.ac.uk/~pjj/cs211/ho/node6.html
@citizenmatt
Rule: a(b|c)d*e+
‘a’ ‘b’ ‘c’ ‘d’ ‘e’ other
0 m(1) E E E E E
1 E m(2) m(2) E E E
2 E E E m(2) m(3) E
3 a a a a m(3) a
m(x) - match,
move to state x
a - accept
E - error
Pete Jinks - http://www.cs.man.ac.uk/~pjj/cs211/ho/node6.html
@citizenmatt
Rule: a(b|c)d*e+
‘a’ ‘b’ ‘c’ ‘d’ ‘e’ other
0 m(1) E E E E E
1 E m(2) m(2) E E E
2 E E E m(2) m(3) E
3 a a a a m(3) a
m(x) - match,
move to state x
a - accept
E - error
Pete Jinks - http://www.cs.man.ac.uk/~pjj/cs211/ho/node6.html
@citizenmatt
Rule: a(b|c)d*e+
‘a’ ‘b’ ‘c’ ‘d’ ‘e’ other
0 m(1) E E E E E
1 E m(2) m(2) E E E
2 E E E m(2) m(3) E
3 a a a a m(3) a
m(x) - match,
move to state x
a - accept
E - error
Pete Jinks - http://www.cs.man.ac.uk/~pjj/cs211/ho/node6.html
@citizenmatt
Rule: a(b|c)d*e+
‘a’ ‘b’ ‘c’ ‘d’ ‘e’ other
0 m(1) E E E E E
1 E m(2) m(2) E E E
2 E E E m(2) m(3) E
3 a a a a m(3) a
m(x) - match,
move to state x
a - accept
E - error
Pete Jinks - http://www.cs.man.ac.uk/~pjj/cs211/ho/node6.html
@citizenmatt
Rule: a(b|c)d*e+
‘a’ ‘b’ ‘c’ ‘d’ ‘e’ other
0 m(1) E E E E E
1 E m(2) m(2) E E E
2 E E E m(2) m(3) E
3 a a a a m(3) a
m(x) - match,
move to state x
a - accept
E - error
Pete Jinks - http://www.cs.man.ac.uk/~pjj/cs211/ho/node6.html
@citizenmatt
Rule: a(b|c)d*e+
‘a’ ‘b’ ‘c’ ‘d’ ‘e’ other
0 m(1) E E E E E
1 E m(2) m(2) E E E
2 E E E m(2) m(3) E
3 a a a a m(3) a
m(x) - match,
move to state x
a - accept
E - error
Pete Jinks - http://www.cs.man.ac.uk/~pjj/cs211/ho/node6.html
@citizenmatt
Rule: a(b|c)d*e+
‘a’ ‘b’ ‘c’ ‘d’ ‘e’ other
0 m(1) E E E E E
1 E m(2) m(2) E E E
2 E E E m(2) m(3) E
3 a a a a m(3) a
m(x) - match,
move to state x
a - accept
E - error
Pete Jinks - http://www.cs.man.ac.uk/~pjj/cs211/ho/node6.html
@citizenmatt
Rule: a(b|c)d*e+
‘a’ ‘b’ ‘c’ ‘d’ ‘e’ other
0 m(1) E E E E E
1 E m(2) m(2) E E E
2 E E E m(2) m(3) E
3 a a a a m(3) a
m(x) - match,
move to state x
a - accept
E - error
Pete Jinks - http://www.cs.man.ac.uk/~pjj/cs211/ho/node6.html
@citizenmatt
It gets better
Rules: a(b|c)d*e+ and [0-9]+
[0-9]
4
[0-9]
‘a’ ‘b’ ‘c’ ‘d’ ‘e’ [0-9] other
0 m(1) E E E E m(4) E
1 E m(2) m(2) E E E E
2 E E E m(2) m(3) E E
3 a a a a m(3) a a
4 a a a a a m(4) a
Rules: a(b|c)d*e+ and [0-9]+
[0-9]
4
[0-9]
‘a’ ‘b’ ‘c’ ‘d’ ‘e’ [0-9] other
0 m(1) E E E E m(4) E
1 E m(2) m(2) E E E E
2 E E E m(2) m(3) E E
3 a a a a m(3) a a
4 a a a a a m(4) a
Rules: a(b|c)d*e+ and [0-9]+
[0-9]
4
[0-9]
‘a’ ‘b’ ‘c’ ‘d’ ‘e’ [0-9] other
0 m(1) E E E E m(4) E
1 E m(2) m(2) E E E E
2 E E E m(2) m(3) E E
3 a a a a m(3) a a
4 a a a a a m(4) a
Rules: a(b|c)d*e+ and [0-9]+
[0-9]
4
[0-9]
‘a’ ‘b’ ‘c’ ‘d’ ‘e’ [0-9] other
0 m(1) E E E E m(4) E
1 E m(2) m(2) E E E E
2 E E E m(2) m(3) E E
3 a a a a m(3) a a
4 a a a a a m(4) a
Rules: a(b|c)d*e+ and [0-9]+
[0-9]
4
[0-9]
‘a’ ‘b’ ‘c’ ‘d’ ‘e’ [0-9] other
0 m(1) E E E E m(4) E
1 E m(2) m(2) E E E E
2 E E E m(2) m(3) E E
3 a a a a m(3) a a
4 a a a a a m(4) a
Rules: a(b|c)d*e+ and [0-9]+
[0-9]
4
[0-9]
‘a’ ‘b’ ‘c’ ‘d’ ‘e’ [0-9] other
0 m(1) E E E E m(4) E
1 E m(2) m(2) E E E E
2 E E E m(2) m(3) E E
3 a a a a m(3) a a
4 a a a a a m(4) a
Rules: a(b|c)d*e+ and [0-9]+
[0-9]
4
[0-9]
‘a’ ‘b’ ‘c’ ‘d’ ‘e’ [0-9] other
0 m(1) E E E E m(4) E
1 E m(2) m(2) E E E E
2 E E E m(2) m(3) E E
3 a a a a m(3) a a
4 a a a a a m(4) a
@citizenmatt
Parsing
@citizenmatt
What is a parser?
• Performs syntactic analysis

Verifies and matches syntax of a file
• Pattern matching on stream of tokens from lexer

Can look at token offsets and text, too
• Syntax is described by a grammar
• Grammar is represented as a recursive hierarchy of rules

Top level is the whole file, composing down to structures and tokens
@citizenmatt
Example grammar
shaderFile:

		SHADER_KEYWORD

		STRING_LITERAL

		LBRACE

		propertiesBlock?

		tagsBlock?

		…

		RBRACE

;	
propertiesBlock:

		PROPERTIES_KEYWORD

		LBRACE

		property*

		RBRACE

;	
tagsBlock:

		TAGS_KEYWORD

		LBRACE

		tag*

		RBRACE

;
Shader	"MyShader"

{

		Properties	{	…	}

		Tags	{	…	}

		…

}
@citizenmatt
Parsing is NOT a solved problem
Well, it is, kinda. There are just lots of solutions
@citizenmatt
Types of parsers
• Top down/recursive descent

Match the root of the tree, recursively split up into child elements
• Bottom up/recursive ascent

Start with matching the leaves of the tree, combine into larger
constructs as you go
@citizenmatt
Top down parser
parseShaderLabFile()

		parseShaderCommand()

				match(SHADER_KEYWORD)

				parseShaderValue()

						parseShaderValueName()

								match(STRING_LITERAL)

						match(LBRACE)

						if	(tokenType	==	PROPERTIES_KEYWORD)

								parsePropertiesCommand()

						…

						match(RBRACE)
@citizenmatt
Bottom up parser
Shift/Reduce algorithm
Match token

Shift token onto stack (e.g. INTEGER, OP_PLUS, INTEGER)

Reduce larger construct (e.g. INTEGER + INTEGER becomes EXPRESSION)
@citizenmatt
Building a parser
• Hand rolled

Mechanical process to build. Easy to understand

Usually top down/recursive descent

Can use grammar to build syntax tree classes
• Parser generators

yacc/bison, ANTLR, etc.

Usually bottom up. Can be hard to debug - table driven
• ReSharper mostly uses top-down procedural parsers

Generated and hand rolled

Mainly historical. Easier to maintain, easier error recovery, etc.
@citizenmatt
Parser combinators
• Build a parser by combining other, simpler parsers
• Monads!

Think linq - similar idea, similar ease of use, similar cost
@citizenmatt
FParsec for F#
//	pstring	-	parse	a	string

//	pfloat	-	parse	a	float

//	spaces1	-	parse	one	or	more	whitespace	chars

		

let	pforward	=	(pstring	"fd"	<|>	pstring	“forward”)	>>.	spaces1	>>.	pfloat	
															|>>	fun	n	->	Forward(int	n)	
let	pleft	=	(pstring	"left"	<|>	pstring	"lt")	>>.	spaces1	>>.	pfloat		
												|>>	fun	x	->	Left(int	-x)	
let	pright	=	(pstring	"right"	<|>	pstring	"right")	>>.	spaces1	>>.	pfloat		
													|>>	fun	x	->	Right(int	x)	
let	pcommand	=	pforward	<|>	pleft	<|>	pright
Phil Trelford - http://trelford.com/blog/post/FParsec.aspx
@citizenmatt
Sprache for C#
Parser<string>	identifier	=	
				from	leading	in	Parse.WhiteSpace.Many()	
				from	first	in	Parse.Letter.Once()	
				from	rest	in	Parse.LetterOrDigit.Many()	
				from	trailing	in	Parse.WhiteSpace.Many()	
				select	new	string(first.Concat(rest).ToArray());	
var	id	=	identifier.Parse("	abc123		");	
Assert.AreEqual("abc123",	id);
@citizenmatt
Problem #1
Whitespace and comments
@citizenmatt
We’d expect this to work:
shaderBlock:

		SHADER_KEYWORD

		STRING_LITERAL

		LBRACE

		…

		RBRACE

;
Shader	"MyShader"

{

		…

}
@citizenmatt
But this is the actual input…
Shaderrn

··········"MyShader"rn

·······n

/*	Cool	shader!	*/n

{···…········}rn
@citizenmatt
Which lexes as…
SHADER_KEYWORD

NEW_LINE

WHITESPACE

STRING_LITERAL

NEW_LINE

WHITESPACE

NEW_LINE

COMMENT

NEW_LINE

LBRACE

WHITESPACE

…

WHITESPACE

RBRACE
Shaderrn

··········"MyShader"rn

·······n

/*	Cool	shader!	*/n

{···…········}rn
@citizenmatt
Which doesn’t match the grammar
shaderBlock:

		SHADER_KEYWORD

		STRING_LITERAL

		LBRACE

		…

		RBRACE

;
SHADER_KEYWORD

NEW_LINE

WHITESPACE

STRING_LITERAL

NEW_LINE

WHITESPACE

NEW_LINE

COMMENT

NEW_LINE

LBRACE

WHITESPACE

…

WHITESPACE

RBRACE
@citizenmatt
• Filter whitespace and comments from the stream of tokens

ReSharper’s tokens have IsFiltered property
• Decorator pattern

Wrap original lexer, swallow filtered tokens
Filtering lexers
Filtering
lexer
Lexer
Parser
Program
structure
@citizenmatt
What are we building?
Is it safe to lose the whitespace?
@citizenmatt
IDE requirements, Part 1
• Code editor features

Syntax highlighting, code folding, etc.
• Syntax error highlighting
• Inspections
• Refactoring
• Formatting
• Etc.
@citizenmatt
IDE requirements, Part 1
• Need to work with the contents and structure of a file
• Contents give us semantic information
• Structure allows us to report inspections, refactor, etc.

Map the semantics back to the file
• Need to represent the structure of the file
• Syntax tree is obvious choice

Inspections walk the tree, refactorings rewrite the tree
@citizenmatt
Abstract Syntax Trees
1
+
2 3
+ 1
+
5
6
= =
@citizenmatt
Concrete Parse Trees
2 WS
+
WS 3
// …
+
NL1
WS
@citizenmatt
Side problem #1
No guidance for designing parse trees!
@citizenmatt
Back to Filtering Lexers
• If we filter tokens out, we have to add them back again
• We need a Missing Tokens Inserter to add whitespace
and comments back into parse tree
Filtering
lexer
Lexer
Parser
Concrete
parse tree
Missing
tokens
inserter
@citizenmatt
Missing Tokens Inserter
• Walk leaf elements of tree

Tokens
• Advances (cached) lexer for each leaf element
• Check current lexer token has same offset as leaf
element
• If not, create leaf element and insert into tree
@citizenmatt
Problem #2
What about significant whitespace?
@citizenmatt
How do we parse this?
There are no end of scope markers!

And we’ve filtered out the whitespace!
let	ArraySample()	=	
		let	numLetters	=	26	
		let	results	=	Array.create	numLetters	0	
		let	data	=	"The	quick	brown	fox"	
		for	i	=	0	to	data.Length	-	1	do	
				let	c	=	data.Chars(i)	
				let	c	=	Char.ToUpper(c)	
				if	c	>=	'A'	&&	c	<=	'Z'	then	
						let	i	=	Char.code	c	-	Char.code	'A'	
						results.[i]	<-	results.[i]	+	1	
		printf	"done!n"
@citizenmatt
Insert zero-width tokens
• Another lexer decorator
• Keeps track of whitespace before it’s filtered
• Inserts “invisible” tokens into token stream

indicating indent/outdent or block start/end

Possibly also token to indicate invalid indentation
• Token is zero-width. Doesn’t affect parse tree
• Parser can match these invisible tokens in grammar
@citizenmatt
Lexer flexibility
It’s just nice to say
@citizenmatt
Altering tokens
• F# example: 2. and [2..0] ambiguous
• Original lexer matches 2. as FLOAT 

and 2.. as INT_DOT_DOT
• Another lexer decorator

Augment generated rules with custom code
• Decorator recognises INT_DOT_DOT 

Splits into two tokens for parser
@citizenmatt
When regexes aren’t enough
• ShaderLab nested comments
• Not possible to match with regex

Don’t even try
• Rule to match start of comment - /*

Finish lexing by hand, counting start and end comment chars

Ignore START_COMMENT and return different token - COMMENT
• It doesn’t have to be completely machine generated
/*	This	/*	is	*/	valid	*/
@citizenmatt
Problem #3
Pre-processor tokens
@citizenmatt
Pre-processor tokens
• Pre-processor tokens can
appear anywhere
• How do you add them to
the grammar/parser?
• ShaderLab has CGPROGRAM
and CGINCLUDE which are
essentially pre-processor
tokens
• (Also nested language - Cg)
@citizenmatt
Parsing pre-processor tokens
• Two pass parsing
• First pass parses pre-processor tokens
• Filtering lexer strips pre-processor tokens
• Parse normally
• Parsed pre-processor tree nodes inserted as missing
tokens
Parsing pre-processor tokens
Including

pre-processor
tokens
Filtering
lexer
Lexer
Parser
Concrete
parse
tree
Missing
tokens
inserter
Pre-processor
parser
Filtering
lexer
@citizenmatt
Problem #4
IDEs impose constraints
@citizenmatt
IDE Requirements, Part 2
• Error highlighting

The code is broken every time you type
• Incremental lexing + parsing

Performance
• Version tolerance

E.g. multiple versions of C#
• Nested/composable languages
@citizenmatt
Problem #5
Error handling
@citizenmatt
Error handling
@citizenmatt
Error handling is more of an art than a science
@citizenmatt
What happens when there’s an error?
• The parser adds an error element into the tree
• Error element spans whatever has been parsed so far

Might just be unexpected token, or incorrect element construct
• Highlighting the error in the editor is trivial

Inspection simply looks for error element, adds highlight
@citizenmatt
How do we find an error?
• Error start is obvious

mismatched rule, unexpected token
• Where does the error stop?

Off by one token could affect rest of file
• IDE must try to recover

How?
@citizenmatt
Error recovery
• Panic mode

Eat tokens until finds a “follows” token
• Token insertion/removal/substitution
• Error rules in grammar
@citizenmatt
Shader	"MyShader"	{

		Properties	{

				_RealProperty1("Real1",	Color)	=	(1,1,1,1)

				_PropName	SyntaxErrorPanicMode	=	(1,1,1,1)

				_Recovered("Real2",	Color)	=	(1,	1,	1,	1)

		}

}
Panic mode
Shader	"MyShader"	{

		Properties	{

				_RealProperty1("Real1",	Color)	=	(1,1,1,1)

				_PropName	SyntaxError	_AttemptedRecovery	=	(1,1,1,1)

				_Recovered("Real2",	Color)	=	(1,	1,	1,	1)

		}

}
@citizenmatt
• Expected RPAREN got EQUALS

Assume RPAREN missing (insert it), EQUALS matches, continue
Token insertion
Shader	"MyShader"	{

		Properties	{

				_RealProperty1("Real1",	Color	=	(1,1,1,1)

		}

}
@citizenmatt
• Token insertion fails

Inserting EQUALS doesn’t sync back up
• Expected EQUALS got extra RPAREN

Skip RPAREN (remove it), EQUALS matches, continue
Shader	"MyShader"	{

		Properties	{

				_RealProperty1("Real1",	Color))	=	(1,1,1,1)

		}

}
Token removal
@citizenmatt
Error production rules
• Create a rule that anticipates an error
• E.g. consume any tokens that shouldn’t be there
emptyBlock:

		LBRACE

		errorElementWithoutRBrace*	
		RBRACE

	;
@citizenmatt
Problem #6
Incremental lexing and parsing
@citizenmatt
What’s the problem?
• Don’t parse entire file on every change
• Only reparse smallest subtree that encloses change

Block nodes (method bodies, classes, etc. Not if, for, etc.)
• Avoid re-lexing the entire file, too
@citizenmatt
Incremental lexing
• Requires a cache of the original token stream

Token type, offsets and state of lexer (int)
• Copy cached tokens up to change position
• Restart lexer at change position with known state from
cache
• Lex until we can match tail of cached tokens
@citizenmatt
Incremental parsing
• Walk up syntax tree, find nearest element that can
reparse and that encompasses change

E.g. method/class body
• Find start of block

E.g. opening LBRACE ‘{‘
• Use updated cached lexer to find end of block

E.g. closing RBRACE ‘}’
• Parse block, add new element into tree

Uses custom entry point into parser
@citizenmatt
Problem #7
Composable languages
@citizenmatt
Three types
• Injected languages

E.g. self-contained islands in a string literal (regex)
• Inherited languages

E.g. TypeScript is a superset of JavaScript
• Nested languages

E.g. JavaScript/CSS nested inside HTML. Razor and C#
@citizenmatt
Injected languages
• Build a parse tree for the contents of another node

E.g. ShaderLab CG_PROGRAM, regular expressions, …
• Provides syntax highlighting, code completion, etc.
• Attaches a new parse tree to the node of another tree
• Changes to injected tree persisted to string and pushed
as change to the owning tree
• Changes to owning tree cause full reparse of injected
language
@citizenmatt
Inherited languages
• E.g. TypeScript is a superset of JavaScript
• TypeScriptParser derives from JavaScriptParser

Share a lexer
• Custom hand rolled parsers

Recursive descent
• Easier to inherit and override key methods

Gang of Four Template pattern
• Also XamlParser, MSBuildParser, WebConfigParser

Custom XML parsers
@citizenmatt
Nested languages
• E.g. .aspx, .cshtml - HTML superset, with C# “islands”
• ReSharper parses .aspx/.cshtml file

Builds parse tree for ASPX/Razor syntax
• HTML superset requires lexer superset
• HtmlCompoundLexer lexes “outer” language’s tokens

When encounters HTML, switches to standard HTML lexer
• How to handle C# islands?
@citizenmatt
Secondary documents
• ASPX/Razor - C# islands
• Create secondary in-memory C# file

Mirrors what gets generated when .aspx file is compiled
• Maps C# islands in .aspx to in-memory C# file
• Inspections, code completion, etc. work through the
mapping
@citizenmatt
How do you parse a file?
@citizenmatt
DON’T
@citizenmatt
Links
https://github.com/JetBrains/resharper-unity
Generating Fast, Error Recovering Parsers

http://www.dtic.mil/dtic/tr/fulltext/u2/a196581.pdf
Effective and Comfortable Error Recovery in Recursive Descent Parsers

http://www.cocolab.com/products/cocktail/doc.pdf/ell.pdf
The Definitive ANTLR4 Reference - Terrence Parr

More Related Content

Similar to How to Parse a File (DDD North 2017)

Everything is composable
Everything is composableEverything is composable
Everything is composable
Victor Igor
 
input output Organization
input output Organizationinput output Organization
input output Organization
Acad
 
A Signature Algorithm Based On Chaotic Maps And Factoring Problems
A Signature Algorithm Based On Chaotic Maps And Factoring ProblemsA Signature Algorithm Based On Chaotic Maps And Factoring Problems
A Signature Algorithm Based On Chaotic Maps And Factoring Problems
Sandra Long
 
AiCore Brochure 27-Mar-2023-205529.pdf
AiCore Brochure 27-Mar-2023-205529.pdfAiCore Brochure 27-Mar-2023-205529.pdf
AiCore Brochure 27-Mar-2023-205529.pdf
AjayRawat829497
 
Deductive verification of unmodified Linux kernel library functions
Deductive verification of unmodified Linux kernel library functionsDeductive verification of unmodified Linux kernel library functions
Deductive verification of unmodified Linux kernel library functions
Denis Efremov
 
Automatic Selection of Predicates for Common Sense Knowledge Expression
Automatic Selection of Predicates for Common Sense Knowledge ExpressionAutomatic Selection of Predicates for Common Sense Knowledge Expression
Automatic Selection of Predicates for Common Sense Knowledge Expression
長岡技術科学大学 自然言語処理研究室
 
Software engineering
Software engineeringSoftware engineering
Software engineering
Dulaj Nuwansiri
 
Placement paper
Placement paperPlacement paper
Placement paper
Ramanujam Ramu
 
CBSE XI COMPUTER SCIENCE
CBSE XI COMPUTER SCIENCECBSE XI COMPUTER SCIENCE
CBSE XI COMPUTER SCIENCE
Gautham Rajesh
 
B010430814
B010430814B010430814
B010430814
IOSR Journals
 
C Programming Interview Questions
C Programming Interview QuestionsC Programming Interview Questions
C Programming Interview Questions
Gradeup
 
Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)
Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)
Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)
Grace Yang
 
Cs gate-2011
Cs gate-2011Cs gate-2011
Cs gate-2011
Ravi Rajput
 
Cs gate-2011
Cs gate-2011Cs gate-2011
Cs gate-2011
Ved Prakash Gautam
 
ESL Anyone?
ESL Anyone? ESL Anyone?
ESL Anyone?
DVClub
 
Gate Previous Years Papers
Gate Previous Years PapersGate Previous Years Papers
Gate Previous Years Papers
Rahul Jain
 
Embedded SW Interview Questions
Embedded SW Interview Questions Embedded SW Interview Questions
Embedded SW Interview Questions
PiTechnologies
 
LDCQ paper Dec21 with answer key_62cb2996afc60f6aedeb248c1d9283e5.pdf
LDCQ paper Dec21 with answer key_62cb2996afc60f6aedeb248c1d9283e5.pdfLDCQ paper Dec21 with answer key_62cb2996afc60f6aedeb248c1d9283e5.pdf
LDCQ paper Dec21 with answer key_62cb2996afc60f6aedeb248c1d9283e5.pdf
Vedant Gavhane
 
1
11
Junaid program assignment
Junaid program assignmentJunaid program assignment
Junaid program assignment
Junaid Ahmed
 

Similar to How to Parse a File (DDD North 2017) (20)

Everything is composable
Everything is composableEverything is composable
Everything is composable
 
input output Organization
input output Organizationinput output Organization
input output Organization
 
A Signature Algorithm Based On Chaotic Maps And Factoring Problems
A Signature Algorithm Based On Chaotic Maps And Factoring ProblemsA Signature Algorithm Based On Chaotic Maps And Factoring Problems
A Signature Algorithm Based On Chaotic Maps And Factoring Problems
 
AiCore Brochure 27-Mar-2023-205529.pdf
AiCore Brochure 27-Mar-2023-205529.pdfAiCore Brochure 27-Mar-2023-205529.pdf
AiCore Brochure 27-Mar-2023-205529.pdf
 
Deductive verification of unmodified Linux kernel library functions
Deductive verification of unmodified Linux kernel library functionsDeductive verification of unmodified Linux kernel library functions
Deductive verification of unmodified Linux kernel library functions
 
Automatic Selection of Predicates for Common Sense Knowledge Expression
Automatic Selection of Predicates for Common Sense Knowledge ExpressionAutomatic Selection of Predicates for Common Sense Knowledge Expression
Automatic Selection of Predicates for Common Sense Knowledge Expression
 
Software engineering
Software engineeringSoftware engineering
Software engineering
 
Placement paper
Placement paperPlacement paper
Placement paper
 
CBSE XI COMPUTER SCIENCE
CBSE XI COMPUTER SCIENCECBSE XI COMPUTER SCIENCE
CBSE XI COMPUTER SCIENCE
 
B010430814
B010430814B010430814
B010430814
 
C Programming Interview Questions
C Programming Interview QuestionsC Programming Interview Questions
C Programming Interview Questions
 
Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)
Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)
Win-Win Search: Dual-Agent Stochastic Game in Session Search (SIGIR 2014)
 
Cs gate-2011
Cs gate-2011Cs gate-2011
Cs gate-2011
 
Cs gate-2011
Cs gate-2011Cs gate-2011
Cs gate-2011
 
ESL Anyone?
ESL Anyone? ESL Anyone?
ESL Anyone?
 
Gate Previous Years Papers
Gate Previous Years PapersGate Previous Years Papers
Gate Previous Years Papers
 
Embedded SW Interview Questions
Embedded SW Interview Questions Embedded SW Interview Questions
Embedded SW Interview Questions
 
LDCQ paper Dec21 with answer key_62cb2996afc60f6aedeb248c1d9283e5.pdf
LDCQ paper Dec21 with answer key_62cb2996afc60f6aedeb248c1d9283e5.pdfLDCQ paper Dec21 with answer key_62cb2996afc60f6aedeb248c1d9283e5.pdf
LDCQ paper Dec21 with answer key_62cb2996afc60f6aedeb248c1d9283e5.pdf
 
1
11
1
 
Junaid program assignment
Junaid program assignmentJunaid program assignment
Junaid program assignment
 

More from citizenmatt

Rider - Taking ReSharper out of Process
Rider - Taking ReSharper out of ProcessRider - Taking ReSharper out of Process
Rider - Taking ReSharper out of Process
citizenmatt
 
The how-dare-you-call-me-an-idiot’s guide to the .NET Standard (NDC London 2017)
The how-dare-you-call-me-an-idiot’s guide to the .NET Standard (NDC London 2017)The how-dare-you-call-me-an-idiot’s guide to the .NET Standard (NDC London 2017)
The how-dare-you-call-me-an-idiot’s guide to the .NET Standard (NDC London 2017)
citizenmatt
 
.NET Core Blimey! Windows Platform User Group, Manchester
.NET Core Blimey! Windows Platform User Group, Manchester.NET Core Blimey! Windows Platform User Group, Manchester
.NET Core Blimey! Windows Platform User Group, Manchester
citizenmatt
 
.NET Core Blimey! (Shropshire Devs Mar 2016)
.NET Core Blimey! (Shropshire Devs Mar 2016).NET Core Blimey! (Shropshire Devs Mar 2016)
.NET Core Blimey! (Shropshire Devs Mar 2016)
citizenmatt
 
.NET Core Blimey! (dotnetsheff Jan 2016)
.NET Core Blimey! (dotnetsheff Jan 2016).NET Core Blimey! (dotnetsheff Jan 2016)
.NET Core Blimey! (dotnetsheff Jan 2016)
citizenmatt
 
.net Core Blimey - Smart Devs UG
.net Core Blimey - Smart Devs UG.net Core Blimey - Smart Devs UG
.net Core Blimey - Smart Devs UG
citizenmatt
 
.Net Core Blimey! (16/07/2015)
.Net Core Blimey! (16/07/2015).Net Core Blimey! (16/07/2015)
.Net Core Blimey! (16/07/2015)
citizenmatt
 
C# 6.0 - DotNetNotts
C# 6.0 - DotNetNottsC# 6.0 - DotNetNotts
C# 6.0 - DotNetNotts
citizenmatt
 
What's New in ReSharper 9?
What's New in ReSharper 9?What's New in ReSharper 9?
What's New in ReSharper 9?
citizenmatt
 

More from citizenmatt (9)

Rider - Taking ReSharper out of Process
Rider - Taking ReSharper out of ProcessRider - Taking ReSharper out of Process
Rider - Taking ReSharper out of Process
 
The how-dare-you-call-me-an-idiot’s guide to the .NET Standard (NDC London 2017)
The how-dare-you-call-me-an-idiot’s guide to the .NET Standard (NDC London 2017)The how-dare-you-call-me-an-idiot’s guide to the .NET Standard (NDC London 2017)
The how-dare-you-call-me-an-idiot’s guide to the .NET Standard (NDC London 2017)
 
.NET Core Blimey! Windows Platform User Group, Manchester
.NET Core Blimey! Windows Platform User Group, Manchester.NET Core Blimey! Windows Platform User Group, Manchester
.NET Core Blimey! Windows Platform User Group, Manchester
 
.NET Core Blimey! (Shropshire Devs Mar 2016)
.NET Core Blimey! (Shropshire Devs Mar 2016).NET Core Blimey! (Shropshire Devs Mar 2016)
.NET Core Blimey! (Shropshire Devs Mar 2016)
 
.NET Core Blimey! (dotnetsheff Jan 2016)
.NET Core Blimey! (dotnetsheff Jan 2016).NET Core Blimey! (dotnetsheff Jan 2016)
.NET Core Blimey! (dotnetsheff Jan 2016)
 
.net Core Blimey - Smart Devs UG
.net Core Blimey - Smart Devs UG.net Core Blimey - Smart Devs UG
.net Core Blimey - Smart Devs UG
 
.Net Core Blimey! (16/07/2015)
.Net Core Blimey! (16/07/2015).Net Core Blimey! (16/07/2015)
.Net Core Blimey! (16/07/2015)
 
C# 6.0 - DotNetNotts
C# 6.0 - DotNetNottsC# 6.0 - DotNetNotts
C# 6.0 - DotNetNotts
 
What's New in ReSharper 9?
What's New in ReSharper 9?What's New in ReSharper 9?
What's New in ReSharper 9?
 

Recently uploaded

Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 

Recently uploaded (20)

Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 

How to Parse a File (DDD North 2017)