SlideShare a Scribd company logo
1
Outline
 Who am I? Why I did this?
 Introduction to PEG
 Introduction to programming language
 Write a parser in PEG
 No demo QQ
2
About Me
 葉闆, Yodalee <lc85301@gmail.com>
 Study EE in college, Microwave in graduate school,
now rookie engineer in Synopsys.
3
Github: yodalee Blogger: http://yodalee.blogspot.tw
Why Did I Do This
 “Understanding Computation: From
Simple Machines to Impossible
Programs”
 In the book, it implements a
programming language parser, regular
expression parser with Ruby Treetop,
which is a PEG parser.
 I re-write all the code in Rust, so I did a
little research on PEG.
https://github.com/yodalee/computationbook
-rust
4
Introduction to PEG
5
Parsing Expression Grammar, PEG
 Bryan Ford, <Parsing Expression Grammars: A Recognition-
Based Syntactic Foundation>, 2004
 A replacement to Chomsky language, by removing the
ambiguity in grammar.
 The ambiguity is useful in modeling natural language, but not
in precise and unambiguous programming language.
6
Language <- Subject Verb Noun
Subject <- He | Lisa …
Verb <- is | has | sees…
Noun <- student | a toy …
PEG Basic Rule
 PEG in definition are very similar to CFG, composed
of rules.
 Rule will either:
 Match success: consume input.
 Match fail: not consume input.
 As predicate: only return success or fail, not consume input.
7
PEG Basic Rule
 Replace choice ‘|’ with
prioritized choice ‘/’.
 Consider following:
 CFG: A = “a” | “ab”
PEG: A = “a” / “ab”
 PEG: A = a* a
8
Operator
“” String Literal
[] Character Set
. Any Character
(e1 e2 ..) Grouping
e? e+ e* Optional Repetition
&e And predicate
!e Not predicate
e1 e2 Sequence
e1 / e2 Prioritized Choice
Some Example
 NUMBER <- [1-9] [0-9]*
 COMMENT <- “//” (!”n” .)* n
 EXPRESSION <- TERM ([+-] TERM)*
TERM <- FACTOR ([*/] FACTOR)*
 STAT_IF <-
“if” COND “then” STATEMENT “else” STATEMENT /
“if” COND “then” STATEMENT
9
PEG is not CFG
 PEG is equivalent to Top Down Programming Language
(TDPL)
 Language anbncn is not context-free, however PEG can parse
it with And-predicate.
 In CFG, A <- aAa | a match: odd number “a”
In PEG, A <- aAa / a match: 2n-1 “a”
 It is an open problem that any CFG can be parsed by PEG
10
A <- aAb / ε
B <- bBc / ε
S <- &(A !b) a* B
Using PEG
 There are many library that supports PEG:
 Rust: rust-peg, pest, nom-peg …
 C++: PEGTL, Boost …
 Ruby: kpeg, raabro, Treetop …
 Python: pyPEG, parsimonious …
 Haskell: Peggy …
 …
 So why Rust?
11
Introduction to
Programming Language
12
Simple Language
 3 types of statements: assign, if else, while.
 Support integer arithmetic.
 Support pair, list, function with one argument.
Simple, but actually we can do some complex things, like
recursion, map.
13
factorfun = function factor(x) {
if (x > 1) { x * factor ( x-1 ) } else { 1 }
}
result = factorfun(10); // 3628800
function last(l) {
if (isnothing(snd(l))) {
fst(l)
} else {
last(snd(l))
}
}
Abstract Syntax Tree
 Use Rust enum to store a payload inside.
 “Programming” like this:
14
pub enum Node {
Number(i64),
Boolean(bool),
Add(Box<Node>, Box<Node>),
Subtract(Box<Node>, Box<Node>),
LT(Box<Node>, Box<Node>)
…
}
let n = Node::add(Node::number(3), Node::number(4))
Add
3 4
LT
8
Abstract Syntax Tree
 All the statement are Node:
15
pub enum Node {
Variable ( String ),
Assign ( String, Box<Node>),
If ( Box<Node>, Box<Node>, Box<Node> ),
While ( Box<Node>, Box<Node> ),
…
}
Pair, List and Nothing
 Node::pair(Node::number(3), Node::number(4))
 List [3,4,5] = pair(3, pair(4, pair(5, nothing)))
 Nothing special
16
Pair
3 4
Pair
3 Pair
4 Pair
Nothing5
Environment and Machine
 Environment stores a Hashmap<String, Box<Node>>, with
<add> and <get> interface.
 A machine accepts an AST and an environment to evaluate
AST inside the machine.
17
pub struct Environment {
pub vars: HashMap<String, Box<Node>>
}
pub struct Machine {
pub environment: Environment,
expression: Box<Node>
}
Evaluate the AST
 Add evaluate function to all AST node using trait.
 The result will be a new Node.
18
fn evaluate(&self, env: &mut Environment) -> Box<Node>;
match *self {
Node::Add(ref l, ref r) => {
Node::number(l.evaluate(env).value() +
r.evaluate(env).value()) }
…
}
Evaluate the AST
 How to evaluate While Node ( condition, body )?
 Evaluate condition => evaluate body and self if true.
19
x = 3;
while (x < 9) { x = x * 2; }
Evaluate x = 3
Evaluate while (x < 9) x = x * 2
Evaluate x = x * 2
Evaluate while (x < 9) x = x * 2
Evaluate x = x * 2
Evaluate while (x < 9) x = x * 2
Function
 Function is also a type of Node. Upon evaluation, function is
wrapped into Closure with environment at that time.
 Call is evaluated the function with closure’s environment.
20
Node::Func(String, String, Box<Node>)
Node::Closure(Environment, Box<Node>)
fn evaluate(&self, env: &mut Environment) -> Box<Node> {
Node::Fun(ref name, ref arg, ref body) => {
Node::closure(env.clone(), Box::new(self.clone()))
}
}
Call a Function
fn evaluate(&self, env: &mut Environment) -> Box<Node> {
Node::Call(ref closure, ref arg) => {
match *closure {
Node::Closure(ref env, ref fun) => {
if let Node::Fun(funname, argname, body) = *fun.clone() {
let mut newenv = env.clone();
newenv.add(&funname, closure.evaluate(env));
newenv.add(&argname, arg.evaluate(env));
body.evaluate(&mut newenv);
} } } } }
21
Free Variable
 Evaluate the free variables in a function to prevent copy whole
environment
 Node::Variable
 Node::Assign
 Node::Function
22
function addx(x) { function addy(y) { x + y }}
-> no free variables
function addy(x) { x + y }
-> free variable y
Call a Function
if let Node::Fun(funname, argname, body) = *fun.clone() {
let mut newenv = new Environment {};
for var in free_vars(fun) {
newenv.add(var, env.get(var));
}
newenv.add(&funname, closure.evaluate(env));
newenv.add(&argname, arg.evaluate(env));
body.evaluate(&mut newenv);
}
23
What is a Language?
 We make some concepts abstract, like a virtual machine.
Design a language is to design the abstraction.
 Function “evaluate” implement the concept, of course we can
implement it as anything. Like return 42 on every evaluation.
24
Concept Simple, virtual
machine
Real Machine
Number 3 Node::number(3) 0b11 in memory
+ Node::add(l, r) add r1 r2
Choice Node::if branch command
What is a Language?
 Abstraction will bring some precision issue, like floating point.
We have no way to express concept of <infinite>.
 We can create a language on geometry as below, which
representation for line is best?
 Consider every pros and cons the abstraction will bring.
25
Concept In Programming Language
Point (x: u32, y: u32)
Line
(Point, Point)
(Point, Slope)
(Point, Point, type{vertical, horizontal, angled})
Intersection Calculate intersection
Implement a Parser with
PEG
26
The Pest Package
 Rust Pest
 https://github.com/pest-parser/pest
 My simple language parser grammar at:
 https://github.com/yodalee/simplelang
 Parsing Flow
27
Grammar Parser
Source
Code
Pest Pair
Structure
Simple AST
The Pest Package
28
use pest::Parser;
#[derive(Parser)]
#[grammar = "simple.pest"]
struct SimpleParser;
let pairs = SimpleParser::parse(
Rule::simple, “<source code>")
 A pair represents the parse result
from a rule.
 Pair.as_rule() => the rule
 Pair.as_span() => get match span
 Pair.as_str() => matched text
 Pair.into_inner()=> Sub-rules
Grammar <-> Build AST
Number = { [1-9] ~ [0-9]* }
Variable = { [A-Za-z] ~ [A-Za-z0-9]* }
Call = { Variable ~ “(“ ~ Expr ~ “)” }
Factor = { “(“ ~ Expr ~ “)” | Call | Variable | Number }
29
fn build_factor(pair: Pair<Rule>) -> Box<Node> {
match pair.as_rule() {
Rule::number => Node::number(pair.as_str().parse::<i64>().unwrap()),
Rule::variable => Node::variable(pair.as_str()),
Rule::expr => ...,
Rule::call => ...,
}
}
Climb the Expression
 Expression can be written as single Rule:
Expr = { Factor ~ (op_binary ~ Factor)* }
 Pest provides a template, just defines:
 Function build factor => create Factor Node
 Function infix rules => create Operator Node
 Operator precedence =>
vector of operator precedence and left/right association
30
Challenges
 Error message with syntax error.
 How to deal with optional? Like C for loop
 A more systematic way to deal with large language, like C.
31
compound_statement <- block_list
block_list <- block_list block | ε
block <- declaration_list | statement_list
declaration_list <- declaration_list declaration | ε
statement_list <- statment_list statement | ε
// Wrong PEG
compound_statement <- block*
block <- declaration* ~ statement*
// Correct PEG
compound_statement <- block*
block <- (declaration | statement)+
Conclusion
32
Conclusion
 PEG is a new, much powerful grammar than CFG. Fast and
convenient to create a small language parser.
 The most important concept in programming language?
Abstraction
 Is there best abstraction? NO. It is engineering.
33
Reference
 <Parsing Expression Grammars: A Recognition-Based
Syntactic Foundation>, Bryan Ford
 <Understanding Computation: From Simple Machines to
Impossible Programs>
 <Programming Language Part B> on Coursera, University of
Washington
34
Thank You for Listening
35
IB502 1430 – 1510
Build Yourself a Nixie Tube Clock
36

More Related Content

What's hot

Merge sort algorithm power point presentation
Merge sort algorithm power point presentationMerge sort algorithm power point presentation
Merge sort algorithm power point presentation
University of Science and Technology Chitttagong
 
Teach your kids how to program with Python and the Raspberry Pi
Teach your kids how to program with Python and the Raspberry PiTeach your kids how to program with Python and the Raspberry Pi
Teach your kids how to program with Python and the Raspberry Pi
Juan Gomez
 
A whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizerA whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizer
Nikita Popov
 
0-1 KNAPSACK PROBLEM
0-1 KNAPSACK PROBLEM0-1 KNAPSACK PROBLEM
0-1 KNAPSACK PROBLEM
i i
 
GitHub - How to "pull request"
GitHub - How to "pull request"GitHub - How to "pull request"
GitHub - How to "pull request"
Naoki Kanazawa
 
Functional Programming with Groovy
Functional Programming with GroovyFunctional Programming with Groovy
Functional Programming with Groovy
Arturo Herrero
 
Python Course for Beginners
Python Course for BeginnersPython Course for Beginners
Python Course for Beginners
Nandakumar P
 
BKK16-503 Undefined Behavior and Compiler Optimizations – Why Your Program St...
BKK16-503 Undefined Behavior and Compiler Optimizations – Why Your Program St...BKK16-503 Undefined Behavior and Compiler Optimizations – Why Your Program St...
BKK16-503 Undefined Behavior and Compiler Optimizations – Why Your Program St...
Linaro
 
Golang getting started
Golang getting startedGolang getting started
Golang getting started
Harshad Patil
 
Лекция 1. Анализ эффективности алгоритмов
Лекция 1. Анализ эффективности алгоритмовЛекция 1. Анализ эффективности алгоритмов
Лекция 1. Анализ эффективности алгоритмов
Mikhail Kurnosov
 
Regular expression
Regular expressionRegular expression
Regular expression
Rajon
 
Type Classes in Scala and Haskell
Type Classes in Scala and HaskellType Classes in Scala and Haskell
Type Classes in Scala and Haskell
Hermann Hueck
 
Chapter 3 2
Chapter 3 2Chapter 3 2
Chapter 3 2
Ch Farhan
 
Rust: Systems Programming for Everyone
Rust: Systems Programming for EveryoneRust: Systems Programming for Everyone
Rust: Systems Programming for Everyone
C4Media
 
Implementing the IO Monad in Scala
Implementing the IO Monad in ScalaImplementing the IO Monad in Scala
Implementing the IO Monad in Scala
Hermann Hueck
 
Asymptotic notations
Asymptotic notationsAsymptotic notations
Asymptotic notationsEhtisham Ali
 
Basics of Computer Coding: Understanding Coding Languages
Basics of Computer Coding: Understanding Coding LanguagesBasics of Computer Coding: Understanding Coding Languages
Basics of Computer Coding: Understanding Coding Languages
Brian Pichman
 
Problema da Mochila 0-1 (Knapsack problem)
Problema da Mochila 0-1 (Knapsack problem)Problema da Mochila 0-1 (Knapsack problem)
Problema da Mochila 0-1 (Knapsack problem)
Marcos Castro
 
how to calclute time complexity of algortihm
how to calclute time complexity of algortihmhow to calclute time complexity of algortihm
how to calclute time complexity of algortihmSajid Marwat
 
Análise de Algoritmos - Método Guloso
Análise de Algoritmos - Método GulosoAnálise de Algoritmos - Método Guloso
Análise de Algoritmos - Método Guloso
Delacyr Ferreira
 

What's hot (20)

Merge sort algorithm power point presentation
Merge sort algorithm power point presentationMerge sort algorithm power point presentation
Merge sort algorithm power point presentation
 
Teach your kids how to program with Python and the Raspberry Pi
Teach your kids how to program with Python and the Raspberry PiTeach your kids how to program with Python and the Raspberry Pi
Teach your kids how to program with Python and the Raspberry Pi
 
A whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizerA whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizer
 
0-1 KNAPSACK PROBLEM
0-1 KNAPSACK PROBLEM0-1 KNAPSACK PROBLEM
0-1 KNAPSACK PROBLEM
 
GitHub - How to "pull request"
GitHub - How to "pull request"GitHub - How to "pull request"
GitHub - How to "pull request"
 
Functional Programming with Groovy
Functional Programming with GroovyFunctional Programming with Groovy
Functional Programming with Groovy
 
Python Course for Beginners
Python Course for BeginnersPython Course for Beginners
Python Course for Beginners
 
BKK16-503 Undefined Behavior and Compiler Optimizations – Why Your Program St...
BKK16-503 Undefined Behavior and Compiler Optimizations – Why Your Program St...BKK16-503 Undefined Behavior and Compiler Optimizations – Why Your Program St...
BKK16-503 Undefined Behavior and Compiler Optimizations – Why Your Program St...
 
Golang getting started
Golang getting startedGolang getting started
Golang getting started
 
Лекция 1. Анализ эффективности алгоритмов
Лекция 1. Анализ эффективности алгоритмовЛекция 1. Анализ эффективности алгоритмов
Лекция 1. Анализ эффективности алгоритмов
 
Regular expression
Regular expressionRegular expression
Regular expression
 
Type Classes in Scala and Haskell
Type Classes in Scala and HaskellType Classes in Scala and Haskell
Type Classes in Scala and Haskell
 
Chapter 3 2
Chapter 3 2Chapter 3 2
Chapter 3 2
 
Rust: Systems Programming for Everyone
Rust: Systems Programming for EveryoneRust: Systems Programming for Everyone
Rust: Systems Programming for Everyone
 
Implementing the IO Monad in Scala
Implementing the IO Monad in ScalaImplementing the IO Monad in Scala
Implementing the IO Monad in Scala
 
Asymptotic notations
Asymptotic notationsAsymptotic notations
Asymptotic notations
 
Basics of Computer Coding: Understanding Coding Languages
Basics of Computer Coding: Understanding Coding LanguagesBasics of Computer Coding: Understanding Coding Languages
Basics of Computer Coding: Understanding Coding Languages
 
Problema da Mochila 0-1 (Knapsack problem)
Problema da Mochila 0-1 (Knapsack problem)Problema da Mochila 0-1 (Knapsack problem)
Problema da Mochila 0-1 (Knapsack problem)
 
how to calclute time complexity of algortihm
how to calclute time complexity of algortihmhow to calclute time complexity of algortihm
how to calclute time complexity of algortihm
 
Análise de Algoritmos - Método Guloso
Análise de Algoritmos - Método GulosoAnálise de Algoritmos - Método Guloso
Análise de Algoritmos - Método Guloso
 

Similar to Use PEG to Write a Programming Language Parser

Functional Programming In Java
Functional Programming In JavaFunctional Programming In Java
Functional Programming In Java
Andrei Solntsev
 
20130530-PEGjs
20130530-PEGjs20130530-PEGjs
20130530-PEGjszuqqhi 2
 
Столпы функционального программирования для адептов ООП, Николай Мозговой
Столпы функционального программирования для адептов ООП, Николай МозговойСтолпы функционального программирования для адептов ООП, Николай Мозговой
Столпы функционального программирования для адептов ООП, Николай Мозговой
Sigma Software
 
Golang basics for Java developers - Part 1
Golang basics for Java developers - Part 1Golang basics for Java developers - Part 1
Golang basics for Java developers - Part 1
Robert Stern
 
The GO Language : From Beginners to Gophers
The GO Language : From Beginners to GophersThe GO Language : From Beginners to Gophers
The GO Language : From Beginners to Gophers
Alessandro Sanino
 
Python basic
Python basicPython basic
Python basic
Saifuddin Kaijar
 
What we can learn from Rebol?
What we can learn from Rebol?What we can learn from Rebol?
What we can learn from Rebol?
lichtkind
 
SymfonyCon 2017 php7 performances
SymfonyCon 2017 php7 performancesSymfonyCon 2017 php7 performances
SymfonyCon 2017 php7 performances
julien pauli
 
Functional programming ii
Functional programming iiFunctional programming ii
Functional programming ii
Prashant Kalkar
 
Дмитрий Верескун «Синтаксический сахар C#»
Дмитрий Верескун «Синтаксический сахар C#»Дмитрий Верескун «Синтаксический сахар C#»
Дмитрий Верескун «Синтаксический сахар C#»
SpbDotNet Community
 
Geeks Anonymes - Le langage Go
Geeks Anonymes - Le langage GoGeeks Anonymes - Le langage Go
Geeks Anonymes - Le langage Go
Geeks Anonymes
 
What can be done with Java, but should better be done with Erlang (@pavlobaron)
What can be done with Java, but should better be done with Erlang (@pavlobaron)What can be done with Java, but should better be done with Erlang (@pavlobaron)
What can be done with Java, but should better be done with Erlang (@pavlobaron)
Pavlo Baron
 
Python - Getting to the Essence - Points.com - Dave Park
Python - Getting to the Essence - Points.com - Dave ParkPython - Getting to the Essence - Points.com - Dave Park
Python - Getting to the Essence - Points.com - Dave Park
pointstechgeeks
 
Groovy Introduction - JAX Germany - 2008
Groovy Introduction - JAX Germany - 2008Groovy Introduction - JAX Germany - 2008
Groovy Introduction - JAX Germany - 2008
Guillaume Laforge
 
ppt7
ppt7ppt7
ppt7
callroom
 
ppt2
ppt2ppt2
ppt2
callroom
 
name name2 n
name name2 nname name2 n
name name2 n
callroom
 
ppt9
ppt9ppt9
ppt9
callroom
 

Similar to Use PEG to Write a Programming Language Parser (20)

Functional Programming In Java
Functional Programming In JavaFunctional Programming In Java
Functional Programming In Java
 
20130530-PEGjs
20130530-PEGjs20130530-PEGjs
20130530-PEGjs
 
Столпы функционального программирования для адептов ООП, Николай Мозговой
Столпы функционального программирования для адептов ООП, Николай МозговойСтолпы функционального программирования для адептов ООП, Николай Мозговой
Столпы функционального программирования для адептов ООП, Николай Мозговой
 
Golang basics for Java developers - Part 1
Golang basics for Java developers - Part 1Golang basics for Java developers - Part 1
Golang basics for Java developers - Part 1
 
The GO Language : From Beginners to Gophers
The GO Language : From Beginners to GophersThe GO Language : From Beginners to Gophers
The GO Language : From Beginners to Gophers
 
C Tutorials
C TutorialsC Tutorials
C Tutorials
 
Python basic
Python basicPython basic
Python basic
 
What we can learn from Rebol?
What we can learn from Rebol?What we can learn from Rebol?
What we can learn from Rebol?
 
SymfonyCon 2017 php7 performances
SymfonyCon 2017 php7 performancesSymfonyCon 2017 php7 performances
SymfonyCon 2017 php7 performances
 
Functional programming ii
Functional programming iiFunctional programming ii
Functional programming ii
 
Ch2
Ch2Ch2
Ch2
 
Дмитрий Верескун «Синтаксический сахар C#»
Дмитрий Верескун «Синтаксический сахар C#»Дмитрий Верескун «Синтаксический сахар C#»
Дмитрий Верескун «Синтаксический сахар C#»
 
Geeks Anonymes - Le langage Go
Geeks Anonymes - Le langage GoGeeks Anonymes - Le langage Go
Geeks Anonymes - Le langage Go
 
What can be done with Java, but should better be done with Erlang (@pavlobaron)
What can be done with Java, but should better be done with Erlang (@pavlobaron)What can be done with Java, but should better be done with Erlang (@pavlobaron)
What can be done with Java, but should better be done with Erlang (@pavlobaron)
 
Python - Getting to the Essence - Points.com - Dave Park
Python - Getting to the Essence - Points.com - Dave ParkPython - Getting to the Essence - Points.com - Dave Park
Python - Getting to the Essence - Points.com - Dave Park
 
Groovy Introduction - JAX Germany - 2008
Groovy Introduction - JAX Germany - 2008Groovy Introduction - JAX Germany - 2008
Groovy Introduction - JAX Germany - 2008
 
ppt7
ppt7ppt7
ppt7
 
ppt2
ppt2ppt2
ppt2
 
name name2 n
name name2 nname name2 n
name name2 n
 
ppt9
ppt9ppt9
ppt9
 

More from Yodalee

COSCUP2023 RSA256 Verilator.pdf
COSCUP2023 RSA256 Verilator.pdfCOSCUP2023 RSA256 Verilator.pdf
COSCUP2023 RSA256 Verilator.pdf
Yodalee
 
rrxv6 Build a Riscv xv6 Kernel in Rust.pdf
rrxv6 Build a Riscv xv6 Kernel in Rust.pdfrrxv6 Build a Riscv xv6 Kernel in Rust.pdf
rrxv6 Build a Riscv xv6 Kernel in Rust.pdf
Yodalee
 
Make A Shoot ‘Em Up Game with Amethyst Framework
Make A Shoot ‘Em Up Game with Amethyst FrameworkMake A Shoot ‘Em Up Game with Amethyst Framework
Make A Shoot ‘Em Up Game with Amethyst Framework
Yodalee
 
Build Yourself a Nixie Tube Clock
Build Yourself a Nixie Tube ClockBuild Yourself a Nixie Tube Clock
Build Yourself a Nixie Tube Clock
Yodalee
 
Introduction to nand2 tetris
Introduction to nand2 tetrisIntroduction to nand2 tetris
Introduction to nand2 tetris
Yodalee
 
Office word skills
Office word skillsOffice word skills
Office word skills
Yodalee
 
Git: basic to advanced
Git: basic to advancedGit: basic to advanced
Git: basic to advanced
Yodalee
 

More from Yodalee (7)

COSCUP2023 RSA256 Verilator.pdf
COSCUP2023 RSA256 Verilator.pdfCOSCUP2023 RSA256 Verilator.pdf
COSCUP2023 RSA256 Verilator.pdf
 
rrxv6 Build a Riscv xv6 Kernel in Rust.pdf
rrxv6 Build a Riscv xv6 Kernel in Rust.pdfrrxv6 Build a Riscv xv6 Kernel in Rust.pdf
rrxv6 Build a Riscv xv6 Kernel in Rust.pdf
 
Make A Shoot ‘Em Up Game with Amethyst Framework
Make A Shoot ‘Em Up Game with Amethyst FrameworkMake A Shoot ‘Em Up Game with Amethyst Framework
Make A Shoot ‘Em Up Game with Amethyst Framework
 
Build Yourself a Nixie Tube Clock
Build Yourself a Nixie Tube ClockBuild Yourself a Nixie Tube Clock
Build Yourself a Nixie Tube Clock
 
Introduction to nand2 tetris
Introduction to nand2 tetrisIntroduction to nand2 tetris
Introduction to nand2 tetris
 
Office word skills
Office word skillsOffice word skills
Office word skills
 
Git: basic to advanced
Git: basic to advancedGit: basic to advanced
Git: basic to advanced
 

Recently uploaded

Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptxText-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
ShamsuddeenMuhammadA
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
timtebeek1
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
Łukasz Chruściel
 
Pro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp BookPro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp Book
abdulrafaychaudhry
 
AI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website CreatorAI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website Creator
Google
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
Nidhi Software Price. Fact , Costs, Tips
Nidhi Software Price. Fact , Costs, TipsNidhi Software Price. Fact , Costs, Tips
Nidhi Software Price. Fact , Costs, Tips
vrstrong314
 
Introduction to Pygame (Lecture 7 Python Game Development)
Introduction to Pygame (Lecture 7 Python Game Development)Introduction to Pygame (Lecture 7 Python Game Development)
Introduction to Pygame (Lecture 7 Python Game Development)
abdulrafaychaudhry
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
Deuglo Infosystem Pvt Ltd
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Łukasz Chruściel
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 

Recently uploaded (20)

Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptxText-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
 
Pro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp BookPro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp Book
 
AI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website CreatorAI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website Creator
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
Nidhi Software Price. Fact , Costs, Tips
Nidhi Software Price. Fact , Costs, TipsNidhi Software Price. Fact , Costs, Tips
Nidhi Software Price. Fact , Costs, Tips
 
Introduction to Pygame (Lecture 7 Python Game Development)
Introduction to Pygame (Lecture 7 Python Game Development)Introduction to Pygame (Lecture 7 Python Game Development)
Introduction to Pygame (Lecture 7 Python Game Development)
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 

Use PEG to Write a Programming Language Parser

  • 1. 1
  • 2. Outline  Who am I? Why I did this?  Introduction to PEG  Introduction to programming language  Write a parser in PEG  No demo QQ 2
  • 3. About Me  葉闆, Yodalee <lc85301@gmail.com>  Study EE in college, Microwave in graduate school, now rookie engineer in Synopsys. 3 Github: yodalee Blogger: http://yodalee.blogspot.tw
  • 4. Why Did I Do This  “Understanding Computation: From Simple Machines to Impossible Programs”  In the book, it implements a programming language parser, regular expression parser with Ruby Treetop, which is a PEG parser.  I re-write all the code in Rust, so I did a little research on PEG. https://github.com/yodalee/computationbook -rust 4
  • 6. Parsing Expression Grammar, PEG  Bryan Ford, <Parsing Expression Grammars: A Recognition- Based Syntactic Foundation>, 2004  A replacement to Chomsky language, by removing the ambiguity in grammar.  The ambiguity is useful in modeling natural language, but not in precise and unambiguous programming language. 6 Language <- Subject Verb Noun Subject <- He | Lisa … Verb <- is | has | sees… Noun <- student | a toy …
  • 7. PEG Basic Rule  PEG in definition are very similar to CFG, composed of rules.  Rule will either:  Match success: consume input.  Match fail: not consume input.  As predicate: only return success or fail, not consume input. 7
  • 8. PEG Basic Rule  Replace choice ‘|’ with prioritized choice ‘/’.  Consider following:  CFG: A = “a” | “ab” PEG: A = “a” / “ab”  PEG: A = a* a 8 Operator “” String Literal [] Character Set . Any Character (e1 e2 ..) Grouping e? e+ e* Optional Repetition &e And predicate !e Not predicate e1 e2 Sequence e1 / e2 Prioritized Choice
  • 9. Some Example  NUMBER <- [1-9] [0-9]*  COMMENT <- “//” (!”n” .)* n  EXPRESSION <- TERM ([+-] TERM)* TERM <- FACTOR ([*/] FACTOR)*  STAT_IF <- “if” COND “then” STATEMENT “else” STATEMENT / “if” COND “then” STATEMENT 9
  • 10. PEG is not CFG  PEG is equivalent to Top Down Programming Language (TDPL)  Language anbncn is not context-free, however PEG can parse it with And-predicate.  In CFG, A <- aAa | a match: odd number “a” In PEG, A <- aAa / a match: 2n-1 “a”  It is an open problem that any CFG can be parsed by PEG 10 A <- aAb / ε B <- bBc / ε S <- &(A !b) a* B
  • 11. Using PEG  There are many library that supports PEG:  Rust: rust-peg, pest, nom-peg …  C++: PEGTL, Boost …  Ruby: kpeg, raabro, Treetop …  Python: pyPEG, parsimonious …  Haskell: Peggy …  …  So why Rust? 11
  • 13. Simple Language  3 types of statements: assign, if else, while.  Support integer arithmetic.  Support pair, list, function with one argument. Simple, but actually we can do some complex things, like recursion, map. 13 factorfun = function factor(x) { if (x > 1) { x * factor ( x-1 ) } else { 1 } } result = factorfun(10); // 3628800 function last(l) { if (isnothing(snd(l))) { fst(l) } else { last(snd(l)) } }
  • 14. Abstract Syntax Tree  Use Rust enum to store a payload inside.  “Programming” like this: 14 pub enum Node { Number(i64), Boolean(bool), Add(Box<Node>, Box<Node>), Subtract(Box<Node>, Box<Node>), LT(Box<Node>, Box<Node>) … } let n = Node::add(Node::number(3), Node::number(4)) Add 3 4 LT 8
  • 15. Abstract Syntax Tree  All the statement are Node: 15 pub enum Node { Variable ( String ), Assign ( String, Box<Node>), If ( Box<Node>, Box<Node>, Box<Node> ), While ( Box<Node>, Box<Node> ), … }
  • 16. Pair, List and Nothing  Node::pair(Node::number(3), Node::number(4))  List [3,4,5] = pair(3, pair(4, pair(5, nothing)))  Nothing special 16 Pair 3 4 Pair 3 Pair 4 Pair Nothing5
  • 17. Environment and Machine  Environment stores a Hashmap<String, Box<Node>>, with <add> and <get> interface.  A machine accepts an AST and an environment to evaluate AST inside the machine. 17 pub struct Environment { pub vars: HashMap<String, Box<Node>> } pub struct Machine { pub environment: Environment, expression: Box<Node> }
  • 18. Evaluate the AST  Add evaluate function to all AST node using trait.  The result will be a new Node. 18 fn evaluate(&self, env: &mut Environment) -> Box<Node>; match *self { Node::Add(ref l, ref r) => { Node::number(l.evaluate(env).value() + r.evaluate(env).value()) } … }
  • 19. Evaluate the AST  How to evaluate While Node ( condition, body )?  Evaluate condition => evaluate body and self if true. 19 x = 3; while (x < 9) { x = x * 2; } Evaluate x = 3 Evaluate while (x < 9) x = x * 2 Evaluate x = x * 2 Evaluate while (x < 9) x = x * 2 Evaluate x = x * 2 Evaluate while (x < 9) x = x * 2
  • 20. Function  Function is also a type of Node. Upon evaluation, function is wrapped into Closure with environment at that time.  Call is evaluated the function with closure’s environment. 20 Node::Func(String, String, Box<Node>) Node::Closure(Environment, Box<Node>) fn evaluate(&self, env: &mut Environment) -> Box<Node> { Node::Fun(ref name, ref arg, ref body) => { Node::closure(env.clone(), Box::new(self.clone())) } }
  • 21. Call a Function fn evaluate(&self, env: &mut Environment) -> Box<Node> { Node::Call(ref closure, ref arg) => { match *closure { Node::Closure(ref env, ref fun) => { if let Node::Fun(funname, argname, body) = *fun.clone() { let mut newenv = env.clone(); newenv.add(&funname, closure.evaluate(env)); newenv.add(&argname, arg.evaluate(env)); body.evaluate(&mut newenv); } } } } } 21
  • 22. Free Variable  Evaluate the free variables in a function to prevent copy whole environment  Node::Variable  Node::Assign  Node::Function 22 function addx(x) { function addy(y) { x + y }} -> no free variables function addy(x) { x + y } -> free variable y
  • 23. Call a Function if let Node::Fun(funname, argname, body) = *fun.clone() { let mut newenv = new Environment {}; for var in free_vars(fun) { newenv.add(var, env.get(var)); } newenv.add(&funname, closure.evaluate(env)); newenv.add(&argname, arg.evaluate(env)); body.evaluate(&mut newenv); } 23
  • 24. What is a Language?  We make some concepts abstract, like a virtual machine. Design a language is to design the abstraction.  Function “evaluate” implement the concept, of course we can implement it as anything. Like return 42 on every evaluation. 24 Concept Simple, virtual machine Real Machine Number 3 Node::number(3) 0b11 in memory + Node::add(l, r) add r1 r2 Choice Node::if branch command
  • 25. What is a Language?  Abstraction will bring some precision issue, like floating point. We have no way to express concept of <infinite>.  We can create a language on geometry as below, which representation for line is best?  Consider every pros and cons the abstraction will bring. 25 Concept In Programming Language Point (x: u32, y: u32) Line (Point, Point) (Point, Slope) (Point, Point, type{vertical, horizontal, angled}) Intersection Calculate intersection
  • 26. Implement a Parser with PEG 26
  • 27. The Pest Package  Rust Pest  https://github.com/pest-parser/pest  My simple language parser grammar at:  https://github.com/yodalee/simplelang  Parsing Flow 27 Grammar Parser Source Code Pest Pair Structure Simple AST
  • 28. The Pest Package 28 use pest::Parser; #[derive(Parser)] #[grammar = "simple.pest"] struct SimpleParser; let pairs = SimpleParser::parse( Rule::simple, “<source code>")  A pair represents the parse result from a rule.  Pair.as_rule() => the rule  Pair.as_span() => get match span  Pair.as_str() => matched text  Pair.into_inner()=> Sub-rules
  • 29. Grammar <-> Build AST Number = { [1-9] ~ [0-9]* } Variable = { [A-Za-z] ~ [A-Za-z0-9]* } Call = { Variable ~ “(“ ~ Expr ~ “)” } Factor = { “(“ ~ Expr ~ “)” | Call | Variable | Number } 29 fn build_factor(pair: Pair<Rule>) -> Box<Node> { match pair.as_rule() { Rule::number => Node::number(pair.as_str().parse::<i64>().unwrap()), Rule::variable => Node::variable(pair.as_str()), Rule::expr => ..., Rule::call => ..., } }
  • 30. Climb the Expression  Expression can be written as single Rule: Expr = { Factor ~ (op_binary ~ Factor)* }  Pest provides a template, just defines:  Function build factor => create Factor Node  Function infix rules => create Operator Node  Operator precedence => vector of operator precedence and left/right association 30
  • 31. Challenges  Error message with syntax error.  How to deal with optional? Like C for loop  A more systematic way to deal with large language, like C. 31 compound_statement <- block_list block_list <- block_list block | ε block <- declaration_list | statement_list declaration_list <- declaration_list declaration | ε statement_list <- statment_list statement | ε // Wrong PEG compound_statement <- block* block <- declaration* ~ statement* // Correct PEG compound_statement <- block* block <- (declaration | statement)+
  • 33. Conclusion  PEG is a new, much powerful grammar than CFG. Fast and convenient to create a small language parser.  The most important concept in programming language? Abstraction  Is there best abstraction? NO. It is engineering. 33
  • 34. Reference  <Parsing Expression Grammars: A Recognition-Based Syntactic Foundation>, Bryan Ford  <Understanding Computation: From Simple Machines to Impossible Programs>  <Programming Language Part B> on Coursera, University of Washington 34
  • 35. Thank You for Listening 35
  • 36. IB502 1430 – 1510 Build Yourself a Nixie Tube Clock 36