SlideShare a Scribd company logo
Instaduction to Instaparse
CAP-CLUG - 2019-03-13
Instaparse is a clojure library for building
parsers from context-free grammars
What is a parser?
● Program that takes some input data (usually a string), and produces a
data-structure (usually a parse tree), based on some grammar (usually a
context-free grammar)
What’s a context-free grammar?
Formal definition:
V = finite set of non-terminals or variables. Each variable represents a clause or a
phrase, or a syntactic category
𝚺 = finite set of terminals. The set of terminals is the alphabet of the language
R = finite relation from V to (V U 𝚺)*. Each member of R is a rewrite rule or
production
S = the starting symbol, must be an element of V
Adapted from https://en.wikipedia.org/wiki/Context-free_grammar#Formal_definitions
What’s a context-free grammar?
The “context-free” bit means that the rules can always be applied, regardless of
the rest of the string (context).
There are other kinds of grammars, some more or less powerful than CFG’s. See
Chomsky Hierarchy for more
A simple CFG
S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+
A simple CFG
S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+
Non-terminals Non-terminals
A simple CFG
S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+
Terminals
A simple CFG
S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+
Productions
A simple CFG
S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+
Starting symbol
Productions
Each production is a rule.
You can replace the symbol on the left with
symbol(s) on the right.
‘+’ means “one or more”; ‘*’ means “zero or
more”
Non-terminals can be recursively defined, and
appear on left- and right-side of rules
Terminals only appear on the right side of a rule
If you imagine a tree, non-terminals are interior
nodes, terminals are leaf nodes (we’ll see this
more later)
S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+
Productions example
S S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+
Productions example
AB S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+
Productions example
A B S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+
Productions example
aaaaaa B S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+
Productions example
aaaaaabb S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+
Some other strings that this grammar can generate
aaabbbbababaaaabbbb
abbbbbbbbb
abababababababab
aabbaabbbbbaaaaaab
S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+
Running a grammar “forwards”
generates strings that conform to a
grammar
Running a grammar “backwards” over a
string tells us if that string is valid,
according to the grammar
Running the CFG backwards
aaaaaabbaab S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+
Running the CFG backwards
aaaaaabbaab S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+
Running the CFG backwards
A bbaab S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+
Running the CFG backwards
A B A B S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+
Running the CFG backwards
AB AB S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+
Running the CFG backwards
S S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+
Running the CFG backwards
S S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+
VALID!
Let’s see that again, but not overwrite
the string
Running the CFG backwards
aaaaaabbaab S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+
Running the CFG backwards
aaaaaa bb aa b S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+A AB B
Running the CFG backwards
aaaaaa bb aa b S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+A AB B
AB AB
Running the CFG backwards
aaaaaa bb aa b S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+A AB B
AB AB
S
Running the CFG backwards
aaaaaa bb aa b S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+A AB B
AB AB
S
Parse Tree
What is a parser?
● Program that takes some input data (usually a string), and produces a
data-structure (usually a parse tree), based on some grammar (usually a
context-free grammar)
Instaparse is a clojure library for building
parsers from context-free grammars
A grammar that recognizes runs of a’s and b’s
S ::= AB*
AB ::= A B
A ::= ‘a’+
B ::= ‘b’+
ABNF notation for instaparse
Hello, instaparse
instaparse-talk.core=> (require '[instaparse.core :as insta])
nil
instaparse-talk.core=> (def as-and-bs
#_=> (insta/parser
#_=> "S = AB*
#_=> AB = A B
#_=> A = 'a'+
#_=> B = 'b'+"))
#'instaparse-talk.core/as-and-bs
Hello, instaparse
instaparse-talk.core=> (as-and-bs "aaabbbaabb")
[:S [:AB [:A "a" "a" "a"] [:B "b" "b" "b"]] [:AB [:A "a" "a"] [:B "b" "b"]]]
instaparse-talk.core=> (pprint *1)
[:S
[:AB [:A "a" "a" "a"] [:B "b" "b" "b"]]
[:AB [:A "a" "a"] [:B "b" "b"]]]
nil
instaparse-talk.core=> (insta/visualize (as-and-bs "aaabbbaabb"))
nil
Walking the parse tree
Parse trees are just clojure data! We have a TON of great ways to handle them
● Recursive or iterative processing using case or core.match (pattern
matching)
● Zippers (functional navigation and “editing” of trees)
● insta/transform
● Seq/tree-seq
● Enlive (CSS-style selectors for clojure data structures)
● Any other way that you want to walk nested vectors in clojure!
Example: replacing a node with a zipper
instaparse-talk.core=> (-> (zip/vector-zip (as-and-bs "aaaabbbbaabbabbb"))
pprint)
[[:S
[:AB [:A "a" "a" "a" "a"] [:B "b" "b" "b" "b"]]
[:AB [:A "a" "a"] [:B "b" "b"]]
[:AB [:A "a"] [:B "b" "b" "b"]]]
nil]
nil
Example: replacing a node with a zipper
Example: infix to postfix using case statements
1 + 2 * 3 - 4 / 5 1 2 3 * + 4 5 / -
Example: infix to postfix using case statements
Example: infix to postfix using case statements
Example: infix to postfix using case statements
Example: insta/transform
Apply this fn
To nodes that
match
magic!
Wrapping up
● Parsers turn text into trees
● Clojure is great at walking through trees
● Instaparse makes it easy to parse things
○ Programming languages
○ Config files
○ Data
○ Lots more!
The docs for instaparse are amazing. A lot of my examples were lifted straight
from it. Read the docs. They’re great. Everyone on the project did a fantastic job
https://github.com/Engelberg/instaparse
Thanks!
A cool way to visualize a CFG is with a railroad
diagram

More Related Content

Similar to Instaduction to instaparse

Syntactic analysis in NLP
Syntactic analysis in NLPSyntactic analysis in NLP
Syntactic analysis in NLP
kartikaVashisht
 
Theory of computation Lecture Slide(Chomsky Normal Form).pptx
Theory of computation Lecture Slide(Chomsky Normal Form).pptxTheory of computation Lecture Slide(Chomsky Normal Form).pptx
Theory of computation Lecture Slide(Chomsky Normal Form).pptx
customersupport14
 
Using Regular Expressions and Staying Sane
Using Regular Expressions and Staying SaneUsing Regular Expressions and Staying Sane
Using Regular Expressions and Staying Sane
Carl Brown
 
Lex analysis
Lex analysisLex analysis
Lex analysis
Suhit Kulkarni
 
RubyConf Argentina 2011
RubyConf Argentina 2011RubyConf Argentina 2011
RubyConf Argentina 2011
Aaron Patterson
 
Regular expression in javascript
Regular expression in javascriptRegular expression in javascript
Regular expression in javascript
Toan Nguyen
 
Periodic pattern mining
Periodic pattern miningPeriodic pattern mining
Periodic pattern mining
Ashis Kumar Chanda
 
Periodic pattern mining
Periodic pattern miningPeriodic pattern mining
Periodic pattern mining
Ashis Chanda
 
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdfFUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
Bryan Alejos
 
Abap 7 02 new features - new string functions
Abap 7 02   new features - new string functionsAbap 7 02   new features - new string functions
Abap 7 02 new features - new string functionsCadaxo GmbH
 
Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-s...
Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-s...Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-s...
Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-s...
PROIDEA
 
Haskell Jumpstart
Haskell JumpstartHaskell Jumpstart
Haskell Jumpstart
David Vollbracht
 
JSDC 2014 - functional java script, why or why not
JSDC 2014 - functional java script, why or why notJSDC 2014 - functional java script, why or why not
JSDC 2014 - functional java script, why or why not
ChengHui Weng
 
Cleancode
CleancodeCleancode
Cleancode
hendrikvb
 
T5 2017 database_searching_v_upload
T5 2017 database_searching_v_uploadT5 2017 database_searching_v_upload
T5 2017 database_searching_v_upload
Prof. Wim Van Criekinge
 
Java/Scala Lab: Slava Schmidt - Introduction to Reactive Streams
Java/Scala Lab: Slava Schmidt - Introduction to Reactive StreamsJava/Scala Lab: Slava Schmidt - Introduction to Reactive Streams
Java/Scala Lab: Slava Schmidt - Introduction to Reactive Streams
GeeksLab Odessa
 
bca data structure
bca data structurebca data structure
bca data structure
shini
 
DATA STRUCTURES
DATA STRUCTURESDATA STRUCTURES
DATA STRUCTURES
bca2010
 
Handling Billions of Edges in a Graph Database
Handling Billions of Edges in a Graph DatabaseHandling Billions of Edges in a Graph Database
Handling Billions of Edges in a Graph Database
ArangoDB Database
 

Similar to Instaduction to instaparse (20)

Syntactic analysis in NLP
Syntactic analysis in NLPSyntactic analysis in NLP
Syntactic analysis in NLP
 
Theory of computation Lecture Slide(Chomsky Normal Form).pptx
Theory of computation Lecture Slide(Chomsky Normal Form).pptxTheory of computation Lecture Slide(Chomsky Normal Form).pptx
Theory of computation Lecture Slide(Chomsky Normal Form).pptx
 
Using Regular Expressions and Staying Sane
Using Regular Expressions and Staying SaneUsing Regular Expressions and Staying Sane
Using Regular Expressions and Staying Sane
 
Lex analysis
Lex analysisLex analysis
Lex analysis
 
RubyConf Argentina 2011
RubyConf Argentina 2011RubyConf Argentina 2011
RubyConf Argentina 2011
 
Regular expression in javascript
Regular expression in javascriptRegular expression in javascript
Regular expression in javascript
 
Periodic pattern mining
Periodic pattern miningPeriodic pattern mining
Periodic pattern mining
 
Periodic pattern mining
Periodic pattern miningPeriodic pattern mining
Periodic pattern mining
 
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdfFUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
 
Abap 7 02 new features - new string functions
Abap 7 02   new features - new string functionsAbap 7 02   new features - new string functions
Abap 7 02 new features - new string functions
 
Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-s...
Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-s...Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-s...
Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-s...
 
Haskell Jumpstart
Haskell JumpstartHaskell Jumpstart
Haskell Jumpstart
 
JSDC 2014 - functional java script, why or why not
JSDC 2014 - functional java script, why or why notJSDC 2014 - functional java script, why or why not
JSDC 2014 - functional java script, why or why not
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Cleancode
CleancodeCleancode
Cleancode
 
T5 2017 database_searching_v_upload
T5 2017 database_searching_v_uploadT5 2017 database_searching_v_upload
T5 2017 database_searching_v_upload
 
Java/Scala Lab: Slava Schmidt - Introduction to Reactive Streams
Java/Scala Lab: Slava Schmidt - Introduction to Reactive StreamsJava/Scala Lab: Slava Schmidt - Introduction to Reactive Streams
Java/Scala Lab: Slava Schmidt - Introduction to Reactive Streams
 
bca data structure
bca data structurebca data structure
bca data structure
 
DATA STRUCTURES
DATA STRUCTURESDATA STRUCTURES
DATA STRUCTURES
 
Handling Billions of Edges in a Graph Database
Handling Billions of Edges in a Graph DatabaseHandling Billions of Edges in a Graph Database
Handling Billions of Edges in a Graph Database
 

Recently uploaded

UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 

Recently uploaded (20)

UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 

Instaduction to instaparse

  • 2. Instaparse is a clojure library for building parsers from context-free grammars
  • 3. What is a parser? ● Program that takes some input data (usually a string), and produces a data-structure (usually a parse tree), based on some grammar (usually a context-free grammar)
  • 4. What’s a context-free grammar? Formal definition: V = finite set of non-terminals or variables. Each variable represents a clause or a phrase, or a syntactic category 𝚺 = finite set of terminals. The set of terminals is the alphabet of the language R = finite relation from V to (V U 𝚺)*. Each member of R is a rewrite rule or production S = the starting symbol, must be an element of V Adapted from https://en.wikipedia.org/wiki/Context-free_grammar#Formal_definitions
  • 5. What’s a context-free grammar? The “context-free” bit means that the rules can always be applied, regardless of the rest of the string (context). There are other kinds of grammars, some more or less powerful than CFG’s. See Chomsky Hierarchy for more
  • 6. A simple CFG S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+
  • 7. A simple CFG S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+ Non-terminals Non-terminals
  • 8. A simple CFG S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+ Terminals
  • 9. A simple CFG S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+ Productions
  • 10. A simple CFG S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+ Starting symbol
  • 11. Productions Each production is a rule. You can replace the symbol on the left with symbol(s) on the right. ‘+’ means “one or more”; ‘*’ means “zero or more” Non-terminals can be recursively defined, and appear on left- and right-side of rules Terminals only appear on the right side of a rule If you imagine a tree, non-terminals are interior nodes, terminals are leaf nodes (we’ll see this more later) S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+
  • 12. Productions example S S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+
  • 13. Productions example AB S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+
  • 14. Productions example A B S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+
  • 15. Productions example aaaaaa B S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+
  • 16. Productions example aaaaaabb S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+
  • 17. Some other strings that this grammar can generate aaabbbbababaaaabbbb abbbbbbbbb abababababababab aabbaabbbbbaaaaaab S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+
  • 18. Running a grammar “forwards” generates strings that conform to a grammar
  • 19. Running a grammar “backwards” over a string tells us if that string is valid, according to the grammar
  • 20. Running the CFG backwards aaaaaabbaab S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+
  • 21. Running the CFG backwards aaaaaabbaab S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+
  • 22. Running the CFG backwards A bbaab S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+
  • 23. Running the CFG backwards A B A B S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+
  • 24. Running the CFG backwards AB AB S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+
  • 25. Running the CFG backwards S S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+
  • 26. Running the CFG backwards S S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+ VALID!
  • 27. Let’s see that again, but not overwrite the string
  • 28. Running the CFG backwards aaaaaabbaab S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+
  • 29. Running the CFG backwards aaaaaa bb aa b S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+A AB B
  • 30. Running the CFG backwards aaaaaa bb aa b S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+A AB B AB AB
  • 31. Running the CFG backwards aaaaaa bb aa b S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+A AB B AB AB S
  • 32. Running the CFG backwards aaaaaa bb aa b S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+A AB B AB AB S Parse Tree
  • 33. What is a parser? ● Program that takes some input data (usually a string), and produces a data-structure (usually a parse tree), based on some grammar (usually a context-free grammar)
  • 34. Instaparse is a clojure library for building parsers from context-free grammars
  • 35. A grammar that recognizes runs of a’s and b’s S ::= AB* AB ::= A B A ::= ‘a’+ B ::= ‘b’+
  • 36. ABNF notation for instaparse
  • 37. Hello, instaparse instaparse-talk.core=> (require '[instaparse.core :as insta]) nil instaparse-talk.core=> (def as-and-bs #_=> (insta/parser #_=> "S = AB* #_=> AB = A B #_=> A = 'a'+ #_=> B = 'b'+")) #'instaparse-talk.core/as-and-bs
  • 38. Hello, instaparse instaparse-talk.core=> (as-and-bs "aaabbbaabb") [:S [:AB [:A "a" "a" "a"] [:B "b" "b" "b"]] [:AB [:A "a" "a"] [:B "b" "b"]]] instaparse-talk.core=> (pprint *1) [:S [:AB [:A "a" "a" "a"] [:B "b" "b" "b"]] [:AB [:A "a" "a"] [:B "b" "b"]]] nil instaparse-talk.core=> (insta/visualize (as-and-bs "aaabbbaabb")) nil
  • 39.
  • 40. Walking the parse tree Parse trees are just clojure data! We have a TON of great ways to handle them ● Recursive or iterative processing using case or core.match (pattern matching) ● Zippers (functional navigation and “editing” of trees) ● insta/transform ● Seq/tree-seq ● Enlive (CSS-style selectors for clojure data structures) ● Any other way that you want to walk nested vectors in clojure!
  • 41. Example: replacing a node with a zipper instaparse-talk.core=> (-> (zip/vector-zip (as-and-bs "aaaabbbbaabbabbb")) pprint) [[:S [:AB [:A "a" "a" "a" "a"] [:B "b" "b" "b" "b"]] [:AB [:A "a" "a"] [:B "b" "b"]] [:AB [:A "a"] [:B "b" "b" "b"]]] nil] nil
  • 42. Example: replacing a node with a zipper
  • 43. Example: infix to postfix using case statements 1 + 2 * 3 - 4 / 5 1 2 3 * + 4 5 / -
  • 44. Example: infix to postfix using case statements
  • 45.
  • 46. Example: infix to postfix using case statements
  • 47. Example: infix to postfix using case statements
  • 48. Example: insta/transform Apply this fn To nodes that match magic!
  • 49. Wrapping up ● Parsers turn text into trees ● Clojure is great at walking through trees ● Instaparse makes it easy to parse things ○ Programming languages ○ Config files ○ Data ○ Lots more! The docs for instaparse are amazing. A lot of my examples were lifted straight from it. Read the docs. They’re great. Everyone on the project did a fantastic job https://github.com/Engelberg/instaparse
  • 51. A cool way to visualize a CFG is with a railroad diagram