SlideShare a Scribd company logo
1 of 18
Download to read offline
DSLs for Fun and Profit
Why minilanguages?
Overview
Why DSLs?
What are some most commonly used DSLs? Why do they exist?
A practical example from a project. How to create a parser?
DSL = Domain Specific Language
General purpose languages are general; you can do anything with them, but
they're not optimized for anything specific.
Domain specific languages - also called mini languages- are made to do one
thing well.
Sometimes it’s hard to say which box a language falls to.
General vs domain specific: a trade off
General languages can do many things, and do
them pretty well too.
But they’re not optimized for any particular
thing.
Domain specific language is optimized for
solving a particular problem. This means that it
usually doesn’t work well for other kinds of
problems.
Benefits of a DSL in a right place:
● Can be easier to learn
● Smaller feature set
● Limitations to what the user can do
● Easier to reason about the problem
(metalinguistic abstraction)
● Can reduce complexity
● Easier to do division of labor with non-
programmers
Some of the most commonly used DSL for
web programmers
● HTML
○ Reduces complexity a lot. Think of the alternative.
○ But not specific enough, we need more
● CSS
○ Yet another mini-language, because there was need for it
○ Not enough
○ SASS, LESS
● MarkDown
● Django Templating Language, J inja
● Front-end templating languages of various frameworks
Other examples
● SQL
● Regular expressions
● J Query selector expressions
● Unix scripting
● TEX, LATEX, PostScript, XML, XSLT, ...
Example of a custom-made DSL
We have a need to match job ads to job titles on a web site that use Django
We want non-programming people to be able to understand and write rules that
match job ads to job titles
Solution: search expressions
Search expression
that matches to
ads with word
“driver” in heading,
but no “truck” or
“bus”
(heading ~ driver)
NOT
(
heading ~ truck
heading ~ bus
)
Compiles to Django query something like this:
Q(heading__icontains='driver')&~(Q(heading__icon
tains='truck')|Q(heading__icontains='bus'))
Agent for an artist
Agent for an artist
(
heading ~ agent
)
AND
(
descr ~ music
descr ~ culture
descr ~ art
descr ~ theater
descr ~ entertainment
)
Compiles to:
Q(heading__icontains='agent')&
(Q(heading__icontains='music')|
Q(heading__icontains='culture')...)
Ok, how to make this work?
A few weeks ago, the recommendation was not to roll your own parser using
regular expressions
Rolling your own by hand without using regular expressions is not necessarily
better. The previous implementation was very hard to understand or maintain.
Lexer and parser generation lex/yacc style
Lexer converts the string, or sequence of characters, to a sequence of tokens
Lex is a tool for generating lexers
Parser generator creates a parser that can go through the sequence of tokens
and generate
Yacc is a parser generator
List of tools shamelessly stolen from
previous presentation
Lex/Yacc, Flex/Bison, PLY(Python Lex/Yacc)
ANTLR
pyPEG
tdparser for python
PEG.js
PLY Lexer for
search
expressions
tokens = (
"OPENPAREN",
"CLOSEPAREN",
"EQUALS",
"APPROXIMATES",
"NOT",
"AND",
"TEXT",
)
t_ignore = "n t"
t_OPENPAREN = r"("
t_CLOSEPAREN = r")"
t_EQUALS = r"="
t_APPROXIMATES = r"~"
t_NOT = r"NOT"
t_AND = r"AND"
t_TEXT = r"((?!bANDb)(?!bNOTb)[-w *.,;:&/]+)|("[-w *.,;:&/]+")"
def t_error(t):
raise TypeError("Unknown text '%s'" % (t.value,))
lex.lex(reflags=re.UNICODE)
PLY parser
def p_expr(p):
"""expr : paren_expr
| and_expr
| or_expr
| not_expr
| equals
| approximates"""
p[0] = p[1]
def p_and_expr(p):
"""and_expr : expr AND expr"""
p[0] = ('and', p[1], p[3])
def p_or_expr(p):
"""or_expr : expr expr"""
p[0] = ('or', p[1], p[2])
def p_paren_expr(p):
"""paren_expr : OPENPAREN expr CLOSEPAREN"""
p[0] = p[2]
def p_not_expr(p):
"""not_expr : expr NOT expr"""
p[0] = ('andnot', p[1], p[3])
def p_equals(p):
"""equals : field EQUALS value"""
p[0] = ('=', p[1], p[3])
PLY parser
def p_approximates(p):
"""approximates : field APPROXIMATES value"""
p[0] = ('~', p[1], p[3])
def p_field(p):
"""field : TEXT"""
field = p[1].strip()
if field not in INDEXED:
raise TypeError("Field {} is not among the accepted ones".format(field))
p[0] = field
def p_value(p):
"""value : TEXT"""
p[0] = p[1]
def p_error(p):
raise TypeError("Error parsing '%s': %s" % (p.value, p))
_parser = yacc.yacc()
class Parser(object):
@staticmethod
def parse(expression):
if not expression.strip():
return tuple()
return _parser.parse(str(expression.replace('rn', 'n')))
parser = Parser()
Now we can parse
syntax tree to Q
expression
def to_q_expression(expression_str):
try:
expression = parser.parse(expression_str)
except TypeError as e:
raise ValueError(e.message)
return _to_q_expression(expression)
def _to_q_expression(expression):
if not expression:
return Q()
operator = expression[0]
if operator == 'and':
return _to_q_expression(expression[1]) & _to_q_expression(expression[2])
elif operator == 'andnot':
return _to_q_expression(expression[1]) & ~_to_q_expression(expression[2])
elif operator == 'or':
return _to_q_expression(expression[1]) | _to_q_expression(expression[2])
elif operator in ('=', '~'):
field = expression[1].strip()
value = expression[2]
if value.startswith(u'"') and value.endswith(u'"'):
value = value.strip(u'"')
field_query_type = 'iexact' if operator == '=' else 'icontains'
if value.startswith(u'*') and value.endswith(u'*'):
value = value[1:-1]
field_query_type = 'icontains'
elif value.startswith(u'*'):
value = u'.+{}$'.format(value[1:])
field_query_type = 'iregex'
elif value.endswith(u'*'):
value = u'^{}.*'.format(value[:-1])
field_query_type = 'iregex'
return Q(**{u'{}__{}'.format(field, field_query_type): value})
raise ValueError("Unknown operator '{}'".format(operator))
These query expressions
have a weakness: think
about the case where
you're categorizing
hundreds of jobtitles to
thousands of different job
ads
Many queries
Benefit of having an
abstract syntax tree: we
can turn it to something
else than query
expressions. Say, python
comparison functions
def _to_python_expression(expression):
if not expression:
return u''
operator = expression[0]
if operator == 'and':
return u'({} and {})'.format(_to_python_expression(expression[1]), _to_python_expression(expression[2]))
elif operator == 'andnot':
return u'({} and not {})'.format(_to_python_expression(expression[1]), _to_python_expression(expression[2
elif operator == 'or':
return u'({} or {})'.format(_to_python_expression(expression[1]), _to_python_expression(expression[2]))
elif operator in ('=', '~'):
field = expression[1]
value = expression[2]
if value.startswith(u'"') and value.endswith(u'"'):
value = value.strip(u'"')
value = value.lower()
field_comparison = (u'{field}.lower() == u"{value}" if {field} else False'
if operator == '=' else
u'u"{value}" in {field}.lower() if {field} else False')
if value.startswith(u'*') and value.endswith(u'*'):
value = value[1:-1]
field_comparison = u'u"{value}" in {field}.lower() if {field} else False'
elif value.startswith(u'*'):
value = value[1:]
field_comparison = u'{field}.lower().startswith(u"{value}") if {field} else False'
elif value.endswith(u'*'):
value = value[:-1]
field_comparison = u'{field}.lower().endswith(u"{value}") if {field} else False'
field_reference = u'entry.access_attr("{}")'.format(field.replace(u'__', u'.'))
comparison = field_comparison.format(field=field_reference, value=value)
return u'({})n'.format(comparison)
Could you benefit from adding a domain
specific language to a current project of
yours?
How?

More Related Content

Similar to DSLs for fun and profit by Jukka Välimaa

Go Beyond Higher Order Functions: A Journey into Functional Programming
Go Beyond Higher Order Functions: A Journey into Functional ProgrammingGo Beyond Higher Order Functions: A Journey into Functional Programming
Go Beyond Higher Order Functions: A Journey into Functional ProgrammingLex Sheehan
 
Querying your database in natural language by Daniel Moisset PyData SV 2014
Querying your database in natural language by Daniel Moisset PyData SV 2014Querying your database in natural language by Daniel Moisset PyData SV 2014
Querying your database in natural language by Daniel Moisset PyData SV 2014PyData
 
Programming Paradigms
Programming ParadigmsProgramming Paradigms
Programming ParadigmsJaneve George
 
Building DSLs On CLR and DLR (Microsoft.NET)
Building DSLs On CLR and DLR (Microsoft.NET)Building DSLs On CLR and DLR (Microsoft.NET)
Building DSLs On CLR and DLR (Microsoft.NET)Vitaly Baum
 
SoTWLG Intro to Code Bootcamps 2016 (Roger Nesbitt)
SoTWLG Intro to Code Bootcamps 2016 (Roger Nesbitt)SoTWLG Intro to Code Bootcamps 2016 (Roger Nesbitt)
SoTWLG Intro to Code Bootcamps 2016 (Roger Nesbitt)ruthmcdavitt
 
Data weave 2.0 language fundamentals
Data weave 2.0 language fundamentalsData weave 2.0 language fundamentals
Data weave 2.0 language fundamentalsManjuKumara GH
 
Poetry with R -- Dissecting the code
Poetry with R -- Dissecting the codePoetry with R -- Dissecting the code
Poetry with R -- Dissecting the codePeter Solymos
 
4-Code-Jugalbandi-destructuring-patternmatching-healthycode#apr2015
4-Code-Jugalbandi-destructuring-patternmatching-healthycode#apr20154-Code-Jugalbandi-destructuring-patternmatching-healthycode#apr2015
4-Code-Jugalbandi-destructuring-patternmatching-healthycode#apr2015Dhaval Dalal
 
CS 354 Programmable Shading
CS 354 Programmable ShadingCS 354 Programmable Shading
CS 354 Programmable ShadingMark Kilgard
 
Scala Parser Combinators - Scalapeno Lightning Talk
Scala Parser Combinators - Scalapeno Lightning TalkScala Parser Combinators - Scalapeno Lightning Talk
Scala Parser Combinators - Scalapeno Lightning TalkLior Schejter
 
Real life-coffeescript
Real life-coffeescriptReal life-coffeescript
Real life-coffeescriptDavid Furber
 
Introduction to Scala for JCConf Taiwan
Introduction to Scala for JCConf TaiwanIntroduction to Scala for JCConf Taiwan
Introduction to Scala for JCConf TaiwanJimin Hsieh
 
Function Applicative for Great Good of Palindrome Checker Function - Polyglot...
Function Applicative for Great Good of Palindrome Checker Function - Polyglot...Function Applicative for Great Good of Palindrome Checker Function - Polyglot...
Function Applicative for Great Good of Palindrome Checker Function - Polyglot...Philip Schwarz
 
DataWeave 2.0 - MuleSoft CONNECT 2019
DataWeave 2.0 - MuleSoft CONNECT 2019DataWeave 2.0 - MuleSoft CONNECT 2019
DataWeave 2.0 - MuleSoft CONNECT 2019Sabrina Marechal
 

Similar to DSLs for fun and profit by Jukka Välimaa (20)

Go Beyond Higher Order Functions: A Journey into Functional Programming
Go Beyond Higher Order Functions: A Journey into Functional ProgrammingGo Beyond Higher Order Functions: A Journey into Functional Programming
Go Beyond Higher Order Functions: A Journey into Functional Programming
 
DSL in scala
DSL in scalaDSL in scala
DSL in scala
 
Querying your database in natural language by Daniel Moisset PyData SV 2014
Querying your database in natural language by Daniel Moisset PyData SV 2014Querying your database in natural language by Daniel Moisset PyData SV 2014
Querying your database in natural language by Daniel Moisset PyData SV 2014
 
Quepy
QuepyQuepy
Quepy
 
Programming Paradigms
Programming ParadigmsProgramming Paradigms
Programming Paradigms
 
PARADIGM IT.pptx
PARADIGM IT.pptxPARADIGM IT.pptx
PARADIGM IT.pptx
 
CL-NLP
CL-NLPCL-NLP
CL-NLP
 
Building DSLs On CLR and DLR (Microsoft.NET)
Building DSLs On CLR and DLR (Microsoft.NET)Building DSLs On CLR and DLR (Microsoft.NET)
Building DSLs On CLR and DLR (Microsoft.NET)
 
SoTWLG Intro to Code Bootcamps 2016 (Roger Nesbitt)
SoTWLG Intro to Code Bootcamps 2016 (Roger Nesbitt)SoTWLG Intro to Code Bootcamps 2016 (Roger Nesbitt)
SoTWLG Intro to Code Bootcamps 2016 (Roger Nesbitt)
 
Data weave 2.0 language fundamentals
Data weave 2.0 language fundamentalsData weave 2.0 language fundamentals
Data weave 2.0 language fundamentals
 
Poetry with R -- Dissecting the code
Poetry with R -- Dissecting the codePoetry with R -- Dissecting the code
Poetry with R -- Dissecting the code
 
Poetic APIs
Poetic APIsPoetic APIs
Poetic APIs
 
4-Code-Jugalbandi-destructuring-patternmatching-healthycode#apr2015
4-Code-Jugalbandi-destructuring-patternmatching-healthycode#apr20154-Code-Jugalbandi-destructuring-patternmatching-healthycode#apr2015
4-Code-Jugalbandi-destructuring-patternmatching-healthycode#apr2015
 
CS 354 Programmable Shading
CS 354 Programmable ShadingCS 354 Programmable Shading
CS 354 Programmable Shading
 
Scala Parser Combinators - Scalapeno Lightning Talk
Scala Parser Combinators - Scalapeno Lightning TalkScala Parser Combinators - Scalapeno Lightning Talk
Scala Parser Combinators - Scalapeno Lightning Talk
 
Real life-coffeescript
Real life-coffeescriptReal life-coffeescript
Real life-coffeescript
 
Introduction to Scala for JCConf Taiwan
Introduction to Scala for JCConf TaiwanIntroduction to Scala for JCConf Taiwan
Introduction to Scala for JCConf Taiwan
 
Function Applicative for Great Good of Palindrome Checker Function - Polyglot...
Function Applicative for Great Good of Palindrome Checker Function - Polyglot...Function Applicative for Great Good of Palindrome Checker Function - Polyglot...
Function Applicative for Great Good of Palindrome Checker Function - Polyglot...
 
DataWeave 2.0 - MuleSoft CONNECT 2019
DataWeave 2.0 - MuleSoft CONNECT 2019DataWeave 2.0 - MuleSoft CONNECT 2019
DataWeave 2.0 - MuleSoft CONNECT 2019
 
Refactoring
RefactoringRefactoring
Refactoring
 

More from Montel Intergalactic

Why I love Elixir by Jukka Välimaa from Elixir Meetup 2/2020
Why I love Elixir by Jukka Välimaa from Elixir Meetup 2/2020Why I love Elixir by Jukka Välimaa from Elixir Meetup 2/2020
Why I love Elixir by Jukka Välimaa from Elixir Meetup 2/2020Montel Intergalactic
 
Python types and doctests by Lauri Kainulainen
Python types and doctests by Lauri KainulainenPython types and doctests by Lauri Kainulainen
Python types and doctests by Lauri KainulainenMontel Intergalactic
 
Reactive programming with my little ponies
Reactive programming with my little poniesReactive programming with my little ponies
Reactive programming with my little poniesMontel Intergalactic
 
Cloud Infrastructure Modernisation Guide
Cloud Infrastructure Modernisation GuideCloud Infrastructure Modernisation Guide
Cloud Infrastructure Modernisation GuideMontel Intergalactic
 
Developing the kick.ai wearable sensor application with Flutter
Developing the kick.ai wearable sensor application with FlutterDeveloping the kick.ai wearable sensor application with Flutter
Developing the kick.ai wearable sensor application with FlutterMontel Intergalactic
 

More from Montel Intergalactic (6)

Why I love Elixir by Jukka Välimaa from Elixir Meetup 2/2020
Why I love Elixir by Jukka Välimaa from Elixir Meetup 2/2020Why I love Elixir by Jukka Välimaa from Elixir Meetup 2/2020
Why I love Elixir by Jukka Välimaa from Elixir Meetup 2/2020
 
Python types and doctests by Lauri Kainulainen
Python types and doctests by Lauri KainulainenPython types and doctests by Lauri Kainulainen
Python types and doctests by Lauri Kainulainen
 
Reactive programming with my little ponies
Reactive programming with my little poniesReactive programming with my little ponies
Reactive programming with my little ponies
 
Cloud Infrastructure Modernisation Guide
Cloud Infrastructure Modernisation GuideCloud Infrastructure Modernisation Guide
Cloud Infrastructure Modernisation Guide
 
Flutter for web
Flutter for webFlutter for web
Flutter for web
 
Developing the kick.ai wearable sensor application with Flutter
Developing the kick.ai wearable sensor application with FlutterDeveloping the kick.ai wearable sensor application with Flutter
Developing the kick.ai wearable sensor application with Flutter
 

Recently uploaded

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 

Recently uploaded (20)

Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 

DSLs for fun and profit by Jukka Välimaa

  • 1. DSLs for Fun and Profit Why minilanguages?
  • 2. Overview Why DSLs? What are some most commonly used DSLs? Why do they exist? A practical example from a project. How to create a parser?
  • 3. DSL = Domain Specific Language General purpose languages are general; you can do anything with them, but they're not optimized for anything specific. Domain specific languages - also called mini languages- are made to do one thing well. Sometimes it’s hard to say which box a language falls to.
  • 4. General vs domain specific: a trade off General languages can do many things, and do them pretty well too. But they’re not optimized for any particular thing. Domain specific language is optimized for solving a particular problem. This means that it usually doesn’t work well for other kinds of problems. Benefits of a DSL in a right place: ● Can be easier to learn ● Smaller feature set ● Limitations to what the user can do ● Easier to reason about the problem (metalinguistic abstraction) ● Can reduce complexity ● Easier to do division of labor with non- programmers
  • 5. Some of the most commonly used DSL for web programmers ● HTML ○ Reduces complexity a lot. Think of the alternative. ○ But not specific enough, we need more ● CSS ○ Yet another mini-language, because there was need for it ○ Not enough ○ SASS, LESS ● MarkDown ● Django Templating Language, J inja ● Front-end templating languages of various frameworks
  • 6. Other examples ● SQL ● Regular expressions ● J Query selector expressions ● Unix scripting ● TEX, LATEX, PostScript, XML, XSLT, ...
  • 7. Example of a custom-made DSL We have a need to match job ads to job titles on a web site that use Django We want non-programming people to be able to understand and write rules that match job ads to job titles Solution: search expressions
  • 8. Search expression that matches to ads with word “driver” in heading, but no “truck” or “bus” (heading ~ driver) NOT ( heading ~ truck heading ~ bus ) Compiles to Django query something like this: Q(heading__icontains='driver')&~(Q(heading__icon tains='truck')|Q(heading__icontains='bus'))
  • 9. Agent for an artist Agent for an artist ( heading ~ agent ) AND ( descr ~ music descr ~ culture descr ~ art descr ~ theater descr ~ entertainment ) Compiles to: Q(heading__icontains='agent')& (Q(heading__icontains='music')| Q(heading__icontains='culture')...)
  • 10. Ok, how to make this work? A few weeks ago, the recommendation was not to roll your own parser using regular expressions Rolling your own by hand without using regular expressions is not necessarily better. The previous implementation was very hard to understand or maintain.
  • 11. Lexer and parser generation lex/yacc style Lexer converts the string, or sequence of characters, to a sequence of tokens Lex is a tool for generating lexers Parser generator creates a parser that can go through the sequence of tokens and generate Yacc is a parser generator
  • 12. List of tools shamelessly stolen from previous presentation Lex/Yacc, Flex/Bison, PLY(Python Lex/Yacc) ANTLR pyPEG tdparser for python PEG.js
  • 13. PLY Lexer for search expressions tokens = ( "OPENPAREN", "CLOSEPAREN", "EQUALS", "APPROXIMATES", "NOT", "AND", "TEXT", ) t_ignore = "n t" t_OPENPAREN = r"(" t_CLOSEPAREN = r")" t_EQUALS = r"=" t_APPROXIMATES = r"~" t_NOT = r"NOT" t_AND = r"AND" t_TEXT = r"((?!bANDb)(?!bNOTb)[-w *.,;:&/]+)|("[-w *.,;:&/]+")" def t_error(t): raise TypeError("Unknown text '%s'" % (t.value,)) lex.lex(reflags=re.UNICODE)
  • 14. PLY parser def p_expr(p): """expr : paren_expr | and_expr | or_expr | not_expr | equals | approximates""" p[0] = p[1] def p_and_expr(p): """and_expr : expr AND expr""" p[0] = ('and', p[1], p[3]) def p_or_expr(p): """or_expr : expr expr""" p[0] = ('or', p[1], p[2]) def p_paren_expr(p): """paren_expr : OPENPAREN expr CLOSEPAREN""" p[0] = p[2] def p_not_expr(p): """not_expr : expr NOT expr""" p[0] = ('andnot', p[1], p[3]) def p_equals(p): """equals : field EQUALS value""" p[0] = ('=', p[1], p[3])
  • 15. PLY parser def p_approximates(p): """approximates : field APPROXIMATES value""" p[0] = ('~', p[1], p[3]) def p_field(p): """field : TEXT""" field = p[1].strip() if field not in INDEXED: raise TypeError("Field {} is not among the accepted ones".format(field)) p[0] = field def p_value(p): """value : TEXT""" p[0] = p[1] def p_error(p): raise TypeError("Error parsing '%s': %s" % (p.value, p)) _parser = yacc.yacc() class Parser(object): @staticmethod def parse(expression): if not expression.strip(): return tuple() return _parser.parse(str(expression.replace('rn', 'n'))) parser = Parser()
  • 16. Now we can parse syntax tree to Q expression def to_q_expression(expression_str): try: expression = parser.parse(expression_str) except TypeError as e: raise ValueError(e.message) return _to_q_expression(expression) def _to_q_expression(expression): if not expression: return Q() operator = expression[0] if operator == 'and': return _to_q_expression(expression[1]) & _to_q_expression(expression[2]) elif operator == 'andnot': return _to_q_expression(expression[1]) & ~_to_q_expression(expression[2]) elif operator == 'or': return _to_q_expression(expression[1]) | _to_q_expression(expression[2]) elif operator in ('=', '~'): field = expression[1].strip() value = expression[2] if value.startswith(u'"') and value.endswith(u'"'): value = value.strip(u'"') field_query_type = 'iexact' if operator == '=' else 'icontains' if value.startswith(u'*') and value.endswith(u'*'): value = value[1:-1] field_query_type = 'icontains' elif value.startswith(u'*'): value = u'.+{}$'.format(value[1:]) field_query_type = 'iregex' elif value.endswith(u'*'): value = u'^{}.*'.format(value[:-1]) field_query_type = 'iregex' return Q(**{u'{}__{}'.format(field, field_query_type): value}) raise ValueError("Unknown operator '{}'".format(operator))
  • 17. These query expressions have a weakness: think about the case where you're categorizing hundreds of jobtitles to thousands of different job ads Many queries Benefit of having an abstract syntax tree: we can turn it to something else than query expressions. Say, python comparison functions def _to_python_expression(expression): if not expression: return u'' operator = expression[0] if operator == 'and': return u'({} and {})'.format(_to_python_expression(expression[1]), _to_python_expression(expression[2])) elif operator == 'andnot': return u'({} and not {})'.format(_to_python_expression(expression[1]), _to_python_expression(expression[2 elif operator == 'or': return u'({} or {})'.format(_to_python_expression(expression[1]), _to_python_expression(expression[2])) elif operator in ('=', '~'): field = expression[1] value = expression[2] if value.startswith(u'"') and value.endswith(u'"'): value = value.strip(u'"') value = value.lower() field_comparison = (u'{field}.lower() == u"{value}" if {field} else False' if operator == '=' else u'u"{value}" in {field}.lower() if {field} else False') if value.startswith(u'*') and value.endswith(u'*'): value = value[1:-1] field_comparison = u'u"{value}" in {field}.lower() if {field} else False' elif value.startswith(u'*'): value = value[1:] field_comparison = u'{field}.lower().startswith(u"{value}") if {field} else False' elif value.endswith(u'*'): value = value[:-1] field_comparison = u'{field}.lower().endswith(u"{value}") if {field} else False' field_reference = u'entry.access_attr("{}")'.format(field.replace(u'__', u'.')) comparison = field_comparison.format(field=field_reference, value=value) return u'({})n'.format(comparison)
  • 18. Could you benefit from adding a domain specific language to a current project of yours? How?