1. Creating a Compiler with
Irony and RunSharp
James M. Curran
NYC CodeCamp
Sept 13, 2014
2. Creating a Compiler with
Irony and RunSharp
James M. Curran
NYC CodeCamp
Sept 13, 2014
Brought to you by : MARQUEE SPONSOR
3. Who the Heck am I?
Simple Country Programmer
20+ Years in Software Development
Assembler, C, C++, C#
Worked in Medical, Financial, Online
Retailing, and Embedded systems
Microsoft MVP C++ (1994-2004)
Currently Itinerant Programmer at One
Call Medical
Write blog: HonestIllusion.com
Designer: NJTheater.Com
Resume'n'stuff: NovelTheory.com
4. Who the Heck am I?
Simple Country Programmer
20+ Years in Software Development
Assembler, C, C++, C#
Worked in Medical, Financial, Online
Retailing, and Embedded systems
Microsoft MVP C++ (1994-2004)
Currently Itinerant Programmer at One
Call Medical
Write blog: HonestIllusion.com
Designer: NJTheater.Com
Resume'n'stuff: NovelTheory.com
Brought to you by
PLATINUM SPONSOR
6. “Naked Came the Null Delegate”
➲ Collaborative blog story telling a tale of sex, betrayal
and .NET coding.
➲ Serialized across the blogs of several noted .NET
writers including Charles Petzold and Jon Skeet.
➲ Dormant since 2010, but new writers wanted.
➲ http://nakedcamethenulldelegate.wordpress.com
Brought to you by
PLATINUM SPONSOR
7. Agenda
The Language
The Parser
The Code Generators (plural)
The Command-line interface.
Brought to you by GOLD SPONSORS
8. Shakespeare Programming Language -
History
Designed by Jon Åslund and Karl Hasselström in 2001.
The design goal was to make a language with beautiful source
code that resembled Shakespeare plays.
Original compiler written in YACC (actually Bison), producing C
code.
I wrote the C# compiler in Sept 2013. Apparently I was the
first person to touch the code in 12 years.
Since then, Javascript, Python, Perl, and Java versions of been
open-sourced and http://shakespearelang.org/ was launched.
9. Variables
Must be the name of an actual character from one of
Shakespeare's play.
That’s exactly 152 specific allowable variable names.
Some names are more than one word!
Only data type is an integer, but it can be used for a
character.
It's really a stack of integers.
10. Numbers/Constants
Nouns that are “nice” or neutral equal 1.
Nouns that are “not nice” equal -1.
Adjectives (nice or not) multiply by 2.
Examples:
“big hairy hound” = 2 * 2 * (+1) = 4
“sorry little codpiece” = 2 * 2 * (-1) = -4
Can be combined with “Sum of ___ and ___”,
“Difference between”, “Square of”, “Cube of”
41 neutral, 13 positive, and 25 negative nouns
20 neutral, 36 positive and 32 negative adjectives
11. Input and Output
"Open your heart" outputs the variable's value as a number
"Speak your mind" outputs the corresponding ASCII character.
"Listen to your heart" cause the variable to receive input from the user
as a number.
"Open your mind" receive input from a character.
Conditional Statements and Gotos
An if/then statement is phrased as a question posed by a
character.
The words "as [any adjective] as" represent a test for equality,
"better" and "worse" correspond to greater than and less than
A subsequent line, starting "if so" or "if not," determines what
happens in response to the truth or falsehood of the original
condition.
A goto statement begins "Let us," "We shall," or "We must,"
continues "return to" or "proceed to," and then gives an act or
scene.
12. Example
Outputting Input Reversedly.
Othello, a stacky man.
Lady Macbeth, who pushes him around till he pops.
Act I: The one and only.
Scene I: In the beginning, there was nothing.
[Enter Othello and Lady Macbeth]
Othello:
You are nothing!
Scene II: Pushing to the very end.
Lady Macbeth:
Open your mind! Remember yourself.
Othello:
You are as hard as the sum of yourself and a stone wall. Am I as horrid
as a flirt-gill?
Lady Macbeth:
If not, let us return to scene II. Recall your imminent death!
13. The original compiler
Like most languages, specified in Backus–Naur Form
(BNF)
Compiler coded in YACC (“Yet Another Compiler
Compiler”) (Actually, in GNU’s version BISON)
It’s object code (output) is C source code, which would
then need to be compiled.
14. BNF & YACC
BNF is made up of “Terminals” and “Non-terminal”
Terminal symbols are the elementary symbols of the
language
Non-terminals are made of a combining terminals and
other non-terminals.
Example:
digit = '0'|'1' |'2' |'3' |'4' |'5' |'6' |'7' |'8' |'9‘
integer = ['-'] digit+
YACC is a LALR parser generator.
YACC source files are BNF with C code attached to
some non-terminals.
16. 1st tool: Irony
Irony is a development kit for implementing languages
on .NET platform.
Name is a play on IronPython, IronRuby, IronScheme :
“Use Irony to create your own Iron* language”
Tries to replicate BNF in C# code via operator
overloading.
Project home: https://irony.codeplex.com/
Available on nuget: “irony”
17. Irony Example:
public class ShakespeareGrammar : InterpretedLanguageGrammar
{
public ShakespeareGrammar()
: base(false)
{
string[] endSymbols = { ".", "?", "!" };
KeyTerm COLON = ToTerm(":", "colon");
KeyTerm COMMA = ToTerm(",");
// :
var Act = new NonTerminal("Act");
var Acts = new NonTerminal("Acts");
Acts.Rule = MakePlusRule(Acts, Act);
var Play = new NonTerminal("Play");
Play.Rule = Title + CharacterDeclarationList + Acts;
19. Multi-Word keywords
BnfTerm MultiWordTermial(string term)
{
var parts = term.Split(' ');
if (parts.Length == 1)
return ToTerm(term);
var nonterm = new NonTerminal(term);
var expr = new BnfExpression(ToTerm(parts[0]));
foreach (var part in parts.Skip(1))
{
expr += ToTerm(part);
}
nonterm.Rule = expr;
return nonterm;
}
20. Equivalent Keywords
NonTerminal BuildTerminal(string name, string filename)
{
var termList = new NonTerminal(name));
var strm = assembly.GetManifestResourceStream(filename);
var sw = new StreamReader(strm);
var block = sw.ReadToEnd();
var lines = block.Split('n', 'r');
if (lines.Any())
{
var expr = new BnfExpression(MultiWordTermial(lines[0]));
foreach (var line in lines.Skip(1))
{
if (!String.IsNullOrWhiteSpace(line))
expr |= MultiWordTermial(line);
}
termList.Rule = expr;
}
return termList;
}
22. Code Generation
• Ok, You’ve got a parser. Now what?
• Irony documentation drops off here.
• Here’s what I learned single-stepping with the
debugger.
23. Create AST tree from source.
var grammar = new ShakespeareGrammar();
var parser = new Parser(grammar);
var text = File.ReadAllText(filename);
var tree = parser.Parse(text, filename);
var app = new ScriptApp(parser.Language);
var thread = new ScriptThread(app);
var output= tree.Root.AstNode.Evaluate(thread);
24. Walking the Tree
As tree.Root.AstNode.Evaluate(thread) runs,
node.DoEvaluate() is called for each node on the tree.
By default, AstNode objects are in tree.
So we need our own nodes in the tree.
25. ASTNode (Fixed)
var Act = new NonTerminal("Act", typeof(ActNode));
var NegativeConstant = new NonTerminal("NegativeConstant",
typeof(NegativeConstantNode));
public class NegativeConstantNode : AstNode
{
protected override object DoEvaluate(ScriptThread thread)
{
var tw = thread.tc().Writer;
if (AstNode1 is NegativeNounNode)
return "(-1)";
else
return string.Format("2*{0}",
AstNode2.ToString(thread));
}
}
27. A second code generator
That works fine for if you only need one code
generator.
But I wanted to write a C code generator, a C# code
generator and a MSIL code generator.
{more digging with debugger}
I’ve got great tool from switching between several
generators – Check the github…..
28. 2nd tool: RunSharp
RunSharp is a layer above the standard .NET
Reflection.Emit API, allowing to generate/compile
dynamic code at runtime very quickly and efficiently
(unlike using CodeDOM and invoking the C# compiler).
Project home : https://code.google.com/p/runsharp/
(but it’s stagnant, so check github for forks)
Not on Nuget.
29. public class PlayNode : AstNode
{
protected override object DoEvaluate(Irony.Interpreter.ScriptThread thread)
{
var rs = thread.rs();
var ag = rs.AssemblyGen;
using (ag.Namespace("Shakespeare.Program"))
{
TypeGen scriptClass = ag.Public.Class("Script", typeof(Shakespeare.Support.Dramaturge));
{
rs.Script = scriptClass;
var console = typeof(Console);
CodeGen ctor = scriptClass.Public.Constructor();
ctor.InvokeBase(Static.Property(console, "In"), Static.Property(console, "Out"));
{
}
CodeGen action = scriptClass.Public.Void("Action");
{
rs.Action = action;
action.At(Span);
AstNode1.Evaluate(thread);
var cdl = AstNode2 as CharacterDeclarationListNode;
foreach (var ch in cdl.Characters)
ch.Evaluate(thread);
AstNode3.Evaluate(thread);
}
}
scriptClass.Complete();
TypeGen MyClass = ag.Public.Class("Program");
CodeGen Main = MyClass.Public.Static.Void("Main").Parameter<string[]>("args");
{
var script = Main.Local(Exp.New(scriptClass.GetCompletedType()));
Main.Invoke(script, "Action");
}
}
ag.Save();
return ag.GetAssembly().GetName().Name;
}
}
30. public class NegativeConstantNode : AstNode
{
protected override object DoEvaluate(ScriptThread thread)
{
if (AstNode1 is NegativeNounNode)
return ((Operand) 0- 1);
else
return (AstNode2.Evaluate(thread) as Operand) * 2;
}
}