1. A brand new way of
Perl product, and it‟s future
Masaaki Goshima (@goccy54)
mixi Inc.
2. Me
• Name : Masaaki Goshima (@goccy54)
• Job : mixi (Tanpopo Group)
– Development for Developers
– Developing a platform for refining legacy software
– Managing Jenkins
• Personally, interested in Perl and developing
yet another Perl, gperl
– (YAPC::Asia 2012) Perlと出会いPerlを作る
3. gperl
• Fastest perl-like language
– Aiming perl5 compatible syntax
– Written with C++
• The last commit for repository was 11 months
ago...
– “Is this already departed !?”
4. NO! ITS STILL ALIVE!!!
• However, gperl itself has departed
• Modules(lexical analyzer, parser, code
generator, interpreter) are useful
– Lexical analyzer: implementing perl5 syntax
highlighter
– Parser: static analysis tool (used as
refinement of legacy perl5 code)
Input ->Lexer -> Parser ->CodeGenerator -> VM (JIT) -> output
5. This presentation‟s Goal
• Today, we‟re going to present “Spinout
projects” for each modules
gperl
Compiler::Lexer Compiler::Parser
Compiler::CodeGenerator
::LLVM
Lexer
Parser CodeGenerator
???::???::??? ?????
Next Module
6. Agenda
• Section 1
– Introduction of Compiler::* Modules
• Section 2
– (Application1) Running on multi platforms
• Section 3
– (Application2) Static analysis tool
7. Section1
• Introduction of Compiler::* Modules
1. Compiler::Lexer
– Lexical Analyzer for Perl5
2. Compiler::Parser
– Create Abstract Syntax Tree for Perl5
3. Compiler::CodeGenerator::LLVM
– Create LLVM IR for Perl5
8. Compiler::Lexer
• Lexical Analyzer for Perl5
• Features
– Simple (return Array Reference of Token Object)
– Fast (faster 10 ~ 100 times than PPI::Tokenize)
• Wirtten with C++
• Perfect hashing for reserved keywords
• memory pool for token objects
– Readable Code
• Nothing use parser generator like yacc/bison
10. Tips
• Cannot tokenize pattern
1. func*v
• 「*」is Glob or Mul
2. func/ ..
• 「/」is RegexDelimiter or Div
3. func<<FLAG
• 「FLAG」is HereDocumentTag or Constant
It may make a mistake
11. Future plan
• Supporting recursively tokenizing
– More wide range of parsing application
– Recursive parsing will take time, optimizing
will be next future work
func*v =>func(*v) or func() * v
func/ … =>func(/../) or func() / v
func<<FLAG =>func(<<FLAG) or func() << FLAG
sub func($) {} or sub func() {}
12. Compiler::Parser
• Create Abstract Syntax Tree for Perl5
• Features
– Fast (faster than PPI)
– Readable Code
• Simple design
• Nothing use generator
Can generate Virtual Machine code
as walking tree by post order
->
$a->{b}->[0]->c(@args)
->
->
$a {}
b
[]
c
0
@args
left
left
left
right
argsright
right
data
data
13. Example
my$v= sin$v + $v * $v / $v - $v&&$v
my($v= (sin((($v+ (($v* $v) / $v)) - $v))&&$v))
=
$v &&
left right
rightleft
The most difficult part of
Perl5 parser:
hidden(optional)
parenthesis
14. Example2) Can parse BlackPerl
BEFOREHAND: close door, each window &exit;
waituntiltime;
open spell book; study;
read (spell, $scan, select); tell us;
write it, print(thehex) whileeach watches,
reverse length,write again;
kill spiders, pop them, chop,
split, kill them. unlink arms, shift,
waitand listen (listening, wait).
sort the flock (then,
warn"the goats", kill"the sheep");
kill them, dump qualms,
shift moralities, values aside, each one;
die sheep; die (to, reversethe => system
you accept (reject, respect));
next step, killnext sacrifice,
each sacrifice, wait, redo ritual
until"all the spirits are pleased";
do it ("as they say").
do it(*everyone***must***participate***in***forbidden**s*e*x*).
return last victim; package body;
exitcrypt
(time, times&“half a time”)
&close it.
select (quickly) and
warnnext victim;
AFTERWARDS: tell nobody.
wait, waituntiltime;
wait until next year, next decade;
sleep, sleep, die yourself,
die@last
BlackPerl for Perl5
15. Future plan
• Replace PPI !!!
– Supporting some expressions
• ThreeTermOperator, Glob, given/when, goto etc..
– Supporting compatible methods PPI provides
• PPI::Document::find
• PPI::Document::prune
16. Compiler::CodeGenerator::LLVM
• Create LLVM IR for Perl5
->
$a->{b}->[0]->c(@args)
->
->
$a {}
b
[]
c
0
@args
left
left
left
right
argsright
right
data
data
1 2
3
4 5
6
7 8
9
10
Compiler::Parser Compiler::CodeGenerator::LLVM
LLVM IR
Running
with JIT
17. Native Code
Other Language
LLVM(Low Level Virtual Machine)
• Compiler Infrastructure
• Better than GNU GCC
LLVM
X86
ARM
Power
C/C++/Object
ive-C
JavaScript
py2llvm
MacRuby
clang
18. How to use
• Dependency: clang/llvm version 3.2 or 3.3
useCompiler::Lexer;
useCompiler::Parser;
useCompiler::CodeGenerator::LLVM;
my$tokens = Compiler::Lexer->new('')->tokenize($code);
my $ast= Compiler::Parser->new->parse($tokens);
my$generator = Compiler::CodeGenerator::LLVM->new();
my$llvm_ir= $generator->generate($ast); # generate LLVM IR
$generator->debug_run($ast); # run with JIT
21. 1. Running on Web Browser
• Recently, way of writing code that runs on web
browser is not onlyJavaScript
– e.g.) CoffeScript, JSX, TypeScript, Dart
• I wrote module for Perl5!
SEE ALSO :perl.js (@gfx)
22. Compiler::Tools::Transpiler
• Translate Perl5 codes to JavaScript codes
– (Step1) translate Perl5 codes to LLVM-IR with
Compiler::* modules
– (Step2) translate LLVM-IR to JavaScript codes with
emscripten
Perl5
LLVM emscripten
23. How to use
useCompiler::Tools::Transpiler;
my$transpiler= Compiler::Tools::Transpiler->new({
engine=>'JavaScript‟,
library_path=> [„lib‟]
});
openmy $fh, '<', 'target_application.pl';
my$perl5_code = do { local$/; <$fh> };
my$javascript_code= $transpiler->transpile($perl5_code);
open $fh, '>', 'target_application.js';
print$fh $javascript_code;
close$fh;
※ Needs clang-3.2 and LLVM-3.2 (Not ver. 3.3 or later) because emscripten has still not
support LLVM 3.3 or later
24. ROADMAP/Repository
• Fix some emscripten‟s bugs
• Supporting accessors for HTML objects
• Supporting Canvas API
• etc..
• Repository
– https://github.com/goccy/p5-Compiler-Tools-Transpiler.git
25. 2. Running on iOS and OSX
• Recently, software that can develop iOS or OSX
applications using light weight language has
being released
– RubyMotion
– mocl
– Titanium
• I wrote module for Perl5!
29. Current Status
Still not support functions
Symbol Management System : Glob
Garbage Collection
Regexp
Three Term Operator
etc..
Not support functions
Dynamic evaluation like eval, s/../../e,
require
Supported functions
Primitive Types : Int, double, String, Array,
Hash, ArrayRefence, HashReference,
IOHandler, BlessedObject …
Operators : binary operator, single term
operator …
BuiltinFunction : print, say, shift, push, sin,
open …
Variable Definition, Function Definition
OOP System : package, bless, @ISA
30. ROADMAP
• Supporting deployment on device
• Preparing debugging environment
– (stdout/stderr/gdb..etc)
• Supporting existing framework (e.g. UIKit)
• Supporting CPAN modules written by Pure Perl
2014/4
Will release at April, 2014!
31. Welcome your Contribution!
• I want discuss design or objective
– Especially, I need advice on “What Perl
community likes?” (Naming rule, etc…)
https://github.com/goccy/perl-motion.git
32. Conclusion at Section 1 and 2
• Perl5 codes Run Anywhere!!!
– iOS,OSX,Web Browser and others
• Welcome your Contribution
– Compiler::Lexer
• https://github.com/goccy/p5-Compiler-Lexer.git
– Compiler::Parser
• https://github.com/goccy/p5-Compiler-Parser.git
– Compiler::CodeGenerator::LLVM
• https://github.com/goccy/p5-Compiler-CodeGenerator.git
– Compiler::Tools::Transpiler
• https://github.com/goccy/p5-Compiler-Tools-Transpiler.git
– PerlMotion
• https://github.com/goccy/perl-motion.git
34. Compiler::Tools::CopyPasteDetector
• Detect Copy and Paste for Perl5
• Features
– Fast (detecting Engine written by C++ with
POSIX Thread)
– Accuracy (based on Compiler::Lexer and
B::Deparse)
– High functionality Visualizer
(some metrics or scattergram etc…)
36. Another Modules
• Perl::MinimumVersion::Fast(@tokuhirom)
– Find a minimum required version of perl for Perl code
• Test::LocalFunctions::Fast(@papix)
– Detect unused local function
Compiler::Lexer or Compiler::Parser provide that
you can write faster modules than existing module that uses PPI
37. Future plan
• Perl::Metrics::Simple::Fast
– Will release faster module than
Perl::Metrics::Simple by using
Compiler::Lexer and Compiler::Parser
• Next future, I will CPANize of static alnalysis
module that has been used at mixi
– It includes rich Visualizer
38. Conclusion at Section 3
• Introduction of copy and paste detecting
tool for Perl5
– Compiler::Tools::CopyPasteDetector
• Compiler::Lexer or Compiler::Parser provide that
you can write faster modules than existing
module that uses PPI