SlideShare a Scribd company logo
1 of 37
Download to read offline
The TclQuadcode
Compiler
Status report on Tcl type
analysis and code
generation
Donal Fellowsorcid.org/0000-0002-9091-5938
Kevin Kenny
What is going on?
Ü  Want to Make Tcl Faster
Ü  Everyone benefits
Ü  Lehenbauer Challenges
Ü  2 times faster (“Perl” territory)
Ü  Better algorithms
Ü  Better buffer management
Ü  Bytecode optimization
Ü  10 times faster (“C” territory)
Ü  Needs more radical approach
2
Generating Native Code is Hard
Ü  Going to 10 times faster requires native code
Ü  Bytecode work simply won’t do it
Ü  But Tcl is a very dynamic language
Ü  Even ignoring command renaming tricks
Ü  Native code needs types
Ü  Many platforms
3
Let’s Go to LLVM!
Ü  Solves many problems
Ü  Optimization
Ü  Native code issuing
Ü  Runtime loading
Ü  LLVM Intermediate Representation (IR)
Ü  Effectively a virtual assembly language target
Ü  Existing Tcl package!
Ü  llvmtcl by Jos Decoster
Ü  Introduces problems though
Ü  LLVM’s idea of “throw an error” is to panic with a
gnostic error message
4
How to get to LLVM?
Ü  Still need those pesky types
Ü  Still need fixed semantics
Ü  We need a new bytecode!
Quadcode
Ü  Designed to help:
Ü  Simple translation from Tcl bytecode
Ü  More amenable to analysis
5
6
Quadcode
Tcl Analysis with
Quadcode
Ü  Based on three-address code assembly
Ü  The Tcl code:
set	
  a	
  [expr	
  {	
  $b	
  +	
  1	
  }]	
  
Ü  Equivalent Tcl bytecode:
loadScalar	
  %b;	
  push	
  “1”;	
  add;	
  storeScalar	
  %a	
  
Ü  Equivalent (optimized) quadcode:
add	
  {var	
  a}	
  {var	
  b}	
  {literal	
  1}
Ü  No stack
Ü  Temporary variables used as required
8
Example: Tcl code to bytecode
proc cos {x {n 16}} {
set x [expr {double($x)}]
set n [expr {int($n)}]
set j 0
set s 1.0
set t 1.0
set i 0
while {[incr i] < $n} {
set t [expr {
-$t*$x*$x / [incr j] / [incr j]
}]
set s [expr {$s + $t}]
}
return $s
}
...
29: startCommand {pc 42} 1
38: push1 {literal 0}
40: storeScalar1 {scalar j}
42: pop
43: startCommand {pc 56} 1
52: push1 {literal 1.0}
54: storeScalar1 {scalar s}
56: pop
57: startCommand {pc 70} 1
66: push1 {literal 1.0}
68: storeScalar1 {scalar t}
70: pop
71: startCommand {pc 84} 1
80: push1 {literal 0}
82: storeScalar1 {scalar i}
84: pop
85: startCommand {pc 179} 1
94: jump1 {pc 160}
96: startCommand {pc 142} 2
105: loadScalar1 {scalar t}
107: uminus
108: loadScalar1 {{scalar arg} x}
110: mult
111: loadScalar1 {{scalar arg} x}
113: mult
114: startCommand {pc 126} 1
123: incrScalar1Imm {scalar j} 1
126: div
127: startCommand {pc 139} 1
136: incrScalar1Imm {scalar j} 1
139: div
140: storeScalar1 {scalar t}
142: pop
...
9
Example: bytecode to
quadcode
29: startCommand {pc 42} 1
38: push1 {literal 0}
40: storeScalar1 {scalar j}
42: pop
43: startCommand {pc 56} 1
52: push1 {literal 1.0}
54: storeScalar1 {scalar s}
56: pop
57: startCommand {pc 70} 1
66: push1 {literal 1.0}
68: storeScalar1 {scalar t}
70: pop
71: startCommand {pc 84} 1
80: push1 {literal 0}
82: storeScalar1 {scalar i}
84: pop
85: startCommand {pc 179} 1
94: jump1 {pc 160}
96: startCommand {pc 142} 2
105: loadScalar1 {scalar t}
107: uminus
108: loadScalar1 {{scalar arg} x}
110: mult
111: loadScalar1 {{scalar arg} x}
113: mult
114: startCommand {pc 126} 1
123: incrScalar1Imm {scalar j} 1
126: div
127: startCommand {pc 139} 1
136: incrScalar1Imm {scalar j} 1
139: div
140: storeScalar1 {scalar t}
142: pop
11: copy {temp 0} {literal 0}
12: copy {var j} {temp 0}
13: copy {temp 0} {literal 1.0}
14: copy {var s} {temp 0}
15: copy {temp 0} {literal 1.0}
16: copy {var t} {temp 0}
17: copy {temp 0} {literal 0}
18: copy {var i} {temp 0}
19: jump {pc 37}
20: copy {temp 0} {var t}
21: uminus {temp 0} {temp 0}
22: copy {temp 1} {var x}
23: mult {temp 0} {temp 0} {temp 1}
24: copy {temp 1} {var x}
25: mult {temp 0} {temp 0} {temp 1}
26: add {var j} {var j} {literal 1}
27: copy {temp 1} {var j}
28: div {temp 0} {temp 0} {temp 1}
29: add {var j} {var j} {literal 1}
30: copy {temp 1} {var j}
31: div {temp 0} {temp 0} {temp 1}
32: copy {var t} {temp 0}
10
Quadcode Analysis
Ü  Code is converted to Static Single
Assignment (SSA) form
Ü  Variables assigned only once
Ü  Phi (φ) instructions used to merge variables at
convergences (after if-branches and in loops)
Ü  Lifetime analysis
Ü  Corresponds to where to use Tcl_DecrRefCount	
  
Ü  Type analysis
Ü  What type of data actually goes in a variable?
11
Example: Tcl code to
cleaned-up quadcode
proc cos {x {n 16}} {
set x [expr {double($x)}]
set n [expr {int($n)}]
set j 0
set s 1.0
set t 1.0
set i 0
while {[incr i] < $n} {
set t [expr {
-$t*$x*$x / [incr j] / [incr j]
}]
set s [expr {$s + $t}]
}
return $s
}
0: param {var x} {arg 0}
1: param {var n} {arg 1}
2: invoke {var x} {literal tcl::mathfunc::double} {var x}
3: invoke {var n} {literal tcl::mathfunc::int} {var n}
4: copy {var j} {literal 0}
5: copy {var s} {literal 1.0}
6: copy {var t} {literal 1.0}
7: copy {var i} {literal 0}
8: jump {pc 18}
9: uminus {temp 0} {var t}
10: mult {temp 0} {temp 0} {var x}
11: mult {temp 0} {temp 0} {var x}
12: add {var j} {var j} {literal 1}
13: div {temp 0} {temp 0} {var j}
14: add {var j} {var j} {literal 1}
15: div {temp 0} {temp 0} {var j}
16: copy {var t} {temp 0}
17: add {var s} {var s} {temp 0}
18: add {var i} {var i} {literal 1}
19: lt {temp 0} {var i} {var n}
20: jumpTrue {pc 9} {temp 0}
21: return {} {var s}
12
Note that this is before SSA analysis
Example: In SSA form
0: param {var x 0} {arg 0}
1: param {var n 1} {arg 1}
2: invoke {var x 2} {literal tcl::mathfunc::double} {var x 0}
3: invoke {var n 3} {literal tcl::mathfunc::int} {var n 1}
4: copy {var j 4} {literal 0}
5: copy {var s 5} {literal 1.0}
6: copy {var t 6} {literal 1.0}
7: copy {var i 7} {literal 0}
8: jump {pc 18}
9: uminus {temp 0 9} {var t 21}
10: mult {temp 0 10} {temp 0 9} {var x 2}
11: mult {temp 0 11} {temp 0 10} {var x 2}
12: add {var j 12} {var j 19} {literal 1}
13: div {temp 0 13} {temp 0 11} {var j 12}
14: add {var j 14} {var j 12} {literal 1}
15: div {temp 0 15} {temp 0 13} {var j 14}
16: copy {var t 16} {temp 0 15}
17: add {var s 17} {var s 20} {temp 0 15}
18: confluence
19: phi {var j 19} {var j 4} {pc 8} {var j 14} {pc 17}
20: phi {var s 20} {var s 5} {pc 8} {var s 17} {pc 17}
21: phi {var t 21} {var t 6} {pc 8} {var t 16} {pc 17}
22: phi {var i 22} {var i 7} {pc 8} {var i 23} {pc 17}
23: add {var i 23} {var i 22} {literal 1}
24: lt {temp 0 24} {var i 23} {var n 3}
25: jumpTrue {pc 9} {temp 0 24}
26: return {} {var s 20}
13
The Types of Tcl
Ü  Tcl isn’t entirely
typeless
Ü  Our values have types
Ü  String, Integer, Double-
precision float,
Boolean, List,
Dictionary, etc.
Ü  But everything is a
string
Ü  All other types are
formally subtypes of
string
14
string
double integer
booleannumeric
bool int
BOTTOM
list
dict
Example: Determined Types
Ü  Variable types inferred:
Ü  DOUBLE (i.e., proven to only ever contain a floating
point number)
Ü  var x 0, var x 2,var t 8, var t 37, temp 0 16, …
Ü  INT (i.e., proven to only ever contain an integer of
unknown width)
Ü  var n 1, var n 4, var j 10, var i 12, var j 35, var j 22, var j 26, …
Ü  INT BOOLEAN (i.e., proven to only ever contain the
values 0 or 1)
Ü  var j 6, var i 9, temp 0 41, …
Ü  Return type inferred:
Ü  DOUBLE (i.e., always succeeds, always produces a
floating point number)
15
Neat Tech along the Way
Ü  Uses TclBDD as Reasoning Engine
Ü  Datalog is clean way to express complex programs
Ü  Good for computing properties
Ü  Stops us from going mad!
Ü  (presented last year)
Ü  Might be possible to use quadcode itself as
an bytecode-interpreted execution target
Ü  Totally not our aim, but it is quite a bit cleaner
Ü  Not yet studied
16
We’re at the Station…
17
LLVM IR
Generating
18
Generating LLVM
Ü  LLVM Intermediate Representation (IR) is very
concrete
Ü  Lower level than C
Ü  Virtual Assembler
Ü  Each Tcl procedure goes to two functions
1.  Body of procedure
2.  “Thunk” to connect body to Tcl
Ü  Each quadcode instruction goes to a non-
branching sequence of IR opcodes
Ü  Keep pattern of basic blocks
Ü  Except branches, which branch of course
Compiling Instructions: Add
Ü  Adding two floats is trivial conversion
%s = fadd double %phi_s, %tmp.08
Ü  Adding two integers is not, as we don’t know
the bit width
Ü  So call a helper function!
%j = call %INT @tcl.add(%INT %phi_j, %INT %k)
Ü  The INT type is really a discriminated union
20
Compiling Instructions: Add
Ü  Many ways to add
Ü  Which to use in particular situation?
Ü  How we do it:
Ü  Look at the argument types (guaranteed known)
Ü  Look up TclOO method in code issuer to actually
get how to issue code
Ü  Add the types to the method name
Ü  Unknown method handler generates normal
typecasts
Ü  Just need to specify the “interesting” cases
21
Example: Issuing an Add
Ü  Want to issue an add:
add {var a 1} {var b 2} {var c 3}
Ü  Look up argument types:
{var b 2} à DOUBLE
{var c 3} à INT
Ü  Call issuer method add(DOUBLE,INT)
Ü  Doesn’t exist, build from add(DOUBLE,DOUBLE) and
typecaster
Ü  End up with required instructions, perhaps:
%45 = call double @tcl.typecast.dbl(%INT %c.3)
%a.1 = fadd double %b.2 %45
22
Internal
Standard
Library
Function
The Internal Standard Library
Ü  Collection of Functions to be Inlined by LLVM
Optimizer
Ü  Implement many more complex operations
23
; casts from our structured INT to a double-precision float
define hidden double @tcl.typecast.dbl(%INT %x) #0 {
; extract the fields
%x.flag = extractvalue %INT %x, 0
%x.32 = extractvalue %INT %x, 1
%x.64 = extractvalue %INT %x, 2
; determine what the 64-bit value is
%is32bit = icmp eq i32 %x.flag, 0
%value32bit = sext i32 %x.32 to i64
%value = select i1 %is32bit, i64 %value32bit, i64 %x.64
; perform the cast and return it
%casted = sitofp i64 %value to double
ret double %casted
}
Optimization
Ü  A critical step of IR generation is to run the
optimizer
Ü  Cleans up the code hugely
Ü  Inlines functions
Ü  Removes dead code paths
Ü  We have much of Tcl API annotated to help
the optimizer understand it
Ü  Documents guarantees and assumptions
24
Example: Optimized COS body
%6 = fmul double %phi_t64, %x
%7 = fmul double %6, %x
%tmp.04 = fsub double -0.000000e+00, %7
%8 = extractvalue %INT %phi_j62, 0
%9 = icmp eq i32 %8, 0
%10 = extractvalue %INT %phi_j62, 1
%11 = sext i32 %10 to i64
%12 = extractvalue %INT %phi_j62, 2
%x.6425.i43 = select i1 %9, i64 %11, i64 %12
%z.643.i44 = add i64 %x.6425.i43, 1
%cast = sitofp i64 %z.643.i44 to double
%tmp.05 = fdiv double %tmp.04, %cast
%z.643.i = add i64 %x.6425.i43, 2
%13 = insertvalue %INT { i32 1, i32 undef, i64 undef }, i64 %z.643.i, 2
%cast7 = sitofp i64 %z.643.i to double
%tmp.08 = fdiv double %tmp.05, %cast7
%s = fadd double %phi_s63, %tmp.08
25
The Other Types
Ü  Lists and Dictionaries are treated as Strings
Ü  Mapped to a Tcl_Obj* reference
Ü  Lifetime management used to control reference
counting efficiently
Ü  Failing operations become tagged derived
types
Ü  Failures cause jump to exception handling code
Ü  The BOTTOM type only occurs in functions
that cannot return
Ü  If they return, they do so by an error
26
Neat Tech along the Way
Ü  Closures
Ü  Callbacks which
capture local variables
Ü  Locally-scoped
Variables
Ü  Easy way to stop
variables from one
place spreading
elsewhere
Ü  Prevented many nasty
bugs
Example from Standard Library Builder
# Create local LLVM function: tcl.int.32
set f [$m local "tcl.int.32" int<-INT ReadNone]
params x
build {
my condBr [my isInt32 $x] $x32 $x64
label x32:
my ret [my int.32 $x]
label x64:
my ret [my cast(int) [my int.64 $x]]
}
# Make closure to create call to tcl.int.32
my closure getInt32 {arg {resultName ""}} {
my call [$f ref] [list $arg] $resultName
}
27
Heading Out…
28
FAST?
So, is this thing
Performance Categories
Ü  Numeric Code
Ü  Test pure integer functions with iterative fibonacci
Ü  Test floating point functions with cosines
Ü  Reference-handling Code
Ü  Test string handling functions with complex string
replacement
Ü  Test list handling functions with list joining
Ü  Test dictionary/array functions with counting words in
a list
Ü  Error-path Code
Ü  Test exception handling with non-trivial try usage
Performance
Category Test
Time (µs) Acceleration
(%)
Target
Reached?Uncompiled Compiled
Numeric fib 12.15 0.4758 2453 ✓✓
Numeric cos 6.277 0.3936 1495 ✓✓
Reference replacing 1.233 0.8792 40 ✗
Reference listjoin 2.300 0.6946 231 ✓
Reference wordcount 18.67 5.660 230 ✓
Error errortester 13.73 4.999 175 ✓-ish
31
Looking
Great!
Summary and Analysis
Ü  Numeric code is hugely faster
Ü  Typically much more than 10 times faster!
Ü  Reference management code is nicely faster
Ü  Often around 2–3 times faster
Ü  Automatically detecting how to unshare objects
Ü  String code largely unaffected
Ü  Critically dependent on buffer management
Ü  Might also be due to code used for testing
Ü  Error code mostly faster
Ü  Could still do better, but not usually critical path
32
Going Fast!
33
Future
Looking to the
34
Where Next?
Ü  Finish filling out translation from bytecode
Ü  Unset
Ü  Introspection
Ü  Address slow speed of compilation
Ü  Resulting code is fast, but process to get to it…
Ü  How to integrate into Tcl?
Ü  When to compile?
Ü  When to cache?
Ü  How to use LLVM practically?
Ü  What extensions to Tcl’s C API are needed?
Advanced Compilation
Ü  Compilation between procedures
Ü  Can we use type info more extensively?
Ü  Access to global variables
Ü  Currently local-variable only
Ü  Traces, variable scopes, etc.
Ü  Other types of compileable things
Ü  Lambdas, methods, …
36
Longer-term Questions
Ü  What changes should we do in Tcl in light of
this?
Ü  Already some ideas:
Ü  Change incr to support floats
Ü  Some way to annotate suggested types on arguments
Ü  If we bite the LLVM bullet, what other
changes follow?
Ü  Need to link to C++ libraries to use LLVM
Ü  Implement official C++ API to Tcl?
37

More Related Content

What's hot

Java 7, 8 & 9 - Moving the language forward
Java 7, 8 & 9 - Moving the language forwardJava 7, 8 & 9 - Moving the language forward
Java 7, 8 & 9 - Moving the language forwardMario Fusco
 
Pragmatic Real-World Scala (short version)
Pragmatic Real-World Scala (short version)Pragmatic Real-World Scala (short version)
Pragmatic Real-World Scala (short version)Jonas Bonér
 
Java Keeps Throttling Up!
Java Keeps Throttling Up!Java Keeps Throttling Up!
Java Keeps Throttling Up!José Paumard
 
FP in Java - Project Lambda and beyond
FP in Java - Project Lambda and beyondFP in Java - Project Lambda and beyond
FP in Java - Project Lambda and beyondMario Fusco
 
Map(), flatmap() and reduce() are your new best friends: simpler collections,...
Map(), flatmap() and reduce() are your new best friends: simpler collections,...Map(), flatmap() and reduce() are your new best friends: simpler collections,...
Map(), flatmap() and reduce() are your new best friends: simpler collections,...Chris Richardson
 
Logic programming a ruby perspective
Logic programming a ruby perspectiveLogic programming a ruby perspective
Logic programming a ruby perspectiveNorman Richards
 
The Ring programming language version 1.7 book - Part 35 of 196
The Ring programming language version 1.7 book - Part 35 of 196The Ring programming language version 1.7 book - Part 35 of 196
The Ring programming language version 1.7 book - Part 35 of 196Mahmoud Samir Fayed
 
The Sincerest Form of Flattery
The Sincerest Form of FlatteryThe Sincerest Form of Flattery
The Sincerest Form of FlatteryJosé Paumard
 
Protocol-Oriented Programming in Swift
Protocol-Oriented Programming in SwiftProtocol-Oriented Programming in Swift
Protocol-Oriented Programming in SwiftGlobalLogic Ukraine
 
AST Transformations at JFokus
AST Transformations at JFokusAST Transformations at JFokus
AST Transformations at JFokusHamletDRC
 
The Ring programming language version 1.8 book - Part 37 of 202
The Ring programming language version 1.8 book - Part 37 of 202The Ring programming language version 1.8 book - Part 37 of 202
The Ring programming language version 1.8 book - Part 37 of 202Mahmoud Samir Fayed
 

What's hot (20)

Java 7, 8 & 9 - Moving the language forward
Java 7, 8 & 9 - Moving the language forwardJava 7, 8 & 9 - Moving the language forward
Java 7, 8 & 9 - Moving the language forward
 
Meta Object Protocols
Meta Object ProtocolsMeta Object Protocols
Meta Object Protocols
 
Pragmatic Real-World Scala (short version)
Pragmatic Real-World Scala (short version)Pragmatic Real-World Scala (short version)
Pragmatic Real-World Scala (short version)
 
Initial Java Core Concept
Initial Java Core ConceptInitial Java Core Concept
Initial Java Core Concept
 
Java Keeps Throttling Up!
Java Keeps Throttling Up!Java Keeps Throttling Up!
Java Keeps Throttling Up!
 
FP in Java - Project Lambda and beyond
FP in Java - Project Lambda and beyondFP in Java - Project Lambda and beyond
FP in Java - Project Lambda and beyond
 
Java Class Design
Java Class DesignJava Class Design
Java Class Design
 
Why Haskell
Why HaskellWhy Haskell
Why Haskell
 
Map(), flatmap() and reduce() are your new best friends: simpler collections,...
Map(), flatmap() and reduce() are your new best friends: simpler collections,...Map(), flatmap() and reduce() are your new best friends: simpler collections,...
Map(), flatmap() and reduce() are your new best friends: simpler collections,...
 
Logic programming a ruby perspective
Logic programming a ruby perspectiveLogic programming a ruby perspective
Logic programming a ruby perspective
 
The Ring programming language version 1.7 book - Part 35 of 196
The Ring programming language version 1.7 book - Part 35 of 196The Ring programming language version 1.7 book - Part 35 of 196
The Ring programming language version 1.7 book - Part 35 of 196
 
What's New In C# 7
What's New In C# 7What's New In C# 7
What's New In C# 7
 
The Sincerest Form of Flattery
The Sincerest Form of FlatteryThe Sincerest Form of Flattery
The Sincerest Form of Flattery
 
Java Full Throttle
Java Full ThrottleJava Full Throttle
Java Full Throttle
 
Protocol-Oriented Programming in Swift
Protocol-Oriented Programming in SwiftProtocol-Oriented Programming in Swift
Protocol-Oriented Programming in Swift
 
Comparing JVM languages
Comparing JVM languagesComparing JVM languages
Comparing JVM languages
 
Java Generics
Java GenericsJava Generics
Java Generics
 
AST Transformations at JFokus
AST Transformations at JFokusAST Transformations at JFokus
AST Transformations at JFokus
 
C# 7
C# 7C# 7
C# 7
 
The Ring programming language version 1.8 book - Part 37 of 202
The Ring programming language version 1.8 book - Part 37 of 202The Ring programming language version 1.8 book - Part 37 of 202
The Ring programming language version 1.8 book - Part 37 of 202
 

Similar to The TclQuadcode Compiler

EMBEDDED SYSTEMS 4&5
EMBEDDED SYSTEMS 4&5EMBEDDED SYSTEMS 4&5
EMBEDDED SYSTEMS 4&5PRADEEP
 
Classical programming interview questions
Classical programming interview questionsClassical programming interview questions
Classical programming interview questionsGradeup
 
Class 16: Making Loops
Class 16: Making LoopsClass 16: Making Loops
Class 16: Making LoopsDavid Evans
 
01 - DAA - PPT.pptx
01 - DAA - PPT.pptx01 - DAA - PPT.pptx
01 - DAA - PPT.pptxKokilaK25
 
Scala as a Declarative Language
Scala as a Declarative LanguageScala as a Declarative Language
Scala as a Declarative Languagevsssuresh
 
C++ and OOPS Crash Course by ACM DBIT | Grejo Joby
C++ and OOPS Crash Course by ACM DBIT | Grejo JobyC++ and OOPS Crash Course by ACM DBIT | Grejo Joby
C++ and OOPS Crash Course by ACM DBIT | Grejo JobyGrejoJoby1
 
Scala Back to Basics: Type Classes
Scala Back to Basics: Type ClassesScala Back to Basics: Type Classes
Scala Back to Basics: Type ClassesTomer Gabel
 
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...Positive Hack Days
 
Oh Crap, I Forgot (Or Never Learned) C! [CodeMash 2010]
Oh Crap, I Forgot (Or Never Learned) C! [CodeMash 2010]Oh Crap, I Forgot (Or Never Learned) C! [CodeMash 2010]
Oh Crap, I Forgot (Or Never Learned) C! [CodeMash 2010]Chris Adamson
 
Abstracting Vector Architectures in Library Generators: Case Study Convolutio...
Abstracting Vector Architectures in Library Generators: Case Study Convolutio...Abstracting Vector Architectures in Library Generators: Case Study Convolutio...
Abstracting Vector Architectures in Library Generators: Case Study Convolutio...ETH Zurich
 
Container and algorithms(C++)
Container and algorithms(C++)Container and algorithms(C++)
Container and algorithms(C++)JerryHe
 
Verilog Lecture3 hust 2014
Verilog Lecture3 hust 2014Verilog Lecture3 hust 2014
Verilog Lecture3 hust 2014Béo Tú
 
C Programming Interview Questions
C Programming Interview QuestionsC Programming Interview Questions
C Programming Interview QuestionsGradeup
 
sonam Kumari python.ppt
sonam Kumari python.pptsonam Kumari python.ppt
sonam Kumari python.pptssuserd64918
 
Legacy projects: how to win the race
Legacy projects: how to win the raceLegacy projects: how to win the race
Legacy projects: how to win the raceVictor_Cr
 
T02 a firstcprogram
T02 a firstcprogramT02 a firstcprogram
T02 a firstcprogramprincepavan
 

Similar to The TclQuadcode Compiler (20)

C tutorial
C tutorialC tutorial
C tutorial
 
EMBEDDED SYSTEMS 4&5
EMBEDDED SYSTEMS 4&5EMBEDDED SYSTEMS 4&5
EMBEDDED SYSTEMS 4&5
 
Classical programming interview questions
Classical programming interview questionsClassical programming interview questions
Classical programming interview questions
 
Class 16: Making Loops
Class 16: Making LoopsClass 16: Making Loops
Class 16: Making Loops
 
01 - DAA - PPT.pptx
01 - DAA - PPT.pptx01 - DAA - PPT.pptx
01 - DAA - PPT.pptx
 
Scala as a Declarative Language
Scala as a Declarative LanguageScala as a Declarative Language
Scala as a Declarative Language
 
C++ and OOPS Crash Course by ACM DBIT | Grejo Joby
C++ and OOPS Crash Course by ACM DBIT | Grejo JobyC++ and OOPS Crash Course by ACM DBIT | Grejo Joby
C++ and OOPS Crash Course by ACM DBIT | Grejo Joby
 
Scala Back to Basics: Type Classes
Scala Back to Basics: Type ClassesScala Back to Basics: Type Classes
Scala Back to Basics: Type Classes
 
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
 
Oh Crap, I Forgot (Or Never Learned) C! [CodeMash 2010]
Oh Crap, I Forgot (Or Never Learned) C! [CodeMash 2010]Oh Crap, I Forgot (Or Never Learned) C! [CodeMash 2010]
Oh Crap, I Forgot (Or Never Learned) C! [CodeMash 2010]
 
C++ in 10 Hours.pdf.pdf
C++ in 10 Hours.pdf.pdfC++ in 10 Hours.pdf.pdf
C++ in 10 Hours.pdf.pdf
 
Matlab Script - Loop Control
Matlab Script - Loop ControlMatlab Script - Loop Control
Matlab Script - Loop Control
 
Verilogforlab
VerilogforlabVerilogforlab
Verilogforlab
 
Abstracting Vector Architectures in Library Generators: Case Study Convolutio...
Abstracting Vector Architectures in Library Generators: Case Study Convolutio...Abstracting Vector Architectures in Library Generators: Case Study Convolutio...
Abstracting Vector Architectures in Library Generators: Case Study Convolutio...
 
Container and algorithms(C++)
Container and algorithms(C++)Container and algorithms(C++)
Container and algorithms(C++)
 
Verilog Lecture3 hust 2014
Verilog Lecture3 hust 2014Verilog Lecture3 hust 2014
Verilog Lecture3 hust 2014
 
C Programming Interview Questions
C Programming Interview QuestionsC Programming Interview Questions
C Programming Interview Questions
 
sonam Kumari python.ppt
sonam Kumari python.pptsonam Kumari python.ppt
sonam Kumari python.ppt
 
Legacy projects: how to win the race
Legacy projects: how to win the raceLegacy projects: how to win the race
Legacy projects: how to win the race
 
T02 a firstcprogram
T02 a firstcprogramT02 a firstcprogram
T02 a firstcprogram
 

Recently uploaded

Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 

Recently uploaded (20)

Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 

The TclQuadcode Compiler

  • 1. The TclQuadcode Compiler Status report on Tcl type analysis and code generation Donal Fellowsorcid.org/0000-0002-9091-5938 Kevin Kenny
  • 2. What is going on? Ü  Want to Make Tcl Faster Ü  Everyone benefits Ü  Lehenbauer Challenges Ü  2 times faster (“Perl” territory) Ü  Better algorithms Ü  Better buffer management Ü  Bytecode optimization Ü  10 times faster (“C” territory) Ü  Needs more radical approach 2
  • 3. Generating Native Code is Hard Ü  Going to 10 times faster requires native code Ü  Bytecode work simply won’t do it Ü  But Tcl is a very dynamic language Ü  Even ignoring command renaming tricks Ü  Native code needs types Ü  Many platforms 3
  • 4. Let’s Go to LLVM! Ü  Solves many problems Ü  Optimization Ü  Native code issuing Ü  Runtime loading Ü  LLVM Intermediate Representation (IR) Ü  Effectively a virtual assembly language target Ü  Existing Tcl package! Ü  llvmtcl by Jos Decoster Ü  Introduces problems though Ü  LLVM’s idea of “throw an error” is to panic with a gnostic error message 4
  • 5. How to get to LLVM? Ü  Still need those pesky types Ü  Still need fixed semantics Ü  We need a new bytecode! Quadcode Ü  Designed to help: Ü  Simple translation from Tcl bytecode Ü  More amenable to analysis 5
  • 6. 6
  • 8. Quadcode Ü  Based on three-address code assembly Ü  The Tcl code: set  a  [expr  {  $b  +  1  }]   Ü  Equivalent Tcl bytecode: loadScalar  %b;  push  “1”;  add;  storeScalar  %a   Ü  Equivalent (optimized) quadcode: add  {var  a}  {var  b}  {literal  1} Ü  No stack Ü  Temporary variables used as required 8
  • 9. Example: Tcl code to bytecode proc cos {x {n 16}} { set x [expr {double($x)}] set n [expr {int($n)}] set j 0 set s 1.0 set t 1.0 set i 0 while {[incr i] < $n} { set t [expr { -$t*$x*$x / [incr j] / [incr j] }] set s [expr {$s + $t}] } return $s } ... 29: startCommand {pc 42} 1 38: push1 {literal 0} 40: storeScalar1 {scalar j} 42: pop 43: startCommand {pc 56} 1 52: push1 {literal 1.0} 54: storeScalar1 {scalar s} 56: pop 57: startCommand {pc 70} 1 66: push1 {literal 1.0} 68: storeScalar1 {scalar t} 70: pop 71: startCommand {pc 84} 1 80: push1 {literal 0} 82: storeScalar1 {scalar i} 84: pop 85: startCommand {pc 179} 1 94: jump1 {pc 160} 96: startCommand {pc 142} 2 105: loadScalar1 {scalar t} 107: uminus 108: loadScalar1 {{scalar arg} x} 110: mult 111: loadScalar1 {{scalar arg} x} 113: mult 114: startCommand {pc 126} 1 123: incrScalar1Imm {scalar j} 1 126: div 127: startCommand {pc 139} 1 136: incrScalar1Imm {scalar j} 1 139: div 140: storeScalar1 {scalar t} 142: pop ... 9
  • 10. Example: bytecode to quadcode 29: startCommand {pc 42} 1 38: push1 {literal 0} 40: storeScalar1 {scalar j} 42: pop 43: startCommand {pc 56} 1 52: push1 {literal 1.0} 54: storeScalar1 {scalar s} 56: pop 57: startCommand {pc 70} 1 66: push1 {literal 1.0} 68: storeScalar1 {scalar t} 70: pop 71: startCommand {pc 84} 1 80: push1 {literal 0} 82: storeScalar1 {scalar i} 84: pop 85: startCommand {pc 179} 1 94: jump1 {pc 160} 96: startCommand {pc 142} 2 105: loadScalar1 {scalar t} 107: uminus 108: loadScalar1 {{scalar arg} x} 110: mult 111: loadScalar1 {{scalar arg} x} 113: mult 114: startCommand {pc 126} 1 123: incrScalar1Imm {scalar j} 1 126: div 127: startCommand {pc 139} 1 136: incrScalar1Imm {scalar j} 1 139: div 140: storeScalar1 {scalar t} 142: pop 11: copy {temp 0} {literal 0} 12: copy {var j} {temp 0} 13: copy {temp 0} {literal 1.0} 14: copy {var s} {temp 0} 15: copy {temp 0} {literal 1.0} 16: copy {var t} {temp 0} 17: copy {temp 0} {literal 0} 18: copy {var i} {temp 0} 19: jump {pc 37} 20: copy {temp 0} {var t} 21: uminus {temp 0} {temp 0} 22: copy {temp 1} {var x} 23: mult {temp 0} {temp 0} {temp 1} 24: copy {temp 1} {var x} 25: mult {temp 0} {temp 0} {temp 1} 26: add {var j} {var j} {literal 1} 27: copy {temp 1} {var j} 28: div {temp 0} {temp 0} {temp 1} 29: add {var j} {var j} {literal 1} 30: copy {temp 1} {var j} 31: div {temp 0} {temp 0} {temp 1} 32: copy {var t} {temp 0} 10
  • 11. Quadcode Analysis Ü  Code is converted to Static Single Assignment (SSA) form Ü  Variables assigned only once Ü  Phi (φ) instructions used to merge variables at convergences (after if-branches and in loops) Ü  Lifetime analysis Ü  Corresponds to where to use Tcl_DecrRefCount   Ü  Type analysis Ü  What type of data actually goes in a variable? 11
  • 12. Example: Tcl code to cleaned-up quadcode proc cos {x {n 16}} { set x [expr {double($x)}] set n [expr {int($n)}] set j 0 set s 1.0 set t 1.0 set i 0 while {[incr i] < $n} { set t [expr { -$t*$x*$x / [incr j] / [incr j] }] set s [expr {$s + $t}] } return $s } 0: param {var x} {arg 0} 1: param {var n} {arg 1} 2: invoke {var x} {literal tcl::mathfunc::double} {var x} 3: invoke {var n} {literal tcl::mathfunc::int} {var n} 4: copy {var j} {literal 0} 5: copy {var s} {literal 1.0} 6: copy {var t} {literal 1.0} 7: copy {var i} {literal 0} 8: jump {pc 18} 9: uminus {temp 0} {var t} 10: mult {temp 0} {temp 0} {var x} 11: mult {temp 0} {temp 0} {var x} 12: add {var j} {var j} {literal 1} 13: div {temp 0} {temp 0} {var j} 14: add {var j} {var j} {literal 1} 15: div {temp 0} {temp 0} {var j} 16: copy {var t} {temp 0} 17: add {var s} {var s} {temp 0} 18: add {var i} {var i} {literal 1} 19: lt {temp 0} {var i} {var n} 20: jumpTrue {pc 9} {temp 0} 21: return {} {var s} 12 Note that this is before SSA analysis
  • 13. Example: In SSA form 0: param {var x 0} {arg 0} 1: param {var n 1} {arg 1} 2: invoke {var x 2} {literal tcl::mathfunc::double} {var x 0} 3: invoke {var n 3} {literal tcl::mathfunc::int} {var n 1} 4: copy {var j 4} {literal 0} 5: copy {var s 5} {literal 1.0} 6: copy {var t 6} {literal 1.0} 7: copy {var i 7} {literal 0} 8: jump {pc 18} 9: uminus {temp 0 9} {var t 21} 10: mult {temp 0 10} {temp 0 9} {var x 2} 11: mult {temp 0 11} {temp 0 10} {var x 2} 12: add {var j 12} {var j 19} {literal 1} 13: div {temp 0 13} {temp 0 11} {var j 12} 14: add {var j 14} {var j 12} {literal 1} 15: div {temp 0 15} {temp 0 13} {var j 14} 16: copy {var t 16} {temp 0 15} 17: add {var s 17} {var s 20} {temp 0 15} 18: confluence 19: phi {var j 19} {var j 4} {pc 8} {var j 14} {pc 17} 20: phi {var s 20} {var s 5} {pc 8} {var s 17} {pc 17} 21: phi {var t 21} {var t 6} {pc 8} {var t 16} {pc 17} 22: phi {var i 22} {var i 7} {pc 8} {var i 23} {pc 17} 23: add {var i 23} {var i 22} {literal 1} 24: lt {temp 0 24} {var i 23} {var n 3} 25: jumpTrue {pc 9} {temp 0 24} 26: return {} {var s 20} 13
  • 14. The Types of Tcl Ü  Tcl isn’t entirely typeless Ü  Our values have types Ü  String, Integer, Double- precision float, Boolean, List, Dictionary, etc. Ü  But everything is a string Ü  All other types are formally subtypes of string 14 string double integer booleannumeric bool int BOTTOM list dict
  • 15. Example: Determined Types Ü  Variable types inferred: Ü  DOUBLE (i.e., proven to only ever contain a floating point number) Ü  var x 0, var x 2,var t 8, var t 37, temp 0 16, … Ü  INT (i.e., proven to only ever contain an integer of unknown width) Ü  var n 1, var n 4, var j 10, var i 12, var j 35, var j 22, var j 26, … Ü  INT BOOLEAN (i.e., proven to only ever contain the values 0 or 1) Ü  var j 6, var i 9, temp 0 41, … Ü  Return type inferred: Ü  DOUBLE (i.e., always succeeds, always produces a floating point number) 15
  • 16. Neat Tech along the Way Ü  Uses TclBDD as Reasoning Engine Ü  Datalog is clean way to express complex programs Ü  Good for computing properties Ü  Stops us from going mad! Ü  (presented last year) Ü  Might be possible to use quadcode itself as an bytecode-interpreted execution target Ü  Totally not our aim, but it is quite a bit cleaner Ü  Not yet studied 16
  • 17. We’re at the Station… 17
  • 19. Generating LLVM Ü  LLVM Intermediate Representation (IR) is very concrete Ü  Lower level than C Ü  Virtual Assembler Ü  Each Tcl procedure goes to two functions 1.  Body of procedure 2.  “Thunk” to connect body to Tcl Ü  Each quadcode instruction goes to a non- branching sequence of IR opcodes Ü  Keep pattern of basic blocks Ü  Except branches, which branch of course
  • 20. Compiling Instructions: Add Ü  Adding two floats is trivial conversion %s = fadd double %phi_s, %tmp.08 Ü  Adding two integers is not, as we don’t know the bit width Ü  So call a helper function! %j = call %INT @tcl.add(%INT %phi_j, %INT %k) Ü  The INT type is really a discriminated union 20
  • 21. Compiling Instructions: Add Ü  Many ways to add Ü  Which to use in particular situation? Ü  How we do it: Ü  Look at the argument types (guaranteed known) Ü  Look up TclOO method in code issuer to actually get how to issue code Ü  Add the types to the method name Ü  Unknown method handler generates normal typecasts Ü  Just need to specify the “interesting” cases 21
  • 22. Example: Issuing an Add Ü  Want to issue an add: add {var a 1} {var b 2} {var c 3} Ü  Look up argument types: {var b 2} à DOUBLE {var c 3} à INT Ü  Call issuer method add(DOUBLE,INT) Ü  Doesn’t exist, build from add(DOUBLE,DOUBLE) and typecaster Ü  End up with required instructions, perhaps: %45 = call double @tcl.typecast.dbl(%INT %c.3) %a.1 = fadd double %b.2 %45 22 Internal Standard Library Function
  • 23. The Internal Standard Library Ü  Collection of Functions to be Inlined by LLVM Optimizer Ü  Implement many more complex operations 23 ; casts from our structured INT to a double-precision float define hidden double @tcl.typecast.dbl(%INT %x) #0 { ; extract the fields %x.flag = extractvalue %INT %x, 0 %x.32 = extractvalue %INT %x, 1 %x.64 = extractvalue %INT %x, 2 ; determine what the 64-bit value is %is32bit = icmp eq i32 %x.flag, 0 %value32bit = sext i32 %x.32 to i64 %value = select i1 %is32bit, i64 %value32bit, i64 %x.64 ; perform the cast and return it %casted = sitofp i64 %value to double ret double %casted }
  • 24. Optimization Ü  A critical step of IR generation is to run the optimizer Ü  Cleans up the code hugely Ü  Inlines functions Ü  Removes dead code paths Ü  We have much of Tcl API annotated to help the optimizer understand it Ü  Documents guarantees and assumptions 24
  • 25. Example: Optimized COS body %6 = fmul double %phi_t64, %x %7 = fmul double %6, %x %tmp.04 = fsub double -0.000000e+00, %7 %8 = extractvalue %INT %phi_j62, 0 %9 = icmp eq i32 %8, 0 %10 = extractvalue %INT %phi_j62, 1 %11 = sext i32 %10 to i64 %12 = extractvalue %INT %phi_j62, 2 %x.6425.i43 = select i1 %9, i64 %11, i64 %12 %z.643.i44 = add i64 %x.6425.i43, 1 %cast = sitofp i64 %z.643.i44 to double %tmp.05 = fdiv double %tmp.04, %cast %z.643.i = add i64 %x.6425.i43, 2 %13 = insertvalue %INT { i32 1, i32 undef, i64 undef }, i64 %z.643.i, 2 %cast7 = sitofp i64 %z.643.i to double %tmp.08 = fdiv double %tmp.05, %cast7 %s = fadd double %phi_s63, %tmp.08 25
  • 26. The Other Types Ü  Lists and Dictionaries are treated as Strings Ü  Mapped to a Tcl_Obj* reference Ü  Lifetime management used to control reference counting efficiently Ü  Failing operations become tagged derived types Ü  Failures cause jump to exception handling code Ü  The BOTTOM type only occurs in functions that cannot return Ü  If they return, they do so by an error 26
  • 27. Neat Tech along the Way Ü  Closures Ü  Callbacks which capture local variables Ü  Locally-scoped Variables Ü  Easy way to stop variables from one place spreading elsewhere Ü  Prevented many nasty bugs Example from Standard Library Builder # Create local LLVM function: tcl.int.32 set f [$m local "tcl.int.32" int<-INT ReadNone] params x build { my condBr [my isInt32 $x] $x32 $x64 label x32: my ret [my int.32 $x] label x64: my ret [my cast(int) [my int.64 $x]] } # Make closure to create call to tcl.int.32 my closure getInt32 {arg {resultName ""}} { my call [$f ref] [list $arg] $resultName } 27
  • 30. Performance Categories Ü  Numeric Code Ü  Test pure integer functions with iterative fibonacci Ü  Test floating point functions with cosines Ü  Reference-handling Code Ü  Test string handling functions with complex string replacement Ü  Test list handling functions with list joining Ü  Test dictionary/array functions with counting words in a list Ü  Error-path Code Ü  Test exception handling with non-trivial try usage
  • 31. Performance Category Test Time (µs) Acceleration (%) Target Reached?Uncompiled Compiled Numeric fib 12.15 0.4758 2453 ✓✓ Numeric cos 6.277 0.3936 1495 ✓✓ Reference replacing 1.233 0.8792 40 ✗ Reference listjoin 2.300 0.6946 231 ✓ Reference wordcount 18.67 5.660 230 ✓ Error errortester 13.73 4.999 175 ✓-ish 31 Looking Great!
  • 32. Summary and Analysis Ü  Numeric code is hugely faster Ü  Typically much more than 10 times faster! Ü  Reference management code is nicely faster Ü  Often around 2–3 times faster Ü  Automatically detecting how to unshare objects Ü  String code largely unaffected Ü  Critically dependent on buffer management Ü  Might also be due to code used for testing Ü  Error code mostly faster Ü  Could still do better, but not usually critical path 32
  • 35. Where Next? Ü  Finish filling out translation from bytecode Ü  Unset Ü  Introspection Ü  Address slow speed of compilation Ü  Resulting code is fast, but process to get to it… Ü  How to integrate into Tcl? Ü  When to compile? Ü  When to cache? Ü  How to use LLVM practically? Ü  What extensions to Tcl’s C API are needed?
  • 36. Advanced Compilation Ü  Compilation between procedures Ü  Can we use type info more extensively? Ü  Access to global variables Ü  Currently local-variable only Ü  Traces, variable scopes, etc. Ü  Other types of compileable things Ü  Lambdas, methods, … 36
  • 37. Longer-term Questions Ü  What changes should we do in Tcl in light of this? Ü  Already some ideas: Ü  Change incr to support floats Ü  Some way to annotate suggested types on arguments Ü  If we bite the LLVM bullet, what other changes follow? Ü  Need to link to C++ libraries to use LLVM Ü  Implement official C++ API to Tcl? 37