This document discusses various techniques for optimizing metaprogramming and dynamic languages, including reflection, gradual typing, metaobject protocols, and dynamic optimizations like just-in-time compilation. It describes how techniques like polymorphic inline caches, hidden classes/shapes, and gradual typing checks can help optimize reflective operations and dynamic language features while preserving their flexibility. The document argues these "unoptimizable" features can achieve excellent performance through caching dynamically gathered type information and generating efficient machine code.
Generative AI for Technical Writer or Information Developers
Metaprogramming, Metaobject Protocols, Gradual Type Checks: Optimizing the "Unoptimizable" Using Old Ideas
1. Metaprogramming, Metaobject Protocols,
Gradual Type Checks
Optimizing the “Unoptimizable”
Using Old Ideas
Stefan Marr
Athens, October 2019
Creative Commons
Attribution-ShareAlike
4.0 License
10. Gradual Typing
async addMessage(user: User, message) {
const msg = `<img
src="/image/${user.profilePicture}">
${message}</span>`;
this.outputElem
.insertAdjacentHTML(
'beforeend', msg);
10Excellent
Performance
Gradual
Typing
Somewhat True…
The whole Truth
is a little more complex
11. The Knights Found New Homes
11Excellent
Performance
Reflection
Metaprogramming Gradual
Typing
Metaobject
Protocols
AOP
Land of Engineering Short Cuts
15. The (Movie) Heroes of ‘91
15
Polymorphic Inline Caches Just-in-time Compilation Maps (Hidden Classes)
Terminator 2 The Naked Gun 2 1/2 Star Trek VI
18. A Class Hierarchy of Widgets
18
class Widget {
fitsInto(width) {
return this.width <= width;
}
}
class Button extends Widget {}
class RadioButton extends Button {}
fn findAllThatFit(arr, width) {
const result = [];
for (const w of arr)
if (w.fitsInto(width))
result.append(w)
return result;
}
19. Lookups can be frequent and costly
19
class Widget {
fitsInto(width) {
return this.width <= width;
}
}
class Button extends Widget {}
class RadioButton extends Button {}
fn findAllThatFit(arr, width) {
const result = [];
for (const w of arr)
if (w.fitsInto(width))
result.append(w)
return result;
}
RadioButton
Button
fitsInto
Widget
superclass
superclass
For each fitsInto call
hasMethod: 3x
getSuperclass: 2x
20. Solution: Lookup Caching
20
w.fitsInto(width)
could be various functions,
but we don’t need to do the same lookup repeatedly
method
method
(in case we see
widget of
different class)
PIC: check for receiver and jump to
method directly in machine code
Useful because:
• Most sends are monomorphic
• Few are polymorphic
• And just a couple are megamorphic
22. JUST IN TIME COMPILATION
Generating Machine Code at Run Time
23
23. Just-in-time Compilation
• Produces native code, optimized, avoiding the
overhead of interpretation
• At run time, can utilize knowledge about
program execution
• Ahead-of-time compilation, i.e., classic static
compilation can only guess how a program is
used
24
24. With PICs, we can know
25
class Widget {
fitsInto(width) {
return this.width <= width;
}
}
class Button extends Widget {}
class RadioButton extends Button {}
fn findAllThatFit(arr, width) {
const result = [];
for (const w of arr)
if (w.fitsInto(width))
result.append(w)
return result;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
RadioButton Widget.fitsInto
Array
RadioButton
Integer
25. And Generate Efficient Code
26
fn findAllThatFit(arr, width) {
const result = [];
for (const w of arr)
if (w.width(field) <=(int) width)
result.append(w)
return result;
}
1
2
3
4
5
6
7
inlined fitsInto
Specialized to types
Resulting in Efficient
Machine Code
29. The Power of Dynamic Languages
30
o = {foo: 33} Object with 1 field
o.bar = new Object() Object with 2 fields
o.float = 4.2 Object with 3 fields
o.float = "string"
And you can
store anything
30. Data Representation for Objects
31
o = {foo: 33}
o.bar = new Object()
o.baz = "string"
o.float = 4.2
Obj
1
2
3
…
8
foo
33
"string"
4.2
bar
baz
float
Full Power of Dynamic Languages: Rarely Used
33. METAPROGRAMMING
Is there hope for our Knight
from the Land of Engineering Shortcuts
to find the true treasure?
34
Reflection
Metaprogramming
Excellent
Performance
37. Zero-Overhead Metaprogramming: Reflection and Metaobject Protocols Fast and without Compromises.
Marr, S., Seaton, C. & Ducasse, S. (2015). PLDI’15
Simple Metaprogramming: Zero Overhead
38
http://stefan-marr.de/papers/pldi-marr-et-al-zero-overhead-metaprogramming-artifacts/
39. METAOBJECT PROTOCOLS
Is there hope for our Knight
from the Land of Engineering Shortcuts
to find the true treasure?
40
Excellent
Performance
Metaobject
Protocols
40. Metaobject Protocols
WriteLogging extends Metaclass {
writeToField(obj, fieldName, value) {
console.log(`${fieldName}: ${value}`)
obj.setField(fieldName, value)
}
}
41
Redefine the Language from within the Language
41. Problem
obj.field = 12;
writeToField(obj, "field", 12)
fn writeToField(obj, fieldName, value) {
console.log(`${fieldName}: ${value}`)
obj.setField(fieldName, value)
}
}
42
turns into
AOPLooks very Hard!
42. Ownership-based Metaobject Protocol
Building a Safe Actor Framework
class ActorDomain extends Domain {
fn writeToField(obj, fieldIdx, value) {
if (Domain.current() == this) {
obj.setField(fieldIdx, value);
} else {
throw new IsolationError(obj);
}
}
/* ... */
}
43
http://stefan-marr.de/research/omop/
43. An Actor Example
44
actor.fieldA := 1
semantic depends
on metaobject
AD.writeToField
Cache Desired Language Semantics
Eliminates Potential Variability
Std write
46. GRADUAL TYPING
Is there hope for our Knight
from the Land of Engineering Shortcuts
to find the true treasure?
47
Excellent
Performance
Gradual
Typing
47. Gradual Typing without
Run-Time Semantics
async addMessage(user: User, message) {
const msg = `<img
src="/image/${user.profilePicture}">
${message}</span>`;
this.outputElem
.insertAdjacentHTML(
'beforeend', msg);
48
Very Useful in Practice.
And rather popular.Gradual
Typing
48. Transient Gradual Typing
type Vehicle = interface {
registration
registerTo(_)
}
type Department = interface { code }
var companyCar: Vehicle := object {
method registerTo(d: Department) { print "{d.code}" }
}
companyCar.registerTo(
object { var name := "R&D" })
49
Types are shallow.
Method names matter,
but arguments don’t.
Object only has name,
no code method
Assignment to
registerTo(d: Department) should error
49. Transient Gradual Typing
tmp := object {
method registerTo(d) {
typeCheck d is Department
print "{d.code}" }
}
typeCheck tmp is Vehicle
var companyCar = tmp
companyCar.registerTo(
object { var name := "R&D" })
50
Very simple semantics. Other Gradual
systems have blame, and are more complex
Possibly many
many checks.
Looks very Hard!
Gradual
Typing
50. How to get rid of these checks without
losing run-time semantics ?
tmp := object {
method registerTo(d) {
typeCheck d is Department
print "{d.code}" }
}
typeCheck tmp is Vehicle
var companyCar = tmp
companyCar.registerTo(
object { var name := "R&D" })
51
51. Shapes to the Rescue
52
Shape
1: foo(int)
2: baz(ptr)
3: float(float)
4: bar(ptr)
Shape
1: foo(int)
2: baz(ptr)
3: float(float)
4: bar(ptr)
Implicitly
Compatible to:
- Type 1
- Type 2
1. Check object is compatible
2. Shape implies compatibility
52. Final optimized code
tmp := object {
method registerTo(d) {
check d hasShape s1
print "{d.code}" }
}
check tmp hasShape s2
var companyCar = tmp
companyCar.registerTo(
object { var name := "R&D" }) 53
need to do type check only once per lexical location
s1.code
s2.registerTo
JIT Compiler can remove
redundant checks
57. Things I didn’t talk about
Failure cases:
Deoptimization
An Efficient
Implementation of SELF a
Dynamically-Typed
Object-Oriented Language
Based on Prototypes.
Chambers, C., Ungar, D. &
Lee, E. (1989). OOPSLA’89
Object shapes are useful
for other things
Efficient and Thread-Safe
Objects for Dynamically-
Typed Languages.
B. Daloze, S. Marr, D.
Bonetta, and H.
Mössenböck. OOPSLA'16
58
And many other
modern optimizations
58. Our Knights Made it With Some Help
of our 90’s Heroes
59
Excellent
Performance
Reflection
Metaprogramming
Gradual
Typing
Metaobject
Protocols
60. Research and Literature
• Efficient Implementation of the
Smalltalk-80 System.
Deutsch, L. P. & Schiffman, A. M.
(1984). POPL’84
• Optimizing Dynamically-Typed
Object-Oriented Languages With
Polymorphic Inline Caches.
Hölzle, U., Chambers, C. & Ungar,
D. (1991). ECOOP’91
• Zero-Overhead
Metaprogramming: Reflection
and Metaobject Protocols Fast
and without Compromises.
Marr, S., Seaton, C. & Ducasse, S.
(2015). PLDI’15
• Optimizing prototypes in V8
https://mathiasbynens.be/notes/
prototypes
• https://mathiasbynens.be/notes/
shapes-ics
• https://mrale.ph/blog/2012/06/0
3/explaining-js-vms-in-js-inline-
caches.html
61
61. Research and Literature
• An Efficient Implementation of
SELF a Dynamically-Typed
Object-Oriented Language Based
on Prototypes.
Chambers, C., Ungar, D. & Lee, E.
(1989). OOPSLA’89
• An Object Storage Model for the
Truffle Language Implementation
Framework.
A. Wöß, C. Wirth, D. Bonetta, C.
Seaton, C. Humer, and H.
Mössenböck. PPPJ’14.
• Storage Strategies for Collections
in Dynamically Typed Languages.
C. F. Bolz, L. Diekmann, and L.
Tratt. OOPSLA’13.
• Memento Mori: Dynamic
Allocation-site-based
Optimizations. Clifford, D., Payer,
H., Stanton, M. & Titzer, B. L.
(2015). ISMM’15
62
62. Research and Literature
• Virtual Machine Warmup Blows Hot
and Cold. Barrett, E., Bolz-Tereick, C.
F., Killick, R., Mount, S. & Tratt, L.
(2017). OOPSLA’17
• Quantifying Performance Changes
with Effect Size Confidence Intervals.
Kalibera, T. & Jones, R.
(2012). Technical Report, University
of Kent.
• Rigorous Benchmarking in
Reasonable Time. Kalibera, T. &
Jones, R. (2013). ISMM’13
• How Not to Lie With Statistics: The
Correct Way to Summarize
Benchmark Results. Fleming, P. J. &
Wallace, J. J. (1986). Commun. ACM
• SIGPLAN Empirical Evaluation
Guidelines
https://www.sigplan.org/Resources/E
mpiricalEvaluation/
• Systems Benchmarking Crimes,
Gernot Heiser
https://www.cse.unsw.edu.au/~gern
ot/benchmarking-crimes.html
• Benchmarking Crimes: An Emerging
Threat in Systems Security. van der
Kouwe, E., Andriesse, D., Bos, H.,
Giuffrida, C. & Heiser, G. (2018).
arxiv:1801.02381
• http://btorpey.github.io/blog/2014/0
2/18/clock-sources-in-linux/
• Generating an Artefact From a
Benchmarking Setup as Part of CI
https://stefan-
marr.de/2019/05/artifacts-from-ci/
63
Editor's Notes
I know, questions are hard.
So, let’s practice together.
Lt. Frank Drebin (Leslie Nielsen)
Use of runtime programming sneaks into all kind of places, including performance sensitive parts.
This here is a ruby library to process photoshop files, we chose a number of kernels that do layer composition to show that optimizing reflective operations gives major benefits.
This library is not artificial, it is widely used, and arguably, layer composition can be performance critical.
It has more than 500 forks on GitHub…