The slides of my PhD defense, more information are available here: http://soft.vub.ac.be/~smarr/2013/01/supporting-concurrency-abstractions-in-high-level-language-virtual-machines/
How to Troubleshoot Apps for the Modern Connected Worker
Supporting Concurrency Abstractions in High-level Language Virtual Machines
1. Supporting Concurrency Abstractions in
High-level Language Virtual Machines
Stefan Marr
Promotor: Prof. Dr. Theo D’Hondt
Copromotor: Dr. Michael Haupt
Software Languages Lab
Public PhD Defense, 2013-01-18
10. How To Support Concurrent and Parallel
Programming Abstractions in a VM?
Main
Contribution
Virtual Machine + OMOP
Virtual Machine
Context 10
11. Thesis Statement
There exists a
relevant and significant subset of concurrent and
parallel programming concepts
that can be realized on top of a unifying substrate.
This substrate enables the
flexible definition of language semantics
that build on the identified set of concepts, and this
substrate
lends itself to an efficient implementation.
1/21/201 11
12. Agenda
Survey Problems Requirements The OMOP Evaluation Implementation Evaluation
Substrate Performance
There exists a
relevant and significant subset of concurrent and parallel programming concepts
that can be realized on top of a unifying substrate.
This substrate enables the
flexible definition of language semantics
that build on the identified set of concepts, and this substrate
lends itself to an efficient implementation.
12
13. What do we need for a Multi-Language Virtual Machine?
2 SURVEYS: VMS AND CONCEPTS
1/21/201 13
14. Survey 1: Today’s VM Support
Always a mismatch with some concepts
VM Threads & Locks Communicating Communicating
Threads Isolates
CLI X X
DisVM X
Erlang X
GHC X
JavaScript X
JVM + Dalvik X
Mozart/Oz X X
Python X X
Ruby X
Self X
Squeak X
* Table is abbreviated (Diss. Tab. 3.2)
Marr, S.; Haupt, M.; and D’Hondt, T. (2009), Intermediate language design of high-level language virtual machines: Towards
comprehensive concurrency support. In VMIL’09 Workshop, pages 3:1–3:2. ACM. (extended abstract)
Surveys 14
16. Survey Goal: Understand Concepts
LIBRARY Can it be implemented as a library? Solved
Problems
PRIOR ART Is it supported by a mainstream VM?
SEMANTICS Does it require runtime support to
guarantee its semantics?
PERFORMANCE Would runtime support enable significant
performance improvements?
Marr, S. and D’Hondt, T. (2012), Identifying A Unifying Mechanism for the Implementation of Concurrency Abstractions on Multi-
Language Virtual Machines, in TOOLS’12, Springer, pp. 171–186.
1/21/201 Surveys 16
17. Existing and Library Solutions
• 97 concepts identified, 66 considered distinct
Prior Art Library Solutions
Asynchronous Operations Join Agents Guards
Atomic Primitives Locks
Atoms MVars
Co-routines Memory Model
Condition Variables Method Invocation Concurrent Objects Message Queue
Critical Sections Race-And-Repair Event-Loop Parallel Bulk Operations
Fences Thread Pools
Global Address Spaces Thread-local Variables
Events Reducers
Global Interpreter Lock Threads Far-References Single Blocks
Green Threads Volatiles
Futures State Reconciliation
Immutability Wrapper Objects
Surveys 17
18. VM Support Required for:
Performance Improvements Semantic Guarantees
APGAS Implicit Parallelism Active Objects Message sends
Actors No-Intercession
Barriers Locality
Asynchronous Persistent Data
Clocks Mirrors Invocation Structures
Data Movement One-sided Axum-Domains Replication
Communication By-Value Side-Effect Free
Channels Speculative
Data-Flow Graphs Ownership Execution
Data-Flow Variables PGAS Data Streams Transactions
Isolation Tuple Spaces
Fork/Join Vector Operations
Map/Reduce Vats
Surveys 18
19. Results: Two Independent Sets of
Requirements
Improving Performance Ensuring Correct Semantics
• Optimization infrastructure • Custom language behavior
• Monitoring facilities • Semantic enforcement
Exposed to language implementer
Surveys 19
20. Focus on Semantic Aspects
Improving Performance Ensuring Correct Semantics
• Optimization infrastructure • Custom language behavior
• Monitoring facilities • Semantic enforcement
Exposed to language implementer
1/21/201 Surveys 20
22. Need to Solve Common
Language-Implementation Problems
• Isolation
– State encapsulation
– Safe message passing
• Scheduling guarantees
– Fairness, ordering,…
• Immutability
• Reflection violates
concurrency properties
[1] Karmani, R. K.; Shali, A. & Agha, G. (2009), Actor Frameworks for the JVM Platform: A Comparative Analysis, in PPPJ '09 , ACM.
[2] Marr, S.; De Wael, M.; Haupt, M. & D'Hondt, T. (2011), Which Problems Does a Multi-Language Virtual Machine Need to 22
Solve in the Multicore/Manycore Era?, in VMIL’11, ACM.
23. Isolation?
Broken State Encapsulation
Weak
object semaphore {
Semantics
class SemaActor() extends Actor {
are Common
def enter() {
if (num < MAX) { // Race • Scala
Condition! • Akka
num = num + 1; } } } • JCSP
• Kilim
def main() : Unit = { • Clojure
var gate = new SemaActor() • Swing UI
gate.start • …
gate ! enter // gate's thread
gate.enter // main thread
} }
Example from: Karmani, R. K.; Shali, A. & Agha, G. (2009), Actor Frameworks for the
1/21/201 JVM Platform: A Comparative Analysis, in PPPJ '09 , ACM. 23 23
24. Reflection?
Voids Concurrency Semantics
Client actor
// ...
customer report<-add(customer.order)
order // ...
add(order)
Report actor
void add(obj) {
report fields = obj.class.getFields()
for (field : fields) {
field.setAccessible(true)
// breaks isolation
list += field.get(obj)
}
}
Problems 24
26. What does the VM need to support?
DEDUCING REQUIREMENTS
26
27. Deduced Requirements for VMs
Based on Survey and Common Problems
Managed Managed Notion of Controlled
State Execution Ownership Enforcement
Requirements 27
28. Required: Managed State
• Variety of state access policies
– Actors, agents, axum domains,
by-value, immutability, isolation,
side-effect freeness, transactions, vats
• Problematic on today’s VMs
Managed State
– Isolation
– Immutability
Requirements 28
41. Case Studies: Cover complete OMOP
STM Event-Loop Actors
Clojure Agents LRSTM AmbientTalkST
Ad Hoc
No Guarantees Custom AST Transformation Wrapper Objects for
Safe Message Passing
OMOP-based
Full Guarantees Custom State Access Policies Custom Execution Policies
based on Ownership
LRSTM ported from: Renggli, L. & Nierstrasz, O. (2007), Transactional Memory for Smalltalk, in ‘Proceedings of
Evaluation: Semantics 41
the International Conference on Dynamic Languages 2007’, ACM, pp. 207-221.
42. Evaluation Thesis Statement Part 1
“relevant and significant subset of concurrent
and parallel programming concepts”
Relevant:
Novel set of supported
concepts
Significant:
Support all concepts X +
Evaluation: Semantics 42
43. Novel Set of Supported Concepts
VM Threads & Locks Communicating Communicating
Threads Isolates
CLI X X
DisVM X
Erlang X
GHC X
JavaScript X
JVM + Dalvik X
Mozart/Oz X X
Python X X
Ruby X
OMOP X X X
* Table is abbreviated (Diss. Tab. 3.2)
Marr, S.; Haupt, M.; and D’Hondt, T. (2009), Intermediate language design of high-level language virtual machines: Towards
comprehensive concurrency support. In VMIL’09 Workshop, pages 3:1–3:2. ACM. (extended abstract)
Surveys 43
44. Degree of Support for Concepts
X direct support + partial support
Semantics require Support OMOP Semantics require Support OMOP
SUPPORT SUPPORT
Active Objects X Message sends +
Actors X No-Intercession X
Asynchronous Invocation X Ownership X
Axum-Domains X Persistent Data Structures +
By-Value + Replication +
Channels + Side-Effect Free X
Data Streams + Speculative Execution +
Implicit Parallelism + Transactions X
Isolation X Tuple Spaces +
Map/Reduce + Vats X
Evaluation: Semantics 44
45. Evaluation Thesis Statement Part 1
“relevant and significant subset of concurrent
and parallel programming concepts”
Relevant:
Novel set of supported
concepts ✔
Significant:
Support all concepts X + ✔
Evaluation: Semantics 45
46. Evaluation Thesis Statement Part 2
“on top of a unifying substrate”
Abstraction:
No 1-to-1 mapping of concepts ✔
Minimalism:
All elements required ✔
Evaluation: Semantics 46
47. Implementation Size in LOC
Ad hoc Ownership-
based MOP
Agents 36* 113.
LRSTM 990. 262.
AmbientTalkST 183⁺ 83⁺
ActiveObjects - 68.
CSP+π - 53.
*) without enforcement of semantics
+) incomplete state encapsulation
Evaluation: Semantics 47
48. Evaluation Thesis Statement Part 3
“enables the
flexible definition of language semantics”
Demonstration:
✔
Case studies
Concepts X + ✔
Implementation Assessment:
Size ✔
Evaluation: Semantics 48
50. Impl 1: AST Transformation Applied
cell := Cell new. "unenforced"
cell set: 1.
immDomain adopt: cell.
result := ["enforced"
"cell set: 2"
cell domain reqExecOf: #set:
with: 2
on: cell.
] enforced.
Marr, S. and D’Hondt, T. (2012), Identifying A Unifying Mechanism for the Implementation of Concurrency Abstractions on Multi-
Language Virtual Machines, in TOOLS’12, Springer, pp. 171–186.
1/21/201 Implementation 50
51. Impl 3: Bytecodes + Primitives Adapted
void unenforced_extendedSendBytecode() { Impl 2: does a test
messageSelector = literal(fetchByte()); in every bytecode
set_argCount(fetchByte());
normalSend();
}
void enforced_extendedSendBytecode() {
int args = fetchByte();
set_argCount(args + 2);
Oop rcvr = stackValue(args);
Oop domain = rcvr.domain();
set_stackValue(args, domain);
push(literal(fetchByte()));
push(rcvr);
messageSelector = req_exec_selector(args);
normalSend();
}
1/21/201 Implementation 51
54. Performance
Runtime, normalized to corr. Ad Hoc implementation
AST−OMOP on CogVM
RoarVM+OMOP3 (opt)
10.00
Overall slowdown:
25% (min -3%, max 3x)
3.16
Performance Sweet Spot
• Custom state access policies
• Within 5% on average for
1.00
LRSTM (min -3%, max. 12%)
Binary Trees (AT)
Fannkuch (AT)
Fasta (AT)
NBody (AT)
Binary Trees (STM)
Fannkuch (STM)
Fasta (STM)
NBody (STM)
AmbientTalkST STM
Evaluation: Performance 54
56. Identified Requirements
for Unifying Abstraction
Managed State
Managed Execution
Notion of Ownership
Requirements based on surveys Controlled Enforcement
and common problems.
Conclusion 56
57. Ownership-based MOP as Concurrency
Abstraction for VMs
Agents
AmbientTalkST
ActiveObjects
CSP+π
STM
Conclusion 57
58. Benefits of the OMOP
Solves Common Problems Supports Semantics of Concepts
• Isolation • Active objects, actors, async
• Immutability invocation, axum domains,
• Reflection isolation, side-effect free,
transactions, vats, …
• Improves Enforceability of
Scheduling Policies
Conclusion 58
59. Future Work
• VM support for parallel performance
• Evaluate OMOP overhead in JIT’ed VMs, JVM
• Widen sweet spot (actors, CSP, …): use MMU
• Limitations
– Interactions between domains: Actors + STM,…
– Interface for scheduling policies
Conclusion 59
60. Main Contributions
Determined Devised Solved
Requirements the OMOP Common
Problems
Main Publications
Marr, S. and D’Hondt, T. (2012), Identifying A Unifying Mechanism for the Implementation of Concurrency Abstractions on Multi-
Language Virtual Machines, in TOOLS’12, Springer, pp. 171–186.
Marr, S.; De Wael, M.; Haupt, M. & D'Hondt, T. (2011), Which Problems Does a Multi-Language Virtual Machine Need to Solve in
the Multicore/Manycore Era?, in VMIL’11, ACM.
Marr, S.; Haupt, M.; and D’Hondt, T. (2009), Intermediate language design of high-level language virtual machines: Towards
comprehensive concurrency support. In VMIL’09 Workshop, pages 3:1–3:2. ACM. (extended abstract)
Editor's Notes
Ok, so the title of my dissertation is “Supporting Concurrency Abstractions in Multi-language Virtual Machines”.<next-slide>
Let me start with a bit of introduction.So, what are Virtual Machines?People use them, because they provide a number of interesting benefits.Let’s look at an intuitive one.If you build an application, a game such as Angry Birds:you want to reach as many people as possible, on all kind of devices.<next-animation>The problem is, beside building the game, you need to adapt it for each platform.And there are quite a number of them out there.That all differ in some way or another.So, in the end, you build in addition to your application quite a bit of platform-specific parts you would like to avoid.<next-slide>
And Virtual Machines help here.So in an ideal world, you develop only your application, and don’t worry about platform-specific parts anymore.That’s what the VM does take care of.<next-slide>
But that’s only one aspect.More important is that VMs provide modern tools, compared to programming with classic C/C++.These tools, or programing language features include:Just-in-time compilation to make sure your programs run fast.Automated memory management, because it simplifies software development a lot.<next-slide>
We see here also a trend, that people actual developall kind of different languages, so, better programming tools, on top of these VMs.So multi-language VMs can for instance improve productivity.And these multi-language VMs are the kind of VMs I am interested in.<next-slide>
Beside VMs, there is another important aspect to my work.We have seen an interesting trend in the last decade with the development of processors.They reached a speed limit around 2005.<next-slide>
They reached a point where current technology just produces too much heat.They can’t go faster anymore.Now, instead of having faster processors, we get them with more cores.So they can do multiple things at the same time.<next-slide>
That’s unfortunately a real problem.If you imagine what a developers need to think about, keep track in their heads, it looks perhaps something like this.A very complex system, where all kind of stuff could go wrong at any point at the same time.Rather complicated to work with.<next-slide>
So, what do people do, well, I was talking about VMs.They basically go in that direction and propose better tools to manage the complexity.And my problem is, that they proposed a large bunch of things over the last 7 decades.<next-slide>
So, my question is, how can we support all the things we need in a VM.The problem is that a VM can not just support all of them directly.That’s to complex and not maintainable.So, the question is, which ones to support or, how to support them all in a better way?<next-animation>My proposal is an Ownership-based Metaobject protocol as a minimal unifying substrate.It’s the main contribution of this dissertation. And, one important notion here is the notion of ownership. But we will see the details later.<next-slide>
In this presentation, I’ll highlight the main parts of my dissertation, based on my thesis statement:There exists a relevant and significant subset of concurrent and parallel programming concepts.I’ll give the used definitions later in the evaluation.These programming concepts can be realized on top of a unifying substrate. That is my OMOP.<next-animation>To evaluate that:The substrate needs to enable the flexible definition of language semantics.And, lends itself to an efficient implementation.<next-slide>
I map the thesis statement on the following parts of my presentation:First, I give an overview of the supporting surveys, the identified problems, and the concrete requirements that I derived from them.Then, I am going to discuss the solution, and the different parts of my evaluation strategy.<next-slide>
Ok, so let’s start with the two surveys.<next-slide>
In the first survey, we investigated the state of the art in VM support.The main insight we gain is that VMs do not support the wide range of concepts out there.They are typically designed for one specific language, which makes specific choices in the concepts it supports.To simplify the discussion, I categorized the VM support into Threads & Locks, Communicating Threads, and Communicating Isolates.<point out table heading>The main point I want to make is that trying to map concepts from the categories on other categories always leads to a mismatch.These mismatches lead to either higher implementation complexity or semantic insufficiencies,if we try to implement these concepts on top of these VMs.So, it’s not an ideal situation, and none of the VMs actually supports all three categories.<next-slide>
The second survey concentrates on the actual concepts proposed in the field.People have been doing research for 70 years.And they developed a wide range of different approaches and solutions.<next-slide>
So, the goal of this survey is to understand these concepts and to derive requirements.The approach to this survey is,to classify the concepts based on the four questions here <point out slide>If we can implement a concept without problems in libraries, or if they are already widely supported in VMs, so prior art,we consider them as solved problems.<next-animation>So, we can concentrate on the last two questions:We evaluate whether a concept requires VM support to guarantee its properties/its semantics.And, we evaluate whether a VM could provide betterperformance by having greater knowledge about certain concepts. This is based on literature that proposes optimizations.<next-slide>
So, the overall result is that we have roughly 100 concepts that we identified in the literature, and programming languages.66 of them have been discussed in detail.The ones listed on this slide fall in the first category I described.They are solved problems for this dissertation.So, we can move on to the remaining concepts.<next-slide>
The concepts listed here, can benefit from VM support.On the one hand, we have the ones for which we identified ways to improve performance with VM support.And on the other hand, we have the ones that require VM support to ensure correct semantics.Examples are isolation of actors, asynchronous invocation on active objects.These tables give of course only the rough categorization.In the text, I detail them further to extract the relevant characteristics.And we use that discussion to derive requirements for VM support.<next-slide>
Based on the two groups of concepts, we derive two independent sets of requirements.Requirements for performance improvements,And requirements to ensure the correct semantics of the desired concepts.The result of our analysis shows that the VM could provide a range of mechanisms to expose the adapt optimization infrastructure to the language developers. So, making the JIT complier accessible to them.On the other hand, we can imagine customizable performance monitoring facilitiesthat enable dynamic optimizations similar to what is done today on a much more restricted level.Independent of that are the semantic issues.<next-slide>
We decided to focus in this dissertation on those.To be able to consider all the aspects and dive in deeply.Those two parts are basically two separate research projects, both interesting, and plenty of cool things we could do in future work.Anyway, let’s focus on the semantic aspects first.While the surveys yielded some general requirements, we wanted to refine them first.Refine based on common problems language implementers are facing when targeting today’s VMs and multi-language environments.<next-slide>
So, based on the survey, we look now into common problems, to refine our requirements.<next-slide>
The identified problems are based on a survey done for Actor languages.And on our own work and experience.We identified the four problems listed here.I am going to give an example for the problem of isolation, and show the way it is usually broken on today’s VMs and languages.Scheduling guarantees are another set of problems, which come from the fact that there is typically no way to enforce scheduling semantics such as fairness or ordering.Immutability, and its weak realization is a common problem as well.And, it is problematic for the Java people.They started a very recently a proposal to get real immutability in form of value objects: http://openjdk.java.net/jeps/169For the problem with Reflection, I am also giving an example.The idea is that these problems are common, when language implementers target multi-language VMs, and there isn’t a general solution so far available.<next-slide>
Ok, so the first example is isolation.By isolation I mean that it is guaranteed that no shared state access is possible between isolated entities.Thus, on the one hand, state encapsulation needs to be ensured that’s this example.And on the other hand, we care about safe message passing, that does not introduce problematic shared state.The example given here is a simple actor program written in Scala.On the side here, I list a couple of languages and frameworks that have such problems, it is fairly common, especially when the JVM or .NET are targeted.What happens is that we create a gate actor, that’s supposed to be a semaphore<point-out-lower-part-of-slide>However, we actually send it an asynchronous message, and afterwards execute the same message synchronously as well.This is an inherent race condition, the language does not prevent it.And we get a problem with the num field, because it is not read and updated atomically.One reason to allow such situations is to avoid the additional implementation complexity to enforce the rules.Other aspects might be performance considerations, etc.<next-slide>
The second problem I want to detail here is reflection.Reflection was designed for sequential code, and most applications care to circumvent language restrictions.However, if we look at this example here, that’s not exactly what we want.So, we have here two actors. All of their interaction has to happen via messagesthat are processed asynchronously by the receiving actor.However, in this scenario here, we have that customer and that order object inside the client actor, and we send the order over to a report actor, to have it persisted for reporting reasons.The problem now is, while standard field accesses would respect the encapsulation, reflection does not.You use in Java things like setAccessible(true) to disable security checks, and then you do what ever you want.In the worst case, you actually end up with inconsistent state.That’s no good, so, reflection needs to be more flexible. We need to be able to access private fields, but we still want to be subject to the concurrency-related guarantees.<next-slide>
Ok, what I was talking about so far, was the analysis part of my work.Let’s move on and put the pieces together.
With the concepts identified, and the problems just discussed,I am now going over the requirements for a unifying substrate for a VM.<next-slide>
I will briefly go over the four concrete requirements:Managed stateManaged executionNotion of ownershipAnd controlled enforcement<next-slide>
Managed state is the first one.If we look at the different concepts, one aspect that is important is that they define different rules with respect to state access.There is a lot of variation, and a language implementer needs a way to implement these variations and define different rules for instance for the reading and writing of object fields.As discussed with the problems, Isolation and Immutability are typically problematic to guarantee on top of the JVM and similar VMs.<next-slide>
Managed execution is the second one.Very similar to managed state, but this time it is about variations to the method invocation semantics.One specifically problematic case is reflective invocation of methods.Even when a method is called via reflection, you still want to make sure that it is only executed asynchronously in case of active objects.<next-slide>
The third thing we requirement is a notion of ownership.What we see in our survey is that the different concepts typically define their rules with respect to an owning entity.So, the most typical thing is that on the object-level there is some notion of owner, which has certain rights on the object.While other entities have different rights.One example is again the event-loop actor model, where we have full access in the same actor, but can only interact asynchronously with objects in other actors.<next-slide>
The last requirement is what I call ‘Controlled Enforcement’Reflection is way to useful to just forbid it. I listed a couple of use cases here.The main point we need to support is that we do not have an all-or-nothing approach to reflection.We still want to be able to enforce concurrency semantics, and use reflection where it is not problematic easily.Ok, and that completes our requirements.<next-slide>
So, based on the surveys, the discussed problems, and the derived requirements:The proposed solution, and the main contribution of this dissertation is the ‘Ownership-based Metaobject Protocol’, or short: the OMOP.<next-slide>
Ah ja, and all the stuff I am discussing here is implemented.And it is implemented in Smalltalk. There are, depending on how you count, now four implementations, that work and enabled me to experiment with the ideas.<next-slide>
I will detail the OMOP by showing it’s elements and the related requirement.Ok, so the basic parts of the system are of course objects,and threads, which execute the actual program.<next-animation>Based on the ownership requirement, we have a Domain.The domain is the metaobject that defines the language behavior.Every object has one owner, and domains can own an arbitrary number of objects.A thread is said to executed inside a domain.And as you can see, the domain provides intercession handlers to react to threads starting/resuming execution inside a domain.And a handler to define what the initial domain is of objects created during execution of this domain.<next-animation>To satisfy the managed state requirement, we have these two intercession handlers to define the behavior of field read and write operations.<next-animation>Managed execution is represented by the requestExec handler, which allows us to give special semantics to method execution.For instance, to enforce asynchronous execution.<next-animation>The last requirement is ‘controlled enforcement’.To realize that, the thread has conceptually a bit, to indicate the current execution mode.So, either enforced or unenforced execution.I say conceptually, also with respect to other details here, it can be realized differently.As I did describe in the text. This here is the model, straightforward without considering optimizations etc…<next-animation>I want to point out some VM specific mechanism such as primitives.So, the idea is that the OMOP is general, and can be applied for instance also to JVMs and so on.Primitives are critical in that regard, also global state, because if you leave them out, your guarantees won’t be complete.<next-animation>The last point before I give an example are helper functions.They are notessential to the OMOP.I added them, because during the implementation of my case studies, those things were the most useful ones.<next-slide>
So, the domain is the customizable part.This here is an example of how we can guaranteed immutability by overriding all the mutating operations, and instead of performing them, we raise an error.So, we need to cover field writes, and all the mutating primitives.<next-slide>
So, how does the interaction flow with the domain, let’s look at a small sequence diagram.It includes the main execution, and a current domain, and the immutable domain over there.<next-animation>The first thing I do is creating a cell object. Just a simple mutable thing.That code is part of the initial execution, let’s call it the setup phase where we initialize the application.<next-animation>And, the initial value for our cell is going to be 1.<next-animation>The intention is to have it immutable, so, we are asking the immutable domain to adopt it.So, we change the owner of the cell.Now, we are basically done with the setup.<next-animation>The next step is then to start the actual application, and it is going to execute in the enforced mode.<next-animation>Ok, so, we try to do the `set: 2` in the enforced mode.But the enforced mode means that the OMOP is active, and we actually invoke the intercession handler for methods on the domain object instead of executing any code directly.<next-animation>For immutability, that’s not a problem.So, the Domain just does the standard implementation.<next-animation>We perform the set method on the object.<next-animation>At some point, the set method actually reaches the `set/field write` bytecode.But instead of just performing the operation, the domain receives the write:toField:of: message and can refine the semantics.In our case: booom!<next-animation>We get the desired immutability error.<next-slide>
Here I have highlighted the elements of the OMOP that are required to realize isolation, one of the big current problems.Ownership is one aspect we need to realize isolation.The other aspect is of course manageable state. Based on these two, we define different rules for inter and intra domain object access.Typically, we also need to take care of VM-specifics such as primitives and global state.But, that’s it! And we have solved the isolation problem easily.<next-slide>
The problem of reflection is actually handled implicitly.As a consequence of managed execution, we also reify the primitive execution.And since reflection is typically realized via primitives, they are covered as well,And we can decide in a case-by-case session which language restrictions to uphold, and which to bypass.As an examples, we can now distinguish between isolation and private field protection.<next-slide>
Ok, here a last overview of the key elements of the OMOP, and how the parts map to the requirements.The main elements are that we handle state access, execution, ownership, and that we provide control over the policy enforcement.Let me also mention that the presented OMOP is only one possible design, but there are other variation possible to satisfy the requirements.The Interface design can vary,We can imagine different optimizations.Possible variations: handled primitives in a unified way like methods.Or explicitly different handlers based on ownership, that could also be a performance optimization.Important is that the requirements are satisfied and that the key elements are represented in the OMOP in one way or another.<next-slide>
Ok, so, I hope that gave a good impression of how the OMOP works.Now, I want to talk about the evaluation approach used.Here, I start with looking at the semantics part.<next-slide>
The evaluation is based on three case studies, all implemented in SmalltalkI implemented Clojure’s agents, an STM, and event-loop actors similar to AmbientTalk.Once in an ad-hoc fashion, the classical way, and once based on the OMOP.I chose those three case studies to cover all aspects of the OMOP, and investigate different models.With the agents, I experimented with providing extended guarantees.The STM is a nice example of changing the semantics for all state access.And the event-loop actors utilize the notion of ownership to define the policies.The details on how they map to the OMOP are in the text. Here I will focus on the evaluation approach for the thesis statement.<next-slide>
The first part of the thesis statement asks that the set of concepts is relevant and significant.The first part, the relevance, is supported by our surveys.I argue that the set of concepts that are supported is relevant in a multi-language VM setting and it is novel.For the second part, we discuss again the concepts that require semantic support.<next-slide>
So, you might remember this table.The first survey we did.I just want to emphasize here that the OMOP is actually the first to provide support for all the different categories.
The goal of this evaluation is to assess the degree with which the OMOP supports the various concepts.As you can see here, I discussed all the concepts that require support for semantics.I marked them either as directly supported, or as partially supported.With direct support I mean that the main concept maps onto the OMOP directly.While partial support means that the concept is mostly orthogonal to the OMOP, but one key property is supported by it.Active objects are an example for the direct support.Here the main aspect is directly covered by the OMOP. So, you can make sure that all operations are done asynchronously.And, that’s more or less it in terms of active objects.On the other hand, we have concepts such as Persistent Data Structures, which are mostly orthogonal to the OMOP.However, there is a major aspect that is problematic. For persistent data structures, that’s the immutability, and here the OMOP provides the necessary support. Thus, we say it is supported partially.<next-slide>
Ok, so since we cover all the concepts that require support for semantics, either completely or for the problematic aspect, we conclude that the chosen subset is actually significant and relevant.
Another main point of the thesis statement is the unification aspect.And here we simply argue in the text that there is no 1-to-1 mapping, and that the set of provide abstractions is actually minimal.Thus, all of it’s elements are required for fulfill the stated requirements.
Another aspect I was evaluating is Lines of Code, implementation size.As you can see, providing the extended guarantees to the ad hoc implementation of agents adds a couple of LOC. So, that’s going beyond what Agents typically guarantee.However, it improves the engineering significantly, when we actually know that everything is immutable for instance.For the STM, the numbers here are a bit superficial. We get the small implementation sizebecause we do not have to transform the AST, as it is done by the ad hoc implementation.I detail those numbers much more in the text.The size reduction is more clear for the event-loop actor implementation.Here we stay below 100 lines with the OMOP while the original version almost double the size.I also did these two more implementation to see how flexible the OMOP is, and stay easily below 100lines of code.<next-slide>
Looking at the implementation size was one aspect of my evaluation of the flexibility of the OMOP.So, the implementation size seems to be smaller.And the implementation feel more direct, which is subjective, but the OMOP looks pretty good to my eyes.The other aspects with which I argue for the flexibility are the case studies and the supported concepts which cover a large design space, and are a good indication for the flexibility the OMOP provides.<next-slide>
Ok, after evaluating the semantic aspect of the OMOP, I want first to briefly go over the implementation, and then talk about the performance evaluation.<next-slide>
The first implementation is based on AST transformation.Here the simple cell example from earlier again.As we can see, the set: 2 in the enforced execution block is transformed to a lookup of the owner domain and then a send of the reqExecOf: message with the corresponding parameters.It’s important to note that this implementation runs, and provides the desired semantics, without changing the VM.<next-slide>
The second and third implementation are on the VM level.I show here a code snippet from an implementation variant that I added after submitting the dissertation text.I added it here, because the performance is slightly better.We do avoid active checks by having an extended bytecode set where the dispatch takes into account whether we execute enforced or unenforced.Ok, so that’s changing the VM, and adapting bytecodes and primitives to invoke the OMOP’s intercession handlers as necessary.<next-slide>
Ok, so with these implementation strategies in mind, the next question is of course the evaluation of the thesis statement, and to see whether the OMOP can be implemented efficiently.<next-slide>
For the performance evaluation, I selected a number of micro and kernel benchmarks.And prepared them for the STM and AmbientTalkST implementation.I also carefully prepared the actual measurement setup to avoid all unrelated and unnecessary disturbance in the results, which could lead to misinterpretations.One thing that I did, and that is important is that I disabled a number of features in the RoarVM to improve performance. That includes the ability to execute code in parallel.This gives a performance boost of 23% [Speedup of RoarVM (opt) over RoarVM: 23.2% (min 6.5%, max 43.7%)].The goal is to reduce the noise of the interpreter and do what we can to make all performance impact that’s coming from my changes stick out more clearly. It’s not supposed to hide in the noise.With that, I claim that the results I obtain are generalizable to other interpreters.But of course, not beyond that. So, it doesn’t have any predictive power for other classes of VMs.<next-slide>
Ok, so I am only giving here the graph for showing the most relevant experiment.I compare the performance of ad hoc implementation to OMOP-based implementation.The graph here is a log scale, because we compare ratios.The AST implementation, that is the gray bars are pretty bad over all.But that was to be expected.As we can see with the blue bars, look pretty good in case of the STM.For the AmbientTalkST we do not get that good numbers, so it seems that the current implementation strategy has a sweet spot for STM like things, thus, when the state access is heavily customized.Here we get an average of 5% overhead, which is close to on-par performance.Overall we have 25% worse numbers unfortunately, but I feel confident to claim that there is at least one sweet spot in which we do not have to give up performance.And, putting the distinction between inter and intra domain access into the VM should speed up things like event-loop actors as well.<next-slide>
To conclude this presentation, let me briefly recap the results.<next-slide>
One important aspect of my work are the surveys and the analysis of what a VM needs to support.On the one hand, we have a list of general requirements, but then focusing on the semantic aspects, we have a very concrete list of requirements to support the semantic aspects of the concepts, and to solve common problems people encounter when implementing languages on top of multi-language VMs.<next-slide>
The OMOP, so my main contribution, is designed based on these requirements.And with the case studies, I showed that it provides the flexibility to implement a number of abstractions, as listed here, and that is fulfills, of course, the requirements.<next-slide>
The main benefit of the OMOP is that it solves common problems.And with that, it makes the live of a language implementer a lot easier form my perspective.People can now go and implement all their fancy abstractions on top of the OMOP and voilà the semantics are as desired, no problems with reflection, etc.You can go and invent and experiment with domain-specific abstractions and try to help them solving the multicore challenge.<next-slide>
Of course, there are new questions that wait to be answered.The performance on JVM and other JITed VMs needs to be investigated.We need to investigate how to widen the sweet spot for the OMOP, and I have a couple of ideas for that.And then there are number of things that are not yet comfortable.One example is having STM and actors in the same applications. The OMOP domains need to be prepared for that. So, reuse of domains definition is not as trivial as it should be.<next-slide>
Again, the main contributions are that we determined the requirements for a multi-language VM.We proposed the OMOP, and we have shown that it solves the common problems people have today when targeting things like JVM and .NET.Questions?