Parallel First-Order Operations

Parallel first-order operations
Sina Madani, Dimitris Kolovos, Richard Paige
{sm1748, dimitris.kolovos, richard.paige}@york.ac.uk
Enterprise Systems, Department of Computer Science
1OCL 2018, Copenhagen

Outline
• Background and related work
• Epsilon Object Language (EOL)
• Parallelisation challenges and solutions
• Performance evaluation
• Future work
• Questions

Motivation
• Scalability is an active research area in model-driven engineering
• Collaboration and versioning
• Persistence and distribution
• Continuous event processing
• Queries and transformations
• Very large models / datasets common in complex industrial projects
• First-order operations frequently used in model management tasks
• Diminishing single-thread performance, increasing number of cores
• Vast majority of operations on collections are pure functions
• i.e. inherently thread-safe and parallelisable

Related Work
• Parallel ATL (Tisi et al., 2013)
• Task-parallel approach to model transformation
• Parallel OCL (Vajk et al., 2011)
• Automated parallel code generation based on CSP and C#
• Lazy OCL (Tisi et al., 2015)
• Iterator-based lazy evaluation of expressions on collections
• Parallel Streams (Java 8+, 2013)
• Rich and powerful API for general queries and transformations
• Combines lazy semantics with divide-and-conquer parallelism

Epsilon Object Language (EOL)
• Powerful imperative programming constructs
• Independent of underlying modelling technology
• Interpreted, model-oriented Java + OCL-like language
• Base query language of Epsilon
• Global variables
• Cached operations
• ...and more

General challenges / assumptions
• Need to capture state prior to parallel execution
• e.g. Any declared variables need to be accessible
• Side-effects need not be persisted
• e.g. through operation invocations
• Operations should not depend on mutable global state
• Caches need to be thread-safe
• Through synchronization or atomicity
• Mutable engine internals (e.g. frame stack) are thread-local
• Intermediate variables’ scope is limited to each parallel “job”
• No nested parallelism

Collection<T> select (Expression<Boolean> predicate, Collection<T> source)
• Filters the collection based on a predicate applied to each element
var jobs = new ArrayList<Callable<Optional<T>>>(source.size());
for (T element : source) {
jobs.add(() -> {
if (predicate.execute(element))
return Optional.of(element);
else return Optional.empty();
});
}
context.executeParallel(jobs).forEach(opt -> opt.ifPresent(results::add));
return results;
OCL 2018, Copenhagen 7

context.executeParallel (ordered)
EolThreadPoolExecutor executorService = getExecutorService();
List<Future<T>> futureResults = jobs.stream()
.map(executorService::submit).collect(Collectors.toList());
List<T> actualResults = new ArrayList<>(futureResults.size());
for (Future<T> future : futureResults) {
actualResults.add(future.get());
}
return actualResults;

T selectOne (Expression<Boolean> predicate, Collection<T> source)
• Finds any* element matching the predicate
• Same as select, except with short-circuiting
jobs.add(() -> {
if (predicate.execute(element))
context.completeShortCircuit(Optional.of(element));
});
}
Optional<T> result = context.awaitShortCircuit(jobs);
hasResult = result != null;
if (hasResult) return result.get();

context.shortCircuit
• ExecutionStatus object used for signalling completion
• “AwaitCompletion” thread waits for completion of jobs
• Also checks whether the completion status has been signalled
• Main thread waits for the ExecutionStatus to be signalled
• Call to context.completeShortCircuit() signals the ExecutionStatus
• “AwaitCompletion” terminates upon interruption
• After control returns to main thread, remaining jobs are cancelled

Boolean nMatch (Expression<Boolean> predicate, int n, Collection<T> source)
• Returns true iff the collection contains exactly n elements satisfying
the predicate
AtomicInteger matches = new AtomicInteger(), evaluated = new AtomicInteger();
jobs.add(() -> {
int evaluatedInt = evaluated.incrementAndGet();
if (predicate.execute(element) && (matches.incrementAndGet() > n ||
sourceSize – evaluatedInt < n - matches.get())) {
context.completeShortCircuit();
}
});
}
return matches.get() == n;

Boolean exists (Expression<Boolean> predicate, Collection<T> source)
• Returns true if any element matches the predicate
• Same as selectOne, but returns a Boolean
var selectOne = new ParallelSelectOneOperation();
selectOne.execute(source, predicateExpression);
return selectOne.hasResult();

Boolean forAll (Expression<Boolean> predicate, Collection<T> source)
• Returns true iff all elements match the predicate
• Delegate to nMatch to benefit from short-circuiting
var nMatch = new ParallelNMatchOperation(source.size());
return nMatch.execute(source, predicateExpression);
• Alternatively, delegate to exists with inverted predicate

Collection<R> collect (Expression<R> mapFunction, Collection<T> source)
• Transforms each element T into R, returning the result collection
• Computationally similar to select, but simpler
• No wrapper required, since we’re performing a one-to-one mapping
var jobs = new ArrayList<Callable<R>>(source.size());
jobs.add(() -> mapFunction.execute(element));
}
context.executeParallel(jobs).forEach(results::add);
return results;

List<T> sortBy (Expression<Comparable<?>> property, Collection<T> source)
• Sorts the collection according to the derived Comparable
• Maps each element to a Comparable using collect
• Sorts the derived collection based on the Comparator property of
each derived element
• Sorting can be parallelised using java.util.Arrays.parallelSort
• Divide-and-conquer approach, sequential threshold = 8192 elements

Map<K, Collection<T>> mapBy (Expression<K> keyExpr, Collection<T> source)
• Groups elements based on the derived key expression
var jobs = new ArrayList<Callable<Map.Entry<K, T>>>(source.size());
jobs.add(() -> {
K result = keyExpr.execute(element);
return new SimpleEntry<>(result, element);
});
}
Collection<Map.Entry<K, T>> intermediates = context.executeParallel(jobs);
Map<K, Sequence<T>> result = mergeByKey(intermediates);
return result;

Testing for correctness
• EUnit – JUnit-style tests for Epsilon
• Testing of all operations, with corner cases
• Equivalence test of sequential and parallel operations
• Testing of scope capture, operation calls, exception handling etc.
• Repeated many times with no failures

Performance evaluation
19
• Execution time on X axis
• Speedup indicated on data points (higher is better)
• Number of threads indicated in parentheses on Y axis
• All tests performed on following system:
• AMD Threadripper 1950X (16 core / 32 threads)
• 32 GB (4 x 8GB) DDR4-3000MHz RAM
• Oracle JDK 11 HotSpot VM
• Fedora 28 OS
OCL 2018, Copenhagen

1
12.438
13.334
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
Sequential
Parallel (16)
Parallel (32)
Execution time (seconds)
select (3.53 million elements)

Future Work
• closure
• aggregate and iterate
• Identify bottlenecks to improve performance
• Combine with lazy solution
• More comprehensive performance evaluation
• Test all operations
• Compare with Eclipse OCL
• More varied and complex models / queries

Questions?
24
sm1748@york.ac.uk
OCL 2018, Copenhagen
eclipse.org/epsilon
• Data-parallelisation of first-order operations on collections
• Short-circuiting operations more complex to deal with
• Stateful operations, such as mapBy, require different approach
• Significant performance improvement with more cores
• Open-source
github.com/epsilonlabs/parallel-erl

Thread-local base delegation example
• Can be used to solve variable scoping
• Each thread has its own frame stack (used for storing variables)
• Each thread-local frame stack has a reference to the main thread’s
frame stack
• If a variable in the thread-local frame stack can’t be found, look in the
main thread frame stack
• Main thread frame stack should be thread-safe, but thread-local
frame stacks needn’t be

Control Flow Traceability
• Different parts of the program could be executing simultaneously
• Need execution trace for all threads
• Solution:
• Each thread has its own execution controller
• Record the trace when exception occurs
• Parallel execution terminates when any thread encounters an exception

Parallel First-Order Operations

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Parallel First-Order Operations

Similar to Parallel First-Order Operations (20)

Recently uploaded

Recently uploaded (20)

Parallel First-Order Operations

Editor's Notes