Performance Test Driven Development with Oracle Coherence

Test Driven Development
with Oracle Coherence
Alexey Ragozin
London, 18 Jul 2013

Presentation outline
• Motivation
• PTDD philosophy
• Oracle Coherence under test
 Coherence Vs. Testing
 Small cluster Vs. Big cluster
 Areas to keep an eye on
• Automation challenge
• Common pitfalls of performance testing

This code works
Filter keyFilter = new InFilter(new KeyExtractor(), keySet);
EntryAggregator aggregator = new Count();
Object result = cache.aggregate(keyFilter, aggregator);
ValueExtractor[] extractors = {
new PofExtractor(String.class, TRADE_ID),
new PofExtractor(String.class, SIDE),
new PofExtractor(String.class, SECURITY),
new PofExtractor(String.class, CLIENT),
new PofExtractor(String.class, TRADER),
new PofExtractor(String.class, STATUS),
};
MultiExtractor projecter = new MultiExtractor(extractors);
ReducerAggregator reducer = new ReducerAggregator(projecter);
Object result = cache.aggregate(filter, reducer);

This code also works
public static class MyNextExpiryExtractor implements ValueExtractor {
@Override
public Object extract(Object obj) {
MyPorfolio pf = (MyPorfolio) obj;
long nextExpiry = Long.MAX_VALUE;
for(MyOption opt: pf.getOptions()) {
if (opt.getExpiry() < nextExpiry) {
nextExpiry = opt.getExpiry();
}
}
return nextExpiry;
}
@Override
public String toString() {
return getClass().getSimpleName();
}
}

And this also looks Ok
@LiveObject
public static class MyLiveObject implements PortableObject {
// ...
@EventProcessorFor(events={EntryInsertedEvent.class})
public void inserted(
EventDispatcher dispatcher,
EntryInsertedEvent event) {
DefaultCommandSubmitter.getInstance()
.submitCommand(
"subscription",
new MySubscribeCommand(event.getEntry().getKey()));
}
}

Another slide to scare you
API
Cache
service
Packet
publisher
Packet
speaker
OS
Packet
listener
Packet
receiver
OS
Service
thread
Worker
thread
Packet
receiver
Packet
publisher
Packet
speaker
Packet
listener
Packet
receiver
Service
thread
Cache
service
Packet
publisher
Packet
speaker
Packet
listener
API
Service
thread
Packet
receiver
Packet
listener
OS OS
Packet
speaker
Packet
publisher
Service
thread
Worker
thread
Serialization
Deserialization
Client thread
Approximate sequence diagram for cache get operation

Functional Vs. Fast
 You have paid for Coherence
 You have paid for gigabytes of RAM
 You have spent time developing solution
and you want to be REALLY FAST
 Do not be a hostage of your beliefs
 Test early
 Test often

PTTD Philosophy
Working cycle
 Write simplest functional code
 Benchmark it
 Improve based on test measurements
Never optimize unless you can measure outcome
Never speculate, measure
Saves time and improves work/life balance 

Testing Coherence
Challenges
 Cluster required
 Sensitive to network
 Database is usually part of solution
 Massive parallel load generation required

Testing Coherence
Benefits
 Pure Java, less hurdle with OS tweaking etc
 Nodes are usually plain J2SE processes
 you can avoid app server setup pain
 No disk persistence
 managing data is usually hardest part of test setup

Benchmarking and cluster size
Single node cluster may reveal
 server side processing issues
Small cluster 2-8 physical servers
 latency related problems
 scalability anomalies
 partitioning anomalies
Large cluster > 8 physical servers
 my terra incognita, so far

Areas to keep eye on
Extractors, queries, indexes
• query index usage
• query plans for complex filters
• POF extractors
Server side processing
• backing map listeners
• storage side transformations
• cross cache access
Network
• effective network throughput
Capacity
• large messages in cluster
• Coherence*Extend buffers
Mixed operation loads
• cache service thread pool saturation
• cache service lock contention
Scale out
• broadcast requests
• hash quality
• 100% utilization of network thread

Automation
“Classic” approach
 bash + SSH + log scraping
Problems
 not reusable
 short “half-live” of test harness
 Java and bash/awk is totally different skill set

Automation
Stock performance test tools
 Deployment are not covered
 Often “web oriented”
 Insufficient performance of tool

Automation – Game changer
cloud = CloudFactory.createSimpleSshCloud();
cloud.node("cbox1");
cloud.node("**").touch();
// Say hello
cloud.node("**").exec(new Callable<Void>() {
@Override
public Void call() throws Exception {
String jvmName =
ManagementFactory.getRuntimeMXBean().getName();
System.out.println("My name is '" + jvmName + "'. Hello!");
return null;
}
});

NanoCloud - http://code.google.com/p/gridkit/wiki/NanoCloudTutorial
• Managing slave nodes
 in-process, local JVM process, remote JVM process
• Deploy free remote execution
• Classpath management
 automatic master classpath replication
 include/exclude classpath elements
• Pain free master/slave communications
• Just works! 

Full stack
• Maven – ensure test portability
• Nanocloud – painless remote execution
• JUnit – test enter point
• Java – all test logic
staring nodes, starting clients, load generation, result processing …
•Java – all test logic
• Jenkins – execution scheduling

Simple microbench mark
@Before
public void setupCluster() {
// Simple cluster configuration template
// Single host cluster config preset
cloud.all().presetFastLocalCluster();
cloud.all().pofEnabled(true);
cloud.all().pofConfig("benchmark-pof-config.xml");
// DSL for cache config XML generation
DistributedScheme scheme = CacheConfig.distributedSheme();
scheme.backingMapScheme(CacheConfig.localScheme());
cloud.all().mapCache("data", scheme);
// Configuring roles
cloud.node("storage*").localStorage(true);
cloud.node("client").localStorage(false);
// Storage nodes will run as separate processes
cloud.node("storage*").outOfProcess(true);
}
*https://github.com/gridkit/coherence-search-common/blob/master/src/test/java/org/gridkit/coherence/search/bench/FilterPerformanceMicrobench.java

Simple microbench mark
@Test
public void verify_full_vs_index_scan() {
// Tweak JVM arguments for storage nodes
JvmProps.addJvmArg(cloud.node("storage-*"),
"|-Xmx1024m|-Xms1024m|-XX:+UseConcMarkSweepGC");
// Generating data for benchmark
// ...
cloud.node("client").exec(new Callable<Void>() {
@Override
public Void call() throws Exception {
NamedCache cache = CacheFactory.getCache("data");
System.out.println("Cache size: " + cache.size());
calculate_query_time(tagFilter);
long time =
TimeUnit.NANOSECONDS.toMicros(calculate_query_time(tagFilter));
System.out.println("Exec time for [tagFilter] no index - " + time);
// ...
return null;
}
});
}
*https://github.com/gridkit/coherence-search-common/blob/master/src/test/java/org/gridkit/coherence/search/bench/FilterPerformanceMicrobench.java

Monitoring
 Server CPU usage
 Process CPU usage
 Network bandwidth usage
 Coherence threads CPU usage
 Packet Publisher/Speaker/Receiver
 Cache service thread
 Cache service thread pool
 Coherence MBeans
 Cache service task backlog
 TCMP, *Extend IO throughput
etc

Flavors of testing
 Distributed micro benchmarks
 Performance regression tests
 Bottlenecks analyzing
 Performance sign off

Flavors of testing
• Micro benchmark using real cluster
• Proving ideas
• To be run manually be developer

Flavors of testing
• To be run by CI
• Execute several stable test scenarios
• Fixed load scenarios, not for stress testing
• GOAL: track impact of code changes
• GOAL: keep test harness compatible with code base

Flavors of testing
• Testing through N-dimensional space of parameters
• Fully autonomous execution of all test grid !!!
• Analyzing correlation to pin point bottle neck
• To be performed regularly to prioritize optimization
• Also used to measure/prove effect of optimizations

Flavors of testing
• Execution performance acceptance tests aligned to release goals
• Activity driven by QA
• Can share infrastructure with dev team owned tests

Common pit falls
 Measuring “exception generation” performance
 always validate operation results
 write functional test on performance tests
 Fixed user Vs. Fixed request rate
 serious problems may go unnoticed
 Ignoring environment health and side load

Common pit falls
Fixed user Vs. Fixed request frequency
Fixed users
 5 threads
 5 operations out of 50k
will fall out of time envelop
 99 percentile would be ~1ms
Fixed request rate
 300k operation in total
 250k around 1 ms
 50k between 1ms and 10 s
 99 percentile would be ~9.4 s
Case:
 Operation mean time: 1ms
 Throughput: 5k ops/s
 Test time: 1 minute
 GC pause 10 seconds in middle of run

Links
• Nanocloud
 http://code.google.com/p/gridkit/wiki/NanoCloudTutorial
• ChTest – Coherence oriented wrapper for Nanocloud
 http://code.google.com/p/gridkit/wiki/ChTest
 http://blog.ragozin.info/2013/03/chtest-is-out.html
 https://speakerdeck.com/aragozin/chtest-feature-outline
• GridKit @ GitHub
 https://github.com/gridkit

Thank you
Alexey Ragozin
alexey.ragozin@gmail.com
http://blog.ragozin.info
- my blog about JVM, Coherence and other stuff

Performance Test Driven Development with Oracle Coherence

More Related Content

What's hot

Similar to Performance Test Driven Development with Oracle Coherence

More from aragozin

Recently uploaded

Performance Test Driven Development with Oracle Coherence