Performance modeling and simulation for accumulo applications

Performance Modeling and
Simulation for Accumulo
Applications
Adam Fuchs
CTO, Sqrrl Data, Inc.
October 16, 2017

© 2017 Sqrrl Data, Inc. All rights reserved. 2
Modeling and Simulation with Accumulo
Goals:
Optimize performance in code and configuration
Discover unexpected factors affecting performance and scalability
Reduce risk when extrapolating performance to large customer sizes (10+PB) over long periods of
time (years)
Types used by Sqrrl:
Micro-benchmarking
Drive scoped-down tests with actual code
Iterate with code changes to optimize
Analog Simulator
Track scheduling and side-effects of operations without actually running them
Simulate large clusters and long periods of time quickly on a single machine
Predictive model validation
Predict performance and scalability mathematically (e.g. spreadsheet model)
Full-scope tests instrumented with measurements to validate model

Technique #1: Micro-Benchmarking
Optimize the crap out of operations in tight loops
Isolate components to simplify interpretation of results
Example: RFileSortingOutputFormat
Produce RFiles from randomly ordered Key/Value Pairs
Input
Data
(Split)
Record
Reader
Mapper
Context
(Collector)
In-Memory
Buffer
(~1GB)
Sorted
RFile1
Sorted
RFileN
Sort / Spill
...
Micro-Benchmarking
Candidates

Micro Benchmarking Setup
Process:
1. Benchmark
2. Compare
3. Rewrite
4. Repeat
UsefulTools:
• JMH (Java Microbenchmarking Harness)
• Sampling profiler, like JVisualVM
main() {
startTime = now()
for(i = 0; i < many; i++) {
do.thing()
}
endTime = now()
print(endTime - startTime)
}

Insights GainedThrough Micro-Benchmarking
Serialization/deserialization of Keys can be very expensive
Holding onto objects makes Java sad
Sort boils down to compare and swap
minimize the number of bytes to compare for sorting via hierarchical organization
minimize the cost of swapping
Other attempted optimization: Quicksort vs. merge sort
Merge sort has tighter worst-case bound
Quicksort faster in practice (probably dominated by cache fetches)
Quicker to merge in ideas like this with micro-benchmarking
Next steps:
1. Code with best known algorithms and design patterns
2. Measure again

RFileSortingOutputFormat Sort Buffer Design
Key/Value Buffer
tuple(CQ,Vis)
value length
value
Metadata Buffers
...
tuple(CQ,Vis)
value length
value
...
Row ColFam
Length Offset
Length Offset
Length Offset
Length Offset
Length Offset
Length Offset
...
...
Row ColFam
Length Offset
Length Offset
Length Offset
Length Offset
Length Offset
Length Offset
...
...

Micro-Benchmarking: RFileSortingOutputFormat Performance
10 million random keys sorted and written in 17-20 seconds per thread
Up to ~588,000 key/value pairs per second per thread
On one 20-core node, scaled up to ~7,000,000 key/value pairs per second
generated, sorted, and written to disk

Technique #2: Analog Simulation
Test bigger than you can afford to test!
Example: Optimize Query and Compaction I/O
Explore performance of various key designs and configuration
Optimize Write-Amplification Factor (WAF) due to long-term compactions
Optimize query latency, tied to number of RFiles per tablet at any time
Input Organization and
Maintenance
Output

Document-Distributed IndexTable Design
Document Query
Ingest Process Query Process
Document StoreTable
Forward Index
Inverted Index
Forward Index
Inverted Index
Forward Index
Inverted Index
Forward Index
Inverted Index
Forward Index
Inverted Index

Tablet Files
Compaction Algorithm
File
Size
1. Pick the maximal set of files F that satisfies:
sum(f) / max(f) > Ratio
2. Compact F into a new file f’
Candidate Set F
4.92
1.52
= 3.24 > 3.0 þ

LSMTree
RFile
RFile
RFile
RFile
RFile
RFile
Compact!
Compact!
Time

Simulator Loop
main() { ...
while (queue not
empty) {
event =
queue.pop();
event.process();
}
}
Event {
final long eventTime;
process() {
modState();
genEvents();
...
}
}
PriorityQueue<Event>
(ordered by time)

Simulator Design
Simulate:
Ingest
Triggered compactions
Timed events for:
Bulk load of new files
Compaction start, end
Measure:
Files to tablet counts over time
Total input
Total bytes
Designs to explore:
Key design, partitioning
Compaction strategies
Ingest Event
Generator
Compaction
Simulator
Ingest
Simulator
Tablet State
Model
Measurement
Apparatus
Event Queue

Example Simulator Execution
• Inputs from default configs,
micro-benchmarking, and
other simulations
• Intermediate measurements
give hints about possible
issues
• Simplified optimization
scores focus on small
number of dimensions
• 7 days and 20GB simulated
in under a second
Input Parameters:
Compaction ratio: 4.0
Compaction rate: 2643.0 bytes/ms
Max compacted file size: 1073741824000 bytes
Last input time: 604800000 ms
Num files per drop: 1.1
Bytes per drop: 10485760
File size stddev: 1048576
File drop period: 300000 ms
Results:
Last Input Time: 604800000 ms
End time: 604800000 ms
Time compacting: 35179342 ms
Compaction portion: 0.058166901455026454
Active tablets supported per compaction thread: 17
Max concurrent files: 32
Max files per compaction: 20
Max compaction size: 16211724213 bytes
Max compaction time: 6133834 ms
Final file count: 7
Total Events: 2486
Total Compactions: 469
Total MajC Writes: 92979631364 bytes
Total Input Writes: 21204489777 bytes
Final Size (GB): 19.748220012523234
Write Amplification Factor: 5.384903024870363
Average files over time: 8.666971808862433

Simulator Insights
Different compression codec for small vs. large compactions
Limit maximum file size for compaction
WAF optimized with roughly 1-week time partitioning in our application
Other recent uses of simulators:
Query readahead threadpool improvements
Secondary compaction threadpool
Bulk index creation partitioning optimization

Technique #3: Predictive ModelValidation
Use knowledge of application to build predictive performance model
Avoid “explaining” observations
Minimize model fitting parameters
Instrument whole-stack application with key performance indicators
Run performance tests with enough variety to fit parameter and validate model
Extra credit: include some factors that aren’t model parameters
Validates non-relevance
Prioritize by “most likely relevant”
Analyze the difference between the prediction and the observations
Refine model
Iterate

Predictive ModelValidation: Example
Query: filtered graph search
implemented in Accumulo iterators
Accumulo’s iterators support both
next() and seek()
Time to call next() and seek() is
relatively uniform (big hand-wave here)
Prediction: I/O-bound query runtime
proportional to total next and seek calls
Modeling performance ó accounting
for next() and seek() calls

Predictive ModelValidation: Spreadsheets!
Model Version 7:

Predictive ModelValidation: Insights
Some iterators are seeking/scanning more than optimal
Logic tweak cuts seeks in half
Total of all scans and seeks is way more than expected
Re-ordering filter hierarchy allows for fewer, bigger seeks
Next steps:
1. Improve the code
2. Re-validate with new code

Predictive ModelValidation: Improvements
25X Predicted
Improvement
New Iterator
Model Version 12:

Takeaways
Performance is not good until you prove it!
Don’t expect performance to be optimal the first time
Measure early
Leave time for optimization
Measure often
Micro-benchmarking, simulation, and predictive model validation:
Learn ’em
Use ’em
Love ’em
Ping me if you want help getting organized or getting started with code and
spreadsheets

Performance modeling and simulation for accumulo applications

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Performance modeling and simulation for accumulo applications

Similar to Performance modeling and simulation for accumulo applications (20)

Recently uploaded

Recently uploaded (20)

Performance modeling and simulation for accumulo applications