Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Juggling with Bits and Bytes - How Apache Flink operates on binary data
1. Juggling with Bits and Bytes
How Apache Flink operates on binary data
Fabian Hueske
fhueske@apache.org @fhueske
1
2. Big Data frameworks on JVMs
• Many (open source) Big Data frameworks run on JVMs
– Hadoop, Drill, Spark, Hive, Pig, and ...
– Flink as well
• Common challenge: How to organize data in-memory?
– In-memory processing (sorting, joining, aggregating)
– In-memory caching of intermediate results
• Memory management of a system influences
– Reliability
– Resource efficiency, performance & performance predictability
– Ease of configuration
2
3. The straight-forward approach
Store and process data as objects on the heap
• Put objects in an Array and sort it
A few notable drawbacks
• Predicting memory consumption is hard
– If you fail, an OutOfMemoryError will kill you!
• High garbage collection overhead
– Easily 50% of time spend on GC
• Objects have space overhead
– At least 8 bytes for each (nested) object! (Depends on arch)
3
5. Flink adopts DBMS technology
• Allocates fixed number of memory segments upfront
• Data objects are serialized into memory segments
• DBMS-style algorithms work on binary representation
5
6. Why is that good?
• Memory-safe execution
– Used and available memory segments are easy to count
• Efficient out-of-core algorithms
– Memory segments can be efficiently written to disk
• Reduced GC pressure
– Memory segments are never deallocated
– Data objects are short-lived or reused
• Space-efficient data representation
• Efficient operations on binary data
6
7. What does it cost?
• Significant implementation investment
– Using java.util.HashMap
vs.
– Implementing a spillable hash table backed by byte arrays
and custom serialization stack
• Other systems use similar techniques
– Apache Drill, Apache Ignite, Apache Geode
• Apache Spark plans to evolve into a similar direction
7
9. Memory segments
• Unit of memory distribution in Flink
– Fixed number allocated when worker starts
• Backed by a regular byte array (default 32KB)
• R/W access through Java’s efficient unsafe methods
• Multiple memory segments can be concatenated to
a larger chunk of memory
9
12. Custom de/serialization stack
• Many alternatives for Java object serialization
– Kryo, Apache Avro, Apache Thrift, Protobufs, …
• But Flink has its own serialization stack
– Operating on serialized data requires knowledge of layout
– Control over layout can improve efficiency of operations
– Data types are known before execution
12
13. Rich & extensible type system
• Serialization framework requires knowledge of types
• Flink analyzes return types of functions
– Java: Reflection based type analyzer
– Scala: Compiler information
• Rich type system
– Atomics: Primitives, Writables, Generic types, …
– Composites: Tuples, Pojos, CaseClasses
– Extensible by custom types
13
14. Serializers & comparators
• All types have dedicated de/serializers
– Primitives are natively serialized
– Writables use their own serialization functions
– Generic types use Kryo
– …
• Serialization goes automatically through Java unsafe
• Comparators compare and hash objects
– On binary representation if possible
• Composite serializers and comparators delegate to
serializers and comparators of member types
14
17. Data Processing Algorithms
• Flink’s algorithms are based on RDBMS technology
– External Merge Sort, Hybrid Hash Join, Sort Merge Join, …
• Algorithms receive a budget of memory segments
• Operate in-memory as long as data fits into budget
– And gracefully spill to disk if data exceeds memory
17
22. Sort benchmark
• Task: Sort 10 million Tuple2<Integer, String> records
– String length 12 chars
• Tuple has 16 Bytes of raw data
• ~152 MB raw data
– Integers uniformly, Strings long-tail distributed
– Sort on Integer field and on String field
• Input provided as mutable object iterator
• Use JVM with 900 MB heap size
– Minimum size to reliable run the benchmark
22
23. Sorting methods
1. Objects-on-Heap:
– Put cloned data objects in ArrayList and use Java’s Collection sort.
– ArrayList is initialized with right size.
2. Flink-serialized:
– Using Flink’s custom serializers.
– Integer with full binary sorting key, String with 8 byte prefix key.
3. Kryo-serialized:
– Serialize fields with Kryo.
– No binary sorting keys, objects are deserialized for comparison.
• All implementations use a single thread
• Average execution time of 10 runs reported
• GC triggered between runs (does not go into time)
23
28. We’re not done yet!
• Move memory segments to off-heap memory
– Smaller JVM, lower GC pressure, easier configuration
• Table API provides full semantics for execution
– Use code generation to operate fully on binary data
• Serialization layouts tailored towards operations
– More efficient operations on binary data
• …
28
29. Summary
• Active memory management avoids OOMErrors.
• Highly efficient data serialization stack
– Facilitates operations on binary data
– Makes more data fit into memory
• DBMS-style operators operate on binary data
– High performance in-memory processing
– Graceful destaging to disk if necessary
• Read the full story:
http://flink.apache.org/news/2015/05/11/Juggling-with-Bits-and-Bytes.html
29