Preon
A declarative data binding framework for binary
encoded data
Binary Encoded Data?
Any file for which 'file ‐bi {filename}' does
not return a mimetype starting with 'text/'
Non‐binary: Binary:
text/plain (*.txt) application/pdf (*.pdf)
text/html (*.html) application/octet‐stream (*.mp4)
text/xml (*.xml) image/png (*.png)
Not only byte stream
Always octects (8 bit values)
8 bits
8
bi
ts
But also bit stream
Always octects (8 bit values)
5 bits 5 b
its
Compressed Data
2 kg
1021 pages
Network Traffic
TomTom Map Files
Approx. 300 pages of C++ source code,
just for decoding only...
Why?
Binary 45.71%
Non‐binary 54.29%
Binary vs. non‐binary distribution on a
random directory on my system
Why?
100%
Binary files only
...and on my wife's system
Challenges
● Decoding from bit stream not for the faint‐hearted
● Encoding to bit stream not for the faint‐hearted
● Hard to maintain
● Hard to extend
● Java doesn't help (Bug 4504839)
● Decoder and encoder easily go out of sync
● Documentation and software easily go out of sync
What if....
..this could all be
solved easily?
Preon Ambitions
● Declaratively map data structure to encoding
format
● Get the decoder/encoder/documentation,
free of charge
5-Second User Guide
Decoding a BitMap:
01 File file = new File(...);
02 Codec<BitMap> codec = Codecs.create(BitMap.class);
03 BitMap bitmap = Codecs.decode(codec, file);
What just happened?
Create a Codec for instances of BitMap
01 File file = new File(...);
02 Codec<BitMap> codec = Codecs.create(BitMap.class);
03 BitMap bitmap = Codecs.decode(codec, file);
... and use it to decode a BitMap.
Find the specification
The data structure IS the specification:
class BitMap {
@Bound int width;
@Bound int height;
@Bound int nrColors;
@BoundList(size=”nrColors”) Color[];
@BoundList(size=”width*height”) byte[] pixels;
}
class Color {
@Bound int red;
@Bound int green;
@Bound int blue;
}
Demo Decoding
But what is actually happening?
Codec is nothing but a
facace to a chain of
Codecs.
Each Codec only
understands its own task.
Codecs will delegate to
other Codecs
Codecs
Encapsulate everything there is to know about the
mapping between in‐memory representation and
encoded representation
Convention over Configuration
// Default settings for int Codec:
// 32‐bits, little endian
@Bound int foo;
// Read an int, but construct the int
// from 5 bits only
@BoundNumber(size=”5”) int foo;
Expressions (1)
/**
* An icon of maximal 15 * 15 pixels. Each color
* has a color in the range 0‐255.
*/
public class Icon {
@BoundNumber(size=”4”) int height;
@BoundNumber(size=”4”) int width;
@BoundList(size=”height * width”) byte[] pixels;
}
By accepting Strings instead of booleans and
integers, Preon allows you to pass in expressions.
References
Backward references only
Reference the “outer” object
addresses[1].street
outer.driversLicenses[1].state
driveresLicenses[nrDriverLicenses ‐ 1].state
Variable Introductions
@Bound int[] offsets;
@BoundList(offset=”offsets[index]”,...)
List<Node> nodes;
Introduction
● Preon will inject List implementation
● List implementation will load Nodes lazily, on demand
● Calling nodes.get(3) will cause the List implementation to
– Calculate the node's position: offsets[index=3] = 120
– Call Codec<Node> to start decoding from that point
Preon Layers
Data binding
A fluent interface for
generating documents.
(pecia.sourceforget.net)
An expression language capable of
BitBuffer abstractions rendering itself to human readable
text. (limbo.sourceforge.net)
If not Preon, then what else?
Declarative approach is not new:
– BSDL (Bitstream Syntax Description Language)
– Flavor (http://flavor.sourceforge.net/)
– XFlavor
– BFlavor (http://multimedialab.elis.ugent.be/bflavor/)
None of them would worked in our case:
– Overly complicated
– No solid implementation
– Assumptions are surreal:
● “everything will just fit in memory”
● “there will only be a single thread”
● “our current feature set is all you ever need”
The Preon Answer
“everything will just fit in memory”
Preon is capable of using memory‐mapped data. (Default
BitBuffer implementation wraps around ByteBuffer, and
hence also MappedByteBuffer.)
“there will only be a single thread”
In Preon, every thread can have its own reference to the
current position in the BitBuffer.
“our current feature set is all you ever need”
Preon has incredibly open: implement CodecFactory or
CodecDecorator to create your own codecs based on type
information or annotations.
CodecFactory
Annotations on the object expecting data
to be decoded by the Codec.
interface CodecFactory {
<T> Codec<T> create(AnnotatedElement metadata,
Class<T> type,
ResolverContext context);
}
The type of object to The object for constructing
be decoded. references.
returns null if it does not have a way to construct a Codec
CodecFactory Usage
● CodecFactories can be passed to Codecs.create(...)
● ... which will cause the Codecs class to consider
these factories when constructing the various
Codecs.
● Internally, a big chain of commands
Current Status
● http://preon.sourceforge.net/
● Interfaces fairly stable
● Current version 1.0‐SNAPSHOT
● Complete Java class example before first release
candidate
● Bugs...
● I want you
Future work
● The encode operation (after 1.0)
● Better debugging
● Annotation driven support for other compression
techniques
● More hyperlinking in the documentation
● Better algebraic simplification of expressions
● Descriptions from JavaDoc
If there is only a couple of things...
● XML is just a lame way of binary encoding
● Preon cuts out the Infoset model, preventing
unnecessary transformations
● Preon makes binary encoding easy
● Preon is extensible
● Preon (probably) scales quite well
● Preon is friendly to Java developers
0 comments
Post a comment