Preon
Bit Syntax for Java
Preon




  A declaratve data binding framework for binary
                   encoded data
Binary Encoded Data?

Any fle for which 'file -bi {filename}' does
 not return a mimetype startng with 'text/'



  Non-binary:          Binary:
  text/plain (*.txt)   applicaton/pdf (*.pdf)
  text/html (*.html)   applicaton/octet-stream (*.mp4)
  text/xml (*.xml)     image/png (*.png)
Not only byte stream

Always octects (8 bit values)

             8 bits
                                8
                                    bi
                                      ts
But also bit stream

Always octects (8 bit values)

            5 bits   5 bi
                         ts
Decoding TomTom Map Files




Approx. 300 pages of C++ source code,
  just for decoding only...
Compressed Data




                  2 kg
                  1021 pages
Challenges
●   Decoding from bit stream not for the faint-hearted
●   Encoding to bit stream not for the faint-hearted
●   Hard to maintain
●   Hard to extend
●   Java doesn't help (Bug 4504839)
●   Decoder and encoder easily go out of sync
●   Documentaton and sofware easily go out of sync
What if....



  ..this could all be
     solved easily?
Preon Ambitions
●   Declaratvely map data structure to encoded
    representaton
●   Get the decoder/encoder/documentaton,
    free of charge
5-Second User Guide

Decoding a BitM ap:

01   File file = new File(...);
02   Codec<BitMap> codec = Codecs.create(BitMap.class);
03   BitMap bitmap = Codecs.decode(codec, file);
What just happened?

        Create a Codec for instances of BitMap

01   File file = new File(...);
02   Codec<BitMap> codec = Codecs.create(BitMap.class);
03   BitMap bitmap = Codecs.decode(codec, file);



        ... and use it to decode a BitMap.
Find the specification

The data structure IS the specifcaton:
class BitMap {
  @Bound int width;
  @Bound int height;
  @Bound int nrColors;
  @BoundList(size=”nrColors”) Color[];
  @BoundList(size=”width*height”) byte[] pixels;
}

class Color {
  @Bound int red;
  @Bound int green;
  @Bound int blue;
}
Demo
But what is actually happening?
                 Codec is nothing but a
                 facade to a chain of
                 Codecs.

                 Each Codec only
                 understands its own task.

                 Codecs will delegate to
                 other Codecs
While Decoding
Codecs
Encapsulate everything there is to know about the
  mapping between in-memory representaton and
  encoded representaton
Demo Generated Documentation
So, now....




      Forget everything I just told you
Preon just works
Annotatons             Types
●   @Bound             int, byte, short, long, Integer,
●   @BoundList         Byte, Short, Long, boolean,
●   @BoundString       Boolean, String, List<T>, T[],
●   @BoundNumber       type-safe enums, Object
●   @BoundObject
●   @BoundExplicitly
●   @If                Expressions
●   @LazyLoading       atributes: .{name}
●   @LengthPrefx       items: [{number}], or [{expr}]
●   @TypePrefx         arithmetc: +, -, /, *, ^
●   @ByteAlign         combinatorial: &&, ||
●   @Slice             relatonal: >=, <=, ==, <, >
●   @Import            literals: 'foobar', 0x0f, 0b01101
Convention over Configuration
// Default settings for int Codec:
// 32-bits, little endian
@Bound int foo;

// Read an int, but construct the int
// from 5 bits only
@BoundNumber(size=”5”) int foo;
Expressions (1)
/**
  * An icon of maximal 15 * 15 pixels. Each color
  * has a color in the range 0-255.
  */
public class Icon {
@BoundNumber(size=”4”) int height;
@BoundNumber(size=”4”) int width;
@BoundList(size=”height * width”) byte[] pixels;
}



              By acceptng Strings instead of booleans and
         integers, Preon allows you to pass in expressions.
Expressions (2)
@BoundString(size=”{expr}”, ...)
@BoundList(size=”{expr}”, offset=”{expr}”, ...)
@BoundNumber(size=”{expr}”, ...)
@Slice(“{expr}”)
...
Boolean expressions
@Bound
private int mapVersion;

@If(“mapVersion >= 700”)
@Bound
private MapFlags flags;

//   But also:
//   mapVersion + 1 >= 700
//   mapVersion > 3 && mapVersion < 300
//   etc.
References


                          Backward references only
                          Reference the “outer” object




  addresses[1].street
  outer.driversLicenses[1].state
  driveresLicenses[nrDriverLicenses - 1].state
Variable Introductions

    @Bound int[] offsets;
    @BoundList(offset=”offsets[index]”,...)
      List<Node> nodes;
                                               Introducton
●   Preon will inject List implementaton
●   List implementaton will load Nodes lazily, on demand
●   Calling nodes.get(3) will cause the List implementaton to
     –   Calculate the node's positon: ofsets[index=3] = 120
     –   Call Codec<Node> to start decoding from that point
Inheritance

 Java Classes   Preon Perspectve
Codecs class is your friend
static <T> T decode(Codec<T> codec, byte[] buffer)
static <T> T decode(Codec<T> codec, ByteBuffer buffer)
static <T> T decode(Codec<T> codec, File file)

static <T> Codec<T> create(Class<T> type)
static <T> Codec<T> create(Class<T> type, CodecFactory...
  factories)
static <T> Codec<T> create(Class<T> type, CodecDecorator...
  decorators)

static <T> void document(Codec<T> codec, ArticleDocument
  document)
static <T> void document(Codec<T> codec, DocumentType type,
  OutputStream out)
static <T> void document(Codec<T> codec, DocumentType type, File
  file)
If not Preon, then what else?
Declaratve approach is not new:
– BSDL (Bitstream Syntax Descripton Language)
– Flavor (htp://favor.sourceforge.net/)
– XFlavor
– BFlavor (htp://multmedialab.elis.ugent.be/bfavor/)

None of them would worked in our case:
– Overly complicated
– No free/solid implementaton
– Assumptons are surreal:
   ● “everything will just ft in memory”

   ● “there will only be a single thread”

   ● “our current feature set is all you ever need”
The Preon Answer
“everything will just ft in memory”
   Preon is loading data lazily by default, and when it does it relies
     on memory mapped data (Default BitBufer implementaton
     wraps around ByteBufer, and hence also
     MappedByteBufer.)
“there will only be a single thread”
   In Preon, every thread can have its own reference to the
      current positon in the BitBufer.
“our current feature set is all you ever need”
   Preon has incredibly open: implement CodecFactory or
     CodecDecorator to create your own codecs based on type
     informaton or annotatons.
CodecFactory
             Annotatons on the object expectng data
             to be decoded by the Codec.

interface CodecFactory {
<T> Codec<T> create(AnnotatedElement metadata,
                    Class<T> type,
                    ResolverContext context);
}


The type of object to        The object for constructng
be decoded.                  references.


returns null if it does not have a way to construct a Codec
Chain of CodecFactories
CodecFactory Usage
●   CodecFactories can be passed to Codecs.create(...)
●   ... which will cause the Codecs class to consider
    these factories when constructng the various
    Codecs.
●   Internally, a big chain of commands
Current Status

Preon 1.0 released
(htp://preon.fotsam.nl/)

Big Theme for next Release:
    Encoding
Other Future Work

●   Improved debugging
●   Annotaton driven support for other compression
    techniques
●   More hyperlinking in the documentaton
●   Beter algebraic simplifcaton of expressions
●   “In-place” editng?
Conclusions
●   It can be done!!!
●   Breaking down Codecs into smaller codecs results
    into some interestng propertes
●   The ideas in Preon could work in other Java data
    binding frameworks as well
●   Encoding and decoding bitstream compressed data
    is stll hard, but Preon truly hides that complexity
●   Preon is interestng enough to deserve more
    support
Preon
  Bit Syntax for Java


http://preon.fotsam.nl/
  wilfred@fotsam.nl
Preon Layers

                                    Data binding



                                    A fuent interface for
                                    generatng documents.
                                    (pecia.fotsam.nl)



                       An expression language capable of
BitBufer abstractons   rendering itself to human readable
                       text. (limbo.sourceforge.net)
Preon License

                                        GPL + Classpath
                                        Excepton


                                        Apache 2.0




                           Apache 2.0
GPL + Classpath Excepton

OOPSLA Talk on Preon

  • 1.
  • 2.
    Preon Adeclaratve data binding framework for binary encoded data
  • 3.
    Binary Encoded Data? Anyfle for which 'file -bi {filename}' does not return a mimetype startng with 'text/' Non-binary: Binary: text/plain (*.txt) applicaton/pdf (*.pdf) text/html (*.html) applicaton/octet-stream (*.mp4) text/xml (*.xml) image/png (*.png)
  • 4.
    Not only bytestream Always octects (8 bit values) 8 bits 8 bi ts
  • 5.
    But also bitstream Always octects (8 bit values) 5 bits 5 bi ts
  • 6.
    Decoding TomTom MapFiles Approx. 300 pages of C++ source code, just for decoding only...
  • 7.
    Compressed Data 2 kg 1021 pages
  • 8.
    Challenges ● Decoding from bit stream not for the faint-hearted ● Encoding to bit stream not for the faint-hearted ● Hard to maintain ● Hard to extend ● Java doesn't help (Bug 4504839) ● Decoder and encoder easily go out of sync ● Documentaton and sofware easily go out of sync
  • 9.
    What if.... ..this could all be solved easily?
  • 10.
    Preon Ambitions ● Declaratvely map data structure to encoded representaton ● Get the decoder/encoder/documentaton, free of charge
  • 11.
    5-Second User Guide Decodinga BitM ap: 01 File file = new File(...); 02 Codec<BitMap> codec = Codecs.create(BitMap.class); 03 BitMap bitmap = Codecs.decode(codec, file);
  • 12.
    What just happened? Create a Codec for instances of BitMap 01 File file = new File(...); 02 Codec<BitMap> codec = Codecs.create(BitMap.class); 03 BitMap bitmap = Codecs.decode(codec, file); ... and use it to decode a BitMap.
  • 13.
    Find the specification Thedata structure IS the specifcaton: class BitMap { @Bound int width; @Bound int height; @Bound int nrColors; @BoundList(size=”nrColors”) Color[]; @BoundList(size=”width*height”) byte[] pixels; } class Color { @Bound int red; @Bound int green; @Bound int blue; }
  • 14.
  • 15.
    But what isactually happening? Codec is nothing but a facade to a chain of Codecs. Each Codec only understands its own task. Codecs will delegate to other Codecs
  • 16.
  • 17.
    Codecs Encapsulate everything thereis to know about the mapping between in-memory representaton and encoded representaton
  • 18.
  • 19.
    So, now.... Forget everything I just told you
  • 20.
    Preon just works Annotatons Types ● @Bound int, byte, short, long, Integer, ● @BoundList Byte, Short, Long, boolean, ● @BoundString Boolean, String, List<T>, T[], ● @BoundNumber type-safe enums, Object ● @BoundObject ● @BoundExplicitly ● @If Expressions ● @LazyLoading atributes: .{name} ● @LengthPrefx items: [{number}], or [{expr}] ● @TypePrefx arithmetc: +, -, /, *, ^ ● @ByteAlign combinatorial: &&, || ● @Slice relatonal: >=, <=, ==, <, > ● @Import literals: 'foobar', 0x0f, 0b01101
  • 21.
    Convention over Configuration //Default settings for int Codec: // 32-bits, little endian @Bound int foo; // Read an int, but construct the int // from 5 bits only @BoundNumber(size=”5”) int foo;
  • 22.
    Expressions (1) /** * An icon of maximal 15 * 15 pixels. Each color * has a color in the range 0-255. */ public class Icon { @BoundNumber(size=”4”) int height; @BoundNumber(size=”4”) int width; @BoundList(size=”height * width”) byte[] pixels; } By acceptng Strings instead of booleans and integers, Preon allows you to pass in expressions.
  • 23.
    Expressions (2) @BoundString(size=”{expr}”, ...) @BoundList(size=”{expr}”,offset=”{expr}”, ...) @BoundNumber(size=”{expr}”, ...) @Slice(“{expr}”) ...
  • 24.
    Boolean expressions @Bound private intmapVersion; @If(“mapVersion >= 700”) @Bound private MapFlags flags; // But also: // mapVersion + 1 >= 700 // mapVersion > 3 && mapVersion < 300 // etc.
  • 25.
    References Backward references only Reference the “outer” object addresses[1].street outer.driversLicenses[1].state driveresLicenses[nrDriverLicenses - 1].state
  • 26.
    Variable Introductions @Bound int[] offsets; @BoundList(offset=”offsets[index]”,...) List<Node> nodes; Introducton ● Preon will inject List implementaton ● List implementaton will load Nodes lazily, on demand ● Calling nodes.get(3) will cause the List implementaton to – Calculate the node's positon: ofsets[index=3] = 120 – Call Codec<Node> to start decoding from that point
  • 27.
    Inheritance Java Classes Preon Perspectve
  • 28.
    Codecs class isyour friend static <T> T decode(Codec<T> codec, byte[] buffer) static <T> T decode(Codec<T> codec, ByteBuffer buffer) static <T> T decode(Codec<T> codec, File file) static <T> Codec<T> create(Class<T> type) static <T> Codec<T> create(Class<T> type, CodecFactory... factories) static <T> Codec<T> create(Class<T> type, CodecDecorator... decorators) static <T> void document(Codec<T> codec, ArticleDocument document) static <T> void document(Codec<T> codec, DocumentType type, OutputStream out) static <T> void document(Codec<T> codec, DocumentType type, File file)
  • 29.
    If not Preon,then what else? Declaratve approach is not new: – BSDL (Bitstream Syntax Descripton Language) – Flavor (htp://favor.sourceforge.net/) – XFlavor – BFlavor (htp://multmedialab.elis.ugent.be/bfavor/) None of them would worked in our case: – Overly complicated – No free/solid implementaton – Assumptons are surreal: ● “everything will just ft in memory” ● “there will only be a single thread” ● “our current feature set is all you ever need”
  • 30.
    The Preon Answer “everythingwill just ft in memory” Preon is loading data lazily by default, and when it does it relies on memory mapped data (Default BitBufer implementaton wraps around ByteBufer, and hence also MappedByteBufer.) “there will only be a single thread” In Preon, every thread can have its own reference to the current positon in the BitBufer. “our current feature set is all you ever need” Preon has incredibly open: implement CodecFactory or CodecDecorator to create your own codecs based on type informaton or annotatons.
  • 31.
    CodecFactory Annotatons on the object expectng data to be decoded by the Codec. interface CodecFactory { <T> Codec<T> create(AnnotatedElement metadata, Class<T> type, ResolverContext context); } The type of object to The object for constructng be decoded. references. returns null if it does not have a way to construct a Codec
  • 32.
  • 33.
    CodecFactory Usage ● CodecFactories can be passed to Codecs.create(...) ● ... which will cause the Codecs class to consider these factories when constructng the various Codecs. ● Internally, a big chain of commands
  • 34.
    Current Status Preon 1.0released (htp://preon.fotsam.nl/) Big Theme for next Release: Encoding
  • 35.
    Other Future Work ● Improved debugging ● Annotaton driven support for other compression techniques ● More hyperlinking in the documentaton ● Beter algebraic simplifcaton of expressions ● “In-place” editng?
  • 36.
    Conclusions ● It can be done!!! ● Breaking down Codecs into smaller codecs results into some interestng propertes ● The ideas in Preon could work in other Java data binding frameworks as well ● Encoding and decoding bitstream compressed data is stll hard, but Preon truly hides that complexity ● Preon is interestng enough to deserve more support
  • 37.
    Preon BitSyntax for Java http://preon.fotsam.nl/ wilfred@fotsam.nl
  • 38.
    Preon Layers Data binding A fuent interface for generatng documents. (pecia.fotsam.nl) An expression language capable of BitBufer abstractons rendering itself to human readable text. (limbo.sourceforge.net)
  • 39.
    Preon License GPL + Classpath Excepton Apache 2.0 Apache 2.0 GPL + Classpath Excepton