CBOR
The Better JSON
www.hazelcast.com@noctarius2k
Who’s that dude?
• Chris Engelbert
• Manager of Developer Relations @Hazelcast
• Java-Passionate (10+ years)
• Performance
• Garbage Collection
• JVM / Benchmark Fairytales
www.hazelcast.com@noctarius2k
Brief look at the history
www.hazelcast.com@noctarius2k
In the beginning there was darkness…
www.hazelcast.com@noctarius2k
…and binary data
www.hazelcast.com@noctarius2k
…and binary data
www.hazelcast.com@noctarius2k
Debugging binary data is hard
www.hazelcast.com@noctarius2k
Binary data is just a big blob
www.hazelcast.com@noctarius2k
Possible with custom tooling
www.hazelcast.com@noctarius2k
Possible with custom tooling
• unique to this specific use case
• involves model classes to be known
• utilizes workarounds
• generally feels weird :-)
Often:
www.hazelcast.com@noctarius2k
“I need to understand it!”
www.hazelcast.com@noctarius2k
Human Readability, Expressiveness
www.hazelcast.com@noctarius2k
Human Readability, Expressiveness
<fun>

<with>

<xml dont:you="think" />

</with>

</fun>
www.hazelcast.com@noctarius2k
Human Readability, Expressiveness
www.hazelcast.com@noctarius2k
Human Readability, Expressiveness
This is not a flame graph!
www.hazelcast.com@noctarius2k
Human Readability, Expressiveness
This is not a flame graph!
<plugin>

<groupId>org.apache.maven.plugins</groupId>

<artifactId>maven-failsafe-plugin</artifactId>

<version>${maven.failsafe.plugin.version}</version>

<configuration combine.self="override">

<redirectTestOutputToFile>true</redirectTestOutputToFile>

<argLine>

${jacoco.agent.argLine}

-Xms129m -Xmx1G -XX:MaxPermSize=129M

-Dxyz.feature1.enabled=false

-Dxyz.feature2.enabled=false

-Dxyz.feature3.type=none

-Dxyz.feature4=true

</argLine>

<includes>

<include>**/TestCast*.java</include>

<include>**/IntegrationTestCase*.java</include>

</includes>

<groups>

com.xyz.test.annotation.QuickTest,

com.xyz.test.annotation.SlowTest,

com.xyz.test.annotation.NightlyTest

</groups>



</configuration>

<executions>

<execution>

<goals>

<goal>integration-test</goal>

</goals>

</execution>

</executions>
</plugin>
www.hazelcast.com@noctarius2k
Too verbose!
www.hazelcast.com@noctarius2k
Too understandable?!
www.hazelcast.com@noctarius2k
Just gross?
www.hazelcast.com@noctarius2k
Grossbusters!
www.hazelcast.com@noctarius2k
There is no Dana, only Zuul!JSON
www.hazelcast.com@noctarius2k
Human Readability, Concise
www.hazelcast.com@noctarius2k
Human Readability, Concise
www.hazelcast.com@noctarius2k
Human Readability, Concise
base64 encoded image
www.hazelcast.com@noctarius2k
Human Readability, Concise
Does not play “well”
with binary data :-(
base64 encoded image
www.hazelcast.com@noctarius2k
But let’s be honest…
www.hazelcast.com@noctarius2k
…do we need to store human readable?
www.hazelcast.com@noctarius2k
Binary but understandable!
www.hazelcast.com@noctarius2k
So what’s the next evolution step?
www.hazelcast.com@noctarius2k
C(oncise) B(inary) O(bject) R(epresentation)
www.hazelcast.com@noctarius2k
Features
• Stores JSON-alike content
• Extremely concise
• Binary
• Streamable
• Type safe
• Schemaless
• Highly Extensible
• Well defined specification
• Standard (by the IETF, RFC7049)
www.hazelcast.com@noctarius2k
Real World (Size) Example
www.pathway-game.com
• Stores Geometry Information
• Mainly Float Data
• Often floatToIntBits for Speed
• Lots of Sequences
• Only a few Dictionaries
Really simple data structure
www.hazelcast.com@noctarius2k
Real World (Size) Example
2250000
4500000
6750000
9000000
XML JSON BSON CBOR MSGPACK
MSGPACK

CBOR
BSON
JSON
XML
1.408.737 bytes
1.415.215 bytes
5.575.783 bytes
5.755.188 bytes
8.592.541 bytes
1.0x
1.0x
3.9x
4.0x
6.0x
www.hazelcast.com@noctarius2k
Debugging capability built-in
Dictionary [
ByteString{ ANI0 }=Sequence [
Sequence [
UInt{ 7264 },
UInt{ 4 },
UInt{ 2724 },
UInt{ 37 },
Sequence [
NInt{ -1053818880 },
UInt{ 1077214225 },
UInt{ 0 },
UInt{ 0 },
UInt{ 1068825617 },
]
]
]
]
Since items carry their type information,
writing a quick debugger or using the
provided to-string capabilities of libs
www.hazelcast.com@noctarius2k
Implementations
• JavaScript
• browser
• node.js
• Julia
• C#
• Java
• Lua
• PHP
• Python
• Go
• C
• C++
• Rust
• Perl
• Ruby
• D
• Swift
• Erlang
• Elixir
www.hazelcast.com@noctarius2k
Implementations
• JavaScript
• browser
• node.js
• Julia
• C#
• Java
• Lua
• PHP
• Python
• Go
• C
• C++
• Rust
• Perl
• Ruby
• D
• Swift
• Erlang
• Elixir
CBOR-java:

https://github.com/peteroupc/CBOR-Java
Jackson Dataformat:
https://github.com/FasterXML/jackson-dataformat-cbor
cbor-java:
https://github.com/c-rack/cbor-java
JACOB:
https://github.com/jawi/jacob
borabora:
https://github.com/noctarius/borabora
www.hazelcast.com@noctarius2k
Implementations
• JavaScript
• browser
• node.js
• Julia
• C#
• Java
• Lua
• PHP
• Python
• Go
• C
• C++
• Rust
• Perl
• Ruby
• D
• Swift
• Erlang
• Elixir
CBOR-java:

https://github.com/peteroupc/CBOR-Java
Jackson Dataformat:
https://github.com/FasterXML/jackson-dataformat-cbor
cbor-java:
https://github.com/c-rack/cbor-java
JACOB:
https://github.com/jawi/jacob
borabora:
https://github.com/noctarius/borabora
www.hazelcast.com@noctarius2k
borabora
•Skip-scan parser
•Simple, fluent API
•Supports object queries
•Reading + writing
•Supports mutating (hacky prototype)
•Extended datatype support
www.hazelcast.com@noctarius2k
borabora
•Skip-scan parser
•Simple, fluent API
•Supports object queries
•Reading + writing
•Supports mutating (hacky prototype)
•Extended datatype support
Not just a blur of color
www.hazelcast.com@noctarius2k
borabora
•Skip-scan parser
•Simple, fluent API
•Supports object queries
•Reading + writing
•Supports mutating (hacky prototype)
•Extended datatype support
but the island ;-)
www.hazelcast.com@noctarius2k
Time for some demos, isn’t it?
www.hazelcast.com@noctarius2k
Comparisons
www.hazelcast.com@noctarius2k
Comparisons
“Alternative Facts”
in the comparisons,
are all fake news!
www.hazelcast.com@noctarius2k
MessagePack
• Extremely concise
• Available in almost every language
• UTF-8 and binary strings
• No map keys datatype restriction
• Schemaless
• Binary
• 32bit size limitation for certain data types
• Specification is concise too (~450 lines)
• No official standard
www.hazelcast.com@noctarius2k
MessagePack
• Extremely concise
• Available in almost every language
• UTF-8 and binary strings
• No map keys datatype restriction
• Schemaless
• Binary
• 32bit size limitation for certain data types
• Specification is concise too (~450 lines)
• No official standard
1:0
2:0
Draw
Draw
Draw
Draw
2:1
2:2
2:3
www.hazelcast.com@noctarius2k
JSON
• Human readable
• Official standard (ECMA-404)
• Subset of JavaScript (strong support)
• Schemaless
• Kind of type safe
• Human readable
• Text-based
• Map keys can be strings only
• Large bytesize
www.hazelcast.com@noctarius2k
• Human readable
• Official standard (ECMA-404)
• Subset of JavaScript (strong support)
• Schemaless
• Kind of type safe
• Human readable
• Text-based
• Map keys can be strings only
• Large bytesize
1:0
Draw
2:0
Draw
Draw
2:1
2:2
2:3
2:4
JSON
www.hazelcast.com@noctarius2k
• Binary
• Available in a lot of languages
• Derived from JSON, interoperability 1+
• Schemaless
• Type safe
• Supports in-place updates
• No official standard
• Extremely concise specification
• Map keys can be strings only
• Large bytesize
BSON
www.hazelcast.com@noctarius2k
• Binary
• Available in a lot of languages
• Derived from JSON, interoperability 1+
• Schemaless
• Type safe
• Supports in-place updates
• No official standard
• Extremely concise specification
• Map keys can be strings only
• Large bytesize
Draw
1:0
2:0
Draw
Draw
3:0
3:1
3:2
3:3
3:4
BSON
www.hazelcast.com@noctarius2k
• Human readable
• Official standard (W3C REC-xml)
• Strong commercial support (e.g. IBM)
• Supported everywhere
• Type safe or schemaless
• Human readable
• Text-based
• Very verbose
• Large bytesize
XML
www.hazelcast.com@noctarius2k
• Human readable
• Official standard (W3C REC-xml)
• Strong commercial support (e.g. IBM)
• Supported everywhere
• Type safe or schemaless
• Human readable
• Text-based
• Very verbose
• Large bytesize
1:0
Draw
2:0
3:0
3:1
3:2
3:3
3:4
3:5
XML
www.hazelcast.com@noctarius2k
More alternatives
Text-based:
• bencode
• YAML
• …
Binary:
• Amazon ION
• Smile
• Apache Thrift
• Apache Avro
• Protobuf
• …
ASCII based encoding, unaffected by endianess, type safe
commonly for configuration, superset of JSON, type safe
skip-scan parsable, type safe, commercial support (Amazon)
based on JSON, property-name back-references
DSL, schema-bound, developed by Facebook, RPC services
DSL, schema-bound, primarily used by Hadoop
DSL, schema-bound, developed by Google, RPC services
www.hazelcast.com@noctarius2k
More information:
• http://cbor.io/
• http://bsonspec.org/
• http://www.json.org/
• https://www.w3.org/TR/REC-xml/
• http://msgpack.org/
• https://github.com/noctarius/borabora
• https://en.wikipedia.org/wiki/Bencode
• https://en.wikipedia.org/wiki/YAML
• https://amznlabs.github.io/ion-docs/
• https://en.wikipedia.org/wiki/Apache_Thrift
• https://en.wikipedia.org/wiki/Apache_Avro
• https://en.wikipedia.org/wiki/Protocol_Buffers
• https://en.wikipedia.org/wiki/Smile_(data_interchange_format)
Thank You!
www.hazelcast.com@noctarius2k
Thank You!
Any Questions?
@noctarius2k
http://www.sourceprojects.org
http://github.com/noctarius
@hazelcast
http://www.hazelcast.com
http://www.hazelcast.org
http://github.com/hazelcast

CBOR - The Better JSON