3. Problem space
• Writting buffer control that need to persist struct
data in disk
• Struct data is simple (will not change in near
future)
• Program needs
• Low memory footprint
• Low CPU usage
4. Bunches of options
• encoding/gob (base on encoding/binary)
• gogoprotobuf
• capnproto (glycerine/go-capnproto)
• ugorji/go/codec
• mgo.v2/bson
• .....
5. Some problems
• Some are overcomplex
• Cryptic error message
• Some are fast, but not support all datastructure (map)
• flatbuffer (could use vector instead, but look up is not
O(1))
• All libraries does some abstraction, make it hard to debug
• Write to disk failed at the middle, some bytes are written,
some are not
• Using library lack of fine-grained control
• You want some special behaviours for some special field
• You want some special behaviours when it failed
8. type A struct {
Name string
BirthDay time.Time
Phone string
Siblings int
Spouse bool
Money float64
}
Example struct
9. Struct layout matter
type A struct {
Name string
BirthDay time.Time
Phone string
Siblings int
Spouse bool
Money float64
}
size + order
10. In general, there 2 types
• Dynamic layout: just pass struct, serializer will do
everything for you
• encoding/gob, encoding/json
• Library have to figure out "what type" first,
than serialize later
• Fix layout: you have to tell serializer about your
struct first
• protobuf, capnproto, messagepack...
• Library already know type, just using code-
gen to serialize
11. Dynamic layout Fix layout
Advantages
- Easy to use
- Easy support nested
struct...
- No additional step
- Easy to optimize
- Managable
protocol file (.proto
or .flatbuffer)
Disadvantages
- Harder to optimize
- Need reflection
(performance
downgrade)
- Needs code
generation
12. What should we use
• What we have
• Fix protocol
• Need low memory footprint / low CPU usage
• So I decided to have serialization
method which is
• Fix layout in code
• But without codegen
13. func MarshalRaw(a *A, buf *bytes.Buffer) {
encodeString(a.Name, buf)
encodeUint64(uint64(a.BirthDay.UnixNano()), buf)
encodeString(a.Phone, buf)
encodeUint64(uint64(a.Siblings), buf)
encodeBool(a.Spouse, buf)
encodeFloat64(a.Money, buf)
}
Your struct field order is fixed in code
Name
Birthday
Phone
...
15. First try
• Using encoding/binary to convert type to byte
array
• Write byte array to buffer
• For dynamic size struct (vector, map..)
• Write size first as int and than write payload
• When decode, read size first, and than read
payload
16. uint64
func encodeUint64(v uint64, w io.Writer) error {
b := [64 / 8]byte{}
binary.LittleEndian.PutUint64(b[:], v)
_, err := w.Write(b[:])
return err
}
func decodeUint64(r io.Reader) (uint64, error) {
var l uint64
err := binary.Read(r, binary.LittleEndian, &l)
if err != nil {
return 0, err
}
return l, nil
}
20. Let's benchmark
• Using
• https://github.com/alecthomas/go_serialization_benchmarks
• Add our own serialization method and
compare with another
• Call it `raw`
• Let's see result
23. Slow pattern
• Use GODEBUG=allocfreetrace=1 to find
redundant allocation pattern
func encodeUint64(v uint64, w io.Writer) error {
b := [64 / 8]byte{}
binary.LittleEndian.PutUint64(b[:], v)
_, err := w.Write(b[:])
return err
}
func encodeString(v string, w io.Writer) error {
l := len(v)
err := encodeUint16(uint16(l), w)
if err != nil {
return err
}
_, err = w.Write([]byte(v))
return err
}
func rawbyteslice(size int) (b []byte) {
cap := roundupsize(uintptr(size))
p := mallocgc(cap, nil, false)
if cap != uintptr(size) {
memclrNoHeapPointers(add(p,
uintptr(size)), cap-uintptr(size))
}
*(*slice)(unsafe.Pointer(&b)) = slice{p,
size, int(cap)}
return
}
24. Slow pattern
• Took a look at some fast serialization
• Just byte copying around, no alloc
• And in our case, we write to file write after
encode, so we do not need each serialization
buffer, we just need global one
25. Second try
• Prepare a global buffer
• Grow if needed
• Clear buffer each run
• Just copy byte around, no more allocation
26. var bufferByte = make([]byte, DEFAULT_BUFFER_CAP)
func (rs Raw2Serializer) Marshal(o interface{}) []byte {
a := o.(*A)
cleanBuffer()
idx := 0
idx += WriteString(idx, a.Name)
idx += WriteUint64(idx, uint64(a.BirthDay.UnixNano()))
idx += WriteString(idx, a.Phone)
idx += WriteUint64(idx, uint64(a.Siblings))
idx += WriteBool(idx, a.Spouse)
idx += WriteFloat64(idx, a.Money)
// copy from a to bufferByte
return bufferByte[0:idx]
}
small different, need index control to know where we need to copy
and need to clean Buffer for each run
27. func WriteUint64(idx int, n uint64) int {
if (idx + 8) > currentCap {
growBufferIfneeded()
}
for i := uint(idx); i < uint(8); i++ {
bufferByte[i] = byte(n >> (i*8))
}
return 8
}
func WriteString(idx int, s string) int {
l := len(s)
if (idx + l) > currentCap {
growBufferIfneeded()
}
n := WriteUint64(idx, uint64(l))
// NOTE: copy works without conversion
copy(bufferByte[idx+n:idx+l], s)
return l+n
}
29. What I learned
• Hidden allocation reduce performance
• Serialization to file pitfalls
• Need thread-safe implement to prevent dirty file
• Need versioning (write version first, than payload
later) for backward compatibility
• Checksum matter
• You can calculate checksum directly from struct,
no need to calculate from bytes
• Using fnv to hash all fields and add up together,
instead of using CRC32 to calculate the whole
byte arrays
31. varint (protobuf)
• Available in many softwares (protobuf, sqlite,
webassemlby (LEB128 of LLVM), golang encoding/
binary)
• Compressed positive integer (negative number with 2-
complement will take more bits)
• Idea:
• most of integer in our app is small ("not very big")
• Use as little number of bits as possible
• 7 bit per byte, MSB bit as "continuation bit"
• Cons:
• CPU cost
• decoding is a bit complex
32. varint (protobuf)
t := uint64(l)
for t >= 0x80 {
buf[i+8] = byte(t) | 0x80
t >>= 7
i++
}
Many variant (group varint encoding,
prefix varint encoding etc ... )
33. zigzag encoding (protobuf)
• varint works only with positive number
• Zigzag encoding encode negative number as
nearest (in absolute) positive number
0 0
-1 1
1 2
-2 3
2147483647 4294967294
-2147483648 4294967295
zigzag = (n << 1) ^ (n >> (BIT_WIDTH - 1)
Remember that arithmetic shift replicates the sign bit
(n >> (BIT_WIDTH - 1) -> 11111...1 for negative
(n >> (BIT_WIDTH - 1) -> 00000...0 for positive
So when XOR with negative n, a lot of 1 will be eliminate
34. float reverse (gob)
// floatBits returns a uint64 holding the bits of a floating-point
number.
// Floating-point numbers are transmitted as uint64s holding the bits
// of the underlying representation. They are sent byte-reversed, with
// the exponent end coming out first, so integer floating point numbers
// (for example) transmit more compactly. This routine does the
// swizzling.
func floatBits(f float64) uint64 {
u := math.Float64bits(f)
var v uint64
for i := 0; i < 8; i++ {
v <<= 8
v |= u & 0xFF
u >>= 8
}
return v
}
35. unsafe (andyleap/gencode)
v := *(*uint64)(unsafe.Pointer(&(d.Height)))
Unmarshal number without copy or allocation
You could use same technique for string too
http://qiita.com/mattn/items/176459728ff4f854b165
36. Finally
• Write your own serialization is not hard, and fun
• You can learn a lot from existence method
• There are tons of techniques could be used to
enhance performance
• When there are no much preferences, let's use
fix-layout type serialization
• Version control proto file
• High performance