SlideShare a Scribd company logo
1 of 53
How to Write the Fastest JSON
Parser/Writer in the World
Milo Yip
Tencent
28 Mar 2015
Milo Yip 叶劲峰
• Expert Engineer (2011 to now)
– Engine Technology Center, R & D Department,
Interactive Entertainment Group (IEG), Tencent
• Master of Philosophy in System Engineering &
Engineering Management, CUHK
• Bachelor of Cognitive Science, HKU
• https://github.com/miloyip
• http://www.cnblogs.com/miloyip
• http://www.zhihu.com/people/miloyip
Table of Contents
1. Introduction
2. Benchmark
3. Design
4. Limitations
5. Thoughts
6. References
1. INTRODUCTION
JSON
• JavaScript Object Notation
• Alternative to XML
• Human-readable text to transmit/persist data
• RFC 7159/ECMA-404
• Common uses
– Open API (e.g. Twitter, Facebook, etc.)
– Data storage/exchange (e.g. GeoJSON)
RapidJSON
• https://github.com/miloyip/rapidjson
• MIT License
• C++ Header-only Library
• Started in Nov 2011
• Inspired by RapidXML
• Will release 1.0 under Tencent *soon*
Features
• Both SAX and DOM style API
• Fast
• Cross platform/compiler
• No dependencies
• Memory friendly
• UTF-8/16/32/ASCII and transcoding
• In-situ Parsing
• More at http://miloyip.github.io/rapidjson/md_doc_features.html
Hello RapidJSON!
#include "rapidjson/document.h"
#include "rapidjson/writer.h"
#include "rapidjson/stringbuffer.h"
#include <iostream>
using namespace rapidjson;
int main() {
// 1. Parse a JSON string into DOM.
const char* json = "{"project":"rapidjson","stars":10}";
Document d;
d.Parse(json);
// 2. Modify it by DOM.
Value& s = d["stars"];
s.SetInt(s.GetInt() + 1);
// 3. Stringify the DOM
StringBuffer buffer;
Writer<StringBuffer> writer(buffer);
d.Accept(writer);
// Output {"project":"rapidjson","stars":11}
std::cout << buffer.GetString() << std::endl;
return 0;
}
Fast, AND Reliable
• 103 Unit Tests
• Continuous Integration
– Travis on Linux
– AppVeyor on Windows
– Valgrind (Linux) for memory leak checking
• Use in real applications
– Use in client and server applications at Tencent
– A user reported parsing 50 million JSON daily
Public Projects using RapidJSON
• Cocos2D-X: Cross-Platform 2D Game Engine
http://cocos2d-x.org/
• Microsoft Bond: Cross-Platform Serialization
https://github.com/Microsoft/bond/
• Google Angle: OpenGL ES 2 for Windows
https://chromium.googlesource.com/angle/angle/
• CERN LHCb: Large Hadron Collider beauty
http://lhcb-comp.web.cern.ch/lhcb-comp/
• Tell me if you know more
2. BENCHMARK
Benchmarks for Native JSON libraries
• https://github.com/miloyip/nativejson-benchmark
• Compare 20 open source C/C++ JSON libraries
• Evaluate speed, memory and code size
• For parsing, stringify, traversal, and more
Libaries
• CAJUN
• Casablanca
• cJSON
• dropbox/json11
• FastJson
• gason
• jansson
• json-c
• json spirit
• Json Box
• JsonCpp
• JSON++
• parson
• picojson
• RapidJSON
• simplejson
• udp/json
• ujson4c
• vincenthz/libjson
• YAJL
Results: Parsing Speed
Results: Parsing Memory
Results: Stringify Speed
Results: Code Size
Benchmarks for Spine
• Spine is a 2D skeletal animation tool
• Spine-C is the official runtime in C
https://github.com/EsotericSoftware/spine-runtimes/tree/master/spine-c
• It uses JSON as data format
• It has a custom JSON parser
• Adapt RapidJSON and compare loading time
Test Data
• http://esotericsoftware.com/forum/viewtopic.php?f=3&t=2831
• Original 80KB JSON
• Interpolate to get
multiple JSON files
• Load 100 times
Results
3. DESIGN
The Zero Overhead Principle
• Bjarne Stroustrup[1]:
“What you don't use, you don't pay for.”
• RapidJSON tries to obey this principle
– SAX and DOM
– Combinable options, configurations
SAX
StartObject()
Key("hello", 5, true)
String("world", 5, true)
Key("t", 1, true)
Bool(true)
Key("f", 1, true)
Bool(false)
Key("n", 1, true)
Null()
Key("i")
UInt(123)
Key("pi")
Double(3.1416)
Key("a")
StartArray()
Uint(1)
Uint(2)
Uint(3)
Uint(4)
EndArray(4)
EndObject(7)
DOM
When parsing a JSON to DOM, use SAX events to build a DOM.
When stringify a DOM, traverse it and generate events to SAX.
{"hello":"world", "t":true, "f":false, "n":null,
"i":123, "pi":3.1416, "a":[1, 2, 3, 4]}
DOM
SAX
Architecture
Value
Document
Reader
Writer
<<concept>>
Handler
<<concept>>
Stream
<<concept>>
Encoding
<<concept>>
Allocator
calls
implements
implements
accepts
has
Handler: Template Parameter
• Handler handles SAX event callbacks
• How to implement callbacks?
– Traditional: virtual function
– RapidJSON: template parameter
template <unsigned parseFlags, typename InputStream, typename Handler>
ParseResult Reader::Parse(InputStream& is, Handler& handler);
• No virtual function overhead
• Inline callback functions
Parsing Options: Template Argument
• Many parse options -> Zero overhead principle
• Use integer template argument
template <unsigned parseFlags, typename InputStream, typename Handler>
ParseResult Reader::Parse(InputStream& is, Handler& handler);
if (parseFlags & kParseInsituFlag) {
// ...
}
else {
// ...
}
• Compiler optimization eliminates unused code
Recursive SAX Parser
• Simple to write/optimize by hand
• Use program stack to maintain parsing state of
the tree structure
• Prone to stack overflow
– So also provide an iterative parser
(Contributed by Don Ding @thebusytypist)
Normal Parsing
In situ Parsing
No allocation and copying for strings! Cache Friendly!
Parsing Number: the Pain ;(
• RapidJSON supports parsing JSON number to
uint32_t, int32_t, uint64_t, int64_t, double
• Difficult to detect in single pass
• Even more difficult for double (strtod() is slow)
• Implemented kFullPrecision option using
1. Fast-path
2. DIY-FP (https://github.com/floitsch/double-conversion)
3. Big Integer method [2]
How difficult?
• PHP Hangs On Numeric Value 2.2250738585072011e-308
http://www.exploringbinary.com/php-hangs-on-numeric-
value-2-2250738585072011e-308/
• Java Hangs When Converting 2.2250738585072012e-308
http://www.exploringbinary.com/java-hangs-when-
converting-2-2250738585072012e-308/
• "2.22507385850720113605740979670913197593481954635
164564e-308“ → 2.2250738585072009e-308
• "2.22507385850720113605740979670913197593481954635
164565e-308“→ 2.2250738585072014e-308
• And need to be fast…
DOM Designed for Fast Parsing
• A JSON value can be one of 6 types
– object, array, number, string, boolean, null
• Inheritance needs new for each value
• RapidJSON uses a single variant type Value
Layout of Value
String
Ch* str
SizeType length
unsigned flags
Number
int i unsigned u
int64_t i64 uint64_t u64 double d
0 0
unsigned flags
Object
Member* members
SizeType size
SizeType capacity
unsigned flags
Array
Value* values
SizeType size
SizeType capacity
unsigned flags
Move Semantics
• Deep copying object/array/string is slow
• RapidJSON enforces move semantics
The Default Allocator
• Internally allocates a single linked-list of
buffers
• Do not free objects (thus FAST!)
• Suitable for parsing (creating values
consecutively)
• Not suitable for DOM manipulation
Custom Initial Buffer
• User can provide a custom initial buffer
– For example, buffer on stack, scratch buffer
• The allocator use that buffer first until it is full
• Possible to archive zero allocation in parsing
Short String Optimization
• Many JSON keys are short
• Contributor @Kosta-Github submitted a PR to
optimize short strings
String
Ch* str
SizeType length
unsigned flags
ShortString
Ch str[11];
uint8_t x;
unsigned flags
Let length = 11 – x
So 11-char long string is ended with ‘0’
SIMD Optimization
• Using SSE2/SSE4 to skip whitespaces
(space, tab, LF, CR)
• Each iteration compare 16 chars × 4 chars
• Fast for JSON with indentation
• Visual C++ 2010 32-bit test:
strlen()
for ref.
strspn() RapidJSON
(no SIMD)
RapidJSON
(SSE2)
RapidJSON
(SSE4)
Skip 1M
whitespace
(ms)
752 3011 1349 170 102
Integer-to-String Optimization
• Integer-To-String conversion is simple
– E.g. 123 -> “123”
• But standard library is quite slow
– E.g. sprintf(), _itoa(), etc.
• Tried various implementations
My implementations
• https://github.com/miloyip/itoa-benchmark
• Visual C++ 2013 on Windows 64-bit
Double-to-String Optimziation
• Double-to-string conversion is very slow
– E.g. 3.14 -> “3.14”
• Grisu2 is a fast algorithm for this[3]
– 100% cases give correct results
– >99% cases give optimal results
• Google V8 has an implementation
– https://github.com/floitsch/double-conversion
– But not header-only, so…
My Grisu2 Implementation
• https://github.com/miloyip/dtoa-benchmark
• Visual C++ 2013 on Windows 64-bit:
4. LIMITATIONS
Tradeoff: User-Friendliness
• DOM only supports move semantics
– Cannot copy-construct Value/Document
– So, cannot pass them by value, put in containers
• DOM APIs needs allocator as parameter, e.g.
numbers.PushBack(1, allocator);
• User needs to concern life-cycle of allocator
and its allocated values
Pausing in Parsing
• Cannot pause in parsing and resume it later
– Not keeping all parsing states explicitly
– Doing so will be much slower
• Typical Scenario
– Streaming JSON from network
– Don’t want to store the JSON in memory
• Solution
– Parse in an separate thread
– Block the input stream to pause
5. THOUGHTS
Origin
• RapidJSON is my hobby project in 2011
• Also my first open source project
• First version released in 2 weeks
Community
• Google Code helps tracking bugs but hard to
involve contributions
• After migrating to GitHub in 2014
– Community much more active
– Issue tracking more powerful
– Pull requests ease contributions
Future
• Official Release under Tencent
– 1.0 beta → 1.0 release (after 3+ years…)
– Can work on it in working time
– Involve marketing and other colleagues
– Establish Community in China
• Post-1.0 Features
– Easy DOM API (but slower)
– JSON Schema
– Relaxed JSON syntax
– Optimization on Object Member Access
• Open source our internal projects at Tencent
To Establish an Open Source Project
• Courage
• Start Small
• Make Different
– Innovative Idea?
– Easy to Use?
– Good Performance?
• Embrace Community
• Learn
References
1. Stroustrup, Bjarne. The design and evolution
of C++. Pearson Education India, 1994.
2. Clinger, William D. How to read floating point
numbers accurately. Vol. 25. No. 6. ACM,
1990.
3. Loitsch, Florian. "Printing floating-point
numbers quickly and accurately with
integers." ACM Sigplan Notices 45.6 (2010):
233-243.
Q&A

More Related Content

What's hot

コンテンツサンプルを楽しむ"超"初心者の為のNiagara
コンテンツサンプルを楽しむ"超"初心者の為のNiagaraコンテンツサンプルを楽しむ"超"初心者の為のNiagara
コンテンツサンプルを楽しむ"超"初心者の為のNiagarahistoria_Inc
 
NDC 2015 삼시세끼 빌드만들기
NDC 2015 삼시세끼 빌드만들기NDC 2015 삼시세끼 빌드만들기
NDC 2015 삼시세끼 빌드만들기Hyunsuk Ahn
 
FMX 2017: Extending Unreal Engine 4 with Plug-ins (Master Class)
FMX 2017: Extending Unreal Engine 4 with Plug-ins (Master Class)FMX 2017: Extending Unreal Engine 4 with Plug-ins (Master Class)
FMX 2017: Extending Unreal Engine 4 with Plug-ins (Master Class)Gerke Max Preussner
 
아티스트에게 사랑받는 3DS Max 우버쉐이더
아티스트에게 사랑받는 3DS Max 우버쉐이더아티스트에게 사랑받는 3DS Max 우버쉐이더
아티스트에게 사랑받는 3DS Max 우버쉐이더포프 김
 
輪読発表資料: Efficient Virtual Shadow Maps for Many Lights
輪読発表資料: Efficient Virtual Shadow Maps for Many Lights輪読発表資料: Efficient Virtual Shadow Maps for Many Lights
輪読発表資料: Efficient Virtual Shadow Maps for Many Lightsomochi64
 
Mieszko Zielinski (Epic Games), White Nights 2015
Mieszko Zielinski  (Epic Games), White Nights 2015 Mieszko Zielinski  (Epic Games), White Nights 2015
Mieszko Zielinski (Epic Games), White Nights 2015 White Nights Conference
 
NDC2019 - 게임플레이 프로그래머의 역할
NDC2019 - 게임플레이 프로그래머의 역할NDC2019 - 게임플레이 프로그래머의 역할
NDC2019 - 게임플레이 프로그래머의 역할Hoyoung Choi
 
第4回UE4勉強会 in 大阪 UE4でのチーム製作
第4回UE4勉強会 in 大阪   UE4でのチーム製作第4回UE4勉強会 in 大阪   UE4でのチーム製作
第4回UE4勉強会 in 大阪 UE4でのチーム製作com044
 
잘 알려지지 않은 숨은 진주, Winsock API - WSAPoll, Fast Loopback
잘 알려지지 않은 숨은 진주, Winsock API - WSAPoll, Fast Loopback잘 알려지지 않은 숨은 진주, Winsock API - WSAPoll, Fast Loopback
잘 알려지지 않은 숨은 진주, Winsock API - WSAPoll, Fast Loopback흥배 최
 
UE4×Switchで60FPSの(ネットワーク)対戦アクションをなんとかして作る! | UNREAL FEST EXTREME 2020 WINTER
UE4×Switchで60FPSの(ネットワーク)対戦アクションをなんとかして作る!  | UNREAL FEST EXTREME 2020 WINTERUE4×Switchで60FPSの(ネットワーク)対戦アクションをなんとかして作る!  | UNREAL FEST EXTREME 2020 WINTER
UE4×Switchで60FPSの(ネットワーク)対戦アクションをなんとかして作る! | UNREAL FEST EXTREME 2020 WINTERエピック・ゲームズ・ジャパン Epic Games Japan
 
HDR Theory and practicce (JP)
HDR Theory and practicce (JP)HDR Theory and practicce (JP)
HDR Theory and practicce (JP)Hajime Uchimura
 
エフェクトにしっかり色を付ける方法
エフェクトにしっかり色を付ける方法エフェクトにしっかり色を付ける方法
エフェクトにしっかり色を付ける方法kmasaki
 
Multithread & shared_ptr
Multithread & shared_ptrMultithread & shared_ptr
Multithread & shared_ptr내훈 정
 
Linux/DB Tuning (DevSumi2010, Japanese)
Linux/DB Tuning (DevSumi2010, Japanese)Linux/DB Tuning (DevSumi2010, Japanese)
Linux/DB Tuning (DevSumi2010, Japanese)Yoshinori Matsunobu
 
ゲームエンジンの中の話
ゲームエンジンの中の話ゲームエンジンの中の話
ゲームエンジンの中の話Masayoshi Kamai
 

What's hot (20)

コンテンツサンプルを楽しむ"超"初心者の為のNiagara
コンテンツサンプルを楽しむ"超"初心者の為のNiagaraコンテンツサンプルを楽しむ"超"初心者の為のNiagara
コンテンツサンプルを楽しむ"超"初心者の為のNiagara
 
NDC 2015 삼시세끼 빌드만들기
NDC 2015 삼시세끼 빌드만들기NDC 2015 삼시세끼 빌드만들기
NDC 2015 삼시세끼 빌드만들기
 
FMX 2017: Extending Unreal Engine 4 with Plug-ins (Master Class)
FMX 2017: Extending Unreal Engine 4 with Plug-ins (Master Class)FMX 2017: Extending Unreal Engine 4 with Plug-ins (Master Class)
FMX 2017: Extending Unreal Engine 4 with Plug-ins (Master Class)
 
아티스트에게 사랑받는 3DS Max 우버쉐이더
아티스트에게 사랑받는 3DS Max 우버쉐이더아티스트에게 사랑받는 3DS Max 우버쉐이더
아티스트에게 사랑받는 3DS Max 우버쉐이더
 
輪読発表資料: Efficient Virtual Shadow Maps for Many Lights
輪読発表資料: Efficient Virtual Shadow Maps for Many Lights輪読発表資料: Efficient Virtual Shadow Maps for Many Lights
輪読発表資料: Efficient Virtual Shadow Maps for Many Lights
 
Mieszko Zielinski (Epic Games), White Nights 2015
Mieszko Zielinski  (Epic Games), White Nights 2015 Mieszko Zielinski  (Epic Games), White Nights 2015
Mieszko Zielinski (Epic Games), White Nights 2015
 
NDC2019 - 게임플레이 프로그래머의 역할
NDC2019 - 게임플레이 프로그래머의 역할NDC2019 - 게임플레이 프로그래머의 역할
NDC2019 - 게임플레이 프로그래머의 역할
 
UE4のシーケンサーをもっともっと使いこなそう!最新情報・Tipsをご紹介!
UE4のシーケンサーをもっともっと使いこなそう!最新情報・Tipsをご紹介!UE4のシーケンサーをもっともっと使いこなそう!最新情報・Tipsをご紹介!
UE4のシーケンサーをもっともっと使いこなそう!最新情報・Tipsをご紹介!
 
第4回UE4勉強会 in 大阪 UE4でのチーム製作
第4回UE4勉強会 in 大阪   UE4でのチーム製作第4回UE4勉強会 in 大阪   UE4でのチーム製作
第4回UE4勉強会 in 大阪 UE4でのチーム製作
 
잘 알려지지 않은 숨은 진주, Winsock API - WSAPoll, Fast Loopback
잘 알려지지 않은 숨은 진주, Winsock API - WSAPoll, Fast Loopback잘 알려지지 않은 숨은 진주, Winsock API - WSAPoll, Fast Loopback
잘 알려지지 않은 숨은 진주, Winsock API - WSAPoll, Fast Loopback
 
UE4×Switchで60FPSの(ネットワーク)対戦アクションをなんとかして作る! | UNREAL FEST EXTREME 2020 WINTER
UE4×Switchで60FPSの(ネットワーク)対戦アクションをなんとかして作る!  | UNREAL FEST EXTREME 2020 WINTERUE4×Switchで60FPSの(ネットワーク)対戦アクションをなんとかして作る!  | UNREAL FEST EXTREME 2020 WINTER
UE4×Switchで60FPSの(ネットワーク)対戦アクションをなんとかして作る! | UNREAL FEST EXTREME 2020 WINTER
 
HDR Theory and practicce (JP)
HDR Theory and practicce (JP)HDR Theory and practicce (JP)
HDR Theory and practicce (JP)
 
エフェクトにしっかり色を付ける方法
エフェクトにしっかり色を付ける方法エフェクトにしっかり色を付ける方法
エフェクトにしっかり色を付ける方法
 
Unreal Engine最新機能 アニメーション+物理ショーケース!
Unreal Engine最新機能 アニメーション+物理ショーケース!Unreal Engine最新機能 アニメーション+物理ショーケース!
Unreal Engine最新機能 アニメーション+物理ショーケース!
 
Multithread & shared_ptr
Multithread & shared_ptrMultithread & shared_ptr
Multithread & shared_ptr
 
[CEDEC2017] UE4プロファイリングツール総おさらい(グラフィクス編)
[CEDEC2017] UE4プロファイリングツール総おさらい(グラフィクス編)[CEDEC2017] UE4プロファイリングツール総おさらい(グラフィクス編)
[CEDEC2017] UE4プロファイリングツール総おさらい(グラフィクス編)
 
Fortniteを支える技術
Fortniteを支える技術Fortniteを支える技術
Fortniteを支える技術
 
Linux/DB Tuning (DevSumi2010, Japanese)
Linux/DB Tuning (DevSumi2010, Japanese)Linux/DB Tuning (DevSumi2010, Japanese)
Linux/DB Tuning (DevSumi2010, Japanese)
 
入門 シェル実装
入門 シェル実装入門 シェル実装
入門 シェル実装
 
ゲームエンジンの中の話
ゲームエンジンの中の話ゲームエンジンの中の話
ゲームエンジンの中の話
 

Viewers also liked

Json for modern c++
Json for modern c++Json for modern c++
Json for modern c++지환 김
 
GPU Gems3 Vegetation
GPU Gems3 VegetationGPU Gems3 Vegetation
GPU Gems3 VegetationYoupyo Choi
 
D2 Horizon Occlusion
D2 Horizon OcclusionD2 Horizon Occlusion
D2 Horizon OcclusionYoupyo Choi
 
D2 Depth of field
D2 Depth of fieldD2 Depth of field
D2 Depth of fieldYoupyo Choi
 
FINDING FORENSIC ARTIFACTS FROM WINDOW REGISTRY
FINDING FORENSIC ARTIFACTS FROM WINDOW REGISTRYFINDING FORENSIC ARTIFACTS FROM WINDOW REGISTRY
FINDING FORENSIC ARTIFACTS FROM WINDOW REGISTRYnitinparashar786
 
How to be a writer in a world of structured content
How to be a writer in a world of structured contentHow to be a writer in a world of structured content
How to be a writer in a world of structured contentFabrizio Ferri-Benedetti
 
Stories that Sell: Content Strategy for Adventure Brands
Stories that Sell: Content Strategy for Adventure Brands Stories that Sell: Content Strategy for Adventure Brands
Stories that Sell: Content Strategy for Adventure Brands Stephen Landau
 
Learning To Sell - The Most Essential Start-up Skill by Chris Cousins
 Learning To Sell - The Most Essential Start-up Skill by Chris Cousins Learning To Sell - The Most Essential Start-up Skill by Chris Cousins
Learning To Sell - The Most Essential Start-up Skill by Chris CousinsGibraltar Startup
 
Open Ldap Integration and Configuration with Lifray 6.2
Open Ldap Integration and Configuration with Lifray 6.2Open Ldap Integration and Configuration with Lifray 6.2
Open Ldap Integration and Configuration with Lifray 6.2Vinaykumar Hebballi
 
Workers of the web - BrazilJS 2013
Workers of the web - BrazilJS 2013Workers of the web - BrazilJS 2013
Workers of the web - BrazilJS 2013Thibault Imbert
 

Viewers also liked (20)

Rapid json tutorial
Rapid json tutorialRapid json tutorial
Rapid json tutorial
 
Java JSON Benchmark
Java JSON BenchmarkJava JSON Benchmark
Java JSON Benchmark
 
Json for modern c++
Json for modern c++Json for modern c++
Json for modern c++
 
JSON and REST
JSON and RESTJSON and REST
JSON and REST
 
JSON with C++ & C#
JSON with C++ & C#JSON with C++ & C#
JSON with C++ & C#
 
D2 Rain (1/2)
D2 Rain (1/2)D2 Rain (1/2)
D2 Rain (1/2)
 
GPU Gems3 Vegetation
GPU Gems3 VegetationGPU Gems3 Vegetation
GPU Gems3 Vegetation
 
D2 Horizon Occlusion
D2 Horizon OcclusionD2 Horizon Occlusion
D2 Horizon Occlusion
 
D2 Rain (2/2)
D2 Rain (2/2)D2 Rain (2/2)
D2 Rain (2/2)
 
D2 Havok
D2 HavokD2 Havok
D2 Havok
 
D2 Job Pool
D2 Job PoolD2 Job Pool
D2 Job Pool
 
D2 Depth of field
D2 Depth of fieldD2 Depth of field
D2 Depth of field
 
FINDING FORENSIC ARTIFACTS FROM WINDOW REGISTRY
FINDING FORENSIC ARTIFACTS FROM WINDOW REGISTRYFINDING FORENSIC ARTIFACTS FROM WINDOW REGISTRY
FINDING FORENSIC ARTIFACTS FROM WINDOW REGISTRY
 
D2 Hdr
D2 HdrD2 Hdr
D2 Hdr
 
How to Sell Content Strategy... in Spain
How to Sell Content Strategy... in SpainHow to Sell Content Strategy... in Spain
How to Sell Content Strategy... in Spain
 
How to be a writer in a world of structured content
How to be a writer in a world of structured contentHow to be a writer in a world of structured content
How to be a writer in a world of structured content
 
Stories that Sell: Content Strategy for Adventure Brands
Stories that Sell: Content Strategy for Adventure Brands Stories that Sell: Content Strategy for Adventure Brands
Stories that Sell: Content Strategy for Adventure Brands
 
Learning To Sell - The Most Essential Start-up Skill by Chris Cousins
 Learning To Sell - The Most Essential Start-up Skill by Chris Cousins Learning To Sell - The Most Essential Start-up Skill by Chris Cousins
Learning To Sell - The Most Essential Start-up Skill by Chris Cousins
 
Open Ldap Integration and Configuration with Lifray 6.2
Open Ldap Integration and Configuration with Lifray 6.2Open Ldap Integration and Configuration with Lifray 6.2
Open Ldap Integration and Configuration with Lifray 6.2
 
Workers of the web - BrazilJS 2013
Workers of the web - BrazilJS 2013Workers of the web - BrazilJS 2013
Workers of the web - BrazilJS 2013
 

Similar to How to Write the Fastest JSON Parser/Writer in the World

Messaging, interoperability and log aggregation - a new framework
Messaging, interoperability and log aggregation - a new frameworkMessaging, interoperability and log aggregation - a new framework
Messaging, interoperability and log aggregation - a new frameworkTomas Doran
 
Performance and Abstractions
Performance and AbstractionsPerformance and Abstractions
Performance and AbstractionsMetosin Oy
 
Go from a PHP Perspective
Go from a PHP PerspectiveGo from a PHP Perspective
Go from a PHP PerspectiveBarry Jones
 
Python VS GO
Python VS GOPython VS GO
Python VS GOOfir Nir
 
Introduction to libre « fulltext » technology
Introduction to libre « fulltext » technologyIntroduction to libre « fulltext » technology
Introduction to libre « fulltext » technologyRobert Viseur
 
Your backend architecture is what matters slideshare
Your backend architecture is what matters slideshareYour backend architecture is what matters slideshare
Your backend architecture is what matters slideshareColin Charles
 
Hibernate ORM: Tips, Tricks, and Performance Techniques
Hibernate ORM: Tips, Tricks, and Performance TechniquesHibernate ORM: Tips, Tricks, and Performance Techniques
Hibernate ORM: Tips, Tricks, and Performance TechniquesBrett Meyer
 
Ruby and Distributed Storage Systems
Ruby and Distributed Storage SystemsRuby and Distributed Storage Systems
Ruby and Distributed Storage SystemsSATOSHI TAGOMORI
 
The Why and How of Scala at Twitter
The Why and How of Scala at TwitterThe Why and How of Scala at Twitter
The Why and How of Scala at TwitterAlex Payne
 
High Performance Systems in Go - GopherCon 2014
High Performance Systems in Go - GopherCon 2014High Performance Systems in Go - GopherCon 2014
High Performance Systems in Go - GopherCon 2014Derek Collison
 
Writing a fast HTTP parser
Writing a fast HTTP parserWriting a fast HTTP parser
Writing a fast HTTP parserfukamachi
 
PostgreSQL is the new NoSQL - at Devoxx 2018
PostgreSQL is the new NoSQL  - at Devoxx 2018PostgreSQL is the new NoSQL  - at Devoxx 2018
PostgreSQL is the new NoSQL - at Devoxx 2018Quentin Adam
 
Taming the resource tiger
Taming the resource tigerTaming the resource tiger
Taming the resource tigerElizabeth Smith
 
From a student to an apache committer practice of apache io tdb
From a student to an apache committer  practice of apache io tdbFrom a student to an apache committer  practice of apache io tdb
From a student to an apache committer practice of apache io tdbjixuan1989
 
AWS re:Invent 2016| GAM302 | Sony PlayStation: Breaking the Bandwidth Barrier...
AWS re:Invent 2016| GAM302 | Sony PlayStation: Breaking the Bandwidth Barrier...AWS re:Invent 2016| GAM302 | Sony PlayStation: Breaking the Bandwidth Barrier...
AWS re:Invent 2016| GAM302 | Sony PlayStation: Breaking the Bandwidth Barrier...Amazon Web Services
 
High Performance With Java
High Performance With JavaHigh Performance With Java
High Performance With Javamalduarte
 
Taming the resource tiger
Taming the resource tigerTaming the resource tiger
Taming the resource tigerElizabeth Smith
 

Similar to How to Write the Fastest JSON Parser/Writer in the World (20)

Messaging, interoperability and log aggregation - a new framework
Messaging, interoperability and log aggregation - a new frameworkMessaging, interoperability and log aggregation - a new framework
Messaging, interoperability and log aggregation - a new framework
 
Performance and Abstractions
Performance and AbstractionsPerformance and Abstractions
Performance and Abstractions
 
Go from a PHP Perspective
Go from a PHP PerspectiveGo from a PHP Perspective
Go from a PHP Perspective
 
Python VS GO
Python VS GOPython VS GO
Python VS GO
 
Introduction to libre « fulltext » technology
Introduction to libre « fulltext » technologyIntroduction to libre « fulltext » technology
Introduction to libre « fulltext » technology
 
Your backend architecture is what matters slideshare
Your backend architecture is what matters slideshareYour backend architecture is what matters slideshare
Your backend architecture is what matters slideshare
 
Hibernate ORM: Tips, Tricks, and Performance Techniques
Hibernate ORM: Tips, Tricks, and Performance TechniquesHibernate ORM: Tips, Tricks, and Performance Techniques
Hibernate ORM: Tips, Tricks, and Performance Techniques
 
Ruby and Distributed Storage Systems
Ruby and Distributed Storage SystemsRuby and Distributed Storage Systems
Ruby and Distributed Storage Systems
 
The Why and How of Scala at Twitter
The Why and How of Scala at TwitterThe Why and How of Scala at Twitter
The Why and How of Scala at Twitter
 
High Performance Systems in Go - GopherCon 2014
High Performance Systems in Go - GopherCon 2014High Performance Systems in Go - GopherCon 2014
High Performance Systems in Go - GopherCon 2014
 
Writing a fast HTTP parser
Writing a fast HTTP parserWriting a fast HTTP parser
Writing a fast HTTP parser
 
PostgreSQL is the new NoSQL - at Devoxx 2018
PostgreSQL is the new NoSQL  - at Devoxx 2018PostgreSQL is the new NoSQL  - at Devoxx 2018
PostgreSQL is the new NoSQL - at Devoxx 2018
 
Taming the resource tiger
Taming the resource tigerTaming the resource tiger
Taming the resource tiger
 
From a student to an apache committer practice of apache io tdb
From a student to an apache committer  practice of apache io tdbFrom a student to an apache committer  practice of apache io tdb
From a student to an apache committer practice of apache io tdb
 
AWS re:Invent 2016| GAM302 | Sony PlayStation: Breaking the Bandwidth Barrier...
AWS re:Invent 2016| GAM302 | Sony PlayStation: Breaking the Bandwidth Barrier...AWS re:Invent 2016| GAM302 | Sony PlayStation: Breaking the Bandwidth Barrier...
AWS re:Invent 2016| GAM302 | Sony PlayStation: Breaking the Bandwidth Barrier...
 
High Performance With Java
High Performance With JavaHigh Performance With Java
High Performance With Java
 
Zero mq logs
Zero mq logsZero mq logs
Zero mq logs
 
Taming the resource tiger
Taming the resource tigerTaming the resource tiger
Taming the resource tiger
 
Php
PhpPhp
Php
 
Php
PhpPhp
Php
 

Recently uploaded

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 

Recently uploaded (20)

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 

How to Write the Fastest JSON Parser/Writer in the World

  • 1. How to Write the Fastest JSON Parser/Writer in the World Milo Yip Tencent 28 Mar 2015
  • 2. Milo Yip 叶劲峰 • Expert Engineer (2011 to now) – Engine Technology Center, R & D Department, Interactive Entertainment Group (IEG), Tencent • Master of Philosophy in System Engineering & Engineering Management, CUHK • Bachelor of Cognitive Science, HKU • https://github.com/miloyip • http://www.cnblogs.com/miloyip • http://www.zhihu.com/people/miloyip
  • 3.
  • 4. Table of Contents 1. Introduction 2. Benchmark 3. Design 4. Limitations 5. Thoughts 6. References
  • 6. JSON • JavaScript Object Notation • Alternative to XML • Human-readable text to transmit/persist data • RFC 7159/ECMA-404 • Common uses – Open API (e.g. Twitter, Facebook, etc.) – Data storage/exchange (e.g. GeoJSON)
  • 7. RapidJSON • https://github.com/miloyip/rapidjson • MIT License • C++ Header-only Library • Started in Nov 2011 • Inspired by RapidXML • Will release 1.0 under Tencent *soon*
  • 8. Features • Both SAX and DOM style API • Fast • Cross platform/compiler • No dependencies • Memory friendly • UTF-8/16/32/ASCII and transcoding • In-situ Parsing • More at http://miloyip.github.io/rapidjson/md_doc_features.html
  • 9. Hello RapidJSON! #include "rapidjson/document.h" #include "rapidjson/writer.h" #include "rapidjson/stringbuffer.h" #include <iostream> using namespace rapidjson; int main() { // 1. Parse a JSON string into DOM. const char* json = "{"project":"rapidjson","stars":10}"; Document d; d.Parse(json); // 2. Modify it by DOM. Value& s = d["stars"]; s.SetInt(s.GetInt() + 1); // 3. Stringify the DOM StringBuffer buffer; Writer<StringBuffer> writer(buffer); d.Accept(writer); // Output {"project":"rapidjson","stars":11} std::cout << buffer.GetString() << std::endl; return 0; }
  • 10. Fast, AND Reliable • 103 Unit Tests • Continuous Integration – Travis on Linux – AppVeyor on Windows – Valgrind (Linux) for memory leak checking • Use in real applications – Use in client and server applications at Tencent – A user reported parsing 50 million JSON daily
  • 11. Public Projects using RapidJSON • Cocos2D-X: Cross-Platform 2D Game Engine http://cocos2d-x.org/ • Microsoft Bond: Cross-Platform Serialization https://github.com/Microsoft/bond/ • Google Angle: OpenGL ES 2 for Windows https://chromium.googlesource.com/angle/angle/ • CERN LHCb: Large Hadron Collider beauty http://lhcb-comp.web.cern.ch/lhcb-comp/ • Tell me if you know more
  • 13. Benchmarks for Native JSON libraries • https://github.com/miloyip/nativejson-benchmark • Compare 20 open source C/C++ JSON libraries • Evaluate speed, memory and code size • For parsing, stringify, traversal, and more
  • 14. Libaries • CAJUN • Casablanca • cJSON • dropbox/json11 • FastJson • gason • jansson • json-c • json spirit • Json Box • JsonCpp • JSON++ • parson • picojson • RapidJSON • simplejson • udp/json • ujson4c • vincenthz/libjson • YAJL
  • 19. Benchmarks for Spine • Spine is a 2D skeletal animation tool • Spine-C is the official runtime in C https://github.com/EsotericSoftware/spine-runtimes/tree/master/spine-c • It uses JSON as data format • It has a custom JSON parser • Adapt RapidJSON and compare loading time
  • 20. Test Data • http://esotericsoftware.com/forum/viewtopic.php?f=3&t=2831 • Original 80KB JSON • Interpolate to get multiple JSON files • Load 100 times
  • 23. The Zero Overhead Principle • Bjarne Stroustrup[1]: “What you don't use, you don't pay for.” • RapidJSON tries to obey this principle – SAX and DOM – Combinable options, configurations
  • 24. SAX StartObject() Key("hello", 5, true) String("world", 5, true) Key("t", 1, true) Bool(true) Key("f", 1, true) Bool(false) Key("n", 1, true) Null() Key("i") UInt(123) Key("pi") Double(3.1416) Key("a") StartArray() Uint(1) Uint(2) Uint(3) Uint(4) EndArray(4) EndObject(7) DOM When parsing a JSON to DOM, use SAX events to build a DOM. When stringify a DOM, traverse it and generate events to SAX. {"hello":"world", "t":true, "f":false, "n":null, "i":123, "pi":3.1416, "a":[1, 2, 3, 4]}
  • 26. Handler: Template Parameter • Handler handles SAX event callbacks • How to implement callbacks? – Traditional: virtual function – RapidJSON: template parameter template <unsigned parseFlags, typename InputStream, typename Handler> ParseResult Reader::Parse(InputStream& is, Handler& handler); • No virtual function overhead • Inline callback functions
  • 27. Parsing Options: Template Argument • Many parse options -> Zero overhead principle • Use integer template argument template <unsigned parseFlags, typename InputStream, typename Handler> ParseResult Reader::Parse(InputStream& is, Handler& handler); if (parseFlags & kParseInsituFlag) { // ... } else { // ... } • Compiler optimization eliminates unused code
  • 28. Recursive SAX Parser • Simple to write/optimize by hand • Use program stack to maintain parsing state of the tree structure • Prone to stack overflow – So also provide an iterative parser (Contributed by Don Ding @thebusytypist)
  • 30. In situ Parsing No allocation and copying for strings! Cache Friendly!
  • 31. Parsing Number: the Pain ;( • RapidJSON supports parsing JSON number to uint32_t, int32_t, uint64_t, int64_t, double • Difficult to detect in single pass • Even more difficult for double (strtod() is slow) • Implemented kFullPrecision option using 1. Fast-path 2. DIY-FP (https://github.com/floitsch/double-conversion) 3. Big Integer method [2]
  • 32. How difficult? • PHP Hangs On Numeric Value 2.2250738585072011e-308 http://www.exploringbinary.com/php-hangs-on-numeric- value-2-2250738585072011e-308/ • Java Hangs When Converting 2.2250738585072012e-308 http://www.exploringbinary.com/java-hangs-when- converting-2-2250738585072012e-308/ • "2.22507385850720113605740979670913197593481954635 164564e-308“ → 2.2250738585072009e-308 • "2.22507385850720113605740979670913197593481954635 164565e-308“→ 2.2250738585072014e-308 • And need to be fast…
  • 33. DOM Designed for Fast Parsing • A JSON value can be one of 6 types – object, array, number, string, boolean, null • Inheritance needs new for each value • RapidJSON uses a single variant type Value
  • 34. Layout of Value String Ch* str SizeType length unsigned flags Number int i unsigned u int64_t i64 uint64_t u64 double d 0 0 unsigned flags Object Member* members SizeType size SizeType capacity unsigned flags Array Value* values SizeType size SizeType capacity unsigned flags
  • 35. Move Semantics • Deep copying object/array/string is slow • RapidJSON enforces move semantics
  • 36. The Default Allocator • Internally allocates a single linked-list of buffers • Do not free objects (thus FAST!) • Suitable for parsing (creating values consecutively) • Not suitable for DOM manipulation
  • 37. Custom Initial Buffer • User can provide a custom initial buffer – For example, buffer on stack, scratch buffer • The allocator use that buffer first until it is full • Possible to archive zero allocation in parsing
  • 38. Short String Optimization • Many JSON keys are short • Contributor @Kosta-Github submitted a PR to optimize short strings String Ch* str SizeType length unsigned flags ShortString Ch str[11]; uint8_t x; unsigned flags Let length = 11 – x So 11-char long string is ended with ‘0’
  • 39. SIMD Optimization • Using SSE2/SSE4 to skip whitespaces (space, tab, LF, CR) • Each iteration compare 16 chars × 4 chars • Fast for JSON with indentation • Visual C++ 2010 32-bit test: strlen() for ref. strspn() RapidJSON (no SIMD) RapidJSON (SSE2) RapidJSON (SSE4) Skip 1M whitespace (ms) 752 3011 1349 170 102
  • 40. Integer-to-String Optimization • Integer-To-String conversion is simple – E.g. 123 -> “123” • But standard library is quite slow – E.g. sprintf(), _itoa(), etc. • Tried various implementations
  • 42. Double-to-String Optimziation • Double-to-string conversion is very slow – E.g. 3.14 -> “3.14” • Grisu2 is a fast algorithm for this[3] – 100% cases give correct results – >99% cases give optimal results • Google V8 has an implementation – https://github.com/floitsch/double-conversion – But not header-only, so…
  • 43. My Grisu2 Implementation • https://github.com/miloyip/dtoa-benchmark • Visual C++ 2013 on Windows 64-bit:
  • 45. Tradeoff: User-Friendliness • DOM only supports move semantics – Cannot copy-construct Value/Document – So, cannot pass them by value, put in containers • DOM APIs needs allocator as parameter, e.g. numbers.PushBack(1, allocator); • User needs to concern life-cycle of allocator and its allocated values
  • 46. Pausing in Parsing • Cannot pause in parsing and resume it later – Not keeping all parsing states explicitly – Doing so will be much slower • Typical Scenario – Streaming JSON from network – Don’t want to store the JSON in memory • Solution – Parse in an separate thread – Block the input stream to pause
  • 48. Origin • RapidJSON is my hobby project in 2011 • Also my first open source project • First version released in 2 weeks
  • 49. Community • Google Code helps tracking bugs but hard to involve contributions • After migrating to GitHub in 2014 – Community much more active – Issue tracking more powerful – Pull requests ease contributions
  • 50. Future • Official Release under Tencent – 1.0 beta → 1.0 release (after 3+ years…) – Can work on it in working time – Involve marketing and other colleagues – Establish Community in China • Post-1.0 Features – Easy DOM API (but slower) – JSON Schema – Relaxed JSON syntax – Optimization on Object Member Access • Open source our internal projects at Tencent
  • 51. To Establish an Open Source Project • Courage • Start Small • Make Different – Innovative Idea? – Easy to Use? – Good Performance? • Embrace Community • Learn
  • 52. References 1. Stroustrup, Bjarne. The design and evolution of C++. Pearson Education India, 1994. 2. Clinger, William D. How to read floating point numbers accurately. Vol. 25. No. 6. ACM, 1990. 3. Loitsch, Florian. "Printing floating-point numbers quickly and accurately with integers." ACM Sigplan Notices 45.6 (2010): 233-243.
  • 53. Q&A