The document discusses how an easy-to-use and fast database can have a complicated implementation for developers. It outlines four key areas: 1) Flexible writing schema requires schema merging at read time. 2) Fast reads prune non-covered data chunks through predicate push-down. 3) Loading duplicated data necessitates data deduplication and compaction operations. 4) Quick data deletion still needs data elimination at read time or in the background. The document provides examples to illustrate the tradeoffs between user and developer requirements.
3. Users: We want a Database that ...
Easy to Use Fast (Great Performance)
4. Developers: How do we build it?
Can it be simple? Or it has to be complex?
5. There must be trade-offs
Need a complicated black box to meet
easy-to-use & fast requirements
6. Outline
Easy-to-Use & Fast for Users Complicated Implementation for Developers
1 Flexible Writing Schema (No DDL) Need Schema Merging at Reading Time
2 Fast Read Prune Non-Covered Data Chunks (Predicate Push-Down)
3 Able to Load Duplicated Data Need Data Deduplication & Compaction Operations
4 Quick Data Deletion No deletion right away but need data elimination at read
time or in the background
12. Flexible Writing Schema
Load 1:
weather,location=east temp=82,humidity=67 1465839830100400200
weather,location=west temp=70 1465839830100400200
weather,location=east temp=82,humidity=69
host,state=MA,city=Boston cpu=10 1465839830200400200
host,state=MA,city=Andover cpu=12 1465839830400400200
weather,location=midwest temp=70,humidity=57 1465839830400400200
Loading Data
location temp humidity timestamp
east
west
east
midwest
82
70
82
70
67
69
57
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.5000000Z
2016-06-13T17:43:50.4004002Z
Weather’s Chunk 1
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
IOx Storage: Table Chunks
13. Flexible Writing Schema
Load 1:
weather,location=east temp=82,humidity=67 1465839830100400200
weather,location=west temp=70 1465839830100400200
weather,location=east temp=82,humidity=69
host,state=MA,city=Boston cpu=10 1465839830200400200
host,state=MA,city=Andover cpu=12 1465839830400400200
weather,location=midwest temp=70,humidity=57 1465839830400400200
Loading Data
location temp humidity timestamp
east
west
east
midwest
82
70
82
70
67
69
57
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.5000000Z
2016-06-13T17:43:50.4004002Z
Weather’s Chunk 1
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
IOx Storage: Table Chunks
Load 2:
host,state=MA,city=Boston disk=100 1465839830200500200
host,state=NY,city=New York disk=200 1465839830400600200
14. Flexible Writing Schema
Load 1:
weather,location=east temp=82,humidity=67 1465839830100400200
weather,location=west temp=70 1465839830100400200
weather,location=east temp=82,humidity=69
host,state=MA,city=Boston cpu=10 1465839830200400200
host,state=MA,city=Andover cpu=12 1465839830400400200
weather,location=midwest temp=70,humidity=57 1465839830400400200
Loading Data
location temp humidity timestamp
east
west
east
midwest
82
70
82
70
67
69
57
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.5000000Z
2016-06-13T17:43:50.4004002Z
Weather’s Chunk 1
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
IOx Storage: Table Chunks
Load 2:
host,state=MA,city=Boston disk=100 1465839830200500200
host,state=NY,city=New York disk=200 1465839830400600200
Host’s Chunk 2
state city disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
15. Flexible Writing Schema
Load 1:
weather,location=east temp=82,humidity=67 1465839830100400200
weather,location=west temp=70 1465839830100400200
weather,location=east temp=82,humidity=69
host,state=MA,city=Boston cpu=10 1465839830200400200
host,state=MA,city=Andover cpu=12 1465839830400400200
weather,location=midwest temp=70,humidity=57 1465839830400400200
Loading Data
(*) Chunk Types: Mutable Buffer, Read Buffer, Object Store (see previous talks)
location temp humidity timestamp
east
west
east
midwest
82
70
82
70
67
69
57
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.5000000Z
2016-06-13T17:43:50.4004002Z
Weather’s Chunk 1
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
IOx Storage: Table Chunks(*)
Load 2:
host,state=MA,city=Boston disk=100 1465839830200500200
host,state=NY,city=New York disk=200 1465839830400600200
Host’s Chunk 2
state city disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
16. Flexible Writing Schema
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
Host’s Chunk 2
state city disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
IOx Storage: Different Host Chunk Schema
17. Flexible Writing Schema → Schema Merging at Read Time
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
Host’s Chunk 2
state city disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
IOx Storage: Different Host Chunk Schema User issues: Read everything from Host
18. Flexible Writing Schema → Schema Merging at Read Time
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
Host’s Chunk 2
state city disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
IOx Storage: Different Host Chunk Schema User issues: Read everything from Host
→ IOx merges Chunk Schema at Scan Step
Host’s Chunk 1
state city cpu disk timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
Host’s Chunk 2
state city cpu disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
19. 2. Fast Read
→ Prune Non-Covered Data Chunks
(Predicate Push-Down)
20. Fast Read
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
Host’s Chunk 2
state city disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
IOx Storage: Different Host Chunk Schema User issues: Read everything from Host with “disk > 100”
21. Fast Read → Prune Non-Covered Data Chunks
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
Host’s Chunk 2
state city disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
IOx Storage: Different Host Chunk Schema User issues: Read everything from Host with “disk > 100”
→ IOx prunes Chunk 1 by applying predicate “disk > 100”
to prune non-covered “disk” data chunks
Host’s Chunk 2
state city cpu disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
22. Fast Read → Prune Non-Covered Data Chunks
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
Host’s Chunk 2
state city disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
IOx Storage: Different Host Chunk Schema User issues: Read everything from Host with “disk > 100”
→ IOx prunes Chunk 1 by applying predicate “disk > 100”
to prune non-covered “disk” data chunks
Then apply further the predicate “disk > 100” to return 1
row
Host’s Chunk 2
state city cpu disk timestamp
NY New York 200 2016-06-13T17:43:50.6004002Z
23. Fast Read → Prune Non-Covered Data Chunks
Chunk Scan without Pruning
IOxReadFilterNode
chunk_id = 1
IOxReadFilterNode
chunk_id = 2
UnionExec
Chunk Scan with Pruning
Previous IOx Talk: Query Processing in InfluxDB IOx
IOxReadFilterNode
chunk_id = 2
FilterExec
(Disk > 100)
FilterExec
(Disk > 100)
24. 3. Able to Load Duplicated Data
→ Deduplicate & Compact Operators
25. Able to Load Duplicated Data
Load 1:
weather,location=east temp=82,humidity=67 1465839830100400200
weather,location=west temp=70 1465839830100400200
weather,location=east temp=82,humidity=69
host,state=MA,city=Boston cpu=10 1465839830200400200
host,state=MA,city=Andover cpu=12 1465839830400400200
weather,location=midwest temp=70,humidity=57 1465839830400400200
Loading Data: Same tag values are duplicates
location temp humidity timestamp
east
west
east
midwest
82
70
82
70
67
69
57
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.5000000Z
2016-06-13T17:43:50.4004002Z
Weather’s Chunk 1
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
IOx Storage: Table Chunks
Load 2:
host,state=MA,city=Boston disk=100 1465839830200500200
host,state=NY,city=New York disk=200 1465839830400600200
Host’s Chunk 2
state city disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
26. location temp humidity timestamp
east
West
east
midwest
82
70
82
70
67
69
57
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.5000000Z
2016-06-13T17:43:50.4004002Z
Able to Load Duplicated Data → Deduplicate at Read Time
Weather’s Chunk 1
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
Host’s Chunk 2
state city disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
IOx Storage: Table Chunks
27. location temp humidity timestamp
east
West
east
midwest
82
70
82
70
67
69
57
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.5000000Z
2016-06-13T17:43:50.4004002Z
Able to Load Duplicated Data → Deduplicate at Read Time
Weather’s Chunk 1
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
Host’s Chunk 2
state city disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
IOx Storage: Table Chunks User issues: Read Weather Data
28. location temp humidity timestamp
east
West
east
midwest
82
70
82
70
67
69
57
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.5000000Z
2016-06-13T17:43:50.4004002Z
Able to Load Duplicated Data → Deduplicate at Read Time
Weather’s Chunk 1
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
Host’s Chunk 2
state city disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
IOx Storage: Table Chunks User issues: Read Weather Data
→ 3 rows returned
location temp humidity timestamp
east
west
midwest
82
70
70
69
57
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.4004002Z
29. location temp humidity timestamp
east
West
east
midwest
82
70
82
70
67
69
57
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.5000000Z
2016-06-13T17:43:50.4004002Z
Able to Load Duplicated Data → Deduplicate at Read Time
Weather’s Chunk 1
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
Host’s Chunk 2
state city disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
IOx Storage: Table Chunks User issues: Read Weather Data
→ 3 rows returned
location temp humidity timestamp
east
west
midwest
82
70
70
69
57
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.4004002Z
User issues: Read Host Data
30. location temp humidity timestamp
east
West
east
midwest
82
70
82
70
67
69
57
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.5000000Z
2016-06-13T17:43:50.4004002Z
Able to Load Duplicated Data → Deduplicate at Read Time
Weather’s Chunk 1
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
Host’s Chunk 2
state city disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
IOx Storage: Table Chunks User issues: Read Weather Data
→ 3 rows returned
location temp humidity timestamp
east
west
midwest
82
70
70
69
57
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.4004002Z
User issues: Read Host Data
→ 3 rows returned
state city cpu disk timestamp
MA
MA
NY
Boston
Andover
New York
10
12
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.4004002Z
2016-06-13T17:43:50.6004002Z
31. Able to Load Duplicated Data → Deduplicate at Read Time
Chunk Scan without Deduplication
IOxReadFilterNode
chunk_id = 1
IOxReadFilterNode
chunk_id = 2
UnionExec
SortPreservingMerge
DeduplicateExec
SortExec(optional)
Sort_key: tags
SortExec(optional)
Sort_key: tags
IOxReadFilterNode
chunk_id = 1
IOxReadFilterNode
chunk_id = 2
UnionExec
Chunk Scan with Deduplication (*)
(*) Previous IOx Talk: Query Processing in InfluxDB IOx
32. Able to Load Duplicated Data → Compact from time to time
IOx Storage: before compaction
location temp humidity timestamp
east
West
east
midwest
82
70
82
70
67
69
57
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.5000000Z
2016-06-13T17:43:50.4004002Z
Weather’s Chunk 1
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
Host’s Chunk 2
state city disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
33. Able to Load Duplicated Data → Compact from time to time
IOx Storage: before compaction
location temp humidity timestamp
east
west
midwest
82
70
70
69
57
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.4004002Z
state city cpu disk timestamp
MA
MA
NY
Boston
Andover
New York
10
12
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.4004002Z
2016-06-13T17:43:50.6004002Z
IOx Storage: after compaction
Weather’s Chunk
Host’s Chunk
location temp humidity timestamp
east
West
east
midwest
82
70
82
70
67
69
57
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.5000000Z
2016-06-13T17:43:50.4004002Z
Weather’s Chunk 1
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
Host’s Chunk 2
state city disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
34. Able to Load Duplicated Data → Compact from time to time
● Compaction Operation ≃
Deduplication Operation (= Chunk Scan) +
Create a new chunk to store deduplicated data +
Drop old chunks
● Compaction runs in the background based on compaction policy
35. 4. Quick Data Deletion
→ No Online Deletion
→ Eliminate data at Read Time &
Actual Deletion during Compaction
36. ● User issues a Delete
○ Nothing is deleted (Classic Technique in Analytic/Big Data System)
○ Delete Predicate is stored as a Tombstone
● At Read time (Chunk Scan)
○ The Tombstone is applied at Scan step to not return the deleted data
● During compaction
○ The newly created chunk won’t include deleted data as the result of the chunk
scan
Quick Data Deletion
37. Quick Data Deletion → Add Tombstone at Delete Time
IOx Storage: before delete
location temp humidity timestamp
east
West
east
midwest
82
70
82
70
67
69
57
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.5000000Z
2016-06-13T17:43:50.4004002Z
Weather’s Chunk 1
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
Host’s Chunk 2
state city disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
38. Quick Data Deletion → Add Tombstone at Delete Time
IOx Storage: before delete IOx Storage: after delete from host “city = Boston”
location temp humidity timestamp
east
West
east
midwest
82
70
82
70
67
69
57
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.5000000Z
2016-06-13T17:43:50.4004002Z
Weather’s Chunk 1
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
Host’s Chunk 2
state city disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
location temp humidity timestamp
east
West
east
midwest
82
70
82
70
67
69
57
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.5000000Z
2016-06-13T17:43:50.4004002Z
Weather’s Chunk 1
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
Host’s Chunk 2
state city disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
Tombstone: “city = Boston”
Tombstone: “city = Boston”
39. Quick Data Deletion → Eliminate Data at Read Time
IOx Storage: Host Chunks with Tombstones
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
Host’s Chunk 2
state city disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
Tombstone: “city = Boston”
Tombstone: “city = Boston”
40. Quick Data Deletion → Eliminate Data at Read Time
IOx Storage: Host Chunks with Tombstones User issues: Read everything from Host
→ IOx applies tombstones to eliminate data
→ 2 rows returned
state city cpu disk timestamp
MA
NY
Andover
New York
12
200
2016-06-13T17:43:50.4004002Z
2016-06-13T17:43:50.6004002Z
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
Host’s Chunk 2
state city disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
Tombstone: “city = Boston”
Tombstone: “city = Boston”
41. Quick Data Deletion → Eliminate Data at Read Time
IOxReadFilterNode
chunk_id = 1
IOxReadFilterNode
chunk_id = 2
UnionExec
SortPreservingMerge
DeduplicateExec
SortExec(optional)
Sort_key: tags
SortExec(optional)
Sort_key: tags
FilterExec
city = Boston
Chunk Scan without Delete Chunk Scan with Delete(*) (city = Boston)
IOxReadFilterNode
chunk_id = 1
IOxReadFilterNode
chunk_id = 2
UnionExec
SortPreservingMerge
DeduplicateExec
SortExec(optional)
Sort_key: tags
SortExec(optional)
Sort_key: tags
(*) Previous IOx Talk: Query Processing in InfluxDB IOx
FilterExec
city = Boston
42. Quick Data Deletion → Compact from time to time
IOx Storage: before compaction
location temp humidity timestamp
east
West
east
midwest
82
70
82
70
67
69
57
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.5000000Z
2016-06-13T17:43:50.4004002Z
Weather’s Chunk 1
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
Host’s Chunk 2
state city disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
Tombstone: “city = Boston”
Tombstone: “city = Boston”
43. Quick Data Deletion → Compact from time to time
IOx Storage: before compaction
location temp humidity timestamp
east
west
midwest
82
70
70
69
57
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.4004002Z
state city cpu disk timestamp
MA
NY
Andover
New York
12
200
2016-06-13T17:43:50.4004002Z
2016-06-13T17:43:50.6004002Z
IOx Storage: after compaction (deduplication + delete)
Weather’s Chunk
Host’s Chunk
location temp humidity timestamp
east
West
east
midwest
82
70
82
70
67
69
57
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.1004002Z
2016-06-13T17:43:50.5000000Z
2016-06-13T17:43:50.4004002Z
Weather’s Chunk 1
Host’s Chunk 1
state city cpu timestamp
MA
MA
Boston
Andover
10
12
2016-06-13T17:43:50.2004002Z
2016-06-13T17:43:50.4004002Z
Host’s Chunk 2
state city disk timestamp
MA
NY
Boston
New York
100
200
2016-06-13T17:43:50.5004002Z
2016-06-13T17:43:50.6004002Z
Tombstone: “city = Boston”
Tombstone: “city = Boston”
44. Summary:
Easy-to-Use & Fast for Users Complicated Implementation for Developers
1 Flexible Writing Schema (No DDL) Need Schema Merging at Reading Time
2 Fast Read Prune Non-Covered Data Chunks (Predicate Push-Down)
3 Able to Load Duplicated Data Need Data Deduplication & Compaction Operations
4 Quick Data Deletion No deletion right away but need data elimination at read
time or in the background
Simplicity is the Ultimate Sophistication
But InfluxData is committed to bring Simplicity to Users