Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Storing 16 Bytes at Scale
Fabian Reinartz
Software Engineer
@fabxc
Time Series
time
Time Series
time
series
requests_total{path="/status", method="GET", instance="10.0.0.1:80"}
requests_total{path="/status", method="POST", instanc...
requests_total{path="/status", method="GET", instance="10.0.0.1:80"}
requests_total{path="/status", method="POST", instanc...
requests_total{path="/status", method="GET", instance="10.0.0.1:80"}
requests_total{path="/status", method="POST", instanc...
Time Series
time
series
Time Series
time
series
Time Series
time
series
Time Series
time
series
Order of Scale
5 million active time series
30 second scrape interval
1 month of retention
166,000 samples/second
432 bill...
Order of Scale
5 million active time series
30 second scrape interval
1 month of retention
166,000 samples/second
432 bill...
Order of Scale
5 million active time series
30 second scrape interval
6 month of retention
166,000 samples/second
2592 bil...
Sample Compression – Timestamps
1496163646 1496163676 1496163706 1496163735 1496163765
Sample Compression – Timestamps
1496163646 1496163676 1496163706 1496163735 1496163765
1496163646 +30 +30 +29 +30Δ
Sample Compression – Timestamps
1496163646 1496163676 1496163706 1496163735 1496163765
1496163646 +30 +30 +29 +30
14961636...
Sample Compression – Values
Source: http://www.vldb.org/pvldb/vol8/p1816-teller.pdf
Sample Compression
Raw: 16 bytes/sample
Compressed: 1.37 bytes/sample
@beorn7 @dgryski
Order of Scale
5 million active time series
30 second scrape interval
1 month of retention
166,000 samples/second
432 bill...
Storing 16 Bytes at Scale
Fabian Reinartz
Software Engineer, CoreOS
@fabxc
1.37
Time Series – Chunks
time
series
1 file/series
filled with chunks
Time Series – Chunks
time
series
1 file/series
filled with chunks
Churn
time
series
Churn
time
series
Order of Scale – with Churn
5 million active time series
150 million total time series
30 second scrape interval
1 month o...
Churn
time
series
Blocks
time
series
Blocks
time
series
Blocks
time
series
Blocks
data
├── 01BX6V6H5JQN8RWVXM5ABYXY9N
│ ├── chunks
│ │ ├── 000001
│ │ ├── 000002
│ │ └── 000003
│ ├── index
│ ├── met...
Compaction
time
series
Compaction
time
series
Retention
time
series
Retention
time
series
Tree
Block 1
chunk 1 chunk 2 chunk 3
time
Block 2 Block 3 Block 4 Block 5
Tree
Block 1
chunk 1 chunk 2 chunk 3 chunk 4 chunk 5
time
Block 1 Block 2 Block 2 Block 3
Index
{
__name__=”requests_total”,
pod=”nginx-34534242-abc723
job=”nginx”,
path=”/api/v1/status”,
status=”200”,
method=”GE...
Index
{
__name__=”requests_total”,
pod=”nginx-34534242-abc723
job=”nginx”,
path=”/api/v1/status”,
status=”200”,
method=”GE...
Index
{
__name__=”requests_total”,
pod=”nginx-34534242-abc723
job=”nginx”,
path=”/api/v1/status”,
status=”200”,
method=”GE...
Index
{
__name__=”requests_total”,
pod=”nginx-34534242-abc723
job=”nginx”,
path=”/api/v1/status”,
status=”200”,
method=”GE...
Index
{
__name__=”requests_total”,
pod=”nginx-34534242-abc723
job=”nginx”,
path=”/api/v1/status”,
status=”200”,
method=”GE...
Index
{
__name__=”requests_total”,
pod=”nginx-34534242-abc723
job=”nginx”,
path=”/api/v1/status”,
status=”200”,
method=”GE...
Index
{
__name__=”requests_total”,
pod=”nginx-34534242-abc723
job=”nginx”,
path=”/api/v1/status”,
status=”200”,
method=”GE...
Index
{
__name__=”requests_total”,
pod=”nginx-34534242-abc723
job=”nginx”,
path=”/api/v1/status”,
status=”200”,
method=”GE...
Index
{
__name__=”requests_total”,
pod=”nginx-34534242-abc723
job=”nginx”,
path=”/api/v1/status”,
status=”200”,
method=”GE...
Index
{
__name__=”requests_total”,
pod=”nginx-34534242-abc723
job=”nginx”,
path=”/api/v1/status”,
status=”200”,
method=”GE...
Index
{
__name__=”requests_total”,
pod=”nginx-34534242-abc723
job=”nginx”,
path=”/api/v1/status”,
status=”200”,
method=”GE...
Index
{
__name__=”requests_total”,
pod=”nginx-34534242-abc723
job=”nginx”,
path=”/api/v1/status”,
status=”200”,
method=”GE...
Index
{
__name__=”requests_total”,
pod=”nginx-34534242-abc723
job=”nginx”,
path=”/api/v1/status”,
status=”200”,
method=”GE...
Index
{
__name__=”requests_total”,
pod=”nginx-34534242-abc723
job=”nginx”,
path=”/api/v1/status”,
status=”200”,
method=”GE...
Index
{
__name__=”requests_total”,
pod=”nginx-34534242-abc723
job=”nginx”,
path=”/api/v1/status”,
status=”200”,
method=”GE...
Benchmarks
Benchmarks
● Kubernetes cluster + dedicated Prometheus nodes
● 800 microservice instances + Kubernetes components
● 120,00...
Benchmarks (memory)
Benchmarks (CPU)
Benchmarks (disk writes)
Benchmarks (on-disk size)
Benchmarks (query latency)
Try it out!
We’re hiring: coreos.com/careers
● https://fabxc.org/blog/2017-04-10-writing-a-tsdb/
● https://github.com/prometheus/tsdb
More
Storing 16 Bytes at Scale
Storing 16 Bytes at Scale
Upcoming SlideShare
Loading in …5
×

Storing 16 Bytes at Scale

2,981 views

Published on

An overview of the new time series storage engine powering Prometheus 2.0

Published in: Technology
  • Be the first to comment

Storing 16 Bytes at Scale

  1. 1. Storing 16 Bytes at Scale Fabian Reinartz Software Engineer @fabxc
  2. 2. Time Series time
  3. 3. Time Series time series
  4. 4. requests_total{path="/status", method="GET", instance="10.0.0.1:80"} requests_total{path="/status", method="POST", instance="10.0.0.3:80"} requests_total{path="/", method="GET", instance="10.0.0.2:80"} ... Time Series
  5. 5. requests_total{path="/status", method="GET", instance="10.0.0.1:80"} requests_total{path="/status", method="POST", instance="10.0.0.3:80"} requests_total{path="/", method="GET", instance="10.0.0.2:80"} Select: requests_total Time Series
  6. 6. requests_total{path="/status", method="GET", instance="10.0.0.1:80"} requests_total{path="/status", method="POST", instance="10.0.0.3:80"} requests_total{path="/", method="GET", instance="10.0.0.2:80"} Select: requests_total{method="GET"} Time Series
  7. 7. Time Series time series
  8. 8. Time Series time series
  9. 9. Time Series time series
  10. 10. Time Series time series
  11. 11. Order of Scale 5 million active time series 30 second scrape interval 1 month of retention 166,000 samples/second 432 billion samples 2,000 - 15,000 targets
  12. 12. Order of Scale 5 million active time series 30 second scrape interval 1 month of retention 166,000 samples/second 432 billion samples 8 byte timestamp + 8 byte value ⇒ 7 TB on disk 2,000 - 15,000 targets seems OK
  13. 13. Order of Scale 5 million active time series 30 second scrape interval 6 month of retention 166,000 samples/second 2592 billion samples 8 byte timestamp + 8 byte value ⇒ 42 TB on disk 2,000 - 15,000 targets
  14. 14. Sample Compression – Timestamps 1496163646 1496163676 1496163706 1496163735 1496163765
  15. 15. Sample Compression – Timestamps 1496163646 1496163676 1496163706 1496163735 1496163765 1496163646 +30 +30 +29 +30Δ
  16. 16. Sample Compression – Timestamps 1496163646 1496163676 1496163706 1496163735 1496163765 1496163646 +30 +30 +29 +30 1496163646 +30 +0 -1 +1 Δ ΔΔ
  17. 17. Sample Compression – Values Source: http://www.vldb.org/pvldb/vol8/p1816-teller.pdf
  18. 18. Sample Compression Raw: 16 bytes/sample Compressed: 1.37 bytes/sample @beorn7 @dgryski
  19. 19. Order of Scale 5 million active time series 30 second scrape interval 1 month of retention 166,000 samples/second 432 billion samples 8 byte timestamp + 8 byte value ⇒ 7 TB on disk 0.8 TB 2,000 - 15,000 targets
  20. 20. Storing 16 Bytes at Scale Fabian Reinartz Software Engineer, CoreOS @fabxc 1.37
  21. 21. Time Series – Chunks time series 1 file/series filled with chunks
  22. 22. Time Series – Chunks time series 1 file/series filled with chunks
  23. 23. Churn time series
  24. 24. Churn time series
  25. 25. Order of Scale – with Churn 5 million active time series 150 million total time series 30 second scrape interval 1 month of retention 166,000 samples/second 432 billion samples
  26. 26. Churn time series
  27. 27. Blocks time series
  28. 28. Blocks time series
  29. 29. Blocks time series
  30. 30. Blocks data ├── 01BX6V6H5JQN8RWVXM5ABYXY9N │ ├── chunks │ │ ├── 000001 │ │ ├── 000002 │ │ └── 000003 │ ├── index │ ├── meta.json │ └── tombstones ├── 01BX6V6TY06G5MFQ0GPH7EMXRH │ ├── chunks │ │ └── 000001 │ ├── index │ ├── meta.json │ └── tombstones └── wal ├── 000004 ├── 000005 └── 000006
  31. 31. Compaction time series
  32. 32. Compaction time series
  33. 33. Retention time series
  34. 34. Retention time series
  35. 35. Tree Block 1 chunk 1 chunk 2 chunk 3 time Block 2 Block 3 Block 4 Block 5
  36. 36. Tree Block 1 chunk 1 chunk 2 chunk 3 chunk 4 chunk 5 time Block 1 Block 2 Block 2 Block 3
  37. 37. Index { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, }
  38. 38. Index { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } ● In each block, a series has a unique ID ID: 5
  39. 39. Index { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } ● In each block, a series has a unique ID ● Maintain sorted lists from label pair to IDs ID: 5 ... status=”200”: 1 2 5 99 1000 1001 1500 1502 500000 method=”GET”: 2 3 4 5 6 9 10 1502 999999 ... ...
  40. 40. Index { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } status=”200”: 1 2 5 99 1000 1001 1500 1502 500000 method=”GET”: 2 3 4 5 6 9 10 1502 999999 ... ● In each block, a series has a unique ID ● Maintain sorted lists from label pair to IDs ● Efficient k-way set operations (merge, intersect)
  41. 41. Index { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } status=”200”: 1 2 5 99 1000 1001 1500 1502 500000 method=”GET”: 2 3 4 5 6 9 10 1502 999999 ... Intersect: ● In each block, a series has a unique ID ● Maintain sorted lists from label pair to IDs ● Efficient k-way set operations (merge, intersect)
  42. 42. Index { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } status=”200”: 1 2 5 99 1000 1001 1500 1502 500000 method=”GET”: 2 3 4 5 6 9 10 1502 999999 ... Intersect: ● In each block, a series has a unique ID ● Maintain sorted lists from label pair to IDs ● Efficient k-way set operations (merge, intersect)
  43. 43. Index { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } status=”200”: 1 2 5 99 1000 1001 1500 1502 500000 method=”GET”: 2 3 4 5 6 9 10 1502 999999 ... Intersect: 2 ● In each block, a series has a unique ID ● Maintain sorted lists from label pair to IDs ● Efficient k-way set operations (merge, intersect)
  44. 44. Index { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } status=”200”: 1 2 5 99 1000 1001 1500 1502 500000 method=”GET”: 2 3 4 5 6 9 10 1502 999999 ... Intersect: 2 ● In each block, a series has a unique ID ● Maintain sorted lists from label pair to IDs ● Efficient k-way set operations (merge, intersect)
  45. 45. Index { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } status=”200”: 1 2 5 99 1000 1001 1500 1502 500000 method=”GET”: 2 3 4 5 6 9 10 1502 999999 ... Intersect: 2 ● In each block, a series has a unique ID ● Maintain sorted lists from label pair to IDs ● Efficient k-way set operations (merge, intersect)
  46. 46. Index { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } status=”200”: 1 2 5 99 1000 1001 1500 1502 500000 method=”GET”: 2 3 4 5 6 9 10 1502 999999 ... Intersect: 2 5 ● In each block, a series has a unique ID ● Maintain sorted lists from label pair to IDs ● Efficient k-way set operations (merge, intersect)
  47. 47. Index { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } status=”200”: 1 2 5 99 1000 1001 1500 1502 500000 method=”GET”: 2 3 4 5 6 9 10 1502 999999 ... Intersect: 2 5 ● In each block, a series has a unique ID ● Maintain sorted lists from label pair to IDs ● Efficient k-way set operations (merge, intersect)
  48. 48. Index { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } status=”200”: 1 2 5 99 1000 1001 1500 1502 500000 method=”GET”: 2 3 4 5 6 9 10 1502 999999 ... Intersect: 2 5 ● In each block, a series has a unique ID ● Maintain sorted lists from label pair to IDs ● Efficient k-way set operations (merge, intersect)
  49. 49. Index { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } status=”200”: 1 2 5 99 1000 1001 1500 1502 500000 method=”GET”: 2 3 4 5 6 9 10 1502 999999 ... Intersect: 2 5 ● In each block, a series has a unique ID ● Maintain sorted lists from label pair to IDs ● Efficient k-way set operations (merge, intersect)
  50. 50. Index { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } status=”200”: 1 2 5 99 1000 1001 1500 1502 500000 method=”GET”: 2 3 4 5 6 9 10 1502 999999 ... Intersect: 2 5 1502 ● In each block, a series has a unique ID ● Maintain sorted lists from label pair to IDs ● Efficient k-way set operations (merge, intersect)
  51. 51. Index { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } status=”200”: 1 2 5 99 1000 1001 1500 1502 500000 method=”GET”: 2 3 4 5 6 9 10 1502 999999 ... Intersect: 2 5 1502 ● In each block, a series has a unique ID ● Maintain sorted lists from label pair to IDs ● Efficient k-way set operations (merge, intersect)
  52. 52. Benchmarks
  53. 53. Benchmarks ● Kubernetes cluster + dedicated Prometheus nodes ● 800 microservice instances + Kubernetes components ● 120,000 samples/second ● 300,000 active time series ● Swap out 50% of pods every 10 minutes
  54. 54. Benchmarks (memory)
  55. 55. Benchmarks (CPU)
  56. 56. Benchmarks (disk writes)
  57. 57. Benchmarks (on-disk size)
  58. 58. Benchmarks (query latency)
  59. 59. Try it out! We’re hiring: coreos.com/careers
  60. 60. ● https://fabxc.org/blog/2017-04-10-writing-a-tsdb/ ● https://github.com/prometheus/tsdb More

×