SlideShare a Scribd company logo
1 of 57
Download to read offline
Writing a TSDB from scratch
performance optimizations
Roman Khavronenko | github.com/hagen1778
Roman Khavronenko
Co-founder of VictoriaMetrics
Software engineer with experience in distributed systems,
monitoring and high-performance services.
https://github.com/hagen1778
https://twitter.com/hagen1778
What is a metric?
What is a metric? Scrape target
> curl http://service:port/metrics
Collecting metrics
Collecting metrics
Delivering collected metrics to TSDB
Workload pattern for TSDB: writes
Workload pattern for TSDB: reads
Workload pattern for TSDB
● TSDBs process tremendous amounts of data
● They are usually write-heavy applications, optimized for ingestion
● Read load is usually much lower than write load
● Read queries are sporadic and unpredictable
How to deal with such workload?
System design oriented for time series data:
1. Log Structured Merge (LSM) data structure
2. Data for each column is stored separately
3. Append-only writes
How to deal with such workload?
And some more non-design-specific optimizations:
1. Strings interning
2. Function results caching
3. Concurrency limiting for CPU-bound operations
4. Sync pool for CPU-bound operations
Strings interning
String interning
Store only one unique copy in memory!
String interning
String interning: naive implementation
var internStringsMap = make(map[string]string)
func intern(s string) string {
m := internStringsMap
if v, ok := m[s]; ok {
return v
}
m[s] = s
return s
}
func ptr (s string) uintptr {
return (*reflect.StringHeader)(unsafe.Pointer(&s)).Data
}
func main() {
s1 := intern("42")
s2 := intern(fmt.Sprintf("%d", 42))
fmt.Println(ptr(s1) == ptr(s2)) // true
}
String interning: naive implementation
1. Map isn't thread safe
String interning: naive implementation
1. Map isn't thread safe
2. Map with lock doesn't scale with number of CPUs
String interning: naive implementation
String interning: sync.Map
var internStringsMap = sync.Map{}
func intern(s string) string {
m := &internStringsMap
interned, _ := m.LoadOrStore(s, s)
return interned.(string)
}
String interning: sync.Map
sync.Map is optimized for two common use cases:
1. When the entry for a given key is only ever written once but read
many times
String interning: sync.Map
sync.Map is optimized for two common use cases:
1. When the entry for a given key is only ever written once but read
many times
2. When multiple goroutines read, write, and overwrite entries for
disjoint sets of keys.
In these two cases, use of a Map reduces lock contention
and improves performance compared to a Go map paired with a
separate Mutex or RWMutex.
String interning: gotchas
1. Map will grow over time:
a. Rotate maps once in a while
String interning: gotchas
1. Map will grow over time:
a. Rotate maps once in a while
b. Add TTL logic to purge cold entries
String interning: gotchas
1. Map will grow over time:
a. Rotate maps once in a while
b. Add TTL logic to purge cold entries
2. Sanity check of arguments:
a. At some point, someone will try to intern byte slice or substring:
*(*string)(unsafe.Pointer(&b)) or str[:n]
String interning: gotchas
1. Map will grow over time:
a. Rotate maps once in a while
b. Add TTL logic to purge cold entries
2. Sanity check of arguments:
a. At some point, someone will try to intern byte slice or substring:
*(*string)(unsafe.Pointer(&b)) or str[:n]
b. Make sure to clone received strings:
strings.Clone(s)
String interning: summary
● We use string interning for storing time series metadata (aka labels).
● It helps to reduce memory usage during metadata parsing
● Interning works the best for read-intensive workload with limited
number of variants with high hit rate
Function results caching
Function results caching: relabeling
Function results caching: caching Transformer
type Transformer struct {
m sync.Map
transformFunc func(s string) string
}
func (t *Transformer) Transform(s string) string {
v, ok := t.m.Load(s)
if ok {
// Fast path - the transformed s is found in the cache.
return v.(string)
}
// Slow path - transform s and store it in the cache.
sTransformed := t.transformFunc(s)
t.m.Store(s, sTransformed)
return sTransformed
}
Function results caching: caching Transformer
// SanitizeName replaces unsupported by Prometheus chars
// in metric names and label names with _.
func SanitizeName(name string) string {
return promSanitizer.Transform(name)
}
var promSanitizer = NewTransformer(func(s string) string {
return unsupportedPromChars.ReplaceAllString(s, "_")
})
Function results caching: example
Function results caching: summary
● Helps to save CPU time in the cost of increased mem usage
● Works best for heavy usage of string transforms, regex matching, etc
● And when the number of arguments and their variants is limited
● Doesn't work good when number of transformations is unlimited or
inconsistent - like query processing
Limiting concurrency for CPU-bound load
Volatile number of scrape targets
Spikes in ingestion stream
Limiting concurrency for CPU intensive operations
+ Makes system more stable and efficient
+ Helps to control the memory usage on load spikes (which is expected in
monitoring)
+ Improves the processing speed of each goroutine by reducing the number
of context switches
- The downside is complexity - it is easy to make a mistake and end up with
a deadlock or inefficient resource utilization.
Limited concurrency: workers
var concurrencyLimit = runtime.NumCPU()
func main() {
workCh := make(chan work, concurrencyLimit*2)
for i := 0; i < concurrencyLimit; i++ {
go func() {
for {
processData(<-workCh)
}
}()
}
}
Limited concurrency: workers
+ Workers could have scoped buffers, metrics, etc.
- Code becomes complicated: start and stop procedures for workers
- Additional synchronization to distribute work via channels
Limited concurrency: channel
var concurrencyLimitCh = make(chan struct{}, runtime.NumCPU())
// This function is CPU-bound and may allocate a lot of memory.
// We limit the number of concurrent calls to limit memory
// usage under high load without sacrificing the performance.
func processData(src, dst []byte) error {
concurrencyLimitCh <- struct{}{}
defer func() {
<-concurrencyLimitCh
}()
// heavy processing...
Limited concurrency: summary
● Works the best for CPU bound operations
● Helps to bound resource usage and process it sequentially with the
optimal performance instead of wasting resources on context switches
● Helps to prevent from excessive memory usage during load spikes
● Do not apply limiting to IO bound (disk, network) operations
sync.Pool for CPU bound operations
sync.Pool is widely used in VM
grep -r "sync.Pool" ./app ./lib | wc -l
118
grep -r "bytesutil.ByteBufferPool" ./app ./lib | wc -l
34
sync.Pool for CPU bound operations in one thread
● All processed on a single CPU core
● No object stealing
● Lower number of objects allocated, better pool utilization
● Lower GC pressure
sync.Pool for synchronous processing
● Object is retrieved, used and released by different goroutines
● High chances for goroutines to be scheduled to different threads
● High chances for objects stealing
sync.Pool for IO bound operations
● Obj retrieved from sync.pool used for IO operations.
● IO operations are slow and sporadic
● so sync.Pool can allocate big amount of objects and result in uncontrolled
mem usage
● Higher pressure on GC
sync.Pool - lib/bytesbuffer
type ByteBufferPool struct {
p sync.Pool
}
// Verify ByteBuffer implements the given interfaces.
_ io.Writer = &ByteBuffer{}
_ fs.MustReadAtCloser = &ByteBuffer{}
_ io.ReaderFrom = &ByteBuffer{}
sync.Pool - lib/bytesbuffer
func (bbp *ByteBufferPool) Get() *ByteBuffer {
bbv := bbp.p.Get()
if bbv == nil {
return &ByteBuffer{}
}
return bbv.(*ByteBuffer)
}
func (bbp *ByteBufferPool) Put(bb *ByteBuffer) {
bb.Reset()
bbp.p.Put(bb)
}
sync.Pool - lib/bytesbuffer
bb := bbPool.Get() // acquire from pool
bb.B, err = DecompressZSTD(bb.B[:0], src)
if err != nil {
return nil, fmt.Errorf("cannot decompress: %w", err)
}
// unmarshal from buffer to dst
dst, err = unmarshalInt64NearestDelta(dst, bb.B)
bbPool.Put(bb) // release to pool
Bytebuffer pool issues
1. sync.Pool assumes all entries it contains are "the same"
2. While in real world bytebuffer are usually have different size
3. Mixing big and small bytebuffers in a single pool can result into:
a. Excessive memory usage
b. Suboptimal objects reuse
Leveled (bucketized) bytebuffer pool
Leveled (bucketized) bytebuffer pool
// pools contains pools for byte slices of various capacities.
//
// pools[0] is for capacities from 0 to 8
// pools[1] is for capacities from 9 to 16
// pools[2] is for capacities from 17 to 32
// ...
// pools[n] is for capacities from 2^(n+2)+1 to 2^(n+3)
//
// Limit the maximum capacity to 2^18, since there are no
performance benefits
// in caching byte slices with bigger capacities.
var pools [17]sync.Pool
Leveled (bucketized) bytebuffer pool
func (sw *scrapeWork) scrape() {
body := leveledbytebufferpool.Get(sw.prevBodyLen)
body.B = sw.ReadData(body.B[:0])
sw.processScrapedData(body)
leveledbytebufferpool.Put(body)
}
Ingestion of 100Mil samples/s benchmark
Summary
1. String interning for reducing GC pressure and memory usage for
read-intensive workloads
2. Function results caching for reducing CPU usage during strings
transformations
3. Concurrency limiting for the better performance and predictable
memory usage
4. Sync.pool for reducing GC pressure and improving performance of
CPU bound operations.
Questions?
● VictoriaMetrics scaling to 100M samples/s
● https://github.com/VictoriaMetrics
● https://github.com/hagen1778

More Related Content

Similar to Writing a TSDB from scratch_ performance optimizations.pdf

002 hbase clientapi
002 hbase clientapi002 hbase clientapi
002 hbase clientapiScott Miao
 
Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...
Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...
Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...apidays
 
NET Systems Programming Learned the Hard Way.pptx
NET Systems Programming Learned the Hard Way.pptxNET Systems Programming Learned the Hard Way.pptx
NET Systems Programming Learned the Hard Way.pptxpetabridge
 
Java 어플리케이션 성능튜닝 Part1
Java 어플리케이션 성능튜닝 Part1Java 어플리케이션 성능튜닝 Part1
Java 어플리케이션 성능튜닝 Part1상욱 송
 
あなたのScalaを爆速にする7つの方法
あなたのScalaを爆速にする7つの方法あなたのScalaを爆速にする7つの方法
あなたのScalaを爆速にする7つの方法x1 ichi
 
Dragoncraft Architectural Overview
Dragoncraft Architectural OverviewDragoncraft Architectural Overview
Dragoncraft Architectural Overviewjessesanford
 
A Domain-Specific Embedded Language for Programming Parallel Architectures.
A Domain-Specific Embedded Language for Programming Parallel Architectures.A Domain-Specific Embedded Language for Programming Parallel Architectures.
A Domain-Specific Embedded Language for Programming Parallel Architectures.Jason Hearne-McGuiness
 
Perly Parallel Processing of Fixed Width Data Records
Perly Parallel Processing of Fixed Width Data RecordsPerly Parallel Processing of Fixed Width Data Records
Perly Parallel Processing of Fixed Width Data RecordsWorkhorse Computing
 
Optimizing Performance - Clojure Remote - Nikola Peric
Optimizing Performance - Clojure Remote - Nikola PericOptimizing Performance - Clojure Remote - Nikola Peric
Optimizing Performance - Clojure Remote - Nikola PericNik Peric
 
Hadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log projectHadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log projectMao Geng
 
Chapter Seven(2)
Chapter Seven(2)Chapter Seven(2)
Chapter Seven(2)bolovv
 
"Optimization of a .NET application- is it simple ! / ?", Yevhen Tatarynov
"Optimization of a .NET application- is it simple ! / ?",  Yevhen Tatarynov"Optimization of a .NET application- is it simple ! / ?",  Yevhen Tatarynov
"Optimization of a .NET application- is it simple ! / ?", Yevhen TatarynovFwdays
 
Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Andriy Berestovskyy
 
Router Queue Simulation in C++ in MMNN and MM1 conditions
Router Queue Simulation in C++ in MMNN and MM1 conditionsRouter Queue Simulation in C++ in MMNN and MM1 conditions
Router Queue Simulation in C++ in MMNN and MM1 conditionsMorteza Mahdilar
 
Finagle and Java Service Framework at Pinterest
Finagle and Java Service Framework at PinterestFinagle and Java Service Framework at Pinterest
Finagle and Java Service Framework at PinterestPavan Chitumalla
 
Using R on High Performance Computers
Using R on High Performance ComputersUsing R on High Performance Computers
Using R on High Performance ComputersDave Hiltbrand
 
Spark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with SparkSpark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with Sparksamthemonad
 
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized EngineApache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized EngineDataWorks Summit
 

Similar to Writing a TSDB from scratch_ performance optimizations.pdf (20)

002 hbase clientapi
002 hbase clientapi002 hbase clientapi
002 hbase clientapi
 
Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...
Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...
Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...
 
NET Systems Programming Learned the Hard Way.pptx
NET Systems Programming Learned the Hard Way.pptxNET Systems Programming Learned the Hard Way.pptx
NET Systems Programming Learned the Hard Way.pptx
 
gRPC in Go
gRPC in GogRPC in Go
gRPC in Go
 
Java 어플리케이션 성능튜닝 Part1
Java 어플리케이션 성능튜닝 Part1Java 어플리케이션 성능튜닝 Part1
Java 어플리케이션 성능튜닝 Part1
 
あなたのScalaを爆速にする7つの方法
あなたのScalaを爆速にする7つの方法あなたのScalaを爆速にする7つの方法
あなたのScalaを爆速にする7つの方法
 
Dragoncraft Architectural Overview
Dragoncraft Architectural OverviewDragoncraft Architectural Overview
Dragoncraft Architectural Overview
 
A Domain-Specific Embedded Language for Programming Parallel Architectures.
A Domain-Specific Embedded Language for Programming Parallel Architectures.A Domain-Specific Embedded Language for Programming Parallel Architectures.
A Domain-Specific Embedded Language for Programming Parallel Architectures.
 
Perly Parallel Processing of Fixed Width Data Records
Perly Parallel Processing of Fixed Width Data RecordsPerly Parallel Processing of Fixed Width Data Records
Perly Parallel Processing of Fixed Width Data Records
 
Optimizing Performance - Clojure Remote - Nikola Peric
Optimizing Performance - Clojure Remote - Nikola PericOptimizing Performance - Clojure Remote - Nikola Peric
Optimizing Performance - Clojure Remote - Nikola Peric
 
Hadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log projectHadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log project
 
Go on!
Go on!Go on!
Go on!
 
Chapter Seven(2)
Chapter Seven(2)Chapter Seven(2)
Chapter Seven(2)
 
"Optimization of a .NET application- is it simple ! / ?", Yevhen Tatarynov
"Optimization of a .NET application- is it simple ! / ?",  Yevhen Tatarynov"Optimization of a .NET application- is it simple ! / ?",  Yevhen Tatarynov
"Optimization of a .NET application- is it simple ! / ?", Yevhen Tatarynov
 
Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)
 
Router Queue Simulation in C++ in MMNN and MM1 conditions
Router Queue Simulation in C++ in MMNN and MM1 conditionsRouter Queue Simulation in C++ in MMNN and MM1 conditions
Router Queue Simulation in C++ in MMNN and MM1 conditions
 
Finagle and Java Service Framework at Pinterest
Finagle and Java Service Framework at PinterestFinagle and Java Service Framework at Pinterest
Finagle and Java Service Framework at Pinterest
 
Using R on High Performance Computers
Using R on High Performance ComputersUsing R on High Performance Computers
Using R on High Performance Computers
 
Spark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with SparkSpark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with Spark
 
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized EngineApache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
 

Recently uploaded

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 

Recently uploaded (20)

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 

Writing a TSDB from scratch_ performance optimizations.pdf

  • 1. Writing a TSDB from scratch performance optimizations Roman Khavronenko | github.com/hagen1778
  • 2. Roman Khavronenko Co-founder of VictoriaMetrics Software engineer with experience in distributed systems, monitoring and high-performance services. https://github.com/hagen1778 https://twitter.com/hagen1778
  • 3. What is a metric?
  • 4. What is a metric? Scrape target > curl http://service:port/metrics
  • 8. Workload pattern for TSDB: writes
  • 9. Workload pattern for TSDB: reads
  • 10. Workload pattern for TSDB ● TSDBs process tremendous amounts of data ● They are usually write-heavy applications, optimized for ingestion ● Read load is usually much lower than write load ● Read queries are sporadic and unpredictable
  • 11. How to deal with such workload? System design oriented for time series data: 1. Log Structured Merge (LSM) data structure 2. Data for each column is stored separately 3. Append-only writes
  • 12. How to deal with such workload? And some more non-design-specific optimizations: 1. Strings interning 2. Function results caching 3. Concurrency limiting for CPU-bound operations 4. Sync pool for CPU-bound operations
  • 15. Store only one unique copy in memory!
  • 17. String interning: naive implementation var internStringsMap = make(map[string]string) func intern(s string) string { m := internStringsMap if v, ok := m[s]; ok { return v } m[s] = s return s }
  • 18. func ptr (s string) uintptr { return (*reflect.StringHeader)(unsafe.Pointer(&s)).Data } func main() { s1 := intern("42") s2 := intern(fmt.Sprintf("%d", 42)) fmt.Println(ptr(s1) == ptr(s2)) // true } String interning: naive implementation
  • 19. 1. Map isn't thread safe String interning: naive implementation
  • 20. 1. Map isn't thread safe 2. Map with lock doesn't scale with number of CPUs String interning: naive implementation
  • 21. String interning: sync.Map var internStringsMap = sync.Map{} func intern(s string) string { m := &internStringsMap interned, _ := m.LoadOrStore(s, s) return interned.(string) }
  • 22. String interning: sync.Map sync.Map is optimized for two common use cases: 1. When the entry for a given key is only ever written once but read many times
  • 23. String interning: sync.Map sync.Map is optimized for two common use cases: 1. When the entry for a given key is only ever written once but read many times 2. When multiple goroutines read, write, and overwrite entries for disjoint sets of keys. In these two cases, use of a Map reduces lock contention and improves performance compared to a Go map paired with a separate Mutex or RWMutex.
  • 24. String interning: gotchas 1. Map will grow over time: a. Rotate maps once in a while
  • 25. String interning: gotchas 1. Map will grow over time: a. Rotate maps once in a while b. Add TTL logic to purge cold entries
  • 26. String interning: gotchas 1. Map will grow over time: a. Rotate maps once in a while b. Add TTL logic to purge cold entries 2. Sanity check of arguments: a. At some point, someone will try to intern byte slice or substring: *(*string)(unsafe.Pointer(&b)) or str[:n]
  • 27. String interning: gotchas 1. Map will grow over time: a. Rotate maps once in a while b. Add TTL logic to purge cold entries 2. Sanity check of arguments: a. At some point, someone will try to intern byte slice or substring: *(*string)(unsafe.Pointer(&b)) or str[:n] b. Make sure to clone received strings: strings.Clone(s)
  • 28. String interning: summary ● We use string interning for storing time series metadata (aka labels). ● It helps to reduce memory usage during metadata parsing ● Interning works the best for read-intensive workload with limited number of variants with high hit rate
  • 31. Function results caching: caching Transformer type Transformer struct { m sync.Map transformFunc func(s string) string }
  • 32. func (t *Transformer) Transform(s string) string { v, ok := t.m.Load(s) if ok { // Fast path - the transformed s is found in the cache. return v.(string) } // Slow path - transform s and store it in the cache. sTransformed := t.transformFunc(s) t.m.Store(s, sTransformed) return sTransformed } Function results caching: caching Transformer
  • 33. // SanitizeName replaces unsupported by Prometheus chars // in metric names and label names with _. func SanitizeName(name string) string { return promSanitizer.Transform(name) } var promSanitizer = NewTransformer(func(s string) string { return unsupportedPromChars.ReplaceAllString(s, "_") }) Function results caching: example
  • 34. Function results caching: summary ● Helps to save CPU time in the cost of increased mem usage ● Works best for heavy usage of string transforms, regex matching, etc ● And when the number of arguments and their variants is limited ● Doesn't work good when number of transformations is unlimited or inconsistent - like query processing
  • 35. Limiting concurrency for CPU-bound load
  • 36. Volatile number of scrape targets
  • 38. Limiting concurrency for CPU intensive operations + Makes system more stable and efficient + Helps to control the memory usage on load spikes (which is expected in monitoring) + Improves the processing speed of each goroutine by reducing the number of context switches - The downside is complexity - it is easy to make a mistake and end up with a deadlock or inefficient resource utilization.
  • 39. Limited concurrency: workers var concurrencyLimit = runtime.NumCPU() func main() { workCh := make(chan work, concurrencyLimit*2) for i := 0; i < concurrencyLimit; i++ { go func() { for { processData(<-workCh) } }() } }
  • 40. Limited concurrency: workers + Workers could have scoped buffers, metrics, etc. - Code becomes complicated: start and stop procedures for workers - Additional synchronization to distribute work via channels
  • 41. Limited concurrency: channel var concurrencyLimitCh = make(chan struct{}, runtime.NumCPU()) // This function is CPU-bound and may allocate a lot of memory. // We limit the number of concurrent calls to limit memory // usage under high load without sacrificing the performance. func processData(src, dst []byte) error { concurrencyLimitCh <- struct{}{} defer func() { <-concurrencyLimitCh }() // heavy processing...
  • 42. Limited concurrency: summary ● Works the best for CPU bound operations ● Helps to bound resource usage and process it sequentially with the optimal performance instead of wasting resources on context switches ● Helps to prevent from excessive memory usage during load spikes ● Do not apply limiting to IO bound (disk, network) operations
  • 43. sync.Pool for CPU bound operations
  • 44. sync.Pool is widely used in VM grep -r "sync.Pool" ./app ./lib | wc -l 118 grep -r "bytesutil.ByteBufferPool" ./app ./lib | wc -l 34
  • 45. sync.Pool for CPU bound operations in one thread ● All processed on a single CPU core ● No object stealing ● Lower number of objects allocated, better pool utilization ● Lower GC pressure
  • 46. sync.Pool for synchronous processing ● Object is retrieved, used and released by different goroutines ● High chances for goroutines to be scheduled to different threads ● High chances for objects stealing
  • 47. sync.Pool for IO bound operations ● Obj retrieved from sync.pool used for IO operations. ● IO operations are slow and sporadic ● so sync.Pool can allocate big amount of objects and result in uncontrolled mem usage ● Higher pressure on GC
  • 48. sync.Pool - lib/bytesbuffer type ByteBufferPool struct { p sync.Pool } // Verify ByteBuffer implements the given interfaces. _ io.Writer = &ByteBuffer{} _ fs.MustReadAtCloser = &ByteBuffer{} _ io.ReaderFrom = &ByteBuffer{}
  • 49. sync.Pool - lib/bytesbuffer func (bbp *ByteBufferPool) Get() *ByteBuffer { bbv := bbp.p.Get() if bbv == nil { return &ByteBuffer{} } return bbv.(*ByteBuffer) } func (bbp *ByteBufferPool) Put(bb *ByteBuffer) { bb.Reset() bbp.p.Put(bb) }
  • 50. sync.Pool - lib/bytesbuffer bb := bbPool.Get() // acquire from pool bb.B, err = DecompressZSTD(bb.B[:0], src) if err != nil { return nil, fmt.Errorf("cannot decompress: %w", err) } // unmarshal from buffer to dst dst, err = unmarshalInt64NearestDelta(dst, bb.B) bbPool.Put(bb) // release to pool
  • 51. Bytebuffer pool issues 1. sync.Pool assumes all entries it contains are "the same" 2. While in real world bytebuffer are usually have different size 3. Mixing big and small bytebuffers in a single pool can result into: a. Excessive memory usage b. Suboptimal objects reuse
  • 53. Leveled (bucketized) bytebuffer pool // pools contains pools for byte slices of various capacities. // // pools[0] is for capacities from 0 to 8 // pools[1] is for capacities from 9 to 16 // pools[2] is for capacities from 17 to 32 // ... // pools[n] is for capacities from 2^(n+2)+1 to 2^(n+3) // // Limit the maximum capacity to 2^18, since there are no performance benefits // in caching byte slices with bigger capacities. var pools [17]sync.Pool
  • 54. Leveled (bucketized) bytebuffer pool func (sw *scrapeWork) scrape() { body := leveledbytebufferpool.Get(sw.prevBodyLen) body.B = sw.ReadData(body.B[:0]) sw.processScrapedData(body) leveledbytebufferpool.Put(body) }
  • 55. Ingestion of 100Mil samples/s benchmark
  • 56. Summary 1. String interning for reducing GC pressure and memory usage for read-intensive workloads 2. Function results caching for reducing CPU usage during strings transformations 3. Concurrency limiting for the better performance and predictable memory usage 4. Sync.pool for reducing GC pressure and improving performance of CPU bound operations.
  • 57. Questions? ● VictoriaMetrics scaling to 100M samples/s ● https://github.com/VictoriaMetrics ● https://github.com/hagen1778