SlideShare a Scribd company logo
SignalFx
SignalFx
Scaling ingest pipelines with
high performance computing principles
Rajiv Kurian, Software Engineer
rajiv@signalfx.com
Agenda
1. Why we need to scale ingest
2. Basic properties and limitations of modern
hardware
3. Optimization techniques inspired by HPC
4. Results!
5. Q&A (hopefully!)
SignalFx
Why we need to scale ingest
• High resolution:
• Up to 1 sec
• Streaming analytics:
• Charts/analytics update @1sec
• Real time
• Multidimensional metrics:
• Dimensions : representing customer, server etc
• Filter, aggregate : 99th-pct-latency-by-service,customer
SignalFx is an advanced monitoring platform for modern applications
Ingest pipeline
ROLLUPS PERSIST
REST/RATE
CONTROL
Raw time
series data
Processed data to
analytics
SignalFx ingest library
Raw data in Rollup data out
TimeSeries 0 rollup
TimeSeries 1 rollup
TimeSeries 2 rollup
TimeSeries 3 rollup
TimeSeries 4 rollup
TimeSeries 5 rollup
TimeSeries 6 rollup
TimeSeries 7 rollup
TimeSeries 8 rollup
Issues identified (before applying HPC techniques)
• Expensive - too many servers

• Exhibits parallel slow down
• More threads = worse performance

• What did the profile say?
• Death by a thousand cuts
• The core library = 35% of profile
SignalFx
Basic properties and limitations
of modern hardware
SignalFx
L1 Data
L1
Instruction
L3
L1 Data
L1
Instruction
L2L2
Core 1 Core 2
Main memory
Cache Lines
• Data is transferred between memory and cache in blocks of
fixed size, called cache lines. Usually 64 bytes
• When the processor needs to read or write a location in
main memory, it first checks for a corresponding entry in the
cache. In the case of:
• a cache hit, the processor immediately reads or writes
the data in the cache line
• a cache miss, the cache allocates a new entry and
copies in data from main memory, then the request (read
or write) is fulfilled from the contents of the cache
• The memory subsystem makes two kinds of bets to help us:
• Temporal locality
• Spatial locality
Reference latency numbers for comparison
By Jeff Dean: http://research.google.com/people/jeff/
L1 Cache 0.5ns
Branch mispredict 5 ns
L2 Cache 7 ns 14x L1 Cache
Mutex lock/unlock 25 ns
Main memory 100 ns 20x L2 Cache, 200x L1 Cache
Compress 1K bytes (Zippy) 3,000 ns
Send 1K bytes over 1Gbps 10,000 ns 0.01 ms
Read 4K randomly from SSD 150,000 ns 0.15 ms
Read 1MB sequentially from memory 250,000 ns 0.25 ms
Round trip within same DC 500,000 ns 0.5 ms
Read 1MB sequentially from SSD 1,000,000 ns 1 ms 4x memory
Disk seek 10,000,000 ns 10 ms 20x DC roundtrip
Read 1MB sequentially from disk 20,000,000 ns 20 ms 80x memory, 20x SSD
Send packet CA->Netherlands->CA 150,000,000 ns 150 ms
SignalFx
L1 CORE
SignalFx
L2 CORE
SignalFx
Main
Memory
CORE
Our optimization goal
Convert a memory bandwidth bound
application to a CPU bound application
Things we kept in mind
• Measure, measure, measure!

• Don’t rely on micro benchmarks alone
SignalFx
Benchmark
SignalFx library benchmark
Rollup data out
Key Value
ID 0 TimeSeries rollup 0
ID 1 TimeSeries rollup 1
ID 2 TimeSeries rollup 2
ID 3 TimeSeries rollup 3
ID 4 TimeSeries rollup 4
….. …..
….. …..
Key 1M TimeSeries rollup 1M
Raw data in,
in random order,
one per Time Series.
50x
SignalFx library benchmark
Rollup data out
Key Value
ID 0 TimeSeries rollup 0
ID 1 TimeSeries rollup 1
ID 2 TimeSeries rollup 2
ID 3 TimeSeries rollup 3
ID 4 TimeSeries rollup 4
….. …..
….. …..
Key 1M TimeSeries rollup 1M
Raw data in,
in random order,
one per Time Series.
50x
35% of the profile of
the entire application
SignalFx
Techniques inspired by HPC that have
improved our pipeline
Single threaded, event based architectures:
parallelize by running multiple copies of
single threaded code
Single threaded event based architectures
• Threads work on their own private
data (as much as possible)

• Communicate with other threads
using events/messages
SignalFx
local data
Network In thread Processor thread(s) Network out thread
Receive data
Process data
Write batched
data
Events
Events
Key Value
key 1 value 1
key 2 value 2
key 3 value 3
key 4 value 4
SignalFx
Network In thread Processor thread(s) Network out thread
Receive data
Process data Write batched
data
local data
Ring
Buffer
Ring
Buffer
Key Value
key 1 value 1
key 2 value 2
key 3 value 3
key 4 value 4
Single threaded event based architectures advantages
• It enables many other optimal choices like
• Compact array based data structures
• Buffer/object re-use

• Loosely coupled - easy to test

• Run multiple copies for parallelism
SignalFx
Ring
Buffer
Network In thread Worker thread(s) Network out thread
Receive data
Process data Write batched
data
local data
1
2
3
4
Ring
Buffer
Ring
Buffer
Key Value
key 1 value 1
key 2 value 2
key 3 value 3
key 4 value 4
SignalFx
Worker thread
Receive data using
Async IO
Process data
synchronously
Write data using
Async IO
Receive data using
Async IO
Process data
synchronously
Write data using
Async IO
Receive data using
Async IO
Process data
synchronously
Write data using
Async IO
Worker thread Worker thread
local data local data local data
Key Value
key 5 value 5
key 6 value 6
key 7 value 7
key 8 value 8
Key Value
key 1 value 1
key 2 value 2
key 3 value 3
key 4 value 4
Key Value
key 9 value 9
key 10 value 10
key 11 value 11
key 12 value 12
SignalFx
Ring
Buffer
Network thread Processor thread(s) Async IO thread
Receive data
Process data
Batched
IO calls
local data
1
2
3
4
Ring
Buffer
Ring
Buffer
5
Ring
Buffer
6
7
Key Value
key 1 value 1
key 2 value 2
key 3 value 3
key 4 value 4
Advice for threaded applications
• Threads should ideally reflect the actual
parallelism of the system.
• Avoid gratuitous over subscribing
• Exception: IO threads?

• DO NOT communicate unless you have to
SignalFx
Techniques inspired by HPC that have
improved our pipeline
Use compact, cache-conscious, array based
data structures with minimal indirection
SignalFx
L1 Data
L1
Instruction
L3
L1 Data
L1
Instruction
L2L2
Core 1 Core 2
Main memory
Basic principles
• Strive for smaller data structures
• Extra computation is ok
• E.g. Compressing network data

• Design data structures that facilitate
processing multiple entries—big
arrays!

• Layout should reflect access patterns
Hash maps
• Hash maps look ups are NOT free!

• A lookup in a well implemented hash
map is by definition a cache miss

• Popular implementations like
java.util.HashMap can cause multiple
cache misses
valuekey
key* value*key* value*
List
List
List
List
Typical hash map implemented as an array of
lists of key* | value*
valuekey
key* value*key* value*
List
List
List
List
1
Cache misses in a typical hash-map
implementation
valuekey
key* value*key* value*
List
List
List
List
1
2
Cache misses in a typical hash-map
implementation
valuekey
key* value*key* value*
List
List
List
List
1
2
3
Cache misses in a typical hash-map
implementation
valuekey
key* value*key* value*
List
List
List
List
1
2
3 4
Cache misses in a typical hash-map
implementation
valuekey
key* value*key* value*
List
List
List
List
1
2
3 4
Cache misses in a typical hash-map
implementation
valuekey
key* value*key* value*
Hash map implemented as an array of lists of
key* | value*
List
List
List
List
Array of co-located key/value
Key Value
Key 0 Value 0
Key 1 Value 1
Key 2 Value 2
Key 3 Value 3
Key 4 Value 4
Key 5 Value 5
Key 6 Value 6
Key 7 Value 7
Cache misses with no collision
Key Value
Key 0 Value 0
Key 1 Value 1
Key 2 Value 2
Key 3 Value 3
Key 4 Value 4
Key 5 Value 5
Key 6 Value 6
Key 7 Value 7
1
Cache misses with collisions
Key Value
Key 0 Value 0
Key 1 Value 1
Key 2 Value 2
Key 3 Value 3
Key 4 Value 4
Key 5 Value 5
Key 6 Value 6
Key 7 Value 7
1
2
Hash map of key to index to an array of structs
Value 0
Value 1
Value 2
Value 3
Value 4
Value 5
Value 6
Value 7
Value 8
Key Index
Key 0 1
Key 1 6
Key 2 4
Key 3 8
Cache misses with collision
Value 0
Value 1
Value 2
Value 3
Value 4
Value 5
Value 6
Value 7
Value 8
Key Index
Key 0 1
Key 1 6
Key 2 4
Key 3 8
1
Cache misses with collision
Value 0
Value 1
Value 2
Value 3
Value 4
Value 5
Value 6
Value 7
Value 8
Key Index
Key 0 1
Key 1 6
Key 2 4
Key 3 8
1 2
New library memory layout
TimeSeries rollup 0
TimeSeries rollup 1
TimeSeries rollup 2
TimeSeries rollup 3
TimeSeries rollup 4
TimeSeries rollup 5
TimeSeries rollup 6
TimeSeries rollup 7
TimeSeries rollup 8
ID Index
ID 0 1
ID 1 6
ID 2 4
ID 3 8
Raw data in Rollup out
Changing hash map implementations
• java.util.HashMap (uses separate chaining and boxes
primitives) to make a long -> int lookup
• Allocations galore

• net.openhft.koloboke primitive open hash map
• 45% improvement

For the JVM use libraries like https://github.com/
OpenHFT/Koloboke.
For C++ try https://github.com/preshing/
CompareIntegerMaps or similar.
Access patterns
Hot data
Cold data
object 0
object 1
object 2
object 3
Field 0 Field 1 Field 2 Field 3 Field 4
Field 0 Field 1 Field 2 Field 3 Field 4
Field 0 Field 1 Field 2 Field 3 Field 4
Field 0 Field 1 Field 2 Field 3 Field 4
Group fields accessed together
Hot fields
Cold fields
object 1
Field 0 Field 1 Field 2
Field 0 Field 1 Field 2
Field 0 Field 1 Field 2
Field 0 Field 1 Field 2
Field 3 Field 4
Field 3 Field 4
Field 3 Field 4
Field 3 Field 4
Results of separating hot and cold data
A hot loop run about once every 500 ms
• Old - Hot and cold data kept together
• 5 cache lines per time series
• Took anywhere between 62-70 ms

• New - Hot and cold data kept separate
• 3 cache lines of hot data per time series
• Took anywhere between 40-45 ms

• 35% improvement
SignalFx
Results (library)!
Old vs New
• Concurrent -> single threaded
• Locks gone
• Array based data structures
• Zero allocations
• Extensive batching and hardware prefetching



• Multiple hash maps -> a single hash map look up
Old vs New
Old vs New
76 K/sec
VS
2.1 M/sec
Old vs New
27x
Old vs New
35 %
SignalFx
Results (application)!
Amdahl’s law - 35%Overallspeedup
0
0.4
0.8
1.2
1.6
Library speed up
1 4 8 12 16 20 24 28 ∞
CPU
CPU
3.4x
35% of the profile but 3.4x improvement?
• Amdahl’s law
• Max 1.54x improvement if 35% => 0%

• Why 3.4x ?
• When you use less cache, you leave more for
others - thus speeding up other code too

• Lesson
• A profiler is a necessary tool, but not a substitute for
informed design
Heap growth
Closing remarks / rant
• “Write code first, profile later” = BAD

• Excessive encapsulation leads to
myopic decisions being made re: perf
• allocations
• “thread safe” code

• Beware of micro benchmarks
SignalFx
Thank You!
Rajiv Kurian
rajiv@signalfx.com
@rzidane360
WE’RE HIRING
jobs@signalfx.com
@SignalFx - signalfx.com
SignalFx
Bonus slides
Composition in C
Struct A Struct B
Struct B1
embedded
Struct B2
embedded
int
int
int
int
int
int
int
int
int
int
int
int
Composition in Java
Object A Object B
Object B1 Object B2
int
int
int
int
B1
B2
Actual layout?
int
int
int
int
Object B
Object B1 Object B2
B1
B2
Actual layout?
Object B
B1
B2
B (header)
B1*
B2*
Actual layout?
int
int
Object B
Object B1
B1
B2
B (header)
B1*
B2*
Actual layout?
int
int
Object B
Object B1
B1
B2
B (header)
B1*
B2*
B1 (header)
int
int
Actual layout?
int
int
Object B
Object B1
B1
B2
B (header)
B1*
B2*
B1 (header)
int
int
Actual layout?
int
int
int
int
Object B
Object B1 Object B2
B1
B2
B (header)
B1*
B2*
B1 (header)
int
int
Actual layout?
int
int
int
int
Object B
Object B1 Object B2
B1
B2
B (header)
B1*
B2*
B1 (header)
int
int
B2 (header)
int
int
Actual layout?
int
int
int
int
Object B
Object B1 Object B2
B1
B2
B (header)
B1*
B2*
B1 (header)
int
int
B2 (header)
int
int
Potential layout after GC
int
int
int
int
Object B
Object B1 Object B2
B1
B2
B (header)
B1*
B2*
Other data
B1 (header)
int
int
Other data
B2 (header
int
int
SignalFx
Techniques inspired by HPC that have
improved our pipeline
Separate the control and data planes
Frequent
Infrequent
A networking concept
Routing table
Packets in Packets out
Routing data
Key Value
What the control and data planes do
In networking terminology:
• Data plane - Defines the part that
decides what to do with packets
arriving on an inbound interface—
Frequent
• Control plane - Defines the part that is
concerned with drawing the network
map or routing table—Infrequent
The goal of control and data plane separation
DO NOT slow the frequent path because
of the infrequent path
Runtime configuration variables
Worker threadConfiguration variables
(volatile/atomic)
Setter thread
while (1) {
process_data_using_configuration_variables();
}
Flag 0
Flag 1
Flag 2
Flag 3
Flag 0
Flag 1
Flag 2
Flag 3
Runtime configuration variables
Worker threadConfiguration variables
(volatile/atomic)
Setter thread
Flag 0
Flag 1
Flag 2
Flag 3
while (1) {
cache_configuration_variables();
process_a_ton_of_stuff();
}
Cached configuration
variables
Volatile/atomic flag vs cached local flag
• All run time flags (used on every data point) are
volatile/atomic loads
• All run time flags are cached and refreshed on each
run loop
• About 8% improvement in datapoint/second. Others
might see more or less

More Related Content

What's hot

0.5mln packets per second with Erlang
0.5mln packets per second with Erlang0.5mln packets per second with Erlang
0.5mln packets per second with Erlang
Maxim Kharchenko
 
How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...
How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...
How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...
InfluxData
 
Flink Forward SF 2017: Eron Wright - Introducing Flink Tensorflow
Flink Forward SF 2017: Eron Wright - Introducing Flink TensorflowFlink Forward SF 2017: Eron Wright - Introducing Flink Tensorflow
Flink Forward SF 2017: Eron Wright - Introducing Flink Tensorflow
Flink Forward
 
Finding OOMS in Legacy Systems with the Syslog Telegraf Plugin
Finding OOMS in Legacy Systems with the Syslog Telegraf PluginFinding OOMS in Legacy Systems with the Syslog Telegraf Plugin
Finding OOMS in Legacy Systems with the Syslog Telegraf Plugin
InfluxData
 
Development and Applications of Distributed IoT Sensors for Intermittent Conn...
Development and Applications of Distributed IoT Sensors for Intermittent Conn...Development and Applications of Distributed IoT Sensors for Intermittent Conn...
Development and Applications of Distributed IoT Sensors for Intermittent Conn...
InfluxData
 
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward
 
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10 an integration story
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10  an integration story[Spark Summit EU 2017] Apache spark streaming + kafka 0.10  an integration story
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10 an integration story
Joan Viladrosa Riera
 
Multi cluster, multitenant and hierarchical kafka messaging service slideshare
Multi cluster, multitenant and hierarchical kafka messaging service   slideshareMulti cluster, multitenant and hierarchical kafka messaging service   slideshare
Multi cluster, multitenant and hierarchical kafka messaging service slideshare
Allen (Xiaozhong) Wang
 
0.5mln packets per second with Erlang
0.5mln packets per second with Erlang0.5mln packets per second with Erlang
0.5mln packets per second with Erlang
Maxim Kharchenko
 
Apache Storm In Retail Context
Apache Storm In Retail ContextApache Storm In Retail Context
Apache Storm In Retail Context
Karthik Deivasigamani
 
Virtual training Intro to InfluxDB & Telegraf
Virtual training  Intro to InfluxDB & TelegrafVirtual training  Intro to InfluxDB & Telegraf
Virtual training Intro to InfluxDB & Telegraf
InfluxData
 
From Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka JourneyFrom Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka Journey
Allen (Xiaozhong) Wang
 
Using eBPF to Measure the k8s Cluster Health
Using eBPF to Measure the k8s Cluster HealthUsing eBPF to Measure the k8s Cluster Health
Using eBPF to Measure the k8s Cluster Health
ScyllaDB
 
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetes at Scale – Real-time Ano...
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetes at Scale – Real-time Ano...
Paul Brebner
 
Top Ten Kafka® Configs
Top Ten Kafka® ConfigsTop Ten Kafka® Configs
Top Ten Kafka® Configs
confluent
 
HDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon KimHDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon Kim
Databricks
 
Fraud Detection for Israel BigThings Meetup
Fraud Detection  for Israel BigThings MeetupFraud Detection  for Israel BigThings Meetup
Fraud Detection for Israel BigThings Meetup
Gwen (Chen) Shapira
 
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache KafkaKafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
confluent
 
Kafka replication apachecon_2013
Kafka replication apachecon_2013Kafka replication apachecon_2013
Kafka replication apachecon_2013Jun Rao
 

What's hot (20)

0.5mln packets per second with Erlang
0.5mln packets per second with Erlang0.5mln packets per second with Erlang
0.5mln packets per second with Erlang
 
How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...
How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...
How to Introduce Telemetry Streaming (gNMI) in Your Network with SNMP with Te...
 
Flink Forward SF 2017: Eron Wright - Introducing Flink Tensorflow
Flink Forward SF 2017: Eron Wright - Introducing Flink TensorflowFlink Forward SF 2017: Eron Wright - Introducing Flink Tensorflow
Flink Forward SF 2017: Eron Wright - Introducing Flink Tensorflow
 
Finding OOMS in Legacy Systems with the Syslog Telegraf Plugin
Finding OOMS in Legacy Systems with the Syslog Telegraf PluginFinding OOMS in Legacy Systems with the Syslog Telegraf Plugin
Finding OOMS in Legacy Systems with the Syslog Telegraf Plugin
 
Development and Applications of Distributed IoT Sensors for Intermittent Conn...
Development and Applications of Distributed IoT Sensors for Intermittent Conn...Development and Applications of Distributed IoT Sensors for Intermittent Conn...
Development and Applications of Distributed IoT Sensors for Intermittent Conn...
 
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
 
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10 an integration story
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10  an integration story[Spark Summit EU 2017] Apache spark streaming + kafka 0.10  an integration story
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10 an integration story
 
Multi cluster, multitenant and hierarchical kafka messaging service slideshare
Multi cluster, multitenant and hierarchical kafka messaging service   slideshareMulti cluster, multitenant and hierarchical kafka messaging service   slideshare
Multi cluster, multitenant and hierarchical kafka messaging service slideshare
 
0.5mln packets per second with Erlang
0.5mln packets per second with Erlang0.5mln packets per second with Erlang
0.5mln packets per second with Erlang
 
Apache Storm In Retail Context
Apache Storm In Retail ContextApache Storm In Retail Context
Apache Storm In Retail Context
 
Virtual training Intro to InfluxDB & Telegraf
Virtual training  Intro to InfluxDB & TelegrafVirtual training  Intro to InfluxDB & Telegraf
Virtual training Intro to InfluxDB & Telegraf
 
From Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka JourneyFrom Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka Journey
 
Using eBPF to Measure the k8s Cluster Health
Using eBPF to Measure the k8s Cluster HealthUsing eBPF to Measure the k8s Cluster Health
Using eBPF to Measure the k8s Cluster Health
 
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetes at Scale – Real-time Ano...
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetes at Scale – Real-time Ano...
 
Top Ten Kafka® Configs
Top Ten Kafka® ConfigsTop Ten Kafka® Configs
Top Ten Kafka® Configs
 
HDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon KimHDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon Kim
 
Fraud Detection for Israel BigThings Meetup
Fraud Detection  for Israel BigThings MeetupFraud Detection  for Israel BigThings Meetup
Fraud Detection for Israel BigThings Meetup
 
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache KafkaKafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
 
Raptor codes
Raptor codesRaptor codes
Raptor codes
 
Kafka replication apachecon_2013
Kafka replication apachecon_2013Kafka replication apachecon_2013
Kafka replication apachecon_2013
 

Viewers also liked

Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
SignalFx
 
AWS Loft Talk: Behind the Scenes with SignalFx
AWS Loft Talk: Behind the Scenes with SignalFxAWS Loft Talk: Behind the Scenes with SignalFx
AWS Loft Talk: Behind the Scenes with SignalFx
SignalFx
 
Docker at and with SignalFx
Docker at and with SignalFxDocker at and with SignalFx
Docker at and with SignalFx
SignalFx
 
Microservices and Devs in Charge: Why Monitoring is an Analytics Problem
Microservices and Devs in Charge: Why Monitoring is an Analytics ProblemMicroservices and Devs in Charge: Why Monitoring is an Analytics Problem
Microservices and Devs in Charge: Why Monitoring is an Analytics Problem
SignalFx
 
Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...
Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...
Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...
SignalFx
 
Go debugging and troubleshooting tips - from real life lessons at SignalFx
Go debugging and troubleshooting tips - from real life lessons at SignalFxGo debugging and troubleshooting tips - from real life lessons at SignalFx
Go debugging and troubleshooting tips - from real life lessons at SignalFx
SignalFx
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache Cassandra
Patrick McFadin
 

Viewers also liked (7)

Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
 
AWS Loft Talk: Behind the Scenes with SignalFx
AWS Loft Talk: Behind the Scenes with SignalFxAWS Loft Talk: Behind the Scenes with SignalFx
AWS Loft Talk: Behind the Scenes with SignalFx
 
Docker at and with SignalFx
Docker at and with SignalFxDocker at and with SignalFx
Docker at and with SignalFx
 
Microservices and Devs in Charge: Why Monitoring is an Analytics Problem
Microservices and Devs in Charge: Why Monitoring is an Analytics ProblemMicroservices and Devs in Charge: Why Monitoring is an Analytics Problem
Microservices and Devs in Charge: Why Monitoring is an Analytics Problem
 
Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...
Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...
Operationalizing Docker at Scale: Lessons from Running Microservices in Produ...
 
Go debugging and troubleshooting tips - from real life lessons at SignalFx
Go debugging and troubleshooting tips - from real life lessons at SignalFxGo debugging and troubleshooting tips - from real life lessons at SignalFx
Go debugging and troubleshooting tips - from real life lessons at SignalFx
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache Cassandra
 

Similar to Scaling ingest pipelines with high performance computing principles - Rajiv Kurian

hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
Michael Stack
 
Emerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataEmerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big Data
Rahul Jain
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
Malin Weiss
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
Speedment, Inc.
 
«Scrapy internals» Александр Сибиряков, Scrapinghub
«Scrapy internals» Александр Сибиряков, Scrapinghub«Scrapy internals» Александр Сибиряков, Scrapinghub
«Scrapy internals» Александр Сибиряков, Scrapinghub
it-people
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
Uwe Printz
 
What’s Evolving in the Elastic Stack
What’s Evolving in the Elastic StackWhat’s Evolving in the Elastic Stack
What’s Evolving in the Elastic Stack
Elasticsearch
 
InfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxEnterprise Architecture Patterns by Tim Hall & Sam DillardInfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxData
 
Scaling HDFS to Manage Billions of Files with Key-Value Stores
Scaling HDFS to Manage Billions of Files with Key-Value StoresScaling HDFS to Manage Billions of Files with Key-Value Stores
Scaling HDFS to Manage Billions of Files with Key-Value Stores
DataWorks Summit
 
Scaling HDFS to Manage Billions of Files
Scaling HDFS to Manage Billions of FilesScaling HDFS to Manage Billions of Files
Scaling HDFS to Manage Billions of Files
Haohui Mai
 
Real time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackReal time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stack
DataWorks Summit/Hadoop Summit
 
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxData
 
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement VMware Tanzu
 
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Apex
 
NYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ SpeedmentNYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ Speedment
Speedment, Inc.
 
London devops logging
London devops loggingLondon devops logging
London devops loggingTomas Doran
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
Uwe Printz
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Apache Apex
 
FiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
FiloDB: Reactive, Real-Time, In-Memory Time Series at ScaleFiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
FiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
Evan Chan
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
Splunk
 

Similar to Scaling ingest pipelines with high performance computing principles - Rajiv Kurian (20)

hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
 
Emerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataEmerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big Data
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
 
«Scrapy internals» Александр Сибиряков, Scrapinghub
«Scrapy internals» Александр Сибиряков, Scrapinghub«Scrapy internals» Александр Сибиряков, Scrapinghub
«Scrapy internals» Александр Сибиряков, Scrapinghub
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
What’s Evolving in the Elastic Stack
What’s Evolving in the Elastic StackWhat’s Evolving in the Elastic Stack
What’s Evolving in the Elastic Stack
 
InfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxEnterprise Architecture Patterns by Tim Hall & Sam DillardInfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
 
Scaling HDFS to Manage Billions of Files with Key-Value Stores
Scaling HDFS to Manage Billions of Files with Key-Value StoresScaling HDFS to Manage Billions of Files with Key-Value Stores
Scaling HDFS to Manage Billions of Files with Key-Value Stores
 
Scaling HDFS to Manage Billions of Files
Scaling HDFS to Manage Billions of FilesScaling HDFS to Manage Billions of Files
Scaling HDFS to Manage Billions of Files
 
Real time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stackReal time fraud detection at 1+M scale on hadoop stack
Real time fraud detection at 1+M scale on hadoop stack
 
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
 
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
 
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
 
NYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ SpeedmentNYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ Speedment
 
London devops logging
London devops loggingLondon devops logging
London devops logging
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
 
FiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
FiloDB: Reactive, Real-Time, In-Memory Time Series at ScaleFiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
FiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
 

Recently uploaded

Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 

Recently uploaded (20)

Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 

Scaling ingest pipelines with high performance computing principles - Rajiv Kurian

  • 2. SignalFx Scaling ingest pipelines with high performance computing principles Rajiv Kurian, Software Engineer rajiv@signalfx.com
  • 3. Agenda 1. Why we need to scale ingest 2. Basic properties and limitations of modern hardware 3. Optimization techniques inspired by HPC 4. Results! 5. Q&A (hopefully!)
  • 4. SignalFx Why we need to scale ingest
  • 5. • High resolution: • Up to 1 sec • Streaming analytics: • Charts/analytics update @1sec • Real time • Multidimensional metrics: • Dimensions : representing customer, server etc • Filter, aggregate : 99th-pct-latency-by-service,customer SignalFx is an advanced monitoring platform for modern applications
  • 6. Ingest pipeline ROLLUPS PERSIST REST/RATE CONTROL Raw time series data Processed data to analytics
  • 7. SignalFx ingest library Raw data in Rollup data out TimeSeries 0 rollup TimeSeries 1 rollup TimeSeries 2 rollup TimeSeries 3 rollup TimeSeries 4 rollup TimeSeries 5 rollup TimeSeries 6 rollup TimeSeries 7 rollup TimeSeries 8 rollup
  • 8. Issues identified (before applying HPC techniques) • Expensive - too many servers
 • Exhibits parallel slow down • More threads = worse performance
 • What did the profile say? • Death by a thousand cuts • The core library = 35% of profile
  • 9. SignalFx Basic properties and limitations of modern hardware
  • 11. Cache Lines • Data is transferred between memory and cache in blocks of fixed size, called cache lines. Usually 64 bytes • When the processor needs to read or write a location in main memory, it first checks for a corresponding entry in the cache. In the case of: • a cache hit, the processor immediately reads or writes the data in the cache line • a cache miss, the cache allocates a new entry and copies in data from main memory, then the request (read or write) is fulfilled from the contents of the cache • The memory subsystem makes two kinds of bets to help us: • Temporal locality • Spatial locality
  • 12. Reference latency numbers for comparison By Jeff Dean: http://research.google.com/people/jeff/ L1 Cache 0.5ns Branch mispredict 5 ns L2 Cache 7 ns 14x L1 Cache Mutex lock/unlock 25 ns Main memory 100 ns 20x L2 Cache, 200x L1 Cache Compress 1K bytes (Zippy) 3,000 ns Send 1K bytes over 1Gbps 10,000 ns 0.01 ms Read 4K randomly from SSD 150,000 ns 0.15 ms Read 1MB sequentially from memory 250,000 ns 0.25 ms Round trip within same DC 500,000 ns 0.5 ms Read 1MB sequentially from SSD 1,000,000 ns 1 ms 4x memory Disk seek 10,000,000 ns 10 ms 20x DC roundtrip Read 1MB sequentially from disk 20,000,000 ns 20 ms 80x memory, 20x SSD Send packet CA->Netherlands->CA 150,000,000 ns 150 ms
  • 16. Our optimization goal Convert a memory bandwidth bound application to a CPU bound application
  • 17. Things we kept in mind • Measure, measure, measure!
 • Don’t rely on micro benchmarks alone
  • 19. SignalFx library benchmark Rollup data out Key Value ID 0 TimeSeries rollup 0 ID 1 TimeSeries rollup 1 ID 2 TimeSeries rollup 2 ID 3 TimeSeries rollup 3 ID 4 TimeSeries rollup 4 ….. ….. ….. ….. Key 1M TimeSeries rollup 1M Raw data in, in random order, one per Time Series. 50x
  • 20. SignalFx library benchmark Rollup data out Key Value ID 0 TimeSeries rollup 0 ID 1 TimeSeries rollup 1 ID 2 TimeSeries rollup 2 ID 3 TimeSeries rollup 3 ID 4 TimeSeries rollup 4 ….. ….. ….. ….. Key 1M TimeSeries rollup 1M Raw data in, in random order, one per Time Series. 50x 35% of the profile of the entire application
  • 21. SignalFx Techniques inspired by HPC that have improved our pipeline Single threaded, event based architectures: parallelize by running multiple copies of single threaded code
  • 22. Single threaded event based architectures • Threads work on their own private data (as much as possible)
 • Communicate with other threads using events/messages
  • 23. SignalFx local data Network In thread Processor thread(s) Network out thread Receive data Process data Write batched data Events Events Key Value key 1 value 1 key 2 value 2 key 3 value 3 key 4 value 4
  • 24. SignalFx Network In thread Processor thread(s) Network out thread Receive data Process data Write batched data local data Ring Buffer Ring Buffer Key Value key 1 value 1 key 2 value 2 key 3 value 3 key 4 value 4
  • 25. Single threaded event based architectures advantages • It enables many other optimal choices like • Compact array based data structures • Buffer/object re-use
 • Loosely coupled - easy to test
 • Run multiple copies for parallelism
  • 26. SignalFx Ring Buffer Network In thread Worker thread(s) Network out thread Receive data Process data Write batched data local data 1 2 3 4 Ring Buffer Ring Buffer Key Value key 1 value 1 key 2 value 2 key 3 value 3 key 4 value 4
  • 27. SignalFx Worker thread Receive data using Async IO Process data synchronously Write data using Async IO Receive data using Async IO Process data synchronously Write data using Async IO Receive data using Async IO Process data synchronously Write data using Async IO Worker thread Worker thread local data local data local data Key Value key 5 value 5 key 6 value 6 key 7 value 7 key 8 value 8 Key Value key 1 value 1 key 2 value 2 key 3 value 3 key 4 value 4 Key Value key 9 value 9 key 10 value 10 key 11 value 11 key 12 value 12
  • 28. SignalFx Ring Buffer Network thread Processor thread(s) Async IO thread Receive data Process data Batched IO calls local data 1 2 3 4 Ring Buffer Ring Buffer 5 Ring Buffer 6 7 Key Value key 1 value 1 key 2 value 2 key 3 value 3 key 4 value 4
  • 29. Advice for threaded applications • Threads should ideally reflect the actual parallelism of the system. • Avoid gratuitous over subscribing • Exception: IO threads?
 • DO NOT communicate unless you have to
  • 30. SignalFx Techniques inspired by HPC that have improved our pipeline Use compact, cache-conscious, array based data structures with minimal indirection
  • 32. Basic principles • Strive for smaller data structures • Extra computation is ok • E.g. Compressing network data
 • Design data structures that facilitate processing multiple entries—big arrays!
 • Layout should reflect access patterns
  • 33. Hash maps • Hash maps look ups are NOT free!
 • A lookup in a well implemented hash map is by definition a cache miss
 • Popular implementations like java.util.HashMap can cause multiple cache misses
  • 34. valuekey key* value*key* value* List List List List Typical hash map implemented as an array of lists of key* | value*
  • 35. valuekey key* value*key* value* List List List List 1 Cache misses in a typical hash-map implementation
  • 36. valuekey key* value*key* value* List List List List 1 2 Cache misses in a typical hash-map implementation
  • 37. valuekey key* value*key* value* List List List List 1 2 3 Cache misses in a typical hash-map implementation
  • 38. valuekey key* value*key* value* List List List List 1 2 3 4 Cache misses in a typical hash-map implementation
  • 39. valuekey key* value*key* value* List List List List 1 2 3 4 Cache misses in a typical hash-map implementation
  • 40. valuekey key* value*key* value* Hash map implemented as an array of lists of key* | value* List List List List
  • 41. Array of co-located key/value Key Value Key 0 Value 0 Key 1 Value 1 Key 2 Value 2 Key 3 Value 3 Key 4 Value 4 Key 5 Value 5 Key 6 Value 6 Key 7 Value 7
  • 42. Cache misses with no collision Key Value Key 0 Value 0 Key 1 Value 1 Key 2 Value 2 Key 3 Value 3 Key 4 Value 4 Key 5 Value 5 Key 6 Value 6 Key 7 Value 7 1
  • 43. Cache misses with collisions Key Value Key 0 Value 0 Key 1 Value 1 Key 2 Value 2 Key 3 Value 3 Key 4 Value 4 Key 5 Value 5 Key 6 Value 6 Key 7 Value 7 1 2
  • 44. Hash map of key to index to an array of structs Value 0 Value 1 Value 2 Value 3 Value 4 Value 5 Value 6 Value 7 Value 8 Key Index Key 0 1 Key 1 6 Key 2 4 Key 3 8
  • 45. Cache misses with collision Value 0 Value 1 Value 2 Value 3 Value 4 Value 5 Value 6 Value 7 Value 8 Key Index Key 0 1 Key 1 6 Key 2 4 Key 3 8 1
  • 46. Cache misses with collision Value 0 Value 1 Value 2 Value 3 Value 4 Value 5 Value 6 Value 7 Value 8 Key Index Key 0 1 Key 1 6 Key 2 4 Key 3 8 1 2
  • 47. New library memory layout TimeSeries rollup 0 TimeSeries rollup 1 TimeSeries rollup 2 TimeSeries rollup 3 TimeSeries rollup 4 TimeSeries rollup 5 TimeSeries rollup 6 TimeSeries rollup 7 TimeSeries rollup 8 ID Index ID 0 1 ID 1 6 ID 2 4 ID 3 8 Raw data in Rollup out
  • 48. Changing hash map implementations • java.util.HashMap (uses separate chaining and boxes primitives) to make a long -> int lookup • Allocations galore
 • net.openhft.koloboke primitive open hash map • 45% improvement
 For the JVM use libraries like https://github.com/ OpenHFT/Koloboke. For C++ try https://github.com/preshing/ CompareIntegerMaps or similar.
  • 49. Access patterns Hot data Cold data object 0 object 1 object 2 object 3 Field 0 Field 1 Field 2 Field 3 Field 4 Field 0 Field 1 Field 2 Field 3 Field 4 Field 0 Field 1 Field 2 Field 3 Field 4 Field 0 Field 1 Field 2 Field 3 Field 4
  • 50. Group fields accessed together Hot fields Cold fields object 1 Field 0 Field 1 Field 2 Field 0 Field 1 Field 2 Field 0 Field 1 Field 2 Field 0 Field 1 Field 2 Field 3 Field 4 Field 3 Field 4 Field 3 Field 4 Field 3 Field 4
  • 51. Results of separating hot and cold data A hot loop run about once every 500 ms • Old - Hot and cold data kept together • 5 cache lines per time series • Took anywhere between 62-70 ms
 • New - Hot and cold data kept separate • 3 cache lines of hot data per time series • Took anywhere between 40-45 ms
 • 35% improvement
  • 53. Old vs New • Concurrent -> single threaded • Locks gone • Array based data structures • Zero allocations • Extensive batching and hardware prefetching
 
 • Multiple hash maps -> a single hash map look up
  • 55. Old vs New 76 K/sec VS 2.1 M/sec
  • 59. Amdahl’s law - 35%Overallspeedup 0 0.4 0.8 1.2 1.6 Library speed up 1 4 8 12 16 20 24 28 ∞
  • 60. CPU
  • 62. 35% of the profile but 3.4x improvement? • Amdahl’s law • Max 1.54x improvement if 35% => 0%
 • Why 3.4x ? • When you use less cache, you leave more for others - thus speeding up other code too
 • Lesson • A profiler is a necessary tool, but not a substitute for informed design
  • 64. Closing remarks / rant • “Write code first, profile later” = BAD
 • Excessive encapsulation leads to myopic decisions being made re: perf • allocations • “thread safe” code
 • Beware of micro benchmarks
  • 65. SignalFx Thank You! Rajiv Kurian rajiv@signalfx.com @rzidane360 WE’RE HIRING jobs@signalfx.com @SignalFx - signalfx.com
  • 67. Composition in C Struct A Struct B Struct B1 embedded Struct B2 embedded int int int int int int int int
  • 68. int int int int Composition in Java Object A Object B Object B1 Object B2 int int int int B1 B2
  • 71. Actual layout? int int Object B Object B1 B1 B2 B (header) B1* B2*
  • 72. Actual layout? int int Object B Object B1 B1 B2 B (header) B1* B2* B1 (header) int int
  • 73. Actual layout? int int Object B Object B1 B1 B2 B (header) B1* B2* B1 (header) int int
  • 74. Actual layout? int int int int Object B Object B1 Object B2 B1 B2 B (header) B1* B2* B1 (header) int int
  • 75. Actual layout? int int int int Object B Object B1 Object B2 B1 B2 B (header) B1* B2* B1 (header) int int B2 (header) int int
  • 76. Actual layout? int int int int Object B Object B1 Object B2 B1 B2 B (header) B1* B2* B1 (header) int int B2 (header) int int
  • 77. Potential layout after GC int int int int Object B Object B1 Object B2 B1 B2 B (header) B1* B2* Other data B1 (header) int int Other data B2 (header int int
  • 78. SignalFx Techniques inspired by HPC that have improved our pipeline Separate the control and data planes
  • 79. Frequent Infrequent A networking concept Routing table Packets in Packets out Routing data Key Value
  • 80. What the control and data planes do In networking terminology: • Data plane - Defines the part that decides what to do with packets arriving on an inbound interface— Frequent • Control plane - Defines the part that is concerned with drawing the network map or routing table—Infrequent
  • 81. The goal of control and data plane separation DO NOT slow the frequent path because of the infrequent path
  • 82. Runtime configuration variables Worker threadConfiguration variables (volatile/atomic) Setter thread while (1) { process_data_using_configuration_variables(); } Flag 0 Flag 1 Flag 2 Flag 3
  • 83. Flag 0 Flag 1 Flag 2 Flag 3 Runtime configuration variables Worker threadConfiguration variables (volatile/atomic) Setter thread Flag 0 Flag 1 Flag 2 Flag 3 while (1) { cache_configuration_variables(); process_a_ton_of_stuff(); } Cached configuration variables
  • 84. Volatile/atomic flag vs cached local flag • All run time flags (used on every data point) are volatile/atomic loads • All run time flags are cached and refreshed on each run loop • About 8% improvement in datapoint/second. Others might see more or less