AWS Pop Up Loft Munich
Analyzing and processing
Financial Market data on AWS
with Kinesis
Florian Benz
Engineering Team Lead
November 14, 2019
■ Founded in 2014
■ 115 Employees
■ Managing 1.8 billion Euro
■ Developed in Munich
Our Company.
Our product.
International B2C + B2B Business.
5
AWS - Going Serverless.
Real-Time Market Data.
Real-Time Market Data.
½ MiB/s
Show real time
24 GiB/day
Store all data
{
"isin":"DE0007480204",
"currency": "EUR",
"bid": 26.96,
"ask": 27.2,
"time":"2019-11-07T17:18:51Z"
...
}
Real-Time Market Data.
Our Journey.
Aurora TimescaleDB DynamoDB BigTableTimesteam
● Looks like a good solution
● Data organized in time intervals
● Not available yet
Timesteam
● PostgreSQL
● Handled the load w/ high resource usage
● Performance degraded over time
Aurora
● Managed service available
● Better than pure PostgreSQL for time-series data
● High resource usage
TimescaleDB
● Couldn’t handle the load
● Not easy to design for
● High costs for our use case
DynamoDB
● Advertised for market data
● Overhead due to connecting Clouds
● Minimal production setup is rather big
BigTable
Separate Processing
and Storage.
Separate Processing
and Storage.
Streams?
Separate Processing
and Storage.
Streams!
Amazon Kinesis
Ingest data as
it’s generated
Process data
on the fly
Real-time
analytics
Amazon Kinesis.
Data Streams
Capture, process and
store data streams
Data Firehose
Load data streams
into data stores
Amazon Kinesis.
Producer
● 1 MiB/s/shard or
● 1,000 PUT records/s/shard
Consumer
● 2 MiB/s/shard
● 5 GetRecords/s/shard
Amazon Kinesis Data Streams.
Consumer - Enhanced Fan-Out
● Subscribe via HTTP/2
● No limitations on the consumers
● Low latency
Amazon Kinesis Data Streams.
Our Kinesis Setup.
Data Stream Firehose
Kinesis in Action.
Producer
● Kotlin w/ Spring Boot
● Kinesis Producer Library (KPL)
Processor
● Kotlin
Infrastructure
● Terraform
Choices around Kinesis.
KPL aggregates
● Daemon
● RecordMaxBufferedTime, default = 100ms
● Firehose deaggregates by default
● Lambda has to deaggregate
AWS SDK vs Kinesis Producer Library (KPL).
Issues.
● Written in C++
● Child process of the main process
● Java release with precompiled binaries
Issue - KPL Language Support.
Issue - Processing Performance.
Issue - Terraform Support.
Evolution.
● Stream data to all clients
● Use data to improve UX
Awesome UI Components.
● Try to reduce latency further
● Reduce configuration options
Kinesis Enhanced Fan-Out.
● Optimize data on S3 for Athena
● Employ Athena to get insights
Amazon Athena.
Questions ?
Andreas Schranzhofer, CTO
andreas@scalable.capital
@Schranzhofer

Analyzing and processing FInancial Market Data on AWS with Kinesis - AWS Pop Up Loft Munich