Pavel Odintsov introduces FastNetMon DDoS prevention and how they migrated analytics to ClickHouse to handle large data volumes. Pavel is CTO and co-founder of FastNetMon LTD.
FastNetMon: https://fastnetmon.com/
Meetup: https://www.meetup.com/San-Francisco-Bay-Area-ClickHouse-Meetup/events/282872933/
6. FastNetMon: Detection Logic
6
Detection type:
• Threshold based (based on host’s average traffic)
THRESHOLD TYPES:
• USING TOTAL TRAFFIC
• USING TOTAL PPS RATE
• PER PROTOCOL
10. Our Challenges with Metrics
10
● Cardinality: medium sized network has ~1m IPv4 hosts with 32
metrics each
● Value range from 0 to UINT64_MAX: traffic in mbit / s or packet /s
● 1s precision
● Insert in very large batches
● Customers love top-k queries
11. Graphite As Storage for Metrics, 2015
11
● “.” as delimiter. IPs and prefixes look ugly: 10_1_2_3 , 10_1_2_3_24
● Limited by performance of single CPU core
● Disk space hungry datastore format
● Whisper is simple and easy to implement
● Graphite is not well maintained and broken in recent Debian /
Ubuntu
15. InfluxDB As Storage for Metrics, 2016
15
● Allows “.” in metric names
● Pretty compact datastore format
● Automated retention
● Native Grafana support
● Can use multiple CPU cores
● Easy installation, just single binary
● May need tens of minutes for loading with large database
● Uses lots of memory
● Top-k query is extremely slow
● Does not scale after 2m metrics per second
● Queries over few days of data are very slow
16. ClickHouse As Storage for Metrics, 2022
16
● Allows “.” in metric names
● Pretty compact datastore format
● Automated retention
● Plugin for Grafana
● Can use multiple CPU cores
● Requires SSE 4.2 :(
● Top-k query is pretty fast
● Supports unlimited cardinality
● Queries over few days of data can be finished in reasonable time
● Ability to store flows!
17. Clickhouse vs InfluxDB
17
InfluxDB ClickHouse
Cardinality < 1m of unique series Battle tested with 16m+ unique series
Metrics / s < 1m per second 10m+ per second
Top-k performance Extremely slow Good
Data format Inefficient, text Very Efficient, binary
Query syntax Counterintuitive Well known SQL
Multi CPU support Limited Brilliant, scales linearly
Grafana Native Plugin based