Tempesta FW
Alexander Krizhanovsky
NatSys Lab.
ak@natsys-lab.com
What Tempesta FW Is?
FireWall: layer 3 (IP) – layer 7 (HTTP) filter
FrameWork: high performance and flexible platform to build intelligent
DDoS mitigation systems and Web Application Firewalls (WAF)
First and only hybrid of HTTP accelerator and FireWall
Directly embedded into Linux TCP/IP stack
JIT Domain Specific Language (DSL) for traffic processing
This is Open Source (GPLv2)
Challenges
! per-request
resource consumption
! drop early or die
! high concurrency
Is mostly about application layer (HTTP) DDoS:
● small HTTP requests and short-lived TCP
connections
● requests prevail responses
● a lot of concurrent connection
● fine-grained filtration rules at all network
layers
Existing Solutions:
How To Filter HTTP requests?
Modules on Application HTTP servers
Firewalls
Deep Packet Inspection (DPI)
Existing Solutions
Deep Packet Inspection (DPI) - not an active TCP participant
● can't accelerate content to mitigate defended Web-resource under
DDoS
● SSL termination is hard
User-space HTTP accelerators are too slow due to context switches,
copies and are designed for old hardware
Firewalls – low layers only (IP and partially TCP)
● rules generation for app. layer is messy (fail2ban etc.)
● no dynamic rules persistency
L7 DDoS is About Performance:
How To Accelerate Web-application
DDoS mitigation CDN
Filter
● DPI
● FireWall
+ HTTP accelerator
Accelerator
● HTTP server
Extra communications
Hard to manage
Web Application Firewall (WAF)
Modern WAF:
● Heavy buzzwords: XHTML,
WSDL,...
● Machine learning
● Tons of regexps
● Run on top of common Web
server
WAF Accelerator!
(~ Web accelerator)
What's Wrong With Traditional
Web Servers & Firewalls
User-space & monolithic OS kernel (exokernel approach helps much):
● context switches
● copies
● no uniform access to information on all network layers
No flexibility to analyze and filter traffic on all layers
Designed for old hardware and/or oblivious to hardware features
Tempesta FW Architecture
Synchronous Sockets
Reading from a socket in a context other
than deferred interrupt context is
asynchronous to arrival of TCP segments
Synchronous Sockets:
● process packets while they're hot in
CPU caches
● no queues – do work when data is
ready
http://natsys-lab.blogspot.ru/2013/03/whats-
wrong-with-sockets-performance.html
Faster HTTP Parser
Switch-driven (widespread):
poor C-cache usage & CPU intensive
Table-driven (with compression):
poor D-cache usage
Hybrid State Machine
(combinations of two previous)
Direct jumps (Ragel)
PCMPSTR (~strspn(3) – very limited)
http://natsys-lab.blogspot.ru/2014/11/the-
fast-finite-state-machine-for-http.html
while (++*str_ptr):
3: switch (state) { lookup!
case 1:
switch (*str_ptr) {
case 'a':
...
state = 1
case 'b':
...
1: => state = 2
4: case 2:
...
2: jmp to while
Generic Finite State Machine (GFSM)
Protocol FSMs context switch for ICAP etc.:
(1) HTTP FSM: receive & process HTTP request;
(2) ICAP FSM: the callback is called at particular HTTP state,
current HTTP FSM state is push()'ed to stack
(3) ICAP FSM: send the request to ICAP server and get results
(4) HTTP FSM: the callback is called at particular ICAP state,
stored HTTP FSM state is pop()'ed back
Fundation for TL programs execution (~coroutine)
Tempesta DB:
Web-cache & Filter
mmap()'ed & mlock()'ed in-memory persistent database –
no disk IO (size is limited, but can be processed in softirq)
Cache conscious Burst Hash Trie:
● NUMA-aware: independent databases for each node
(retrieved by less significant bits);
● Can be lock-freed
● Almost zero-copy (only NIC → disk)
● Suitable to store fixed- and variable-size records
● Quick for large string keys (e.g. URI) as well as for integer keys
Filtering
Dynamic persistent rules with eviction (Tempesta DB)
Set of callbacks on all network layers:
● classify_ipv{4,6} - called for each received IPv4/IPv6 client packet
● classify_tcp - called for each received TCP segment
● classify_conn_{estab,close} - a client connection is
established/closed
● classify_tcp_timer_retrans - called on retransmissions to client
● …and other TCP stuff
● and surely HTTP processing phases
Tempesta Language
# One-shot function to be called at ingress IPv4 packet
if (tdb.select("ip_filter", pkt.src))
filter(pkt, DROP);
# Sample senseless multi-layer rule
if ((req.user_agent =~ /firefox/i && client.addr == 1.1.1.1)
|| length(req.uri) > 256)
# Block the client at IP layer, so it will be filtered
# efficiently w/o further HTTP processing.
tdb.insert("ip_filter", client.addr);
Benchmark (very outdated)
10-core Intel Xeon E7-4850
2.4GHz, 64GB RAM (One CPU
with 10 cores
NIC RX and TX queues binding to
CPU cores
RFS enabled
Nginx: 10 workers, multi_accept,
sendfile, epoll, tcp_nopush and
tcp_nodelay
Features & TODO
Simple HTTP proxy, GFSM, classification hooks
Load balancing
Simple rate limiting module
Cluster failovering
Filtering & simple HTTP DDoS protection
Web-cache – in progress
SSL/TLS (libressl) – in progress
Tempesta Language (advanced traffic processing) – in progress
Thanks!
Availability: https://github.com/natsys/tempesta
Contact: ak@natsys-lab.com

Tempesta FW: a FrameWork and FireWall for HTTP DDoS mitigation and Web Application Firewalls (WAF)

  • 1.
  • 2.
    What Tempesta FWIs? FireWall: layer 3 (IP) – layer 7 (HTTP) filter FrameWork: high performance and flexible platform to build intelligent DDoS mitigation systems and Web Application Firewalls (WAF) First and only hybrid of HTTP accelerator and FireWall Directly embedded into Linux TCP/IP stack JIT Domain Specific Language (DSL) for traffic processing This is Open Source (GPLv2)
  • 3.
    Challenges ! per-request resource consumption !drop early or die ! high concurrency Is mostly about application layer (HTTP) DDoS: ● small HTTP requests and short-lived TCP connections ● requests prevail responses ● a lot of concurrent connection ● fine-grained filtration rules at all network layers
  • 4.
    Existing Solutions: How ToFilter HTTP requests? Modules on Application HTTP servers Firewalls Deep Packet Inspection (DPI)
  • 5.
    Existing Solutions Deep PacketInspection (DPI) - not an active TCP participant ● can't accelerate content to mitigate defended Web-resource under DDoS ● SSL termination is hard User-space HTTP accelerators are too slow due to context switches, copies and are designed for old hardware Firewalls – low layers only (IP and partially TCP) ● rules generation for app. layer is messy (fail2ban etc.) ● no dynamic rules persistency
  • 6.
    L7 DDoS isAbout Performance: How To Accelerate Web-application DDoS mitigation CDN Filter ● DPI ● FireWall + HTTP accelerator Accelerator ● HTTP server Extra communications Hard to manage
  • 7.
    Web Application Firewall(WAF) Modern WAF: ● Heavy buzzwords: XHTML, WSDL,... ● Machine learning ● Tons of regexps ● Run on top of common Web server WAF Accelerator! (~ Web accelerator)
  • 8.
    What's Wrong WithTraditional Web Servers & Firewalls User-space & monolithic OS kernel (exokernel approach helps much): ● context switches ● copies ● no uniform access to information on all network layers No flexibility to analyze and filter traffic on all layers Designed for old hardware and/or oblivious to hardware features
  • 9.
  • 10.
    Synchronous Sockets Reading froma socket in a context other than deferred interrupt context is asynchronous to arrival of TCP segments Synchronous Sockets: ● process packets while they're hot in CPU caches ● no queues – do work when data is ready http://natsys-lab.blogspot.ru/2013/03/whats- wrong-with-sockets-performance.html
  • 11.
    Faster HTTP Parser Switch-driven(widespread): poor C-cache usage & CPU intensive Table-driven (with compression): poor D-cache usage Hybrid State Machine (combinations of two previous) Direct jumps (Ragel) PCMPSTR (~strspn(3) – very limited) http://natsys-lab.blogspot.ru/2014/11/the- fast-finite-state-machine-for-http.html while (++*str_ptr): 3: switch (state) { lookup! case 1: switch (*str_ptr) { case 'a': ... state = 1 case 'b': ... 1: => state = 2 4: case 2: ... 2: jmp to while
  • 12.
    Generic Finite StateMachine (GFSM) Protocol FSMs context switch for ICAP etc.: (1) HTTP FSM: receive & process HTTP request; (2) ICAP FSM: the callback is called at particular HTTP state, current HTTP FSM state is push()'ed to stack (3) ICAP FSM: send the request to ICAP server and get results (4) HTTP FSM: the callback is called at particular ICAP state, stored HTTP FSM state is pop()'ed back Fundation for TL programs execution (~coroutine)
  • 13.
    Tempesta DB: Web-cache &Filter mmap()'ed & mlock()'ed in-memory persistent database – no disk IO (size is limited, but can be processed in softirq) Cache conscious Burst Hash Trie: ● NUMA-aware: independent databases for each node (retrieved by less significant bits); ● Can be lock-freed ● Almost zero-copy (only NIC → disk) ● Suitable to store fixed- and variable-size records ● Quick for large string keys (e.g. URI) as well as for integer keys
  • 14.
    Filtering Dynamic persistent ruleswith eviction (Tempesta DB) Set of callbacks on all network layers: ● classify_ipv{4,6} - called for each received IPv4/IPv6 client packet ● classify_tcp - called for each received TCP segment ● classify_conn_{estab,close} - a client connection is established/closed ● classify_tcp_timer_retrans - called on retransmissions to client ● …and other TCP stuff ● and surely HTTP processing phases
  • 15.
    Tempesta Language # One-shotfunction to be called at ingress IPv4 packet if (tdb.select("ip_filter", pkt.src)) filter(pkt, DROP); # Sample senseless multi-layer rule if ((req.user_agent =~ /firefox/i && client.addr == 1.1.1.1) || length(req.uri) > 256) # Block the client at IP layer, so it will be filtered # efficiently w/o further HTTP processing. tdb.insert("ip_filter", client.addr);
  • 16.
    Benchmark (very outdated) 10-coreIntel Xeon E7-4850 2.4GHz, 64GB RAM (One CPU with 10 cores NIC RX and TX queues binding to CPU cores RFS enabled Nginx: 10 workers, multi_accept, sendfile, epoll, tcp_nopush and tcp_nodelay
  • 17.
    Features & TODO SimpleHTTP proxy, GFSM, classification hooks Load balancing Simple rate limiting module Cluster failovering Filtering & simple HTTP DDoS protection Web-cache – in progress SSL/TLS (libressl) – in progress Tempesta Language (advanced traffic processing) – in progress
  • 18.