Как разработать DBFW с нуля

ptsecurity.com
Database Firewall From Scratch
Arseny Reutov
areutov@ptsecurity.com
Denis Kolegov
dkolegov@ptsecurity.com

About us
• Arseny Reutov
• Head of application security research at Positive Technologies
• Member of Positive Hack Days (https://phdays.com) conference board
• Occasional web security blogger (https://raz0r.name)
• Denis Kolegov
• Team lead of Application Firewall research at Positive Technologies
• PhD, associate professor at Tomsk State University
• Web security micro blogger (https://twitter.com/dnkolegov)

Outline
• The stuff we are going to talk about is joint work of PT Application Firewall
Research Team developing a database firewall prototype as a part of our application
firewall
• Thanks to
§ Arseny Reutov
§ Denis Kolegov
§ Vladimir Kochetkov
§ Igor Kanygin
§ Nikolay Tkachenko
§ Ivan Hudyashov
§ Sergey Grechnev
§ Sergey Reshetnikov

Agenda
• Intro
• WAF and DBFW
• Related Work
• Our Prototype
• Parser
• Protectors
§ Profiler
§ Dejector
§ SQLi
§ Access Control
§ IAM
• Roadmap

What is WAF for most people?
WAF is a blackbox system that applies a
set of rules to an HTTP conversation

WAF protection is pattern
matching

For a WAF web application is just a
series of HTTP transactions

Web Applications in 2017
by Jeff Atwood @codinghorror

WAF nowadays
Web
Application
Firewall

WAF nowadays
Client-Side
Firewall
(waf.js)
Web
Application
Firewall

WAF nowadays
Inspected Application
Module (IAM)
Client-Side
Firewall
(waf.js)
Web
Application
Firewall

WAF nowadays
Inspected Application
Module (IAM)
Client-Side
Firewall
(waf.js)
Database Firewall
(DBFW)
Web
Application
Firewall
(WAF)

What is Database Firewall?
Database firewalls are a type of application firewalls which
• Monitor database activity
• Detect database specific attacks
• Protect sensitive information stored in the databases
• Implement adequate access control models

Database Firewall Deployment
Like WAFs database firewalls can be deployed
• in proxy mode
• in sniffer mode via a SPAN port (mirrored traffic)
• as a host-based agent

What Database Firewall Can Do?
Database firewalls can do several actions on each query:
• Pass
• Log for monitoring purposes
• Alert
• Rewrite query
• Block (either by dropping connection or by generating a native error code)

WAF vs DBFW
WAF
XSS
XXE
RCE
LFI
SSRF
IDOR
CSRF
Path Traversal
Open Redirect
Object Injection
Session Fixation
…

WAF vs DBFW
WAF
XSS
XXE
RCE
LFI
SSRF
IDOR
CSRF
Path Traversal
Open Redirect
Object Injection
Session Fixation
…
DBFW
Segregation of
Duties
Audit &
Monitoring
Sensitive Data
Discovery

WAF vs DBFW
WAF
XSS
XXE
RCE
LFI
SSRF
IDOR
CSRF
Path Traversal
Open Redirect
Object Injection
Session Fixation
…
DBFW
Segregation of
Duties
Audit &
Monitoring
Sensitive Data
Discovery
Access Control
SQL Injection
Buffer Overflow
Data Leakage Prevention
Data Masking

SQL Injection Detection: Green SQL
• Green SQL have been mod_security of DBFWs for many
years, but open source project is no longer maintained
• SQL Injection detection is based on risk score using metrics:
§ SQL comments
§ Sensitive tables
§ OR token
§ UNION token
§ Variable comparison
§ Always true expressions
§ and more

SQL Injection Detection: Machine Learning
SOFIA: An Automated Security Oracle for Black-Box Testing of SQL-Injection Vulnerabilities

SQL Injection Detection: Machine Learning
• ”SOFIA is significantly more accurate than antiSQLi and GreenSQL and
significantly faster than antiSQLi in classifying legitimate SQL
statements and SQLi attacks.”
• However, it takes lots of computing power to train the model since tree
operations are time expensive
• The algorithm is not tolerant to attacks during training

Architecture
• Core
§ Go
§ Python/Twisted
• Parser
§ ANTLR 4
§ ANTLR Grammars-v4
§ Python/C++
• Storage
§ Mongo
• UI
§ React
§ Redux
§ GraphQL

DBFW Data Flow
dbfwgo
(1) xsql query
(6) check
result
dbfwpython
(2) request
context
Web
server
Parser
DBMS
(10) xsql result
(7) response
context
(3) parser
request
(4) response
(AST, tokens, etc.)
(8) xsql query
(9) xsql result
Protector 1
…
Protector N
(5) query
attributes

Naïve Approach
Parser
Syntax analysis Protectors
Protector 1
…
Protector N
Concrete
syntax tree
mysql
query

Naïve Approach
Syntax analysis Protectors
Protector 1
…
Protector N
Concrete
syntax tree
Parser
mysql
Parser
tsql
Parser
plsql
query

CST vs AST
…
select 1
CST AST

SQL Parsing Flow
Mysql parser
AST
tsqlParser
plsqlParser
Syntax analysis Semantic analysis
…
Collector 1
…
Collector N
CST AST
Abstract syntax
analysis
query

SQL Parsing Implementation
ANTLR4 xsql parser
Code
generator
xsql grammar
SQL AST
specification
Parser’s
classes
AST Node
classes
query
xsql visitor
CST
AST

MySQL Parsing Time Results
Python 2.7
~ 0.3 / 0.001 sec
~ 0.4 / 0.001 sec
~ 1.5 / 0.003 sec
~ 1.6 / 0.004 sec
C++ 11
~ 0.01 / 0.00008 sec
~ 0.01 / 0.00029 sec
~ 0.06 / 0.0001 sec
~ 0.06 / 0.004 sec
Test
Minimal query* parsing
Minimal query full parsing
Heavy query** parsing
Heavy query full parsing
~ 983 sec
~ 2 KB/sec
~ 48 sec
~ 40 KB/sec
Bitrix queries file (2 MB)
~ 432 sec
~ 420 query/sec
~ 16 sec
~ 11000 query/sec
Wordpress queries (181540)
* Minimal query: select 1
** Heavy query: select * from (((((((select col1 from t1) as ttt))))))

Profiler
• SQL profiler is a basic protection mechanism implemented in all database firewalls
• It works like linting utilities or linters (e.g. eslint, pylint, cppint, etc.), but analyses
SQL queries and check if they satisfy security policy (SQL profile)
• The main goal is to prevent using of SQLi automatic tools and exploits
• SQL profile can be
§ Static: created by manual configuration
§ Dynamic: created by source code analysis tools

Example of SQL Profile Rules
Options
["log", 20]
Rule
no-union
["block", 150]max-query-length
["block", 6]max-columns
["block", 0]max-union
["block", "benchmark", "md5", "if"]function-blacklist
["block"]
entity-length
"off"no-subqueries
["block"]no-hex
["block"]no-stacked-queries
["block"]no-comments
["block"]no-os-commands

AST Example
Dejector is a context-free parse tree validation approach to preventing SQL Injection,
proposed by Hansen and Patterson in 2005
Given a set of known-good queries and the base formal grammar, Dejector builds a
new subgrammar that contains only the rules required to produce exactly the queries
in the known-good set
Strings recognized by the subgrammar are guaranteed to be structurally identical to
those in the known-good set
The subgrammar is then used with a parser generator such as bison or ANTLR to
produce a recognizer for the sublanguage

ANTLR-based Dejector
xsql Parser
Compiler
(3) Trusted
xsql queries
Aggregator
(4) CST
list
(6) Union CST
(7) Grammar
subxsql.g4
ANTLR
(8) subxsql
parser
(1) xsql.g4
(2) xsql
Parser
(5) Dejector
mode
Application
Security
language
mechanism
Developer

Strict Mode
a
b c
f
hg
i j
l
k
m
UCST ANTLR v4 grammar:
a: b | c | pd ;
c: f;
f: g | h;
g: i | j;
i: l;
…
p d

Strict Mode
a
b c
f
d
hg
i j
l
k
m
a
b
f
d
UCST CST

Strict Mode
a
b c
f
hg
i j
l
k
m
b1
f
d1
new UCST ANTLR v4 grammar:
a: b | c | pd | b1 | d1 ;
b1: f;
c: f;
f: g | h;
g: i | j;
i: l;
…
p d

Strict Dejector Parsing Time Results
Python 2.7 MySQL
~ 0.643 / 0.0019 sec
~ 0.67 / 0.002 sec
~ 0.33 / 0.003 sec
~ 0.32 / 0.009 sec
Python 2.7 SubMySQL
~ 0.09 / 0.0011 sec
~ 0.102 / 0.0011 sec
~ 0.09 / 0.001 sec
~ 0.18 / 0.005 sec
Test
SELECT * FROM a WHERE b='c'
SELECT * FROM a WHERE b BETWEEN 'c' AND 'd'
INSERT INTO passbook VALUES('a','b','c','d','e','f','g','h')
CREATE TABLE a (b int(5) AUTO_INCREMENT, c date, d
VARCHAR(255), e VARCHAR(255), f VARCHAR(255), g int(10), h
int(10), i float(10,2), j VARCHAR(255), PRIMARY KEY (b))
~ 1.54 / 0.003 sec ~ 0.09 / 0.001 secSELECT * FROM (((((((SELECT col1 FROM t1) AS ttt))))))*
* Query can not be derived in SubMySQL grammar

WAF + DBFW
• Suppose that we have both WAF and DBFW deployed:
Client WAF Web Server DBFW Database

HTTP & SQL Correlation
In order to correlate SQL queries with HTTP packets a host-based module can be
deployed on the web server which will append session cookie into each SQL query in
a comment section

• When these modified queries reach DBFW it can look up those
session identifiers in the database shared with WAF
• WAF holding access control policy for web users acts as
information point, i.e. it provides user information given a
session cookie
• DBFW serves as enforcement point, effectively blocking or
allowing queries

• What if we do not have a chance to deploy a host module (agent)?
• We can still try to correlate HTTP and SQL using time-throttled request processing
• Idea is that we process HTTP requests synchronously, observe emitted SQL queries, and associate
them with HTTP requests

SQL Injection Detection
• Using host-based agent we can effectively detect SQL Injections
• Agent injects into an SQL comment data about HTTP parameters that were
observed when executing SQL query

SQLi Detection Approach
• DBFW replaces each occurrence of HTTP parameter value
found in SQL query with a constant
• Then it tries to parse and get tokens firstly for the original query
and then for the second one with replaced constants
• If a number of tokens is different, an SQL Injection is reported
since constant replacement have caused changes in the query
structure

AST-based Detection
• A better approach is to compare ASTs instead
• After traversal of the ASTs, if differences are found, an SQL Injection is
reported because constant replacement have caused changes in AST

It decreases number of false positives. Does this mechanism decrease
false negatives too?
One of bypasses for owasp-modsecurity-crs found by Ivan Novikov
It is not detected by libinjection too due to the context issue
From Theory to Practice
curl 'localhost/index.html?id=1%20or%20true'
1%20or%20true
id=1.or-id
id=.1or-UTC_DATE—
)-sleep(9999
sleep(9999)
*/UNION SELECT password FROM users--

Show Me Impact
Michael Stepankin. Advanced Web Application Fuzzing

SQLi Detection Example
GET /app/?id=1%20or%20true HTTP/1.1
Host: example.com
Connection: close
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding: gzip, deflate, sdch
Accept-Language: ru-RU,ru;q=0.8,en-US;q=0.6,en;q=0.4

Host: example.com
Connection: close
/*{"get_args":[{"id":"1 or true"}]}*/ select * from users where clientid = 1 or true

Host: example.com
Connection: close
/*{"get_args":[{"id":"1 or true"}]}*/ select * from users where clientid = 1 or true
select * from users where clientid = ""
select * from users where clientid = 1 or true

select
*
from
users
where
id
=
""
Lexems
select
*
from
users
where
id
=
1
or
true
Lexems

select
*
from
users
where
id
=
""
Lexems
select
*
from
users
where
id
=
1
or
true
Lexems
8 ≠ 10

select * from users where id = ""
select * from users where id = 1 or true
select
*
from
users
where
clientid
=
""
Lexems
select
*
from
users
where
clientid
=
1
or
true
Lexems
8 ≠ 10

Access Control
• All types of application firewalls should have access control mechanisms
• The main statement of any access policy: All entities must be identified
• Entities identification in account-based systems: at least it is necessary to identify
web application subjects (users) that initiate queries to DBMS
• Approaches
§ Many-to-many applications
§ HTTP and SQL user tracking
§ RASP
• Angine - ABAC eNgine

Angine Data Flow Model
Subjects,
resources,
environment
PAP
(6) attribute query
(1) ALFAScript
policy
(5) request context
(12) response
(4) request
Compiler
Access
requester
PDP
(2) Lua code
Context
handler
PEP
(3) access
request
PIP
(9) attribute
(7) attribute
query
(11) response context
(10) attributes
(8) attributes
Generator
Data
structures
Language
ALFAScript
IDL policy
spec

ALFAScript Policy Example
interface Entity {
abstract id: String;
}
interface SQLEntity <: Entity {
database: String;
table: String;
column: String;
level: Number;
tags: [String];
}
interface Subject <: Entity {
name: String;
level: Number;
tags: [String];
}

ALFAScript Policy Example
namespace example {
export policy Main {
target clause action == "select“
apply denyUnlessPermit
rule mls {
permit
target clause subject.level > entity.level
}
rule rbac {
permit
condition rbacCheckRoles() == true
}
}
}

Generated Classes
class Entity(object):
def __init__(self):
super(Entity, self).__init__()
@abstractproperty
def id(self):
pass
class SQLEntity(Entity):
def __init__(self, database=None, table=None, column=None, level=None, tags=None):
super(SQLEntity, self).__init__()
self.database = database
self.table = table
self.column = column
self.level = level
self.tags = tags
class Subject(Entity):
def __init__(self, name=None, level=None, tags=None):
super(Subject, self).__init__()
self.name = name
self.level = level
self.tags = tags

Angine Example
from Angine.policy import Policy
from Angine.pip import PIP
from Angine.pdp import PDP
def pep():
...
request = get_request(network)
policy = Policy(alfa_mysql_policy)
pip = PIP.init_data(mongo_connection)
pdp = PDP(policy.get_lua_policy())
ctx = pip.create_ctx(request)
response = pdp.evaluate(ctx)
if response["result"]["decision"] != "permit":
return None
else:
return process(request)

Angine Results
• ALFAScript IDL to runtime language code generator
• ALFAScript language
• ALFAScript to Lua transcompiler
• PDP and PIP implementations for runtime language
• Common parsers (HTTP, mysql, tsql)

Roadmap
• Host agents for C#, Java
• ANTLR-based C++ parser
• Release MySQL grammar for ANTLR4
• PT Application Firewall integration
• SQL user tracking
• Machine learning for sensitive data discovery
• Inspected Application Module for DBFW

Inspected Application Module
Vladimir Kochetkov. Do WAFs dream of static analyzers?
Peculiarities
§ Web-only IAM can not process non HTTP attack vectors
§ There are some cases when CompFG is not adequate to detect attacks
• Loops, recursion
• Internal and external dependencies
The idea is to build SQL profile based on application code, compile it to binary module
and run on the DBFW
This approach can be used to detect second order SQL injection attacks

Inspected Application Module Flow
Application
Inspector
Deployed
application
Configuration
WAF IAM WAF Front-end
Web client
(2) HTTP request context
Application
CompFG (5) HTTP response context
(11) SQL response context
Computation flow
model
(8) SQL request context
DBMS
(7) SQL
request
(6) HTTP
request
(12) SQL
request
(13) SQL
response
(14) SQL
response
(15) HTTP
response
(16) HTTP
response (1) HTTP
request
Compiled
binary
module
Source code
Application Firewall

SQL IAM Example
…
$sql = "select * from data where id= intval(".$_POST["id"])
$result = mysql_query($sql, $connection)
$row = mysql_fetch_row($result)
$sql = "select * from data where fname=' ".$row[2]. " ' "

SQL IAM Example
…
Untrusted data read from database.
What if fname is ' or '1' = '1 ?

SQL IAM Example
…
The main SQL injection feature:
a number of tokens more that one

SQL IAM Example
…
(concat "select * from data where fname=" (
concat (index-access row 2) "'"))

SQL IAM Example
…
(concat "select * from data where fname=" (
concat (index-access row 2) "'"))
(call mysql_fetch_row (call mysql_query (concat
"select * from data where id=intval(" (concat (index-
access POST, "id") ")")) connection))

SQL IAM Example
…
№ Query hash Index Tokens
1 87248237482347 [(28,-1)] 1
2 13475837458758 [(32,-1)] 1

SQLi IAM Example
GET /app/?id=1000 HTTP/1.1
Host: example.com
Connection: close
select * from data where id=1000
select * from data where fname='john' or '1'='1'
1 ≠ 2

Как разработать DBFW с нуля

More Related Content

What's hot

Similar to Как разработать DBFW с нуля

More from Positive Hack Days

Recently uploaded

Как разработать DBFW с нуля