SlideShare a Scribd company logo
1 of 56
Session ID:
Session Classification:
Romain Gaucher
Coverity
ASEC-F42
Intermediate
Why Haven’t We Stamped Out SQL Injection
and XSS Yet?
► Lead security researcher at Coverity
► Have spent a fair amount of time on automated analysis of
sanitizers, framework analysis, precise remediation advices,
context-aware static analysis, etc.
► Officer at WASC, contributed to several community
projects (Script Mapping, Threat Classification 2,
CAPEC, etc.)
► Previously at Cigital and NIST
► On the web:
Twitter: @rgaucher
Coverity SRL blog: https://communities.coverity.com/blogs/security
About me
► This talk is not about
► what XSS or SQL injection are
► how to exploit XSS or SQLi
► static analysis heuristics to find these issues
► etc.
► This talk is about
► what developers need to do when they write a web application
► importance of understanding SQL and HTML contexts
► how these contexts relates to XSS and SQLi
► remediation advices we usually hear from security professionals
Agenda
► We analyzed 28 Java web applications and 154 versions
over time. Most were open source projects. That’s
6MLOC Java and JSP code.
► For SQL injection:
► focus on queries embedded in applications
► we did not analyze the code of stored procedures
► For XSS:
► focus on server-side pages
► most pages have JavaScript but we did not try to understand the
impact of JavaScript code (i.e., after it executes)
What we studied (and what we didn’t)
► These applications used several different control
frameworks, such as Spring MVC, Struts, etc.
► ORMs such as JPA or Hibernate are used in two-thirds
of the applications
► All projects had JSP files, but its use is quite different:
► One project has 24 lines of JSP, another 84,000
Analyzed projects
Jira Tatami Memoire M2 jforum3 Ecuries du Loup
Cosmo Nuxeo OpenMRS psi-probe Liferay
Scrumter Jackrabbit dotCMS ala-portal Jahia
Trading System Aid Pebble Pyramus WiseMapping jforum2
Nacre Threadfix YouWho JSPwiki Roller
Master MusicStudio Ubanist Connect
► This research relies heavily on the ability to statically
compute contexts for HTML and SQL
► We reported the contexts every time some dynamic data is
inserted in HTML or SQL (injection site). We do not report the
number of possible paths, just injection sites.
► Contexts computations
► HTML contexts are derived from a HTML5 based parser [1], and
simple CSS/JavaScript parsers
► SQL contexts are derived from a parser that handles generically
SQL, HQL, and JPQL
► Limitations
► The contexts are computed in the Java program, we do not try to
understand how JavaScript impacts the HTML contexts at
runtime
Analysis
► The injection site is our unit for measuring the implicit or
explicit string concatenations in a program
► We only care and report injection sites that are related to
a sub-language we want to analyze: SQL, JPQL, HQL,
HTML, JavaScript, and CSS
► Example of injection sites related to SQL:
String sql = "select id from users where 1=1";
if (condition1)
sql += " and name='" + user_name + "'";
if (condition2)
sql += " and password=?";
Concept of “injection site”
2 injection sites in one
SQL query
SQL Injection
► When parameterized queries are used correctly there
are no real opportunities for SQL injection
► The root of SQL injection therefore is string concatenation of a
query string with tainted data
► Bright idea: If we could eliminate the habit of developers using
string concatenation for queries, we could eliminate SQL
injection
► Let's examine the "common" security advice given to
developers:
► “Use an ORM”
► “Use parameterized queries”
The root of all evil
"Use an ORM"
► ORMs are
not typically
vulnerable to
SQLi
► Most
projects use
both ORM
and non-
ORM
Project ID
SQL 2 4 6 7 9 10 11 13 17 18 20 22 23 24 27
JDBC API ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
Hibernate
ORM
✓ ✓ ✓
Hibernate
Non-
ORM
✓ ✓ ✓ ✓ ✓ ✓ ✓
JPA ORM ✓ ✓ ✓ ✓ ✓ ✓ ✓
JPA Non-
ORM
✓
► Overall combination of ORM and non-ORM SQL access
► Non-ORM: JDBC, Hibernate SQL / HQL, and JPA JPQL queries
► ORM: EntityManager.find(), EntityManager.persist(),
Hibernate Criteria, etc.
► ORMs are common in our dataset (67%)
► Most use JPA, some use Hibernate exclusively
► Use of a query language (HQL, SQL, etc.) is common to
all projects
► SQL accounts for 63% of actual queries
► HQL is second, with 31% queries
► Conclusion: string-based query construction is still very
common in our dataset, even in applications using
ORMs.
ORM versus Non-ORM
► “Use an ORM”
► Projects use a mixture of ORM and query languages. There is a
need for query languages!
► “Use parameterized queries”
Eliminate String Concatenation
► Analysis of 985 injection sites, and 545 unique queries
(including JDBC, JPA, and Hibernate)
► There is an average of 1.8 injection sites per query with a
maximum of 22 injection sites in one query.
► We identified 10 different SQL contexts
► Good: 85% of the injection sites are associated to a SQL context
that can be parameterized
► Problem: 15% of the queries cannot be parameterized
SQL Contexts and Parameterization
Parameterizable vs. Not parameterizable
“Use parameterized queries”
► Out of the 833 parameterizable contexts, 94% are
parameterized
► It’s still not perfect; here's the breakdown of the
remaining 6% that could be parameterized
Parameterizable Contexts
► Constants: 19%
► Constant identifiers
► Class names
► Strings: 34%
► Good ol' strings
► Safe types: 47%
► Numeric, Date, etc.
► SQL_FULL_STATEMENT (5.8%): the entire SQL query
is not resolvable statically (DB, SQL file, configuration
files, user, etc.)
► It’s not a “real” context and most likely a design decision
► For web requests, they should have anti-CSRF and access
controls in place
► SQL_TABLE (4.1%): table name
► SQL_IDENTIFIER (1.4%): column name or a variable
► SQL_GENERIC (4.2%): keyword, etc.
Non-Parameterizable Contexts
► Table name:
stmt1= c.createStatement();
stmt1.execute("TRUNCATE TABLE `"+ table + "`");
► Trigger name:
Statement st = conn.createStatement();
schema = StringUtils.quoteIdentifier(schema);
String trigger = schema + '.'
+ StringUtils.quoteIdentifier(PREFIX + table);
st.execute("DROP TRIGGER IF EXISTS " + trigger);
► Conclusion: developers still need to concatenate in some
queries.
Parameterize This!
► “Use an ORM”
► Projects use a mixture of ORM and non-ORM SQL technologies.
There is a need for query languages.
► “Use parameterized queries”
► 15% of injection sites cannot be parameterized
► There's hope, most parameterizable ones are parameterized
(94%)!
► 2% of parameterizable injection sites use dynamic strings that
aren't parameterized; opportunity for SQL injection
Eliminate String Concatenation
► String concatenation cannot easily be eliminated
► How can its effects be mitigated?
► Let's examine some other "common" security advice:
► “Do input validation”
► “Escape special characters”
Now What?
► Input validation = “Security through serendipity”
► Does the domain of valid input values happens to
preclude security problems for all contexts where the
data is used? Not necessarily.
► The input validation of data should be dictated by the
functional requirements of the application, not by the
security obligations of SQL contexts.
► “Why can’t my password contain % ?”
► “Why can’t Miles O’Brien create a profile?”
► Conclusion: you happen to be safe sometimes, you just
cannot guarantee it throughout the application.
“Do Input Validation”
► “Do input validation”
► Different contexts have different security obligations
► Functional requirements should drive input validation
► “Sanitize special characters”
Mitigate String Concatenation
► What's a special character? Different dialects have
different requirements.
► Does  escaping work for default PostgreSQL >= 9.1?
► What needs escaping in a quoted identifier?
► Escapers might be useful in a special case where the
application is doing a lot of dynamic queries and
parameterization would require large scale refactoring.
► Not all contexts can use an escaper. Some characters
just need to be filtered out.
Sanitize special characters
► “Do input validation”
► Different contexts have different security obligations
► Application requirements should drive input validation
► “Sanitize special characters”
► Different dialects have different requirements
► Can be applied incorrectly
Mitigate String Concatenation
► String concatenation cannot be eradicated from SQL
queries.
► Input validation may not be adequate, sanitizers can help
but can also hinder.
► Security ought to provide developers helpful advice:
► Code, code, code…
► Specific to a technology (JDBC, Hibernate HQL, etc.)
► And specific to a context (table name, IN clause, etc.)
► Working with developers to build this kind of
documentation for your organization’s specific use of
technologies and query styles will help reduce SQL
injection defects.
SQL Conclusions
Cross-Site Scripting
► From a code perspective, XSS isn’t a single vulnerability.
It’s a group of vulnerabilities that mostly involve injection
of tainted data into various HTML contexts.
► There is no one way to fix all XSS vulnerabilities with
one magic “cleansing function”. Developers really need
to understand this.
► Let's examine the "common" security advices given to
developers:
► “Use HTML escaping”
► “Don’t insert user data in <random HTML location here>”
► “Use auto-escaping template engines”
XSS is confusing
► HTML entity escaping can be used in several locations in
an HTML document
► It can be used when the web browser will be able to
process the HTML entities. For example, it is correct for
the value of a <div> or an HTML attribute.
► However, HTML escaping cannot be used in in all cases:
you wouldn’t escape an attribute name, or the content of
a <style>
► To better understand when we can use this HTML
escaper and when we cannot, we need to talk about
HTML contexts
“Use HTML escaping”
► No documentation really describes what HTML contexts
are or gives an good list of them
► Practically, many places in an HTML page (inc. CSS, JS,
etc.) have the same security obligation. These are the
HTML contexts.
► Simple example, double quoted HTML attribute value
<div id="${inj_var}">…
<pre class="${inj_var}">…
► Some of the contexts:
► JavaScript string, HTML tag name, HTML attribute name, CSS
string, CSS code, JavaScript code, HTML RCDATA, HTML
PCDATA, HTML script tag body, URL, etc.
HTML Contexts?
<a href="javascript:hello('${inj_var}')">
The joy of nested contexts
HTML Contexts Stack
Single quoted
JavaScript string
JavaScript code
URI
Double quoted HTML
attribute value
Single Quoted JavaScript String
URI
Double Quoted HTML Attribute Value
JavaScript Code
► Simplified: Our data is inserted inside a JavaScript String
inside a URI inside an HTML attribute
► We’ll note: HTML Attribute -> URI -> JS String
<a href="javascript:hello('${inj_var}')">
stack := {HTML Attribute -> URI -> JS String}
► The web browser will:
► Take the content of the attribute href and HTML decode it
► Analyze the URI and recognize the javascript: scheme
► Take the content of the URI (i.e., after the scheme) and URL
decode it
► Since it’s supposed to be JavaScript, the extracted content
hello('${inj_var}') is sent to the JS engine for rendering
► The JS engine will parse the program and especially the string
that contains our ${inj_var} by doing a JS string decoding
► Based on this flow we know what needs to be done to
make a safe insertion for ${inj_var} in this context
The joy of nested contexts
Well put grand’ma!
Distribution of HTML contexts
HTML contexts matters more now!
► “Use HTML escaping”
► Projects are using a lot of different HTML contexts, and it’s not
getting better
► “Don’t insert user data in <insert HTML location here>”
► “Use auto-escaping template engines”
Fix XSS
► HTML escaping is important and can be correctly used in
50% of the injection sites
► In addition, it makes safe 70% of them when implemented
correctly (but makes the data incorrect)
► However, let’s look at the different implementations of
HTML escapers
Focusing on the HTML escaping…
Dive into HTML escapers…
► HTML escaping should be
simple
► We found 76 different HTML
escapers in our dataset
► Most of them belong to a
framework/library used by
the application
► In average, a project has
5.7 HTML escapers
► All HTML escapers are
not equivalent…
► “Use HTML escaping”
► Projects are using a lot of different HTML contexts, and it’s not
getting better
► HTML escaping seems to mean a lot of different things… Only
41% of the implementations seems sufficient
► “Don’t insert user data in <insert HTML location here>”
► “Use auto-escaping template engines”
Fix XSS
► We found that many security guidances just say “don’t
do this”
► OWASP XSS cheat sheet [2] has the following rule:
“Don’t insert data into <?>”
Never put data HERE “Context” name
<script>...HERE...</script> Directly in script
<!--...HERE...--> HTML comment
<div ...HERE...=test /> Attribute name
<HERE... href="/test" /> Tag name
<style>...HERE...</style> Directly in CSS
Observed HTML contexts and nesting
Observed HTML contexts and nesting
► “Use HTML escaping”
► Projects are using a many different HTML contexts, and it’s not
getting better
► HTML escaping seems to mean a lot of different things… Only
41% of the implementations seems sufficient
► “Don’t insert user data in <insert HTML location here>”
► Developers need to add data into many HTML contexts
► “Use auto-escaping template engines”
Fix XSS
► A good trend with template engines is to provide auto-
escaping.
► Auto-escaping means that the engine will take the
dynamic data and escape it without any directive from
the developer.
► Such reasoning is great and clearly makes a typical web
application more secure.
► However, do they understand the contexts and provide
the required escaping/filtering to be safe for XSS?
Auto-escaping template libraries
Auto-escaping template libraries
Engine Contextual Autoescaping HTML Autoescaping On By Default Manual escaping Escapes '
GCT Yes N/A Yes N/A Yes
.NET Razor No Yes Yes Yes Yes
Ruby on Rails No Yes Yes Yes No
HAML No Yes Yes Yes No
NHAML No Yes Yes Yes No
Facelets No Yes Yes Yes N/A
Mustache No Yes Yes Yes No
Twig No Yes No Yes Yes
Smarty No Yes No Yes Yes
Spring No No N/A Yes N/A
.NET WebForms No No N/A No N/A
JSP No No N/A Yes N/A
► Some don’t even escape ‘ properly
► Only GCT provides a fairly good context aware auto-
escaping, others essentially perform HTML escaping all
the time
► “Use HTML escaping”
► Projects are using a lot of different HTML contexts, and it’s not
getting better
► HTML escaping seems to mean a lot of different things… Only
41% of the implementations we saw seems sufficient
► “Don’t insert user data in <insert HTML location here>”
► Developers need to add data into many HTML contexts
► “Use auto-escaping template engines”
► Sometimes a false sense of security
Fix XSS
► I’m not saying that we have been all wrong until now. We
just need to come up with a more complete solution for
XSS:
► Sadly, there is no one silver bullet for XSS
► We cannot rely on one escaping function or filter
► We cannot rely yet only on most auto-escaping template engines
► For the time being, we need:
► Libraries that allow developer to escape the data properly for
many contexts, and filter the data when an escaper cannot be
used
► Give actionable guidance to the developers
Sad truth about XSS
► List of required escapers based on our dataset:
► HTML escaping (^85%)
► URI encoding (^30%)
► JavaScript string escaping (^13%)
► CSS string escaping (^0.16%)
► JavaScript regex escaping (^0.01%)
► Libraries like OWASP ESAPI [3] or Coverity Security
Library [4] already have these escapers; promote them!
Escapers should be available
► HTML contexts that require a filter based on our dataset:
► HTML Attribute -> URI (^30%)
► HTML Attribute -> JS code (0.2%)
► HTML Attribute -> CSS (0.3%)
► Etc.
► How to handle URIs?
► Filter the scheme to make sure you allow it (http, https, etc.)
► Make sure the authority makes sense for your application
► URI encode each element of the path
► URI encode query string parameter names and values
► ESAPI has a isValidURL which seems very restrictive, and
CSL asUrl rewrites the URL to make it safe (but you need
to manually encode the params & paths)
Filters should be available
► XSS is complex because it can happen in many different
locations in the HTML pages (HTML contexts).
► Input validation may not be adequate
► Sanitizers can help but it’s very difficult to find a
complete library for XSS
► OWASP XSS cheat sheet is a good start, but I believe
we should do more:
► We started blogging [5] about these kind of issues and will
continue doing it as we improve our technologies or just come
across “interesting stuff”
► We will also continue to improve CSL [4] by adding filters, etc.
► We need to be more serious about raising awareness of HTML
contexts
XSS Conclusions
Developers need
some <3
► It’s getting better. More and more developers have a
basic understanding of security issues like XSS and
SQLi.
► For SQL, developers parameterized 94% of the time when
possible
► For HTML, we see many places where HTML escaping is used
► But it’s just too complex and convoluted for them to
understand everything
► Security obligations of each context
► Nested contexts and ordering of escapers
► There are gaps in the way security information is
presented online; StackOverflow is a scary place.
However, if you don't give them advice, who will?
What’s up with developers?!
► Focus developer communication on the code point of
view, as opposed to the attacker’s point of view:
► Attacks and threats are the "why" we need to fix
► Developers need the "how" and "where" to fix
► Provide helpful advice:
► Actionable (i.e. code, code, and more code)
► To a specific technology (JSTL, Hibernate HQL, jQuery, etc.)
► In a specific context or contexts (IN clause, URI, CSS string,
etc.)
► What does helpful advice look like?
► Not this: "Parameterized all SQL statements."
► How about this?
Ways forward
How to Parameterize Data Values
Example: "SELECT * FROM table WHERE name = '" + userName + "'"
JDBC
String paramQuery = "SELECT * FROM table WHERE name = ?";
PreparedStatement prepStmt = connection.prepareStatement(paramQuery);
prepStmt.setString(1, userName);
Hibernate
Native
Query
String paramQuery = "SELECT * FROM table WHERE name = :username";
SQLQuery query = session.createSQLQuery(paramQuery);
query.setParameter("username", userName);
JPA Native
Query
String paramQuery = "SELECT * FROM table WHERE name = :username";
Query query = entityManager.createNativeQuery(paramQuery);
query.setParameter("username", userName);
Helpful Advice
How to add data in a CSS string within a CSS block using CSL.
Provide several examples that show the common usages for different HTML
contexts, and technologies your developers are using
Expression
Language
<style>
a[href *= "${cov:cssStringEscape(param.foobar)}"] {
background-color: pink!important;
}
</style>
Java
or
JSP Scriptlet
<style>
<%
String parm = request.getParameter("foobar");
String cssEscaped = Escape.cssString(parm);
%>
a[href *= "<%= cssEscaped %>"] {
background-color: pink!important;
}
</style>
Helpful Advice
► XSS and SQLi can be nuanced and complex. XSS is
actually becoming more and more complex over time as
the number of contexts and nesting increases.
► Current advices we saw are either:
► Generic remediation
► Precisely how to perform the attack
► Guidance needs to be both more specific and technical,
yet also simpler and shorter. The only way to get that is
to take tailored, detailed advice and crop it down for the
development team’s needs.
► Remember: your developers need some <3
Conclusions
Questions?
► [1] HTML 5 tokenizer specification
► [2] OWASP XSS Cheat Sheet
► [3] OWASP ESAPI Java
► [4] Coverity Security Library Java
► [5] Coverity Security Research Blog
► [6] OWASP SQLi Prevention Cheat Sheet
References

More Related Content

What's hot

Secure coding - Balgan - Tiago Henriques
Secure coding - Balgan - Tiago HenriquesSecure coding - Balgan - Tiago Henriques
Secure coding - Balgan - Tiago Henriques
Tiago Henriques
 
Secure Coding - Web Application Security Vulnerabilities and Best Practices
Secure Coding - Web Application Security Vulnerabilities and Best PracticesSecure Coding - Web Application Security Vulnerabilities and Best Practices
Secure Coding - Web Application Security Vulnerabilities and Best Practices
Websecurify
 
Jim Manico: Developer Top 10 Core Controls, web application security @ OWASP ...
Jim Manico: Developer Top 10 Core Controls, web application security @ OWASP ...Jim Manico: Developer Top 10 Core Controls, web application security @ OWASP ...
Jim Manico: Developer Top 10 Core Controls, web application security @ OWASP ...
Xlator
 

What's hot (13)

XSS - Attacks & Defense
XSS - Attacks & DefenseXSS - Attacks & Defense
XSS - Attacks & Defense
 
Secure coding - Balgan - Tiago Henriques
Secure coding - Balgan - Tiago HenriquesSecure coding - Balgan - Tiago Henriques
Secure coding - Balgan - Tiago Henriques
 
Secure programming with php
Secure programming with phpSecure programming with php
Secure programming with php
 
Secure Coding 101 - OWASP University of Ottawa Workshop
Secure Coding 101 - OWASP University of Ottawa WorkshopSecure Coding 101 - OWASP University of Ottawa Workshop
Secure Coding 101 - OWASP University of Ottawa Workshop
 
Secure Coding - Web Application Security Vulnerabilities and Best Practices
Secure Coding - Web Application Security Vulnerabilities and Best PracticesSecure Coding - Web Application Security Vulnerabilities and Best Practices
Secure Coding - Web Application Security Vulnerabilities and Best Practices
 
Microsoft Fakes, Unit Testing the (almost) Untestable Code
Microsoft Fakes, Unit Testing the (almost) Untestable CodeMicrosoft Fakes, Unit Testing the (almost) Untestable Code
Microsoft Fakes, Unit Testing the (almost) Untestable Code
 
Breaking AngularJS Javascript sandbox
Breaking AngularJS Javascript sandboxBreaking AngularJS Javascript sandbox
Breaking AngularJS Javascript sandbox
 
Sql Injection - Vulnerability and Security
Sql Injection - Vulnerability and SecuritySql Injection - Vulnerability and Security
Sql Injection - Vulnerability and Security
 
Do WAFs dream of static analyzers
Do WAFs dream of static analyzersDo WAFs dream of static analyzers
Do WAFs dream of static analyzers
 
Ten Commandments of Secure Coding
Ten Commandments of Secure CodingTen Commandments of Secure Coding
Ten Commandments of Secure Coding
 
Modular ObjectLens
Modular ObjectLensModular ObjectLens
Modular ObjectLens
 
Jim Manico: Developer Top 10 Core Controls, web application security @ OWASP ...
Jim Manico: Developer Top 10 Core Controls, web application security @ OWASP ...Jim Manico: Developer Top 10 Core Controls, web application security @ OWASP ...
Jim Manico: Developer Top 10 Core Controls, web application security @ OWASP ...
 
Security for developers
Security for developersSecurity for developers
Security for developers
 

Similar to Why haven't we stamped out SQL injection and XSS yet

owasp top 10 security risk categories and CWE
owasp top 10 security risk categories and CWEowasp top 10 security risk categories and CWE
owasp top 10 security risk categories and CWE
Arun Voleti
 
OWASP_Top_Ten_Proactive_Controls_v2.pptx
OWASP_Top_Ten_Proactive_Controls_v2.pptxOWASP_Top_Ten_Proactive_Controls_v2.pptx
OWASP_Top_Ten_Proactive_Controls_v2.pptx
cgt38842
 
SQLCLR For DBAs and Developers
SQLCLR For DBAs and DevelopersSQLCLR For DBAs and Developers
SQLCLR For DBAs and Developers
webhostingguy
 
Dr. Jekyll and Mr. Hyde
Dr. Jekyll and Mr. HydeDr. Jekyll and Mr. Hyde
Dr. Jekyll and Mr. Hyde
webhostingguy
 

Similar to Why haven't we stamped out SQL injection and XSS yet (20)

Understanding and preventing sql injection attacks
Understanding and preventing sql injection attacksUnderstanding and preventing sql injection attacks
Understanding and preventing sql injection attacks
 
Null meet Code Review
Null meet Code ReviewNull meet Code Review
Null meet Code Review
 
Nguyen Phuong Truong Anh - Some new vulnerabilities in modern web application
Nguyen Phuong Truong Anh  - Some new vulnerabilities in modern web applicationNguyen Phuong Truong Anh  - Some new vulnerabilities in modern web application
Nguyen Phuong Truong Anh - Some new vulnerabilities in modern web application
 
SAST, CWE, SEI CERT and other smart words from the information security world
SAST, CWE, SEI CERT and other smart words from the information security worldSAST, CWE, SEI CERT and other smart words from the information security world
SAST, CWE, SEI CERT and other smart words from the information security world
 
Coding Security: Code Mania 101
Coding Security: Code Mania 101Coding Security: Code Mania 101
Coding Security: Code Mania 101
 
owasp top 10 security risk categories and CWE
owasp top 10 security risk categories and CWEowasp top 10 security risk categories and CWE
owasp top 10 security risk categories and CWE
 
Prevoty NYC Java SIG 20150730
Prevoty NYC Java SIG 20150730Prevoty NYC Java SIG 20150730
Prevoty NYC Java SIG 20150730
 
Node.js for enterprise - JS Conference
Node.js for enterprise - JS ConferenceNode.js for enterprise - JS Conference
Node.js for enterprise - JS Conference
 
Top Ten Java Defense for Web Applications v2
Top Ten Java Defense for Web Applications v2Top Ten Java Defense for Web Applications v2
Top Ten Java Defense for Web Applications v2
 
Mitigating Java Deserialization attacks from within the JVM (improved version)
Mitigating Java Deserialization attacks from within the JVM (improved version)Mitigating Java Deserialization attacks from within the JVM (improved version)
Mitigating Java Deserialization attacks from within the JVM (improved version)
 
DevBeat 2013 - Developer-first Security
DevBeat 2013 - Developer-first SecurityDevBeat 2013 - Developer-first Security
DevBeat 2013 - Developer-first Security
 
STIX Patterning: Viva la revolución!
STIX Patterning: Viva la revolución!STIX Patterning: Viva la revolución!
STIX Patterning: Viva la revolución!
 
Writing Secure Code – Threat Defense
Writing Secure Code – Threat DefenseWriting Secure Code – Threat Defense
Writing Secure Code – Threat Defense
 
OWASP_Top_Ten_Proactive_Controls_v32.pptx
OWASP_Top_Ten_Proactive_Controls_v32.pptxOWASP_Top_Ten_Proactive_Controls_v32.pptx
OWASP_Top_Ten_Proactive_Controls_v32.pptx
 
OWASP_Top_Ten_Proactive_Controls_v2.pptx
OWASP_Top_Ten_Proactive_Controls_v2.pptxOWASP_Top_Ten_Proactive_Controls_v2.pptx
OWASP_Top_Ten_Proactive_Controls_v2.pptx
 
Introduction to threat_modeling
Introduction to threat_modelingIntroduction to threat_modeling
Introduction to threat_modeling
 
2012-10-16 Mil-OSS Working Group: Introduction to SCAP Security Guide
2012-10-16 Mil-OSS Working Group: Introduction to SCAP Security Guide2012-10-16 Mil-OSS Working Group: Introduction to SCAP Security Guide
2012-10-16 Mil-OSS Working Group: Introduction to SCAP Security Guide
 
Code Quality - Security
Code Quality - SecurityCode Quality - Security
Code Quality - Security
 
SQLCLR For DBAs and Developers
SQLCLR For DBAs and DevelopersSQLCLR For DBAs and Developers
SQLCLR For DBAs and Developers
 
Dr. Jekyll and Mr. Hyde
Dr. Jekyll and Mr. HydeDr. Jekyll and Mr. Hyde
Dr. Jekyll and Mr. Hyde
 

Recently uploaded

Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 

Recently uploaded (20)

FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
 
chemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdfchemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdf
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 

Why haven't we stamped out SQL injection and XSS yet

  • 1. Session ID: Session Classification: Romain Gaucher Coverity ASEC-F42 Intermediate Why Haven’t We Stamped Out SQL Injection and XSS Yet?
  • 2. ► Lead security researcher at Coverity ► Have spent a fair amount of time on automated analysis of sanitizers, framework analysis, precise remediation advices, context-aware static analysis, etc. ► Officer at WASC, contributed to several community projects (Script Mapping, Threat Classification 2, CAPEC, etc.) ► Previously at Cigital and NIST ► On the web: Twitter: @rgaucher Coverity SRL blog: https://communities.coverity.com/blogs/security About me
  • 3. ► This talk is not about ► what XSS or SQL injection are ► how to exploit XSS or SQLi ► static analysis heuristics to find these issues ► etc. ► This talk is about ► what developers need to do when they write a web application ► importance of understanding SQL and HTML contexts ► how these contexts relates to XSS and SQLi ► remediation advices we usually hear from security professionals Agenda
  • 4. ► We analyzed 28 Java web applications and 154 versions over time. Most were open source projects. That’s 6MLOC Java and JSP code. ► For SQL injection: ► focus on queries embedded in applications ► we did not analyze the code of stored procedures ► For XSS: ► focus on server-side pages ► most pages have JavaScript but we did not try to understand the impact of JavaScript code (i.e., after it executes) What we studied (and what we didn’t)
  • 5. ► These applications used several different control frameworks, such as Spring MVC, Struts, etc. ► ORMs such as JPA or Hibernate are used in two-thirds of the applications ► All projects had JSP files, but its use is quite different: ► One project has 24 lines of JSP, another 84,000 Analyzed projects Jira Tatami Memoire M2 jforum3 Ecuries du Loup Cosmo Nuxeo OpenMRS psi-probe Liferay Scrumter Jackrabbit dotCMS ala-portal Jahia Trading System Aid Pebble Pyramus WiseMapping jforum2 Nacre Threadfix YouWho JSPwiki Roller Master MusicStudio Ubanist Connect
  • 6. ► This research relies heavily on the ability to statically compute contexts for HTML and SQL ► We reported the contexts every time some dynamic data is inserted in HTML or SQL (injection site). We do not report the number of possible paths, just injection sites. ► Contexts computations ► HTML contexts are derived from a HTML5 based parser [1], and simple CSS/JavaScript parsers ► SQL contexts are derived from a parser that handles generically SQL, HQL, and JPQL ► Limitations ► The contexts are computed in the Java program, we do not try to understand how JavaScript impacts the HTML contexts at runtime Analysis
  • 7. ► The injection site is our unit for measuring the implicit or explicit string concatenations in a program ► We only care and report injection sites that are related to a sub-language we want to analyze: SQL, JPQL, HQL, HTML, JavaScript, and CSS ► Example of injection sites related to SQL: String sql = "select id from users where 1=1"; if (condition1) sql += " and name='" + user_name + "'"; if (condition2) sql += " and password=?"; Concept of “injection site” 2 injection sites in one SQL query
  • 9. ► When parameterized queries are used correctly there are no real opportunities for SQL injection ► The root of SQL injection therefore is string concatenation of a query string with tainted data ► Bright idea: If we could eliminate the habit of developers using string concatenation for queries, we could eliminate SQL injection ► Let's examine the "common" security advice given to developers: ► “Use an ORM” ► “Use parameterized queries” The root of all evil
  • 10. "Use an ORM" ► ORMs are not typically vulnerable to SQLi ► Most projects use both ORM and non- ORM Project ID SQL 2 4 6 7 9 10 11 13 17 18 20 22 23 24 27 JDBC API ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ Hibernate ORM ✓ ✓ ✓ Hibernate Non- ORM ✓ ✓ ✓ ✓ ✓ ✓ ✓ JPA ORM ✓ ✓ ✓ ✓ ✓ ✓ ✓ JPA Non- ORM ✓
  • 11. ► Overall combination of ORM and non-ORM SQL access ► Non-ORM: JDBC, Hibernate SQL / HQL, and JPA JPQL queries ► ORM: EntityManager.find(), EntityManager.persist(), Hibernate Criteria, etc. ► ORMs are common in our dataset (67%) ► Most use JPA, some use Hibernate exclusively ► Use of a query language (HQL, SQL, etc.) is common to all projects ► SQL accounts for 63% of actual queries ► HQL is second, with 31% queries ► Conclusion: string-based query construction is still very common in our dataset, even in applications using ORMs. ORM versus Non-ORM
  • 12. ► “Use an ORM” ► Projects use a mixture of ORM and query languages. There is a need for query languages! ► “Use parameterized queries” Eliminate String Concatenation
  • 13. ► Analysis of 985 injection sites, and 545 unique queries (including JDBC, JPA, and Hibernate) ► There is an average of 1.8 injection sites per query with a maximum of 22 injection sites in one query. ► We identified 10 different SQL contexts ► Good: 85% of the injection sites are associated to a SQL context that can be parameterized ► Problem: 15% of the queries cannot be parameterized SQL Contexts and Parameterization
  • 14. Parameterizable vs. Not parameterizable “Use parameterized queries”
  • 15. ► Out of the 833 parameterizable contexts, 94% are parameterized ► It’s still not perfect; here's the breakdown of the remaining 6% that could be parameterized Parameterizable Contexts ► Constants: 19% ► Constant identifiers ► Class names ► Strings: 34% ► Good ol' strings ► Safe types: 47% ► Numeric, Date, etc.
  • 16. ► SQL_FULL_STATEMENT (5.8%): the entire SQL query is not resolvable statically (DB, SQL file, configuration files, user, etc.) ► It’s not a “real” context and most likely a design decision ► For web requests, they should have anti-CSRF and access controls in place ► SQL_TABLE (4.1%): table name ► SQL_IDENTIFIER (1.4%): column name or a variable ► SQL_GENERIC (4.2%): keyword, etc. Non-Parameterizable Contexts
  • 17. ► Table name: stmt1= c.createStatement(); stmt1.execute("TRUNCATE TABLE `"+ table + "`"); ► Trigger name: Statement st = conn.createStatement(); schema = StringUtils.quoteIdentifier(schema); String trigger = schema + '.' + StringUtils.quoteIdentifier(PREFIX + table); st.execute("DROP TRIGGER IF EXISTS " + trigger); ► Conclusion: developers still need to concatenate in some queries. Parameterize This!
  • 18. ► “Use an ORM” ► Projects use a mixture of ORM and non-ORM SQL technologies. There is a need for query languages. ► “Use parameterized queries” ► 15% of injection sites cannot be parameterized ► There's hope, most parameterizable ones are parameterized (94%)! ► 2% of parameterizable injection sites use dynamic strings that aren't parameterized; opportunity for SQL injection Eliminate String Concatenation
  • 19. ► String concatenation cannot easily be eliminated ► How can its effects be mitigated? ► Let's examine some other "common" security advice: ► “Do input validation” ► “Escape special characters” Now What?
  • 20. ► Input validation = “Security through serendipity” ► Does the domain of valid input values happens to preclude security problems for all contexts where the data is used? Not necessarily. ► The input validation of data should be dictated by the functional requirements of the application, not by the security obligations of SQL contexts. ► “Why can’t my password contain % ?” ► “Why can’t Miles O’Brien create a profile?” ► Conclusion: you happen to be safe sometimes, you just cannot guarantee it throughout the application. “Do Input Validation”
  • 21. ► “Do input validation” ► Different contexts have different security obligations ► Functional requirements should drive input validation ► “Sanitize special characters” Mitigate String Concatenation
  • 22. ► What's a special character? Different dialects have different requirements. ► Does escaping work for default PostgreSQL >= 9.1? ► What needs escaping in a quoted identifier? ► Escapers might be useful in a special case where the application is doing a lot of dynamic queries and parameterization would require large scale refactoring. ► Not all contexts can use an escaper. Some characters just need to be filtered out. Sanitize special characters
  • 23. ► “Do input validation” ► Different contexts have different security obligations ► Application requirements should drive input validation ► “Sanitize special characters” ► Different dialects have different requirements ► Can be applied incorrectly Mitigate String Concatenation
  • 24. ► String concatenation cannot be eradicated from SQL queries. ► Input validation may not be adequate, sanitizers can help but can also hinder. ► Security ought to provide developers helpful advice: ► Code, code, code… ► Specific to a technology (JDBC, Hibernate HQL, etc.) ► And specific to a context (table name, IN clause, etc.) ► Working with developers to build this kind of documentation for your organization’s specific use of technologies and query styles will help reduce SQL injection defects. SQL Conclusions
  • 26. ► From a code perspective, XSS isn’t a single vulnerability. It’s a group of vulnerabilities that mostly involve injection of tainted data into various HTML contexts. ► There is no one way to fix all XSS vulnerabilities with one magic “cleansing function”. Developers really need to understand this. ► Let's examine the "common" security advices given to developers: ► “Use HTML escaping” ► “Don’t insert user data in <random HTML location here>” ► “Use auto-escaping template engines” XSS is confusing
  • 27. ► HTML entity escaping can be used in several locations in an HTML document ► It can be used when the web browser will be able to process the HTML entities. For example, it is correct for the value of a <div> or an HTML attribute. ► However, HTML escaping cannot be used in in all cases: you wouldn’t escape an attribute name, or the content of a <style> ► To better understand when we can use this HTML escaper and when we cannot, we need to talk about HTML contexts “Use HTML escaping”
  • 28. ► No documentation really describes what HTML contexts are or gives an good list of them ► Practically, many places in an HTML page (inc. CSS, JS, etc.) have the same security obligation. These are the HTML contexts. ► Simple example, double quoted HTML attribute value <div id="${inj_var}">… <pre class="${inj_var}">… ► Some of the contexts: ► JavaScript string, HTML tag name, HTML attribute name, CSS string, CSS code, JavaScript code, HTML RCDATA, HTML PCDATA, HTML script tag body, URL, etc. HTML Contexts?
  • 29. <a href="javascript:hello('${inj_var}')"> The joy of nested contexts HTML Contexts Stack Single quoted JavaScript string JavaScript code URI Double quoted HTML attribute value Single Quoted JavaScript String URI Double Quoted HTML Attribute Value JavaScript Code ► Simplified: Our data is inserted inside a JavaScript String inside a URI inside an HTML attribute ► We’ll note: HTML Attribute -> URI -> JS String
  • 30. <a href="javascript:hello('${inj_var}')"> stack := {HTML Attribute -> URI -> JS String} ► The web browser will: ► Take the content of the attribute href and HTML decode it ► Analyze the URI and recognize the javascript: scheme ► Take the content of the URI (i.e., after the scheme) and URL decode it ► Since it’s supposed to be JavaScript, the extracted content hello('${inj_var}') is sent to the JS engine for rendering ► The JS engine will parse the program and especially the string that contains our ${inj_var} by doing a JS string decoding ► Based on this flow we know what needs to be done to make a safe insertion for ${inj_var} in this context The joy of nested contexts
  • 34. ► “Use HTML escaping” ► Projects are using a lot of different HTML contexts, and it’s not getting better ► “Don’t insert user data in <insert HTML location here>” ► “Use auto-escaping template engines” Fix XSS
  • 35. ► HTML escaping is important and can be correctly used in 50% of the injection sites ► In addition, it makes safe 70% of them when implemented correctly (but makes the data incorrect) ► However, let’s look at the different implementations of HTML escapers Focusing on the HTML escaping…
  • 36. Dive into HTML escapers… ► HTML escaping should be simple ► We found 76 different HTML escapers in our dataset ► Most of them belong to a framework/library used by the application ► In average, a project has 5.7 HTML escapers ► All HTML escapers are not equivalent…
  • 37. ► “Use HTML escaping” ► Projects are using a lot of different HTML contexts, and it’s not getting better ► HTML escaping seems to mean a lot of different things… Only 41% of the implementations seems sufficient ► “Don’t insert user data in <insert HTML location here>” ► “Use auto-escaping template engines” Fix XSS
  • 38. ► We found that many security guidances just say “don’t do this” ► OWASP XSS cheat sheet [2] has the following rule: “Don’t insert data into <?>” Never put data HERE “Context” name <script>...HERE...</script> Directly in script <!--...HERE...--> HTML comment <div ...HERE...=test /> Attribute name <HERE... href="/test" /> Tag name <style>...HERE...</style> Directly in CSS
  • 39. Observed HTML contexts and nesting
  • 40. Observed HTML contexts and nesting
  • 41. ► “Use HTML escaping” ► Projects are using a many different HTML contexts, and it’s not getting better ► HTML escaping seems to mean a lot of different things… Only 41% of the implementations seems sufficient ► “Don’t insert user data in <insert HTML location here>” ► Developers need to add data into many HTML contexts ► “Use auto-escaping template engines” Fix XSS
  • 42. ► A good trend with template engines is to provide auto- escaping. ► Auto-escaping means that the engine will take the dynamic data and escape it without any directive from the developer. ► Such reasoning is great and clearly makes a typical web application more secure. ► However, do they understand the contexts and provide the required escaping/filtering to be safe for XSS? Auto-escaping template libraries
  • 43. Auto-escaping template libraries Engine Contextual Autoescaping HTML Autoescaping On By Default Manual escaping Escapes ' GCT Yes N/A Yes N/A Yes .NET Razor No Yes Yes Yes Yes Ruby on Rails No Yes Yes Yes No HAML No Yes Yes Yes No NHAML No Yes Yes Yes No Facelets No Yes Yes Yes N/A Mustache No Yes Yes Yes No Twig No Yes No Yes Yes Smarty No Yes No Yes Yes Spring No No N/A Yes N/A .NET WebForms No No N/A No N/A JSP No No N/A Yes N/A ► Some don’t even escape ‘ properly ► Only GCT provides a fairly good context aware auto- escaping, others essentially perform HTML escaping all the time
  • 44. ► “Use HTML escaping” ► Projects are using a lot of different HTML contexts, and it’s not getting better ► HTML escaping seems to mean a lot of different things… Only 41% of the implementations we saw seems sufficient ► “Don’t insert user data in <insert HTML location here>” ► Developers need to add data into many HTML contexts ► “Use auto-escaping template engines” ► Sometimes a false sense of security Fix XSS
  • 45. ► I’m not saying that we have been all wrong until now. We just need to come up with a more complete solution for XSS: ► Sadly, there is no one silver bullet for XSS ► We cannot rely on one escaping function or filter ► We cannot rely yet only on most auto-escaping template engines ► For the time being, we need: ► Libraries that allow developer to escape the data properly for many contexts, and filter the data when an escaper cannot be used ► Give actionable guidance to the developers Sad truth about XSS
  • 46. ► List of required escapers based on our dataset: ► HTML escaping (^85%) ► URI encoding (^30%) ► JavaScript string escaping (^13%) ► CSS string escaping (^0.16%) ► JavaScript regex escaping (^0.01%) ► Libraries like OWASP ESAPI [3] or Coverity Security Library [4] already have these escapers; promote them! Escapers should be available
  • 47. ► HTML contexts that require a filter based on our dataset: ► HTML Attribute -> URI (^30%) ► HTML Attribute -> JS code (0.2%) ► HTML Attribute -> CSS (0.3%) ► Etc. ► How to handle URIs? ► Filter the scheme to make sure you allow it (http, https, etc.) ► Make sure the authority makes sense for your application ► URI encode each element of the path ► URI encode query string parameter names and values ► ESAPI has a isValidURL which seems very restrictive, and CSL asUrl rewrites the URL to make it safe (but you need to manually encode the params & paths) Filters should be available
  • 48. ► XSS is complex because it can happen in many different locations in the HTML pages (HTML contexts). ► Input validation may not be adequate ► Sanitizers can help but it’s very difficult to find a complete library for XSS ► OWASP XSS cheat sheet is a good start, but I believe we should do more: ► We started blogging [5] about these kind of issues and will continue doing it as we improve our technologies or just come across “interesting stuff” ► We will also continue to improve CSL [4] by adding filters, etc. ► We need to be more serious about raising awareness of HTML contexts XSS Conclusions
  • 50. ► It’s getting better. More and more developers have a basic understanding of security issues like XSS and SQLi. ► For SQL, developers parameterized 94% of the time when possible ► For HTML, we see many places where HTML escaping is used ► But it’s just too complex and convoluted for them to understand everything ► Security obligations of each context ► Nested contexts and ordering of escapers ► There are gaps in the way security information is presented online; StackOverflow is a scary place. However, if you don't give them advice, who will? What’s up with developers?!
  • 51. ► Focus developer communication on the code point of view, as opposed to the attacker’s point of view: ► Attacks and threats are the "why" we need to fix ► Developers need the "how" and "where" to fix ► Provide helpful advice: ► Actionable (i.e. code, code, and more code) ► To a specific technology (JSTL, Hibernate HQL, jQuery, etc.) ► In a specific context or contexts (IN clause, URI, CSS string, etc.) ► What does helpful advice look like? ► Not this: "Parameterized all SQL statements." ► How about this? Ways forward
  • 52. How to Parameterize Data Values Example: "SELECT * FROM table WHERE name = '" + userName + "'" JDBC String paramQuery = "SELECT * FROM table WHERE name = ?"; PreparedStatement prepStmt = connection.prepareStatement(paramQuery); prepStmt.setString(1, userName); Hibernate Native Query String paramQuery = "SELECT * FROM table WHERE name = :username"; SQLQuery query = session.createSQLQuery(paramQuery); query.setParameter("username", userName); JPA Native Query String paramQuery = "SELECT * FROM table WHERE name = :username"; Query query = entityManager.createNativeQuery(paramQuery); query.setParameter("username", userName); Helpful Advice
  • 53. How to add data in a CSS string within a CSS block using CSL. Provide several examples that show the common usages for different HTML contexts, and technologies your developers are using Expression Language <style> a[href *= "${cov:cssStringEscape(param.foobar)}"] { background-color: pink!important; } </style> Java or JSP Scriptlet <style> <% String parm = request.getParameter("foobar"); String cssEscaped = Escape.cssString(parm); %> a[href *= "<%= cssEscaped %>"] { background-color: pink!important; } </style> Helpful Advice
  • 54. ► XSS and SQLi can be nuanced and complex. XSS is actually becoming more and more complex over time as the number of contexts and nesting increases. ► Current advices we saw are either: ► Generic remediation ► Precisely how to perform the attack ► Guidance needs to be both more specific and technical, yet also simpler and shorter. The only way to get that is to take tailored, detailed advice and crop it down for the development team’s needs. ► Remember: your developers need some <3 Conclusions
  • 56. ► [1] HTML 5 tokenizer specification ► [2] OWASP XSS Cheat Sheet ► [3] OWASP ESAPI Java ► [4] Coverity Security Library Java ► [5] Coverity Security Research Blog ► [6] OWASP SQLi Prevention Cheat Sheet References

Editor's Notes

  1. This talk has 3 sections: SQL injection, XSS, and interaction with developers
  2. Studied where developers add dynamic data in HTML or SQL
  3. Largest application is ~1MLOC, smallest ~10KLOC
  4. Computing contexts requires dataflow analysis, and tracking the strings inside the application (concats, mutations, etc.)
  5. Injection sites is essentially the entire space of eventual string concat w/ a nested language. The abstract string generated from this query has 4 possible instantiations (based on each conditionals), since we only care about the injection sites, we’d report 2 there. One being the ? (special analysis around it). We decided to also consider ? as injection sites since, they happen to use a correct API, but they still aimed to add some data in a query.
  6. This point is taking out stored procedure.
  7. Define ORM here, Object Relation Mapping, practically you use java objects to shadow the DB tables Columns (JDBC API, etc.)
  8. Quickly define SQL contexts here
  9. Interestingly, there is no segmentation of strings and safe types. Looking at some projects specifically, we can clearly see that writing such query is a habit. Developer sometimes don’t know about parameterized statement and keep using string concats
  10. 9% are the queries that contain one injection site that could be parameterized but isn’t SQL_GENERIC: everything that does not belong to the previous contexts (which we did not find so interesting to group)
  11. SQL/HQL/SQL
  12. PostgreSQL 9.1 by default sets standard_conforming_strings to 'on'. When set, only the PostgreSQL special "escape string constants", denoted with an "E" in front of the string like E'foobar', honors a backslash escape. Otherwise, normal string constants treat a backslash as a normal character. Quoted identifier can have anything, even double quotes. The double quotes just need to be doubled up to be escaped. Some dialects support a backtick also as a quote.
  13. Mostly inj in different HTML contexts because flash, etc. aren’t really about HTML contexts
  14. Usually -> in XHTML applications, it is possible to HTML escape most of the content
  15. Not one place to look for security obligations for HTML contexts. - HTML related: html5 tokenizer spec and html4 spec - JavaScript grammars - CSS grammars Then, need to integrate some quirks? How to capture the behaviors from browsers… it’s fairly tough; we cannot have an exhaustive list of behaviors What happens is that ${inj_var} has the same security constraints: - cannot contain a “ - the behavior is that the web browser will HTML decode the content of foo or bar before processing them (if it does)
  16. Look back at the example: <a href="javascript:hello('${inj_var}')”> What do we need to do to fix an issue in that nested contexts? -> JS string escaping -> URI encoding -> HTML escaping No need to worry about specific injections in a URI contexts (scheme issue) here…
  17. JavaScript string escaping URL encoding HTML escaping
  18. We analyzed more than 20K injection sites for HTML which cover 26 different stacks of HTML contexts We found context stacks with up to 3 elements in out dataset, and more than 45% of the context stacks have 2 contexts
  19. This is an example of a project we analyzed overtime. For each revision, we looked at the distribution of HTML contexts, and plotted this area graph. We can clearly see the increasing number of diversity of contexts in the application.
  20. I doubt we can talk about a reference implementation, but we developed on that works for HTML normal elements and HTML attributes, available in the coverity security library Some escapers detected as HTML escapers are XML escapers and might have a very limited scope… this might biased the data Example of implementations: - ESAPI has 2 (attr and normal elements)
  21. Border: terminal node for the generated strings (e.g., there is an output in that context) Egg shape: some kind of filtering is required Page shape: escaping is possible Flow := combination of nested contexts, each remediation requirement should be met in order This representation has several limitations; a “real” representations of the contexts would be a graph with cycles (e.g., JS STRING contains HTML that can contains JS …)
  22. Border: terminal node for the generated strings (e.g., there is an output in that context) Egg shape: some kind of filtering is required Page shape: escaping is possible Flow := combination of nested contexts, each remediation requirement should be met in order This representation has several limitations; a “real” representations of the contexts would be a graph with cycles (e.g., JS STRING contains HTML that can contains JS …)
  23. HTML escaping all the time is a false sense of security. It’s not correct all time GCT (google closure template) caveat: bad handling of URI, injection in href=“javascript:<HERE>” won’t be dealt with, etc. It’s tough to write a HTML context parser! Note that OWASP seems to have one (owasp-jxt), but no update since 2011
  24. ^ : at most – escapers percentage are computed based on the distribution of context. We split the context stacks and resolve all necessary sanitizers based on the individual contexts
  25. Filters are to be used when an escaper cannot be.
  26. StackOverflow is becoming the most important resource for developers.
  27. Advices need to be tailored based on what’s used in the application.
  28. In this context RAWTEXT -> CSS STRING, (nested context), we designed the escaper to behave that way: - Ensure that the literal </style> cannot be part of the string (otherwise, closes the <style>) - The string cannot be escaped from by either closing the quote, or generating a parse error from the CSS parser (which will try to recover from it and can create a CSS injection condition)
  29. Current advices are not really targeting the developers; they need more precise stuff based on the languages, frameworks, libs they’re using…