This document discusses monitoring software repositories to detect security issues. It introduces a tool called SANZARU that analyzes commits to repositories to identify potential bugs and vulnerabilities. SANZARU works by extracting vectors from commit data, training a classifier on past issues, and then classifying new commits. Its goals are to detect security fixes, new vulnerabilities, and interesting new features. The document provides examples of issues SANZARU has found and discusses challenges in commit classification.
Recent workshop on security code review given at SecTalks Melbourne. The slides contain a link to the vulnerable PHP application to perform the review.
Recent workshop on security code review given at SecTalks Melbourne. The slides contain a link to the vulnerable PHP application to perform the review.
Lie to Me: Bypassing Modern Web Application FirewallsIvan Novikov
The report considers analysis of modern Web Application Firewalls. The author provides comparison of attack detection algorithms and discusses their advantages and disadvantages. The talk includes examples of bypassing protection mechanisms. The author points out the necessity of discovering a universal method of masquerading for vectors of various attacks via WAFs for different algorithms.
“The call to kill Adobe’s Flash in favour of HTML5 is rising...” This and similar statements mean that many web applications might now contain old and vulnerable SWF files as their developers have to concentrate on developing non-Flash contents. We may all hope that we never have to see Flash files ever again! However, as long as web browsers continue their support for Flash, web applications can be vulnerable to client-side issues and it is important for a penetration tester or a bug bounty hunter to have the right skills to find vulnerable SWF files. This presentation aids eager testers to identify security issues in the SWF files manually and automatically using certain techniques and tools.
PowerPoint File:
https://soroush.secproject.com/downloadable/flash_it_baby_v2.0.pptx
A beginner level presentation made for c0c0n 2013 to talk about some basic modules of python which can be used in routine penetration testing exercises.
libinjection: from SQLi to XSS by Nick GalbreathCODE BLUE
libinjection was introduced at Black Hat USA 2012 to quickly and accurately detect SQLi attacks from user inputs. Two years later the algorithm has been used by a number of open-source and proprietary WAFs and honeypots. This talk will introduce a new algorithm for detecting XSS. Like the SQLi libinjection algorithm, this does not use regular expressions, is very fast, and has a low false positive rate. Also like the original libinjection algorithm, this is available on GitHub with free license.
Nick Galbreath
Nick Galbreath is Vice President of Engineering at IPONWEB, a world leader in the development of online advertising exchanges. Prior to IPONWEB, his role was Director of Engineering at Etsy, overseeing groups handling security, fraud, security, authentication and other enterprise features. Prior to Etsy, Nick has held leadership positions in number of social and e-commerce companies, including Right Media, UPromise, Friendster, and Open Market. He is the author of ""Cryptography for Internet and Database Applications"" (Wiley). Previous speaking engagements have been at Black Hat, Def Con, DevOpsDays and other OWASP events. He holds a master's degree in mathematics from Boston University and currently resides in Tokyo, Japan.
In 2013
- LASCON http://lascon.org/about/, Keynote Speaker Austin, Texas USA
- DevOpsDays Tokyo, Japan
- Security Development Conference (Microsoft) San Francisco, CA, USA
- DevOpsDays Austin, Texas, USA
- Positive Hack Days http://phdays.com, Moscow Russia
- RSA USA, San Francisco, CA, speaker and panelist
In 2012
- DefCon
- BlackHat USA
- Others
Carlo Sciolla - Above and beyond type systems with clojure.spec - Codemotion ...Codemotion
The most exciting feature of the upcoming Clojure 1.9 is clojure.spec, which delivers a game changer experience in a dynamically typed language such as clojure. With this new tool your code will be able to express constraints that are traditionally very hard to encode in traditional type systems, like describing a function that only accepts a sequence of strings of increasing length of which the third element starts with a capital letter. In this talk I'll start with the spec basics up to some of its advanced usages, from data validation to generative testing.
[HES2013] Virtually secure, analysis to remote root 0day on an industry leadi...Hackito Ergo Sum
Today most networks present one “gateway” to the whole network – The SSL-VPN. A vector that is often overlooked and considered “secure”, we decided to take apart an industry leading SSL-VPN appliance and analyze it to bits to thoroughly understand how secure it really is. During this talk we will examine the internals of the F5 FirePass SSL-VPN Appliance. We discover that even though many security protections are in-place, the internals of the appliance hides interesting vulnerabilities we can exploit. Through processes ranging from reverse engineering to binary planting, we decrypt the file-system and begin examining the environment. As we go down the rabbit hole, our misconceptions about “security appliances” are revealed.
Using a combination of web vulnerabilities, format string vulnerabilities and a bunch of frustration, we manage to overcome the multiple limitations and protections presented by the appliance to gain a remote unauthenticated root shell. Due to the magnitude of this vulnerability and the potential for impact against dozens of fortune 500 companies, we contacted F5 and received one of the best vendor responses we’ve experienced – EVER!
https://www.hackitoergosum.org
Think Like a Hacker - Database Attack VectorsMark Ginnebaugh
More here: http://bit.ly/2OMTu4
Sudha Iyer of LogLogic and Slavik Markovich of Sentrigo discuss how hackers learn their trade and what you can do to protect your database.
Learn about methods for protecting against each type of attack, including secure coding practices, database hardening methods and deep-scanning database activity monitoring tools.
You will learn:
• How to think like a hacker (including a demonstration of basic hacking)
• SQL injection in depth
• How to avoid SQL injection problems
• User-defined DBMS security policies
• Taking control of SQL injection, buffer overflow and other privilege-escalation attacks
• How to preserve the confidentiality and integrity of your data
• Strategies for monitoring and analyzing database activities without impacting performance
Adding Pentest Sauce to Your Vulnerability Management Recipe. Coves 10 tips to improve vulnerability management based on common red team and pentest findings.
Browsers nowadays are competing with operating systems as the next application development platform. The rapid development of Web 2.0 keeps pushing browser developers into implementing advanced features that allow the creation of interactive multimedia applications. This sets the grounds for a new fertile environment in which a new breed of malware can come to life. Malware that is OS and architecture independent, as covert as a cutting edge rootkit but at the same time implemented through a series of API\'s and a generous variety of high-level OOP languages simplifying the task
This presentation was given at DerbyCon 6 on 9/23/2016. It covers the fusion of the PowerShell Empire and Python EmPyre projects, as well as new Empire 2.0 transports.
Dirty Little Secrets They Didn't Teach You In Pentest Class v2Rob Fuller
This talk (hopefully) provides some new pentesters tools and tricks. Basically a continuation of last year’s Dirty Little Secrets they didn’t teach you in Pentest class. Topics include; OSINT and APIs, certificate stealing, F**king with Incident Response Teams, 10 ways to psexec, and more. Yes, mostly using metasploit.
Lie to Me: Bypassing Modern Web Application FirewallsIvan Novikov
The report considers analysis of modern Web Application Firewalls. The author provides comparison of attack detection algorithms and discusses their advantages and disadvantages. The talk includes examples of bypassing protection mechanisms. The author points out the necessity of discovering a universal method of masquerading for vectors of various attacks via WAFs for different algorithms.
“The call to kill Adobe’s Flash in favour of HTML5 is rising...” This and similar statements mean that many web applications might now contain old and vulnerable SWF files as their developers have to concentrate on developing non-Flash contents. We may all hope that we never have to see Flash files ever again! However, as long as web browsers continue their support for Flash, web applications can be vulnerable to client-side issues and it is important for a penetration tester or a bug bounty hunter to have the right skills to find vulnerable SWF files. This presentation aids eager testers to identify security issues in the SWF files manually and automatically using certain techniques and tools.
PowerPoint File:
https://soroush.secproject.com/downloadable/flash_it_baby_v2.0.pptx
A beginner level presentation made for c0c0n 2013 to talk about some basic modules of python which can be used in routine penetration testing exercises.
libinjection: from SQLi to XSS by Nick GalbreathCODE BLUE
libinjection was introduced at Black Hat USA 2012 to quickly and accurately detect SQLi attacks from user inputs. Two years later the algorithm has been used by a number of open-source and proprietary WAFs and honeypots. This talk will introduce a new algorithm for detecting XSS. Like the SQLi libinjection algorithm, this does not use regular expressions, is very fast, and has a low false positive rate. Also like the original libinjection algorithm, this is available on GitHub with free license.
Nick Galbreath
Nick Galbreath is Vice President of Engineering at IPONWEB, a world leader in the development of online advertising exchanges. Prior to IPONWEB, his role was Director of Engineering at Etsy, overseeing groups handling security, fraud, security, authentication and other enterprise features. Prior to Etsy, Nick has held leadership positions in number of social and e-commerce companies, including Right Media, UPromise, Friendster, and Open Market. He is the author of ""Cryptography for Internet and Database Applications"" (Wiley). Previous speaking engagements have been at Black Hat, Def Con, DevOpsDays and other OWASP events. He holds a master's degree in mathematics from Boston University and currently resides in Tokyo, Japan.
In 2013
- LASCON http://lascon.org/about/, Keynote Speaker Austin, Texas USA
- DevOpsDays Tokyo, Japan
- Security Development Conference (Microsoft) San Francisco, CA, USA
- DevOpsDays Austin, Texas, USA
- Positive Hack Days http://phdays.com, Moscow Russia
- RSA USA, San Francisco, CA, speaker and panelist
In 2012
- DefCon
- BlackHat USA
- Others
Carlo Sciolla - Above and beyond type systems with clojure.spec - Codemotion ...Codemotion
The most exciting feature of the upcoming Clojure 1.9 is clojure.spec, which delivers a game changer experience in a dynamically typed language such as clojure. With this new tool your code will be able to express constraints that are traditionally very hard to encode in traditional type systems, like describing a function that only accepts a sequence of strings of increasing length of which the third element starts with a capital letter. In this talk I'll start with the spec basics up to some of its advanced usages, from data validation to generative testing.
[HES2013] Virtually secure, analysis to remote root 0day on an industry leadi...Hackito Ergo Sum
Today most networks present one “gateway” to the whole network – The SSL-VPN. A vector that is often overlooked and considered “secure”, we decided to take apart an industry leading SSL-VPN appliance and analyze it to bits to thoroughly understand how secure it really is. During this talk we will examine the internals of the F5 FirePass SSL-VPN Appliance. We discover that even though many security protections are in-place, the internals of the appliance hides interesting vulnerabilities we can exploit. Through processes ranging from reverse engineering to binary planting, we decrypt the file-system and begin examining the environment. As we go down the rabbit hole, our misconceptions about “security appliances” are revealed.
Using a combination of web vulnerabilities, format string vulnerabilities and a bunch of frustration, we manage to overcome the multiple limitations and protections presented by the appliance to gain a remote unauthenticated root shell. Due to the magnitude of this vulnerability and the potential for impact against dozens of fortune 500 companies, we contacted F5 and received one of the best vendor responses we’ve experienced – EVER!
https://www.hackitoergosum.org
Think Like a Hacker - Database Attack VectorsMark Ginnebaugh
More here: http://bit.ly/2OMTu4
Sudha Iyer of LogLogic and Slavik Markovich of Sentrigo discuss how hackers learn their trade and what you can do to protect your database.
Learn about methods for protecting against each type of attack, including secure coding practices, database hardening methods and deep-scanning database activity monitoring tools.
You will learn:
• How to think like a hacker (including a demonstration of basic hacking)
• SQL injection in depth
• How to avoid SQL injection problems
• User-defined DBMS security policies
• Taking control of SQL injection, buffer overflow and other privilege-escalation attacks
• How to preserve the confidentiality and integrity of your data
• Strategies for monitoring and analyzing database activities without impacting performance
Adding Pentest Sauce to Your Vulnerability Management Recipe. Coves 10 tips to improve vulnerability management based on common red team and pentest findings.
Browsers nowadays are competing with operating systems as the next application development platform. The rapid development of Web 2.0 keeps pushing browser developers into implementing advanced features that allow the creation of interactive multimedia applications. This sets the grounds for a new fertile environment in which a new breed of malware can come to life. Malware that is OS and architecture independent, as covert as a cutting edge rootkit but at the same time implemented through a series of API\'s and a generous variety of high-level OOP languages simplifying the task
This presentation was given at DerbyCon 6 on 9/23/2016. It covers the fusion of the PowerShell Empire and Python EmPyre projects, as well as new Empire 2.0 transports.
Dirty Little Secrets They Didn't Teach You In Pentest Class v2Rob Fuller
This talk (hopefully) provides some new pentesters tools and tricks. Basically a continuation of last year’s Dirty Little Secrets they didn’t teach you in Pentest class. Topics include; OSINT and APIs, certificate stealing, F**king with Incident Response Teams, 10 ways to psexec, and more. Yes, mostly using metasploit.
I was part of the German team at the Cyber Security Challenge Europe, where I had to hold a presentation on one of the challenges we solved.
I chose a rather simple one, since the time to prepare the presentation was very limited (around 30 minutes).
Running your app in the Cloud is all the rage, but our tools for managing and supporting complex environments lag behind our needs. If we truly want to embrace Infrastructure as a Service, then we must apply standard software development lessons such as: DRY, Versioning, Decomposition, Abstraction and more. Why haven't we taken these lessons to heart?
This talk presents a brief overview of Use-after-Free vulnerability and corresponding exploitation techniques for Internet Explorer (IE), followed by description of memory protection schemes implemented in newer versions of IE in order to mitigate exploitation of such vulnerabilities.
An overview of techniques for defending against SQL Injection using Python tools. This slide deck was presented at the DC Python Meetup on October 4th, 2011 by Edgar Roman, Sr Director of Application Development at PBS
This talk is about why I believe having the ability to write tools and/or scripts can help elevate a Pen Testers game to the next level.
The talk is case study driven by the different scenarios I've encountered on assessments and the scripts or tools that have been developed as a result.
One Does Not… write TypeScript so easily! In this Meetup talk, I'll share the tricks and pain points I had to learn in my first 6 months of professional TypeScript. The goal is to spare the reader many hours of Stack Overflow...
Using Xtext for the first time is usually a very positive experience. Although Xtext is a complex generic framework, it is very easy to create your first Xtext-based editor, because of Xtext’s smart defaults and intuitive APIs. Even with minimal initial effort, the results are quite spectacular. Unfortunately the initial excitement often turns into disillusion as soon as you use your plugin on a big project.
Many development teams hit a performance wall as their plugin gets deployed and has to support larger projects. Internally, Xtext is a complex beast. The internals are carefully hidden from the user, but understanding them is critical to understand where the performance bottlenecks come from.
At Sigasi we have built commercial tool support for complex hardware description languages (VHDL, Verilog, SystemVerilog) using the Xtext framework. Our plugin needs to handle big industrial sized projects (>400k lines of code) that include large generated files (2 to 10 MB). To handle these kinds of projects we have developed a set of techniques over the last four years.
In this talk we will cover some performance critical pieces of the Xtext framework and evaluate what can be done to optimize it (think: parallel loading, caching, fast linking,…). We will also discuss some workarounds that can be used if nothing else works (light-weight editors, reducing the workload of the compiler).
Thoughts and ideas on why and how to maintain a code repository with special attention to code style and documentation.
Includes some tips and recommendations about tools and standards you use or get inspiration from to keep your project in a good shape.
Dart is a new language for the web, enabling you to write JavaScript on a secure and manageable way. No need to worry about "JavaScript: The bad parts".
This presentation concentrates on the developer experience converting from the Java based GWT to Dart.
Good has won this time. To be more exact, source codes of the Chromium project have won. Chromium is one of the best projects we have checked with PVS-Studio.
Good has won this time. To be more exact, source codes of the Chromium project have won. Chromium is one of the best projects we have checked with PVS-Studio.
It was presented in #NullHyd on 14th Dec, 2019 with 4 hours hands-on session. All code has been shared in github repo as well: https://github.com/jassics/python-for-cybersecurity
Testing is fundamental in software development. Quality gates demand high coverage levels, pull requests need sufficient tests, leading to teams spending considerable time writing and maintaining them. But are we using our tests to their full potential?
'If code is hard to test, the design can be improved'. Starting from this mantra, this deep-dive session unveils hints to simplify code, break-down complexity, and effectively use functional programming. We'll delve into topics like fixture creep, partial mocks, onion architecture, and pure functions, providing numerous best practices and practical tips for your testing.
Be warned: This session may significantly disrupt your work routine and will likely change how you see testing. Attend at your own risk.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
2. About me
● Security consultant (C.T.O.)
working for Securus Global in Melbourne
● PentesterLab (.com):
○ cool/awesome (web) *free* training/exercises
○ real life scenario
3. Disclaimer
● No code is going to be released today
● No repositories
were harmed during
the preparation of
this talk
● I worked on Web and Open Source projects
● I worked on commits without using the entire
project's source code
4. Why work on commits?
● Corporate development:
○ Cannot review all projects anymore
○ Nice to have a “what to check today”
○ Sort commits by criticality
○ Detect backdoors
● Agile development:
○ The code changes every day
○ Can’t rely on one time code review anymore
○ Current approach: daily scan
5. Why work on commits?
● You have vulnerabilities:
○ Detect patches affecting your bugs
○ Detect changes to sensitive functions
6. Why work on commits?
● You want vulnerabilities ($$):
○ Detect new features with dangerous functions
○ Detect changes to sensitive functions
7. Why work on commits?
● You want bugs (lulz):
○ Get bugs few hours before the patch is available
○ Get a list of bad practices examples
○ Detect silent patching
8. What's a repository?
● Developers
● Files
● Commits
● And all of these are constantly moving...
9. Developers
● Main developer(s):
○ Add features
○ Fix bugs
● Cosmetic committer(s):
○ Change comments (fix typo)
○ Change designs of the website
○ Change indentation
○ Add documentation
● External people
○ Do a bit of everything
14. Stats (on the last 5000 commits)
● Commits per week:
○ anywhere between 20 and 180 (phpmyadmin) per
week
○ 40 commits per week seems to be the average for
"normal/interesting" projects
● Authors:
○ between 1 and 140
● Average commit: 200 lines
(insertions+deletions)
21. Filtering files
● General approach:
○ images
○ css
○ README
● Framework based:
○ tests (interesting to keep for some projects)
○ database migration/creation script
● Project based files
○ deployment
○ installation files
22. Filtering developers
● For a given project find the "cosmetic
developers"
● Don't get me wrong they are not useless,
they just do things i don't care about
23. Results
● Around 5-10% of commits have nothing to
do with code...
● You can divide the size of most other
commits by 2-3 if you ignore noise
(files/comments/...):
○ new code with test cases
○ modification in comments
○ ...
25. Data mining
● Take your samples (commits)
○ Extract a vector from each sample
○ Classify each sample
● From a training set, learn to classify the data
● Apply what you learned:
○ to the same training set after splitting it (cross-
validation)
○ to new samples
26. Data mining
● training set:
[1,2,3,0,10,220 ] -> bugfix
[2,4,3,0,1,0 ] -> boring
[2,5,3,3,1,1 ] -> boring
[20,1,0,100,0,10 ] -> new bug
● testing:
[23,0,1,90,0,15 ] -> ???
27. Extracting a vector
● You can't really say a commit is close to
another commit
● You need to generate a vector from each
commit to compare them
● Once you have done that, everything else is
just magic^W Maths
28. Extracting a vector: getting data
● Number of lines changed:
○ insertion vs deletion
● Number of words changed (--word-diff):
○ insertion vs deletion
● Authors:
○ rating of authors based on the project's history
■ "fixing" score
■ "vulnerability creator" score
○ new developers
○ known security researchers
29. Extracting a vector: getting data
● Number of "dangerous" functions:
○ insertion
○ deletion
● Number of "filtering" functions:
○ insertion
○ deletion
● commit date vs author date
● Keywords in the message and in the code
30. Extracting a vector: getting data
● Files modified:
○ already implicated in a bug fix
○ already implicated in a vulnerability
31. Filtering vs Dangerous
● Good list of "dangerous" signatures from
graudit:
○ https://github.com/wireghoul/graudit/
● Weighting is *really* important:
○ echo -> potential XSS -> 1 point
○ system -> potential commands execution -> 10
points
● Some functions are in both:
○ crypto functions for example
○ crypto can be dangerous and but can filter as well
34. Classification
● Fixed bugs:
○ learn from dangerous keywords
● New bugs:
○ git blame
○ read the source code and classify manually
● Potentially interesting new feature:
○ read the source code
○ can be a new bug
35. Results
● Vector computation:
○ between 15 and 120 minutes for 5000 commits
● Classification:
○ less than a minute
● Scoring:
○ 90% success rate on bug fix (without using the
message as part of the vector)
○ 50/50 between FP and FN on bug fix
○ 200 commits down to 5-10 bugs per day
36. My tool: SANZARU
● Japanese names for tools make you a Ninja ;)
● Ruby based (what else...)
● Data Mining done with Weka (thx Silvio)
37. SANZARU: virtuous circle
● Made in a way that the more you learn on a
project the more effective it gets :)
● Score authors through learning
● Score files through learning
● add functions used by the project
38. SANZARU: "learning mode"
● take the last 5k commits and give you the list
of impacted files and authors with a weight
● still working on finding the initial bug's author
but it doesn't really give you more information
41. SANZARU: "classification mode"
● Using ruby to create all the vectors
● Using weka to classify the data
● Then manual review of the results:
○ New features to find security bugs
○ FP for possible silent patching
42. SANZARU: "daily mode"
● Cron job (every day)
○ update all repositories (hasn't been blacklisted by
github...yet), ruby-git is *shit*
○ find alerts in new commits
○ classify new commits
○ give me a nice report with what to read
47. General observations
● Most fixes are:
○ small code insertion (less than 10 lines)
○ basic line substitution
○ easy to detect
● Most new bugs are:
○ details...
○ really hard to detect statistically
○ general approach: read all potentially interesting
commits
○ working on important projects make the creation of
bugs far less likely
○ it's not going to rain 0dayz...
48. Possible improvements
● Integrating syntactic analysis:
○ regular expression are just not enough
○ False alerts are time consuming...
● Retrieve information from external sources:
○ bug report
○ CVE
● Support for more languages/platforms:
○ Objective C libraries and applications?
○ Linux kernel?
○ ...
49. Conclusion
● Easy to detect:
○ (Silent) Security Fixes
○ New features with "interesting" functions
● Not so easy to detect
○ New security bugs
● Still worth the time
○ if you want bugs
○ if you are doing code review to have examples to
learn from or share: vulnerability patterns
○ most frustrating thing you can do?
50. Questions?
@snyff
● Have a great Ruxcon
● Play the CTF and Lock Picking
● Remember to checkout:
○ PentesterLab.com
○ @PentesterLab
● Thx to everyone who helped me
putting this talk together