Attacks Against Captcha Systems - DefCamp 2012

•Download as PPT, PDF•

1 like•1,937 views

DefCamp

Attacking CAPTCHAs
explained

Ioan – Carol Plangu

What's a CAPTCHA

Completely
Automated
Public
Turing test to tell
Computers and
Humans
Apart

Three attack methods


Implementation attack


Automated recognition


Manual labor

The implementation attack
Scenario 1

the image session id can be reused

The implementation attack
Scenario 1

the image session id can be reused

id
Restricted
Captcha
page
form

The implementation attack
Scenario 2

the number of captcha tests is limited

The implementation attack
Scenario 2

the number of captcha tests is limited

we just need to solve them all and store them in a hash
table

The implementation attack
Scenario 3

hash of solution sent to client

The implementation attack
Scenario 3

hash of solution sent to client

rainbow tables :)

Or not...

XXX
Complete this captcha form to continue

Automated recognition
We're going to actually reproduce a human
response for the given question

It's hard to add noise to the
generated speech without making it
hard for the human

The most common approach


Greedy optimization – reverse engineer
everything

Character segmentation

OCR

Possible security measures

Funky background image

Possible security measures

Funky background image

− usually can be removed with basic preprocessing

Possible security measures

Funky background image
− usually can be removed with basic preprocessing

Text distortions

Possible security measures

Funky background image
− usually can be removed with basic preprocessing

Text distortions

− modern OCR techniques can beat it

Beating segmentation

If a character signature can be extracted from
only the vertical signature, character
segmentation becomes trivial

A Low-cost Attack on a Microsoft CAPTCHA - Jeff Yan, Ahmad Salah El Ahmad
School of Computing Science, Newcastle University, UK

Beating segmentation

We can otherwise ignore it!

Beating segmentation

We can otherwise ignore it!

The following slides are about an experiment
about this approach

A Monte-Carlo experiment


Note: for testing performance, the variance of
the characters has been kept to a minimum

f(x) → y
x in binary( 0 - 2^3000 )
y in 10^6

Training:

− Select one character image at random
− Select N black spots
− Sort the points for uniqueness
− Subtract the first point from all others for position
independence
− Assign it a 'weight' for each character using the
following formula:
matched characters count / sample size
− Assign it a 'score' (indicates classification quality)
selected digit weight / (1 + other digit weights)

Recognition:

− Make a score map for all points
− Select the most appropriate character for each
column
− Process the resulting string into a 6 digit string

An equivalent model

input layer

linear hidden layer
(feature layer)

threshold layers

softmax layer

An equivalent model

input layer

OCR
linear hidden layer
(feature layer) without zero
penalty

==

threshold layers No biases for
the first layer

(avoids the
2*binary - 1
effect)
softmax layer

Hacking the OCR:

To negate the effect the biases, for each image we
add random noise in the white areas

This will greatly improve the recognition in a noisy
image

An more powerful model

input layer

Hacked OCR layer

Score map

output layer

The demo source is hosted at
https://github.com/theshark08/howtobreakacaptcha01

Viewers also liked

Automated Attack Surface Approximation [FSE - SRC 2015]Chris Theisen

Software Security Education at ScaleChris Theisen

Sania: Syntactic and Semantic Analysis for Automated Testing against SQL Inje...Yuji Kosuga

Autonomous Hacking: The New Frontiers of Attack and DefensePriyanka Aash

A DevOps Guide to Web Application SecurityImperva Incapsula

DefCamp 2013 - In vehicle CAN network securityDefCamp

Automated and Effective Testing of Web Services for XML Injection AttacksLionel Briand

Crowd-Sourced Threat IntelligenceAlienVault

Implementing An Automated Incident Response ArchitecturePriyanka Aash

Viewers also liked (9)

Automated Attack Surface Approximation [FSE - SRC 2015]

Software Security Education at Scale

Sania: Syntactic and Semantic Analysis for Automated Testing against SQL Inje...

Autonomous Hacking: The New Frontiers of Attack and Defense

A DevOps Guide to Web Application Security

DefCamp 2013 - In vehicle CAN network security

Automated and Effective Testing of Web Services for XML Injection Attacks

Crowd-Sourced Threat Intelligence

Implementing An Automated Incident Response Architecture

Similar to Attacks Against Captcha Systems - DefCamp 2012

Neural Networks in the Wild: Handwriting RecognitionJohn Liu

Deep learning: what? how? why? How to win a Kaggle competition317070

ECCV2010: feature learning for image classification, part 4zukun

Angular and Deep LearningOswald Campesato

Introduction to machine learning november 25, 2017Manish Panchmatia

Intelligent Thumbnail SelectionKamil Sindi

Haskell for data scienceJohn Cant

Deep learning unsupervised learning diapoMilton Paja

Artificial neural networks introductionSungminYou

Gan seminarSan Kim

Introductory Digital Image Processing using Matlab, IIT RoorkeeVinayak Sahai

Som paper1.docAbhi Mediratta

CAPTCHA and Convolutional neural network Bushra Jbawi

Introduction to Deep LearningOswald Campesato

Using Deep Learning to do Real-Time Scoring in Practical ApplicationsGreg Makowski

Deep learning from a novice perspectiveAnirban Santara

Alberto Massidda - Scenes from a memory - Codemotion Rome 2019Codemotion

supervised.pptxMohamedSaied316569

Defcon 22-paul-mcmillan-attacking-the-iot-using-timing-attacPriyanka Aash

Scrambling For Video SurveillanceKobi Magnezi

Similar to Attacks Against Captcha Systems - DefCamp 2012 (20)

Neural Networks in the Wild: Handwriting Recognition

Deep learning: what? how? why? How to win a Kaggle competition

ECCV2010: feature learning for image classification, part 4

Angular and Deep Learning

Introduction to machine learning november 25, 2017

Intelligent Thumbnail Selection

Haskell for data science

Deep learning unsupervised learning diapo

Artificial neural networks introduction

Gan seminar

Introductory Digital Image Processing using Matlab, IIT Roorkee

Som paper1.doc

CAPTCHA and Convolutional neural network

Introduction to Deep Learning

Using Deep Learning to do Real-Time Scoring in Practical Applications

Deep learning from a novice perspective

Alberto Massidda - Scenes from a memory - Codemotion Rome 2019

supervised.pptx

Defcon 22-paul-mcmillan-attacking-the-iot-using-timing-attac

Scrambling For Video Surveillance

More from DefCamp

Remote Yacht HackingDefCamp

Mobile, IoT, Clouds… It’s time to hire your own risk manager!DefCamp

The Charter of TrustDefCamp

Internet Balkanization: Why Are We Raising Borders Online?DefCamp

Bridging the gap between CyberSecurity R&D and UXDefCamp

Secure and privacy-preserving data transmission and processing using homomorp...DefCamp

Drupalgeddon 2 – Yet Another Weapon for the AttackerDefCamp

Economical Denial of Sustainability in the Cloud (EDOS)DefCamp

Trust, but verify – Bypassing MFADefCamp

Threat Hunting: From Platitudes to Practical ApplicationDefCamp

Building application security with 0 money downDefCamp

Implementation of information security techniques on modern android based Kio...DefCamp

Lattice based Merkle for post-quantum epochDefCamp

The challenge of building a secure and safe digital environment in healthcareDefCamp

Timing attacks against web applications: Are they still practical?DefCamp

Tor .onions: The Good, The Rotten and The Misconfigured DefCamp

Needles, Haystacks and Algorithms: Using Machine Learning to detect complex t...DefCamp

We will charge you. How to [b]reach vendor’s network using EV charging station.DefCamp

Connect & Inspire Cyber SecurityDefCamp

The lions and the watering holeDefCamp

More from DefCamp (20)

Remote Yacht Hacking

Mobile, IoT, Clouds… It’s time to hire your own risk manager!

The Charter of Trust

Internet Balkanization: Why Are We Raising Borders Online?

Bridging the gap between CyberSecurity R&D and UX

Secure and privacy-preserving data transmission and processing using homomorp...

Drupalgeddon 2 – Yet Another Weapon for the Attacker

Economical Denial of Sustainability in the Cloud (EDOS)

Trust, but verify – Bypassing MFA

Threat Hunting: From Platitudes to Practical Application

Building application security with 0 money down

Implementation of information security techniques on modern android based Kio...

Lattice based Merkle for post-quantum epoch

The challenge of building a secure and safe digital environment in healthcare

Timing attacks against web applications: Are they still practical?

Tor .onions: The Good, The Rotten and The Misconfigured

Needles, Haystacks and Algorithms: Using Machine Learning to detect complex t...

We will charge you. How to [b]reach vendor’s network using EV charging station.

Connect & Inspire Cyber Security

The lions and the watering hole

Attacks Against Captcha Systems - DefCamp 2012

1. Attacking CAPTCHAs explained Ioan – Carol Plangu

2. What's a CAPTCHA Completely Automated Public Turing test to tell Computers and Humans Apart

4. Three attack methods  Implementation attack  Automated recognition  Manual labor

5. The implementation attack Scenario 1 the image session id can be reused

6. The implementation attack Scenario 1 the image session id can be reused id Restricted Captcha page form

7. The implementation attack Scenario 2 the number of captcha tests is limited

8. The implementation attack Scenario 2 the number of captcha tests is limited we just need to solve them all and store them in a hash table

9. The implementation attack Scenario 3 hash of solution sent to client

10. The implementation attack Scenario 3 hash of solution sent to client rainbow tables :)

11. Manual labor There are two options:

12. Pay a bunch of monkeys

13. Or not... XXX Complete this captcha form to continue

14. Automated recognition We're going to actually reproduce a human response for the given question

15. Can you understand my voice?

16. The sound sample is usually generated

17. It's hard to add noise to the generated speech without making it hard for the human

18. But can you read?

19. Sort of.....

20. The most common approach  Greedy optimization – reverse engineer everything  Character segmentation  OCR

21. Possible security measures

22. Possible security measures  Funky background image

23. Possible security measures  Funky background image − usually can be removed with basic preprocessing

24. Possible security measures  Funky background image − usually can be removed with basic preprocessing  Text distortions

25. Possible security measures  Funky background image − usually can be removed with basic preprocessing  Text distortions − modern OCR techniques can beat it

26. Possible security measures  Funky background image − usually can be removed with basic preprocessing  Text distortions − modern OCR techniques can beat it  Anti segmentation measures

27. Beating segmentation

28. Beating segmentation  If a character signature can be extracted from only the vertical signature, character segmentation becomes trivial A Low-cost Attack on a Microsoft CAPTCHA - Jeff Yan, Ahmad Salah El Ahmad School of Computing Science, Newcastle University, UK

29. Beating segmentation We can otherwise ignore it!

30. Beating segmentation We can otherwise ignore it! The following slides are about an experiment about this approach

31. A Monte-Carlo experiment  Note: for testing performance, the variance of the characters has been kept to a minimum f(x) → y x in binary( 0 - 2^3000 ) y in 10^6

32. Training: − Select one character image at random − Select N black spots − Sort the points for uniqueness − Subtract the first point from all others for position independence − Assign it a 'weight' for each character using the following formula: matched characters count / sample size − Assign it a 'score' (indicates classification quality) selected digit weight / (1 + other digit weights)

33. Recognition: − Make a score map for all points − Select the most appropriate character for each column − Process the resulting string into a 6 digit string

34.

35.

36. An equivalent model input layer linear hidden layer (feature layer) threshold layers softmax layer

37. An equivalent model input layer OCR linear hidden layer (feature layer) without zero penalty == threshold layers No biases for the first layer (avoids the 2*binary - 1 effect) softmax layer

38. Hacking the OCR: To negate the effect the biases, for each image we add random noise in the white areas This will greatly improve the recognition in a noisy image

39. An more powerful model input layer Hacked OCR layer Score map output layer

40. Questions?

41. The demo source is hosted at https://github.com/theshark08/howtobreakacaptcha01

Attacks Against Captcha Systems - DefCamp 2012

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (9)

Similar to Attacks Against Captcha Systems - DefCamp 2012

Similar to Attacks Against Captcha Systems - DefCamp 2012 (20)

More from DefCamp

More from DefCamp (20)

Attacks Against Captcha Systems - DefCamp 2012