• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Attacks Against Captcha Systems - DefCamp 2012
 

Attacks Against Captcha Systems - DefCamp 2012

on

  • 1,132 views

 

Statistics

Views

Total Views
1,132
Views on SlideShare
1,132
Embed Views
0

Actions

Likes
0
Downloads
11
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Attacks Against Captcha Systems - DefCamp 2012 Attacks Against Captcha Systems - DefCamp 2012 Presentation Transcript

    • Attacking CAPTCHAs explained Ioan – Carol Plangu
    • Whats a CAPTCHACompletelyAutomatedPublicTuring test to tellComputers andHumansApart
    • Three attack methods Implementation attack Automated recognition Manual labor
    • The implementation attackScenario 1 the image session id can be reused
    • The implementation attackScenario 1 the image session id can be reused id Restricted Captcha page form
    • The implementation attackScenario 2 the number of captcha tests is limited
    • The implementation attackScenario 2 the number of captcha tests is limited we just need to solve them all and store them in a hash table
    • The implementation attackScenario 3 hash of solution sent to client
    • The implementation attackScenario 3 hash of solution sent to client rainbow tables :)
    • Manual laborThere are two options:
    • Pay a bunch of monkeys
    • Or not... XXX Complete this captcha form to continue
    • Automated recognitionWere going to actually reproduce a human response for the given question
    • Can you understand my voice?
    • The sound sample is usually generated
    • Its hard to add noise to thegenerated speech without making it hard for the human
    • But can you read?
    • Sort of.....
    • The most common approach Greedy optimization – reverse engineer everything Character segmentation OCR
    • Possible security measures
    • Possible security measures Funky background image
    • Possible security measures Funky background image − usually can be removed with basic preprocessing
    • Possible security measures Funky background image − usually can be removed with basic preprocessing Text distortions
    • Possible security measures Funky background image − usually can be removed with basic preprocessing Text distortions − modern OCR techniques can beat it
    • Possible security measures Funky background image − usually can be removed with basic preprocessing Text distortions − modern OCR techniques can beat it Anti segmentation measures
    • Beating segmentation
    • Beating segmentation  If a character signature can be extracted from only the vertical signature, character segmentation becomes trivialA Low-cost Attack on a Microsoft CAPTCHA - Jeff Yan, Ahmad Salah El AhmadSchool of Computing Science, Newcastle University, UK
    • Beating segmentationWe can otherwise ignore it!
    • Beating segmentationWe can otherwise ignore it!The following slides are about an experiment about this approach
    • A Monte-Carlo experiment Note: for testing performance, the variance of the characters has been kept to a minimumf(x) → yx in binary( 0 - 2^3000 )y in 10^6
    • Training: − Select one character image at random − Select N black spots − Sort the points for uniqueness − Subtract the first point from all others for position independence − Assign it a weight for each character using the following formula: matched characters count / sample size − Assign it a score (indicates classification quality) selected digit weight / (1 + other digit weights)
    • Recognition: − Make a score map for all points − Select the most appropriate character for each column − Process the resulting string into a 6 digit string
    • An equivalent model input layer linear hidden layer (feature layer) threshold layers softmax layer
    • An equivalent model input layer OCR linear hidden layer (feature layer) without zero penalty == threshold layers No biases for the first layer (avoids the 2*binary - 1 effect) softmax layer
    • Hacking the OCR: To negate the effect the biases, for each image we add random noise in the white areas This will greatly improve the recognition in a noisy image
    • An more powerful model input layer Hacked OCR layer Score map output layer
    • Questions?
    • The demo source is hosted athttps://github.com/theshark08/howtobreakacaptcha01