SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Successfully reported this slideshow.
Activate your 14 day free trial to unlock unlimited reading.
21.
Possible security measures
Funky background image
22.
Possible security measures
Funky background image
− usually can be removed with basic preprocessing
23.
Possible security measures
Funky background image
− usually can be removed with basic preprocessing
Text distortions
24.
Possible security measures
Funky background image
− usually can be removed with basic preprocessing
Text distortions
− modern OCR techniques can beat it
25.
Possible security measures
Funky background image
− usually can be removed with basic preprocessing
Text distortions
− modern OCR techniques can beat it
Anti segmentation measures
27.
Beating segmentation
If a character signature can be extracted from
only the vertical signature, character
segmentation becomes trivial
A Low-cost Attack on a Microsoft CAPTCHA - Jeff Yan, Ahmad Salah El Ahmad
School of Computing Science, Newcastle University, UK
28.
Beating segmentation
We can otherwise ignore it!
29.
Beating segmentation
We can otherwise ignore it!
The following slides are about an experiment
about this approach
30.
A Monte-Carlo experiment
Note: for testing performance, the variance of
the characters has been kept to a minimum
f(x) → y
x in binary( 0 - 2^3000 )
y in 10^6
31.
Training:
− Select one character image at random
− Select N black spots
− Sort the points for uniqueness
− Subtract the first point from all others for position
independence
− Assign it a 'weight' for each character using the
following formula:
matched characters count / sample size
− Assign it a 'score' (indicates classification quality)
selected digit weight / (1 + other digit weights)
32.
Recognition:
− Make a score map for all points
− Select the most appropriate character for each
column
− Process the resulting string into a 6 digit string
33.
An equivalent model
input layer
linear hidden layer
(feature layer)
threshold layers
softmax layer
34.
An equivalent model
input layer
OCR
linear hidden layer
(feature layer) without zero
penalty
==
threshold layers No biases for
the first layer
(avoids the
2*binary - 1
effect)
softmax layer
35.
Hacking the OCR:
To negate the effect the biases, for each image we
add random noise in the white areas
This will greatly improve the recognition in a noisy
image
36.
An more powerful model
input layer
Hacked OCR layer
Score map
output layer