AGENDA
ď‚— DEFINITION
ď‚— BACKGROUND
ď‚— TYPES
ď‚— APPLICATIONS
ď‚— CONSTRUCTING CAPTCHA
ď‚— BREAKING CAPTCHA
ď‚— ISSUES WITH CAPTCHA
ď‚— CONCLUSION
INTRODUCTION
CAPTCHA – Completely Automated
Public Turing test to tell Computers &
Humans Apart.
ď‚—Invented at CMU by Luis von Ahn,
Manuel Blum, et.al.
ď‚—It is a program that is a challenge
response to test to separate humans
from computer programs.
Generic CAPTCHAs distort letters &
numbers -
ď‚—Distorted characters are presented to
the user.
ď‚—User has to recognize the distorted
letters.
ď‚—If the guessed letters are correct, the
user is inferred to be a human &
allowed access.
Contd…
ď‚—Humans can read the distorted &
noisy text.
ď‚—Current OCRs(Optical Character
Recognition) cannot read them.
BACKGROUND
ď‚—Why CAPTCHA was needed ?
ď‚§Sabotage of Online Polls.
ď‚§Spam e-mails.
ď‚§Abusing free Online accounts.
ď‚§Tampering with rankings on
recommendation systems (like
Ebay, Amazon)
What is TURING TEST ?
ď‚— Proposed by Alan Turing.
 To test a machine’s level of intelligence.
ď‚— Human judge asks questions to two participants,
one is a machine & the other human.
 The judge doesn’t know which is which.
ď‚— After listening to the answer, if the judge fails to
recognize which one is the machine, then the
machine passes the test.
Contd…
ď‚—CAPTCHA employs a Reverse Turing
Test.
ď‚—Judge = CAPTCHA program,
participant = user
ď‚—If the user passes CAPTCHA, he is
human otherwise it is a machine.
1. Text Based-
ď‚— simple, normal questions :-
ď‚— What is the sum of three & thirty-five ?
ď‚— If today is Saturday, what is day after
tomorrow ?
ď‚— Which of mango, table & water is a fruit ?
ď‚—Very effective, needs a large question bank.
ď‚—Congnitively challenged users find it hard.
2. Gimpy-
ď‚— Designed by Yahoo & CMU(Carnegie Mellon
University)
ď‚— Picks up 10 random words from dictionary & distorts,
fills with noise.
ď‚— User has to recognize at least 3 words.
ď‚— If the user is correct, then he is admitted.
3. EZ-Gimpy-
ď‚— A modified version of Gimpy.
ď‚— Yahoo used this version in Messenger.
ď‚— Has only 1 random string of characters.
ď‚— Not a dictionary word, so not prone to dictionary
attack.
ď‚— Not a good implimentation , already broken by
OCRs(Optical Character Recognition).
4. MSNs passport service CAPATCHAs-
 Provided for Microsoft’s MSN services.
ď‚— Use of 8 characters.
ď‚— Warping is used to distort.
 Very strong implementation, hasn’t been broken.
ď‚— It is segmentation-resistant.
5. Graphic based CAPTCHAs-
ď‚— 1. BONGO-
ď‚— After M.M.Bongard, pattern recognition expert.
ď‚— User has to solve a pattern recognition problem.
ď‚— Has to tell the distinct characteristic between
two sets of figures.
ď‚— Then tell to which set a given figure belongs to.
Contd…
ď‚— 2. PIX-
ď‚— Uses a large database of labelled images.
ď‚— It shows a set of images, user has to recognize
the common feature among those.
ď‚— Eg :- pick the common characteristic among the
following 4 pictures = “aeroplane”.
6. Audio CAPTCHAs-
ď‚— Consists of downloadable audio clip.
ď‚— User listens & enters the spoken word.
ď‚— Helps visually disabled users.
 Below is the Google’s audio enabled CAPTCHA-
7. Applications-
ď‚— Protect Online polls.
ď‚— Prevent web registration abuse, protect
passwords from brute-force attack.
ď‚— Prevent comment spam & spam e-mails.
ď‚— E-ticketing, prevent scalping.
Contd…
 Verify digitized books : “RE-CAPTCHA”
ď‚— Used in Google books project.
ď‚— Two words are shown, the program knows the first
word.
ď‚— If the user enter the first word correctly, it assumes
that the second unknown word will also be entered
correctly.
 Second word becomes “known”.
Constructing CAPTCHAs
ď‚— Things to keep in mind :-
 Don’t store CAPTCHA solution in web page’s
metadata.
 A CAPTCHA is no good if it doesn’t distort.
ď‚— Need a large database of different CAPTCHA
questions.
ď‚— Avoid repetition of question.
CAPTCHA logic
ď‚— Generate the question
ď‚— Persist the correct answer
ď‚— Present the question to the user
ď‚— Evaluate the answer, if incorrect start again-
Generate a different CAPTCHA
ď‚— If correct allow the access to the user
Breaking CAPTCHAs
 Cracking CAPTCHAs through programs –
ď‚—Convert CAPTCHA into Grey scale.
ď‚—Detect patterns in the image
corresponding to the characters
ď‚— Greg Mori & Jitendra Malik have broken
text CAPTCHAs
Ex:- Easy Gimpy
Contd…
 To break this CAPTCHA –
 Segmentation –
 Locate possible letters in the image –
 Construct graph of consisting letters –
ď‚— Find out the possible words from the graph, use scores
to rank
Roll = 11.94
Profit = 9.42 (better match)
Contd…
 Social engineering to break CAPTCHAs –
ď‚— Spammer encounters a CAPTCHA
ď‚— That CAPTCHA is copied to another site
ď‚— Humans are baited, Ex:- free MP3s, free wallpapers, etc.
ď‚— To get those MP3s or wallpapers, users are told to solve
the copied CAPTCHA.
ď‚— Then the solution is routed back to the spammer.
Solution – Fix a time-to-live period for a question.
Issues with CAPTCHAs
 Usability issue –
ď‚— W3C mandates web to be accessible to all
people.
ď‚— Some CAPTCHAs are in accessible to visually
impaired, cognitively challenged people.
 Compatibility issue –
ď‚— Java script may be needed to be activated in
browsers.
ď‚— Some may need Adobe Flash Plugin.
SUMMARY
ď‚—CAPTCHAs are an effective way to
counter bots & reduce spam.
ď‚—They help advance AI knowledge.
ď‚—Some issues with current
implementations represent challenges
for future improvements.
49 captcha
49 captcha

49 captcha

  • 2.
    AGENDA ď‚— DEFINITION ď‚— BACKGROUND ď‚—TYPES ď‚— APPLICATIONS ď‚— CONSTRUCTING CAPTCHA ď‚— BREAKING CAPTCHA ď‚— ISSUES WITH CAPTCHA ď‚— CONCLUSION
  • 3.
    INTRODUCTION CAPTCHA – CompletelyAutomated Public Turing test to tell Computers & Humans Apart. Invented at CMU by Luis von Ahn, Manuel Blum, et.al. It is a program that is a challenge response to test to separate humans from computer programs.
  • 4.
    Generic CAPTCHAs distortletters & numbers - ď‚—Distorted characters are presented to the user. ď‚—User has to recognize the distorted letters. ď‚—If the guessed letters are correct, the user is inferred to be a human & allowed access.
  • 5.
    Contd… Humans can readthe distorted & noisy text. Current OCRs(Optical Character Recognition) cannot read them.
  • 6.
    BACKGROUND ď‚—Why CAPTCHA wasneeded ? ď‚§Sabotage of Online Polls. ď‚§Spam e-mails. ď‚§Abusing free Online accounts. ď‚§Tampering with rankings on recommendation systems (like Ebay, Amazon)
  • 7.
    What is TURINGTEST ?  Proposed by Alan Turing.  To test a machine’s level of intelligence.  Human judge asks questions to two participants, one is a machine & the other human.  The judge doesn’t know which is which.  After listening to the answer, if the judge fails to recognize which one is the machine, then the machine passes the test.
  • 8.
    Contd… CAPTCHA employs aReverse Turing Test. Judge = CAPTCHA program, participant = user If the user passes CAPTCHA, he is human otherwise it is a machine.
  • 10.
    1. Text Based- ď‚—simple, normal questions :- ď‚— What is the sum of three & thirty-five ? ď‚— If today is Saturday, what is day after tomorrow ? ď‚— Which of mango, table & water is a fruit ? ď‚—Very effective, needs a large question bank. ď‚—Congnitively challenged users find it hard.
  • 11.
    2. Gimpy- ď‚— Designedby Yahoo & CMU(Carnegie Mellon University) ď‚— Picks up 10 random words from dictionary & distorts, fills with noise. ď‚— User has to recognize at least 3 words. ď‚— If the user is correct, then he is admitted.
  • 12.
    3. EZ-Gimpy- ď‚— Amodified version of Gimpy. ď‚— Yahoo used this version in Messenger. ď‚— Has only 1 random string of characters. ď‚— Not a dictionary word, so not prone to dictionary attack. ď‚— Not a good implimentation , already broken by OCRs(Optical Character Recognition).
  • 13.
    4. MSNs passportservice CAPATCHAs-  Provided for Microsoft’s MSN services.  Use of 8 characters.  Warping is used to distort.  Very strong implementation, hasn’t been broken.  It is segmentation-resistant.
  • 14.
    5. Graphic basedCAPTCHAs- ď‚— 1. BONGO- ď‚— After M.M.Bongard, pattern recognition expert. ď‚— User has to solve a pattern recognition problem. ď‚— Has to tell the distinct characteristic between two sets of figures. ď‚— Then tell to which set a given figure belongs to.
  • 15.
    Contd…  2. PIX- Uses a large database of labelled images.  It shows a set of images, user has to recognize the common feature among those.  Eg :- pick the common characteristic among the following 4 pictures = “aeroplane”.
  • 16.
    6. Audio CAPTCHAs- Consists of downloadable audio clip.  User listens & enters the spoken word.  Helps visually disabled users.  Below is the Google’s audio enabled CAPTCHA-
  • 17.
    7. Applications- ď‚— ProtectOnline polls. ď‚— Prevent web registration abuse, protect passwords from brute-force attack. ď‚— Prevent comment spam & spam e-mails. ď‚— E-ticketing, prevent scalping.
  • 18.
    Contd…  Verify digitizedbooks : “RE-CAPTCHA”  Used in Google books project.  Two words are shown, the program knows the first word.  If the user enter the first word correctly, it assumes that the second unknown word will also be entered correctly.  Second word becomes “known”.
  • 19.
    Constructing CAPTCHAs  Thingsto keep in mind :-  Don’t store CAPTCHA solution in web page’s metadata.  A CAPTCHA is no good if it doesn’t distort.  Need a large database of different CAPTCHA questions.  Avoid repetition of question.
  • 20.
    CAPTCHA logic ď‚— Generatethe question ď‚— Persist the correct answer ď‚— Present the question to the user ď‚— Evaluate the answer, if incorrect start again- Generate a different CAPTCHA ď‚— If correct allow the access to the user
  • 21.
    Breaking CAPTCHAs  CrackingCAPTCHAs through programs – Convert CAPTCHA into Grey scale. Detect patterns in the image corresponding to the characters  Greg Mori & Jitendra Malik have broken text CAPTCHAs Ex:- Easy Gimpy
  • 22.
    Contd…  To breakthis CAPTCHA –  Segmentation –  Locate possible letters in the image –  Construct graph of consisting letters –  Find out the possible words from the graph, use scores to rank Roll = 11.94 Profit = 9.42 (better match)
  • 23.
    Contd…  Social engineeringto break CAPTCHAs –  Spammer encounters a CAPTCHA  That CAPTCHA is copied to another site  Humans are baited, Ex:- free MP3s, free wallpapers, etc.  To get those MP3s or wallpapers, users are told to solve the copied CAPTCHA.  Then the solution is routed back to the spammer. Solution – Fix a time-to-live period for a question.
  • 24.
    Issues with CAPTCHAs Usability issue –  W3C mandates web to be accessible to all people.  Some CAPTCHAs are in accessible to visually impaired, cognitively challenged people.  Compatibility issue –  Java script may be needed to be activated in browsers.  Some may need Adobe Flash Plugin.
  • 25.
    SUMMARY ď‚—CAPTCHAs are aneffective way to counter bots & reduce spam. ď‚—They help advance AI knowledge. ď‚—Some issues with current implementations represent challenges for future improvements.