563.10.3 CAPTCHA Presented by: Sari Louis SPAM Group: Marc Gagnon, Sari Louis, Steve White University of Illinois Spring 2...
Agenda <ul><li>Definition </li></ul><ul><li>Background </li></ul><ul><li>Applications </li></ul><ul><li>Types of CAPTCHAs ...
Definition <ul><li>CAPTCHA stands for Completely Automated Public Turing test to tell Computers and Humans Apart </li></ul...
Background <ul><li>First used by Altavista in1997 </li></ul><ul><ul><li>Reduced SPAM add-url by over 95% </li></ul></ul><u...
Background <ul><li>CAPTCHAs are based on open AI problems </li></ul><ul><li>Breaking CAPTCHAs help advance AI by solving t...
Background - Papers <ul><li>Pessimal Print: A Reverse Turing Test Allison L. Coates, Henry S. Baird, Richard J. Fateman </...
Applications <ul><li>Free email services </li></ul><ul><li>Online polls </li></ul><ul><li>Dictionary attacks </li></ul><ul...
Types of CAPTCHAs <ul><li>Text based </li></ul><ul><ul><li>Gimpy, ez-gimpy </li></ul></ul><ul><ul><li>Gimpy-r, Google CAPT...
Text Based CAPTCHAs <ul><li>Gimpy, ez-gimpy </li></ul><ul><ul><li>Pick a word or words from a small dictionary </li></ul><...
Text Based CAPTCHAs
Graphic Based CAPTCHAs <ul><li>Bongo </li></ul><ul><ul><li>Display two series of blocks </li></ul></ul><ul><ul><li>User mu...
Graphic Based CAPTCHAs <ul><li>PIX </li></ul><ul><ul><li>Create a large database of labeled images </li></ul></ul><ul><ul>...
Graphic Based CAPTCHAs Dog Pool
Audio Based CAPTCHAs <ul><li>Pick a word or a sequence of numbers at random </li></ul><ul><li>Render them into an audio cl...
Breaking CAPTCHAs <ul><li>Most text based CAPTCHAs have been broken by software </li></ul><ul><ul><li>OCR </li></ul></ul><...
Proposed Approach <ul><li>Very similar to PIX </li></ul><ul><li>Pick a concrete object </li></ul><ul><li>Get 6 images at r...
Proposed Approach - Technical <ul><li>Make an HTTP call to images.google.com and search for the object </li></ul><ul><li>S...
Proposed Approach - Benefits <ul><li>The database already exists and is public </li></ul><ul><li>The database is constantl...
Proposed Approach - Drawbacks <ul><li>Not accessible to people with disabilities (which is the case of most CAPTCHAs) </li...
Upcoming SlideShare
Loading in …5
×

563.10.3 captcha

1,694
-1

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,694
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
137
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

563.10.3 captcha

  1. 1. 563.10.3 CAPTCHA Presented by: Sari Louis SPAM Group: Marc Gagnon, Sari Louis, Steve White University of Illinois Spring 2006
  2. 2. Agenda <ul><li>Definition </li></ul><ul><li>Background </li></ul><ul><li>Applications </li></ul><ul><li>Types of CAPTCHAs </li></ul><ul><li>Breaking CAPTCHAs </li></ul><ul><li>Proposed Approach </li></ul><ul><li>Conclusion </li></ul>
  3. 3. Definition <ul><li>CAPTCHA stands for Completely Automated Public Turing test to tell Computers and Humans Apart </li></ul><ul><li>A.K.A. Reverse Turing Test, Human Interaction Proof </li></ul><ul><li>The challenge: develop a software program that can create and grade challenges most humans can pass but computers cannot </li></ul>
  4. 4. Background <ul><li>First used by Altavista in1997 </li></ul><ul><ul><li>Reduced SPAM add-url by over 95% </li></ul></ul><ul><li>CMU/Yahoo! </li></ul><ul><ul><li>Automated the creating and grading of challenges </li></ul></ul><ul><li>PARC </li></ul><ul><ul><li>Relies on document image degradation to prevent successful OCR </li></ul></ul><ul><ul><li>Conducted user-focused studies to assess the effectiveness of CAPTCHAs </li></ul></ul>
  5. 5. Background <ul><li>CAPTCHAs are based on open AI problems </li></ul><ul><li>Breaking CAPTCHAs help advance AI by solving these open problems </li></ul><ul><li>Improving CAPTCHAs help telling computers and human apart </li></ul><ul><li>Win-win situation </li></ul>
  6. 6. Background - Papers <ul><li>Pessimal Print: A Reverse Turing Test Allison L. Coates, Henry S. Baird, Richard J. Fateman </li></ul><ul><li>Telling Humans and Computer Apart Automatically Luis von Ahn, Manuel Blum, and John Langford </li></ul><ul><li>CAPTCHA: Using Hard AI Problems for Security Luis von Ahn, Manuel Blum, Nicholas J. Hopper, and John Langford </li></ul><ul><li>Using Machine Learning to Break Visual Human Interaction Proofs (HIPs) Kumar Chellapilla, Patrice Y. Simard </li></ul>
  7. 7. Applications <ul><li>Free email services </li></ul><ul><li>Online polls </li></ul><ul><li>Dictionary attacks </li></ul><ul><li>Newsgroups, Blogs, etc… </li></ul><ul><li>SPAM </li></ul>
  8. 8. Types of CAPTCHAs <ul><li>Text based </li></ul><ul><ul><li>Gimpy, ez-gimpy </li></ul></ul><ul><ul><li>Gimpy-r, Google CAPTCHA </li></ul></ul><ul><ul><li>Simard’s HIP (MSN) </li></ul></ul><ul><li>Graphic based </li></ul><ul><ul><li>Bongo </li></ul></ul><ul><ul><li>Pix </li></ul></ul><ul><li>Audio based </li></ul>
  9. 9. Text Based CAPTCHAs <ul><li>Gimpy, ez-gimpy </li></ul><ul><ul><li>Pick a word or words from a small dictionary </li></ul></ul><ul><ul><li>Distort them and add noise and background </li></ul></ul><ul><li>Gimpy-r, Google’s CAPTCHA </li></ul><ul><ul><li>Pick random letters </li></ul></ul><ul><ul><li>Distort them, add noise and background </li></ul></ul><ul><li>Simard’s HIP </li></ul><ul><ul><li>Pick random letters and numbers </li></ul></ul><ul><ul><li>Distort them and add arcs </li></ul></ul>
  10. 10. Text Based CAPTCHAs
  11. 11. Graphic Based CAPTCHAs <ul><li>Bongo </li></ul><ul><ul><li>Display two series of blocks </li></ul></ul><ul><ul><li>User must find the characteristic that sets the two series apart </li></ul></ul><ul><ul><li>User is asked to determine which series each of four single blocks belongs to </li></ul></ul><ul><ul><li>Difference? thick vs. thin lines </li></ul></ul>
  12. 12. Graphic Based CAPTCHAs <ul><li>PIX </li></ul><ul><ul><li>Create a large database of labeled images </li></ul></ul><ul><ul><li>Pick a concrete object </li></ul></ul><ul><ul><li>Pick four images of the object from the images database </li></ul></ul><ul><ul><li>Distort the images </li></ul></ul><ul><ul><li>Ask the user to pick the object for a list of words </li></ul></ul>
  13. 13. Graphic Based CAPTCHAs Dog Pool
  14. 14. Audio Based CAPTCHAs <ul><li>Pick a word or a sequence of numbers at random </li></ul><ul><li>Render them into an audio clip using a TTS software </li></ul><ul><li>Distort the audio clip </li></ul><ul><li>Ask the user to identify and type the word or numbers </li></ul>
  15. 15. Breaking CAPTCHAs <ul><li>Most text based CAPTCHAs have been broken by software </li></ul><ul><ul><li>OCR </li></ul></ul><ul><ul><li>Segmentation </li></ul></ul><ul><li>Other CAPTCHAs were broken by streaming the tests for unsuspecting users to solve. </li></ul>
  16. 16. Proposed Approach <ul><li>Very similar to PIX </li></ul><ul><li>Pick a concrete object </li></ul><ul><li>Get 6 images at random from images.google.com that match the object </li></ul><ul><li>Distort the images </li></ul><ul><li>Build a list of 100 words: 90 from a full dictionary, 10 from the objects dictionary </li></ul><ul><li>Prompt the user to pick the object from the list of words </li></ul>
  17. 17. Proposed Approach - Technical <ul><li>Make an HTTP call to images.google.com and search for the object </li></ul><ul><li>Screen scrape the result of 2-3 pages to get the list of images </li></ul><ul><li>Pick 6 images at random </li></ul><ul><li>Randomly distort both the images and their URLs before displaying them </li></ul><ul><li>Expire the CAPTCHA in 30-45 seconds </li></ul>
  18. 18. Proposed Approach - Benefits <ul><li>The database already exists and is public </li></ul><ul><li>The database is constantly being updated and maintained </li></ul><ul><li>Adding “concrete objects” to the dictionary is virtually instantaneous </li></ul><ul><li>Distortion prevents caching hacks </li></ul><ul><li>Quick expiration limits streaming hacks </li></ul>
  19. 19. Proposed Approach - Drawbacks <ul><li>Not accessible to people with disabilities (which is the case of most CAPTCHAs) </li></ul><ul><li>Relies on Google’s infrastructure </li></ul><ul><li>Unlike CAPTCHAs using random letters and numbers, the number of challenge words is limited </li></ul>
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×