Published on

Published in: Education
1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. CAPTCHA What humans can do, But computers can not.
  2. 2. CAPTCHA, the Acronym <ul><li>C ompletely </li></ul><ul><li>A utomated </li></ul><ul><li>P ublic </li></ul><ul><li>T uring Test to Tell </li></ul><ul><li>C omputers and </li></ul><ul><li>H umans </li></ul><ul><li>A part </li></ul>
  3. 3. <ul><li>C ompletely--- Whole </li></ul><ul><li>A utomated--- made by machine </li></ul><ul><li>P ublic--- universally known also easy for hackers to break it </li></ul><ul><li>T uring Test to Tell--- test presented by Alan Turing </li></ul><ul><li>C omputers and </li></ul><ul><li>H umans </li></ul><ul><li>A part </li></ul>CAPTCHA – literal meaning
  4. 4. CAPTCHA Origins <ul><li>1997: Andrei Broder at AltaVista wanted to prevent bots from automatically submitting sites for indexing </li></ul><ul><li>He decided to add a test to the submission page </li></ul><ul><li>He reversed Brother scanner OCR optimization techniques </li></ul><ul><li>2000: Luis von Ahn, Manuel Blum & John Langford at CMU coined term CAPTCHA </li></ul>
  5. 5. CAPTCHA: Deciding Human or Bot? <ul><li>A puzzle or problem that is easy for humans to solve and very difficult for computers </li></ul><ul><li>If the puzzle is solved correctly, you are considered human and can continue </li></ul>
  6. 6. Basic two types <ul><li>Printed CAPTCHA </li></ul><ul><li>H-CAPTCHA </li></ul>
  7. 7. Printed CAPTCHA <ul><li>Printed CAPTCHA is difficult to break </li></ul><ul><li>Lots of algorithms are available to generate these </li></ul><ul><li>Humans cannot identify these very easily </li></ul><ul><li>Two major types are there viz. Baffle text,Pessimal print. </li></ul>
  8. 8. Baffle Text image <ul><li>Developed by Monica Chew and Henry Baird </li></ul><ul><li>Uses pronounceable English characters with masking that are not present in English dictionary </li></ul>
  9. 9. Pessimal Print Image <ul><li>Developed by Allison Coates and Henry Baird and Richard Fateman </li></ul><ul><li>Uses the degradation model simulating physical defects caused by printing and scanning of printed text </li></ul>
  10. 10. Handwritten CAPTCHA <ul><li>less frequently used because human can easily identify the handwriting rather than text images </li></ul><ul><li>Use of transformations by adding lines,arcs,circles etc. </li></ul>
  11. 11. Example showing H-CAPTCHA
  12. 12. Types of Printed CAPTCHA <ul><li>GIMPY </li></ul><ul><li>BONGO </li></ul><ul><li>PIX </li></ul><ul><li>KittenAuth </li></ul><ul><li>Face Recognition </li></ul><ul><li>Audio </li></ul><ul><li>Logic Puzzles </li></ul>
  13. 13. GIMPY <ul><li>Randomly chooses 7 words from a dictionary </li></ul><ul><li>Distorts the words using a variety of techniques </li></ul><ul><li>Human must correctly type 3 of the words to pass the test </li></ul><ul><li>In the real world, most applications only test for a single word (EZ-Gimpy) </li></ul>
  14. 14. GIMPY Examples EZ-GYMPY R-GIMPY
  15. 15. BONGO <ul><li>A visual recognition problem </li></ul><ul><li>Two sets of shapes with a distinguishing characteristic </li></ul><ul><li>Must choose which set the shape belongs to </li></ul>
  16. 16. PIX <ul><li>A database of labeled images of recognizable objects </li></ul><ul><li>Randomly chooses an object and displays N pictures of it </li></ul><ul><li>Must correctly identify the object </li></ul><ul><li>Pictures are distorted </li></ul>
  17. 17. KittenAuth <ul><li>“The Cutest Human Test” </li></ul><ul><li>A 3x3 matrix of cute animals </li></ul><ul><li>Choose the 3 kittens </li></ul><ul><li>Strategy is to use </li></ul><ul><li>animals that look </li></ul><ul><li>similar to kittens </li></ul>
  18. 18. Face Recognition CAPTCHA
  19. 19. Audio CAPTCHA <ul><li>Pick a word or a sequence of numbers at random </li></ul><ul><li>Render them into an audio clip using a TTS software </li></ul><ul><li>Distort the audio clip </li></ul><ul><li>Ask the user to identify and type the word or numbers </li></ul>
  20. 20. Logic Puzzles <ul><li>Easy trivia questions </li></ul><ul><li>Example: Which of the following is a bird? Elephant, Tiger or Robin,Cons </li></ul><ul><ul><li>Difficult to create a big enough database of these questions </li></ul></ul><ul><ul><li>Difficult for ESL users / international users </li></ul></ul>
  21. 21. <ul><li>Most text based CAPTCHAs have been broken by software </li></ul><ul><ul><li>OCR </li></ul></ul><ul><ul><li>Segmentation </li></ul></ul><ul><li>Other CAPTCHAs were broken by streaming the tests for unsuspecting users to solve. </li></ul>Breaking CAPTCHA
  22. 22. Uses of CAPTCHA <ul><li>Online polls </li></ul><ul><li>Free e-mail services </li></ul><ul><li>Search engine bots </li></ul><ul><li>Prevention to Worms and spams </li></ul><ul><li>Preventing dictionary attack </li></ul><ul><li>etc. </li></ul>
  23. 23. Properties <ul><li>CAPTCHA should be automatically generated and graded </li></ul><ul><li>Test can be taken quickly and easily by human users </li></ul><ul><li>Test will accept virtually all human users and reject software agents </li></ul><ul><li>Test will resist automatic attack for many years despite the technology advances and prior knowledge of algorithms </li></ul>
  24. 24. Free Email Registration Hotmail Registration Yahoo! Registration
  25. 25. Final Thoughts <ul><li>They are crucial to preventing bot attacks </li></ul><ul><li>Hopefully, they will become more user-friendly to people with disabilities (visual, mental) </li></ul><ul><li>CAPTCHA’s are mainly produced from AJAX and PHP technology </li></ul><ul><li>Various algorithms are present </li></ul><ul><li>Use of XML </li></ul>
  26. 26. Different CAPTCHA’s
  27. 27. Thank You!
  28. 28. PHP <ul><li>PHP – originally known as Personal Home Page </li></ul><ul><li>It’s a Hypertext Preprocessor </li></ul><ul><li>It is a scripting lang. Used to create dynamic web pages. </li></ul><ul><li>With syntax from C,JAVA,perl etc PHP code is embedded within HTML pages for server side execution. </li></ul>
  29. 29. OCR (Optical Character Recognition ) The machine recognition of printed characters. OCR systems can recognize many different OCR fonts, as well as typewriter and computer-printed characters. Advanced OCR systems can recognize hand printing. When a text document is scanned into the computer, it is turned into a bitmap, which is a picture of the text. OCR software analyzes the light and dark areas of the bitmap in order to identify each alphabetic letter and numeric digit. When it recognizes a character, it converts it into ASCII text. Hand printing is much more difficult to analyze than machine-printed characters. Old, worn and smudged documents are also difficult. Scanning documents and processing them with OCR is sometimes as much an art as it is a science.
  30. 30. OCR
  31. 31. Segmentation It is nothing but Image Processing Pixel based Segmentation Model based Segmentation Multi-scale Segmentation Semi-automatic Segmentation
  32. 32. Validators <ul><li>Types of validators : </li></ul><ul><li>Mark up : checks web documents in format like HTML,XHTML etc. </li></ul><ul><li>Link validator : checks hyperlinks,useful to find broken links </li></ul><ul><li>CSS validator : checks stylesheet </li></ul><ul><li>RDF validator : checks RDF documents </li></ul><ul><li>Feed validator </li></ul><ul><li>P3P validator : related to protocols </li></ul><ul><li>Etc. </li></ul>
  33. 33. Session Management <ul><li>Process of keeping tracks of user’s activity across the sessions of interaction of user with comp sys. </li></ul><ul><li>When user opens some web pages and does not do anything on that, session gets xpired. </li></ul><ul><li>E.g : score watch on web site </li></ul><ul><li>So after certain time when user re-login to the page then previously xpired session gets restored. </li></ul><ul><li>E.g: if user opened yahoo acc in two windows, and after some time he she logged off from one window.then user cannot use same acc from other window, session gets xpired. User have to re-login to acc. </li></ul>
  34. 34. Session Management <ul><li>There are types : </li></ul><ul><li>Desktop management </li></ul><ul><li>Browser management </li></ul><ul><li>Mainly useful for web applications </li></ul>