CAPTCHA is an acronym for “ Completely automated public Turing Test To Tell the Computers and Human apart”
A CAPTCHA is a challenge response test used in computing to determine the user is human .
Trademarked in 2000 by Luis von Ahn,Manuel Blum,Nicholas Hopper and John Langford of Carnegie Mellon University ,who developed the first CAPTCHA.
A common type of CAPTCHA requires the user to type the letters of a distorted image sometimes with the addition of an obscured sequence of letters or digits appears on screen.
This string which the user has to type to submit a form .This is a simple problem for humans,but a very hard problem for computers which have to use character recognition,because the displayed string is alienated in a way,which makes it very hard for a computer to decode
Early CAPTCHAs such as these distorted images generated by EZ-Gimpy program were used on Yahoo.
A program that can generate and grade tests that:
1. Most humans can pass
2. Current computer programs cannot pass
The concept of a CAPTCHA is motivated by real-world problems faced by internet companies such as Yahoo! and AltaVista.
These companies offer free email accounts, intended for use by humans.
However, they found that many online vendors were using "bots", computer programs that would sign up for thousands of email accounts, from which they could send out masses of junk email.
The first discussion of automated tests which distinguish humans from computers for the purpose of controlling access to web services appears in a 1996 manuscript of Moni Naor from the Weizmann Institute of Science , entitled "Verification of a human in the loop, or Identification via the Turing Test".
Primitive CAPTCHAs seem to have been later developed in 1997 at AltaVista by Andrei Broder and his colleagues to prevent bots from adding URLs to their search engine
In order to make the images resistant to OCR (Optical Character Recognition), the team simulated situations that scanner manuals claimed resulted in bad OCR.
In 2000, von Ahn and Blum developed and publicized the notion of a CAPTCHA, which included any program that can distinguish humans from computers.
A CAPTCHA system is an automated means of generating new challenges which current computers are unable to accurately solve, but most humans can solve .
CAPTCHAs are by definition fully automated, requiring little human maintenance or intervention in administering the test.
This has obvious benefits in cost and reliability. By definition, the algorithm used to create the CAPTCHA must be made public, though it may be covered by a patent.
Because CAPTCHAs rely on perception, users unable to perceive a CAPTCHA due to a disability (such as blindness) will be unable to perform the task protected by a CAPTCHA. In certain Cases, failing to provide a universally accessible means of bypassing the CAPTCHA could make site owners a target of litigation
In order to combat this problem, many implementations of CAPTCHAs permit users to opt for an audio CAPTCHA in addition to a text based one.
While the combination of an audio and visual CAPTCHA can not satisfy all users (for example, those with deafblindness ), the choice of adding a CAPTCHA to an application is a balance between ease of use for legitimate users and creating enough of a challenge for abusers that abusing the application is not worthwhile
The inconvenience caused by a CAPTCHA is sometimes higher for users with disabilities. For some applications, the potential for abuse is so high that the application author feels that a CAPTCHA is necessary. For other applications, the need for accessibility outweighs the abuse that a CAPTCHA would prevent.
EZ-Gimpy and Gimpy, the CAPTCHAs that we have broken, are examples of word-based CAPTCHAs.
In EZ-Gimpy, the CATPCHA used by Yahoo! the user is presented with an image of a single word.
This image has been distorted, and a cluttered, textured background has been added.
The distortion and clutter is sufficient to confuse current OCR (optical character recognition) software.
However, using our computer vision techniques we are able to correctly identify the word 92% of the time.
Gimpy is a more difficult variant of a word-based CAPTCHA. Ten words are presented in distortion and clutter similar to EZ-Gimpy.
The words are also overlapped, providing a CAPTCHA test that can be challenging for humans in some cases.
Generating CAPTCHAs :Bongo Answer: left
to the Right series or to the Left displays two series of blocks, the Left and the Right
blocks in the Left series differ from those in the Right, and the user must find the characteristic that sets them apart.
then, the user is presented with a single block and is asked to determine whether this block belongs
Sound Based CAPTCHAs:Eco
picks a word or a sequence of numbers at random, renders the word or the numbers into a sound clip and distorts the sound clip.
then presents the distorted sound clip to its user and asks them to enter the contents of the sound clip
Text Based CAPTCHAs
A number of research projects have attempted (often with success) to beat visual CAPTCHAs by creating programs that contain the following functionality
1.Extraction of the image from the web page.
2.Removal of background clutter, for example with color filters and detection of thin lines.
3.Segmentation, i.e. splitting the image into segments containing a single letter.
4.Identifying the letter for each segment
Steps 1, 2, and 4 are easy tasks for computers
The only part where humans still out perform computers is segmentation.
If the background clutter consists of shapes similar to letter shapes, and the letters are connected by this clutter, the segmentation becomes nearly impossible with current software. Hence, an effective CAPTCHA should focus on the segmentation
Graphic Based CAPTCHAs
Some researchers promote image recognition CAPTCHAs as a possible alternative for text based CAPTCHAs. To date, no major website has made use of an image based CAPTCHA. As such, the technology would be best described as in the stage of theoretical research. Image recognition CAPTCHAs face many potential problems which have not been fully studied:
It is difficult for a small site to acquire a large dictionary of images which an attacker does not have access to. Without a means of automatically acquiring new labelled images, an image based challenge does not meet the definition of a CAPTCHA.
The principles behind CAPTCHA are as follows:
The user is presented with a garbled image on which some text is displayed. This image is generated by the server using random text.
The user must enter the same letters in the text into a text field that is displayed on the form to protect.
When the form is submitted, the server checks if the text entered by the user matches the initial generated text. If it does, the transaction continues. Otherwise, an error message is displayed and the user has to enter a new code.
CAPTCHA would look like…
The captcha would look like this:
On the main registration form a regular captcha is presented just like before. Users that can see the image may use this test. A link informs users that there is an alternative test.
Clicking the link leads to the audio based test form . This form provides access to an audio file and three input fields. The audio file contains three numbers that the user has to enter into the fields
Protecting Website Registration
Preventing Comment Spam in Blogs.
Search Engine Bots
Worms and Spam
Prevent Dictionary attacks
In November 1999,htttp://slashdot.com
Released an online poll asking which was the best graduate school in computer science!. As is the case with most online polls, IP addresses of voters were recorded in order to prevent single users from voting more than once. However, students at Carnegie Mellon found a way to stuff the ballots by using programs that voted for CMU thousands of times.
CMU's score started growing rapidly. The next day, students at MIT wrote their own voting program and the poll became a contest between voting “bots". MIT finished with 21,156 votes, Carnegie Mellon with 21,032 and every other school with less than 1,000.
Protecting Website Registration
Several companies offer free email services. Up Until a few years ago most of these services suffered from a a specific type of attack:”bots” that would sign up for thousands of email accounts every minuite.The solution to this problem was to use CAPTCHAs to ensure that only humans obtain free accounts.
Preventing Comment spam in Blogs
Most Bloggers are familiar with programs that submit bogus comments usually for the purpose of raising search engine ranks of some website.This is called comment spam.By using a CAPTCHA only humans can enter comments on a blog.There is no need to make users sign up before they enter a comment,and no legitimate comments are over lost!
Search Engine Bots
It is sometimes desirable to keep webpages unindexed to prevent others from finding them easily.There is an html tag to prevent search engine bots from reading webpages.
Worms and Spam
CAPTCHA tests also offer a plausible solution against email worms and spam:
only accept an email message if you know
there is a human behind the other computer.
Preventing Dictionary attacks
CAPTCHA can also be used to prevent dictionary attacks in password systems.The idea is simple:prevent computer from being able to iterate through the entire space of passwords by requiring it to solve a CAPTCHA after a certain number of unsuccessful logins.