Story of reCAPTCHA

  • 2,018 views
Uploaded on

Story of reCAPTCHA : A Session By Naga Chokkanathan @ CRMIT (http://www.crmit.com/) …

Story of reCAPTCHA : A Session By Naga Chokkanathan @ CRMIT (http://www.crmit.com/)

Video Recording Of This Session : http://youtu.be/K5XI60uc06c

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to like this
No Downloads

Views

Total Views
2,018
On Slideshare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
19
Comments
2
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Story of reCAPTCHANaga Chokkanathan
  • 2. Remember This? • CAPTCHA – Completely – Automated – Public – Turing test to tell – Computers and – Humans – Apart • Security for the website, Agreed • But for the real users? • BORING task • Waste of time Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  • 3. CAPTCHA • Yahoo! popularized it first • Later, almost every website started using CAPTCHA to avoid automated attacks • Very effective : Only people can crack those word / image puzzles • But, it is a waste of time too – Assuming you spend 10 seconds on a CAPTCHA – Multiplied by 200 Million CAPTCHAs every day – Thousands of hours being wasted on a daily basis • Can something be done about this? (1) Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  • 4. Another Problem • Digitizing Books • Process: – Stage 1 • Scan • Convert to image • Save – Stage 2 • Use OCR to convert images to text • Searchable Text Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  • 5. OCR • Optical Character Recognition • Wonderful technology • But not always reliable • Especially with old text (due to ancient typeface, damages, stains etc.,) • Can something be done about this? (2) Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  • 6. Possible Solutions • Manual Corrections – Near Impossible – VERY Expensive • Using multiple OCR Programs – They will still make mistakes – But not the same mistakes – Hopefully! • Can something be done about this? (3) Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  • 7. Crowd Sourcing • Each book contains 25000 words (Assume) – Can we split them to 25 people, each correcting 1000 words? – Or 50 people, each 500 words? – Or 100 people, each 250 words? – Or 2500 people, each 10 words? – Or 25000 people, each 1 word? • Sounds Stupid? – Think again! Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  • 8. Dr. Luis von Ahn • Associate Professor @ Carnegie Mellon University • Coined the word CAPTCHA • Pioneer in the field of Crowdsourcing • Founder of the company reCAPTCHA (Later acquired by Google) Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  • 9. reCAPTCHA Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  • 10. reCAPTCHA Process • Step 1 : Using multiple OCR Programs – Accept Matching Words – Use Dictionary – Flag “Problematic” Words • Step 2 : reCAPTCHA – Millions of users on various websites fill reCAPTCHA forms • Proving they are not robots • Proof reading text, One word at a time – Similar entries are compared, before arriving at the final word Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  • 11. How It Works Flagged Word Control Word (Real CAPTCHA) Remember “25000 people, Proof Reading 1 Word at a time”? Not “Stupid” Anymore! Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  • 12. Few Statistics • 100M+ reCAPTCHAs every day • 96000+ Websites – Most major websites use it • Facebook, Twitter, CNN etc., • Security concerns exist! Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  • 13. What We Can Do • Use reCAPTCHA instead of CAPTCHA in your websites, wherever required – Registration Forms, Blogs, Forums etc., – Easy to use Widgets • Be proud when filling a reCAPTCHA form – You are helping Google preserve books ☺ Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  • 14. Applying Crowd Sourcing • Can it solve some of your existing problems? Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  • 15. References, Image Credits • https://www.youtube.com/watch?v=VoybhowC4LE • http://www.nytimes.com/2011/03/29/science/29recaptcha.html?_r=1& • http://techie-buzz.com/tech-news/recaptcha-crowdsourcing-ocr-google- books.html • http://www.google.com/recaptcha • http://drupal.org/project/captcha • http://www.captcha.net/ • http://www.brothersoft.com/cuneiform-ocr-4384.html • http://www.compzets.com/view-upload.php?id=166&action=view • http://en.wikipedia.org/ Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  • 16. Thank you Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.