Story of reCAPTCHA

2,944 views
2,809 views

Published on

Story of reCAPTCHA : A Session By Naga Chokkanathan @ CRMIT (http://www.crmit.com/)

Video Recording Of This Session : http://youtu.be/K5XI60uc06c

Published in: Technology
2 Comments
1 Like
Statistics
Notes
No Downloads
Views
Total views
2,944
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
69
Comments
2
Likes
1
Embeds 0
No embeds

No notes for slide

Story of reCAPTCHA

  1. 1. Story of reCAPTCHANaga Chokkanathan
  2. 2. Remember This? • CAPTCHA – Completely – Automated – Public – Turing test to tell – Computers and – Humans – Apart • Security for the website, Agreed • But for the real users? • BORING task • Waste of time Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  3. 3. CAPTCHA • Yahoo! popularized it first • Later, almost every website started using CAPTCHA to avoid automated attacks • Very effective : Only people can crack those word / image puzzles • But, it is a waste of time too – Assuming you spend 10 seconds on a CAPTCHA – Multiplied by 200 Million CAPTCHAs every day – Thousands of hours being wasted on a daily basis • Can something be done about this? (1) Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  4. 4. Another Problem • Digitizing Books • Process: – Stage 1 • Scan • Convert to image • Save – Stage 2 • Use OCR to convert images to text • Searchable Text Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  5. 5. OCR • Optical Character Recognition • Wonderful technology • But not always reliable • Especially with old text (due to ancient typeface, damages, stains etc.,) • Can something be done about this? (2) Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  6. 6. Possible Solutions • Manual Corrections – Near Impossible – VERY Expensive • Using multiple OCR Programs – They will still make mistakes – But not the same mistakes – Hopefully! • Can something be done about this? (3) Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  7. 7. Crowd Sourcing • Each book contains 25000 words (Assume) – Can we split them to 25 people, each correcting 1000 words? – Or 50 people, each 500 words? – Or 100 people, each 250 words? – Or 2500 people, each 10 words? – Or 25000 people, each 1 word? • Sounds Stupid? – Think again! Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  8. 8. Dr. Luis von Ahn • Associate Professor @ Carnegie Mellon University • Coined the word CAPTCHA • Pioneer in the field of Crowdsourcing • Founder of the company reCAPTCHA (Later acquired by Google) Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  9. 9. reCAPTCHA Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  10. 10. reCAPTCHA Process • Step 1 : Using multiple OCR Programs – Accept Matching Words – Use Dictionary – Flag “Problematic” Words • Step 2 : reCAPTCHA – Millions of users on various websites fill reCAPTCHA forms • Proving they are not robots • Proof reading text, One word at a time – Similar entries are compared, before arriving at the final word Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  11. 11. How It Works Flagged Word Control Word (Real CAPTCHA) Remember “25000 people, Proof Reading 1 Word at a time”? Not “Stupid” Anymore! Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  12. 12. Few Statistics • 100M+ reCAPTCHAs every day • 96000+ Websites – Most major websites use it • Facebook, Twitter, CNN etc., • Security concerns exist! Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  13. 13. What We Can Do • Use reCAPTCHA instead of CAPTCHA in your websites, wherever required – Registration Forms, Blogs, Forums etc., – Easy to use Widgets • Be proud when filling a reCAPTCHA form – You are helping Google preserve books ☺ Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  14. 14. Applying Crowd Sourcing • Can it solve some of your existing problems? Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  15. 15. References, Image Credits • https://www.youtube.com/watch?v=VoybhowC4LE • http://www.nytimes.com/2011/03/29/science/29recaptcha.html?_r=1& • http://techie-buzz.com/tech-news/recaptcha-crowdsourcing-ocr-google- books.html • http://www.google.com/recaptcha • http://drupal.org/project/captcha • http://www.captcha.net/ • http://www.brothersoft.com/cuneiform-ocr-4384.html • http://www.compzets.com/view-upload.php?id=166&action=view • http://en.wikipedia.org/ Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.
  16. 16. Thank you Story of reCAPTCHA www.crmit.com© Copyright 2013 CRMIT. All rights reserved.

×