SlideShare a Scribd company logo
1 of 59
Download to read offline
Introduction Evolution Method
CPS 205 : Introduction to Cybersecurity
Breaking CAPTCHAs using ML
Jishnu Jaykumar P
jishnujayakumar.github.io
Robert Bosch Centre for Cyber-Physical Systems
Indian Institute of Science Bangalore
March 5, 2018
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
Introduction
CAPTCHA stands for
Completely
Automated
Public
Turing Test to Tell
Computers and
Humans
Apart.
1
CAPTCHA: using hard AI problems for security
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
Introduction
CAPTCHA stands for
Completely
Automated
Public
Turing Test to Tell
Computers and
Humans
Apart.
The term CAPTCHA was coined in 2003 by Luis von
Ahn, Manuel Blum, Nicholas Hopper and John
Langford of Carnegie Mellon University.
1
CAPTCHA: using hard AI problems for security
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
Introduction
CAPTCHA stands for
Completely
Automated
Public
Turing Test to Tell
Computers and
Humans
Apart.
The term CAPTCHA was coined in 2003 by Luis von
Ahn, Manuel Blum, Nicholas Hopper and John
Langford of Carnegie Mellon University.
Find the paper here 1
1
CAPTCHA: using hard AI problems for security
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
What is a CAPTCHA?
A CAPTCHA is a program that protects websites
against bots by generating and grading tests that
humans can pass but current computer programs can-
not.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
What is a CAPTCHA?
A CAPTCHA is a program that protects websites
against bots by generating and grading tests that
humans can pass but current computer programs can-
not.
For example, humans can read distorted text as the one
shown below, but current computer programs canā€™t:
Figure: Source - https://fakecaptcha.com
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
CAPTCHAs have several applications for practical
security, including (but not limited to):
Preventing Comment Spam in Blogs.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
CAPTCHAs have several applications for practical
security, including (but not limited to):
Preventing Comment Spam in Blogs.
Protecting Website Registration.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
CAPTCHAs have several applications for practical
security, including (but not limited to):
Preventing Comment Spam in Blogs.
Protecting Website Registration.
Protecting Email Addresses From Scrapers.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
CAPTCHAs have several applications for practical
security, including (but not limited to):
Preventing Comment Spam in Blogs.
Protecting Website Registration.
Protecting Email Addresses From Scrapers.
Online Polls (CMU-MIT bot race for best CS university
ranking, 1999).
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
CAPTCHAs have several applications for practical
security, including (but not limited to):
Preventing Comment Spam in Blogs.
Protecting Website Registration.
Protecting Email Addresses From Scrapers.
Online Polls (CMU-MIT bot race for best CS university
ranking, 1999).
Preventing Dictionary Attacks.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
CAPTCHAs have several applications for practical
security, including (but not limited to):
Preventing Comment Spam in Blogs.
Protecting Website Registration.
Protecting Email Addresses From Scrapers.
Online Polls (CMU-MIT bot race for best CS university
ranking, 1999).
Preventing Dictionary Attacks.
Search Engine Bots.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
CAPTCHAs have several applications for practical
security, including (but not limited to):
Preventing Comment Spam in Blogs.
Protecting Website Registration.
Protecting Email Addresses From Scrapers.
Online Polls (CMU-MIT bot race for best CS university
ranking, 1999).
Preventing Dictionary Attacks.
Search Engine Bots.
Worms and Spam.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
First Generation CAPTCHA
Distorted pieces of text that would help stop spam on
the internet.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
First Generation CAPTCHA
Distorted pieces of text that would help stop spam on
the internet.
They worked because humans could read the text but
the computers/bots couldnā€™t.
Figure: An example of First Gen CAPTCHA
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
reCAPTCHA - Stop Spam, Read Books
Fast Forwarding, millions of CAPTCHAs were solved
daily.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
reCAPTCHA - Stop Spam, Read Books
Fast Forwarding, millions of CAPTCHAs were solved
daily.
So Luis von Ahn started to think, can we use this
brain power to do something useful.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
reCAPTCHA - Stop Spam, Read Books
Fast Forwarding, millions of CAPTCHAs were solved
daily.
So Luis von Ahn started to think, can we use this
brain power to do something useful.
And the answer to this was yes and that gave birth to
reCAPTCHA.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
reCAPTCHA - Stop Spam, Read Books
Fast Forwarding, millions of CAPTCHAs were solved
daily.
So Luis von Ahn started to think, can we use this
brain power to do something useful.
And the answer to this was yes and that gave birth to
reCAPTCHA.
They decided to use this brain power to digitize every
single physical book that we have.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
reCAPTCHA - Stop Spam, Read Books
Figure: First take real physical books and scan them.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
reCAPTCHA - Stop Spam, Read Books
Figure: Some errors while translating scanned copies to digital
text. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
reCAPTCHA - Stop Spam, Read Books
The reCAPTCHA team dumped the words that were
diļ¬ƒcult to decipher by the OCR to the reCAPTCHA
database.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
reCAPTCHA - Stop Spam, Read Books
The reCAPTCHA team dumped the words that were
diļ¬ƒcult to decipher by the OCR to the reCAPTCHA
database.
So now, instead of using distorted text, they started to
show words from books that computers couldnā€™t under-
stand.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
reCAPTCHA - Stop Spam, Read Books
The reCAPTCHA team dumped the words that were
diļ¬ƒcult to decipher by the OCR to the reCAPTCHA
database.
So now, instead of using distorted text, they started to
show words from books that computers couldnā€™t under-
stand.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
reCAPTCHA - Stop Spam, Read Books
Figure: When enough people on the internet solving these CAPTCHAs
wrote the same word for a piece of text shown, it would be uploaded
to the E-Books database.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
reCAPTCHA - Stop Spam, Read Books
100 million reCAPTCHAs/day were being solved
everyday.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
reCAPTCHA - Stop Spam, Read Books
100 million reCAPTCHAs/day were being solved
everyday.
Equivalent to 2.5 million books/year.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
reCAPTCHA - Stop Spam, Read Books
100 million reCAPTCHAs/day were being solved
everyday.
Equivalent to 2.5 million books/year.
Hence in 2009, Google acquired reCAPTCHA.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
reCAPTCHA - Stop Spam, Read Books
100 million reCAPTCHAs/day were being solved
everyday.
Equivalent to 2.5 million books/year.
Hence in 2009, Google acquired reCAPTCHA.
Google used the brain power to digitize all of the
New York Times Article Archives since 1851 and
Google Books.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
reCAPTCHA - Stop Spam, Read Books
Figure: When Google ran out of NYT articles and Google Books,
they started giving street numbers from street views that helped
label Google Maps.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
reCAPTCHA - Stop Spam, Read Books
Seems like a good solution, right?
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
reCAPTCHA - Stop Spam, Read Books
Seems like a good solution, right?
NO!!!
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
reCAPTCHA - Stop Spam, Read Books
Seems like a good solution, right?
NO!!!
What about blind people?
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
reCAPTCHA - Stop Spam, Read Books
Seems like a good solution, right?
NO!!!
What about blind people?
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
reCAPTCHA - Stop Spam, Read Books
Seems like a good solution, right?
NO!!!
What about blind people?
What about people with Dyslexia, poor eyesight, poor
hearing ability?
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
reCAPTCHA - Stop Spam, Read Books
Seems like a good solution, right?
NO!!!
What about blind people?
What about people with Dyslexia, poor eyesight, poor
hearing ability?
On the other hand, computer vision algorithms were
becoming powerful and were outperforming humans in
solving problems (An example is shown towards the
end).
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
NoCAPTCHA reCAPTCHA
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
NoCAPTCHA reCAPTCHA
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
NoCAPTCHA reCAPTCHA
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
NoCAPTCHA reCAPTCHA
Figure: When you click it, it sends a whole bunch of information
to Google.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
NoCAPTCHA reCAPTCHA
Figure: If the Google reCAPTCHA risk analysis engine is still con-
fused, then it pops up a task box. If you pass it, then chances are the
next time you click it, it will automatically allow you to pass without
the task box challenge.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
Overview
Final year project by Stanford students Nathan Zhao
Yi Liu and Yijun Jiang, Autumn 2017. 2
They had proposed the following algorithms.
Single-letter CAPTCHA recognition.
Multi-CAPTCHA recognition algorithm.
2
http://cs229.stanford.edu/proj2017/ļ¬nal-reports/5239112.pdf
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
Dataset
PyCaptcha, a python package for CAPTCHA gener-
ation was used to make custom CAPTCHA image
dataset.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
Dataset
PyCaptcha, a python package for CAPTCHA gener-
ation was used to make custom CAPTCHA image
dataset.
This package oļ¬€ers several degrees of freedom such as
font style, distortion and noise, which can be exploited
to increase the diversity of the data and the diļ¬ƒculty of
the recognition task.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
Dataset
PyCaptcha, a python package for CAPTCHA gener-
ation was used to make custom CAPTCHA image
dataset.
This package oļ¬€ers several degrees of freedom such as
font style, distortion and noise, which can be exploited
to increase the diversity of the data and the diļ¬ƒculty of
the recognition task.
Single-letter CAPTCHA images (40-by-60 pixels) were
created by feeding PyCaptcha with uppercase letters
ranging from A to Z from a restricted set of fonts.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
Dataset
PyCaptcha, a python package for CAPTCHA gener-
ation was used to make custom CAPTCHA image
dataset.
This package oļ¬€ers several degrees of freedom such as
font style, distortion and noise, which can be exploited
to increase the diversity of the data and the diļ¬ƒculty of
the recognition task.
Single-letter CAPTCHA images (40-by-60 pixels) were
created by feeding PyCaptcha with uppercase letters
ranging from A to Z from a restricted set of fonts.
The resulting images were labelled by the corresponding
letters.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
Dataset
PyCaptcha, a python package for CAPTCHA gener-
ation was used to make custom CAPTCHA image
dataset.
This package oļ¬€ers several degrees of freedom such as
font style, distortion and noise, which can be exploited
to increase the diversity of the data and the diļ¬ƒculty of
the recognition task.
Single-letter CAPTCHA images (40-by-60 pixels) were
created by feeding PyCaptcha with uppercase letters
ranging from A to Z from a restricted set of fonts.
The resulting images were labelled by the corresponding
letters.
This thus gave a supervised classiļ¬cation problem with
26 classes.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
Sample CAPTCHA generated by PyCaptcha
Figure: A typical CAPTCHA, which is an image distortion of the string
ADMD
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
Diļ¬€erent methods used and their results
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
K-Means clustering results
Figure: Clustering after dimensionality reduction from 40x60 dimen-
sions to 2D.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
Method: CNN
Figure: Proposed structure of convolutional neural network.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
Method: VGG-19
Figure: Structure of VGG-19 and freezing of many last convolutional
layers.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
Related Work
As CAPTCHAs are actively used by many websites to
protect traļ¬ƒc, major corporations have already invested
signiļ¬cant resources in breaking CAPTCHAs to assess
the strengths of shortcomings of these data techniques.
3
Goodfellow, I.J., Bulatov, Y., Ibarz, J. Arnoud, S., Shet, V. (2013).
Multi-digit Number Recognition from Street View: Imagery using Deep
Convolutional Neural Networks. arxiv preprint.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
Related Work
As CAPTCHAs are actively used by many websites to
protect traļ¬ƒc, major corporations have already invested
signiļ¬cant resources in breaking CAPTCHAs to assess
the strengths of shortcomings of these data techniques.
A noteworthy mention is Googleā€™s StreetView team.
3
Goodfellow, I.J., Bulatov, Y., Ibarz, J. Arnoud, S., Shet, V. (2013).
Multi-digit Number Recognition from Street View: Imagery using Deep
Convolutional Neural Networks. arxiv preprint.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
Related Work
As CAPTCHAs are actively used by many websites to
protect traļ¬ƒc, major corporations have already invested
signiļ¬cant resources in breaking CAPTCHAs to assess
the strengths of shortcomings of these data techniques.
A noteworthy mention is Googleā€™s StreetView team.
They have used their algorithms for recognizing signs
in images on the CAPTCHA problem, achieving
99.8% 3
success on particular types of diļ¬ƒcult-to-read
CAPTCHAs.
3
Goodfellow, I.J., Bulatov, Y., Ibarz, J. Arnoud, S., Shet, V. (2013).
Multi-digit Number Recognition from Street View: Imagery using Deep
Convolutional Neural Networks. arxiv preprint.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
Google StreetView teamā€™s DataSet
Examples of incorrectly transcribed street numbers from the large internal dataset
(transcription vs. ground truth). Note that for some of these, the Ė†aground truthĖ†a is also
incorrect. The ground truth labels in this dataset are quite noisy, as is common in real world
settings.4
4
Multi-digit Number Recognition from Street View: Imagery using
Deep Convolutional Neural Networks. arxiv preprint.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
Hard CAPTCHA puzzles dataset
Examples of images from the hard CAPTCHA puzzles
dataset.5
5
Goodfellow, I.J., Bulatov, Y., Ibarz, J. Arnoud, S., Shet, V. (2013).
Multi-digit Number Recognition from Street View: Imagery using Deep
Convolutional Neural Networks. arxiv preprint.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
Any Questions?
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
Introduction Evolution Method
The NSA and Israel wrote Stuxnet
together. - Edward Snowden
Thank You.
Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML

More Related Content

Similar to Breaking CAPTCHAs using ML

Captcha seminar report
Captcha seminar reportCaptcha seminar report
Captcha seminar reportRishabh Agarwal
Ā 
563.10.3 captcha
563.10.3 captcha563.10.3 captcha
563.10.3 captchasaishanker
Ā 
web application security using CAPTCHA
web application  security using CAPTCHAweb application  security using CAPTCHA
web application security using CAPTCHAkomal jadhav
Ā 
Machine Learning for Marketers by Mike King at The Inbounder New York
Machine Learning for Marketers by Mike King at The Inbounder New YorkMachine Learning for Marketers by Mike King at The Inbounder New York
Machine Learning for Marketers by Mike King at The Inbounder New YorkWe Are Marketing
Ā 
Captcha Seminar report 2014
Captcha Seminar report 2014Captcha Seminar report 2014
Captcha Seminar report 2014Ganesh Dhage
Ā 
Jean captcha-ppt
Jean captcha-pptJean captcha-ppt
Jean captcha-pptJean D'souza
Ā 
The Evolution of CAPTCHA
The Evolution of CAPTCHAThe Evolution of CAPTCHA
The Evolution of CAPTCHADanielleGhazi
Ā 
Metadata in a Crowd: Shared Knowledge Production
Metadata in a Crowd: Shared Knowledge ProductionMetadata in a Crowd: Shared Knowledge Production
Metadata in a Crowd: Shared Knowledge ProductionKevin Rundblad
Ā 
IRJET- Different Implemented Captchas and Breaking Methods
IRJET- Different Implemented Captchas and Breaking MethodsIRJET- Different Implemented Captchas and Breaking Methods
IRJET- Different Implemented Captchas and Breaking MethodsIRJET Journal
Ā 
Building your outreach machine
Building your outreach machineBuilding your outreach machine
Building your outreach machineMichael King
Ā 

Similar to Breaking CAPTCHAs using ML (20)

captcha
captcha captcha
captcha
Ā 
14A81A05A8
14A81A05A814A81A05A8
14A81A05A8
Ā 
Captcha report
Captcha reportCaptcha report
Captcha report
Ā 
Captcha
CaptchaCaptcha
Captcha
Ā 
Captcha seminar report
Captcha seminar reportCaptcha seminar report
Captcha seminar report
Ā 
Captcha
CaptchaCaptcha
Captcha
Ā 
563.10.3 captcha
563.10.3 captcha563.10.3 captcha
563.10.3 captcha
Ā 
web application security using CAPTCHA
web application  security using CAPTCHAweb application  security using CAPTCHA
web application security using CAPTCHA
Ā 
Captcha ppt
Captcha pptCaptcha ppt
Captcha ppt
Ā 
Machine Learning for Marketers by Mike King at The Inbounder New York
Machine Learning for Marketers by Mike King at The Inbounder New YorkMachine Learning for Marketers by Mike King at The Inbounder New York
Machine Learning for Marketers by Mike King at The Inbounder New York
Ā 
Captcha Seminar report 2014
Captcha Seminar report 2014Captcha Seminar report 2014
Captcha Seminar report 2014
Ā 
CAPTCHA
CAPTCHACAPTCHA
CAPTCHA
Ā 
Jean captcha-ppt
Jean captcha-pptJean captcha-ppt
Jean captcha-ppt
Ā 
The Evolution of CAPTCHA
The Evolution of CAPTCHAThe Evolution of CAPTCHA
The Evolution of CAPTCHA
Ā 
Metadata in a Crowd: Shared Knowledge Production
Metadata in a Crowd: Shared Knowledge ProductionMetadata in a Crowd: Shared Knowledge Production
Metadata in a Crowd: Shared Knowledge Production
Ā 
IRJET- Different Implemented Captchas and Breaking Methods
IRJET- Different Implemented Captchas and Breaking MethodsIRJET- Different Implemented Captchas and Breaking Methods
IRJET- Different Implemented Captchas and Breaking Methods
Ā 
Captcha
CaptchaCaptcha
Captcha
Ā 
Captcha
CaptchaCaptcha
Captcha
Ā 
Building your outreach machine
Building your outreach machineBuilding your outreach machine
Building your outreach machine
Ā 
CAPTCHA.pptx
CAPTCHA.pptxCAPTCHA.pptx
CAPTCHA.pptx
Ā 

More from Jishnu P

SinGAN - Learning a Generative Model from a Single Natural Image
SinGAN - Learning a Generative Model from a Single Natural ImageSinGAN - Learning a Generative Model from a Single Natural Image
SinGAN - Learning a Generative Model from a Single Natural ImageJishnu P
Ā 
Stencil computation research project presentation #1
Stencil computation research project presentation #1Stencil computation research project presentation #1
Stencil computation research project presentation #1Jishnu P
Ā 
Btp 2017 presentation
Btp 2017 presentationBtp 2017 presentation
Btp 2017 presentationJishnu P
Ā 
Ir mcq-answering-system
Ir mcq-answering-systemIr mcq-answering-system
Ir mcq-answering-systemJishnu P
Ā 
Cs403 Parellel Programming Travelling Salesman Problem
Cs403   Parellel Programming Travelling Salesman ProblemCs403   Parellel Programming Travelling Salesman Problem
Cs403 Parellel Programming Travelling Salesman ProblemJishnu P
Ā 
Ansible Overview - System Administration and Maintenance
Ansible Overview - System Administration and MaintenanceAnsible Overview - System Administration and Maintenance
Ansible Overview - System Administration and MaintenanceJishnu P
Ā 
CS404 Pattern Recognition - Locality Preserving Projections
CS404   Pattern Recognition - Locality Preserving ProjectionsCS404   Pattern Recognition - Locality Preserving Projections
CS404 Pattern Recognition - Locality Preserving ProjectionsJishnu P
Ā 

More from Jishnu P (7)

SinGAN - Learning a Generative Model from a Single Natural Image
SinGAN - Learning a Generative Model from a Single Natural ImageSinGAN - Learning a Generative Model from a Single Natural Image
SinGAN - Learning a Generative Model from a Single Natural Image
Ā 
Stencil computation research project presentation #1
Stencil computation research project presentation #1Stencil computation research project presentation #1
Stencil computation research project presentation #1
Ā 
Btp 2017 presentation
Btp 2017 presentationBtp 2017 presentation
Btp 2017 presentation
Ā 
Ir mcq-answering-system
Ir mcq-answering-systemIr mcq-answering-system
Ir mcq-answering-system
Ā 
Cs403 Parellel Programming Travelling Salesman Problem
Cs403   Parellel Programming Travelling Salesman ProblemCs403   Parellel Programming Travelling Salesman Problem
Cs403 Parellel Programming Travelling Salesman Problem
Ā 
Ansible Overview - System Administration and Maintenance
Ansible Overview - System Administration and MaintenanceAnsible Overview - System Administration and Maintenance
Ansible Overview - System Administration and Maintenance
Ā 
CS404 Pattern Recognition - Locality Preserving Projections
CS404   Pattern Recognition - Locality Preserving ProjectionsCS404   Pattern Recognition - Locality Preserving Projections
CS404 Pattern Recognition - Locality Preserving Projections
Ā 

Recently uploaded

Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
Ā 
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļøcall girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø9953056974 Low Rate Call Girls In Saket, Delhi NCR
Ā 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
Ā 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
Ā 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
Ā 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
Ā 
18-04-UA_REPORT_MEDIALITERAŠ”Y_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAŠ”Y_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAŠ”Y_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAŠ”Y_INDEX-DM_23-1-final-eng.pdfssuser54595a
Ā 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
Ā 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
Ā 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
Ā 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
Ā 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
Ā 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
Ā 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
Ā 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
Ā 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
Ā 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
Ā 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
Ā 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
Ā 

Recently uploaded (20)

Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
Ā 
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļøcall girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
Ā 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Ā 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
Ā 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
Ā 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
Ā 
18-04-UA_REPORT_MEDIALITERAŠ”Y_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAŠ”Y_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAŠ”Y_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAŠ”Y_INDEX-DM_23-1-final-eng.pdf
Ā 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
Ā 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
Ā 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
Ā 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
Ā 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
Ā 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
Ā 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
Ā 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
Ā 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
Ā 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Ā 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
Ā 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
Ā 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
Ā 

Breaking CAPTCHAs using ML

  • 1. Introduction Evolution Method CPS 205 : Introduction to Cybersecurity Breaking CAPTCHAs using ML Jishnu Jaykumar P jishnujayakumar.github.io Robert Bosch Centre for Cyber-Physical Systems Indian Institute of Science Bangalore March 5, 2018 Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 2. Introduction Evolution Method Introduction CAPTCHA stands for Completely Automated Public Turing Test to Tell Computers and Humans Apart. 1 CAPTCHA: using hard AI problems for security Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 3. Introduction Evolution Method Introduction CAPTCHA stands for Completely Automated Public Turing Test to Tell Computers and Humans Apart. The term CAPTCHA was coined in 2003 by Luis von Ahn, Manuel Blum, Nicholas Hopper and John Langford of Carnegie Mellon University. 1 CAPTCHA: using hard AI problems for security Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 4. Introduction Evolution Method Introduction CAPTCHA stands for Completely Automated Public Turing Test to Tell Computers and Humans Apart. The term CAPTCHA was coined in 2003 by Luis von Ahn, Manuel Blum, Nicholas Hopper and John Langford of Carnegie Mellon University. Find the paper here 1 1 CAPTCHA: using hard AI problems for security Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 5. Introduction Evolution Method What is a CAPTCHA? A CAPTCHA is a program that protects websites against bots by generating and grading tests that humans can pass but current computer programs can- not. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 6. Introduction Evolution Method What is a CAPTCHA? A CAPTCHA is a program that protects websites against bots by generating and grading tests that humans can pass but current computer programs can- not. For example, humans can read distorted text as the one shown below, but current computer programs canā€™t: Figure: Source - https://fakecaptcha.com Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 7. Introduction Evolution Method CAPTCHAs have several applications for practical security, including (but not limited to): Preventing Comment Spam in Blogs. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 8. Introduction Evolution Method CAPTCHAs have several applications for practical security, including (but not limited to): Preventing Comment Spam in Blogs. Protecting Website Registration. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 9. Introduction Evolution Method CAPTCHAs have several applications for practical security, including (but not limited to): Preventing Comment Spam in Blogs. Protecting Website Registration. Protecting Email Addresses From Scrapers. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 10. Introduction Evolution Method CAPTCHAs have several applications for practical security, including (but not limited to): Preventing Comment Spam in Blogs. Protecting Website Registration. Protecting Email Addresses From Scrapers. Online Polls (CMU-MIT bot race for best CS university ranking, 1999). Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 11. Introduction Evolution Method CAPTCHAs have several applications for practical security, including (but not limited to): Preventing Comment Spam in Blogs. Protecting Website Registration. Protecting Email Addresses From Scrapers. Online Polls (CMU-MIT bot race for best CS university ranking, 1999). Preventing Dictionary Attacks. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 12. Introduction Evolution Method CAPTCHAs have several applications for practical security, including (but not limited to): Preventing Comment Spam in Blogs. Protecting Website Registration. Protecting Email Addresses From Scrapers. Online Polls (CMU-MIT bot race for best CS university ranking, 1999). Preventing Dictionary Attacks. Search Engine Bots. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 13. Introduction Evolution Method CAPTCHAs have several applications for practical security, including (but not limited to): Preventing Comment Spam in Blogs. Protecting Website Registration. Protecting Email Addresses From Scrapers. Online Polls (CMU-MIT bot race for best CS university ranking, 1999). Preventing Dictionary Attacks. Search Engine Bots. Worms and Spam. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 14. Introduction Evolution Method First Generation CAPTCHA Distorted pieces of text that would help stop spam on the internet. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 15. Introduction Evolution Method First Generation CAPTCHA Distorted pieces of text that would help stop spam on the internet. They worked because humans could read the text but the computers/bots couldnā€™t. Figure: An example of First Gen CAPTCHA Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 16. Introduction Evolution Method reCAPTCHA - Stop Spam, Read Books Fast Forwarding, millions of CAPTCHAs were solved daily. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 17. Introduction Evolution Method reCAPTCHA - Stop Spam, Read Books Fast Forwarding, millions of CAPTCHAs were solved daily. So Luis von Ahn started to think, can we use this brain power to do something useful. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 18. Introduction Evolution Method reCAPTCHA - Stop Spam, Read Books Fast Forwarding, millions of CAPTCHAs were solved daily. So Luis von Ahn started to think, can we use this brain power to do something useful. And the answer to this was yes and that gave birth to reCAPTCHA. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 19. Introduction Evolution Method reCAPTCHA - Stop Spam, Read Books Fast Forwarding, millions of CAPTCHAs were solved daily. So Luis von Ahn started to think, can we use this brain power to do something useful. And the answer to this was yes and that gave birth to reCAPTCHA. They decided to use this brain power to digitize every single physical book that we have. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 20. Introduction Evolution Method reCAPTCHA - Stop Spam, Read Books Figure: First take real physical books and scan them. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 21. Introduction Evolution Method reCAPTCHA - Stop Spam, Read Books Figure: Some errors while translating scanned copies to digital text. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 22. Introduction Evolution Method reCAPTCHA - Stop Spam, Read Books The reCAPTCHA team dumped the words that were diļ¬ƒcult to decipher by the OCR to the reCAPTCHA database. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 23. Introduction Evolution Method reCAPTCHA - Stop Spam, Read Books The reCAPTCHA team dumped the words that were diļ¬ƒcult to decipher by the OCR to the reCAPTCHA database. So now, instead of using distorted text, they started to show words from books that computers couldnā€™t under- stand. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 24. Introduction Evolution Method reCAPTCHA - Stop Spam, Read Books The reCAPTCHA team dumped the words that were diļ¬ƒcult to decipher by the OCR to the reCAPTCHA database. So now, instead of using distorted text, they started to show words from books that computers couldnā€™t under- stand. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 25. Introduction Evolution Method reCAPTCHA - Stop Spam, Read Books Figure: When enough people on the internet solving these CAPTCHAs wrote the same word for a piece of text shown, it would be uploaded to the E-Books database. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 26. Introduction Evolution Method reCAPTCHA - Stop Spam, Read Books 100 million reCAPTCHAs/day were being solved everyday. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 27. Introduction Evolution Method reCAPTCHA - Stop Spam, Read Books 100 million reCAPTCHAs/day were being solved everyday. Equivalent to 2.5 million books/year. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 28. Introduction Evolution Method reCAPTCHA - Stop Spam, Read Books 100 million reCAPTCHAs/day were being solved everyday. Equivalent to 2.5 million books/year. Hence in 2009, Google acquired reCAPTCHA. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 29. Introduction Evolution Method reCAPTCHA - Stop Spam, Read Books 100 million reCAPTCHAs/day were being solved everyday. Equivalent to 2.5 million books/year. Hence in 2009, Google acquired reCAPTCHA. Google used the brain power to digitize all of the New York Times Article Archives since 1851 and Google Books. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 30. Introduction Evolution Method reCAPTCHA - Stop Spam, Read Books Figure: When Google ran out of NYT articles and Google Books, they started giving street numbers from street views that helped label Google Maps. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 31. Introduction Evolution Method reCAPTCHA - Stop Spam, Read Books Seems like a good solution, right? Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 32. Introduction Evolution Method reCAPTCHA - Stop Spam, Read Books Seems like a good solution, right? NO!!! Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 33. Introduction Evolution Method reCAPTCHA - Stop Spam, Read Books Seems like a good solution, right? NO!!! What about blind people? Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 34. Introduction Evolution Method reCAPTCHA - Stop Spam, Read Books Seems like a good solution, right? NO!!! What about blind people? Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 35. Introduction Evolution Method reCAPTCHA - Stop Spam, Read Books Seems like a good solution, right? NO!!! What about blind people? What about people with Dyslexia, poor eyesight, poor hearing ability? Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 36. Introduction Evolution Method reCAPTCHA - Stop Spam, Read Books Seems like a good solution, right? NO!!! What about blind people? What about people with Dyslexia, poor eyesight, poor hearing ability? On the other hand, computer vision algorithms were becoming powerful and were outperforming humans in solving problems (An example is shown towards the end). Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 37. Introduction Evolution Method NoCAPTCHA reCAPTCHA Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 38. Introduction Evolution Method NoCAPTCHA reCAPTCHA Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 39. Introduction Evolution Method NoCAPTCHA reCAPTCHA Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 40. Introduction Evolution Method NoCAPTCHA reCAPTCHA Figure: When you click it, it sends a whole bunch of information to Google. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 41. Introduction Evolution Method NoCAPTCHA reCAPTCHA Figure: If the Google reCAPTCHA risk analysis engine is still con- fused, then it pops up a task box. If you pass it, then chances are the next time you click it, it will automatically allow you to pass without the task box challenge. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 42. Introduction Evolution Method Overview Final year project by Stanford students Nathan Zhao Yi Liu and Yijun Jiang, Autumn 2017. 2 They had proposed the following algorithms. Single-letter CAPTCHA recognition. Multi-CAPTCHA recognition algorithm. 2 http://cs229.stanford.edu/proj2017/ļ¬nal-reports/5239112.pdf Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 43. Introduction Evolution Method Dataset PyCaptcha, a python package for CAPTCHA gener- ation was used to make custom CAPTCHA image dataset. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 44. Introduction Evolution Method Dataset PyCaptcha, a python package for CAPTCHA gener- ation was used to make custom CAPTCHA image dataset. This package oļ¬€ers several degrees of freedom such as font style, distortion and noise, which can be exploited to increase the diversity of the data and the diļ¬ƒculty of the recognition task. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 45. Introduction Evolution Method Dataset PyCaptcha, a python package for CAPTCHA gener- ation was used to make custom CAPTCHA image dataset. This package oļ¬€ers several degrees of freedom such as font style, distortion and noise, which can be exploited to increase the diversity of the data and the diļ¬ƒculty of the recognition task. Single-letter CAPTCHA images (40-by-60 pixels) were created by feeding PyCaptcha with uppercase letters ranging from A to Z from a restricted set of fonts. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 46. Introduction Evolution Method Dataset PyCaptcha, a python package for CAPTCHA gener- ation was used to make custom CAPTCHA image dataset. This package oļ¬€ers several degrees of freedom such as font style, distortion and noise, which can be exploited to increase the diversity of the data and the diļ¬ƒculty of the recognition task. Single-letter CAPTCHA images (40-by-60 pixels) were created by feeding PyCaptcha with uppercase letters ranging from A to Z from a restricted set of fonts. The resulting images were labelled by the corresponding letters. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 47. Introduction Evolution Method Dataset PyCaptcha, a python package for CAPTCHA gener- ation was used to make custom CAPTCHA image dataset. This package oļ¬€ers several degrees of freedom such as font style, distortion and noise, which can be exploited to increase the diversity of the data and the diļ¬ƒculty of the recognition task. Single-letter CAPTCHA images (40-by-60 pixels) were created by feeding PyCaptcha with uppercase letters ranging from A to Z from a restricted set of fonts. The resulting images were labelled by the corresponding letters. This thus gave a supervised classiļ¬cation problem with 26 classes. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 48. Introduction Evolution Method Sample CAPTCHA generated by PyCaptcha Figure: A typical CAPTCHA, which is an image distortion of the string ADMD Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 49. Introduction Evolution Method Diļ¬€erent methods used and their results Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 50. Introduction Evolution Method K-Means clustering results Figure: Clustering after dimensionality reduction from 40x60 dimen- sions to 2D. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 51. Introduction Evolution Method Method: CNN Figure: Proposed structure of convolutional neural network. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 52. Introduction Evolution Method Method: VGG-19 Figure: Structure of VGG-19 and freezing of many last convolutional layers. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 53. Introduction Evolution Method Related Work As CAPTCHAs are actively used by many websites to protect traļ¬ƒc, major corporations have already invested signiļ¬cant resources in breaking CAPTCHAs to assess the strengths of shortcomings of these data techniques. 3 Goodfellow, I.J., Bulatov, Y., Ibarz, J. Arnoud, S., Shet, V. (2013). Multi-digit Number Recognition from Street View: Imagery using Deep Convolutional Neural Networks. arxiv preprint. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 54. Introduction Evolution Method Related Work As CAPTCHAs are actively used by many websites to protect traļ¬ƒc, major corporations have already invested signiļ¬cant resources in breaking CAPTCHAs to assess the strengths of shortcomings of these data techniques. A noteworthy mention is Googleā€™s StreetView team. 3 Goodfellow, I.J., Bulatov, Y., Ibarz, J. Arnoud, S., Shet, V. (2013). Multi-digit Number Recognition from Street View: Imagery using Deep Convolutional Neural Networks. arxiv preprint. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 55. Introduction Evolution Method Related Work As CAPTCHAs are actively used by many websites to protect traļ¬ƒc, major corporations have already invested signiļ¬cant resources in breaking CAPTCHAs to assess the strengths of shortcomings of these data techniques. A noteworthy mention is Googleā€™s StreetView team. They have used their algorithms for recognizing signs in images on the CAPTCHA problem, achieving 99.8% 3 success on particular types of diļ¬ƒcult-to-read CAPTCHAs. 3 Goodfellow, I.J., Bulatov, Y., Ibarz, J. Arnoud, S., Shet, V. (2013). Multi-digit Number Recognition from Street View: Imagery using Deep Convolutional Neural Networks. arxiv preprint. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 56. Introduction Evolution Method Google StreetView teamā€™s DataSet Examples of incorrectly transcribed street numbers from the large internal dataset (transcription vs. ground truth). Note that for some of these, the Ė†aground truthĖ†a is also incorrect. The ground truth labels in this dataset are quite noisy, as is common in real world settings.4 4 Multi-digit Number Recognition from Street View: Imagery using Deep Convolutional Neural Networks. arxiv preprint. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 57. Introduction Evolution Method Hard CAPTCHA puzzles dataset Examples of images from the hard CAPTCHA puzzles dataset.5 5 Goodfellow, I.J., Bulatov, Y., Ibarz, J. Arnoud, S., Shet, V. (2013). Multi-digit Number Recognition from Street View: Imagery using Deep Convolutional Neural Networks. arxiv preprint. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 58. Introduction Evolution Method Any Questions? Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML
  • 59. Introduction Evolution Method The NSA and Israel wrote Stuxnet together. - Edward Snowden Thank You. Jishnu Jaykumar P CPS205 : Breaking CAPTCHAs using ML