SlideShare a Scribd company logo
1 of 4
CAPTCHA CRACKING USING IMAGE PROCESSING
Chirag Patil
Software Developer
ABSTRACT
Today Internet is life line for most of individual person.The day start with looking for any updates and checking
online status on socialmedia and giving feedback in form of comment or likes. This feedback or likes can be
controlled and manipulated by bots orautomated programs which is provided by spammers. Here comes testing
method Known as CAPTCHA ( “ an acronym for "completely automated public Turing test to tell computers and
humans apart", trademarked by Carnegie Mellon University)[8]. It is type of test which is used to determine whether
the user is Human or not. It is set of random alphabets and numbers which is twisted so that it cannot be
recognized by bots but is readable by humans. There is two version of CAPTCHA . Version one which consist of
twisted alphabets and numbers and Version two which consist of multiple images to select from according to system.
Therefore it is very important to check vulnerability t o check CAPTCHA against attack. So in this paper we will
check the method to Crack CAPTCHA and check its effectiveness strength and security against spammers and
automated programs.
Key Words:
CAPTCHA, IMAGE PROCES S IG, OCR, TURING TES T, MATLAB, CAPTCHA CRACKING,
S ECURITY
1. INTRODUCTION
In last few years internet has changed drastically. No of
internet user has increased and so are attacks and its
exploitation inform of spammers and hackers. This hackers
create automated programs or bots to generate traffic and
load on servers which slow down websites and other portals.
SO we are going to crack this CAPTCHA using image
process and check strength and complexity of CAPTCHA.
1.1 LiteratureSurvey
Many organization have cracked CAPTCHA to check and
make better CHAPTCHA and improve security. There
are paper published by researchers on same one of them is
„Stanford Researchers crack CAPTCHA code‟ by Todd
Wasserman published by ‘Stanford University has
created DeCAPTCHA, It is a software which makes
computer read CAPTCHA by clanging up text and
rendering them in liable format of alphabets and numbers.
It was not fully accurate it was able to crack CAPTCHA
but not fully its accuracy was 60 to 70 percent. Another
paper titled as ‘Braking and image based CPATCHA by
MICHAELMerler, JACQUILENE Jacob it was published
by Stanford University is on Vidoop CAPTCHA. It is
another typeof verification process that uses images of the
objects, and animals, people, instead of distorted text to
distinguish a human from computer program and bots
.Authors underestimate that since a computer programs
and bots can try to access several times a day without any
change in effort.
2. Working
In this program we are going to crack CAPTCHA
using IMAGE PROCESSING for image based
CAPTCHA which are not highly secured. We are
going to create a program which will extract the image
from web page using developer mode or by taking the
snapshot of web page. After that it will search for the
image or part were CAPTCHA is. Then it will process
the image or rather say CAPTCHA file which is in
.bmp format. It will extract the text using algorithms
and give us the result and it will automatically submit
the result text in required field and submit the form.
This will tell us the level of security CAPTCH of that
web site has. Example of CAPTCHA from web
page is shown in fig.1.
Fig. 1: EXAMPE OF CAPTCH ON A
WEB PAGE
4. METHODLOGY ANG PROBLEM
APPROCH
4.1. SYSTEM ARCHITECTUR
System Architecture consist of following parts as proposed
by our team.
1. LAYOUT (USER INERFACE)
2. DATA EXTRACTION
3. PRE PROCESSING AND PROCESSING
4. DECESSION MAKING
5. RESULT (OUTPUT)
Fig. 2: SYSTEM ARCHITECTURE
AND FLOW CHART
4.2. DISCRIPTION
4.2.1. LAYOUT (USER INTERFACE)
User Interface it is physical appearance of software with
GUI (Graphical User Inter face) at user end with controls
and input output data.
4.2.2. DATA EXTRACTION
In this phase after getting command by user from
layout or user interface the program will extract
CHAPTCHA image from the web page for further
process.
4.2.3. PRE PROCESSING AND
PROCESSING
After the image is extracted the image is pre processed
because the original image is unfit for processing so is
character in it as well. In this there are chances that the data
or character in image is not of mono color so first
preference is to make one. Then it is further processed
which includes.
a) Bifurcating according to RGB value
b) rotation of image
c) finding original corner of character
d) resizing of image
4.2.4 DICESION MAKING
In this phase after image being processed it can be
recognized by using algorithms and Artificial
intelligence to recognize character. Such OCR and
MATLAB. After character benign recognized it will
submit in required field and if it was success full it will
enter the page and if not it will reload next CAPTCHA
image from web page and process continues [8].
4.2.5 RESULT (OUTPUT)
After successful attempt we will have full access or logged
in to web page then that means CHAPTCH is vulnerable.
5. ALGORITHM
5.1 DATA [PREPARATION
The CAPTCHA image is converted to gray scale image if it‟s
a RGB image. After converting RGB image to gray scale
image, the threshold method is performed. First it calculates
the thresholdvalue using predefined function graythresh().This
threshold value is used, where gray-levels below this threshold
is said to be black and levels above are said to be white [1].It
performs binarization for further segmentation [6].
Fig. 3: EXAMPE OF CAPTCH BEFORE
PROCESSING
Fig. 4: EXAMPE OF CAPTCH AFTER
PROCESSING
5.2 SEGMENTTION AND PIXELTION
Segmentation
The preprocessed CAPTCHA block is segmented into
individual characters as shown in figure 10. Detailed
explanation of this step is as follows. Firstly the size of
preprocessed image is obtained, then every individual
column is scanned till the data is found. Then the obtained
character is cropped. Twenty rows and columns are padded
with 0's as shown in fig. 5.[2]
.
Fig. 5: EXAMPE OF CAPTCH
SEGMENTATION
Pixelation
The pixel-to-pixel comparison function requires both
character images to be the same width/height. Simple
experiments determined a 16 pixel by 16 pixel image was
sufficient to retain important characteristics. It was
important to decaled the image, because for each additional
row/column k, it increased the time complexity. The actual
pixilation was done by dividing the original character
image into the 256 cells, and determining the average value
of each cell (1=black, 0=white). If the average value of
each cell was greater than 0.5, then the script set the
corresponding pixel in the pixilated image to black, and
vice versa.
Fig. 6: EXAMPE OF CHARACTER
PIXELATION
5.3 CLASSIFY
Firstly thetemplates are loaded then it can use four types of
classification techniques as follows:
5.3.1 PIXEL COUNT
First, the vertical segmentation divides the characters into 5
segments. Next each segment is scanned to get the number
of foreground pixels in it. Then, the pixel count obtained in
the previous step is used to look up the mapping table [11].
Finally it gives the output with success rate of 8%.
5.3.2 VERTICAL PROJECTION
Vertical projection is applied to a segmented image, each of
which contains one character. The process of vertical
projection starts by mapping the image histogram to that of
the template vertical histogram which finally has more
mapping involved to it is the output. Finally it gives the
output with success rate of 95% [3].
5.3.3. HORIZONTAL PROJECTION
Horizontal projection is applied to a segmented image, each
of which contains one character. The process of vertical
projection starts by mapping the image histogram to that of
the template horizontal histogram which finally has more
mapping involved to it is the output. Finally it gives the
output with success rate of 100%.
5.3.4. TEMPLATE CORRELATION
The matrix containing the image of the input character is
directly matched with a set of characters in templates
representing each possible class. The distance between the
pattern and each character in template is computed, and the
maximum correlation value obtained from segmented
character and the characters in template is the character
printed to output. Finally it gives the output with success
rate of 100%.
5.4. TEMPELATE MACHING AND
CORRELATION TECHNIQUE
These techniques are different from the others in that no
features are actually extracted. Instead the matrix
containing the image of the input character is directly
matched with a set of characters in templates representing
each possible class. The distance between the pattern and
each character in template is computed, and the maximum
correlation value obtained from segmented character and
the characters in template is the character printed to output
[2].
6. RESULT (OUTPUT)
After successful attempt we will have full access or logged
in to web page then that means CHAPTCH is vulnerable.
7. Goals of this application
The main goals of our application
are:
1. To show the vulnerability of CAPTCHA
2. Further improvement of CAPTCHA and secure
online system.
3. Automate login forperson who face problem with
CAPTCHA.
4. Testing of web page with security issues.
8. Conclusion:
Finally the conclusion by cracking the CAPTCHA is that it
is not completely secure. This paper shows the technique to
crack CAPTCHA so that one can implant CAPTCHA
cracker by giving CAPTCHA image as an input and
retrieving CAPTCHA text as output.
9. Future Scope:
In future automated system can be made to increase
security as well as check vulnerability of a web site. It will
also help those who face problem in logging in portal due
to CAPTCHA.
10. Reference
[1] Mr.Nithya.Eand Dr. Ramesh Babu D R June 2013
OCR Systemfor Complex Printed Kannada Characters.
[2] Sandeep Tiwari, Shivangi Mishra, Priyank Bhatia,
Praveen Km. Yadav May 2013 OpticalCharacter
Recognition using MATLAB.
[3] Jeff Yan and Ahmad Salah El Ahmad A Low-cost
Attack on a Microsoft CAPTCHA.
[4] Silky Azad & Kiran Jain, CAPTCHA:Attacks and
Weaknesses against OCR Technology
[5] Prof. (Mrs.) A.A. Chandavale and Prof. Dr.A.M. Sapkal
and Dr.R.M.Jalnekar ,Algorithm To Break Visual
CAPTCHA.
[6] Hina Parveen and Sudhir Singh, Captcha Recognition
and Robustness Measurement using Hybrid Approaches
[7] Elie Bursztein, Jonathan Aigrain and Angelika
Moscicki TheEnd is Nigh: Generic Solving of Text-based
CAPTCHAs
[8] Christoph Fritsch, Michael Netter, Andreas Reisser, and
Günther Pernul, Attacking Image Recognition Captchas A
Naive but Effective Approach
[9] Luis von Ahn1, Manuel Blum1, Nicholas J. Hopper1 ,
and John Langford CAPTCHA:Using Hard AI Problems
For Security
[10] Bin B. Zhu*1, Jeff Yan2, Qiujie Li3, Chao Yang4, Jia
Liu5, Ning Xu1, Meng Yi6, Kaiwei Cai7, Attacks and
Design of Image Recognition CAPTCHAs
[11] Jeff Yan, Ahmad Salah El Ahmad Breaking Visual
CAPTCHAs with Naïve Pattern Recognition Algorithm

More Related Content

Similar to Captcha cracking using image processing ieee paper

Asia 16-sivakorn-im-not-a-human-breaking-the-google-re captcha-wp
Asia 16-sivakorn-im-not-a-human-breaking-the-google-re captcha-wpAsia 16-sivakorn-im-not-a-human-breaking-the-google-re captcha-wp
Asia 16-sivakorn-im-not-a-human-breaking-the-google-re captcha-wpAndrey Apuhtin
 
A web based network worm simulator
A web based network worm simulatorA web based network worm simulator
A web based network worm simulatorUltraUploader
 
Improve Captcha's Security Using Gaussian Blur Filter
Improve Captcha's Security Using Gaussian Blur FilterImprove Captcha's Security Using Gaussian Blur Filter
Improve Captcha's Security Using Gaussian Blur Filtersipij
 
captcha formation with warping and random number generation
captcha formation with warping and random number generationcaptcha formation with warping and random number generation
captcha formation with warping and random number generationSaeed Ullah
 
Video Captcha as a Graphical Password
Video Captcha as a Graphical PasswordVideo Captcha as a Graphical Password
Video Captcha as a Graphical PasswordIRJET Journal
 
IRJET-PLC and SCADA based Distribution and Substation Automation
IRJET-PLC and SCADA based Distribution and Substation AutomationIRJET-PLC and SCADA based Distribution and Substation Automation
IRJET-PLC and SCADA based Distribution and Substation AutomationIRJET Journal
 
Demonstrated Deep Learning Techniques for the Resolution of CAPTCHA images
Demonstrated Deep Learning Techniques for the Resolution of CAPTCHA imagesDemonstrated Deep Learning Techniques for the Resolution of CAPTCHA images
Demonstrated Deep Learning Techniques for the Resolution of CAPTCHA imagesIRJET Journal
 
An Implementation of A Geometric and Arithmetic CAPTCHA without Database
An Implementation of A Geometric and Arithmetic CAPTCHA without DatabaseAn Implementation of A Geometric and Arithmetic CAPTCHA without Database
An Implementation of A Geometric and Arithmetic CAPTCHA without DatabaseShubham Saurav
 
A countermeasure for security intensification in cloud using CaPGP
A countermeasure for security intensification in cloud using CaPGPA countermeasure for security intensification in cloud using CaPGP
A countermeasure for security intensification in cloud using CaPGPIRJET Journal
 
An Optimized System to Solve Text-Based Captcha
An Optimized System to Solve Text-Based CaptchaAn Optimized System to Solve Text-Based Captcha
An Optimized System to Solve Text-Based Captchagerogepatton
 
Gender Classification using SVM With Flask
Gender Classification using SVM With FlaskGender Classification using SVM With Flask
Gender Classification using SVM With FlaskAI Publications
 
Evolution of captcha technologies
Evolution of captcha technologiesEvolution of captcha technologies
Evolution of captcha technologiesMonika Keerthi
 
web application security using CAPTCHA
web application  security using CAPTCHAweb application  security using CAPTCHA
web application security using CAPTCHAkomal jadhav
 
Captcha Recognition and Robustness Measurement using Image Processing Techniques
Captcha Recognition and Robustness Measurement using Image Processing TechniquesCaptcha Recognition and Robustness Measurement using Image Processing Techniques
Captcha Recognition and Robustness Measurement using Image Processing TechniquesIOSR Journals
 
Dynamic autoselection and autotuning of machine learning models forcloud netw...
Dynamic autoselection and autotuning of machine learning models forcloud netw...Dynamic autoselection and autotuning of machine learning models forcloud netw...
Dynamic autoselection and autotuning of machine learning models forcloud netw...Venkat Projects
 
Generic Solving Of Text Based Captcha
Generic Solving Of Text Based CaptchaGeneric Solving Of Text Based Captcha
Generic Solving Of Text Based Captchakaranwayne
 
A Survey of Current Research on CAPTCHA
A Survey of Current Research on CAPTCHAA Survey of Current Research on CAPTCHA
A Survey of Current Research on CAPTCHAIJCSES Journal
 

Similar to Captcha cracking using image processing ieee paper (20)

Asia 16-sivakorn-im-not-a-human-breaking-the-google-re captcha-wp
Asia 16-sivakorn-im-not-a-human-breaking-the-google-re captcha-wpAsia 16-sivakorn-im-not-a-human-breaking-the-google-re captcha-wp
Asia 16-sivakorn-im-not-a-human-breaking-the-google-re captcha-wp
 
A SURVEY ON CAPTCHA TECHNIQUE BASED ON DRAG AND DROP MOUSE ACTION
A SURVEY ON CAPTCHA TECHNIQUE BASED ON DRAG AND DROP MOUSE ACTIONA SURVEY ON CAPTCHA TECHNIQUE BASED ON DRAG AND DROP MOUSE ACTION
A SURVEY ON CAPTCHA TECHNIQUE BASED ON DRAG AND DROP MOUSE ACTION
 
Choudhary2015
Choudhary2015Choudhary2015
Choudhary2015
 
A web based network worm simulator
A web based network worm simulatorA web based network worm simulator
A web based network worm simulator
 
Improve Captcha's Security Using Gaussian Blur Filter
Improve Captcha's Security Using Gaussian Blur FilterImprove Captcha's Security Using Gaussian Blur Filter
Improve Captcha's Security Using Gaussian Blur Filter
 
captcha formation with warping and random number generation
captcha formation with warping and random number generationcaptcha formation with warping and random number generation
captcha formation with warping and random number generation
 
Video Captcha as a Graphical Password
Video Captcha as a Graphical PasswordVideo Captcha as a Graphical Password
Video Captcha as a Graphical Password
 
IRJET-PLC and SCADA based Distribution and Substation Automation
IRJET-PLC and SCADA based Distribution and Substation AutomationIRJET-PLC and SCADA based Distribution and Substation Automation
IRJET-PLC and SCADA based Distribution and Substation Automation
 
Captcha
CaptchaCaptcha
Captcha
 
Demonstrated Deep Learning Techniques for the Resolution of CAPTCHA images
Demonstrated Deep Learning Techniques for the Resolution of CAPTCHA imagesDemonstrated Deep Learning Techniques for the Resolution of CAPTCHA images
Demonstrated Deep Learning Techniques for the Resolution of CAPTCHA images
 
An Implementation of A Geometric and Arithmetic CAPTCHA without Database
An Implementation of A Geometric and Arithmetic CAPTCHA without DatabaseAn Implementation of A Geometric and Arithmetic CAPTCHA without Database
An Implementation of A Geometric and Arithmetic CAPTCHA without Database
 
A countermeasure for security intensification in cloud using CaPGP
A countermeasure for security intensification in cloud using CaPGPA countermeasure for security intensification in cloud using CaPGP
A countermeasure for security intensification in cloud using CaPGP
 
An Optimized System to Solve Text-Based Captcha
An Optimized System to Solve Text-Based CaptchaAn Optimized System to Solve Text-Based Captcha
An Optimized System to Solve Text-Based Captcha
 
Gender Classification using SVM With Flask
Gender Classification using SVM With FlaskGender Classification using SVM With Flask
Gender Classification using SVM With Flask
 
Evolution of captcha technologies
Evolution of captcha technologiesEvolution of captcha technologies
Evolution of captcha technologies
 
web application security using CAPTCHA
web application  security using CAPTCHAweb application  security using CAPTCHA
web application security using CAPTCHA
 
Captcha Recognition and Robustness Measurement using Image Processing Techniques
Captcha Recognition and Robustness Measurement using Image Processing TechniquesCaptcha Recognition and Robustness Measurement using Image Processing Techniques
Captcha Recognition and Robustness Measurement using Image Processing Techniques
 
Dynamic autoselection and autotuning of machine learning models forcloud netw...
Dynamic autoselection and autotuning of machine learning models forcloud netw...Dynamic autoselection and autotuning of machine learning models forcloud netw...
Dynamic autoselection and autotuning of machine learning models forcloud netw...
 
Generic Solving Of Text Based Captcha
Generic Solving Of Text Based CaptchaGeneric Solving Of Text Based Captcha
Generic Solving Of Text Based Captcha
 
A Survey of Current Research on CAPTCHA
A Survey of Current Research on CAPTCHAA Survey of Current Research on CAPTCHA
A Survey of Current Research on CAPTCHA
 

More from chirag patil

Wh Yes-No questions.pptx
Wh Yes-No questions.pptxWh Yes-No questions.pptx
Wh Yes-No questions.pptxchirag patil
 
joining words not only but also.pptx
joining words not only but also.pptxjoining words not only but also.pptx
joining words not only but also.pptxchirag patil
 
Basic English Grammar 2.pptx
Basic English Grammar 2.pptxBasic English Grammar 2.pptx
Basic English Grammar 2.pptxchirag patil
 
Basic English Grammar.pptx
Basic English Grammar.pptxBasic English Grammar.pptx
Basic English Grammar.pptxchirag patil
 
Input output devices
Input output devicesInput output devices
Input output deviceschirag patil
 
Decimal and binary conversion
Decimal and binary conversionDecimal and binary conversion
Decimal and binary conversionchirag patil
 
Abbreviations and full forms
Abbreviations and full formsAbbreviations and full forms
Abbreviations and full formschirag patil
 
Web engineering and Technology
Web engineering and TechnologyWeb engineering and Technology
Web engineering and Technologychirag patil
 
Web data management
Web data managementWeb data management
Web data managementchirag patil
 
Web application development
Web application developmentWeb application development
Web application developmentchirag patil
 
Programming the web
Programming the webProgramming the web
Programming the webchirag patil
 
8051 microcontroller
8051 microcontroller8051 microcontroller
8051 microcontrollerchirag patil
 
Computer Graphics and Virtual Reality
Computer Graphics and Virtual RealityComputer Graphics and Virtual Reality
Computer Graphics and Virtual Realitychirag patil
 
Advanced Database Management Syatem
Advanced Database Management SyatemAdvanced Database Management Syatem
Advanced Database Management Syatemchirag patil
 

More from chirag patil (20)

Wh Yes-No questions.pptx
Wh Yes-No questions.pptxWh Yes-No questions.pptx
Wh Yes-No questions.pptx
 
joining words not only but also.pptx
joining words not only but also.pptxjoining words not only but also.pptx
joining words not only but also.pptx
 
Basic English Grammar 2.pptx
Basic English Grammar 2.pptxBasic English Grammar 2.pptx
Basic English Grammar 2.pptx
 
Basic English Grammar.pptx
Basic English Grammar.pptxBasic English Grammar.pptx
Basic English Grammar.pptx
 
Maths formulae
Maths formulaeMaths formulae
Maths formulae
 
Input output devices
Input output devicesInput output devices
Input output devices
 
Shortcut keys
Shortcut keysShortcut keys
Shortcut keys
 
Operating system
Operating systemOperating system
Operating system
 
Network topology
Network topologyNetwork topology
Network topology
 
Decimal and binary conversion
Decimal and binary conversionDecimal and binary conversion
Decimal and binary conversion
 
Abbreviations and full forms
Abbreviations and full formsAbbreviations and full forms
Abbreviations and full forms
 
ASCII Code
ASCII CodeASCII Code
ASCII Code
 
Web engineering and Technology
Web engineering and TechnologyWeb engineering and Technology
Web engineering and Technology
 
Web data management
Web data managementWeb data management
Web data management
 
Web application development
Web application developmentWeb application development
Web application development
 
Programming the web
Programming the webProgramming the web
Programming the web
 
Operating System
Operating SystemOperating System
Operating System
 
8051 microcontroller
8051 microcontroller8051 microcontroller
8051 microcontroller
 
Computer Graphics and Virtual Reality
Computer Graphics and Virtual RealityComputer Graphics and Virtual Reality
Computer Graphics and Virtual Reality
 
Advanced Database Management Syatem
Advanced Database Management SyatemAdvanced Database Management Syatem
Advanced Database Management Syatem
 

Recently uploaded

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 

Recently uploaded (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 

Captcha cracking using image processing ieee paper

  • 1. CAPTCHA CRACKING USING IMAGE PROCESSING Chirag Patil Software Developer ABSTRACT Today Internet is life line for most of individual person.The day start with looking for any updates and checking online status on socialmedia and giving feedback in form of comment or likes. This feedback or likes can be controlled and manipulated by bots orautomated programs which is provided by spammers. Here comes testing method Known as CAPTCHA ( “ an acronym for "completely automated public Turing test to tell computers and humans apart", trademarked by Carnegie Mellon University)[8]. It is type of test which is used to determine whether the user is Human or not. It is set of random alphabets and numbers which is twisted so that it cannot be recognized by bots but is readable by humans. There is two version of CAPTCHA . Version one which consist of twisted alphabets and numbers and Version two which consist of multiple images to select from according to system. Therefore it is very important to check vulnerability t o check CAPTCHA against attack. So in this paper we will check the method to Crack CAPTCHA and check its effectiveness strength and security against spammers and automated programs. Key Words: CAPTCHA, IMAGE PROCES S IG, OCR, TURING TES T, MATLAB, CAPTCHA CRACKING, S ECURITY 1. INTRODUCTION In last few years internet has changed drastically. No of internet user has increased and so are attacks and its exploitation inform of spammers and hackers. This hackers create automated programs or bots to generate traffic and load on servers which slow down websites and other portals. SO we are going to crack this CAPTCHA using image process and check strength and complexity of CAPTCHA. 1.1 LiteratureSurvey Many organization have cracked CAPTCHA to check and make better CHAPTCHA and improve security. There are paper published by researchers on same one of them is „Stanford Researchers crack CAPTCHA code‟ by Todd Wasserman published by ‘Stanford University has created DeCAPTCHA, It is a software which makes computer read CAPTCHA by clanging up text and rendering them in liable format of alphabets and numbers. It was not fully accurate it was able to crack CAPTCHA but not fully its accuracy was 60 to 70 percent. Another paper titled as ‘Braking and image based CPATCHA by MICHAELMerler, JACQUILENE Jacob it was published by Stanford University is on Vidoop CAPTCHA. It is another typeof verification process that uses images of the objects, and animals, people, instead of distorted text to distinguish a human from computer program and bots .Authors underestimate that since a computer programs and bots can try to access several times a day without any change in effort. 2. Working In this program we are going to crack CAPTCHA using IMAGE PROCESSING for image based CAPTCHA which are not highly secured. We are going to create a program which will extract the image from web page using developer mode or by taking the snapshot of web page. After that it will search for the image or part were CAPTCHA is. Then it will process the image or rather say CAPTCHA file which is in .bmp format. It will extract the text using algorithms and give us the result and it will automatically submit the result text in required field and submit the form. This will tell us the level of security CAPTCH of that web site has. Example of CAPTCHA from web page is shown in fig.1. Fig. 1: EXAMPE OF CAPTCH ON A WEB PAGE
  • 2. 4. METHODLOGY ANG PROBLEM APPROCH 4.1. SYSTEM ARCHITECTUR System Architecture consist of following parts as proposed by our team. 1. LAYOUT (USER INERFACE) 2. DATA EXTRACTION 3. PRE PROCESSING AND PROCESSING 4. DECESSION MAKING 5. RESULT (OUTPUT) Fig. 2: SYSTEM ARCHITECTURE AND FLOW CHART 4.2. DISCRIPTION 4.2.1. LAYOUT (USER INTERFACE) User Interface it is physical appearance of software with GUI (Graphical User Inter face) at user end with controls and input output data. 4.2.2. DATA EXTRACTION In this phase after getting command by user from layout or user interface the program will extract CHAPTCHA image from the web page for further process. 4.2.3. PRE PROCESSING AND PROCESSING After the image is extracted the image is pre processed because the original image is unfit for processing so is character in it as well. In this there are chances that the data or character in image is not of mono color so first preference is to make one. Then it is further processed which includes. a) Bifurcating according to RGB value b) rotation of image c) finding original corner of character d) resizing of image 4.2.4 DICESION MAKING In this phase after image being processed it can be recognized by using algorithms and Artificial intelligence to recognize character. Such OCR and MATLAB. After character benign recognized it will submit in required field and if it was success full it will enter the page and if not it will reload next CAPTCHA image from web page and process continues [8]. 4.2.5 RESULT (OUTPUT) After successful attempt we will have full access or logged in to web page then that means CHAPTCH is vulnerable. 5. ALGORITHM 5.1 DATA [PREPARATION The CAPTCHA image is converted to gray scale image if it‟s a RGB image. After converting RGB image to gray scale image, the threshold method is performed. First it calculates the thresholdvalue using predefined function graythresh().This threshold value is used, where gray-levels below this threshold
  • 3. is said to be black and levels above are said to be white [1].It performs binarization for further segmentation [6]. Fig. 3: EXAMPE OF CAPTCH BEFORE PROCESSING Fig. 4: EXAMPE OF CAPTCH AFTER PROCESSING 5.2 SEGMENTTION AND PIXELTION Segmentation The preprocessed CAPTCHA block is segmented into individual characters as shown in figure 10. Detailed explanation of this step is as follows. Firstly the size of preprocessed image is obtained, then every individual column is scanned till the data is found. Then the obtained character is cropped. Twenty rows and columns are padded with 0's as shown in fig. 5.[2] . Fig. 5: EXAMPE OF CAPTCH SEGMENTATION Pixelation The pixel-to-pixel comparison function requires both character images to be the same width/height. Simple experiments determined a 16 pixel by 16 pixel image was sufficient to retain important characteristics. It was important to decaled the image, because for each additional row/column k, it increased the time complexity. The actual pixilation was done by dividing the original character image into the 256 cells, and determining the average value of each cell (1=black, 0=white). If the average value of each cell was greater than 0.5, then the script set the corresponding pixel in the pixilated image to black, and vice versa. Fig. 6: EXAMPE OF CHARACTER PIXELATION 5.3 CLASSIFY Firstly thetemplates are loaded then it can use four types of classification techniques as follows: 5.3.1 PIXEL COUNT First, the vertical segmentation divides the characters into 5 segments. Next each segment is scanned to get the number of foreground pixels in it. Then, the pixel count obtained in the previous step is used to look up the mapping table [11]. Finally it gives the output with success rate of 8%. 5.3.2 VERTICAL PROJECTION Vertical projection is applied to a segmented image, each of which contains one character. The process of vertical projection starts by mapping the image histogram to that of the template vertical histogram which finally has more mapping involved to it is the output. Finally it gives the output with success rate of 95% [3]. 5.3.3. HORIZONTAL PROJECTION Horizontal projection is applied to a segmented image, each of which contains one character. The process of vertical projection starts by mapping the image histogram to that of the template horizontal histogram which finally has more mapping involved to it is the output. Finally it gives the output with success rate of 100%.
  • 4. 5.3.4. TEMPLATE CORRELATION The matrix containing the image of the input character is directly matched with a set of characters in templates representing each possible class. The distance between the pattern and each character in template is computed, and the maximum correlation value obtained from segmented character and the characters in template is the character printed to output. Finally it gives the output with success rate of 100%. 5.4. TEMPELATE MACHING AND CORRELATION TECHNIQUE These techniques are different from the others in that no features are actually extracted. Instead the matrix containing the image of the input character is directly matched with a set of characters in templates representing each possible class. The distance between the pattern and each character in template is computed, and the maximum correlation value obtained from segmented character and the characters in template is the character printed to output [2]. 6. RESULT (OUTPUT) After successful attempt we will have full access or logged in to web page then that means CHAPTCH is vulnerable. 7. Goals of this application The main goals of our application are: 1. To show the vulnerability of CAPTCHA 2. Further improvement of CAPTCHA and secure online system. 3. Automate login forperson who face problem with CAPTCHA. 4. Testing of web page with security issues. 8. Conclusion: Finally the conclusion by cracking the CAPTCHA is that it is not completely secure. This paper shows the technique to crack CAPTCHA so that one can implant CAPTCHA cracker by giving CAPTCHA image as an input and retrieving CAPTCHA text as output. 9. Future Scope: In future automated system can be made to increase security as well as check vulnerability of a web site. It will also help those who face problem in logging in portal due to CAPTCHA. 10. Reference [1] Mr.Nithya.Eand Dr. Ramesh Babu D R June 2013 OCR Systemfor Complex Printed Kannada Characters. [2] Sandeep Tiwari, Shivangi Mishra, Priyank Bhatia, Praveen Km. Yadav May 2013 OpticalCharacter Recognition using MATLAB. [3] Jeff Yan and Ahmad Salah El Ahmad A Low-cost Attack on a Microsoft CAPTCHA. [4] Silky Azad & Kiran Jain, CAPTCHA:Attacks and Weaknesses against OCR Technology [5] Prof. (Mrs.) A.A. Chandavale and Prof. Dr.A.M. Sapkal and Dr.R.M.Jalnekar ,Algorithm To Break Visual CAPTCHA. [6] Hina Parveen and Sudhir Singh, Captcha Recognition and Robustness Measurement using Hybrid Approaches [7] Elie Bursztein, Jonathan Aigrain and Angelika Moscicki TheEnd is Nigh: Generic Solving of Text-based CAPTCHAs [8] Christoph Fritsch, Michael Netter, Andreas Reisser, and Günther Pernul, Attacking Image Recognition Captchas A Naive but Effective Approach [9] Luis von Ahn1, Manuel Blum1, Nicholas J. Hopper1 , and John Langford CAPTCHA:Using Hard AI Problems For Security [10] Bin B. Zhu*1, Jeff Yan2, Qiujie Li3, Chao Yang4, Jia Liu5, Ning Xu1, Meng Yi6, Kaiwei Cai7, Attacks and Design of Image Recognition CAPTCHAs [11] Jeff Yan, Ahmad Salah El Ahmad Breaking Visual CAPTCHAs with Naïve Pattern Recognition Algorithm