SRI KRISHNADEVARAYA UNIVERSITY
Dept. of Computer Science & Technology
Anantapuramu – 515003
INTELLIGENT PHISHING WEBSITE
DETECTION AND PREVENTION SYSTEM
A project report on
By
g.NARESH
M.Sc.
Dept. Of Computer Science & Technology
Under the esteemed guidance of
Dr. N. Geethanjali
M.Sc., M.Phil.,M.Tech.,P.hD.
Professor
Abstract
Phishing is a new type of network attack where the attacker creates
a replica of an existing web page to fool users in to submitting
personal, financial, or password data to what they think is their
service provider’s website .The concept is a end host based anti-
phishing algorithm, called the Link Guard, by utilizing the generic
characteristics of the hyperlinks in phishing attacks. Link Guard is
based on the careful analysis of the characteristics of phishing
hyperlinks.
Each end user is implemented with Link Guard algorithm. After
doing so the end user recognizes the phishing emails and can avoid
responding to such mails. Since Link Guard is characteristics based it
can detect and prevent not only known phishing attacks but also
unknown ones.
PHISHING BASICS
Pronounced "fishing“
The word has its Origin from two words “Password Harvesting ”
or fishing for Passwords
Phishing is an online form of pretexting, a kind of deception in
which an attacker pretends to be someone else in order to obtain
sensitive information from the victim
 Also known as "brand spoofing“
Phishers are phishing artists
OBJECTIVES:
Phishing objectives analysis and identified that the phishing
hyperlinks share one or more characteristics:
 The visual link and the actual link are not the same;
 The attackers often use dotted decimal IP address instead of
DNS name;
 Special tricks are used to encode the hyperlinks maliciously;
 The attackers often use fake DNS names that are similar with
target website
SOFTWARE MODEL:
Software Development Life Cycle (SDLC) is a process used by the
software industry to design, develop and test high quality
software. The SDLC aims to produce high quality software that
meets or exceeds customer expectations, reaches completion
within times and cost estimates.
 SDLC is the acronym of Software Development Life Cycle.
 It is also called as Software Development Process.
 SDLC is a framework defining tasks performed at each step
in the software development process.
 ISO/IEC 12207 is an international standard for software life-
cycle processes. It aims to be the standard that defines all the
tasks required for developing and
maintaining software.
Planning
Defining
Designing
Building
Testing
Development
What is SDLC?
SDLC is a process followed for a software project, within a software
organization. It consists of a detailed plan describing how to
develop, maintain, replace and alter or enhance specific software.
The life cycle defines a methodology for improving the quality of
software and the overall development process.
PURPOSE OF THE PROJECT
The word ‘Phishing’ initially emerged in 1990s. The early hackers often use ‘ph’ to
replace ‘f’ to produce new words in the hacker’s community, since they usually hack by
phones. Phishing is a new word produced from ‘fishing’, it refers to the act that the
attacker allure users to visit a faked Web site by sending them faked e-mails.
If you input the account number and password, the attackers then successfully collect
the information at the server side, and is able to perform their next step actions with
that information.
Our analysis identifies that the phishing hyperlinks share one or more characteristics
as listed below:
1) The visual link and the actual link are not the same;
2) The attackers often use dotted decimal IP address instead of DNS name;
3) Special tricks are used to encode the hyperlinks maliciously;
4) The attackers often use fake DNS names that are similar (but not identical) with the
target Web site. We then propose an end-host based anti-phishing algorithm which
we call Link Guard, based on the characteristics of the phishing hyperlink.
EXISTING SYSTEM:
Detect and block the phishing Web sites
in time.
Enhance the security of the web sites.
Block the phishing e-mails by various
spam filters.
Install online anti-phishing software in
user’s computers.
PROPOSED SYSTEM:
Classification of the hyperlinks in the
phishing e-mails
Link guard algorithm
Link guard implemented client
Feasibility study
Operating System : Windows 2000/XP
Documentation Tool : Ms Word
Technologies : JDBC, Servlets & JSP
Data Base : MySQL
Hard disk : 20 GB and above
RAM : 256 MB and above
Processor speed : 1.6 GHz and above
SOFTWARE REQUIREMENTS
HARDWARE REQUIREMENTS
HOW TO AVOID PHISHING
Never send sensitive account information by e-mail
◦ Account numbers, SSN, passwords
Never give any password out to anyone
Verify any person who contacts you (phone or email).
◦ If someone calls you on a sensitive topic, thank
them, hang up and call them back using a
number that you know is correct, like from your
credit card or statement.
ARCHITECTURE DESIGN
This explains the entire architecture of the software being developed and shows
how the flow control is passed over each module in the project.
Primary objective of architecture design is to develop a modular program
structure and represent the control relationships between modules.
USER
LINKGUARD
MAIL SYSTEM
URL & DOMAIN
IDENTITY
COMPOSESEND/RECEIVE
MAIL SYSTEM REGISTRATION
PHISING
WEBSITE
The above architectural design contains different
modules like sender, receiver, and Link guard technique
methods.
 First register as user and that information is available
at Admin.
 The user will compose the mail and send to another
user.
 The user who will get the mail that can be checked
internally with Link guard Algorithm.
 If the mail is of phish then it will be moving to phish
box
 If the mail is not phishy then it will be in normal inbox
UML DIAGRAMS
Fig: Use case diagram
Class Diagram
Fig: Class diagram
Activity Diagram for Mail System
Activity Diagram for compose, send and receive mail
Overview of JAVA
Java technology is both a programming language and a platform.
Java is a powerful but lean object oriented programming language.
It has generated a lot of excitement because it makes it possible to
program for internet by creating applets, programs that can be
embedded in web page.
Java is actually a platform consisting of three components:
1. Java programming Language.
2. Java Library of Classes and Interfaces.
3. Java Virtual Machine.
JAVA DATABASE CONNECTIVITY
JDBC is a Java API for executing SQL statements. (JDBC is often
thought of as “Java Database Connectivity”) .It consists of a set of
classes and interfaces written in the java programming language.
Using JDBC, it is easy to send SQL statements to virtually
any relational database. In the other words, with the JDBC API, it is
not necessary to write to one program to access a Sybase
database, another program to access Informix database, another
program to access Oracle database, and so on. The combinations
of JAVA and JDBC let’s a programmer writes it once and run it
anywhere.
JAVA SERVER PAGES (JSP)
Java Server Pages™ (JSP) is a new technology for web application
development that has received a great deal of attention since it was
first announced.
A JSP is similar in design and functionality to java servlet. It is
called by the client to provide a web service, the nature of which
depends on the J2EE application.
Java Servlet is written using Java programming language and
responses are encoded as an output string object that is passed to
the println () method.
In contrast a JSP is written in HTML, XML, or in the client’s
format that is interspersed with scripting elements, directives,
and actions comprised of Java Programming language and JSP
syntax.
There are three methods that are automatically called when
the JSP is requested and the JSP terminates normally.
These are
the jspInt () method,
the jspDestroy () method, and
the service () method.
Comm.: This collects the information of the
input process, and sends these related
Information’s to the Analyzer.
Database: Store the white list, blacklist, and
the
user input URLs.
Analyzer: It is the key component of Link
Guard, which implements the Link Guard
Algorithm; it uses data provided by Comm and
Database, and sends the results to the Alert
and
Logger modules.
Alerter: When receiving warning messages
from Analyzer, it shows the related information
to alert the users and send back the reactions
of
the user back to the Analyzer.
Logger: Archive the history information, such
as user events, alert information, for future
use.
Link guard algorithm
TESTING
Testing is the process of detecting errors. Software testing is a
critical element of software quality assurance and represents the
ultimate review of specification, design and coding.
TESTING METHODS
System Testing
Code Testing
TYPES OF TESTING
Unit Testing
Link Testing
TEST RESULTS
The below shown are the project Inputs & Outputs which are
shown in a diagrammatical representation
Home page:
New User Sign UP:
CONCLUSION
Phishing has becoming a serious network security
problem, causing finical lose of billions of dollars to both consumers
and e-commerce companies. In this project, we have studied the
characteristics of the hyperlinks that were embedded in phishing e-
mails. Since Phishing Guard is characteristic based, it can not only
detect known attacks, but also is effective to the unknown ones.
We have implemented Link Guard for Windows XP.
Our experiment showed that Link Guard is light-weighted and can
detect up to 96% unknown phishing attacks in real-time. We believe
that Link Guard is not only useful for detecting phishing attacks, but
also can shield users from malicious or unsolicited links in Web pages
and Instant messages.
website phishing by NR

website phishing by NR

  • 1.
    SRI KRISHNADEVARAYA UNIVERSITY Dept.of Computer Science & Technology Anantapuramu – 515003
  • 2.
    INTELLIGENT PHISHING WEBSITE DETECTIONAND PREVENTION SYSTEM A project report on By g.NARESH M.Sc. Dept. Of Computer Science & Technology Under the esteemed guidance of Dr. N. Geethanjali M.Sc., M.Phil.,M.Tech.,P.hD. Professor
  • 3.
    Abstract Phishing is anew type of network attack where the attacker creates a replica of an existing web page to fool users in to submitting personal, financial, or password data to what they think is their service provider’s website .The concept is a end host based anti- phishing algorithm, called the Link Guard, by utilizing the generic characteristics of the hyperlinks in phishing attacks. Link Guard is based on the careful analysis of the characteristics of phishing hyperlinks. Each end user is implemented with Link Guard algorithm. After doing so the end user recognizes the phishing emails and can avoid responding to such mails. Since Link Guard is characteristics based it can detect and prevent not only known phishing attacks but also unknown ones.
  • 4.
    PHISHING BASICS Pronounced "fishing“ Theword has its Origin from two words “Password Harvesting ” or fishing for Passwords Phishing is an online form of pretexting, a kind of deception in which an attacker pretends to be someone else in order to obtain sensitive information from the victim  Also known as "brand spoofing“ Phishers are phishing artists
  • 5.
    OBJECTIVES: Phishing objectives analysisand identified that the phishing hyperlinks share one or more characteristics:  The visual link and the actual link are not the same;  The attackers often use dotted decimal IP address instead of DNS name;  Special tricks are used to encode the hyperlinks maliciously;  The attackers often use fake DNS names that are similar with target website
  • 6.
    SOFTWARE MODEL: Software DevelopmentLife Cycle (SDLC) is a process used by the software industry to design, develop and test high quality software. The SDLC aims to produce high quality software that meets or exceeds customer expectations, reaches completion within times and cost estimates.  SDLC is the acronym of Software Development Life Cycle.  It is also called as Software Development Process.  SDLC is a framework defining tasks performed at each step in the software development process.  ISO/IEC 12207 is an international standard for software life- cycle processes. It aims to be the standard that defines all the tasks required for developing and maintaining software.
  • 7.
    Planning Defining Designing Building Testing Development What is SDLC? SDLCis a process followed for a software project, within a software organization. It consists of a detailed plan describing how to develop, maintain, replace and alter or enhance specific software. The life cycle defines a methodology for improving the quality of software and the overall development process.
  • 8.
    PURPOSE OF THEPROJECT The word ‘Phishing’ initially emerged in 1990s. The early hackers often use ‘ph’ to replace ‘f’ to produce new words in the hacker’s community, since they usually hack by phones. Phishing is a new word produced from ‘fishing’, it refers to the act that the attacker allure users to visit a faked Web site by sending them faked e-mails. If you input the account number and password, the attackers then successfully collect the information at the server side, and is able to perform their next step actions with that information. Our analysis identifies that the phishing hyperlinks share one or more characteristics as listed below: 1) The visual link and the actual link are not the same; 2) The attackers often use dotted decimal IP address instead of DNS name; 3) Special tricks are used to encode the hyperlinks maliciously; 4) The attackers often use fake DNS names that are similar (but not identical) with the target Web site. We then propose an end-host based anti-phishing algorithm which we call Link Guard, based on the characteristics of the phishing hyperlink.
  • 9.
    EXISTING SYSTEM: Detect andblock the phishing Web sites in time. Enhance the security of the web sites. Block the phishing e-mails by various spam filters. Install online anti-phishing software in user’s computers.
  • 10.
    PROPOSED SYSTEM: Classification ofthe hyperlinks in the phishing e-mails Link guard algorithm Link guard implemented client Feasibility study
  • 11.
    Operating System :Windows 2000/XP Documentation Tool : Ms Word Technologies : JDBC, Servlets & JSP Data Base : MySQL Hard disk : 20 GB and above RAM : 256 MB and above Processor speed : 1.6 GHz and above SOFTWARE REQUIREMENTS HARDWARE REQUIREMENTS
  • 12.
    HOW TO AVOIDPHISHING Never send sensitive account information by e-mail ◦ Account numbers, SSN, passwords Never give any password out to anyone Verify any person who contacts you (phone or email). ◦ If someone calls you on a sensitive topic, thank them, hang up and call them back using a number that you know is correct, like from your credit card or statement.
  • 13.
    ARCHITECTURE DESIGN This explainsthe entire architecture of the software being developed and shows how the flow control is passed over each module in the project. Primary objective of architecture design is to develop a modular program structure and represent the control relationships between modules. USER LINKGUARD MAIL SYSTEM URL & DOMAIN IDENTITY COMPOSESEND/RECEIVE MAIL SYSTEM REGISTRATION PHISING WEBSITE
  • 14.
    The above architecturaldesign contains different modules like sender, receiver, and Link guard technique methods.  First register as user and that information is available at Admin.  The user will compose the mail and send to another user.  The user who will get the mail that can be checked internally with Link guard Algorithm.  If the mail is of phish then it will be moving to phish box  If the mail is not phishy then it will be in normal inbox
  • 15.
  • 16.
  • 17.
  • 18.
    Activity Diagram forcompose, send and receive mail
  • 19.
    Overview of JAVA Javatechnology is both a programming language and a platform. Java is a powerful but lean object oriented programming language. It has generated a lot of excitement because it makes it possible to program for internet by creating applets, programs that can be embedded in web page. Java is actually a platform consisting of three components: 1. Java programming Language. 2. Java Library of Classes and Interfaces. 3. Java Virtual Machine.
  • 20.
    JAVA DATABASE CONNECTIVITY JDBCis a Java API for executing SQL statements. (JDBC is often thought of as “Java Database Connectivity”) .It consists of a set of classes and interfaces written in the java programming language. Using JDBC, it is easy to send SQL statements to virtually any relational database. In the other words, with the JDBC API, it is not necessary to write to one program to access a Sybase database, another program to access Informix database, another program to access Oracle database, and so on. The combinations of JAVA and JDBC let’s a programmer writes it once and run it anywhere.
  • 21.
    JAVA SERVER PAGES(JSP) Java Server Pages™ (JSP) is a new technology for web application development that has received a great deal of attention since it was first announced. A JSP is similar in design and functionality to java servlet. It is called by the client to provide a web service, the nature of which depends on the J2EE application.
  • 22.
    Java Servlet iswritten using Java programming language and responses are encoded as an output string object that is passed to the println () method. In contrast a JSP is written in HTML, XML, or in the client’s format that is interspersed with scripting elements, directives, and actions comprised of Java Programming language and JSP syntax. There are three methods that are automatically called when the JSP is requested and the JSP terminates normally. These are the jspInt () method, the jspDestroy () method, and the service () method.
  • 23.
    Comm.: This collectsthe information of the input process, and sends these related Information’s to the Analyzer. Database: Store the white list, blacklist, and the user input URLs. Analyzer: It is the key component of Link Guard, which implements the Link Guard Algorithm; it uses data provided by Comm and Database, and sends the results to the Alert and Logger modules. Alerter: When receiving warning messages from Analyzer, it shows the related information to alert the users and send back the reactions of the user back to the Analyzer. Logger: Archive the history information, such as user events, alert information, for future use. Link guard algorithm
  • 24.
    TESTING Testing is theprocess of detecting errors. Software testing is a critical element of software quality assurance and represents the ultimate review of specification, design and coding. TESTING METHODS System Testing Code Testing TYPES OF TESTING Unit Testing Link Testing
  • 25.
    TEST RESULTS The belowshown are the project Inputs & Outputs which are shown in a diagrammatical representation Home page:
  • 26.
  • 29.
    CONCLUSION Phishing has becominga serious network security problem, causing finical lose of billions of dollars to both consumers and e-commerce companies. In this project, we have studied the characteristics of the hyperlinks that were embedded in phishing e- mails. Since Phishing Guard is characteristic based, it can not only detect known attacks, but also is effective to the unknown ones. We have implemented Link Guard for Windows XP. Our experiment showed that Link Guard is light-weighted and can detect up to 96% unknown phishing attacks in real-time. We believe that Link Guard is not only useful for detecting phishing attacks, but also can shield users from malicious or unsolicited links in Web pages and Instant messages.