Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
CANTINAA Content-Based Approach to Detecting Phishing WebSites
•CANTINA is a content-basedapproach.•Examines whether the content islegitimate or not.•Detects phishing URLs and links.ABS...
INTRODUCTION• PhishingA kind of attack in which victims are tricked byspoofed emails and fraudulent web sites into givingu...
EXISTING SYSTEM• NetCraft(Surface Characteristics)• SpoofGuard(Surface Characteristics andblacklist)• Cloudmark(Blacklist )
PROPOSED SYSTEM• Detects phishing websites• Examines text-based content along with surfacecharacteristics.• Text based con...
TF-IDF ALGORITHM• Term Frequency (TF)–The number of times a given term appearsin a specific document–Measure of the import...
REAL EBAY WEBPAGE
FAKE EBAY WEBPAGE
MODULES• Parsing the web pages• Generating the lexical signature• Testing Process• Report Generation
Parsing the web pages• Link, anchor tag, form tag and attachment in theweb pages is turned into corresponding Text Link,HT...
Generating the lexical signature• TF-IDF algorithm used to generatelexical signatures.• Calculating the TF-IDF value for e...
Testing Process• Feed this lexical signature to a searchengine.• Check domain name of the currentweb page matches the doma...
Report Generation• If a page is Legitimate it returns“legitimate”• If a page is phishing it returns“phishing”
• Used to detect fraudulent websites,emails.•Protects from giving up personalinformation like credit card numbers,bank det...
•Content-based approach for detectingphishing websites.•User friendly interface for the users.•Anti-phishing website that ...
Upcoming SlideShare
Loading in …5
×

Cantina content based approach to detect phishing websites

1,479 views

Published on

Published in: Technology, Design
  • Be the first to comment

Cantina content based approach to detect phishing websites

  1. 1. CANTINAA Content-Based Approach to Detecting Phishing WebSites
  2. 2. •CANTINA is a content-basedapproach.•Examines whether the content islegitimate or not.•Detects phishing URLs and links.ABSTRACT
  3. 3. INTRODUCTION• PhishingA kind of attack in which victims are tricked byspoofed emails and fraudulent web sites into givingup personal information•How many phishing sites are there?9,255 unique phishing sites were reported in June of2006 alone•How much phishing costs each year?$1 billion to 2.8 billion per year
  4. 4. EXISTING SYSTEM• NetCraft(Surface Characteristics)• SpoofGuard(Surface Characteristics andblacklist)• Cloudmark(Blacklist )
  5. 5. PROPOSED SYSTEM• Detects phishing websites• Examines text-based content along with surfacecharacteristics.• Text based content includes:-Age of Domain.-Known Images.-Suspicious URL.-Suspicious links. Detects phishing links in users email.
  6. 6. TF-IDF ALGORITHM• Term Frequency (TF)–The number of times a given term appearsin a specific document–Measure of the importance of the termwithin the particular document• Inverse Document Frequency (IDF)–Measure how common a term is across anentire collection of documents• High TF-IDF weight means High TF
  7. 7. REAL EBAY WEBPAGE
  8. 8. FAKE EBAY WEBPAGE
  9. 9. MODULES• Parsing the web pages• Generating the lexical signature• Testing Process• Report Generation
  10. 10. Parsing the web pages• Link, anchor tag, form tag and attachment in theweb pages is turned into corresponding Text Link,HTML Link e.t.c.•Done by parsing each Text• Uses HTML Parser API• It is used for extracting information fromHTML code
  11. 11. Generating the lexical signature• TF-IDF algorithm used to generatelexical signatures.• Calculating the TF-IDF value for eachword in a document.• Selecting the words with highestvalue.
  12. 12. Testing Process• Feed this lexical signature to a searchengine.• Check domain name of the currentweb page matches the domain nameof the N top search results.
  13. 13. Report Generation• If a page is Legitimate it returns“legitimate”• If a page is phishing it returns“phishing”
  14. 14. • Used to detect fraudulent websites,emails.•Protects from giving up personalinformation like credit card numbers,bank details, account passwords etc.•Used to detect suspicious links inemail.APPLICATIONS
  15. 15. •Content-based approach for detectingphishing websites.•User friendly interface for the users.•Anti-phishing website that protects usersfrom giving their personal information.CONCLUSION

×