Classification Model to Detect Malicious URL via Behaviour AnalysisEditor IJCATR
The challenging task in cyber space is to detect malicious URLs. The websites pointed by the malicious URLs injects malicious code into the client machine or steals the crucial information. As detecting a phishing URL is a challenging task, it is essential to enhance detection techniques against the emerging attacks. The most of the existing approaches are feature based and cannot detect dynamic attacks. Mostly the attacker uses the input form, active content and embeds @ symbol in URL for malicious attack. To detect this attack, a Behaviour based Malicious URL Finder (BMUF) algorithm is proposed. It analyzes the behaviour of the URL. The FSM based state transition diagram is used to model the URL behaviour into various states. The state transition from initial to final state is used for classification. This approach tests the genuine and malicious behavior of the URL based on the responses to the user. It accurately detects the nature of the URL.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Classification Model to Detect Malicious URL via Behaviour AnalysisEditor IJCATR
The challenging task in cyber space is to detect malicious URLs. The websites pointed by the malicious URLs injects malicious code into the client machine or steals the crucial information. As detecting a phishing URL is a challenging task, it is essential to enhance detection techniques against the emerging attacks. The most of the existing approaches are feature based and cannot detect dynamic attacks. Mostly the attacker uses the input form, active content and embeds @ symbol in URL for malicious attack. To detect this attack, a Behaviour based Malicious URL Finder (BMUF) algorithm is proposed. It analyzes the behaviour of the URL. The FSM based state transition diagram is used to model the URL behaviour into various states. The state transition from initial to final state is used for classification. This approach tests the genuine and malicious behavior of the URL based on the responses to the user. It accurately detects the nature of the URL.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Classification is one of the data mining technique to classify the data. Here, I have tried the different technologies such as Machine Learning and Deep Learning using R Programming Language.
A Deep Learning Technique for Web Phishing Detection Combined URL Features an...IJCNCJournal
The most popular way to deceive online users nowadays is phishing. Consequently, to increase cybersecurity, more efficient web page phishing detection mechanisms are needed. In this paper, we propose an approach that rely on websites image and URL to deals with the issue of phishing website recognition as a classification challenge. Our model uses webpage URLs and images to detect a phishing attack using convolution neural networks (CNNs) to extract the most important features of website images and URLs and then classifies them into benign and phishing pages. The accuracy rate of the results of the experiment was 99.67%, proving the effectiveness of the proposed model in detecting a web phishing attack.
A Comparative Analysis of Different Feature Set on the Performance of Differe...gerogepatton
Reducing the risk pose by phishers and other cybercriminals in the cyber space requires a robust and
automatic means of detecting phishing websites, since the culprits are constantly coming up with new
techniques of achieving their goals almost on daily basis. Phishers are constantly evolving the methods
they used for luring user to revealing their sensitive information. Many methods have been proposed in
past for phishing detection. But the quest for better solution is still on. This research covers the
development of phishing website model based on different algorithms with different set of features in order
to investigate the most significant features in the dataset.
The Retail Strategy and Planning Series is designed to provide retail executives with the tactical tips, insights, metrics and trend data needed to guide 2017 strategies. Tune into Are Bot Operators Eating Your Lunch? and learn how to protect your brand image, reputation and SEO rankings from bad bots: rtou.ch/2c5cPmx.
Your listing data is valuable. Scraping it NOT good for distribution of your listings to your competitors and fraudsters. Controlling your listing data is good business - protects your value, saves on costs and maximizes revenue. This session explores the specific of how one property portal found strong ROI with bot detection protecting their listings.
Ensuring Property Portal Listing Data SecurityDistil Networks
Securing your property portal listing data is harder than ever. Why? Web scraping is cheap and easy. Bots simply steal whatever content they’ve been programmed to fetch – listing text, photos, and other data that should only be available to paid subscribers and legitimate consumers.
Review this presentation to learn how to avoid expensive litigation by protecting your content before the theft occurs. Review the latest research on how non-human traffic has evolved over the past few years and best practices to protect both copyrighted and non-copyrightable content.
Hear the results from research conducted with property portal executives on the current state of anti-scraping efforts.
How to clean up travel website traffic from bots and spammers?tnooz
Did you know 30% of travel industry website visitors are unsavory competitors, hackers, spammers, and fraudsters?
When aggressive scrapers took his website offline, Rob Gennaro, Digital Marketing Officer at Red Label Vacations, said enough was enough.
The fact is, travel suppliers, OTAs, and meta search sites are all being scraped by bots which hurts their marketing metrics, SEO, website performance, and customer loyalty.
You can protect your site from web-scraping competitors and fraudsters.
Attend this FREE 30-minute TLearn webinar to understand:
The prevalence and impact of bots on your website
How to identify and block fraudsters and scrapers
When a web scraper is actually good
The future of online travel and website security
Our panelists are:
Rob Gennaro, digital marketing officer, Red Label Vacations
Rami Essaid, co-founder and CEO, Distil Networks
Kevin May, moderator and editor, Tnooz
Nick Vivion, producer and reporter, Tnooz
Cleaning up website traffic from bots & spammersDistil Networks
Did you know 30% of travel industry website visitors are unsavory competitors, hackers, spammers, and fraudsters?
The fact is, travel suppliers, OTAs, and meta search sites are all being scraped by bots which hurts their marketing metrics, SEO, website performance, and customer loyalty.
You can protect your site from web-scraping competitors and fraudsters.
Watch this presentation to understand:
- The prevalence and impact of bots on your website
- How to identify and block fraudsters and scrapers
- When a web scraper is actually good
- The future of online travel and website security
Did you know 30% of Ecommerce website visitors are unsavory competitors, hackers, and fraudsters?
Fact is, online retailers are particularly susceptible to the effects of advanced bot threats, including competitive tactics like price scraping, product matching, variation tracking and availability targeting. Even worse, security breaches such as transaction fraud and account takeovers endanger the overall security of your website, customer base, and brand.
When aggressive scrapers caused repeated site slowdowns, Brian Gress, Director of IT Systems & Governance at Hayneedle, said enough was enough.
Key takeaways include how to:
- Stop competitors from scraping your prices and monitoring your inventory
- Reduce chargeback fees due to transaction fraud, carding and account hijacking
- Optimize your conversion funnel and enjoy clean analytics and KPIs
- Protect your brand image, reputation and SEO rankings
PUMMP: PHISHING URL DETECTION USING MACHINE LEARNING WITH MONOMORPHIC AND POL...IJCNCJournal
Phishing scams are increasing drastically, which affects Internet users in compromising personal
credentials. This paper proposes a novel feature utilization method for phishing URL detection called the
Polymorphic property of features. In the initial stage, the URL-related features (46 features) were
extracted. Later, a subset of features (19 out of 46) with the polymorphic property of features was
identified, and they were extracted from different parts of the URL (the domain and path). After extracting
the features, various machine learning classification algorithms were applied to build the machine
learning model using monomorphic treatment of features, polymorphic treatment of features, and both
monomorphic and polymorphic treatment of features. By the polymorphic property of features, we mean
that the same feature provides different interpretations when considered in different parts of the URL. The
machine learning models were built on two different datasets. A comparison of the machine learning
models derived from the two datasets reveals the fact that the model built with both monomorphic and
polymorphic treatment of features yielded higher accuracy in Phishing URL detection than the existing
works.
PUMMP: Phishing URL Detection using Machine Learning with Monomorphic and Pol...IJCNCJournal
Phishing scams are increasing drastically, which affects Internet users in compromising personal credentials. This paper proposes a novel feature utilization method for phishing URL detection called the Polymorphic property of features. In the initial stage, the URL-related features (46 features) were extracted. Later, a subset of features (19 out of 46) with the polymorphic property of features was identified, and they were extracted from different parts of the URL (the domain and path). After extracting the features, various machine learning classification algorithms were applied to build the machine learning model using monomorphic treatment of features, polymorphic treatment of features, and both monomorphic and polymorphic treatment of features. By the polymorphic property of features, we mean that the same feature provides different interpretations when considered in different parts of the URL. The machine learning models were built on two different datasets. A comparison of the machine learning models derived from the two datasets reveals the fact that the model built with both monomorphic and polymorphic treatment of features yielded higher accuracy in Phishing URL detection than the existing works
Classification is one of the data mining technique to classify the data. Here, I have tried the different technologies such as Machine Learning and Deep Learning using R Programming Language.
A Deep Learning Technique for Web Phishing Detection Combined URL Features an...IJCNCJournal
The most popular way to deceive online users nowadays is phishing. Consequently, to increase cybersecurity, more efficient web page phishing detection mechanisms are needed. In this paper, we propose an approach that rely on websites image and URL to deals with the issue of phishing website recognition as a classification challenge. Our model uses webpage URLs and images to detect a phishing attack using convolution neural networks (CNNs) to extract the most important features of website images and URLs and then classifies them into benign and phishing pages. The accuracy rate of the results of the experiment was 99.67%, proving the effectiveness of the proposed model in detecting a web phishing attack.
A Comparative Analysis of Different Feature Set on the Performance of Differe...gerogepatton
Reducing the risk pose by phishers and other cybercriminals in the cyber space requires a robust and
automatic means of detecting phishing websites, since the culprits are constantly coming up with new
techniques of achieving their goals almost on daily basis. Phishers are constantly evolving the methods
they used for luring user to revealing their sensitive information. Many methods have been proposed in
past for phishing detection. But the quest for better solution is still on. This research covers the
development of phishing website model based on different algorithms with different set of features in order
to investigate the most significant features in the dataset.
The Retail Strategy and Planning Series is designed to provide retail executives with the tactical tips, insights, metrics and trend data needed to guide 2017 strategies. Tune into Are Bot Operators Eating Your Lunch? and learn how to protect your brand image, reputation and SEO rankings from bad bots: rtou.ch/2c5cPmx.
Your listing data is valuable. Scraping it NOT good for distribution of your listings to your competitors and fraudsters. Controlling your listing data is good business - protects your value, saves on costs and maximizes revenue. This session explores the specific of how one property portal found strong ROI with bot detection protecting their listings.
Ensuring Property Portal Listing Data SecurityDistil Networks
Securing your property portal listing data is harder than ever. Why? Web scraping is cheap and easy. Bots simply steal whatever content they’ve been programmed to fetch – listing text, photos, and other data that should only be available to paid subscribers and legitimate consumers.
Review this presentation to learn how to avoid expensive litigation by protecting your content before the theft occurs. Review the latest research on how non-human traffic has evolved over the past few years and best practices to protect both copyrighted and non-copyrightable content.
Hear the results from research conducted with property portal executives on the current state of anti-scraping efforts.
How to clean up travel website traffic from bots and spammers?tnooz
Did you know 30% of travel industry website visitors are unsavory competitors, hackers, spammers, and fraudsters?
When aggressive scrapers took his website offline, Rob Gennaro, Digital Marketing Officer at Red Label Vacations, said enough was enough.
The fact is, travel suppliers, OTAs, and meta search sites are all being scraped by bots which hurts their marketing metrics, SEO, website performance, and customer loyalty.
You can protect your site from web-scraping competitors and fraudsters.
Attend this FREE 30-minute TLearn webinar to understand:
The prevalence and impact of bots on your website
How to identify and block fraudsters and scrapers
When a web scraper is actually good
The future of online travel and website security
Our panelists are:
Rob Gennaro, digital marketing officer, Red Label Vacations
Rami Essaid, co-founder and CEO, Distil Networks
Kevin May, moderator and editor, Tnooz
Nick Vivion, producer and reporter, Tnooz
Cleaning up website traffic from bots & spammersDistil Networks
Did you know 30% of travel industry website visitors are unsavory competitors, hackers, spammers, and fraudsters?
The fact is, travel suppliers, OTAs, and meta search sites are all being scraped by bots which hurts their marketing metrics, SEO, website performance, and customer loyalty.
You can protect your site from web-scraping competitors and fraudsters.
Watch this presentation to understand:
- The prevalence and impact of bots on your website
- How to identify and block fraudsters and scrapers
- When a web scraper is actually good
- The future of online travel and website security
Did you know 30% of Ecommerce website visitors are unsavory competitors, hackers, and fraudsters?
Fact is, online retailers are particularly susceptible to the effects of advanced bot threats, including competitive tactics like price scraping, product matching, variation tracking and availability targeting. Even worse, security breaches such as transaction fraud and account takeovers endanger the overall security of your website, customer base, and brand.
When aggressive scrapers caused repeated site slowdowns, Brian Gress, Director of IT Systems & Governance at Hayneedle, said enough was enough.
Key takeaways include how to:
- Stop competitors from scraping your prices and monitoring your inventory
- Reduce chargeback fees due to transaction fraud, carding and account hijacking
- Optimize your conversion funnel and enjoy clean analytics and KPIs
- Protect your brand image, reputation and SEO rankings
PUMMP: PHISHING URL DETECTION USING MACHINE LEARNING WITH MONOMORPHIC AND POL...IJCNCJournal
Phishing scams are increasing drastically, which affects Internet users in compromising personal
credentials. This paper proposes a novel feature utilization method for phishing URL detection called the
Polymorphic property of features. In the initial stage, the URL-related features (46 features) were
extracted. Later, a subset of features (19 out of 46) with the polymorphic property of features was
identified, and they were extracted from different parts of the URL (the domain and path). After extracting
the features, various machine learning classification algorithms were applied to build the machine
learning model using monomorphic treatment of features, polymorphic treatment of features, and both
monomorphic and polymorphic treatment of features. By the polymorphic property of features, we mean
that the same feature provides different interpretations when considered in different parts of the URL. The
machine learning models were built on two different datasets. A comparison of the machine learning
models derived from the two datasets reveals the fact that the model built with both monomorphic and
polymorphic treatment of features yielded higher accuracy in Phishing URL detection than the existing
works.
PUMMP: Phishing URL Detection using Machine Learning with Monomorphic and Pol...IJCNCJournal
Phishing scams are increasing drastically, which affects Internet users in compromising personal credentials. This paper proposes a novel feature utilization method for phishing URL detection called the Polymorphic property of features. In the initial stage, the URL-related features (46 features) were extracted. Later, a subset of features (19 out of 46) with the polymorphic property of features was identified, and they were extracted from different parts of the URL (the domain and path). After extracting the features, various machine learning classification algorithms were applied to build the machine learning model using monomorphic treatment of features, polymorphic treatment of features, and both monomorphic and polymorphic treatment of features. By the polymorphic property of features, we mean that the same feature provides different interpretations when considered in different parts of the URL. The machine learning models were built on two different datasets. A comparison of the machine learning models derived from the two datasets reveals the fact that the model built with both monomorphic and polymorphic treatment of features yielded higher accuracy in Phishing URL detection than the existing works
State of the Art Analysis Approach for Identification of the Malignant URLsIOSRjournaljce
Malicious URLs have been universally used to ascend various cyber attacks including spamming, phishing and malware. Malware, short term for malicious software, is software which is developed to penetrate computers in a network without the user’s permission or notification. Existing methods typically detect malicious URLs of a single attack type. Hence such detection systems are failed to protect the users from various attacks. Malware spreading widely throughout the area of network as consequence of this it becomes predicament in distributed computer and network systems. Malicious links are the place of origin of all attacks which circulated all over the web. Hence malicious URLs should be detected for the prevention of users from these malware attacks. In this paper we described a novel approach which analyze all types of attacks by identifying malicious URLs and secure the web users from them. This technique prevents the users from malignant URLs before visiting them. Therefore efficiency of web security gets maintained. For such anatomization we developed an analyzer which identifies URLs and examine as malicious or benign. We also developed five processes which crawl for suspicious URLs. This approach will prevent the users from all types of attacks and increase efficiency of web crawling phase.
Paper Presentation - "Your Botnet is my Botnet : Analysis of a Botnet Takeover"Jishnu Pradeep
Presentation based on Paper titled: "Your botnet is my botnet: Analysis of a botnet takeover". The original authors are Brett Stone-Gross, Marco Cova, Lorenzo Cavallaro, Bob Gilbert, Martin Szydlowski,
Richard Kemmerer, Christopher Kruegel, and Giovanni Vigna.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
2. Who am I ?
• Info security Investigator @ Cisco.
• Completed Mtech from IIT Jodhpur in 2014.
• Areas of interest include machine learning,
computer vision and A.I.
• Email : satyamiitj89@gmail.com
6. Problem in a Nutshell
6
URL features to identify malicious Web sites
No context, no content
Different classes of URLs
Benign, spam, phishing, exploits, scams...
For now, distinguish benign vs. malicious
facebook.com fblight.com
8. State of the Practice
8
Current approaches
Blacklists [SORBS, URIBL, SURBL, Spamhaus]
Learning on hand-tuned features [Garera et al, 2007]
Limitations
Cannot predict unlisted sites
Cannot account for new features
Arms race: Fast feedback cycle is critical
More automated approach?
10. Data Sets
10
Malicious URLs
5,000 from PhishTank (phishing)
15,000 from Spamscatter (spam, phishing, etc)
Benign URLs
15,000 from Yahoo Web directory
15,000 from DMOZ directory
Malicious x Benign → 4 Data Sets
30,000 – 55,000 features per data set
11. Algorithms
11
Logistic regression w/ L1-norm regularization
Other models
Naive Bayes
Support vector machines (linear, RBF kernels)
Implicit feature selection
Easier to interpret
13. Features to consider?
14
1) Blacklists
2) Simple heuristics
3) Domain name registration
4) Host properties
5) Lexical
14. (1) Blacklist Queries
15
List of known malicious sites
Providers: SORBS, URIBL, SURBL,
Spamhaus
http://www.bfuduuioo1fp.mobi
In blacklist?
Yes
http://fblight.com
No
In blacklist?
http://www.bfuduuioo1fp.mobi
Blacklist queries as features
........................................
........................................
15. (2) Manually-Selected Features
16
Considered by previous studies
IP address in hostname?
Number of dots in URL
WHOIS (domain name) registration date
stopgap.cn registered 28
June 2009
http://72.23.5.122/www.bankofamerica.com/
http://www.bankofamerica.com.qytrpbcw.stopgap.cn/
16. (3) WHOIS Features
17
Domain name registration
Date of registration, update, expiration
Registrant: Who registered domain?
Registrar: Who manages registration?
http://sleazysalmon.com
http://angryalbacore.com
http://mangymackerel.com
http://yammeringyellowtail.com
Registered on
29 June 2009
By SpamMedia
17. (4) Host-Based Features
18
Blacklisted? (SORBS, URIBL, SURBL, Spamhaus)
WHOIS: registrar, registrant, dates
IP address: Which ASes/IP prefixes?
DNS: TTL? PTR record exists/resolves?
Geography-related: Locale? Connection speed?
75.102.60.0/2269.63.176.0/20
facebook.com fblight.com
18. (5) Lexical Features
19
Tokens in URL hostname + path
Length of URL
Entropy of the domain name
http://www.bfuduuioo1fp.mobi/ws/ebayisapi.dll
21. Limitations
22
False positives
Sites hosted in disreputable ISP
Guilt by association
False negatives
Compromised sites
Free hosting sites
Hosted in reputable ISP
Future work: Web page content
22. Conclusion
23
Detect malicious URLs with high accuracy
Only using URL
Diverse feature set helps: 86.5% w/ 18,000+
features
Proof concept working in lab
Future work
Scaling up for deployment
23. References
Ma, Justin, et al. "Beyond blacklists: learning
to detect malicious web sites from suspicious
URLs." Proceedings of the 15th ACM SIGKDD
international conference on Knowledge
discovery and data mining. ACM, 2009.