SlideShare a Scribd company logo
1 of 20
Presented by: Nileshwari Desai
Under the guidance of: Prashasti Kanikar
 Introduction
 Main design criteria
 Background
 Content- based detection of terror-related activity
 Modes of operation
 Learning typical terrorist behavior
 Detecting typical terrorist behavior
 Case study
 Deployment issues
 Conclusion
 Ongoing research issues
 References
Terrorist cells are using the Internet infrastructure to exchange information and
recruit new members and supporters. Eg., high-speed Internet connections were
used intensively by members of the infamous ‘Hamburg Cell’ that was largely
responsible for the preparation of the September 11 attacks against the United
States.
It is believed that the detection of terrorists on the Web might prevent further
terrorist attacks. One way to detect terrorist activity on the Web is to eavesdrop on
all traffic of Web sites associated with terrorist organizations in order to detect the
accessing users based on their IP address. Unfortunately it is difficult to monitor
terrorist sites since they do not use fixed IP addresses and URLs. The geographical
locations of Web servers hosting those sites also change frequently in order to
prevent successful eavesdropping. To overcome this problem, law enforcement
agencies are trying to detect terrorists by monitoring all ISPs traffic, though
privacy issues raised still prevent relevant laws from being enforced.
 Training the detection algorithm should be based on
the content of existing terrorist sites and known
terrorist traffic on the Web.
 Detection should be carried out in real-time. This goal
can be achieved only if terrorist information interests
are presented in a compact manner for efficient
processing.
 The detection sensitivity should be controlled by user-
defined parameters to enable calibration of the desired
detection performance.
 This research integrates issues from the research fields
of computer security :-
1). Intrusion Detection Systems
2). Information retrieval (the vector-space model)
3). Data mining (cluster analysis)
Detection Environment-This study suggests a new type of knowledge-based
detection methodology that uses the content of Web pages browsed by terrorists
and their supporters as an input to a detection process.
In this study, refers only to the textual content of Web pages, excluding images,
music, video clips, and other complex data types. It is assumed that terror-related
content usually viewed by terrorists and their supporters can be used as training
data for a learning process to obtain a ‘Typical-Terrorist-Behavior’.
This typical behavior will be used to detect further terrorists and their supporters.
A ‘Terrorist- Typical-Behavior’ is defined as an access to information relevant to
terrorists and their supporters. A general description of a system based on the
suggested methodology is presented in Figure 1. Each user under surveillance is
identified as a ‘user's computer’ having a unique IP address rather than by his or
her name. In the case of a real-time alarm, the detected IP can be used to locate
the computer- and hopefully the suspected terrorist who may still be logged on to
the same computer.
 Learning typical terrorist behavior: In this mode, a collection of Web pages
from terror-related sites is downloaded and represented as a set of vectors
using the vector space model. The collected data is used to derive and represent
the typical behavior (‘profile’) of the terrorists and their supporters by applying
techniques of unsupervised clustering. Since the IP addresses of downloaded
pages are ignored, moving the same or similar contents to a new address, as
frequently carried out by terror related sites, will not affect the detection
accuracy of the new method.
 Monitoring users: This mode is aimed at detecting terrorist users by
comparing the content of information accessed by users to the Typical-
Terrorist–Behavior. The textual content of the information accessed by a user
on the Web is converted into a vector called ‘access vector’. An alarm is issued
when the similarity between the ‘access vector’ and the Typical-Terrorist-
Behavior is above a predefined threshold. The privacy of regular users is
preserved, since the system does not need to store either the visited IP
addresses or the actual content of the viewed pages. Due to extensive
dimensionality reduction procedures, the access vectors do not hold sufficient
information to restore the actual content of the page. Other types of viewed
content (e.g., images) are ignored by the system.
The learning Typical–Terrorist-Behavior part of the methodology defines and represents the
typical behavior of terrorist users based on the content of their Web activities. Figure 2
describes the learning module. It is assumed that it is possible to collect Web pages from
terror-related sites. The content of the collected pages is the input to the Vector Generator
module that converts the pages into vectors of weighted terms (each page is converted to
one vector). The vectors are stored for future processing in the Vector of Terrorists
Transactions DB. The Clustering module accesses the collected vectors and performs
unsupervised clustering resulting in n clusters representing the typical topics viewed by
terrorist users. For each cluster, the Terrorist- Representor module computes the centroid
vector (denoted by Cvi) which represents a topic typically accessed by terrorists. As a result,
a set of centroid vectors represent a set of terrorists interests referred to as the ‘Typical-
Terrorist-Behavior’. The Typical-Terrorist-Behavior is based on a set of Web pages that were
downloaded from terrorist related sites and is the main input of the detection algorithm. In
order to make the detection algorithm more accurate, the process of generating the Typical-
Terrorist-Behavior has to be repeated periodically due to changes in the content of terrorist
related site. Typical-Terrorist-Behavior depends on the number of clusters. When the
number of clusters is higher, the Typical-Terrorist-Behavior includes more topics of interest
by terrorists where each topic is based on fewer pages. It is hard to hypothesize what the
optimal number of clusters is. In the case study presented in the next section detection
performance for two settings of the number of clusters are presented.
Learning the Typical–
Terrorist- Behavior
In the Monitoring module , the Vector-Generator
converts the content of each page accessed by a user into
a vector representation (referred to as the ‘access
vector’).
The Detector uses the access vector and the Typical-
Terrorist-Behavior and tries to determine whether the
access vector belongs to a terrorist. This is done by
computing similarity between the access vector and all
centroid vectors of the Typical-Terrorist-Behavior. The
cosine measure is used to compute the similarity.
The detector issues an alarm when the similarity between the access vector and the
nearest centroid is higher than the predefined threshold denoted by tr:
where i Cv is the ith centroid vector, Av - the access vector, i1 tCv - the ith term in the
vector i Cv , i tAv - the ith term in the vector Av , and m - the number of unique terms
in each vector.
The threshold parameter tr controls the sensitivity of the detection. Higher value of tr
will decrease the sensitivity of the detection process, decrease the number of alarms,
increase the accuracy and decrease the number of false alarms. Lower value of tr will
increase the detection process sensitivity, increase the number of alarms and false
alarms and decrease the accuracy. The optimal value of tr depends on the preferences
of the system user. In the next section, the feasibility of the new methodology is
explored using a case study.
An initial evaluation of the proposed knowledge-based detection methodology is
conducted through a prototype system. The experimental environment included a
small network of nine computers, each computer having a constant IP address
and a proxy server through which all the computers accessed the Web. In the
experiment, the proxy server represented an ISP. Eight students in information
systems engineering were instructed to access Web sites related to general topics
and generated about 800 transactions. Several other users were requested to
access terrorist related information (mainly the ‘Azzam Publications’ sites visited
by one of the September 11 terrorists) and generated about 214 transactions. In
this experiment, the users accessed only pages in English, though the
methodology is readily applicable to other languages as well. The Vector
Generator, Clustering and Detector modules described above were implemented
and installed inside the server. Vcluster program from the Cluto Clustering Tool
(Karypis 2002) is used to implement the clustering module where the clustering
algorithm used was ‘k-way’ and the similarity between the objects computed
using the cosine measure (Boger et al. 2001; Pierrea et al. 2000). The ‘k-way’
clustering algorithm is a variation of the popular K-Means clustering algorithm.
One problem with these algorithms is that it is hard to find the optimal k
(number of clusters) that will achieve the best clustering performance.
 The experiments were done with different k's and compared results.
A system implementing the new methodology can be deployed by law
enforcement agencies in two different ways where each way has its advantages and
disadvantages. ISP-based System: The system implementing the new
methodology may be deployed within the ISP infrastructure. The major advantage
of such a deployment is that the ISP is able to provide the exact identity of a
suspicious user detected by the system since the IP is allocated to the user by the
ISP. The disadvantages of such a deployment are that it requires ISP awareness
and cooperation and the privacy of the ISP subscribers is violated. Network-based
System: The system implementing the new methodology is eavesdropping on
communication lines connecting the ISPs to the Internet backbone. In such a
deployment the major advantage is that the ISP cooperation is not required and
the privacy of the ISP subscribers is protected, since most ISPs are allocating a
temporary IP address for every user. The major disadvantage of such a deployment
is that the exact identity of a subscriber using a given IP address is unknown. It is
difficult to effectively deploy a system based on the new methodology on sites that
provide internet access to casual users such as an Internet Café or Hot Spots since
users are not required to identify themselves to the service operator.
In this paper, an innovative, knowledge-based
methodology for terrorist activity detection on the Web
is presented. The results of an initial case study suggest
that the methodology can be useful for detecting
terrorists and their supporters using a legitimate ways of
Internet access to view terror related content at a series
of evasive web sites.
1. Document representation
2. Similarity Measures
3. Detection methodology
4. Computational complexity
5. Optimal settings
6. Analyzing picture and binaries
 Y.Elovici, A.Kandel, M.Last, B.Shapira, O. Zaafrany, ”Using
Data Mining Techniques for Detecting Terror-Related
Activities on the Web”, Department of Information Systems
Engineering, Ben-Gurion University of the Negev, Beer-
Sheva 84105, Israel.
 Data mining techniques ebooks by Berry Linoff
(http://www.gobookee.org/data-mining-techniques-berry-
linoff/)
 Research notes of Laura Ruotsalainen.
data mining for terror attacks
data mining for terror attacks

More Related Content

What's hot

A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...
A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...
A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...IRJET Journal
 
EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...
EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...
EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...IJNSA Journal
 
A COMPARATIVE STUDY OF SOCIAL NETWORKING APPROACHES IN IDENTIFYING THE COVERT...
A COMPARATIVE STUDY OF SOCIAL NETWORKING APPROACHES IN IDENTIFYING THE COVERT...A COMPARATIVE STUDY OF SOCIAL NETWORKING APPROACHES IN IDENTIFYING THE COVERT...
A COMPARATIVE STUDY OF SOCIAL NETWORKING APPROACHES IN IDENTIFYING THE COVERT...ijwscjournal
 
Pre-defense_talk
Pre-defense_talkPre-defense_talk
Pre-defense_talkaphex34
 
Acquisition of malicious code using active learning
Acquisition of malicious code using active learningAcquisition of malicious code using active learning
Acquisition of malicious code using active learningUltraUploader
 
Crime analysis mapping, intrusion detection using data mining
Crime analysis mapping, intrusion detection using data miningCrime analysis mapping, intrusion detection using data mining
Crime analysis mapping, intrusion detection using data miningVenkat Projects
 
SAS Data Mining - Crime Modeling
SAS Data Mining - Crime ModelingSAS Data Mining - Crime Modeling
SAS Data Mining - Crime ModelingJohn Michael Croft
 
Machine reading for cancer biology
Machine reading for cancer biologyMachine reading for cancer biology
Machine reading for cancer biologyLaura Berry
 
Using Data Mining Techniques to Analyze Crime Pattern
Using Data Mining Techniques to Analyze Crime PatternUsing Data Mining Techniques to Analyze Crime Pattern
Using Data Mining Techniques to Analyze Crime PatternZakaria Zubi
 
Improved spambase dataset prediction using svm rbf kernel with adaptive boost
Improved spambase dataset prediction using svm rbf kernel with adaptive boostImproved spambase dataset prediction using svm rbf kernel with adaptive boost
Improved spambase dataset prediction using svm rbf kernel with adaptive boosteSAT Journals
 
DETECTION OF APPLICATION LAYER DDOS ATTACKS USING INFORMATION THEORY BASED ME...
DETECTION OF APPLICATION LAYER DDOS ATTACKS USING INFORMATION THEORY BASED ME...DETECTION OF APPLICATION LAYER DDOS ATTACKS USING INFORMATION THEORY BASED ME...
DETECTION OF APPLICATION LAYER DDOS ATTACKS USING INFORMATION THEORY BASED ME...cscpconf
 
Image Based Relational Database Watermarking: A Survey
Image Based Relational Database Watermarking: A SurveyImage Based Relational Database Watermarking: A Survey
Image Based Relational Database Watermarking: A Surveyiosrjce
 
A predictive model for mapping crime using big data analytics
A predictive model for mapping crime using big data analyticsA predictive model for mapping crime using big data analytics
A predictive model for mapping crime using big data analyticseSAT Journals
 
PUBLIC INTEGRIYT AUDITING FOR SHARED DYNAMIC DATA STORAGE UNDER ONTIME GENERA...
PUBLIC INTEGRIYT AUDITING FOR SHARED DYNAMIC DATA STORAGE UNDER ONTIME GENERA...PUBLIC INTEGRIYT AUDITING FOR SHARED DYNAMIC DATA STORAGE UNDER ONTIME GENERA...
PUBLIC INTEGRIYT AUDITING FOR SHARED DYNAMIC DATA STORAGE UNDER ONTIME GENERA...paperpublications3
 
IRJET- Netreconner: An Innovative Method to Intrusion Detection using Regular...
IRJET- Netreconner: An Innovative Method to Intrusion Detection using Regular...IRJET- Netreconner: An Innovative Method to Intrusion Detection using Regular...
IRJET- Netreconner: An Innovative Method to Intrusion Detection using Regular...IRJET Journal
 
Probabilistic models for anomaly detection based on usage of network traffic
Probabilistic models for anomaly detection based on usage of network trafficProbabilistic models for anomaly detection based on usage of network traffic
Probabilistic models for anomaly detection based on usage of network trafficAlexander Decker
 

What's hot (18)

A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...
A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...
A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...
 
C3602021025
C3602021025C3602021025
C3602021025
 
EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...
EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...
EMAIL SPAM CLASSIFICATION USING HYBRID APPROACH OF RBF NEURAL NETWORK AND PAR...
 
A COMPARATIVE STUDY OF SOCIAL NETWORKING APPROACHES IN IDENTIFYING THE COVERT...
A COMPARATIVE STUDY OF SOCIAL NETWORKING APPROACHES IN IDENTIFYING THE COVERT...A COMPARATIVE STUDY OF SOCIAL NETWORKING APPROACHES IN IDENTIFYING THE COVERT...
A COMPARATIVE STUDY OF SOCIAL NETWORKING APPROACHES IN IDENTIFYING THE COVERT...
 
Pre-defense_talk
Pre-defense_talkPre-defense_talk
Pre-defense_talk
 
Acquisition of malicious code using active learning
Acquisition of malicious code using active learningAcquisition of malicious code using active learning
Acquisition of malicious code using active learning
 
Crime analysis mapping, intrusion detection using data mining
Crime analysis mapping, intrusion detection using data miningCrime analysis mapping, intrusion detection using data mining
Crime analysis mapping, intrusion detection using data mining
 
SAS Data Mining - Crime Modeling
SAS Data Mining - Crime ModelingSAS Data Mining - Crime Modeling
SAS Data Mining - Crime Modeling
 
Machine reading for cancer biology
Machine reading for cancer biologyMachine reading for cancer biology
Machine reading for cancer biology
 
Using Data Mining Techniques to Analyze Crime Pattern
Using Data Mining Techniques to Analyze Crime PatternUsing Data Mining Techniques to Analyze Crime Pattern
Using Data Mining Techniques to Analyze Crime Pattern
 
Improved spambase dataset prediction using svm rbf kernel with adaptive boost
Improved spambase dataset prediction using svm rbf kernel with adaptive boostImproved spambase dataset prediction using svm rbf kernel with adaptive boost
Improved spambase dataset prediction using svm rbf kernel with adaptive boost
 
DETECTION OF APPLICATION LAYER DDOS ATTACKS USING INFORMATION THEORY BASED ME...
DETECTION OF APPLICATION LAYER DDOS ATTACKS USING INFORMATION THEORY BASED ME...DETECTION OF APPLICATION LAYER DDOS ATTACKS USING INFORMATION THEORY BASED ME...
DETECTION OF APPLICATION LAYER DDOS ATTACKS USING INFORMATION THEORY BASED ME...
 
Image Based Relational Database Watermarking: A Survey
Image Based Relational Database Watermarking: A SurveyImage Based Relational Database Watermarking: A Survey
Image Based Relational Database Watermarking: A Survey
 
A predictive model for mapping crime using big data analytics
A predictive model for mapping crime using big data analyticsA predictive model for mapping crime using big data analytics
A predictive model for mapping crime using big data analytics
 
PUBLIC INTEGRIYT AUDITING FOR SHARED DYNAMIC DATA STORAGE UNDER ONTIME GENERA...
PUBLIC INTEGRIYT AUDITING FOR SHARED DYNAMIC DATA STORAGE UNDER ONTIME GENERA...PUBLIC INTEGRIYT AUDITING FOR SHARED DYNAMIC DATA STORAGE UNDER ONTIME GENERA...
PUBLIC INTEGRIYT AUDITING FOR SHARED DYNAMIC DATA STORAGE UNDER ONTIME GENERA...
 
IRJET- Netreconner: An Innovative Method to Intrusion Detection using Regular...
IRJET- Netreconner: An Innovative Method to Intrusion Detection using Regular...IRJET- Netreconner: An Innovative Method to Intrusion Detection using Regular...
IRJET- Netreconner: An Innovative Method to Intrusion Detection using Regular...
 
I Dunderstn
I DunderstnI Dunderstn
I Dunderstn
 
Probabilistic models for anomaly detection based on usage of network traffic
Probabilistic models for anomaly detection based on usage of network trafficProbabilistic models for anomaly detection based on usage of network traffic
Probabilistic models for anomaly detection based on usage of network traffic
 

Viewers also liked

Probabilistic Methods Of Signal And System Analysis, 3rd Edition
Probabilistic Methods Of Signal And System Analysis, 3rd EditionProbabilistic Methods Of Signal And System Analysis, 3rd Edition
Probabilistic Methods Of Signal And System Analysis, 3rd EditionPreston King
 
Steganography Tool & Steganography Detection Tool - Presentation
Steganography Tool & Steganography Detection Tool - PresentationSteganography Tool & Steganography Detection Tool - Presentation
Steganography Tool & Steganography Detection Tool - PresentationLaili Aidi
 
Intorduction to information theory and applications copy
Intorduction to information theory and applications   copyIntorduction to information theory and applications   copy
Intorduction to information theory and applications copyBikash Thapa
 
Applications of Information Theory
Applications of Information TheoryApplications of Information Theory
Applications of Information TheoryDarshan Bhatt
 
Steganography and Steganalysis
Steganography and Steganalysis Steganography and Steganalysis
Steganography and Steganalysis zaidsalfawzan
 
Steganalysis ppt
Steganalysis pptSteganalysis ppt
Steganalysis pptOm Vishnoi
 
Introduction to random variables
Introduction to random variablesIntroduction to random variables
Introduction to random variablesHadley Wickham
 
Information Theory and Coding Notes - Akshansh
Information Theory and Coding Notes - AkshanshInformation Theory and Coding Notes - Akshansh
Information Theory and Coding Notes - AkshanshAkshansh Chaudhary
 
Information theory & coding (ECE)
Information theory & coding (ECE)Information theory & coding (ECE)
Information theory & coding (ECE)nitmittal
 
Discrete and continuous probability distributions ppt @ bec doms
Discrete and continuous probability distributions ppt @ bec domsDiscrete and continuous probability distributions ppt @ bec doms
Discrete and continuous probability distributions ppt @ bec domsBabasab Patil
 
Cryptography & Steganography
Cryptography & SteganographyCryptography & Steganography
Cryptography & SteganographyAnimesh Shaw
 
Discrete Random Variables And Probability Distributions
Discrete Random Variables And Probability DistributionsDiscrete Random Variables And Probability Distributions
Discrete Random Variables And Probability DistributionsDataminingTools Inc
 
Steganography
Steganography Steganography
Steganography Uttam Jain
 
Revista de Desenvolvimento Econômico Territorial – Edição especial – Concurso...
Revista de Desenvolvimento Econômico Territorial – Edição especial – Concurso...Revista de Desenvolvimento Econômico Territorial – Edição especial – Concurso...
Revista de Desenvolvimento Econômico Territorial – Edição especial – Concurso...Pedro Valadares
 

Viewers also liked (20)

Probabilistic Methods Of Signal And System Analysis, 3rd Edition
Probabilistic Methods Of Signal And System Analysis, 3rd EditionProbabilistic Methods Of Signal And System Analysis, 3rd Edition
Probabilistic Methods Of Signal And System Analysis, 3rd Edition
 
Steganography Tool & Steganography Detection Tool - Presentation
Steganography Tool & Steganography Detection Tool - PresentationSteganography Tool & Steganography Detection Tool - Presentation
Steganography Tool & Steganography Detection Tool - Presentation
 
Intorduction to information theory and applications copy
Intorduction to information theory and applications   copyIntorduction to information theory and applications   copy
Intorduction to information theory and applications copy
 
Applications of Information Theory
Applications of Information TheoryApplications of Information Theory
Applications of Information Theory
 
Information theory
Information theoryInformation theory
Information theory
 
Applications of random variable
Applications of random variableApplications of random variable
Applications of random variable
 
Steganography and Steganalysis
Steganography and Steganalysis Steganography and Steganalysis
Steganography and Steganalysis
 
Steganalysis ppt
Steganalysis pptSteganalysis ppt
Steganalysis ppt
 
Introduction to random variables
Introduction to random variablesIntroduction to random variables
Introduction to random variables
 
Information Theory and Coding Notes - Akshansh
Information Theory and Coding Notes - AkshanshInformation Theory and Coding Notes - Akshansh
Information Theory and Coding Notes - Akshansh
 
Information theory & coding (ECE)
Information theory & coding (ECE)Information theory & coding (ECE)
Information theory & coding (ECE)
 
Discrete and continuous probability distributions ppt @ bec doms
Discrete and continuous probability distributions ppt @ bec domsDiscrete and continuous probability distributions ppt @ bec doms
Discrete and continuous probability distributions ppt @ bec doms
 
Random variables
Random variablesRandom variables
Random variables
 
Cryptography & Steganography
Cryptography & SteganographyCryptography & Steganography
Cryptography & Steganography
 
Image Steganography
Image SteganographyImage Steganography
Image Steganography
 
Discrete Random Variables And Probability Distributions
Discrete Random Variables And Probability DistributionsDiscrete Random Variables And Probability Distributions
Discrete Random Variables And Probability Distributions
 
Steganography
Steganography Steganography
Steganography
 
Gia dtc soc
Gia  dtc socGia  dtc soc
Gia dtc soc
 
Akuntansi0001
Akuntansi0001Akuntansi0001
Akuntansi0001
 
Revista de Desenvolvimento Econômico Territorial – Edição especial – Concurso...
Revista de Desenvolvimento Econômico Territorial – Edição especial – Concurso...Revista de Desenvolvimento Econômico Territorial – Edição especial – Concurso...
Revista de Desenvolvimento Econômico Territorial – Edição especial – Concurso...
 

Similar to data mining for terror attacks

Using_data_mining_techniques_for_detecting_terror-.pdf
Using_data_mining_techniques_for_detecting_terror-.pdfUsing_data_mining_techniques_for_detecting_terror-.pdf
Using_data_mining_techniques_for_detecting_terror-.pdfAmita468021
 
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORK
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORKMALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORK
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORKijcseit
 
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORK
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORKMALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORK
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORKijcseit
 
Review of Intrusion and Anomaly Detection Techniques
Review of Intrusion and Anomaly Detection Techniques Review of Intrusion and Anomaly Detection Techniques
Review of Intrusion and Anomaly Detection Techniques IJMER
 
Intrusion Detection Systems By Anamoly-Based Using Neural Network
Intrusion Detection Systems By Anamoly-Based Using Neural NetworkIntrusion Detection Systems By Anamoly-Based Using Neural Network
Intrusion Detection Systems By Anamoly-Based Using Neural NetworkIOSR Journals
 
Iaetsd a survey on detecting denial-of-service attacks
Iaetsd a survey on detecting denial-of-service attacksIaetsd a survey on detecting denial-of-service attacks
Iaetsd a survey on detecting denial-of-service attacksIaetsd Iaetsd
 
RESPOND TO THIS DISCUSSION POST BASED ON THE TOPIC Compare and co.docx
RESPOND TO THIS DISCUSSION POST BASED ON THE TOPIC Compare and co.docxRESPOND TO THIS DISCUSSION POST BASED ON THE TOPIC Compare and co.docx
RESPOND TO THIS DISCUSSION POST BASED ON THE TOPIC Compare and co.docxinfantkimber
 
Optimised Malware Detection in Digital Forensics
Optimised Malware Detection in Digital Forensics Optimised Malware Detection in Digital Forensics
Optimised Malware Detection in Digital Forensics IJNSA Journal
 
Optimised malware detection in digital forensics
Optimised malware detection in digital forensicsOptimised malware detection in digital forensics
Optimised malware detection in digital forensicsIJNSA Journal
 
A proposed architecture for network
A proposed architecture for networkA proposed architecture for network
A proposed architecture for networkIJCNCJournal
 
Running head Project Plan .docx
Running head Project Plan                                        .docxRunning head Project Plan                                        .docx
Running head Project Plan .docxtoltonkendal
 
Intrusion detection system based on web usage mining
Intrusion detection system based on web usage miningIntrusion detection system based on web usage mining
Intrusion detection system based on web usage miningIJCSEA Journal
 
Classification Rule Discovery Using Ant-Miner Algorithm: An Application Of N...
Classification Rule Discovery Using Ant-Miner Algorithm: An  Application Of N...Classification Rule Discovery Using Ant-Miner Algorithm: An  Application Of N...
Classification Rule Discovery Using Ant-Miner Algorithm: An Application Of N...IJMER
 
data mining for security application
data mining for security applicationdata mining for security application
data mining for security applicationbharatsvnit
 
data mining for security application
data mining for security applicationdata mining for security application
data mining for security applicationbharatsvnit
 
ADRISYA: A FLOW BASED ANOMALY DETECTION SYSTEM FOR SLOW AND FAST SCAN
ADRISYA: A FLOW BASED ANOMALY DETECTION SYSTEM FOR SLOW AND FAST SCANADRISYA: A FLOW BASED ANOMALY DETECTION SYSTEM FOR SLOW AND FAST SCAN
ADRISYA: A FLOW BASED ANOMALY DETECTION SYSTEM FOR SLOW AND FAST SCANIJNSA Journal
 
A web application detecting dos attack using mca and tam
A web application detecting dos attack using mca and tamA web application detecting dos attack using mca and tam
A web application detecting dos attack using mca and tameSAT Journals
 
Machine Learning Techniques Used for the Detection and Analysis of Modern Typ...
Machine Learning Techniques Used for the Detection and Analysis of Modern Typ...Machine Learning Techniques Used for the Detection and Analysis of Modern Typ...
Machine Learning Techniques Used for the Detection and Analysis of Modern Typ...IRJET Journal
 
Intrusion detection and anomaly detection system using sequential pattern mining
Intrusion detection and anomaly detection system using sequential pattern miningIntrusion detection and anomaly detection system using sequential pattern mining
Intrusion detection and anomaly detection system using sequential pattern miningeSAT Journals
 

Similar to data mining for terror attacks (20)

Using_data_mining_techniques_for_detecting_terror-.pdf
Using_data_mining_techniques_for_detecting_terror-.pdfUsing_data_mining_techniques_for_detecting_terror-.pdf
Using_data_mining_techniques_for_detecting_terror-.pdf
 
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORK
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORKMALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORK
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORK
 
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORK
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORKMALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORK
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORK
 
Review of Intrusion and Anomaly Detection Techniques
Review of Intrusion and Anomaly Detection Techniques Review of Intrusion and Anomaly Detection Techniques
Review of Intrusion and Anomaly Detection Techniques
 
Intrusion Detection Systems By Anamoly-Based Using Neural Network
Intrusion Detection Systems By Anamoly-Based Using Neural NetworkIntrusion Detection Systems By Anamoly-Based Using Neural Network
Intrusion Detection Systems By Anamoly-Based Using Neural Network
 
Iaetsd a survey on detecting denial-of-service attacks
Iaetsd a survey on detecting denial-of-service attacksIaetsd a survey on detecting denial-of-service attacks
Iaetsd a survey on detecting denial-of-service attacks
 
RESPOND TO THIS DISCUSSION POST BASED ON THE TOPIC Compare and co.docx
RESPOND TO THIS DISCUSSION POST BASED ON THE TOPIC Compare and co.docxRESPOND TO THIS DISCUSSION POST BASED ON THE TOPIC Compare and co.docx
RESPOND TO THIS DISCUSSION POST BASED ON THE TOPIC Compare and co.docx
 
Optimised Malware Detection in Digital Forensics
Optimised Malware Detection in Digital Forensics Optimised Malware Detection in Digital Forensics
Optimised Malware Detection in Digital Forensics
 
Optimised malware detection in digital forensics
Optimised malware detection in digital forensicsOptimised malware detection in digital forensics
Optimised malware detection in digital forensics
 
A proposed architecture for network
A proposed architecture for networkA proposed architecture for network
A proposed architecture for network
 
Running head Project Plan .docx
Running head Project Plan                                        .docxRunning head Project Plan                                        .docx
Running head Project Plan .docx
 
Intrusion detection system based on web usage mining
Intrusion detection system based on web usage miningIntrusion detection system based on web usage mining
Intrusion detection system based on web usage mining
 
Classification Rule Discovery Using Ant-Miner Algorithm: An Application Of N...
Classification Rule Discovery Using Ant-Miner Algorithm: An  Application Of N...Classification Rule Discovery Using Ant-Miner Algorithm: An  Application Of N...
Classification Rule Discovery Using Ant-Miner Algorithm: An Application Of N...
 
data mining for security application
data mining for security applicationdata mining for security application
data mining for security application
 
data mining for security application
data mining for security applicationdata mining for security application
data mining for security application
 
ADRISYA: A FLOW BASED ANOMALY DETECTION SYSTEM FOR SLOW AND FAST SCAN
ADRISYA: A FLOW BASED ANOMALY DETECTION SYSTEM FOR SLOW AND FAST SCANADRISYA: A FLOW BASED ANOMALY DETECTION SYSTEM FOR SLOW AND FAST SCAN
ADRISYA: A FLOW BASED ANOMALY DETECTION SYSTEM FOR SLOW AND FAST SCAN
 
A web application detecting dos attack using mca and tam
A web application detecting dos attack using mca and tamA web application detecting dos attack using mca and tam
A web application detecting dos attack using mca and tam
 
Machine Learning Techniques Used for the Detection and Analysis of Modern Typ...
Machine Learning Techniques Used for the Detection and Analysis of Modern Typ...Machine Learning Techniques Used for the Detection and Analysis of Modern Typ...
Machine Learning Techniques Used for the Detection and Analysis of Modern Typ...
 
Clickstream ppt copy
Clickstream ppt   copyClickstream ppt   copy
Clickstream ppt copy
 
Intrusion detection and anomaly detection system using sequential pattern mining
Intrusion detection and anomaly detection system using sequential pattern miningIntrusion detection and anomaly detection system using sequential pattern mining
Intrusion detection and anomaly detection system using sequential pattern mining
 

More from Nilu Desai

Adversarial search
Adversarial searchAdversarial search
Adversarial searchNilu Desai
 
collaborative study on the cloud
collaborative study on the cloudcollaborative study on the cloud
collaborative study on the cloudNilu Desai
 
digital signature for SMS security
digital signature for SMS securitydigital signature for SMS security
digital signature for SMS securityNilu Desai
 
Cookie replay attack unit wise presentation
Cookie replay attack  unit wise presentationCookie replay attack  unit wise presentation
Cookie replay attack unit wise presentationNilu Desai
 
deadlock prevention
deadlock preventiondeadlock prevention
deadlock preventionNilu Desai
 
management of distributed transactions
management of distributed transactionsmanagement of distributed transactions
management of distributed transactionsNilu Desai
 
Iris recognition system
Iris recognition systemIris recognition system
Iris recognition systemNilu Desai
 

More from Nilu Desai (7)

Adversarial search
Adversarial searchAdversarial search
Adversarial search
 
collaborative study on the cloud
collaborative study on the cloudcollaborative study on the cloud
collaborative study on the cloud
 
digital signature for SMS security
digital signature for SMS securitydigital signature for SMS security
digital signature for SMS security
 
Cookie replay attack unit wise presentation
Cookie replay attack  unit wise presentationCookie replay attack  unit wise presentation
Cookie replay attack unit wise presentation
 
deadlock prevention
deadlock preventiondeadlock prevention
deadlock prevention
 
management of distributed transactions
management of distributed transactionsmanagement of distributed transactions
management of distributed transactions
 
Iris recognition system
Iris recognition systemIris recognition system
Iris recognition system
 

Recently uploaded

Manipur-Book-Final-2-compressed.pdfsal'rpk
Manipur-Book-Final-2-compressed.pdfsal'rpkManipur-Book-Final-2-compressed.pdfsal'rpk
Manipur-Book-Final-2-compressed.pdfsal'rpkbhavenpr
 
Dynamics of Destructive Polarisation in Mainstream and Social Media: The Case...
Dynamics of Destructive Polarisation in Mainstream and Social Media: The Case...Dynamics of Destructive Polarisation in Mainstream and Social Media: The Case...
Dynamics of Destructive Polarisation in Mainstream and Social Media: The Case...Axel Bruns
 
AP Election Survey 2024: TDP-Janasena-BJP Alliance Set To Sweep Victory
AP Election Survey 2024: TDP-Janasena-BJP Alliance Set To Sweep VictoryAP Election Survey 2024: TDP-Janasena-BJP Alliance Set To Sweep Victory
AP Election Survey 2024: TDP-Janasena-BJP Alliance Set To Sweep Victoryanjanibaddipudi1
 
Opportunities, challenges, and power of media and information
Opportunities, challenges, and power of media and informationOpportunities, challenges, and power of media and information
Opportunities, challenges, and power of media and informationReyMonsales
 
Identifying & Combating Misinformation w/ Fact Checking Tools
Identifying & Combating Misinformation w/ Fact Checking ToolsIdentifying & Combating Misinformation w/ Fact Checking Tools
Identifying & Combating Misinformation w/ Fact Checking ToolsUjjwal Acharya
 
26042024_First India Newspaper Jaipur.pdf
26042024_First India Newspaper Jaipur.pdf26042024_First India Newspaper Jaipur.pdf
26042024_First India Newspaper Jaipur.pdfFIRST INDIA
 
Different Frontiers of Social Media War in Indonesia Elections 2024
Different Frontiers of Social Media War in Indonesia Elections 2024Different Frontiers of Social Media War in Indonesia Elections 2024
Different Frontiers of Social Media War in Indonesia Elections 2024Ismail Fahmi
 
Referendum Party 2024 Election Manifesto
Referendum Party 2024 Election ManifestoReferendum Party 2024 Election Manifesto
Referendum Party 2024 Election ManifestoSABC News
 
Lorenzo D'Emidio_Lavoro sullaNorth Korea .pptx
Lorenzo D'Emidio_Lavoro sullaNorth Korea .pptxLorenzo D'Emidio_Lavoro sullaNorth Korea .pptx
Lorenzo D'Emidio_Lavoro sullaNorth Korea .pptxlorenzodemidio01
 
Top 10 Wealthiest People In The World.pdf
Top 10 Wealthiest People In The World.pdfTop 10 Wealthiest People In The World.pdf
Top 10 Wealthiest People In The World.pdfauroraaudrey4826
 
HARNESSING AI FOR ENHANCED MEDIA ANALYSIS A CASE STUDY ON CHATGPT AT DRONE EM...
HARNESSING AI FOR ENHANCED MEDIA ANALYSIS A CASE STUDY ON CHATGPT AT DRONE EM...HARNESSING AI FOR ENHANCED MEDIA ANALYSIS A CASE STUDY ON CHATGPT AT DRONE EM...
HARNESSING AI FOR ENHANCED MEDIA ANALYSIS A CASE STUDY ON CHATGPT AT DRONE EM...Ismail Fahmi
 
Brief biography of Julius Robert Oppenheimer
Brief biography of Julius Robert OppenheimerBrief biography of Julius Robert Oppenheimer
Brief biography of Julius Robert OppenheimerOmarCabrera39
 
College Call Girls Kolhapur Aanya 8617697112 Independent Escort Service Kolhapur
College Call Girls Kolhapur Aanya 8617697112 Independent Escort Service KolhapurCollege Call Girls Kolhapur Aanya 8617697112 Independent Escort Service Kolhapur
College Call Girls Kolhapur Aanya 8617697112 Independent Escort Service KolhapurCall girls in Ahmedabad High profile
 
N Chandrababu Naidu Launches 'Praja Galam' As Part of TDP’s Election Campaign
N Chandrababu Naidu Launches 'Praja Galam' As Part of TDP’s Election CampaignN Chandrababu Naidu Launches 'Praja Galam' As Part of TDP’s Election Campaign
N Chandrababu Naidu Launches 'Praja Galam' As Part of TDP’s Election Campaignanjanibaddipudi1
 
Call Girls in Mira Road Mumbai ( Neha 09892124323 ) College Escorts Service i...
Call Girls in Mira Road Mumbai ( Neha 09892124323 ) College Escorts Service i...Call Girls in Mira Road Mumbai ( Neha 09892124323 ) College Escorts Service i...
Call Girls in Mira Road Mumbai ( Neha 09892124323 ) College Escorts Service i...Pooja Nehwal
 
如何办理(BU学位证书)美国贝翰文大学毕业证学位证书
如何办理(BU学位证书)美国贝翰文大学毕业证学位证书如何办理(BU学位证书)美国贝翰文大学毕业证学位证书
如何办理(BU学位证书)美国贝翰文大学毕业证学位证书Fi L
 
Vashi Escorts, {Pooja 09892124323}, Vashi Call Girls
Vashi Escorts, {Pooja 09892124323}, Vashi Call GirlsVashi Escorts, {Pooja 09892124323}, Vashi Call Girls
Vashi Escorts, {Pooja 09892124323}, Vashi Call GirlsPooja Nehwal
 
2024 03 13 AZ GOP LD4 Gen Meeting Minutes_FINAL.docx
2024 03 13 AZ GOP LD4 Gen Meeting Minutes_FINAL.docx2024 03 13 AZ GOP LD4 Gen Meeting Minutes_FINAL.docx
2024 03 13 AZ GOP LD4 Gen Meeting Minutes_FINAL.docxkfjstone13
 
Quiz for Heritage Indian including all the rounds
Quiz for Heritage Indian including all the roundsQuiz for Heritage Indian including all the rounds
Quiz for Heritage Indian including all the roundsnaxymaxyy
 
23042024_First India Newspaper Jaipur.pdf
23042024_First India Newspaper Jaipur.pdf23042024_First India Newspaper Jaipur.pdf
23042024_First India Newspaper Jaipur.pdfFIRST INDIA
 

Recently uploaded (20)

Manipur-Book-Final-2-compressed.pdfsal'rpk
Manipur-Book-Final-2-compressed.pdfsal'rpkManipur-Book-Final-2-compressed.pdfsal'rpk
Manipur-Book-Final-2-compressed.pdfsal'rpk
 
Dynamics of Destructive Polarisation in Mainstream and Social Media: The Case...
Dynamics of Destructive Polarisation in Mainstream and Social Media: The Case...Dynamics of Destructive Polarisation in Mainstream and Social Media: The Case...
Dynamics of Destructive Polarisation in Mainstream and Social Media: The Case...
 
AP Election Survey 2024: TDP-Janasena-BJP Alliance Set To Sweep Victory
AP Election Survey 2024: TDP-Janasena-BJP Alliance Set To Sweep VictoryAP Election Survey 2024: TDP-Janasena-BJP Alliance Set To Sweep Victory
AP Election Survey 2024: TDP-Janasena-BJP Alliance Set To Sweep Victory
 
Opportunities, challenges, and power of media and information
Opportunities, challenges, and power of media and informationOpportunities, challenges, and power of media and information
Opportunities, challenges, and power of media and information
 
Identifying & Combating Misinformation w/ Fact Checking Tools
Identifying & Combating Misinformation w/ Fact Checking ToolsIdentifying & Combating Misinformation w/ Fact Checking Tools
Identifying & Combating Misinformation w/ Fact Checking Tools
 
26042024_First India Newspaper Jaipur.pdf
26042024_First India Newspaper Jaipur.pdf26042024_First India Newspaper Jaipur.pdf
26042024_First India Newspaper Jaipur.pdf
 
Different Frontiers of Social Media War in Indonesia Elections 2024
Different Frontiers of Social Media War in Indonesia Elections 2024Different Frontiers of Social Media War in Indonesia Elections 2024
Different Frontiers of Social Media War in Indonesia Elections 2024
 
Referendum Party 2024 Election Manifesto
Referendum Party 2024 Election ManifestoReferendum Party 2024 Election Manifesto
Referendum Party 2024 Election Manifesto
 
Lorenzo D'Emidio_Lavoro sullaNorth Korea .pptx
Lorenzo D'Emidio_Lavoro sullaNorth Korea .pptxLorenzo D'Emidio_Lavoro sullaNorth Korea .pptx
Lorenzo D'Emidio_Lavoro sullaNorth Korea .pptx
 
Top 10 Wealthiest People In The World.pdf
Top 10 Wealthiest People In The World.pdfTop 10 Wealthiest People In The World.pdf
Top 10 Wealthiest People In The World.pdf
 
HARNESSING AI FOR ENHANCED MEDIA ANALYSIS A CASE STUDY ON CHATGPT AT DRONE EM...
HARNESSING AI FOR ENHANCED MEDIA ANALYSIS A CASE STUDY ON CHATGPT AT DRONE EM...HARNESSING AI FOR ENHANCED MEDIA ANALYSIS A CASE STUDY ON CHATGPT AT DRONE EM...
HARNESSING AI FOR ENHANCED MEDIA ANALYSIS A CASE STUDY ON CHATGPT AT DRONE EM...
 
Brief biography of Julius Robert Oppenheimer
Brief biography of Julius Robert OppenheimerBrief biography of Julius Robert Oppenheimer
Brief biography of Julius Robert Oppenheimer
 
College Call Girls Kolhapur Aanya 8617697112 Independent Escort Service Kolhapur
College Call Girls Kolhapur Aanya 8617697112 Independent Escort Service KolhapurCollege Call Girls Kolhapur Aanya 8617697112 Independent Escort Service Kolhapur
College Call Girls Kolhapur Aanya 8617697112 Independent Escort Service Kolhapur
 
N Chandrababu Naidu Launches 'Praja Galam' As Part of TDP’s Election Campaign
N Chandrababu Naidu Launches 'Praja Galam' As Part of TDP’s Election CampaignN Chandrababu Naidu Launches 'Praja Galam' As Part of TDP’s Election Campaign
N Chandrababu Naidu Launches 'Praja Galam' As Part of TDP’s Election Campaign
 
Call Girls in Mira Road Mumbai ( Neha 09892124323 ) College Escorts Service i...
Call Girls in Mira Road Mumbai ( Neha 09892124323 ) College Escorts Service i...Call Girls in Mira Road Mumbai ( Neha 09892124323 ) College Escorts Service i...
Call Girls in Mira Road Mumbai ( Neha 09892124323 ) College Escorts Service i...
 
如何办理(BU学位证书)美国贝翰文大学毕业证学位证书
如何办理(BU学位证书)美国贝翰文大学毕业证学位证书如何办理(BU学位证书)美国贝翰文大学毕业证学位证书
如何办理(BU学位证书)美国贝翰文大学毕业证学位证书
 
Vashi Escorts, {Pooja 09892124323}, Vashi Call Girls
Vashi Escorts, {Pooja 09892124323}, Vashi Call GirlsVashi Escorts, {Pooja 09892124323}, Vashi Call Girls
Vashi Escorts, {Pooja 09892124323}, Vashi Call Girls
 
2024 03 13 AZ GOP LD4 Gen Meeting Minutes_FINAL.docx
2024 03 13 AZ GOP LD4 Gen Meeting Minutes_FINAL.docx2024 03 13 AZ GOP LD4 Gen Meeting Minutes_FINAL.docx
2024 03 13 AZ GOP LD4 Gen Meeting Minutes_FINAL.docx
 
Quiz for Heritage Indian including all the rounds
Quiz for Heritage Indian including all the roundsQuiz for Heritage Indian including all the rounds
Quiz for Heritage Indian including all the rounds
 
23042024_First India Newspaper Jaipur.pdf
23042024_First India Newspaper Jaipur.pdf23042024_First India Newspaper Jaipur.pdf
23042024_First India Newspaper Jaipur.pdf
 

data mining for terror attacks

  • 1. Presented by: Nileshwari Desai Under the guidance of: Prashasti Kanikar
  • 2.  Introduction  Main design criteria  Background  Content- based detection of terror-related activity  Modes of operation  Learning typical terrorist behavior  Detecting typical terrorist behavior  Case study  Deployment issues  Conclusion  Ongoing research issues  References
  • 3. Terrorist cells are using the Internet infrastructure to exchange information and recruit new members and supporters. Eg., high-speed Internet connections were used intensively by members of the infamous ‘Hamburg Cell’ that was largely responsible for the preparation of the September 11 attacks against the United States. It is believed that the detection of terrorists on the Web might prevent further terrorist attacks. One way to detect terrorist activity on the Web is to eavesdrop on all traffic of Web sites associated with terrorist organizations in order to detect the accessing users based on their IP address. Unfortunately it is difficult to monitor terrorist sites since they do not use fixed IP addresses and URLs. The geographical locations of Web servers hosting those sites also change frequently in order to prevent successful eavesdropping. To overcome this problem, law enforcement agencies are trying to detect terrorists by monitoring all ISPs traffic, though privacy issues raised still prevent relevant laws from being enforced.
  • 4.  Training the detection algorithm should be based on the content of existing terrorist sites and known terrorist traffic on the Web.  Detection should be carried out in real-time. This goal can be achieved only if terrorist information interests are presented in a compact manner for efficient processing.  The detection sensitivity should be controlled by user- defined parameters to enable calibration of the desired detection performance.
  • 5.  This research integrates issues from the research fields of computer security :- 1). Intrusion Detection Systems 2). Information retrieval (the vector-space model) 3). Data mining (cluster analysis)
  • 6. Detection Environment-This study suggests a new type of knowledge-based detection methodology that uses the content of Web pages browsed by terrorists and their supporters as an input to a detection process. In this study, refers only to the textual content of Web pages, excluding images, music, video clips, and other complex data types. It is assumed that terror-related content usually viewed by terrorists and their supporters can be used as training data for a learning process to obtain a ‘Typical-Terrorist-Behavior’. This typical behavior will be used to detect further terrorists and their supporters. A ‘Terrorist- Typical-Behavior’ is defined as an access to information relevant to terrorists and their supporters. A general description of a system based on the suggested methodology is presented in Figure 1. Each user under surveillance is identified as a ‘user's computer’ having a unique IP address rather than by his or her name. In the case of a real-time alarm, the detected IP can be used to locate the computer- and hopefully the suspected terrorist who may still be logged on to the same computer.
  • 7.
  • 8.  Learning typical terrorist behavior: In this mode, a collection of Web pages from terror-related sites is downloaded and represented as a set of vectors using the vector space model. The collected data is used to derive and represent the typical behavior (‘profile’) of the terrorists and their supporters by applying techniques of unsupervised clustering. Since the IP addresses of downloaded pages are ignored, moving the same or similar contents to a new address, as frequently carried out by terror related sites, will not affect the detection accuracy of the new method.  Monitoring users: This mode is aimed at detecting terrorist users by comparing the content of information accessed by users to the Typical- Terrorist–Behavior. The textual content of the information accessed by a user on the Web is converted into a vector called ‘access vector’. An alarm is issued when the similarity between the ‘access vector’ and the Typical-Terrorist- Behavior is above a predefined threshold. The privacy of regular users is preserved, since the system does not need to store either the visited IP addresses or the actual content of the viewed pages. Due to extensive dimensionality reduction procedures, the access vectors do not hold sufficient information to restore the actual content of the page. Other types of viewed content (e.g., images) are ignored by the system.
  • 9. The learning Typical–Terrorist-Behavior part of the methodology defines and represents the typical behavior of terrorist users based on the content of their Web activities. Figure 2 describes the learning module. It is assumed that it is possible to collect Web pages from terror-related sites. The content of the collected pages is the input to the Vector Generator module that converts the pages into vectors of weighted terms (each page is converted to one vector). The vectors are stored for future processing in the Vector of Terrorists Transactions DB. The Clustering module accesses the collected vectors and performs unsupervised clustering resulting in n clusters representing the typical topics viewed by terrorist users. For each cluster, the Terrorist- Representor module computes the centroid vector (denoted by Cvi) which represents a topic typically accessed by terrorists. As a result, a set of centroid vectors represent a set of terrorists interests referred to as the ‘Typical- Terrorist-Behavior’. The Typical-Terrorist-Behavior is based on a set of Web pages that were downloaded from terrorist related sites and is the main input of the detection algorithm. In order to make the detection algorithm more accurate, the process of generating the Typical- Terrorist-Behavior has to be repeated periodically due to changes in the content of terrorist related site. Typical-Terrorist-Behavior depends on the number of clusters. When the number of clusters is higher, the Typical-Terrorist-Behavior includes more topics of interest by terrorists where each topic is based on fewer pages. It is hard to hypothesize what the optimal number of clusters is. In the case study presented in the next section detection performance for two settings of the number of clusters are presented.
  • 11. In the Monitoring module , the Vector-Generator converts the content of each page accessed by a user into a vector representation (referred to as the ‘access vector’). The Detector uses the access vector and the Typical- Terrorist-Behavior and tries to determine whether the access vector belongs to a terrorist. This is done by computing similarity between the access vector and all centroid vectors of the Typical-Terrorist-Behavior. The cosine measure is used to compute the similarity.
  • 12. The detector issues an alarm when the similarity between the access vector and the nearest centroid is higher than the predefined threshold denoted by tr: where i Cv is the ith centroid vector, Av - the access vector, i1 tCv - the ith term in the vector i Cv , i tAv - the ith term in the vector Av , and m - the number of unique terms in each vector. The threshold parameter tr controls the sensitivity of the detection. Higher value of tr will decrease the sensitivity of the detection process, decrease the number of alarms, increase the accuracy and decrease the number of false alarms. Lower value of tr will increase the detection process sensitivity, increase the number of alarms and false alarms and decrease the accuracy. The optimal value of tr depends on the preferences of the system user. In the next section, the feasibility of the new methodology is explored using a case study.
  • 13.
  • 14. An initial evaluation of the proposed knowledge-based detection methodology is conducted through a prototype system. The experimental environment included a small network of nine computers, each computer having a constant IP address and a proxy server through which all the computers accessed the Web. In the experiment, the proxy server represented an ISP. Eight students in information systems engineering were instructed to access Web sites related to general topics and generated about 800 transactions. Several other users were requested to access terrorist related information (mainly the ‘Azzam Publications’ sites visited by one of the September 11 terrorists) and generated about 214 transactions. In this experiment, the users accessed only pages in English, though the methodology is readily applicable to other languages as well. The Vector Generator, Clustering and Detector modules described above were implemented and installed inside the server. Vcluster program from the Cluto Clustering Tool (Karypis 2002) is used to implement the clustering module where the clustering algorithm used was ‘k-way’ and the similarity between the objects computed using the cosine measure (Boger et al. 2001; Pierrea et al. 2000). The ‘k-way’ clustering algorithm is a variation of the popular K-Means clustering algorithm. One problem with these algorithms is that it is hard to find the optimal k (number of clusters) that will achieve the best clustering performance.  The experiments were done with different k's and compared results.
  • 15. A system implementing the new methodology can be deployed by law enforcement agencies in two different ways where each way has its advantages and disadvantages. ISP-based System: The system implementing the new methodology may be deployed within the ISP infrastructure. The major advantage of such a deployment is that the ISP is able to provide the exact identity of a suspicious user detected by the system since the IP is allocated to the user by the ISP. The disadvantages of such a deployment are that it requires ISP awareness and cooperation and the privacy of the ISP subscribers is violated. Network-based System: The system implementing the new methodology is eavesdropping on communication lines connecting the ISPs to the Internet backbone. In such a deployment the major advantage is that the ISP cooperation is not required and the privacy of the ISP subscribers is protected, since most ISPs are allocating a temporary IP address for every user. The major disadvantage of such a deployment is that the exact identity of a subscriber using a given IP address is unknown. It is difficult to effectively deploy a system based on the new methodology on sites that provide internet access to casual users such as an Internet Café or Hot Spots since users are not required to identify themselves to the service operator.
  • 16. In this paper, an innovative, knowledge-based methodology for terrorist activity detection on the Web is presented. The results of an initial case study suggest that the methodology can be useful for detecting terrorists and their supporters using a legitimate ways of Internet access to view terror related content at a series of evasive web sites.
  • 17. 1. Document representation 2. Similarity Measures 3. Detection methodology 4. Computational complexity 5. Optimal settings 6. Analyzing picture and binaries
  • 18.  Y.Elovici, A.Kandel, M.Last, B.Shapira, O. Zaafrany, ”Using Data Mining Techniques for Detecting Terror-Related Activities on the Web”, Department of Information Systems Engineering, Ben-Gurion University of the Negev, Beer- Sheva 84105, Israel.  Data mining techniques ebooks by Berry Linoff (http://www.gobookee.org/data-mining-techniques-berry- linoff/)  Research notes of Laura Ruotsalainen.