More Related Content
More from Ijarcsee Journal
More from Ijarcsee Journal (20)
78 81
- 1. ISSN: 2277 – 9043
International Journal of Advanced Research in Computer Science and Electronics Engineering (IJARCSEE)
Volume 1, Issue 6, August 2012
Security threats to data mining and analysis
tools of TIA program
Swati Vashisht Divya Singh Bhanu Prakash Lohani
Lecturer at CSE deptt Lecturer at CSE deptt Lecturer at CSE deptt.
DIT SE Gr. Noida DIT SE Gr. Noida DIT SE Gr. Noida
I. INTRODUCTION
Abstract: Data mining is the process that attempts
Data mining is the process of discovering new
to discover patterns in large data sets. The actual
patterns from large data sets involving methods at the
data mining task is the automatic or semi-
intersection of artificial intelligence, machine
automatic analysis of large quantities of data to
learning, statistics and data base system. It is the
extract previously unknown interesting patterns
process of analyzing data from different perspectives
such as groups of data records i.e.cluster analysis,
and summarizing it into useful information,
unusual records (anomaly detection) and
information that can predict the success of a
dependencies association rule mining. This usually
marketing campaign, looking for patterns in financial
involves using database techniques such as spatial
transactions to discover illegal activities or analyzing
indexes. These patterns can then be seen as a kind
genome sequences.[1]
of summary of the input data, and may be used in
further analysis or, for example, in machine
For mining decisions data can be grouped according
learning and predictive analytics. As the internet
to the following categories:
has been involved in all areas of human activity,
there are increasing concerns that data mining •Data classes: Stored data is used to locate data in
may pose a threat to our privacy and security then predetermined groups.
security would be one of the major issues to
monitor. In this paper we present recent research •Data clusters: Data items are grouped according to
on data mining and its security. We prepare a logical relationships or consumer preferences.
survey report on data mining for crime detection.
•Data associations: Data can be mined to identify
associations.
Index Terms: data mining and security, intrusion •Sequential patterns: Data is mined to anticipate
detection, terrorist attack. behavior patterns and trends.
78
All Rights Reserved © 2012 IJARCSEE
- 2. ISSN: 2277 – 9043
International Journal of Advanced Research in Computer Science and Electronics Engineering (IJARCSEE)
Volume 1, Issue 6, August 2012
II. POSSIBLE THREATS TO SECURITY D.Intrusion Detection
A.Predict information about classified work from An intrusion can be defined as "any set of actions
correlation with unclassified work: that attempt to compromise the integrity,
confidentiality or availability of a resource". Intrusion
Classification is a data mining technique used to prevention techniques, such as user authentication
predict group membership for data instances in which (e.g. using passwords or biometrics), avoiding
data instances are classified based on their feature programming errors, and information protection (e.g.,
values. Predictive analysis could be applied to predict encryption) have been used to protect computer
future patterns by providing a record of the past that systems as a first line of defense.[5] Intrusion
can be analyzed more effectively on classified data. detection system produces reports and intrusion
Unclassified work may involve duplicate and prevention system is placed in-line and is able to
redundant data which is difficult to manage.[2] actively prevent or block intrusions that are detected.
Intrusion detection systems are to identify malicious
A correlation is an index of the strength of the
activity, log information about said activity and
relationship between two variables.
report activity.[2]
B.Detect “hidden” information based on
III.TO IMPROVE SECURITY
“conspicuous” lack of information:
• For privacy concerns, one should be only authorized
Data mining techniques are basically used in
access to privacy sensitive information such as credit
detecting hidden information from the large amount
card transaction records, health care records,
of database. Query generators and data interpretation
biological traits, criminal investigation and ethnicity.
components combine with discovery driven systems
So various data mining enhancing techniques have
to reveal hidden data.
been developed to help protecting data. Databases
can employ a multilevel security model to classify
C.Mining “Open Source” data to determine
and restrict data according to various security levels,
predictive events:
with user permitted access to only their authority
Predictive analysis is a way to use data to predict levels.[2]
future patterns. It is an area of statistical analysis that
• For security concerns, data mining can be used for
deals with extracting information from data and using
crime detection and prevention using various
it to predict future trends and behavior patterns. The
techniques such as TIA program ( Terrorism
core of predictive analytics relies on capturing
Information awareness) this project was to focus on
relationships between explanatory variables and the
three specific areas of research i.e. language
predicted variables from past occurrences, and
translation, data search with pattern recognition and
exploiting it to predict future outcome.
privacy protection, and advanced collaborative and
79
All Rights Reserved © 2012 IJARCSEE
- 3. ISSN: 2277 – 9043
International Journal of Advanced Research in Computer Science and Electronics Engineering (IJARCSEE)
Volume 1, Issue 6, August 2012
decision supportive tools.[9] CAPPS-II ( Computer - We are in initial stage of our research, much remains
assisted Passenger Prescreening System), In this to be done including the following task:
system, When a person books a plane ticket, certain
identifying information is collected by the airline: full In TIA program person identification must not based
name, address, etc. This information is used to check on statistical approach i.e. comparing with a standard
against some data store (e.g., a TSA No-Fly list, model and known behavioral patterns , we are trying
the FBI ten most wanted fugitive list etc.) and assign to design some technology based analysis tool for
a terrorism "risk score" to that person. High risk Terrorism Information Awareness program.
scores require the airline to subject the person to
extended baggage and/or personal screening, and to
contact law enforcement if necessary. MATRIX
REFERENCES
(Multistate Anti-terrorism Information Exchange)
which leverages advanced computer management [1]www.anderson.ucla.edu/faculty/jason.frand/teache
capabilities to more quickly access, share and analyze r/.../datamining.htm
public records to help law enforcement generate [2] Jiawei Han, Micheline kamber, Jian Pei Data
leads, expedite investigations, and possibly prevent mining concepts and techniques
terrorist attacks.[3]
[3]William J. Krouse The Multi-State Anti-Terrorism
IV. CONCLUSION Information Exchange (MATRIX) Pilot Project
Though data mining involves data analysis tools to
[4] Gerhard PAAß1, Wolf REINHARDT, Stefan
discover previously unknown valid patterns and
RÜPING, and Stefan WROBEL Data Mining for
relationships in large data sets, and in TIA (Terrorism
Security and Crime Detection
Information Awareness) program, a data mining
application is designed to identify potential terrorist
[5] Wenke Lee and Salvatore J. Stolfo Data Mining
suspects in a large pool of individuals using statistical
Approaches for Intrusion Detection
approach in which the user is tested against the
predesigned model that includes information about
[6] Sushmita Mitra, Sankar K. Pal, Pabitra Mitra
known terrorists. However, while possibly re-
Data Mining in Soft Computing Framework: A
affirming a particular profile, it does not necessarily
survey
mean that the application will identify an individual
whose behavior significantly deviates from the
[7] Varun Chandola, Eric Eilertson, Levent ErtÄoz,
original model or an individual may be considered as
GyÄorgy Simon and Vipin Kumar Data mining for
a suspect if some information is found same as in
cyber security
original model.
V .FUTURE WORK
80
All Rights Reserved © 2012 IJARCSEE
- 4. ISSN: 2277 – 9043
International Journal of Advanced Research in Computer Science and Electronics Engineering (IJARCSEE)
Volume 1, Issue 6, August 2012
[8] Bhavani Thuraisingham, Latifur Khan,
Mohammad M. Masud, Kevin W. Hamlen Data
Mining for Security Applications
[9] Jeffrey W. Seifert Data Mining and Homeland
Security:An Overview
[10] Anshu Veda, Prajakta Kalekar, Anirudha
Bodhankar Intrusion Detection Using Datamining
Techniques
Author’s profile
Swati Vashisht has done bachelors in Information
Technology and pursuing Masters in Computer
Science & Engineering. Her area of interest is Data
mining & warehousing & Operating System.
Divya Singh has done bachelors in Computer
Science & Engg. and pursuing Masters in CSE from
Amity University. Her area of interest is Computer
Networks & Data mining.
Bhanu Prakash Lohani has done bachelors in
Computer Science & Engg. and pursuing Masters in
CSE from Amity University. His area of interest is
Computer Networks & Data mining.
81
All Rights Reserved © 2012 IJARCSEE