78 81


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

78 81

  1. 1. ISSN: 2277 – 9043 International Journal of Advanced Research in Computer Science and Electronics Engineering (IJARCSEE) Volume 1, Issue 6, August 2012 Security threats to data mining and analysis tools of TIA programSwati Vashisht Divya Singh Bhanu Prakash LohaniLecturer at CSE deptt Lecturer at CSE deptt Lecturer at CSE deptt.DIT SE Gr. Noida DIT SE Gr. Noida DIT SE Gr. Noida I. INTRODUCTIONAbstract: Data mining is the process that attempts Data mining is the process of discovering newto discover patterns in large data sets. The actual patterns from large data sets involving methods at thedata mining task is the automatic or semi- intersection of artificial intelligence, machineautomatic analysis of large quantities of data to learning, statistics and data base system. It is theextract previously unknown interesting patterns process of analyzing data from different perspectivessuch as groups of data records i.e.cluster analysis, and summarizing it into useful information,unusual records (anomaly detection) and information that can predict the success of adependencies association rule mining. This usually marketing campaign, looking for patterns in financialinvolves using database techniques such as spatial transactions to discover illegal activities or analyzingindexes. These patterns can then be seen as a kind genome sequences.[1]of summary of the input data, and may be used infurther analysis or, for example, in machine For mining decisions data can be grouped accordinglearning and predictive analytics. As the internet to the following categories:has been involved in all areas of human activity,there are increasing concerns that data mining •Data classes: Stored data is used to locate data inmay pose a threat to our privacy and security then predetermined groups.security would be one of the major issues tomonitor. In this paper we present recent research •Data clusters: Data items are grouped according toon data mining and its security. We prepare a logical relationships or consumer preferences.survey report on data mining for crime detection. •Data associations: Data can be mined to identify associations.Index Terms: data mining and security, intrusion •Sequential patterns: Data is mined to anticipatedetection, terrorist attack. behavior patterns and trends. 78 All Rights Reserved © 2012 IJARCSEE
  2. 2. ISSN: 2277 – 9043 International Journal of Advanced Research in Computer Science and Electronics Engineering (IJARCSEE) Volume 1, Issue 6, August 2012 II. POSSIBLE THREATS TO SECURITY D.Intrusion DetectionA.Predict information about classified work from An intrusion can be defined as "any set of actionscorrelation with unclassified work: that attempt to compromise the integrity, confidentiality or availability of a resource". IntrusionClassification is a data mining technique used to prevention techniques, such as user authenticationpredict group membership for data instances in which (e.g. using passwords or biometrics), avoidingdata instances are classified based on their feature programming errors, and information protection (e.g.,values. Predictive analysis could be applied to predict encryption) have been used to protect computerfuture patterns by providing a record of the past that systems as a first line of defense.[5] Intrusioncan be analyzed more effectively on classified data. detection system produces reports and intrusionUnclassified work may involve duplicate and prevention system is placed in-line and is able toredundant data which is difficult to manage.[2] actively prevent or block intrusions that are detected. Intrusion detection systems are to identify maliciousA correlation is an index of the strength of the activity, log information about said activity andrelationship between two variables. report activity.[2]B.Detect “hidden” information based on III.TO IMPROVE SECURITY“conspicuous” lack of information: • For privacy concerns, one should be only authorizedData mining techniques are basically used in access to privacy sensitive information such as creditdetecting hidden information from the large amount card transaction records, health care records,of database. Query generators and data interpretation biological traits, criminal investigation and ethnicity.components combine with discovery driven systems So various data mining enhancing techniques haveto reveal hidden data. been developed to help protecting data. Databases can employ a multilevel security model to classifyC.Mining “Open Source” data to determine and restrict data according to various security levels,predictive events: with user permitted access to only their authorityPredictive analysis is a way to use data to predict levels.[2]future patterns. It is an area of statistical analysis that • For security concerns, data mining can be used fordeals with extracting information from data and using crime detection and prevention using variousit to predict future trends and behavior patterns. The techniques such as TIA program ( Terrorismcore of predictive analytics relies on capturing Information awareness) this project was to focus onrelationships between explanatory variables and the three specific areas of research i.e. languagepredicted variables from past occurrences, and translation, data search with pattern recognition andexploiting it to predict future outcome. privacy protection, and advanced collaborative and 79 All Rights Reserved © 2012 IJARCSEE
  3. 3. ISSN: 2277 – 9043 International Journal of Advanced Research in Computer Science and Electronics Engineering (IJARCSEE) Volume 1, Issue 6, August 2012decision supportive tools.[9] CAPPS-II ( Computer - We are in initial stage of our research, much remainsassisted Passenger Prescreening System), In this to be done including the following task:system, When a person books a plane ticket, certainidentifying information is collected by the airline: full In TIA program person identification must not basedname, address, etc. This information is used to check on statistical approach i.e. comparing with a standardagainst some data store (e.g., a TSA No-Fly list, model and known behavioral patterns , we are tryingthe FBI ten most wanted fugitive list etc.) and assign to design some technology based analysis tool fora terrorism "risk score" to that person. High risk Terrorism Information Awareness program.scores require the airline to subject the person toextended baggage and/or personal screening, and tocontact law enforcement if necessary. MATRIX REFERENCES(Multistate Anti-terrorism Information Exchange)which leverages advanced computer management [1]www.anderson.ucla.edu/faculty/jason.frand/teachecapabilities to more quickly access, share and analyze r/.../datamining.htmpublic records to help law enforcement generate [2] Jiawei Han, Micheline kamber, Jian Pei Dataleads, expedite investigations, and possibly prevent mining concepts and techniquesterrorist attacks.[3] [3]William J. Krouse The Multi-State Anti-Terrorism IV. CONCLUSION Information Exchange (MATRIX) Pilot ProjectThough data mining involves data analysis tools to [4] Gerhard PAAß1, Wolf REINHARDT, Stefandiscover previously unknown valid patterns and RÜPING, and Stefan WROBEL Data Mining forrelationships in large data sets, and in TIA (Terrorism Security and Crime DetectionInformation Awareness) program, a data miningapplication is designed to identify potential terrorist [5] Wenke Lee and Salvatore J. Stolfo Data Miningsuspects in a large pool of individuals using statistical Approaches for Intrusion Detectionapproach in which the user is tested against thepredesigned model that includes information about [6] Sushmita Mitra, Sankar K. Pal, Pabitra Mitraknown terrorists. However, while possibly re- Data Mining in Soft Computing Framework: Aaffirming a particular profile, it does not necessarily surveymean that the application will identify an individualwhose behavior significantly deviates from the [7] Varun Chandola, Eric Eilertson, Levent ErtÄoz,original model or an individual may be considered as GyÄorgy Simon and Vipin Kumar Data mining fora suspect if some information is found same as in cyber securityoriginal model. V .FUTURE WORK 80 All Rights Reserved © 2012 IJARCSEE
  4. 4. ISSN: 2277 – 9043 International Journal of Advanced Research in Computer Science and Electronics Engineering (IJARCSEE) Volume 1, Issue 6, August 2012[8] Bhavani Thuraisingham, Latifur Khan,Mohammad M. Masud, Kevin W. Hamlen DataMining for Security Applications[9] Jeffrey W. Seifert Data Mining and HomelandSecurity:An Overview[10] Anshu Veda, Prajakta Kalekar, AnirudhaBodhankar Intrusion Detection Using DataminingTechniquesAuthor’s profileSwati Vashisht has done bachelors in InformationTechnology and pursuing Masters in ComputerScience & Engineering. Her area of interest is Datamining & warehousing & Operating System.Divya Singh has done bachelors in ComputerScience & Engg. and pursuing Masters in CSE fromAmity University. Her area of interest is ComputerNetworks & Data mining.Bhanu Prakash Lohani has done bachelors inComputer Science & Engg. and pursuing Masters inCSE from Amity University. His area of interest isComputer Networks & Data mining. 81 All Rights Reserved © 2012 IJARCSEE