Your SlideShare is downloading. ×
Prediction Methods for Mitigating Computer Security Threats
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Prediction Methods for Mitigating Computer Security Threats

306
views

Published on


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
306
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Prediction Methods for Mitigating Computer Security Threats Errin W. Fulp Department of Computer Science Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
  • 2. Outline Overview of data mining methods Machine learning tools, techniques, and tasks Preprocessing, data mining, and interpretation Prediction or knowledge discovery When applied to computer security Large data sets and rare events (at least we hope...) Methods for addressing each concern Example application, function discovery in computer networks Who is doing what in a computer network? Identify the application based on the pattern of interactions Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
  • 3. What is Data Mining Extracting hidden patterns from data Can be used to uncover existing hidden patterns ...but it cannot uncover patterns not already in the data Typically two major objectives Knowledge discovery - determine facts about the data Forecasting or predictions - predict future events Both are relevant to computer security Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
  • 4. Steps in the Process Standard data-oriented view of Knowledge Discovery in Databases selection preprocessing transformation data mining interpretation Data Target Data Preprocessed Data Transformed Data Patterns Knowledge Let’s divide into a process-oriented view transformed data patterns Preprocessing Data Mining Interpretation Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
  • 5. Preprocessing Data Once the objective is determined, assemble the data Again, can only uncover existing patterns Clean the data, removing noise and account for missing data Remove unwanted data that hinders data analysis... but what is noise with regards to security... Do we really want to remove outliers? Reduce and transform data into important feature vectors preprocessing transformation h198.129.146.158 Host Facility Level Tag Time Message 200 tag Encoding (e) Sequence f (base 10) 198.129.8.6 198.129.8.6 local7 notice 189 1171061732 sysstat kern info 6 1171061732 kerne md : usin maxim um availablidl I bandwidth l g e e O 148 2 2 198.129.8.6 cron info 78 1171061733 cron 2500 (root CM D (/usr/lib/sa/sa1) d ) 1 1 150 148 2 22 198.129.8.6 auth info 38 1171062445 rsh(pam unix 2215 sessio opened fo user by (uid=0) ) n r 158 2 222 tag number 198.129.8.6 auth info 38 1171062445 in.rsh 2216 root@hpcs2cs.ed as root cmd=/root/temps d . u : 198.129.8.6 daemon info 30 1171062590 smart 88 Device /dev d : /twe0 SMAR T Prefailur e Attribute 40 1 2221 100 198.129.8.18 syslog info 46 1171062590 syslog restart. d 158 2 22212 239 198.129.7.282 daemon info 30 1171062590 ntpd 2555 synchronize to 198.129.149.218 d , str 188 2 22122 233 198.129.7.222 daemon info 30 1171062590 ntpd 2555 synchronize to 198.129.149.218 d , str 198.129.7.238 daemon info 30 1171062590 ntpd 2555 synchronize to 198.129.149.218 d , str 50 188 2 21222 215 198.129.8.6 auth notice 37 1171062590 sshd(pam unix 12430 aut failure ) h ; logname=el-fork-o 88 1 12221 160 198.129.8.6 kern info 6 1171062590 kerne md : usin 512k, over a tota of 12287936 blocks. l g l 198.129.8.6 cron info 78 1171062601 cron 2500 (root CM D ( d ) /usr/lib/sa/fork-i t 1 1) 0 158 2 22212 239 1.1778 1.1779 1.178 1.1781 1.1782 1.1783 1.1784 1.1785 198.129.8.6 kern alert 1 1171062692 kerne raid5 Dis fai l : k l ure on sde1, disablin device g time (seconds) x 10 9 188 2 22122 215 Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
  • 6. Types of Data Mining transformed data patterns Data Mining Classification Preprocessing Clustering Interpretation Regression Rule Learning Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
  • 7. Classification Arrange data into predefined groups, developed from training Learn a model (classifier) from labeled training data Examples include k-nearest neighbor and support vector machines Typically training is slow, but classification is fast When applied to security (specifically IDS) [CBK] 1 Cluster training data using algorithm 2 For new data, distance to closest cluster is anomaly score Assumption: Normal data instances belong to specific cluster(s) in the data, while anomalous does not. Normal data is closest to the centroid. Can also perform semi-supervised training Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
  • 8. Clustering Arrange data into groups, but the groups are not predefined No training data required, therefore no training time... Attack Graph Cluster Representation 1:execCode(commServer,root) 2:RULE 2 (remote exploit of a server program):1 140:vulExists(commServer,iccpVulnerability,iccpService,remoteExploit,privEscalation) 3:netAccess(commServer,iccpProtocol,iccpPort) 6:RULE 5 (multi-hop access):0.5 4:RULE 5 (multi-hop access):0.5 8:execCode(dataHistorian,root) 7:hacl(dataHistorian,commServer,iccpProtocol,iccpPort) 5:hacl(commServer,commServer,iccpProtocol,iccpPort) 5 9:RULE 2 (remote exploit of a server program):1 138:vulExists(dataHistorian,oracleSqlVulnerability,oracleSqlServer,remoteExploit,privEscalation) 10:netAccess(dataHistorian,sqlProtocol,sqlPort) 137:networkServiceInfo(dataHistorian,oracleSqlServer,sqlProtocol,sqlPort,root) 135:RULE 5 (multi-hop access):0.5 11:RULE 5 (multi-hop access):0.5 131:RULE 5 (multi-hop access):0.5 133:RULE 5 (multi-hop access):0.5 136:hacl(dataHistorian,dataHistorian,sqlProtocol,sqlPort) 13:execCode(citrixServer,normalAccount) 14:RULE 0 (When a principal is compromised any machine he has an account on will also be compromised):0.5 132:hacl(citrixServer,dataHistorian,sqlProtocol,sqlPort) 134:hacl(commServer,dataHistorian,sqlProtocol,sqlPort) 10 15:canAccessHost(citrixServer) 113:RULE 8 (Access a host through a log-in service):1 16:RULE 7 (Access a host through executing code on the machine):1 15 114:netAccess(citrixServer,sshProtocol,sshPort) 127:logInService(citrixServer,sshProtocol,sshPort) 17:RULE 7 (Access a host through executing code on the machine):1 121:RULE 5 (multi-hop access):0.5 119:RULE 5 (multi-hop access):0.5 125:RULE 5 (multi-hop access):0.5 123:RULE 5 (multi-hop access):0.5 117:RULE 5 (multi-hop access):0.5 115:RULE 5 (multi-hop access):0.5 128:RULE 12 ():1 122:hacl(vpnServer,citrixServer,sshProtocol,sshPort) 120:hacl(fileServer,citrixServer,sshProtocol,sshPort) 126:hacl(workStation,citrixServer,sshProtocol,sshPort) 118:hacl(citrixServer,citrixServer,sshProtocol,sshPort) 129:networkServiceInfo(citrixServer,sshd,sshProtocol,sshPort,root) 18:execCode(citrixServer,root) 19:RULE 4 (Trojan horse installation):0.2 20:accessFile(citrixServer,write, /usr/local/share ) 21:RULE 15 (NFS semantics):1 20 22:accessFile(fileServer,write, /export ) 112:nfsMounted(citrixServer, /usr/local/share ,fileServer, /export ,read) 29:RULE 16 (NFS shell):0.6 106:RULE 16 (NFS shell):0.6 109:RULE 16 (NFS shell):0.6 26:RULE 16 (NFS shell):0.6 23:RULE 16 (NFS shell):0.6 32:execCode(webServer,apache) 30:hacl(webServer,fileServer,nfsProtocol,nfsPort) 31:nfsExportInfo(fileServer, /export ,write,webServer) 111:nfsExportInfo(fileServer, /export ,write,workStation) 110:hacl(workStation,fileServer,nfsProtocol,nfsPort) 28:nfsExportInfo(fileServer, /export ,write,citrixServer) 27:hacl(citrixServer,fileServer,nfsProtocol,nfsPort) 33:RULE 2 (remote exploit of a server program):1 34:netAccess(webServer,httpProtocol,httpPort) 104:networkServiceInfo(webServer,httpd,httpProtocol,httpPort,apache) 105:vulExists(webServer, CAN-2002-0392 ,httpd,remoteExploit,privEscalation) 25 95:RULE 5 (multi-hop access):0.5 35:RULE 5 (multi-hop access):0.5 101:RULE 6 (direct network access):1 99:RULE 5 (multi-hop access):0.5 97:RULE 5 (multi-hop access):0.5 37:execCode(vpnServer,normalAccount) 96:hacl(webServer,webServer,httpProtocol,httpPort) 36:hacl(vpnServer,webServer,httpProtocol,httpPort) 102:hacl(attacker,webServer,httpProtocol,httpPort) 100:hacl(workStation,webServer,httpProtocol,httpPort) 38:RULE 0 (When a principal is compromised any machine he has an account on will also be compromised):0.5 39:canAccessHost(vpnServer) 94:hasAccount(ordinaryEmployee,vpnServer,normalAccount) 30 40:RULE 7 (Access a host through executing code on the machine):1 41:RULE 8 (Access a host through a log-in service):1 91:logInService(vpnServer,vpnProtocol,vpnPort) 42:netAccess(vpnServer,vpnProtocol,vpnPort) 92:RULE 13 ():1 43:RULE 5 (multi-hop access):0.5 86:RULE 5 (multi-hop access):0.5 47:RULE 5 (multi-hop access):0.5 88:RULE 6 (direct network access):1 45:RULE 5 (multi-hop access):0.5 93:networkServiceInfo(vpnServer,vpnService,vpnProtocol,vpnPort,root) 44:hacl(vpnServer,vpnServer,vpnProtocol,vpnPort) 87:hacl(workStation,vpnServer,vpnProtocol,vpnPort) 89:hacl(attacker,vpnServer,vpnProtocol,vpnPort) 103:attackerLocated(attacker) 46:hacl(webServer,vpnServer,vpnProtocol,vpnPort) 49:execCode(workStation,normalAccount) 35 50:RULE 0 (When a principal is compromised any machine he has an account on will also be compromised):0.5 51:canAccessHost(workStation) 79:principalCompromised(ordinaryEmployee) 59:RULE 8 (Access a host through a log-in service):1 52:RULE 7 (Access a host through executing code on the machine):1 80:RULE 10 (password sniffing):0.8 82:RULE 10 (password sniffing):0.8 84:RULE 11 (incompetent user):0.2 60:netAccess(workStation,tcp,sshProtocol) 75:logInService(workStation,tcp,sshProtocol) 53:RULE 7 (Access a host through executing code on the machine):1 130:hasAccount(ordinaryEmployee,citrixServer,normalAccount) 83:hasAccount(ordinaryEmployee,workStation,normalAccount) 85:inCompetent(ordinaryEmployee) 40 61:RULE 5 (multi-hop access):0.5 63:RULE 5 (multi-hop access):0.5 69:RULE 5 (multi-hop access):0.5 65:RULE 5 (multi-hop access):0.5 76:RULE 12 ():1 73:RULE 5 (multi-hop access):0.5 71:RULE 5 (multi-hop access):0.5 64:hacl(citrixServer,workStation,tcp,sshProtocol) 70:hacl(vpnServer,workStation,tcp,sshProtocol) 67:execCode(fileServer,root) 66:hacl(fileServer,workStation,tcp,sshProtocol) 77:networkServiceInfo(workStation,sshd,tcp,sshProtocol,sshPort) 74:hacl(workStation,workStation,tcp,sshProtocol) 54:execCode(workStation,root) 68:RULE 4 (Trojan horse installation):0.2 55:RULE 4 (Trojan horse installation):0.2 56:accessFile(workStation,write, /usr/local/share ) 57:RULE 15 (NFS semantics):1 58:nfsMounted(workStation, /usr/local/share ,fileServer, /export ,read) 5 10 15 20 25 30 35 40 Examples of statistical classification include k-means clustering and fuzzy clustering Have difficulty with higher dimensional data [CBK] Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
  • 9. Regression Model the data with the least error Useful for forecasting and prediction As applied to security, regression typically has two steps 1 Fit regression model to the data 2 For each test instance, residual determines anomaly score Presence of anomalies can influence the robustness of the model Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
  • 10. Association Rule Learning Searches for relationships between variables Learn rules that capture normal behavior, any test that is not covered is an anomaly (one-class) [EEGPP06, LSM98] For multi-class if UDP is AVERAGE ∧ TCP is AVERAGE then ICMP is AVERAGE if SYN is AVERAGE ∧ FIN is AVERAGE then ICMP is AVERAGE if ICMP is AVERAGE ∧ UDP is AVERAGE ∧ TCP is AVERAGE ∧ Learn rules from training data SYN is AVERAGE then FIN is AVERAGE if UDP is AVERAGE ∧ FIN is AVERAGE then SYN is AVERAGE using algorithm, each rule has a if UDP is AVERAGE ∧ SYN is AVERAGE then ICMP is AVERAGE if SYN is AVERAGE then ICMP is AVERAGE if ICMP is AVERAGE ∧ FIN is AVERAGE then SYN is AVERAGE confidence values if UDP is AVERAGE ∧ TCP is AVERAGE ∧ SYN is AVERAGE ∧ FIN is AVERAGE then ICMP is AVERAGE For each test instance find the if UDP is AVERAGE ∧ SYN is AVERAGE then FIN is AVERAGE if ICMP is AVERAGE ∧ TCP is AVERAGE ∧ SYN is AVERAGE best rule, the inverse of the then FIN is AVERAGE if ICMP is AVERAGE ∧ SYN is AVERAGE then FIN is AVERAGE . confidence is the anomaly score . . Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
  • 11. Interpreting the Results Final step of the process, evaluate the patterns discovered Not all are valid or may have a validity time period Standard measures: accuracy, precision, recall, and F-score Unbalanced test sets are a concern Overfitting – excellent job of fitting the data, but not predicting Find patterns in training-set not present in test set 3 data overfit model 2 correct model 1 0 -1 -2 -3 0 0.2 0.4 0.6 0.8 1 Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
  • 12. When Applied to Computer Security Two major issues... Large data sets Rare events Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
  • 13. Security and Large Data Sets Security typically involves large data sets Sendmail “11,500 system calls per message” [WGZ08] 1998 MIT network data, 7 weeks is about 5 million connections Must be processed quickly and accurately Data oriented solutions Discretization, feature selection [FFH08], feature construction (principal component analysis) [WGZ04], and sampling [PP07] Method oriented solutions Parallel data mining (high-performance data mining ) Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
  • 14. Security and Rare Events Rare event processing is often required We hope security events are infrequent... Are there enough examples for supervised learning? Black swan theory (hard to predict, high consequence, and easy to see afterwards) Bulk anomalies (worms) are the opposite... [CBK] Standard approaches do not work well with rare events [JAK01] Normal events maybe similar, but rare events often different Many techniques attempt to model normal, look for variations Over-sample rare class, down-size large class, artificial cases Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
  • 15. Rare Events in Other Areas Insurance risk modeling [PRA00] E-commerce and web mining, “Online merchants convert an average of 2%-3% of their site visitors into buyers” Churn analysis, “number of customers that end relationship with a company in a given period” [NGK+ 06] Hardware faults, for example new disk failures [AWG+ 93] Airline No-Show predictions [LHC03] Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
  • 16. Example Security Application: Who is Doing What? Given a computer network, discover what computers are doing Specifically what applications or types of applications Identifying an application is important for two reasons Management of network resources Compliance with security policies However current methods do not always work Port numbers are unreliable Payloads can be encrypted Current in-the-dark methods can defeated Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
  • 17. A New Approach Given a set of computer network trace data, is it possible to identify the application protocols (e.g. HTTP, AIM, DNS) that hosts are using, based on interactions patterns? Three different views of the same network Physical Logical Application Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
  • 18. Motifs A motif is a pattern of interconnections occurring in complex networks at numbers that are significantly higher than those in randomized networks Motifs have been applied to several complex networks Gene regulation, neural networks, ecosystem food webs, electronic circuits (forward logic chips, digital fractional multipliers), and World Wide Web Certain motifs can be linked to specific functions Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
  • 19. Applying this Idea to Application Identification easy talk to ugggh... time consuming time consuming grad student... Parse Construct Create motif Nearest neighbor Interpret data application graphs profiles classification results Evolutionary attribute weighting Preprocessing Collect data, parse into connection information Find all order 3 and 4 motifs and build motif profiles k-nearest-neighbor classification (for training and testing ) Interpret results, possibly weight features to improve performance Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
  • 20. Initial Experiments Sources of data Dartmouth University campus wireless network, Fall 2003 OSDI Conference 2006 Lawrence Berkeley National Lab 2004/2005 Create a profile per application Application x profile = 1.000 0.662 0.650 0.632 0.585 Application y profile = 0.900 0.672 0.50 0.772 0.85 Given new application, find best matching profile Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
  • 21. Motif Profile Results AIM DNS HTTP Kazaa AIM DNS HTTP Kazaa MSDS Netbios SSH MSDS Netbios SSH Results very good compared to traditional graph statistics Although there is a problem with AIM and SSH... So what is the problem...? Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
  • 22. So What is the Problem? Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
  • 23. For Further Reading I [AWG+ 93] C. Apte, S. M. Weiss, G. Grout, Chidanand Apte, Sholom Weiss, and Gordon Grout. Predicting defects in disk drive manufacturing: A case study. In Proceedings of the IEEE CAIA93, pages 212–218, 1993. [CBK] Varun Chandola, Arindam Banerjee, and Vipin Kumar. Anomaly detection: A survey. To appear in ACM Computing Surveys, September 2009. [EEGPP06] Aly ElSemary, Janica Edmonds, Jes´s Gonz´lez-Pino, and Mauricio Papa. u a Applying data mining of fuzzy association rules to network intrusion detection. In Proceedings of the IEEE Workshop on Information Assurance , 2006. [FFH08] Errin W. Fulp, Glenn. A. Fink, and Jereme N. Haack. Predicting computer system failures using support vector machines. In Proceedings of the Workshop on Analysis of Sytem Logfiles , 2008. [JAK01] Mahesh V. Joshi, Ramesh C. Agarwal, and Vipin Kumar. Mining needle in a haystack: classifying rare classes via two-phase rule induction. In SIGMOD ’01: Proceedings of the 2001 ACM SIGMOD international conference on Management of data , pages 91–102, 2001. [LHC03] Richard D. Lawrence, Se June Hong, and Jacques Cherrier. Passenger-based predictive modeling of airline no-show rates. In Proceedings of the ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 397–406, 2003. Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
  • 24. For Further Reading II [LSM98] Wenke Lee, Salvatore J. Stolfo, and Kui W. Mok. Mining audit data to build intrusion detection models. In Proceedings of the International Conference on Knowledge Discovery and Data Mining , 1998. [NGK+ 06] Scott A. Neslin, Sunil Gupta, Wagner Kamakura, Junxiang Lu, and Charlotte H. Mason. Defection detection: Measuring and understanding the predictive accuracy of customer churn models. Journal of Marketing Research, 43:204–211, 2006. [PP07] Animesh Patcha and Jung-Min Park. An adaptive sampling algorithm with applications to denial-of-service attack detection. In Proceedings of the IEEE International Conference on Computer Communications and Networks, pages 11–16, 2007. [PRA00] Edwin P. D. Pednault, Barry K. Rosen, and Chidanand Apte. Handling imbalanced data sets in insurance risk modeling. Technical Report RC-21731, IBM, 2000. [WGZ04] Wei Wang, Xiaohong Guan, and Xiangliang Zhang. A novel intrusion detection method based on principle component analysis in computer security. In Proceedings of the International Symposium on Neural Networks, pages 657–662, 2004. [WGZ08] Wei Wang, Xiaohong Guan, and Xiangliang Zhang. Processing of massive audit data streams for real-time anomaly intrusion detection. Computer Communications, 31(1):58 – 72, 2008. Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats
  • 25. Title Item Sub-item Errin W. Fulp Prediction Methods for Mitigating Computer Security Threats