The Codex of Business Writing Software for Real-World Solutions 2.pptx
Ariu - Workshop on Artificial Intelligence and Security - 2011
1. PRA
Pattern Recognition and Applications Group
Machine Learning in Computer
Forensics
(and the Lessons Learned from Machine Learning in
Computer Security)
D. Ariu G. Giacinto F. Roli
AISEC
4° Workshop on Artificial Intelligence and Security
Chicago – October 21, 2011
Pattern Recognition and Applications Group
P R ADepartment of Electrical and Electronic Engineering
University of Cagliari, Italy
2. What can be analyzed…
(during an investigation)
October 21 - 2011 Davide Ariu - AISEC 2011 2
3. Role of Computer Forensics
(with respect to Computer Security)
Prevention Detection Truth Assessment
Security
Security Forensics
(live) Forensics
Cyber Attack (or Crime) Progress
October 21 - 2011 Davide Ariu - AISEC 2011 3
4. Goals
• To provide a small snapshot of ML research
applied to Computer Forensics
• To clarify the ML approach to Computer
Forensics
October 21 - 2011 Davide Ariu - AISEC 2011 4
5. Historical Perspective
Computer Security Computer Forensics
•Early ’70s – First Computer Security •1984 – The FBI Laboratory began
research research papers appear developing programs to examine
computer evidence
•1988 - The first known internet- •1993 – International Law
wide attack occur (the “Morris Enforcement Conference on
Worm”) Computer Evidence
•1999-2007 – Computer Forensics
•Early 2000 - Slammer and his friend “Golden Age” [Garfinkel,2010]
in the wild: consequent security
issues are on tv channels and
newspapers
October 21 - 2011 Davide Ariu - AISEC 2011 5
6. Computer Security Research
• Strong Research Community
– Research groups and centers exist (almost) worldwide
• Well defined main research directions
– Malware and Botnet analysis and detection
– Web Applications Security
– Intrusion Detection
– Cloud Computing
• Well defined methodologies
– Research results can have an immediate practical
impact
October 21 - 2011 Davide Ariu - AISEC 2011 6
7. Computer Forensics Research
• Not particularly strong research community (at
least in terms of results achieved)
– Mostly people with a computer security
background (as me..)
• Not well defined research directions
• Not well defined approaches and methods
– Difficulty to reproduce digital forensics research
results [Garfinkel, 2009]
October 21 - 2011 Davide Ariu - AISEC 2011 7
8. How can machine learning be
useful in Computer Forensics?
• “Machine Learning methods are the best
methods in applications that are too complex for
people to manually design the
algorithm” [Mitchell,2006]
• The “reasoning” is a fundamental step during the
investigation
– Computer forensics is conceptually different from
Intrusion Detection
• The huge mass of data to be analyzed (TB scale)
makes intelligent analysis methods necessary
– Situations also exist where there is no time for an in-
depth analysis (e.g. Battlefield Forensics)
October 21 - 2011 Davide Ariu - AISEC 2011 8
9. ML applications to CF
• Applications of Machine Learning techniques
have been proposed in several Computer
Forensics applications
– Textual Documents and E-mail forensics
– Network Forensics
– Events and System Data Analysis
– Automatic file (fragment) classification
October 21 - 2011 Davide Ariu - AISEC 2011 9
10. Computer Forensics Research Drawbacks
• The experimental results proposed are not
completely convincing…
– Network forensics solutions evaluated on the
DARPA dataset only
– Email forensics algorithms evaluated on a corpus
of 156 emails (and 3 different authors)
– Automatic File classification algorithms evaluated
on 500MB dataset (best case…)
• In addition, the approach adopted was the
same adopted in Computer Security…
October 21 - 2011 Davide Ariu - AISEC 2011 10
11. How to improve existing tools?
• Useful solutions can be developed only if the
focus is:
– On the investigator and on the knowledge of the
case that he has
– On the organizazion and categorization of of the
information provided to the investigator
• Data sorting and categorization
• Prioritisation of results[Garfinkel, 2010; Beebe, 2009]
October 21 - 2011 Davide Ariu - AISEC 2011 11
12. Putting knowledge into the tool…
• Computer Security tools (e.g. IDS) are based on
a well defined criteria that is used to detect
attacks
• In other contexts where is difficult to explicitely
define a search criteria the feedback provided
by the user is exploited to achieve more
accurate results
– E.g. Content-based Image Retrieval with relevance
feedback [Zhouand,2003]
• It can be definitely the case of Computer
Forensics applications..
October 21 - 2011 Davide Ariu - AISEC 2011 12
13. Organizing data and results
• Discerning among the huge mass of data
represent a dramatically time-consuming task for
investigators
– E.g. Filtering the results obtained after file carving
– E.g. Inspecting all the pictures found in a laptop
• A tool can be definitely useful even if it is only
able to sort results and contents according to a
relevance criteria (most relevant first)
– The tool only assign “scores”, the analyst will inspect
them..
October 21 - 2011 Davide Ariu - AISEC 2011 13
14. To summarize..
• We investigated the problem of applying ML to
Computer Forensics
• We provided a short overview of the literature
related to ML applications in Computer Forensics
• We proposed several guidelines to profitably
apply machine learning to Computer Forensics
October 21 - 2011 Davide Ariu - AISEC 2011 14
15. Question or Comments
Thank you for your attention!
davide.ariu@diee.unica.it
October 21 - 2011 Davide Ariu - AISEC 2011 15