Digital Evidence Analytics:
                   What does the evidence
                          really mean?

                         The 2010 ADFSL Conference on
                         Digital Forensics, Security and Law
                                   May 19-21, 2010
                              St. Paul, Minnesota, USA




Tuesday, May 25, 2010                                          1
Dr. Marcus K. Rogers
                            University Faculty Scholar
                                Fellow of CERIAS
                        Director - Cyber Forensics Program
                               College of Technology
                                Purdue University
                                     CERIAS

                                       2
Tuesday, May 25, 2010                                        2
DE evolution
                  Acquisition Focused
                  All about the data!




                                        Examination and Analysis
                                          Information is King!



                                                                   Interpretation
                                                                    Knowledge??




                                                   3
Tuesday, May 25, 2010                                                               3
context

                    • How do we get there from here?
                    • Content is not the be all, live all, end
                        all!
                    • What meaning can we ascribe to
                        what we are seeing?


                                        4
Tuesday, May 25, 2010                                            4
context v. content

                    •    allows for attributions to be attached to
                        the data.
                    •    relational and/or structure and meaning to
                        the data.
                    • determines the value or weight of the raw
                        data.


                                            5
Tuesday, May 25, 2010                                                 5
context v. content
                    •    totality of the physical and electronic/virtual
                        environment.

                    •   what is missing or absent can be as important as
                        what is there (e.g., missing log files, wiped data
                        areas).

                    •   personal narrative is the key to connecting the data
                        points and more importantly, predicting future
                        behavior (of either the system or the user).


                                                 6
Tuesday, May 25, 2010                                                          6
what can the data tell us?

                    • Context
                    • Meaning
                    • Personal Narrative
                    • Linkages

                                           7
Tuesday, May 25, 2010                          7
what can the data tell us?
                    •   Intentions of individual or group (past & future)

                    •   Social networks

                    •   Technical capacity

                    •   Resources

                    •   Organizational structure

                    •   Organizational activities

                    •   Environment

                    •   Pattern of life


                                                     8
Tuesday, May 25, 2010                                                       8
connecting the dots
                    •   Pattern analysis
                    •   Chronologies (e.g., timelines)
                    •   Frequency analyses
                    •   Hierarchical connections or nodes
                        •   small world networks - (degrees of
                            separation in social networks), dense
                            connection nodes

                                              9
Tuesday, May 25, 2010                                               9
visualization
                    •   Graphical representations allow for better initial analysis
                        by humans (non machine learning systems)

                        •   Heatmaps

                            •   color coded to indicate relationships and
                                importance

                        •   Dashboard or console UI's.

                            •   Allow quick summary with the ability to drill down
                                to various levels of granularity


                                                   10
Tuesday, May 25, 2010                                                                 10
visualization
                    • Timelines
                     • using drill down charts that can be
                        superimposed over other interfaces
                    • Mind maps
                     • dynamic fluid relationships and
                        interconnections at different levels of
                        granularity

                                          11
Tuesday, May 25, 2010                                             11
points of view

                    • investigators v. analysts
                    • technical v. analytical
                     • our frame of reference is vital
                     • communication is vital
                     • asking better questions of the data!
                                         12
Tuesday, May 25, 2010                                         12
analysis
                        Scientific         Investigative      Analytics
                        Method
                        Theory              who             Data driven
                        development        what            (data mining)
                                           when           Decision making
                        Hypothesis
                        testing
                                           where             Statistical
                                            why               analysis
                        Probabilities       how               Pattern
                        Error rates                        identification
                        Accuracy
                                             13
Tuesday, May 25, 2010                                                       13
Summary

                    • It is not all about the data...its not all about
                        the information.
                        •   Information consists of facts and data organized to
                            describe a particular situation or condition.

                    • It is really about the knowledge!
                        •   Knowledge is applied to interpret information about the
                            situation and to decide how to handle it.



                                                   14
Tuesday, May 25, 2010                                                                 14
“There is nothing more
            deceptive than an obvious
                       fact”
                          Sir Arthur Conan Doyle
                              Sherlock Holmes
                        The Boscombe Valley Mystery




                                    15
Tuesday, May 25, 2010                                 15
contact information
                                Dr. Marcus Rogers
                                  765-494-2561
                            cyberforensics@mac.com
                           http://cyberforensics.purdue.edu




                                       16
Tuesday, May 25, 2010                                         16

ADFSL Conference 2010

  • 1.
    Digital Evidence Analytics: What does the evidence really mean? The 2010 ADFSL Conference on Digital Forensics, Security and Law May 19-21, 2010 St. Paul, Minnesota, USA Tuesday, May 25, 2010 1
  • 2.
    Dr. Marcus K.Rogers University Faculty Scholar Fellow of CERIAS Director - Cyber Forensics Program College of Technology Purdue University CERIAS 2 Tuesday, May 25, 2010 2
  • 3.
    DE evolution Acquisition Focused All about the data! Examination and Analysis Information is King! Interpretation Knowledge?? 3 Tuesday, May 25, 2010 3
  • 4.
    context • How do we get there from here? • Content is not the be all, live all, end all! • What meaning can we ascribe to what we are seeing? 4 Tuesday, May 25, 2010 4
  • 5.
    context v. content • allows for attributions to be attached to the data. • relational and/or structure and meaning to the data. • determines the value or weight of the raw data. 5 Tuesday, May 25, 2010 5
  • 6.
    context v. content • totality of the physical and electronic/virtual environment. • what is missing or absent can be as important as what is there (e.g., missing log files, wiped data areas). • personal narrative is the key to connecting the data points and more importantly, predicting future behavior (of either the system or the user). 6 Tuesday, May 25, 2010 6
  • 7.
    what can thedata tell us? • Context • Meaning • Personal Narrative • Linkages 7 Tuesday, May 25, 2010 7
  • 8.
    what can thedata tell us? • Intentions of individual or group (past & future) • Social networks • Technical capacity • Resources • Organizational structure • Organizational activities • Environment • Pattern of life 8 Tuesday, May 25, 2010 8
  • 9.
    connecting the dots • Pattern analysis • Chronologies (e.g., timelines) • Frequency analyses • Hierarchical connections or nodes • small world networks - (degrees of separation in social networks), dense connection nodes 9 Tuesday, May 25, 2010 9
  • 10.
    visualization • Graphical representations allow for better initial analysis by humans (non machine learning systems) • Heatmaps • color coded to indicate relationships and importance • Dashboard or console UI's. • Allow quick summary with the ability to drill down to various levels of granularity 10 Tuesday, May 25, 2010 10
  • 11.
    visualization • Timelines • using drill down charts that can be superimposed over other interfaces • Mind maps • dynamic fluid relationships and interconnections at different levels of granularity 11 Tuesday, May 25, 2010 11
  • 12.
    points of view • investigators v. analysts • technical v. analytical • our frame of reference is vital • communication is vital • asking better questions of the data! 12 Tuesday, May 25, 2010 12
  • 13.
    analysis Scientific Investigative Analytics Method Theory who Data driven development what (data mining) when Decision making Hypothesis testing where Statistical why analysis Probabilities how Pattern Error rates identification Accuracy 13 Tuesday, May 25, 2010 13
  • 14.
    Summary • It is not all about the data...its not all about the information. • Information consists of facts and data organized to describe a particular situation or condition. • It is really about the knowledge! • Knowledge is applied to interpret information about the situation and to decide how to handle it. 14 Tuesday, May 25, 2010 14
  • 15.
    “There is nothingmore deceptive than an obvious fact” Sir Arthur Conan Doyle Sherlock Holmes The Boscombe Valley Mystery 15 Tuesday, May 25, 2010 15
  • 16.
    contact information Dr. Marcus Rogers 765-494-2561 cyberforensics@mac.com http://cyberforensics.purdue.edu 16 Tuesday, May 25, 2010 16