Social Media Investigations

Uploaded on

Data Mining and social media investigations and profiling.

Data Mining and social media investigations and profiling.

More in: Technology , Design
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • Nice presentation, good to read this thanks.
    Industry Analysis Report
    Are you sure you want to
    Your message goes here
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Data Mining And Internet Profiling:
    Approaches to Successful Online Social Media Investigations
    Shellee Hale
  • 2. Shellee Hale - President of Camandago, Inc.
    WA Licensed Private Investigator
    CAS – Certified Anti-Terrorism Specialist
    CEMS – Certified Emergency Management Specialist
    Specializes in:
    Cyber Tracing
    Cyber Warfare Threat Profiling  
    Constituent for the Overseas Security Advisory Council (OSAC)
    Federal Advisory Committee with a U.S. Government charter to promote security cooperation between US private sector interests worldwide and U.S. Dept. of State
    Infragard Member
    Seattle FBI Citizens Academy Alumni Association
  • 3. Dataveillance
    • Dataveillance is the systematic use of digital personal data in the investigation or monitoring of the actions or communications of one or more persons
    • 4. Web based search
    • 5. Social Media
  • Search Engines
    Search engines are algorithmic information retrieval systems that allow searching of massive web-based databases. A web search engine is designed to search for information on the World Wide Web and FTP servers. The search results are generally presented in a list of results and are often called hits. The information may consist of web pages, images, information and other types of files.
  • Search Engine Tutorials
    • 14.
    • 15.
    • 16.
    • 17.
    • 18.
    • 19.
    • 20.
    Google Hacks,
    Tara Calishain, RaelDornfestPublisher: O'Reilly Media
  • 21. Search Techniques
    • +-
    • 22. Quotations
    • 23. Keyword Order and lowercase
    • 24. Truncation (*)
    • 25. Allinbody, [ allinbody:keyword] (Allintitle, Allinurl)
    Boolean logic
    • Enclose OR statements in parentheses.
    • 26. Always use CAPS Most engines require that the operators (AND, OR, AND NOT/NOT) be capitalized.  
    • 27.
  • Always copy urls, because sometimes you can’t backtrack. Google updates its results constantly and with the more than 20 billion websites out there, you may never find the same info again.
    Take screenshots of content, or consider making use of CAMTASIA, a screen recorder and editing software program.
    History Of A Website
    Doesn’t include adult
    Not a complete archive
    You can remove yourself from the machine with robots.txt
  • 28. Meta Search Engines
    Search engines that search other search engines and directories. They extract the best of the searches from various popular search engines and directories and include the information in their own search results.
  • Directories
    Search engines that search other search engines and directories. They extract the best of the searches from various popular search engines and directories and include the information in their own search results.
  • Gateways
    Collections of databases and informational sites assembled, reviewed and recommended by specialists used to access this material
    Invisible Web
    • Large portion of the Web that search engine spiders cannot index - 60-80% of web material
    • 36. Pass-protected sites,
    • 37. Documents behind firewalls, archived material,
    • 38. The contents of certain databases,
    • 39. Information that isn't static but assembled dynamically in response to specific queries
    Subject-Specific Databases (Vortals) are devoted to a single subject ie WebMD
  • 40. Verifying Sources
    • Unlike scholarly books and journal articles, web sites are seldom reviewed or refereed. It's up to you to check for bias and to determine objectivity.  Try to assess the stability of the pages you reference.
    • 41. Understand legitimacy of web address: edu, gov, mil- most reliable sources. com, net ,org  Countries have specific codes .ca, .uk, etc
    • 42. look closely at the page sponsor, last date updated, and the authority of the author(s) if possible.
    • 43. Research Information on domain ownership
    • 44. Verify inbound links
    • 45. Check web traffic
  • Information Aggregators
    These are tools which pull in information from multiple sources, and consolidate that information into a smaller and more easily digested number of streams
    RSS Feeds (Google Reader, Bloglines)
    Pull blogs into a single stream of information
    Spokeo - Big Brother Of Social Networking– Gateway to Paid databases. Shows available websites around a specific name.
    Pipl- The most comprehensive people search on the web
    yoName – Searches Social Networks
    Slandr, etc.
    Real time news
    Email alerts
    Real-time news
  • 46.
  • 47. Social Media Networks
    Analyzing Social Media Networks with NodeXL: Insights from a Connected World
    Derek Hansen,
     Ben Shneiderman, 
    Marc A. Smith 
  • 48.
  • 49. Website TOS, Privacy Laws And Proposed Regulations
    Social Media is a key component to profiling a subject of investigation. The pool of information about each individual can form a distinctive “social signature,” But there are limitations to the info you can access on a Social Network due to privacy settings and anonymity.
  • 50. Issues With Anonymity
    We have a right to it, but websites are not allowing it via TOS. You can be anonymous online, but how can u be anonymous online when they are asking for real info?
    If you go into Facebook and setup a profile, their TOS say that is you. You have to have a valid email address, but how do you know that they are using any random email address and name?
    It is not illegal for internet users to impersonate or create a false identity online.
    Popularity of a site comes with vulnerability of attack.
    We are seeing and increase in SPOOFING - ie reset password emails giving someone else ownership of your account.
    Be advised that accounts under a persons name can be a result of spoofing and not nessicarily created by a user.
    In the context of network security, a spoofing attack is a situation in which one person or program successfully masquerades as another by falsifying data and thereby gaining an illegitimate advantage.
  • 51. The Privacy Debate
    We want privacy  We expose private details of our lives online.
    Once you post something, you are leaving a digital footprint that is owned by the site.
    Facebook has been receiving a lot of bad press. Users fear of how their data might be used. Privacy Policies and TOS are constantly being changed
    We are seeing 2 different agendas in terms of advocates in online privacy
    We put pressure on websites to protect our information, and we do reserve that right.
    But the same time because of the vast scope and information on social media the government wants a backdoor to get info for investigations and terrorism research.
    this will leave personal info vulnerable to hackers...
    Consider This…
    There are different privacy laws in every country.
    Check TOS and privacy laws on each websites. They may allow backdoors.
  • 52. Privacy Settings
    Its Important to understand privacy laws and settings for major social networks to understand limitations, and how to potentially work around them...
    Users can select their own privacy settings, and there are few ways to get around them,
    Facebook Profiles Offer
    • Phone numbers,
    • 53. Email addresses,
    • 54. Photos provide a history and timeline.
    • 55. Status updates offer current whereabouts etc.
    Privacy Settings: Profile can be viewable by
  • Public Tweets:
    Your updates appear in Twitter’s public timeline — a flowing river of every member’s status.
    Anyone can see your Twitter updates.
    Your Twitter updates can be indexed by search engines.
    Protected Tweets:
    People will have to request to follow you and each follow request will need approval
    Your Profile and Tweets will only be visible to users you've approved
    Protected Profiles' Tweets will not appear in Twitter search
    @replies sent to people who aren't following you will not be seen
    You cannot share static page URL's with non-followers
  • 60. Default Settings:
     By default, people on MySpace can see when you’re online. Your profile and photo is also set to be viewable by everyone.
    Privacy Options:
    MySpace’s privacy options are very limited, but changing three key settings can provide you with some important privacy protection:
  • Tips & Tricks
    If you have an email address you want to put a face to, you can also find who owns an email address by searching the email address in the Facebook search window.
    Anyone can create a fake profile so use this to your advantage. Some users will allow friends of friends to access part if not all of a profile. Befriend a friend of someone you are investigating.
    How To Protect Your Privacy on
    Facebook, Myspace, And Linked In
    How do you get in and see info if its been deleted?
    Tweletedallowed you to recover Twitter message
    If user a quotes user b who then removes tweet, it will still show up in user a'squotes.
  • 63. Properly DocumentingSocial Media Investigations
    Always copy urls, because sometimes you cant backtrack. Google updates its results constantly and with the more than 20 billion websites out there, you may never find the same info again.
    Take screenshots of content. (ie. craigslist ads)
    Consider making use of CAMTASIA, a screen recorder and editing software program.
    • Take Screencapson the fly
    • 64. Draw attention with arrows, add text
    • 65. Organizational tools - Search for your captures
    by date, website, or a custom flag that you create and assign.
  • 66. Centrifuge Systems
    Centrifuge has created a powerful approach to analysis called “Interactive Analytics”. Our next generation approach provides groundbreaking visualizations accessible from any browser and any operating system.
    “Interactive Analytics” (IA) is based on extensive work with the US Intelligence Community and brings together three innovations in analytics today, Interactive Data Visualization, Unified Data Views and Collaborative Analysis.
  • 67.
    Afree-form database designed for users rather than programmers (Like a CMS)
  • 76. Factors in Predicting Online Deception
    Any intentional control of information in a message to create a false belief in the receiver of the message
  • 77. Frequency Of Lying
    How do different media affect lying and honesty?
    • 1.75 lies identified in a 10 minute exchange
    • 78. Range from 0 lies to 14 lies
    • 79. Self-preservation goal (‘likeable’) increases deception
    “Electronic mail is a godsend. With e-mail we needn’t worry about so much as a quiver in our voice or a tremor in our pinkie when telling a lie. Email is a first rate deception-enabler.”
    ~Keyes (2004) The Post-Truth Era
  • 80. True Personality vs. Embellished Identity
    Changing pronouns as benign as it seems is the queen mother of linguistic violations and is a very strong indication that deception might be present!
    for instance our house vs. my house
  • 115. Online Deception
    The ambiguity of the Internet allows complete anonymity, providing the user with the ability to create false and misleading profiles and identities online, thus hiding their true identity.
    • gender swapping online,
    • 116. with men playing women.
    • 117. Adults posing as children etc
    lies or exaggerations of
    one’s physical appearance,
    personality or characteristics,
    or even slight exaggerations of a genuine characteristic such as denying being a smoker, drinker, etc.
    One can have ‘as many electronic personas as one has time and energy to create’ (Donath, 1999).
    the University of Texas at Austin that suggest users express their true personality – not an embellished identity – over online social networks such as Facebook.
    The Texas researchers collected 236 profiles of college-aged users of Facebook in the United States and StudiVZ, the equivalent in Germany. The users filled out questionnaires about their personality and also about who they'd like to be. Strangers browsed and rated the online profiles, and the study authors compared the ratings with the users' questionnaires.
    Networks such as Facebook are more “genuine mediums for social interactions than vehicles for self-promotion,”
    But whether honesty on Facebook comes naturally or is necessitated by your audience is up for debate “You don't have full control over it. Other people can write things on your wall and tag you in unflattering photos. etc” Stated Professor Hancock
  • 119. Detecting Deception
    Inconsistencies in actions or words do not necessarily indicate a lie, just as consistency is not necessarily a guarantee of the truth.
    However, a pattern of inconsistencies or unexplainable behavior normally indicate deceit.
  • 120. Techniques For Identifying Deceit
    Control Questions
    Repeat questions
    • Should not be exact repetitions of an earlier question.
    • 121. The investigator must rephrase or otherwise disguise the previous question.
    • 122. Repeat questions also need to be separated in time from the original question so the information cannot easily be remembered.
    Developed from recently confirmed or known information that is not likely to have changed.
    If the answer to a control question is not given as expected, it may be an indicator of deceit.
    Q1 – What was the score of the baseball game?
    A1 – Well, first of all, you wouldn’t believe how much the tickets cost; then I had to get something to eat, which is a total waste of money....
    Topical Examples:
    • Last day of school, Vacation dates
    • 123. School events, Pop culture trivia
    • 124. Video game trivia
  • Internal Inconsistencies
    Frequently when someone is lying, an investigator will be able to identify inconsistencies in the timeline, the circumstances surrounding key events, or other areas within the questioning.
    For example, someone spends a long time explaining something that took a short time to happen, or a short time telling of an event that took a relatively long time to happen.
    Q1 – What was the score of the baseball game?
    A1 – Well, first of all, you wouldn’t believe how much the tickets cost; then I had to get something to eat, which is a total waste of money....
  • 125. “Placement” and “Access”
    Based on a person’s job, geographical location, age, etc., investigators should have a basic idea of the breadth and depth of information that such a person should know.
    When answers show that someone does not have the expected level of information (too much or too little or different information than expected), this may be an indicator of deceit.
    In an extreme case, if someone is interrupted in the middle of a statement on a given topic, they will have to start again at the beginning in order to “get the story straight.”
    Repeated Information
    • Often if someone plans on lying about a topic, they will memorize or practice exactly what they are going to say.
    • 126. If they always relate an incident using exactly the same wording, or answer ‘repeat’ questions identically (word for word) to the original question, it may be an indicator of deceit.
  • Incongruent Appearance and Incongruent Language
    If someone’s online appearance does not match their story, it may be an indication of deceit.
    If the type of language, including sentence structure and vocabulary, does not match the story, this may also be an indicator of deceit.
    If the suspected liar does not use the proper technical vocabulary to match an otherwise familiar story, this may be an indicator of deceit.
  • 127. Questions?