Data Mining And Internet Profiling: Approaches to Successful Online Social Media Investigations Shellee Hale
Shellee Hale - President of Camandago, Inc. WA Licensed Private Investigator CAS – Certified Anti-Terrorism Specialist CEMS – Certified Emergency Management Specialist Specializes in: Cyber Tracing Dataveillance Cyber Warfare Threat Profiling Constituent for the Overseas Security Advisory Council (OSAC) Federal Advisory Committee with a U.S. Government charter to promote security cooperation between US private sector interests worldwide and U.S. Dept. of State Infragard Member Seattle FBI Citizens Academy Alumni Association
Search Engines Search engines are algorithmic information retrieval systems that allow searching of massive web-based databases. A web search engine is designed to search for information on the World Wide Web and FTP servers. The search results are generally presented in a list of results and are often called hits. The information may consist of web pages, images, information and other types of files.
Always copy urls, because sometimes you can’t backtrack. Google updates its results constantly and with the more than 20 billion websites out there, you may never find the same info again. Take screenshots of content, or consider making use of CAMTASIA, a screen recorder and editing software program. CamtasiaStudio Archive.org History Of A Website Doesn’t include adult Not a complete archive You can remove yourself from the machine with robots.txt
Meta Search Engines Search engines that search other search engines and directories. They extract the best of the searches from various popular search engines and directories and include the information in their own search results.
Directories Search engines that search other search engines and directories. They extract the best of the searches from various popular search engines and directories and include the information in their own search results.
Unlike scholarly books and journal articles, web sites are seldom reviewed or refereed. It's up to you to check for bias and to determine objectivity. Try to assess the stability of the pages you reference.
Understand legitimacy of web address: edu, gov, mil- most reliable sources. com, net ,org Countries have specific codes .ca, .uk, etc
look closely at the page sponsor, last date updated, and the authority of the author(s) if possible.
Research Information on domain ownership whois.net/
Information Aggregators These are tools which pull in information from multiple sources, and consolidate that information into a smaller and more easily digested number of streams RSS Feeds (Google Reader, Bloglines) Pull blogs into a single stream of information Spokeo - Big Brother Of Social Networking http://www.pandia.com/sew/620-spokeo.html 123people.com– Gateway to Paid databases. Shows available websites around a specific name. Pipl- The most comprehensive people search on the web yoName – Searches Social Networks Brizzly, SeesmicWeb, HootSuite, Dabr, Slandr, etc. Real time news interceder.net Email alerts Real-time news
Website TOS, Privacy Laws And Proposed Regulations Social Media is a key component to profiling a subject of investigation. The pool of information about each individual can form a distinctive “social signature,” But there are limitations to the info you can access on a Social Network due to privacy settings and anonymity.
Issues With Anonymity We have a right to it, but websites are not allowing it via TOS. You can be anonymous online, but how can u be anonymous online when they are asking for real info? If you go into Facebook and setup a profile, their TOS say that is you. You have to have a valid email address, but how do you know that they are using any random email address and name? It is not illegal for internet users to impersonate or create a false identity online. Popularity of a site comes with vulnerability of attack. We are seeing and increase in SPOOFING - ie reset password emails giving someone else ownership of your account. Be advised that accounts under a persons name can be a result of spoofing and not nessicarily created by a user. In the context of network security, a spoofing attack is a situation in which one person or program successfully masquerades as another by falsifying data and thereby gaining an illegitimate advantage.
The Privacy Debate We want privacy We expose private details of our lives online. Once you post something, you are leaving a digital footprint that is owned by the site. Facebook has been receiving a lot of bad press. Users fear of how their data might be used. Privacy Policies and TOS are constantly being changed We are seeing 2 different agendas in terms of advocates in online privacy We put pressure on websites to protect our information, and we do reserve that right. But the same time because of the vast scope and information on social media the government wants a backdoor to get info for investigations and terrorism research. this will leave personal info vulnerable to hackers... Consider This… There are different privacy laws in every country. Check TOS and privacy laws on each websites. They may allow backdoors.
Privacy Settings Its Important to understand privacy laws and settings for major social networks to understand limitations, and how to potentially work around them... Users can select their own privacy settings, and there are few ways to get around them, Facebook Profiles Offer
Public Tweets: Your updates appear in Twitter’s public timeline — a flowing river of every member’s status. Anyone can see your Twitter updates. Your Twitter updates can be indexed by search engines. Protected Tweets: People will have to request to follow you and each follow request will need approval Your Profile and Tweets will only be visible to users you've approved Protected Profiles' Tweets will not appear in Twitter search @replies sent to people who aren't following you will not be seen You cannot share static page URL's with non-followers
Default Settings: By default, people on MySpace can see when you’re online. Your profile and photo is also set to be viewable by everyone. Privacy Options: MySpace’s privacy options are very limited, but changing three key settings can provide you with some important privacy protection:
Tips & Tricks If you have an email address you want to put a face to, you can also find who owns an email address by searching the email address in the Facebook search window. Anyone can create a fake profile so use this to your advantage. Some users will allow friends of friends to access part if not all of a profile. Befriend a friend of someone you are investigating. RESOURCE: How To Protect Your Privacy on Facebook, Myspace, And Linked In http://www.mint.com/blog/moneyhack/howto-protect-your-privacy-on-facebook-myspace-and-linkedin/ How do you get in and see info if its been deleted? Tweletedallowed you to recover Twitter message If user a quotes user b who then removes tweet, it will still show up in user a'squotes.
Properly DocumentingSocial Media Investigations Always copy urls, because sometimes you cant backtrack. Google updates its results constantly and with the more than 20 billion websites out there, you may never find the same info again. Take screenshots of content. (ie. craigslist ads) Consider making use of CAMTASIA, a screen recorder and editing software program.
Organizational tools - Search for your captures
by date, website, or a custom flag that you create and assign.
Centrifuge Systems Centrifuge has created a powerful approach to analysis called “Interactive Analytics”. Our next generation approach provides groundbreaking visualizations accessible from any browser and any operating system. “Interactive Analytics” (IA) is based on extensive work with the US Intelligence Community and brings together three innovations in analytics today, Interactive Data Visualization, Unified Data Views and Collaborative Analysis. http://www.centrifugesystems.com/
AskSam.com Afree-form database designed for users rather than programmers (Like a CMS)
“Electronic mail is a godsend. With e-mail we needn’t worry about so much as a quiver in our voice or a tremor in our pinkie when telling a lie. Email is a first rate deception-enabler.” ~Keyes (2004) The Post-Truth Era
Online Deception The ambiguity of the Internet allows complete anonymity, providing the user with the ability to create false and misleading profiles and identities online, thus hiding their true identity.
lies or exaggerations of one’s physical appearance, personality or characteristics, or even slight exaggerations of a genuine characteristic such as denying being a smoker, drinker, etc. One can have ‘as many electronic personas as one has time and energy to create’ (Donath, 1999).
CASE STUDY ON DECEPTION ON FACEBOOK STUDY the University of Texas at Austin that suggest users express their true personality – not an embellished identity – over online social networks such as Facebook. The Texas researchers collected 236 profiles of college-aged users of Facebook in the United States and StudiVZ, the equivalent in Germany. The users filled out questionnaires about their personality and also about who they'd like to be. Strangers browsed and rated the online profiles, and the study authors compared the ratings with the users' questionnaires. FINDINGS: Networks such as Facebook are more “genuine mediums for social interactions than vehicles for self-promotion,” But whether honesty on Facebook comes naturally or is necessitated by your audience is up for debate “You don't have full control over it. Other people can write things on your wall and tag you in unflattering photos. etc” Stated Professor Hancock
Detecting Deception Inconsistencies in actions or words do not necessarily indicate a lie, just as consistency is not necessarily a guarantee of the truth. However, a pattern of inconsistencies or unexplainable behavior normally indicate deceit.
Techniques For Identifying Deceit Control Questions Repeat questions
Should not be exact repetitions of an earlier question.
The investigator must rephrase or otherwise disguise the previous question.
Repeat questions also need to be separated in time from the original question so the information cannot easily be remembered.
Developed from recently confirmed or known information that is not likely to have changed. If the answer to a control question is not given as expected, it may be an indicator of deceit. Example: Q1 – What was the score of the baseball game? A1 – Well, first of all, you wouldn’t believe how much the tickets cost; then I had to get something to eat, which is a total waste of money.... Topical Examples:
Internal Inconsistencies Frequently when someone is lying, an investigator will be able to identify inconsistencies in the timeline, the circumstances surrounding key events, or other areas within the questioning. For example, someone spends a long time explaining something that took a short time to happen, or a short time telling of an event that took a relatively long time to happen. Example: Q1 – What was the score of the baseball game? A1 – Well, first of all, you wouldn’t believe how much the tickets cost; then I had to get something to eat, which is a total waste of money....
“Placement” and “Access” Based on a person’s job, geographical location, age, etc., investigators should have a basic idea of the breadth and depth of information that such a person should know. When answers show that someone does not have the expected level of information (too much or too little or different information than expected), this may be an indicator of deceit. Example: In an extreme case, if someone is interrupted in the middle of a statement on a given topic, they will have to start again at the beginning in order to “get the story straight.” Repeated Information
Often if someone plans on lying about a topic, they will memorize or practice exactly what they are going to say.
If they always relate an incident using exactly the same wording, or answer ‘repeat’ questions identically (word for word) to the original question, it may be an indicator of deceit.
Incongruent Appearance and Incongruent Language If someone’s online appearance does not match their story, it may be an indication of deceit. If the type of language, including sentence structure and vocabulary, does not match the story, this may also be an indicator of deceit. Example: If the suspected liar does not use the proper technical vocabulary to match an otherwise familiar story, this may be an indicator of deceit.