Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Rediscover hidden facebook semantic search engine, reimagine open source intelligence


Published on

unchained Facebook Graph Search, Akos Bardoczi, open-source intelligence professional, research OPSEC advisor

Published in: Internet
  • Be the first to comment

Rediscover hidden facebook semantic search engine, reimagine open source intelligence

  1. 1. Rediscovering the hidden Facebook semantic search engine and reimagining open-source intelligence unchained Graph Search
  2. 2. Some important things about OSINT • In most cases, the most efficient technique is not well-known, hidden, but not (too) difficult to use! • The misunderstood deep web: the tale about the size of the deep web is based on a 16-yrs-old research… • The misunderstood deep web #2: it usually cannot provide up-to-date, relevant information
  3. 3. About Facebook Graph Search • Announced by FB in March 2013 • FB almost immediately killed the Advanced search box due to privacy concerns – but the search options stay available  • the dying of semantic search – huge machine learning failures, such as too few complex search queries from users • After 4 years, still available only in US English
  4. 4. Example • „Budapest University of Technology and Economics students who are Budapest, Hungary residents and like Shakira” (sic!) – this will generate a simple keyword search without relevant results • dents/106502519386806/residents/5027904559/likers/int ersect URL will generate a smart, semantic search query
  5. 5. The universal scheme of GS’s URI structure •[n]/string_te rm[n]/entity_id[m]/keyword[m]/(intersect) • the „search” [optional] is not mandatory in some cases • the „str” [optional] indicates simple keyword terms and must be placed before other terms • the „intersect” is mandatory in complex queries
  6. 6. More about the scheme of GS’s URI • the „entity_id” represents the entity of something, e.g. names, places, religious views, spoken languages – see below – you can find it in the client-side code • There isn’t any limit about query complexity or length • The „keyword” indicates the type of entity – which is important, e.g. a university as a physical location, as a school, or as a workplace
  7. 7. Some important things • The queries works only with US English Facebook but: • the URI may contain any characters after the „str” part, e.g. Москва or ‫الدولي‬ ‫دبي‬ ‫مطار‬ • The „word order of the sentence” matters in most cases
  8. 8. What can you search with Graph Search? • Basically almost anything! • in theory you can find any content and relations between entities and contents which you can view with your permissions
  9. 9. Most frequently used Graph Search keywords …now only without explanation pages-liked, photos, photos-by, photos-liked, photos-of, photos- tagged, photos-commented, videos, videos-by, videos-of, videos- liked, videos-commented, apps-used, stories-by, stories-commented, stories-tagged, friends, events, events-joined, events-interested, places-visited, places-liked, groups, users-named, home-residents, residents (/present, /past), likers, users-age, users-born, users- political-view, visitors, employees (/present, /past), speakers, users- checked-in, photos-in, videos-in, stories-keyword, date (YYYY), date-2 (MM/YYYY), date-3 (DD/MM/YYYY), react, studied (/present, /past)
  10. 10. The scope of search in practice • as I mentioned, any content: texts in status updates, comments, image descriptions, images, geotags, likes, and other reacts on public pages, on event pages, and on users’ timelines (even the items hidden from timeline!) • Full contents of open groups, full content of closed and secret groups as a group member • Basically anything except items specifically deleted by the user
  11. 11. The scope of search in practice 0x200. • keep in mind the audience selectors – and bypass them  • your scope will exponentially grow with more friends and after joining more goups – note: the avg. distance between two randomly chosen users is 3.5 and users have 300 contacts on avg. but the limit # of friends is 5000 and you can join 5000 different groups
  12. 12. The scope of search in practice 0x300. • You will need a professionally molded, realistic character for a Facebook user depending on your research interest • A professionally molded character [actor] is not a simple fake profile – and I think this is the most difficult part – see also OPSEC
  13. 13. OPSEC considerations @ sophisticated research • In practice, you cannot make fully-virgin searches – e.g. the order of results depends on everything, the previous searches as well • don’t try to use widely used anonimizer techniques, for example TOR – the FB will know it! • the best practices are similar to the best practices in forensics lab and in HUMINT
  14. 14. OPSEC considerations @ sophisticated research 0x200. • You will need a spare, non-virtual SIM card never used before • Depending on the sensitivity of research, you may need a photoshopped goverment-issued ID – don’t worry, nowadays researchers can generate realistic faces [difficult && not my business ] • the FB reserves the right of account deactivation, temporary suspension; let’s minimize this risk
  15. 15. OPSEC considerations @ sophisticated research 0x300. • It is recommended to use a virtual machine with default browser settings – see also: browser fingerprint • Once again – do not use TOR! – instead use a reliable VPN provider, and keep in mind that your IP address is associated with an approx. location that affects the order of your search results that you receive
  16. 16. • the Facebook traces user behaviour – e.g. statistical information about keystrokes speed – including what you deleted from a text field - and the distribution of different operations, in short, your every click • Of course, never mix your actor’s behavior and your own – e.g. don’t send a friend request to someone you know personally OPSEC considerations @ sophisticated research 0x400.
  17. 17. Tailor your actor’s character and behavior for the concrete research field • more complicated than you think • an ideal actor is similar to a secret agent, who is familiar with language, culture, language-culture (!!) in different cases – e.g. counterterrorism, social psychology researches, or cyber-threat intelligence context • in some cases you simply don’t need an actor, you can search via your own account