Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Building and Using a Knowledge Graph to Combat Human Trafficking

583 views

Published on

Presented at the 14th International Semantic Web Conference (ISWC 2015). Paper was selected Best Application Paper.

Published in: Technology
  • Be the first to comment

Building and Using a Knowledge Graph to Combat Human Trafficking

  1. 1. Building and Using a Knowledge Graph to Combat Human Trafficking Pedro Szekely Craig Knoblock, Jason Slepicka, Andrew Philpot, Amandeep Singh, Chengye Yin, Dipsy Kapoor, Prem Natarajan, Daniel Marcu, Kevin Knight, David Stallard, Subessware S. Karunamoorthy, Rajagopal Bojanapalli, Steven Minton, Brian Amanatullah, Todd Hughes, Mike Tamayo, David Flynt, Rachel Artiss, Shih-Fu Chang, Tao Chen, Gerald Hiebel and Lidia Ferreira Information Sciences Institute, University of Southern California Columbia University, Inferlink, Next Century, NASA JPL
  2. 2. Profits per Year: $32 Billion Average Age of Entry To Prostitutionin the US: 14 PIMP’s Profit Per Victim Per Year: $150,000 Advertising Budget On the Web: $45 Million
  3. 3. Find the locations where a potential victim of human trafficking was advertised
  4. 4. San Diego, where else?
  5. 5. Example: Find the locations where a potential victim of human trafficking was advertised > 100 million pages advertising adult services
  6. 6. “… showing how the Semantic Web can solve problems that end users have right now” “A Semantic Web application is one whose schema is expected to change” David Karger, keynote ESWC 2013
  7. 7. Reusable technology for building domain-specific search Crawling Extraction Data Acquisition Mapping To Ontology Entity Linking & Similarity Knowledge Graph Deployment Query & Visualization Elastic Search Graph DB schema.org geonames Crawling Mapping To Ontology Entity Linking Knowledge Graph Deployment Query & Visualization Extraction
  8. 8. Web Crawling 24/7 2,000 Pages/Hour 68,000,000 pages Total
  9. 9. Reusable technology for building domain-specific search Crawling Extraction Data Acquisition Mapping To Ontology Entity Linking & Similarity Knowledge Graph Deployment Query & Visualization Elastic Search Graph DB schema.org geonames Crawling Mapping To Ontology Entity Linking Knowledge Graph Deployment Query & Visualization Extraction
  10. 10. Semi-Structured Extraction HTML JSON
  11. 11. Text Extraction “YOU don't wanna miss out on ME :) Perfect lil booty Green eyes Long curly black hair Im a Irish,Armenian and Filipino mixed princess :) ❤ Kim ❤ 7○7~7two7~7four77 ❤ HH 80 roses ❤ Hour 120 roses ❤ 15 mins 60 roses” name: Kim eye-color: green hair-color: black phone: 707-727-7477 rate: $60/15min $80/30min $120/60min
  12. 12. Reusable technology for building domain-specific search Crawling Extraction Data Acquisition Mapping To Ontology Entity Linking & Similarity Knowledge Graph Deployment Query & Visualization Elastic Search Graph DB schema.org geonames Crawling Mapping To Ontology Entity Linking Knowledge Graph Deployment Query & Visualization Extraction
  13. 13. Mapping to Ontology One ontology Many schemas
  14. 14. Ontology plus some extensions
  15. 15. Karma: Mapping Data to Ontologies Services Relational Sources Karma { JSON-LD } Hierarchical Sources
  16. 16. Karma Demo
  17. 17. 68 Million Documents Mapped to the Ontology
  18. 18. AdultService-1 Person-1 Offer-1 availableAt seller phone 619-319-7315 Santa Barbara hairColor red price 250/hour startDate 2014-12-07 eyeColor blue name Jessica itemProvided Offer-2 Person-2 availableAt Washington DC phone seller email price 250/hour startDate 2014-05-28 AdultService-2 eyeColor blue name Jessica itemProvided Karma Connects Graphs on Strong Attributes
  19. 19. Reusable technology for building domain-specific search Crawling Extraction Data Acquisition Mapping To Ontology Entity Linking & Similarity Knowledge Graph Deployment Query & Visualization Elastic Search Graph DB schema.org geonames Crawling Mapping To Ontology Entity Linking Knowledge Graph Deployment Query & Visualization Extraction
  20. 20. Using Text Similarity to Connect the Dots E M I LY SEXY.** wHiTe/lATin girl **bUsTy SWEET.LoTs Of fUn. Call Me. O_U_T_C___A___L_L_S LAYLA SEXY.** wHiTe girl ** bUsTy SWEET.LoTs Of fUn.Call Me. O____U____T____C___A___L____L____S LI LA SEXY.** WhiTe girl ** bUsTy SWEET.LoTs Of fUn.Call Me. O_U_T_C___A___L_L_S
  21. 21. Using Image Similarity to Connect the Dots 80 Million Images Technology: Deep Learning
  22. 22. AdultService-1 Person-1 Offer-1 availableAt seller phone 619-319-7315 Santa Barbara hairColor red price 250/hour startDate 2014-12-07 eyeColor blue name Jessica itemProvided Offer-2 Person-2 availableAt Washington DC phone seller email price 250/hour startDate 2014-05-28 AdultService-2 eyeColor blue name Jessica itemProvided Connecting Nodes Using All Attributes
  23. 23. AdultService-1 Person-1 Offer-1 availableAt seller phone 619-319-7315 Santa Barbara hairColor red price 250/hour startDate 2014-12-07 eyeColor blue name Jessica itemProvided Offer-2 Person-2 availableAt Washington DC phone seller email price 250/hour startDate 2014-05-28 AdultService-2 eyeColor blue name Jessica itemProvided Connecting Nodes Using All Attributes same victim same Trafficker
  24. 24. Reusable technology for building domain-specific search Crawling Extraction Data Acquisition Mapping To Ontology Entity Linking & Similarity Knowledge Graph Deployment Query & Visualization Elastic Search Graph DB schema.org geonames Crawling Mapping To Ontology Entity Linking Knowledge Graph Deployment Query & Visualization Extraction
  25. 25. SPARQL ElasticSearch > 100 million docs > 1 billion triples Challenging Easy Text + structured query Restricted Native support Faceted browsing Hard Easy Familiar to developers No Yes
  26. 26. Create Unified Database AdultService-1 Person-1 Offer-1 availableAt seller phone 619-319-7315 Santa Barbara hairColor red price 250/hour startDate 2014-12-07 eyeColor blue name Jessica itemProvided Offer-2 Person-2 availableAt Washington DC phone seller email price 250/hour startDate 2014-05-28 AdultService-2 eyeColor blue name Jessica itemProvided
  27. 27. One Index Per Main Class AdultService-1 Person-1 Offer-1 availableAt seller phone 619-319-7315 Santa Barbara hairColor red price 250/hour startDate 2014-12-07 eyeColor blue name Jessica itemProvided Offer-2 Person-2 availableAt Washington DC phone seller email price 250/hour startDate 2014-05-28 AdultService-2 eyeColor blue name Jessica itemProvided
  28. 28. Offers As Roots AdultService-1 Person-1 Offer-1 availableAt seller phone 619-319-7315 Santa Barbara hairColor red price 250/hour startDate 2014-12-07 eyeColor blue name Jessica itemProvided 619-319-7315 Offer-2 Person-2 availableAt Washington DC phone seller email price 250/hour startDate 2014-05-28 AdultService-2 eyeColor blue name Jessica itemProvided
  29. 29. Adult Service As Roots AdultService-1 Person-1Offer-1 availableAt seller phone Santa Barbara hairColor red price 250/hour startDate 2014-12-07 eyeColor blue name Jessica 619-319-7315 offers Offer-2 Person-2 availableAt Washington DC phone seller email swedebeauty@gmail.com price 250/hour startDate 2014-05-28 AdultService-2 eyeColor blue name Jessica offers 619-319-7315
  30. 30. ElasticSearch Data Model Adult Service Offer Person Efficient indexing and query Phone Web Page
  31. 31. Reusable technology for building domain-specific search Crawling Extraction Data Acquisition Mapping To Ontology Entity Linking & Similarity Knowledge Graph Deployment Query & Visualization Elastic Search Graph DB schema.org geonames Crawling Mapping To Ontology Entity Linking Knowledge Graph Deployment Query & Visualization Extraction
  32. 32. Find the locations where a potential victim of human trafficking was advertised
  33. 33. Impact
  34. 34. Deployed to Law Enforcement and NGOs
  35. 35. Conclusions • Using an ontology to integrate data • Continuous schema evolution • ElasticSearch as an RDF store • Using a JSON-based tool chain • Deployment of large SemanticWeb app
  36. 36. We Are Hiring isi.edu/integration

×