Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The Relevance of the Apache Solr Semantic Knowledge Graph

576 views

Published on

The Semantic Knowledge Graph is an Apache Solr plugin that can be used to discover and rank the relationships between any arbitrary queries or terms within the search index. It is a relevancy swiss army knife, able to discover related terms and concepts, disambiguate different meanings of terms given their context, cleanup noise in datasets, discover previously unknown relationships between entities across documents and fields, rank lists of keywords based upon conceptual cohesion to reduce noise, summarize documents by extracting their most significant terms, generate recommendations and personalized search, and power numerous other applications involving anomaly detection, significance/relationship discovery, and semantic search. This talk will walk you through how to setup and use this plugin in concert with other open source tools (probabilistic query parser, SolrTextTagger for entity extraction) to parse, interpret, and much more correctly model the true intent of user searches than traditional keyword-based search approaches.

Published in: Software
  • Be the first to comment

  • Be the first to like this

The Relevance of the Apache Solr Semantic Knowledge Graph

  1. 1. The Relevance of the Apache Solr Semantic Knowledge Graph Trey Grainger SVP of Engineering, Lucidworks 2017.04.11
  2. 2. Trey Grainger SVP of Engineering • Previously Director of Engineering @ CareerBuilder • MBA, Management of Technology – Georgia Tech • BA, Computer Science, Business, & Philosophy – Furman University • Information Retrieval & Web Search - Stanford University Other fun projects: • Co-author of Solr in Action, plus numerous research papers • Frequent conference speaker • Founder of Celiaccess.com, the gluten-free search engine • Lucene/Solr contributor • Startup Investor / Advisor About Me
  3. 3. • “A compact, auto-generated model for real-time traversal and ranking of any relationship within a domain” • A multi-dimensional term-to-term (vs. term-to-document) search engine • A tool which enables knowledge modeling and reasoning, natural language processing, anomaly detection, data cleansing, semantic search, analytics, data classification, root cause analysis, and recommendations systems • It’s kind of like Word2Vec, but better suited for interpreting the nuanced intent of typical search queries What is the Semantic Knowledge Graph?
  4. 4. • Introduction • Overview of Semantic Knowledge Graph - Index Structure - Graph Traversal Mechanics - Edge Weights/Relevancy Scoring • Use Cases - Document Summarization / Content-based Recommendations - Predictive Analytics - Data Cleansing - Document Classification & Enrichment - Query Disambiguation - Semantic Search / Concept Expansion • Installation • Live Demos • Q&A Agenda
  5. 5. Based in San Francisco, offices and employees worldwide Fusion, the platform for building data-driven, smart apps Consulting and support for organizations using Solr Produces the world’s largest open source user conference dedicated to Lucene/Solr Lucidworks is the primary commercial contributor to the Apache Solr project Employs over 40% of the active committers on the Solr project Contributes over 70% of Solr's open source codebase 40% 70%
  6. 6. We are a global company Our 8 offices serve over 400 customers.
  7. 7. We solve big problems for the world’s brightest businesses.
  8. 8. Fusion powers search for the brightest companies in the world.
  9. 9. Knowledge Graph
  10. 10. Knowledge Graph
  11. 11. Why do we care? User’s Query: machine learning research and development Portland, OR software engineer AND hadoop, java Traditional Query Parsing: (machine AND learning AND research AND development AND portland) OR (software AND engineer AND hadoop AND java) Semantic Query Parsing: "machine learning" AND "research and development" AND "Portland, OR" AND "software engineer" AND hadoop AND java Semantically Expanded Query: ("machine learning"^10 OR "data scientist" OR "data mining" OR "artificial intelligence") AND ("research and development"^10 OR "r&d") AND AND ("Portland, OR"^10 OR "Portland, Oregon" OR {!geofilt pt=45.512,-122.676 d=50 sfield=geo}) AND ("software engineer"^10 OR "software developer") AND (hadoop^10 OR "big data" OR hbase OR hive) AND (java^10 OR j2ee)
  12. 12. Terminology / Background
  13. 13. A Graph DSAA 2016 Montreal Quebec Canada Semantic Knowledge Graph Paper Trey Grainger Mohammed Korayem Andries Smith Khalifeh AlJadda in_country Node / Vertex Edge
  14. 14. Term Documents a doc1 [2x] brown doc3 [1x] , doc5 [1x] cat doc4 [1x] cow doc2 [1x] , doc5 [1x] … ... once doc1 [1x], doc5 [1x] over doc2 [1x], doc3 [1x] the doc2 [2x], doc3 [2x], doc4[2x], doc5 [1x] … … Document Content Field doc1 once upon a time, in a land far, far away doc2 the cow jumped over the moon. doc3 the quick brown fox jumped over the lazy dog. doc4 the cat in the hat doc5 The brown cow said “moo” once. … … What you SEND to Lucene/Solr: How the content is INDEXED into Lucene/Solr (conceptually): The inverted index
  15. 15. /solr/collection/select/?q=apache solr Term Documents … … apache doc1, doc3, doc4, doc5 … hadoop doc2, doc4, doc6 … … solr doc1, doc3, doc4, doc7, doc8 … … doc5 doc7 doc8 doc1 doc3 doc4 solr apache apache solr Matching queries to documents
  16. 16. BM25 (Okapi “Best Match” 25th Iteration) Score(q, d) = ∑ idf(t) · ( tf(t in d) · (k + 1) ) / ( tf(t in d) + k · (1 – b + b · |d| / avgdl ) t in q Where: t = term; d = document; q = query; i = index tf(t in d) = numTermOccurrencesInDocument ½ idf(t) = 1 + log (numDocs / (docFreq + 1)) |d| = ∑ 1 t in d avgdl = = ( ∑ |d| ) / ( ∑ 1 ) ) d in i d in i k = Free parameter. Usually ~1.2 to 2.0. Increases term frequency saturation point. b = Free parameter. Usually ~0.75. Increases impact of document normalization.
  17. 17. The Idea
  18. 18. Knowledge Graph Challenges of building a traditional knowledge graph Because current knowledge bases / ontology learning systems typically requires explicitly modeling nodes and edges into a graph ahead of time, this unfortunately presents several limitations to the use of such a knowledge graph: • Entities not modeled explicitly as nodes have no known relationships to any other entities. • Edges exist between nodes, but not between arbitrary combinations of nodes, and therefore such a graph is not ideal for representing nuanced meanings of an entity when appearing within different contexts, as is common within natural language. • Substantial meaning is encoded in the linguistic representation of the domain that is lost when the underlying textual representation is not preserved: phrases, interaction of concepts through actions (i.e. verbs), positional ordering of entities and the phrases containing those entities, variations in spelling and other representations of entities, the use of adjectives to modify entities to represent more complex concepts, and aggregate frequencies of occurrence for different representations of entities relative to other representations. • It can be an arduous process to create robust ontologies, map a domain into a graph representing those ontologies, and ensure the generated graph is compact, accurate, comprehensive, and kept up to date.
  19. 19. 01 Knowledge Graph Semantic Data Encoded into Free Text Content e en eng engi engineer engineers engineer engineersNode Type: Term software engineer software engineers electrical engineering engineer engineering software … … … Node Type: Character Sequence Node Type: Term Sequence Node Type: Document id: 1 text: looking for a software engineerwith degree in computer science or electrical engineering id: 2 text: apply to be a software engineer and work with other great software engineers id: 3 text: start a great careerin electrical engineering … …
  20. 20. Serves as a “data science toolkit” API that allows dynamically navigating and pivoting through multiple levels of relationships between items in a domain. Semantic Knowledge Graph API Core similarity engine, exposed via API Any product can leverage the core relationship scoring engine to score any list of entities against any other list Full domain support Keywords, categories, tags, based upon any field on your documents. Graph is build automatically from the content representing your domain. Intersections, overlaps, & relationship scoring, many levels deep Users can either provide a list of items to score, or else have the system dynamically discover the most related items (or both). Knowledge Graph
  21. 21. Graph Structure
  22. 22. id: 1 job_title: Software Engineer desc: software engineer at a great company skills: .Net, C#, java id: 2 job_title: Registered Nurse desc: a registered nurse at hospital doing hard work skills: oncology, phlebotemy id: 3 job_title: Java Developer desc: a software engineer or a java engineer doing work skills: java, scala, hibernate field doc term desc 1 a at company engineer great software 2 a at doing hard hospital nurse registered work 3 a doing engineer java or software work job_title 1 Software Engineer … … … Terms-Docs Inverted IndexDocs-Terms Forward IndexDocuments Source: Trey Grainger, Khalifeh AlJadda, Mohammed Korayem, Andries Smith.“The Semantic Knowledge Graph: A compact, auto-generated model for real-time traversal and ranking of any relationship within a domain”. DSAA 2016. Knowledge Graph field term postings list doc pos desc a 1 4 2 1 3 1, 5 at 1 3 2 4 company 1 6 doing 2 6 3 8 engineer 1 2 3 3, 7 great 1 5 hard 2 7 hospital 2 5 java 3 6 nurse 2 3 or 3 4 registered 2 2 software 1 1 3 2 work 2 10 3 9 job_title java developer 3 1 … … … …
  23. 23. Source: Trey Grainger, Khalifeh AlJadda, Mohammed Korayem, Andries Smith.“The Semantic Knowledge Graph: A compact, auto-generated model for real-time traversal and ranking of any relationship within a domain”. DSAA 2016. Knowledge Graph Set-theory View Graph View How the Graph Traversal Works skill: Java skill: Scala skill: Hibernate skill: Oncology doc 1 doc 2 doc 3 doc 4 doc 5 doc 6 skill: Java skill: Java skill: Scala skill: Hibernate skill: Oncology Data Structure View Java Scala Hibernate docs 1, 2, 6 docs 3, 4 Oncology doc 5
  24. 24. Source: Trey Grainger, Khalifeh AlJadda, Mohammed Korayem, Andries Smith.“The Semantic Knowledge Graph: A compact, auto-generated model for real-time traversal and ranking of any relationship within a domain”. DSAA 2016. Knowledge Graph Multi-level Traversal Data Structure View Graph View doc 1 doc 2 doc 3 doc 4 doc 5 doc 6 skill: Java skill: Java skill: Scala skill: Hibernate skill: Oncology doc 1 doc 2 doc 3 doc 4 doc 5 doc 6 job_title: Software Engineer job_title: Data Scientist job_title: Java Developer …… Inverted Index Lookup Forward Index Lookup Forward Index Lookup Inverted Index Lookup Java Java Developer Hibernate Scala Software Engineer Data Scientist has_related_job_title has_related_job_title
  25. 25. Knowledge Graph Materialization of new nodes through shared documents engineer engineers software engineer* (materialized node) engineer* (materialized node) Software doc 1 doc 2 doc 3 doc 4 doc 5 doc 6 links_to links_to
  26. 26. Traversal & Scoring
  27. 27. Knowledge Graph From the academic paper: Structure: Single-level Traversal / Scoring: Multi-level Traversal / Scoring:
  28. 28. Scoring of Node Relationships (Edge Weights) Foreground vs. Background Analysis Every term scored against it’s context. The more commonly the term appears within it’s foreground context versus its background context, the more relevant it is to the specified foreground context. countFG(x) - totalDocsFG * probBG(x) z = -------------------------------------------------------- sqrt(totalDocsFG * probBG(x) * (1 - probBG(x))) { "type":"keywords”, "values":[ { "value":"hive", "relatedness":0.9773, "popularity":369 }, { "value":"java", "relatedness":0.9236, "popularity":15653 }, { "value":".net", "relatedness":0.5294, "popularity":17683 }, { "value":"bee", "relatedness":0.0, "popularity":0 }, { "value":"teacher", "relatedness":-0.2380, "popularity":9923 }, { "value":"registered nurse", "relatedness": -0.3802 "popularity":27089 } ] } We are essentially boosting terms which are more related to some known feature (and ignoring terms which are equally likely to appear in the background corpus) + - Foreground Query: "Hadoop" Knowledge Graph
  29. 29. Source: Trey Grainger, Khalifeh AlJadda, Mohammed Korayem, Andries Smith.“The Semantic Knowledge Graph: A compact, auto-generated model for real-time traversal and ranking of any relationship within a domain”. DSAA 2016. Knowledge Graph Multi-level Graph Traversal with Scores software engineer* (materialized node) Java C# .NET .NET Developer Java Developer Hibernate ScalaVB.NET Software Engineer Data Scientist Skill Nodes has_related_skillStarting Node Skill Nodes has_related_skill Job Title Nodes has_related_job_title 0.90 0.88 0.93 0.93 0.34 0.74 0.91 0.89 0.74 0.89 0.780.72 0.48 0.93 0.76 0.83 0.80 0.64 0.61 0.780.55
  30. 30. Use Cases
  31. 31. Knowledge Graph Use Case: Document Summarization / Content-based Recommendations Experiment: Pass in raw text (extracting phrases as needed), and rank their similarity to the documents using the SKG. Additionally, can traverse the graph to “related” entities/keyword phrases NOT found in the original document Applications: Content-based and multi-modal recommendations (no cold-start problem), data cleansing prior to clustering or other ML methods, semantic search / similarity scoring
  32. 32. Knowledge Graph Use Case: Predictive Analytics
  33. 33. Knowledge Graph Use Case: Data Cleansing { "type":"keywords”, "values":[ { "value":"hive", "relatedness": 0.9765, "popularity":369 }, { "value":”spark", "relatedness": 0.9634, "popularity":15653 }, { "value":".net", "relatedness": 0.5417, "popularity":17683 }, { "value":"bogus_word", "relatedness": 0.0, "popularity":0 }, { "value":"teaching", "relatedness": -0.1510, "popularity":9923 }, { "value":"CPR", "relatedness": -0.4012, "popularity":27089 } ] } Foreground Query: "Hadoop" Experiment: Data analyst manually annotated 500 pairs of terms found together in real query logs as “relevant” or “not relevant” Results: SKG removed 78% of the terms while maintaining a 95% accuracy at removing the correct noisy pairs from the input data.
  34. 34. Use Case: Document Classification & Enrichment Knowledge Graph
  35. 35. Use Case: Query Disambiguation Example Related Keywords (representing multiple meanings) driver truck driver, linux, windows, courier, embedded, cdl, delivery architect autocad drafter, designer, enterprise architect, java architect, designer, architectural designer, data architect, oracle, java, architectural drafter, autocad, drafter, cad, engineer … … Source: M. Korayem, C. Ortiz, K. AlJadda, T. Grainger. "Query Sense Disambiguation Leveraging Large Scale User Behavioral Data". IEEE Big Data 2015.
  36. 36. A few methodologies: 1) Query Log Mining 2) Semantic Knowledge Graph Knowledge Graph
  37. 37. Query Log Mining: Discovering ambiguous phrases 1) Classify users who ran each search in the search logs (i.e. by the job title classifications of the jobs to which they applied) 3) Segment the search term => related search terms list by classification, to return a separate related terms list per classification 2) Create a probabilistic graphical model of those classifications mapped to each keyword phrase. Source: M. Korayem, C. Ortiz, K. AlJadda, T. Grainger. "Query Sense Disambiguation Leveraging Large Scale User Behavioral Data". IEEE Big Data 2015.
  38. 38. Semantic Knowledge Graph: Discovering ambiguous phrases 1) Exact same concept, but use a document classification field (i.e. category) as the first level of your graph, and the related terms as the second level to which you traverse. 2) Has the benefit that you don’t need query logs to mine, but it will be representative of your data, as opposed to your user’s intent, so the quality depends on how clean and representative your documents are. Additional Benefit: Multi-dimensional disambiguation and dynamic materialization of categories. Effectively an dynamically-materialized probabilistic graphical model
  39. 39. Disambiguated meanings (represented as term vectors) Example Related Keywords (Disambiguated Meanings) architect 1: enterprise architect, java architect, data architect, oracle, java, .net 2: architectural designer, architectural drafter, autocad, autocad drafter, designer, drafter, cad, engineer driver 1: linux, windows, embedded 2: truck driver, cdl driver, delivery driver, class b driver, cdl, courier designer 1: design, print, animation, artist, illustrator, creative, graphic artist, graphic, photoshop, video 2: graphic, web designer, design, web design, graphic design, graphic designer 3: design, drafter, cad designer, draftsman, autocad, mechanical designer, proe, structural designer, revit … … Source: M. Korayem, C. Ortiz, K. AlJadda, T. Grainger. "Query Sense Disambiguation Leveraging Large Scale User Behavioral Data". IEEE Big Data 2015.
  40. 40. Using the disambiguated meanings In a situation where a user searches for an ambiguous phrase, what information can we use to pick the correct underlying meaning? 1. Any pre-existing knowledge about the user: • User is a software engineer • User has previously run searches for “c++” and “linux” 2. Context within the query: User searched for windows AND driver vs. courier OR driver 3. If all else fails (and there is no context), use the most commonly occurring meaning. driver 1: linux, windows, embedded 2: truck driver, cdl driver, delivery driver, class b driver, cdl, courier Source: M. Korayem, C. Ortiz, K. AlJadda, T. Grainger. "Query Sense Disambiguation Leveraging Large Scale User Behavioral Data". IEEE Big Data 2015.
  41. 41. Knowledge Graph Use Case: Query Expansion Experiment: Take an initial query, and expand keyword phrases to include the most related entities to that query Example:
  42. 42. The Semantic Search Problem User’s Query: machine learning research and development Portland, OR software engineer AND hadoop, java Traditional Query Parsing: (machine AND learning AND research AND development AND portland) OR (software AND engineer AND hadoop AND java) Semantic Query Parsing: "machine learning" AND "research and development" AND "Portland, OR" AND "software engineer" AND hadoop AND java Semantically Expanded Query: ("machine learning"^10 OR "data scientist" OR "data mining" OR "artificial intelligence") AND ("research and development"^10 OR "r&d") AND AND ("Portland, OR"^10 OR "Portland, Oregon" OR {!geofilt pt=45.512,-122.676 d=50 sfield=geo}) AND ("software engineer"^10 OR "software developer") AND (hadoop^10 OR "big data" OR hbase OR hive) AND (java^10 OR j2ee)
  43. 43. Getting Started
  44. 44. Open Sourced!
  45. 45. #1: Pull, Build, Start Solr cd ~/demo git clone https://github.com/apache/lucene-solr.git git checkout releases/lucene-solr/7.2.1 cd lucene-solr/solr ant server bin/solr -c –Denable.runtime.lib=true #2: Pull and Build Semantic Knowledge Graph cd ~/demo git clone https://github.com/treygrainger/semantic-knowledge-graph.git git checkout solr_7.2.1 mvn package #3: Install Plugin curl -X POST -H 'Content-Type: application/octet-stream' --data-binary @semantic- knowledge-graph-1.0-SNAPSHOT.jar http://localhost:8983/solr/.system/blob/skg.jar #4: Create Collection curl http://localhost:8983/solr/admin/collections?action=CREATE&name=demo-collection
  46. 46. #5: Register Plugin with Collection curl http://localhost:8983/solr/demo-collection/config -H 'Content-type:application/json' -d '{ "add-runtimelib": { "name":"skg.jar", "version":1 } }' #6: Register Response Writer curl http://localhost:8983/solr/demo-collection/config -H 'Content-type:application/json' -d '{ "add-queryresponsewriter": { "name": "skg", "runtimeLib": true, "class": ”org.apache.solr.search.skg.responsewriter.KnowledgeGraphResponseWriter"} }’ #7: Register Request Handler curl http://localhost:8983/solr/demo-collection/config -H 'Content-type:application/json' -d '{ "add-requesthandler" : { "name": "/skg", "class":”org.apache.solr.search.skg.KnowledgeGraphHandler", "defaults":{ "defType":"edismax", "wt":"json"}, "invariants":{"wt":"skg"}, "runtimeLib": true } }'
  47. 47. Per-field options: {field_name}.fallback string An alternative (usually free-text) field to query in case foreground returns less than min_popularity. Example: {field_name}.fallback=content {field_name}.facet-field string A suffix to find a copy-field version of the value in {field_name} to use for the node during traversal. Example: {field_name}.facet-field=id-name {field_name}.key string A parallel key / id field to use to represent a particular value. Typically used to disambiguate multiple entities (each with a unique id) sharing the same textual name Example:{field.v1.key}=id
  48. 48. Demos
  49. 49. Knowledge Graph
  50. 50. Knowledge Graph
  51. 51. Who’s in Love with Jean Grey?
  52. 52. Build a Co-occurrence Matrix http://localhost:8983/solr/job-postings/skg
  53. 53. Related term vector (for query concept expansion) http://localhost:8983/solr/stack-exchange-health/skg
  54. 54. Score keywords from within a document http://localhost:8983/solr/job-postings/skg
  55. 55. • Lucidworks is working on integrating the semantic knowledge graph into Solr’s JSON faceting API (requires some new capabilities and structural changes, but greatly enhances it) • WIP, but this new approach looks promising. We’ll post a patch soon. Hoping to get committed for Solr 7.4 • Being integrated into Lucidworks Fusion as one of several key elements of our semantic search pipeline, content-based recommendations, and to power discovery in our analytics suite. Next Steps
  56. 56. Contact Info Trey Grainger trey.grainger@lucidworks.com @treygrainger http://solrinaction.com Other presentations: http://www.treygrainger.com Discount code: ctwhaystack18
  57. 57. Knowledge Graph Populating the Graph
  58. 58. Other Solr Graph Capabilities
  59. 59. https://lucidworks.com/webinar/solr-6-deep-dive-sql-graph/ top(n="5", sort="count(*) desc", gatherNodes(movielens, top(n="30", sort="count(*) desc", gatherNodes(movielens, search(movielens, q="user_id_i:305", fl="movie_id_i", sort="movie_id_i asc", qt=“/export"), walk="movie_id_i->movie_id_i", gather="user_id_i", maxDocFreq="10000", count(*) ) ), walk="node->user_id_i", gather="movie_id_i", count(*) ) ) Distributed Graph Traversal Streaming Expressions Covered in more detail in deep-dive SQL & Graph webinar:
  60. 60. Graph Query Parser • Query-time, cyclic aware graph traversal is able to rank documents based on relationships • Provides controls for depth, filtering of results and inclusion of root and/or leaves • Limitations: single node/shard only Examples: • http://localhost:8983/solr/graph/query?fl=id,score& q={!graph from=in_edge to=out_edge}id:A • http://localhost:8983/solr/my_graph/query?fl=id& q={!graph from=in_edge to=out_edge traversalFilter='foo:[* TO 15]'}id:A • http://localhost:8983/solr/my_graph/query?fl=id& q={!graph from=in_edge to=out_edge maxDepth=1}foo:[* TO 10]
  61. 61. • Term Frequency: “How well a term describes a document?” – Measure: how often a term occurs per document • Inverse Document Frequency: “How important is a term overall?” – Measure: how rare the term is across all documents TF * IDF *Source: Solr in Action, chapter 3
  62. 62. Lucidworks Fusion Architecture
  63. 63. Lucidworks Fusion
  64. 64. Knowledge Graph – Potential Use Cases Cross-walk between Types • Have an ID field, but want to enable free text search on the most associated entity with that ID? • Have a “state” (geo) search box, but want to accept any free-text location and map it to the right state? • Have an old classification taxonomy and want to know how the values from the old system now map into the new values? Build User Profiles from Search Logs • If someone searches for “Java”, and then “JQuery”, and then “CSS”, and then “JSP”, what do those have in common? • What if they search for “Java”, and then “C++”, and then “Assembly”? Discover Relationships Between Anything • If I want to become a data scientist and know Python, what libraries should I learn? • If my last job was mid-level software engineer and my current job is Engineering Lead, what are my most likely next roles? Traverse arbitrarily deep, Sort on anything • Build an instant co-occurrence matrix, sort the top values by their relatedness, and then add in any number of additional dimensions (RAM permitting). Data Cleansing • Have dirty taxonomies and need to figure out which items don’t belong? • Need to understand the conceptual cohesion of a document (vs spammy or off-topic content)? Knowledge Graph
  65. 65. Knowledge Graph Future Work • Semantic Search (more experiments) • Search Engine Relevancy Algorithms • Trending Topics • Recommendation Systems • Root Cause Analysis • Abuse Detection
  66. 66. Knowledge Graph Conclusion Applications: The Semantic Knowledge Graph has numerous applications, including automatically building ontologies, identification of trending topics over time, predictive analytics on timeseries data, root-cause analysis surfacing concepts related to failure scenarios from free text, data cleansing, document summarization, semantic search interpretation and expansion of queries, recommendation systems, and numerous other forms of anomaly detection. Main contribution of this paper: The introduction (and open sourcing) of the the Semantic Knowledge Graph, a novel and compact new graph model that can dynamically materialize and score the relationships between any arbitrary combination of entities represented within a corpus of documents.
  67. 67. Knowledge Graph References
  68. 68. Serves as a “data science toolkit” API that allows dynamically navigating and pivoting through multiple levels of relationships between items in a domain. Knowledge Graph API Core similarity engine, exposed via API Any product can leverage the core relationship scoring engine to score any list of entities against any other list Full domain support Keywords, categories, tags, based upon any field on your documents. Graph is build automatically from the content representing your domain. Intersections, overlaps, & relationship scoring, many levels deep Users can either provide a list of items to score, or else have the system dynamically discover the most related items (or both). Knowledge Graph

×