Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Connect. Enrich. Evolve. Convert unstructured data silos to knowledge graphs

3,820 views

Published on

Companies or organisations of any size, either public or private, have a large amount of data available into isolated data silos. They are created independently for the specific needs of the organisational unit and mainly contain textual data in multiple formats. In order to unleash the power of the relevant information available in such data sources, it is necessary to collect and organize them in an homogenous data structure, easy to access and extend.

The presentation starts identifying the business needs, then drives the audience through the journey of (i) creating a knowledge graph that represents a single highly connected source of truth for the entire organisation, (ii) enriching it using multiple external sources of knowledge and machine learning algorithms, (iii) and evolving it accordingly to the mutating needs of the company.

Furthermore, this session highlights the role of graphs as a new "access pattern" for textual data, compared with the more classical inverted index approach. It concludes with the presentation of a complete end-to-end infrastructure for unstructured data processing workflow where Neo4j is the core of a complex ecosystem integrated with other tools like Elasticsearch, Apache Kafka, Stanford NLP, OpenNLP, Apache Spark, and Tensorflow.

***
Talk at GraphTour Washington D.C., April 14, 2018

Published in: Technology
  • Be the first to comment

Connect. Enrich. Evolve. Convert unstructured data silos to knowledge graphs

  1. 1. GraphAware® CONNECT. ENRICH. EVOLVE. CONVERT UNSTRUCTURED DATA SILOS TO KNOWLEDGE GRAPHS Alessandro Negro, Chief Scientist @ GraphAware graphaware.com @graph_aware, @AlessandroNegro
  2. 2. BUSINESS NEEDS GraphAware® → Convert Data in Actionable Knowledge Data ‣ Organisations store vast amounts of content ‣ Collect and organise past experience or mistakes ‣ Multiple distributed data silos or data sources
 Goals ‣ Do you know what your customers are going to need in 12 months’ time ‣ How are you going to provide it? ‣ Are you making the best use of information you already have in building the next generation of solutions?
  3. 3. GraphAware® BUSINESS NEEDS
  4. 4. CHALLENGES GraphAware® The challenge with knowledge are: ‣ Data and information are not consistent ‣ The amount of data ‣ Sources spread across many systems ‣ Data are generated at high speed Organisational leadership want a solution for data to be: ‣ Integrated at speed ‣ Enable knowledge workers to be more efficient, effective, and consistent
  5. 5. WHERE IS THE KNOWLEDGE? GraphAware®
  6. 6. SEARCH: 101 GraphAware®
  7. 7. SEARCH: 101 GraphAware®
  8. 8. SEARCH: INVERTED INDEX GraphAware®
  9. 9. SEARCH: INVERTED INDEX GraphAware® ← Vocabulary Inverted index →
  10. 10. WHERE IS THE KNOWLEDGE? GraphAware®
  11. 11. SEARCH: INVERTED INDEX GraphAware®
  12. 12. SEARCH: INVERTED INDEX GraphAware® Pros: ‣ Easy to implement, deploy and maintain ‣ High scalable approach related to the sharding capabilities ‣ Incredibly fast
 Cons: ‣ Tuning results is an hard task ‣ Documents are isolated (no explicit connection between them) ‣ No navigation through documents ‣ Difficult to extend ‣ Issue to change the list of synonyms ‣ Only textual search available
  13. 13. WHERE IS THE KNOWLEDGE? GraphAware®
  14. 14. GRAPH APPROACH GraphAware®
  15. 15. GRAPH APPROACH GraphAware®
  16. 16. GRAPH APPROACH GraphAware®
  17. 17. GRAPH APPROACH GraphAware®
  18. 18. GRAPH APPROACH GraphAware®
  19. 19. GRAPH APPROACH GraphAware®
  20. 20. GRAPH APPROACH GraphAware®
  21. 21. GRAPH APPROACH GraphAware®
  22. 22. GRAPH APPROACH GraphAware®
  23. 23. GRAPH APPROACH GraphAware®
  24. 24. GRAPH APPROACH GraphAware®
  25. 25. GRAPH APPROACH GraphAware®
  26. 26. GRAPH APPROACH GraphAware®
  27. 27. GRAPH APPROACH GraphAware®
  28. 28. GRAPH APPROACH GraphAware® Pros: ‣ The documents are not considered isolated ‣ Multiple, flexible and unpredictable access patterns ‣ Can be integrated with other ML approaches (i.e. recommendation) ‣ Easy to integrate with other tools ‣ Can create a Knowledge Graph ‣ Enable AI
 Cons: ‣ Textual search performance ‣ No sharding ‣ Difficult to Implement
  29. 29. KNOWLEDGE GRAPH? GraphAware® What is it? ‣ mainly describes real world entities and their interrelations, organized in a graph ‣ defines possible classes and relations of entities in a schema ‣ allows for potentially interrelating arbitrary entities with each other ‣ covers various topical domains 
 Some famous Knowledge Graphs: ‣ Google ‣ NASA ‣ Ebay ‣ Facebook ‣ Yahoo ‣ Microsoft A Knowledge Graph is the only way to manage the whole of enterprise data in full generality
  30. 30. OK, YOU CONVINCED ME WHAT SHOULD I DO TO GET STARTED? GraphAware®
  31. 31. GraphAware® “The GraphAware Knowledge Platform converts unstructured data silos to Knowledge Graph”
  32. 32. THE KNOWLEDGE ARCHITECTURE GraphAware®
  33. 33. THE GRAPHAWARE KNOWLEDGE PLATFORM GraphAware® Features: ‣ Import information from your internal sources in one centralised location ‣ Enrich your data with external or internal source of knowledge ‣ Analyse information and Discover business insights using deep analysis
 How it works: ‣ Data Ingestion ‣ Smart Entity Extraction ‣ Augmented Knowledge ‣ Deep Text Analysis ‣ Distributed Processing ‣ Multiple Integration → A platform specifically designed for managing textual data
  34. 34. THE GRAPHAWARE KNOWLEDGE PLATFORM GraphAware®
  35. 35. THE ROLE OF NEO4J GraphAware® ‣ Knowledge Graph store ‣ Single source of truth ‣ Fast access to connected data ‣ Query ‣ Merging External Data ‣ Existing Data Augmentation ‣ Scalability
  36. 36. THE GRAPHAWARE KNOWLEDGE PLATFORM GraphAware®
  37. 37. ‣ Converting Data in actionable knowledge is a complex task ‣ It’s worth it ‣ A knowledge graph approach gives you a lot of advantages ‣ The GraphAware Knowledge Platform simplify the entire process CONCLUSION GraphAware®
  38. 38. www.graphaware.com @graph_aware

×