Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Sumeet vij enterprise_knowledge_graph


Published on

BAH's OA DC Summit Prese

Published in: Technology
  • Be the first to comment

Sumeet vij enterprise_knowledge_graph

  1. 1. Enterprise Knowledge Graph(EKG)Mining an Enterprise’s Systems of Engagements Sumeet Vij Senior Associate Booz Allen Hamilton
  2. 2. Can you make your decisions on just 20% of your data?◦ According to IDC Research, less than 20% percent of an enterprise’s information is in the form of structured data which can reside neatly in traditional columnar relational databases◦ 80% of information is unstructured and semi- structured in the form of documents, web- pages, emails, images and videos which are growing at a tremendous rate◦ Current Enterprise Systems of Record (ERPs, CRMs) capture a miniscule amount of information generated within an Enterprise◦ However the Systems of Record remain the main focus of the IT team and the main source of information for the enterprise leadership
  3. 3. Unstructured Data creates Enterprise Information Management Challenges• Information is scattered and inaccessible • Spread across documents, spreadsheet, emails• Data is stored in multiple, often incompatible formats• Data sources are not linked • No documented relationships between pieces of information • No easy way to harness data from external sources including social networks• Information is hard to understand • Different terminology and vocabularies
  4. 4. How do employees create andshare information? ThroughSystems of Engagement These systems are the primary way employees in an Enterprise communicate and share information, namely ◦ Email ◦ IM (Lync) ◦ Social collaboration tools like Yammer, Tibbr, Jive Not surprisingly, these systems generate unstructured data at high
  5. 5. Systems of Engagement Loosely structured knowledge flows Conversational Dynamic and in flux
  6. 6. How does the industry extract informationfrom unstructured text? GoogleKnowledge GraphThe Google Knowledge Graph provides “Things not justStrings”, that is, it enhances its search results with semanticinformation gathered from multiple sources. It providesstructured information about entities and links to other relatedentities. Its goal is to help people • Find the right thing: Find the right entity, understand the difference between Taj Mahal the monument and Taj Mahal the musician• Get the best summary: Summarize relevant content related to the entity, key facts and other related entities• Go deep and broader: Help make unexpected discoveries and relationships
  7. 7. How does an Enterprise extractinformation from these Systems ofEngagement? Enter the EnterpriseKnowledge Graph (EKG)Along the lines of the Google Knowledge Graph, the EKG aims to helpenterprises extract and explore information created by systems ofengagement. Core EKG concepts are:• Knowledge Capture: Extract key concepts and relationships from unstructured documents using an Enterprise Ontology. This allows concept based indexing of content • Example: An employee submits a trip report in the form of an email. EKG automatically extracts the Who, What, When and Where information and links it to other relevant resources.• Knowledge Discovery: Search multiple data sources for information using a relevant Enterprise Ontology • Example: A proposal manager can ask, “Who has background information about the Army CIO/G6?”.• Knowledge Exploration: Expose information to a host of graphical tools to visualize and further analyze relationships between data
  8. 8. How is the EKG seeded? Crowd-source the creation The major source of information generation in an enterprise is email. The process to seed the EKG with email would be: ◦ The sender copies their email to a monitored EKG email mailbox ◦ The EKG parses, analyzes and adds the extracted facts to the Knowledge Graph ◦ The EKG then sends an automated email back to the sender, describing the facts and a link to correct the extracted information Start with a specific Ontology geared towards a high value use case and then build out the entities and their relationships
  9. 9. Benefits of adding email to theEKG◦ Bigger insights as we can leverage the collective interactions of all the employees (not just the respondent) and the subsequent interactions enrich the EKG, allowing even more questions to be answered◦ Liberate employee knowledge, expertise and interactions from the mailbox and make it available for the enterprise to leverage.
  10. 10. EKG Benefits• Utilize all available knowledge sources • Allows documents, spreadsheets and emails to serve as “top level” information sources• Integration • Ties disconnected pieces of data together into meaningful wholes that provide a basis for planning and decision making• Meaning-Centric • Facts around an object or an entity can be easily explored • Search phrases are better “understood” as they are based upon concepts and not literals• Serendipity • Related searches allow the formerly “unknown” to be discovered SLIDE 10
  11. 11. How we discover information within an Enterprise today Sumeet Vij Proposal Manager Resume Facts System Search Presented at Cliff Daus Attended DoD SOA & Semantic Technology Attended Trip Report Symposium Search OpportunityWho has Managementinformation System Follow on Meetingabout the army Employee ofCIO/G6 ? Demonstration Attended Social Network at CIO/G6 Search CIO/G6 Customer Topic CRM Attended Semantic Technologie s Web A B C D Systems of Record Systems of Engagement SLIDE 11
  12. 12. Knowledge Discovery using EKGProposal Manager Knowledge Discovery Knowledge Capture Web Submission Who has information about the Army CIO/G6? Entity Extraction Parse Trip Reports Meeting Minutes Email Submission Etc. Determine Sources for Information Query Resume Knowledgebase System Opportunity Management Submit System CRM Update Sumeet Vij SLIDE 12
  13. 13. Conceptual EKG Architecture• An open architecture composed of re- useable open source components User Interface Layer Document Knowledge Query UI Upload Browser Semantic Processing Layer Data Source Entity Extraction Concept Catalog Catalog Integration Layer Persistence Layer E-Mail Database Web Services NoSQL Connector Connector Client Store SLIDE 13
  14. 14. Questions?