Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Accelerating Insight - Smart Data Lake Customer Success Stories

1,026 views

Published on

At Gartner Data & Analytics Summit 2017 Alok Prasad, President, was joined by Peter Horowitz of PricewaterhouseCoopers in presenting a session on how Cambridge Semantics' in-memory, massively parallel, semantic graph-based platform delivers an accelerating edge to data-driven organizations, while maintaining trust with security and governance.

Published in: Data & Analytics
  • Be the first to comment

Accelerating Insight - Smart Data Lake Customer Success Stories

  1. 1. Accelerating Insight Smart Data Lake Customer Success Stories Peter Horowitz Principal - Advisory PwC Alok Prasad, President Ben Szekely, VP – Solution Engineering Cambridge Semantics
  2. 2. ©2017 Cambridge Semantics Inc. All rights reserved. Anzo Smart Data Lake: Accelerating Insight A Graph-based Platform to Solve Today’s Data and Analytics Challenges Rapid Ingestion & Mapping Load and Map any data……. from anywhere… at any time! Active Metadata Management Organize, Integrate, Share and Govern any data assets.. more efficiently greater security HiRes Anaytics™ Ask & Answer any questions… any combination any user….. Graphmarts On-Demand Analytic Exploration In-Memory MPP Query Engine Interconnected … trillions of facts… 100x the speed… Enterprise Data Sources Cloud Storage “Last Mile” Analytics Exploratory Analytics HiRes Analytics Search and Discovery Anzo Smart Data Lake™
  3. 3. ©2017 Cambridge Semantics Inc. All rights reserved. Anzo Smart Data Lake: Accelerating Insight A Graph-based Platform to Solve Today’s Data and Analytics Challenges Accelerated Insights – 100x faster than the nearest competitor Automated ETL Generation Collaborative Mapping Text Processing Enterprise Knowledge Graph
  4. 4. ©2017 Cambridge Semantics Inc. All rights reserved. Based in Boston, MA – Privately Held – 100% annual growth Leading customers: Strategic partnerships: Industry recognition: Cambridge Semantics MIT Innovation Showcase Business Intelligence / Analytics Solutions
  5. 5. ©2017 Cambridge Semantics Inc. All rights reserved. Peter Horowitz Principal – Advisory Banking and Capital Management PwC Ben Szekely VP Solution Engineering Cambridge Semantics Accelerating Insight – Customer Success Story
  6. 6. Accelerating Insights: Using Smart Data Lakes Successfully March 2017
  7. 7. PwC Overview $53 bn 1000%The Dilemma – Given current landscape within many financial service institutions, the standard data journey takes TOO LONG Typical data journey Winning Strategy - Attack every step in the process to reduce time to market. 7 Requirements Sourcing Quality Evaluation Quality Remediation Analysis and Reporting 7
  8. 8. PwC Why is this so important? Sources: Average OM% from Morningstar, Inc.; IAIG Index For global top banks, comparing average operating margin percent for previous 5 years against index of information architecture, investment and governance demonstrates strong correlation. Proper investment in information strategy, architecture and governance PAYS OFF! Proper data architecture improving or in place. Insufficient or lagging data architecture investment, execution, governance.
  9. 9. PwC Old approach is Challenged The existing landscape presents great challenges since it is based on relational databases and point-to-point mappings. Operational Data Source Operational Data Source Staging DB Metadata Aggregated Data Detailed Data Data Mart Reporting Analytics Predictive modelingFiles ETL Data Mart Data Mart EDW Integration Barriers Restricted entry Narrow payload Data source mapping Multi-stage, brittle ETL required to integrate diverse data sources Standard model required Enforces a standard model, no exceptions, leading to loss of valuable information context Limited insights Rigidity undermines the agility required in data analytics Challenges
  10. 10. PwC Original data and models maintained Data is loaded seamlessly without transformation or mapping, irrespective of format New approach Unified model Consolidated model unifies disparate source models into a comprehensive and shared canonical model The smart data lake provides a centralized data store with a flexible schema where data is first loaded and then transformed; a process known as schema-on-read. Operational Data Source Operational Data Source Files Smart Data Lake Model 1 Model 2 Model 3 Operational Data Source Operational Data Source Files Richer insights Cross region and domain view supports population-based analytics, hidden relationship discovery and reduces risk of siloed or inaccessible Reporting Analytics Predictive modeling Benefits Graph Data Model (i.e. Ontology) Graph store designs capture the nature of data relationships and supports a variety of data representations, known as “Shared Semantics”
  11. 11. PwC Case Study: Next generation insider trading and fraud surveillance driven by structured and unstructured data Requirements Sourcing Quality Evaluation Quality Remediation Analysis and Reporting Accelerators: Firm Wide Risk Dashboard Employee Risk Dashboard Analysis Overview • Ingested emails, IMs, web browser logs, cookies, phone records, contacts, news feeds • Ingested futures, options trades, market price data Impact • Data was ingested and linked in hours. • Profiles developed for employee roles • Outlier behavior effortlessly identified • Employees given risk ratings allowing for prioritization of detailed compliance investigation.
  12. 12. PwC Cast Study: Terabytes of data ingested and analyzed in a fraction of the time revealing complex relationships between clients and transactions 12 Analysis Overview • 5.5 billion transactions on behalf of 25 million customers for 35 million accounts. • Based on reference data and public information, prepared “social” network of clients analyzing edges (i.e., connections) between them. Impact • Data was ingested and linked in hours. • Edges developed and weighted based on Pointwise Mutual Information algorithms. • Resulting graph database supported recursive analysis, which would have been dramatically longer using RDBS, allowing for identification of suspicious networks Suspicious Account Bridge Customer Network Requirements Sourcing Quality Evaluation Quality Remediation Analysis and Reporting Accelerators:
  13. 13. ©2017 Cambridge Semantics Inc. All rights reserved. INDEFINITE Drug Discovery Preclinical Product Development FDA Review Scale-Up to Mfg. Post-Marketing Surveillance ONE FDA- APPROVED DRUG 0.5 – 2 YEARS6 – 7 YEARS3 – 6 YEARS NUMBER OF VOLUNTEERS PHASE 1 PHASE 2 PHASE 3 5250~ 5,000 – 10,000 COMPOUNDS PRE-DISCOVERY 20–100 100–500 1,000–5,000 INDSUBMITTED NDA/BLASUBMITTED The Information Fabric – A Semantic Layer for the Enterprise R & D Intelligence (CI) Product Development & Regulatory PV & Safety Case Management Source of Influence Commercial Analytics Clinical Trial Operations Medical Advisory Board Analytics Real World Research Clinical Data Standards Management Voice of the Customer Analytics Clinical Trial Exploratory Analytics
  14. 14. Enterprise Knowledge Graph Enabling on-demand access to data by those seeking answers and insight Scalability Security Governance Lineage Automated Structured Data Ingestion Natural Language Processing and Text Analytics Rich Models Anzo Smart Data Lake™ Data On Demand
  15. 15. Enterprise Knowledge Graph Data On Demand Automated Ingestion of Patient Data Patient Safety Clinical Trial Ops R & D Health Economics Lineage The Smart Data Lake for Digital Patient Health Insight for Decision Makers Improving patient outcomes, safety, and comfort Reducing the time bring medicines to patients Lowering the cost of healthcare Insurance Claims Clinical Trials Rx Data Health Records Genetic Data Wearables +
  16. 16. The Semantic Layer disrupts business inhibitors Question or Idea Insights Delivered Data Prep and Clean IT Data Extraction Anzo Smart Data Lake™ Insights Delivered Load or Locate Data Load Data Semantic Layer TIME TIMETIME Question or Idea
  17. 17. ©2017 Cambridge Semantics Inc. All rights reserved. Anzo Smart Data Lake A Graph-based Platform to Disrupt Today’s Data and Analytics Challenges Connectors Models Rules Analytics & Tools ASDL Customer Fingerprint - Intellectual Property Data Ingestion & Mapping Automated ETL Generation Collaborative Mapping Text Processing Data Cataloging Data & Model Governance Active Metadata Management Role-Based Security Discovery & Analytics Automated Query Generation User Dashboards and Custom UI/UX Self-Serve Live Extracts In-Memory MPP Query Graphmarts on Demand ELT, Model Based Data Integration Document Search Enterprise Data Sources Enterprise Data Lakes “Last Mile” Analytics Exploratory Analytics Search and Discovery HiRes Analytics
  18. 18. ©2017 Cambridge Semantics Inc. All rights reserved. An Open Platform for the Semantic Layer based on Knowledge Graphs Rapid Ingestion & Mapping Automated ETL Generation Collaborative Mapping Unstructured Data and Text Processing Active Metadata Management Data Cataloging Model Governance Role-Based Security HiRes Anaytics™ Automated Query Generation Dashboards Custom UI/UX Analytics Endpoints Graphmarts On-Demand Data Layers for Prep, Cleaning, Transformation Analytic Query In-Memory MPP Query Engine Enterprise Data Sources Cloud Storage “Last Mile” Analytics Exploratory Analytics HiRes Analytics Search and Discovery Anzo Smart Data Lake™
  19. 19. ©2017 Cambridge Semantics Inc. All rights reserved. Click here to request a demo

×