SlideShare a Scribd company logo
1 of 15
Mainstreaming BI and Analytics with Enterprise Data
Unification
Shobhit Chugh | Tamr
Data Heterogeneity is Inherent in Large Companies
Data sources are bound to applications with idiosyncratic bias
Sales
Marketing
Manufacturing
HR
Support
Finance
AppsStoreApps Store
Sales
Marketing
Manufacturing
HR
Support
Finance
Aggregation of Data Creates Ambiguity/Complexity
Broad analytics create need to bring data together from many sources
Outside Forces = More Confusion + Complexity
Leadership
Changes
Mergers &
Acquisitions
Reorganizations
Result: Just 10% of Data is Consumable by Any One Person
And 80% of data scientist time is spent preparing it
90%
Dark Data
Expectations for Global Corporate IT as Data Broker
Increasing quickly -- along with the hype about Big Data/Analytics 3.0
HR
Sales
Finance
Divisions
Marketing MFG
ENG
Some Options
Option #1 - Deny Variety - use information that is easiest/closest
Option #2 - Manage Variety incrementally - using traditional approaches:
● Standardization
● Aggregation
● Master Data Management
● Rationalize Systems
● Throw Bodies at it
● Improve Individual Productivity
Option #3 - Embrace Variety using probabalistic/model based approach - Tamr
Logical Evolution to Probabilistic/Model-Based Approach
Probabilistic
Deterministic
Probabilistic
Deterministic
Today Future
Probabilistic (Tamr) complements, NOT Replaces, Deterministic (MDM)
INTRODUCING TAMR
▪ Founded in 2013 by
enterprise database software
veterans
▪ World-class engineering team
▪ Top tier venture backing
(Google Ventures, NEA)
Jerry Held,
PhD
Andy Palmer Mike Stonebraker,
PhD
Ihab Ilyas,
PhD
Kevin Burke Nidhi Aggarwal,
PhD
Min Xiao Nik Bates-
Haus
Kevin Willis
9
Managing enterprise information as an asset requires a new,
bottom-up design pattern
Catalog Connect Consume
ALL your metadata and
map it to logical entities
Entities and attributes to
remove information silos
Unified data in the application
of your choice via APIs
“Embrace” Variety -- Tamr’s NextGen Approach
Tamr’s Design Pattern: “Back to the Future”
1990’s Web:
Yahoo’s top-down
organization
2020’s Enterprise:
Probabilistic data source cataloging,
connection and consumption
12
ARCHITECTURE
DATA &
METADAT
A
SOURCES
Analytics,
visualization,
Data Warehouse
Expert Sourcing
Data
Profiling
Schema
Matching
Record
Deduplication
Data Connection Activities
Data
Security
Data
Governance
Machine Learning
DB, ERP,
CRM, CSV
+ DATA
USES
Data
Security
Fortune 50 company -- Optimized Sourcing Analysis
Benefits
● Massive reductions in
supplier list size & number
of distinct suppliers
● Automated data
maintenance; lower cost
of ownership
● Powering strategic
sourcing analytics and
governance
● Empowering individual
procurement team with
global view of payment
terms
Catalog
Tamr helps you catalog
metadata across the entire
enterprise, providing a logical
map of all of your information
Find us at Booth #613
Connect
Tamr helps match entities
and attributes across the
full variety of your sources,
leveraging entity relationships
for high accuracy
Consume
Tamr provides a consolidated
view of entities and records for
downstream applications via
a set of RESTful APIs
learn more at tamr.com
Find us at Booth #613
ABSTRACT (FOR REFERENCE)
Organizations want to use all the data available to them for analytics. But they’ve been thwarted by
data silos and top-down, mostly manual approaches to unifying data for analytics. A new approach,
based on machine learning combined with human expert sourcing, dramatically speeds analytics’
time-to-value. It automates data unification end-to end: from finding and connecting diverse data to
interactive consumption by virtually anyone using any analytic tool.

More Related Content

What's hot

Importance of Big data for your Business
Importance of Big data for your BusinessImportance of Big data for your Business
Importance of Big data for your Businessazuyo.com
 
Getting down to business on Big Data analytics
Getting down to business on Big Data analyticsGetting down to business on Big Data analytics
Getting down to business on Big Data analyticsThe Marketing Distillery
 
A Dynamic Data Catalog for Autonomy and Self-Service
A Dynamic Data Catalog for Autonomy and Self-ServiceA Dynamic Data Catalog for Autonomy and Self-Service
A Dynamic Data Catalog for Autonomy and Self-ServiceDenodo
 
Business case for Big Data Analytics
Business case for Big Data AnalyticsBusiness case for Big Data Analytics
Business case for Big Data AnalyticsVijay Rao
 
An Overview of the Neo4j Cloud Strategy and the Future of Graph Databases in ...
An Overview of the Neo4j Cloud Strategy and the Future of Graph Databases in ...An Overview of the Neo4j Cloud Strategy and the Future of Graph Databases in ...
An Overview of the Neo4j Cloud Strategy and the Future of Graph Databases in ...Neo4j
 
Data Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data IntelligenceData Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data IntelligenceAlation
 
Business Analytics & Big Data Trends and Predictions 2014 - 2015
Business Analytics & Big Data Trends and Predictions 2014 - 2015Business Analytics & Big Data Trends and Predictions 2014 - 2015
Business Analytics & Big Data Trends and Predictions 2014 - 2015Brad Culbert
 
Data Lake: A simple introduction
Data Lake: A simple introductionData Lake: A simple introduction
Data Lake: A simple introductionIBM Analytics
 
Graph Grid by Atom Rain
Graph Grid by Atom RainGraph Grid by Atom Rain
Graph Grid by Atom RainMeg Vorland
 
2015 Trends in Data Intelligence
2015 Trends in Data Intelligence 2015 Trends in Data Intelligence
2015 Trends in Data Intelligence ClearStory Data
 
Big Data & Business Analytics: Understanding the Marketspace
Big Data & Business Analytics: Understanding the MarketspaceBig Data & Business Analytics: Understanding the Marketspace
Big Data & Business Analytics: Understanding the MarketspaceBala Iyer
 
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Simplilearn
 
Big data analytics in banking sector
Big data analytics in banking sectorBig data analytics in banking sector
Big data analytics in banking sectorAnil Rana
 
Qubole State of the Big Data Industry
Qubole State of the Big Data IndustryQubole State of the Big Data Industry
Qubole State of the Big Data IndustryQubole
 
DMTI Spatial Location Hub Analytics: big data, analytics, visualization
DMTI Spatial Location Hub Analytics: big data, analytics, visualizationDMTI Spatial Location Hub Analytics: big data, analytics, visualization
DMTI Spatial Location Hub Analytics: big data, analytics, visualizationDMTI Spatial
 
The data ecosystem
The data ecosystemThe data ecosystem
The data ecosystemWGroup
 
Importance of data analytics for business
Importance of data analytics for businessImportance of data analytics for business
Importance of data analytics for businessBranliticSocial
 
Karin Patenge "DIGITAL TRANSFORMATION DATA DRIVEN BUSINESS Bedeutung und Nutz...
Karin Patenge "DIGITAL TRANSFORMATION DATA DRIVEN BUSINESS Bedeutung und Nutz...Karin Patenge "DIGITAL TRANSFORMATION DATA DRIVEN BUSINESS Bedeutung und Nutz...
Karin Patenge "DIGITAL TRANSFORMATION DATA DRIVEN BUSINESS Bedeutung und Nutz...GEOkomm e.V.
 

What's hot (20)

Importance of Big data for your Business
Importance of Big data for your BusinessImportance of Big data for your Business
Importance of Big data for your Business
 
Getting down to business on Big Data analytics
Getting down to business on Big Data analyticsGetting down to business on Big Data analytics
Getting down to business on Big Data analytics
 
A Dynamic Data Catalog for Autonomy and Self-Service
A Dynamic Data Catalog for Autonomy and Self-ServiceA Dynamic Data Catalog for Autonomy and Self-Service
A Dynamic Data Catalog for Autonomy and Self-Service
 
Business case for Big Data Analytics
Business case for Big Data AnalyticsBusiness case for Big Data Analytics
Business case for Big Data Analytics
 
An Overview of the Neo4j Cloud Strategy and the Future of Graph Databases in ...
An Overview of the Neo4j Cloud Strategy and the Future of Graph Databases in ...An Overview of the Neo4j Cloud Strategy and the Future of Graph Databases in ...
An Overview of the Neo4j Cloud Strategy and the Future of Graph Databases in ...
 
Data Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data IntelligenceData Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data Intelligence
 
Business Analytics & Big Data Trends and Predictions 2014 - 2015
Business Analytics & Big Data Trends and Predictions 2014 - 2015Business Analytics & Big Data Trends and Predictions 2014 - 2015
Business Analytics & Big Data Trends and Predictions 2014 - 2015
 
Data Lake: A simple introduction
Data Lake: A simple introductionData Lake: A simple introduction
Data Lake: A simple introduction
 
Graph Grid by Atom Rain
Graph Grid by Atom RainGraph Grid by Atom Rain
Graph Grid by Atom Rain
 
2015 Trends in Data Intelligence
2015 Trends in Data Intelligence 2015 Trends in Data Intelligence
2015 Trends in Data Intelligence
 
Big Data & Business Analytics: Understanding the Marketspace
Big Data & Business Analytics: Understanding the MarketspaceBig Data & Business Analytics: Understanding the Marketspace
Big Data & Business Analytics: Understanding the Marketspace
 
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
 
Big data analytics in banking sector
Big data analytics in banking sectorBig data analytics in banking sector
Big data analytics in banking sector
 
5 Big Data Use Cases for 2013
5 Big Data Use Cases for 20135 Big Data Use Cases for 2013
5 Big Data Use Cases for 2013
 
Qubole State of the Big Data Industry
Qubole State of the Big Data IndustryQubole State of the Big Data Industry
Qubole State of the Big Data Industry
 
Big Data SurVey - IOUG - 2013 - 594292
Big Data SurVey - IOUG - 2013 - 594292Big Data SurVey - IOUG - 2013 - 594292
Big Data SurVey - IOUG - 2013 - 594292
 
DMTI Spatial Location Hub Analytics: big data, analytics, visualization
DMTI Spatial Location Hub Analytics: big data, analytics, visualizationDMTI Spatial Location Hub Analytics: big data, analytics, visualization
DMTI Spatial Location Hub Analytics: big data, analytics, visualization
 
The data ecosystem
The data ecosystemThe data ecosystem
The data ecosystem
 
Importance of data analytics for business
Importance of data analytics for businessImportance of data analytics for business
Importance of data analytics for business
 
Karin Patenge "DIGITAL TRANSFORMATION DATA DRIVEN BUSINESS Bedeutung und Nutz...
Karin Patenge "DIGITAL TRANSFORMATION DATA DRIVEN BUSINESS Bedeutung und Nutz...Karin Patenge "DIGITAL TRANSFORMATION DATA DRIVEN BUSINESS Bedeutung und Nutz...
Karin Patenge "DIGITAL TRANSFORMATION DATA DRIVEN BUSINESS Bedeutung und Nutz...
 

Viewers also liked

MeHI Privacy & Security Webinar 3.18.15
MeHI Privacy & Security Webinar 3.18.15MeHI Privacy & Security Webinar 3.18.15
MeHI Privacy & Security Webinar 3.18.15MassEHealth
 
Media Visie 2015 (ABN AMRO)
Media Visie 2015 (ABN AMRO)Media Visie 2015 (ABN AMRO)
Media Visie 2015 (ABN AMRO)Jim Stolze
 
Open Source Software for Data Scientists -- Great Wide Open 2014
Open Source Software for Data Scientists -- Great Wide Open 2014Open Source Software for Data Scientists -- Great Wide Open 2014
Open Source Software for Data Scientists -- Great Wide Open 2014Charlie Greenbacker
 
Travel Security 10 30 09
Travel Security 10 30 09Travel Security 10 30 09
Travel Security 10 30 09James Kane
 
Tech M&A Monthly: 10 Ways to Increase Your Company's Value
Tech M&A Monthly: 10 Ways to Increase Your Company's ValueTech M&A Monthly: 10 Ways to Increase Your Company's Value
Tech M&A Monthly: 10 Ways to Increase Your Company's ValueCorum Group
 
Battling Drug Cartels with Big Data Using Lumify
Battling Drug Cartels with Big Data Using LumifyBattling Drug Cartels with Big Data Using Lumify
Battling Drug Cartels with Big Data Using LumifyAll Things Open
 
Mit Romney 1040 tax return 2011
Mit Romney 1040 tax return 2011Mit Romney 1040 tax return 2011
Mit Romney 1040 tax return 2011Kit Seeborg
 
World Academic Journal of Business & Applied Sciences (WAJBAS)
World Academic Journal of Business & Applied Sciences (WAJBAS) World Academic Journal of Business & Applied Sciences (WAJBAS)
World Academic Journal of Business & Applied Sciences (WAJBAS) World-Academic Journal
 
kamus-science
kamus-sciencekamus-science
kamus-scienceNur Asiah
 
잡코리아 글로벌 프런티어 1기_노점순_탐방 계획서
잡코리아 글로벌 프런티어 1기_노점순_탐방 계획서잡코리아 글로벌 프런티어 1기_노점순_탐방 계획서
잡코리아 글로벌 프런티어 1기_노점순_탐방 계획서잡코리아 글로벌 프런티어
 
Revista C&S 21 junho/julho 2012
Revista C&S 21 junho/julho 2012Revista C&S 21 junho/julho 2012
Revista C&S 21 junho/julho 2012Ciclomídia
 

Viewers also liked (16)

东吴-费森尤斯
东吴-费森尤斯东吴-费森尤斯
东吴-费森尤斯
 
11 cdxc
11 cdxc11 cdxc
11 cdxc
 
MeHI Privacy & Security Webinar 3.18.15
MeHI Privacy & Security Webinar 3.18.15MeHI Privacy & Security Webinar 3.18.15
MeHI Privacy & Security Webinar 3.18.15
 
Media Visie 2015 (ABN AMRO)
Media Visie 2015 (ABN AMRO)Media Visie 2015 (ABN AMRO)
Media Visie 2015 (ABN AMRO)
 
Open Source Software for Data Scientists -- Great Wide Open 2014
Open Source Software for Data Scientists -- Great Wide Open 2014Open Source Software for Data Scientists -- Great Wide Open 2014
Open Source Software for Data Scientists -- Great Wide Open 2014
 
Introduction to Exponentials Insights 2016
Introduction to Exponentials Insights 2016Introduction to Exponentials Insights 2016
Introduction to Exponentials Insights 2016
 
Travel Security 10 30 09
Travel Security 10 30 09Travel Security 10 30 09
Travel Security 10 30 09
 
Tech M&A Monthly: 10 Ways to Increase Your Company's Value
Tech M&A Monthly: 10 Ways to Increase Your Company's ValueTech M&A Monthly: 10 Ways to Increase Your Company's Value
Tech M&A Monthly: 10 Ways to Increase Your Company's Value
 
Battling Drug Cartels with Big Data Using Lumify
Battling Drug Cartels with Big Data Using LumifyBattling Drug Cartels with Big Data Using Lumify
Battling Drug Cartels with Big Data Using Lumify
 
Chicago Safety Conference Presentation 2009
Chicago Safety Conference Presentation 2009Chicago Safety Conference Presentation 2009
Chicago Safety Conference Presentation 2009
 
Mit Romney 1040 tax return 2011
Mit Romney 1040 tax return 2011Mit Romney 1040 tax return 2011
Mit Romney 1040 tax return 2011
 
World Academic Journal of Business & Applied Sciences (WAJBAS)
World Academic Journal of Business & Applied Sciences (WAJBAS) World Academic Journal of Business & Applied Sciences (WAJBAS)
World Academic Journal of Business & Applied Sciences (WAJBAS)
 
kamus-science
kamus-sciencekamus-science
kamus-science
 
잡코리아 글로벌 프런티어 1기_노점순_탐방 계획서
잡코리아 글로벌 프런티어 1기_노점순_탐방 계획서잡코리아 글로벌 프런티어 1기_노점순_탐방 계획서
잡코리아 글로벌 프런티어 1기_노점순_탐방 계획서
 
Revista C&S 21 junho/julho 2012
Revista C&S 21 junho/julho 2012Revista C&S 21 junho/julho 2012
Revista C&S 21 junho/julho 2012
 
CDXC Corporate presentation
CDXC Corporate presentationCDXC Corporate presentation
CDXC Corporate presentation
 

Similar to Tamr Gartner BI and Analytics Summit

Choosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your BusinessChoosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your BusinessChicago Hadoop Users Group
 
Chp11 Business Intelligence
Chp11 Business IntelligenceChp11 Business Intelligence
Chp11 Business IntelligenceChuong Nguyen
 
Accelerate Your Move to the Cloud with Data Catalogs and Governance
Accelerate Your Move to the Cloud with Data Catalogs and GovernanceAccelerate Your Move to the Cloud with Data Catalogs and Governance
Accelerate Your Move to the Cloud with Data Catalogs and GovernanceDATAVERSITY
 
Accelerate Your Move to the Cloud with Data Catalogs and Governance
Accelerate Your Move to the Cloud with Data Catalogs and GovernanceAccelerate Your Move to the Cloud with Data Catalogs and Governance
Accelerate Your Move to the Cloud with Data Catalogs and GovernanceDATAVERSITY
 
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder AtwalDataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder AtwalHarvinder Atwal
 
Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...
Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...
Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...Amazon Web Services
 
AI & ML for Supply Chain Optimization
AI & ML for Supply Chain OptimizationAI & ML for Supply Chain Optimization
AI & ML for Supply Chain OptimizationShiSh Shridhar
 
Big Data Analytics in light of Financial Industry
Big Data Analytics in light of Financial Industry Big Data Analytics in light of Financial Industry
Big Data Analytics in light of Financial Industry Capgemini
 
Klarna Tech Talk - Mind the Data!
Klarna Tech Talk - Mind the Data!Klarna Tech Talk - Mind the Data!
Klarna Tech Talk - Mind the Data!Jeffrey T. Pollock
 
Slides: Applying Artificial Intelligence (AI) in All the Right Places in the ...
Slides: Applying Artificial Intelligence (AI) in All the Right Places in the ...Slides: Applying Artificial Intelligence (AI) in All the Right Places in the ...
Slides: Applying Artificial Intelligence (AI) in All the Right Places in the ...DATAVERSITY
 
Maximize the Value of Your Data: Neo4j Graph Data Platform
Maximize the Value of Your Data: Neo4j Graph Data PlatformMaximize the Value of Your Data: Neo4j Graph Data Platform
Maximize the Value of Your Data: Neo4j Graph Data PlatformNeo4j
 
AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...
AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...
AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...Amazon Web Services
 
Is Your Staff Big Data Ready? 5 Things to Know About What It Will Take to Suc...
Is Your Staff Big Data Ready? 5 Things to Know About What It Will Take to Suc...Is Your Staff Big Data Ready? 5 Things to Know About What It Will Take to Suc...
Is Your Staff Big Data Ready? 5 Things to Know About What It Will Take to Suc...CompTIA
 
Big Data and MDM altogether: the winning association
Big Data and MDM altogether: the winning associationBig Data and MDM altogether: the winning association
Big Data and MDM altogether: the winning associationJean-Michel Franco
 
Effectively Leveraging Graph Technology - Ann Grubbs, Lockheed Martin
Effectively Leveraging Graph Technology - Ann Grubbs, Lockheed MartinEffectively Leveraging Graph Technology - Ann Grubbs, Lockheed Martin
Effectively Leveraging Graph Technology - Ann Grubbs, Lockheed MartinNeo4j
 
[DSC Adria 23] Thomas Miebach A modern, business focused data strategy with C...
[DSC Adria 23] Thomas Miebach A modern, business focused data strategy with C...[DSC Adria 23] Thomas Miebach A modern, business focused data strategy with C...
[DSC Adria 23] Thomas Miebach A modern, business focused data strategy with C...DataScienceConferenc1
 
Analyst Webinar: Discover how a logical data fabric helps organizations avoid...
Analyst Webinar: Discover how a logical data fabric helps organizations avoid...Analyst Webinar: Discover how a logical data fabric helps organizations avoid...
Analyst Webinar: Discover how a logical data fabric helps organizations avoid...Denodo
 
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...IT Support Engineer
 
Capturing big value in big data
Capturing big value in big data Capturing big value in big data
Capturing big value in big data BSP Media Group
 
Augmented Data Management
Augmented Data ManagementAugmented Data Management
Augmented Data ManagementFORMCEPT
 

Similar to Tamr Gartner BI and Analytics Summit (20)

Choosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your BusinessChoosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your Business
 
Chp11 Business Intelligence
Chp11 Business IntelligenceChp11 Business Intelligence
Chp11 Business Intelligence
 
Accelerate Your Move to the Cloud with Data Catalogs and Governance
Accelerate Your Move to the Cloud with Data Catalogs and GovernanceAccelerate Your Move to the Cloud with Data Catalogs and Governance
Accelerate Your Move to the Cloud with Data Catalogs and Governance
 
Accelerate Your Move to the Cloud with Data Catalogs and Governance
Accelerate Your Move to the Cloud with Data Catalogs and GovernanceAccelerate Your Move to the Cloud with Data Catalogs and Governance
Accelerate Your Move to the Cloud with Data Catalogs and Governance
 
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder AtwalDataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
 
Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...
Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...
Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...
 
AI & ML for Supply Chain Optimization
AI & ML for Supply Chain OptimizationAI & ML for Supply Chain Optimization
AI & ML for Supply Chain Optimization
 
Big Data Analytics in light of Financial Industry
Big Data Analytics in light of Financial Industry Big Data Analytics in light of Financial Industry
Big Data Analytics in light of Financial Industry
 
Klarna Tech Talk - Mind the Data!
Klarna Tech Talk - Mind the Data!Klarna Tech Talk - Mind the Data!
Klarna Tech Talk - Mind the Data!
 
Slides: Applying Artificial Intelligence (AI) in All the Right Places in the ...
Slides: Applying Artificial Intelligence (AI) in All the Right Places in the ...Slides: Applying Artificial Intelligence (AI) in All the Right Places in the ...
Slides: Applying Artificial Intelligence (AI) in All the Right Places in the ...
 
Maximize the Value of Your Data: Neo4j Graph Data Platform
Maximize the Value of Your Data: Neo4j Graph Data PlatformMaximize the Value of Your Data: Neo4j Graph Data Platform
Maximize the Value of Your Data: Neo4j Graph Data Platform
 
AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...
AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...
AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...
 
Is Your Staff Big Data Ready? 5 Things to Know About What It Will Take to Suc...
Is Your Staff Big Data Ready? 5 Things to Know About What It Will Take to Suc...Is Your Staff Big Data Ready? 5 Things to Know About What It Will Take to Suc...
Is Your Staff Big Data Ready? 5 Things to Know About What It Will Take to Suc...
 
Big Data and MDM altogether: the winning association
Big Data and MDM altogether: the winning associationBig Data and MDM altogether: the winning association
Big Data and MDM altogether: the winning association
 
Effectively Leveraging Graph Technology - Ann Grubbs, Lockheed Martin
Effectively Leveraging Graph Technology - Ann Grubbs, Lockheed MartinEffectively Leveraging Graph Technology - Ann Grubbs, Lockheed Martin
Effectively Leveraging Graph Technology - Ann Grubbs, Lockheed Martin
 
[DSC Adria 23] Thomas Miebach A modern, business focused data strategy with C...
[DSC Adria 23] Thomas Miebach A modern, business focused data strategy with C...[DSC Adria 23] Thomas Miebach A modern, business focused data strategy with C...
[DSC Adria 23] Thomas Miebach A modern, business focused data strategy with C...
 
Analyst Webinar: Discover how a logical data fabric helps organizations avoid...
Analyst Webinar: Discover how a logical data fabric helps organizations avoid...Analyst Webinar: Discover how a logical data fabric helps organizations avoid...
Analyst Webinar: Discover how a logical data fabric helps organizations avoid...
 
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...
 
Capturing big value in big data
Capturing big value in big data Capturing big value in big data
Capturing big value in big data
 
Augmented Data Management
Augmented Data ManagementAugmented Data Management
Augmented Data Management
 

Recently uploaded

React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
software engineering Chapter 5 System modeling.pptx
software engineering Chapter 5 System modeling.pptxsoftware engineering Chapter 5 System modeling.pptx
software engineering Chapter 5 System modeling.pptxnada99848
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 

Recently uploaded (20)

React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
software engineering Chapter 5 System modeling.pptx
software engineering Chapter 5 System modeling.pptxsoftware engineering Chapter 5 System modeling.pptx
software engineering Chapter 5 System modeling.pptx
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 

Tamr Gartner BI and Analytics Summit

  • 1. Mainstreaming BI and Analytics with Enterprise Data Unification Shobhit Chugh | Tamr
  • 2. Data Heterogeneity is Inherent in Large Companies Data sources are bound to applications with idiosyncratic bias Sales Marketing Manufacturing HR Support Finance AppsStoreApps Store
  • 3. Sales Marketing Manufacturing HR Support Finance Aggregation of Data Creates Ambiguity/Complexity Broad analytics create need to bring data together from many sources
  • 4. Outside Forces = More Confusion + Complexity Leadership Changes Mergers & Acquisitions Reorganizations
  • 5. Result: Just 10% of Data is Consumable by Any One Person And 80% of data scientist time is spent preparing it 90% Dark Data
  • 6. Expectations for Global Corporate IT as Data Broker Increasing quickly -- along with the hype about Big Data/Analytics 3.0 HR Sales Finance Divisions Marketing MFG ENG
  • 7. Some Options Option #1 - Deny Variety - use information that is easiest/closest Option #2 - Manage Variety incrementally - using traditional approaches: ● Standardization ● Aggregation ● Master Data Management ● Rationalize Systems ● Throw Bodies at it ● Improve Individual Productivity Option #3 - Embrace Variety using probabalistic/model based approach - Tamr
  • 8. Logical Evolution to Probabilistic/Model-Based Approach Probabilistic Deterministic Probabilistic Deterministic Today Future Probabilistic (Tamr) complements, NOT Replaces, Deterministic (MDM)
  • 9. INTRODUCING TAMR ▪ Founded in 2013 by enterprise database software veterans ▪ World-class engineering team ▪ Top tier venture backing (Google Ventures, NEA) Jerry Held, PhD Andy Palmer Mike Stonebraker, PhD Ihab Ilyas, PhD Kevin Burke Nidhi Aggarwal, PhD Min Xiao Nik Bates- Haus Kevin Willis 9
  • 10. Managing enterprise information as an asset requires a new, bottom-up design pattern Catalog Connect Consume ALL your metadata and map it to logical entities Entities and attributes to remove information silos Unified data in the application of your choice via APIs “Embrace” Variety -- Tamr’s NextGen Approach
  • 11. Tamr’s Design Pattern: “Back to the Future” 1990’s Web: Yahoo’s top-down organization 2020’s Enterprise: Probabilistic data source cataloging, connection and consumption
  • 12. 12 ARCHITECTURE DATA & METADAT A SOURCES Analytics, visualization, Data Warehouse Expert Sourcing Data Profiling Schema Matching Record Deduplication Data Connection Activities Data Security Data Governance Machine Learning DB, ERP, CRM, CSV + DATA USES Data Security
  • 13. Fortune 50 company -- Optimized Sourcing Analysis Benefits ● Massive reductions in supplier list size & number of distinct suppliers ● Automated data maintenance; lower cost of ownership ● Powering strategic sourcing analytics and governance ● Empowering individual procurement team with global view of payment terms
  • 14. Catalog Tamr helps you catalog metadata across the entire enterprise, providing a logical map of all of your information Find us at Booth #613 Connect Tamr helps match entities and attributes across the full variety of your sources, leveraging entity relationships for high accuracy Consume Tamr provides a consolidated view of entities and records for downstream applications via a set of RESTful APIs learn more at tamr.com Find us at Booth #613
  • 15. ABSTRACT (FOR REFERENCE) Organizations want to use all the data available to them for analytics. But they’ve been thwarted by data silos and top-down, mostly manual approaches to unifying data for analytics. A new approach, based on machine learning combined with human expert sourcing, dramatically speeds analytics’ time-to-value. It automates data unification end-to end: from finding and connecting diverse data to interactive consumption by virtually anyone using any analytic tool.

Editor's Notes

  1. e>>> Heterogeneity of information sources is natural in large companies Much of the roughly $3-4 trillion invested in enterprise software over the last 20 years, has gone toward building and deploying software systems and applications to automate and optimize key business processes in context of specific functions (sales, marketing, manufacturing) and/or geographies (countries, regions, states, etc) - essentially these are systems that produce data and do so in a very idiosyncratic manner. As each of these idiosyncratic applications are deployed - an equally idiosyncratic data source is created. The result: the data tied to enterprise investments in software is extremely heterogeneous and siloed - the broad use of the data has been 2ndary to the primary activity of automating business processes - producing the data. The data is almost like an idiosyncratic exhaust of all of these various applications. It’s not surprising (actually natural) that information across a large enterprise is disconnected and is managed more as the exhaust of 30+ years of business process automation. I think of this as a form of enterprise information entropy. The effort to standardize on single vendor platforms as well as creating enterprise-wide data warehouses has largely been an attempt to compensate for natural enterprise data variety/entropy and ironically - the top-down, approaches used to rationalize to a single platform or implement most warehouses (Deterministic ETL, Master Data Management and Waterfall Data Management Methods) - created not fewer silos - but just additional larger silos that increased the overall variety of data sources within an organization.
  2. >>> Heterogeneity of information sources is natural in large companies Much of the roughly $3-4 trillion invested in enterprise software over the last 20 years, has gone toward building and deploying software systems and applications to automate and optimize key business processes in context of specific functions (sales, marketing, manufacturing) and/or geographies (countries, regions, states, etc) - essentially these are systems that produce data and do so in a very idiosyncratic manner. As each of these idiosyncratic applications are deployed - an equally idiosyncratic data source is created. The result: the data tied to enterprise investments in software is extremely heterogeneous and siloed - the broad use of the data has been 2ndary to the primary activity of automating business processes - producing the data. The data is almost like an idiosyncratic exhaust of all of these various applications. It’s not surprising (actually natural) that information across a large enterprise is disconnected and is managed more as the exhaust of 30+ years of business process automation. I think of this as a form of enterprise information entropy. The effort to standardize on single vendor platforms as well as creating enterprise-wide data warehouses has largely been an attempt to compensate for natural enterprise data variety/entropy and ironically - the top-down, approaches used to rationalize to a single platform or implement most warehouses (Deterministic ETL, Master Data Management and Waterfall Data Management Methods) - created not fewer silos - but just additional larger silos that increased the overall variety of data sources within an organization.
  3. On top of the historical pull toward application and organization specific data sources - these systems get even more complicated and disconnected when you add the confusion and complexity that results from : M&A events every quarter Reorganizations every 6-12 months Changes in leadership every few years
  4. Objective estimates of the scale of this problem are surprising - specifically - industry analysts estimate that : 90% of big data is dark (not used or cataloged within the enterprise) 90% of collected data isn’t consumable (requires significant work to be useful) 80% of data scientist time is spent preparing the data for consumption Not being managed as an asset
  5. This challenge is only going to become more critical -- especially as expectations of Global Corporate IT as data broker are increasing quickly along with the hype around Big Data/Analytics 3.0 As we look forward to the next 20 years, most companies have begun investing heavily in Big Data Analytics – $44 billion in 2014 alone according to Gartner << insert reference to Data/Analytics being the top priority for CIOs >>. In this context, merely managing all of a company’s data as an asset presents a significant challenge for a globally missioned IT organization. But now - enter the trend toward proverbial Big Data and Analytics 3.0 -- and the already impossible problem of managing data variety becomes a strategic imperative for the IT organization who is now expected to integrate analytics and data seamlessly and quickly across all of these idiosyncratic silos so that all these users with great new democratized viz tools. We’d like to think that our data integration and preparation capabilities are advanced enough to service this great democratization. And that our “plumbing” is capable of treating the massive reserves of silo’d, heterogeneous data. However - these aspirations and the cool new viz tools that are available to everyone in the enterprise require clean, unified data that spans all the various silos. Most companies are finding this heterogeneity is a massive fundamental roadblock to effectively using state-of-the-art analytics and visualization tools. Basically Big Data Variety and heterogeneity is the dirty little secret of most enterprises and while it’s not sexy to spend time cleaning and preparing data - unified data is as important to enterprise analytics as reliable water treatment is to providing clean drinking water to the population. All of this leaves Corporate IT organizations several options to address the data variety problem as data brokers for their enterprise.
  6. Some orgs are simply ignoring the opportunity to convert variety into value – overwhelmed by the sheer volume of heterogeneous sources and data. So they go ahead and carve out their pile, go to their corner, and work with what they have.
  7. Multiple approaches have emerged to deal with the Data Variety problem, with the current state dominated by extreme top-down management (95% deterministic to 5% probabilistic). I predict that the shear number of data sources and complexity of change is going to drive us toward a bottom-up approach (80% probabilistic to 20% deterministic). The only viable way to tame enterprise data variety is through “bottom-up, collaborative data curation complements traditional MDM, ETL, data profiling and data quality methods.
  8. A Next-Gen Approach We believe that big companies should start by deploying a fundamentally new design pattern for data management which enables their organization to dynamically catalog, connect, curate ALL of their enterprise information sources from the bottom up using a scalable and agile approach. NOTE that Tamr operationalizes this approach at scale, across the enterprise -- NOT as another idiosyncratic solution -- AND work with existing data management and analytics tools]. Connect - Our emphasis has been on connecting diverse data sources across the enterprise, at scale. We are now expanding the platform to bring this level of scalable data unification and use across the enterprise. Catalog - At the front end, Tamr now solves a very common problem: What data do I use to solve this problem? Consume/Curate - Unified data doesn’t live in Tamr. We make it available to any downstream application or analytic tools -- including something as simple as spreadsheets - via a set of RESTful APIs.
  9. This design pattern is not new - it’s a mimic of the design patterns on the modern world wide web - but is designed to connect the primary information asset of the enterprise - tabular data. In the mid-1990’s - the early days of Yahoo!, they used library sciences professionals and top down information management practices and tools to organize websites and web content for search. Over time - it became clear that Google’s bottom-up probabilistic approach to matching web content with search terms - was going to be a much more scalable and effective approach - so much so that as most of you know - Yahoo! decided to license Google’s tech. Inside the enterprise, tabular data sources are the primary assets to be connected instead of websites … and companies need a new set of tools to register/catalog, connect and curate tabular data that is matched to the data/attributes that analytic users want/need. We believe that our technology at Tamr will be incorporated into existing legacy MDM, ETL and Data Management tools much in the way that Yahoo! licenced Google.
  10. Challenge With thousands of suppliers spanning many P&Ls and ERP systems, the company has been challenged to maintain an accurate supplier master file (SMF) to drive strategic sourcing analysis Solution Create a unified data model that leverages all relevant sources, including address, tax and government data Machine learning algorithms continuously evaluate & remove potential SMF duplicates Automated processing incrementally improves as validation is received from SMEs Benefits Massive reductions in supplier list size & number of distinct suppliers Automated data maintenance; lower cost of ownership in production Powering strategic sourcing analytics and governance at a corporate level Empowering individual procurement team with global view of payment terms Here’s the link for the long-form write up the team did, for background: https://docs.google.com/a/tamr.com/document/d/12JvLG4wr_PjpKOGlUyoDx6iVULCAkwm5bhHKMYP7vwU/edit?usp=sharing