SlideShare a Scribd company logo
1 of 40
Download to read offline
Grab some 
coffee and 
enjoy the 
pre-show 
banter 
before the 
top of the 
hour!
Has Traditional MDM Finally Met Its Match? 
The Briefing Room
Twitter Tag: #briefr 
The Briefing Room 
Welcome 
Host: 
Eric Kavanagh 
eric.kavanagh@bloorgroup.com 
@eric_kavanagh
! Reveal the essential characteristics of enterprise 
software, good and bad 
! Provide a forum for detailed analysis of today’s innovative 
technologies 
! Give vendors a chance to explain their product to savvy 
analysts 
! Allow audience members to pose serious questions... and 
get answers! 
Twitter Tag: #briefr 
The Briefing Room 
Mission
This Month: INTEGRATION & DATA FLOW 
October: ANALYTIC PLATFORMS 
November: DISCOVERY & VISUALIZATION 
Twitter Tag: #briefr 
The Briefing Room 
Topics 
2014 Editorial Calendar at 
www.insideanalysis.com/webcasts/the-briefing-room
There’s a New Sheriff in Town! 
Twitter Tag: #briefr 
The Briefing Room 
Executive Summary 
• Speed and power trump the old way 
• Traditional MDM is officially archaic 
• YARN is the new fabric of MDM
Twitter Tag: #briefr 
The Briefing Room 
Analyst: Robin Bloor 
Robin Bloor is 
Chief Analyst at 
The Bloor Group 
robin.bloor@bloorgroup.com 
@robinbloor
Twitter Tag: #briefr 
The Briefing Room 
RedPoint Global 
! RedPoint Global is a data management and integrated 
marketing technology company 
! RedPoint Data Management offers solutions designed for 
master data management (MDM), collaboration and 
architecture integration 
! RedPoint Data Management for Hadoop is YARN-compliant 
and enables analysts to access and manipulate data directly 
within the Hadoop cluster
Twitter Tag: #briefr 
The Briefing Room 
Guest: George Corugedo 
George Corugedo is Chief Technology Officer & Co- 
Founder at RedPoint Global Inc. A mathematician 
and seasoned technology executive, George has 
over 20 years of business and technical expertise. 
As co-founder and CTO of RedPoint Global, George 
is responsible for leading the development of the 
RedPoint Convergent Marketing Platform™. A 
former math professor, George left academia to 
co-found Accenture’s Customer Insight Practice, 
which specialized in strategic data utilization, 
analytics and customer strategy. Previous positions 
include director of client delivery at ClarityBlue, 
Inc., a provider of hosted customer intelligence 
solutions to enterprise commercial entities, and 
COO/CIO of Riscuity, a receivables management 
company specializing in the utilization of analytics 
to drive collections.
MDM for the Modern Data Architecture 
September 
2014
Purpose of MDM 
Create correct and consistent data across 
the enterprise that earns trust in information 
and acceleration of growth. 
11 © RedPoint Global Inc. 2014 Confidential
Vicious Cycle of Unmanaged Data 
1. Master 
Data Issues 
remain 
unaddressed 
or unresolved 
2. Garbage 
in/garbage 
out creates 
process 
confusion 
4. Data 
conflicts 
reinforce 
siloed 
operations 
3. Lack of 
process trust 
slows business 
momentum 
12 © RedPoint Global Inc. 2014 Confidential
13 © RedPoint Global Inc. 2014 Confidential 
© Hortonworks Inc. 2014 
A Data Architecture Under Pressure
Broad Spectrum of Benefits Across Industries 
14 © RedPoint Global Inc. 2014 Confidential
Gartner’s Nexus of Forces Making Things Worse 
15 © RedPoint Global Inc. 2014 Confidential
Business Benefits of MDM 
16 © RedPoint Global Inc. 2014 Confidential
Types of Data in a Typical Organization 
Challenges 
to 
Data 
Lake 
Approach 
• Severe 
shortage 
of 
Map 
Reduce 
skilled 
resources 
• Inconsistent 
skills 
lead 
to 
inconsistent 
results 
of 
code 
based 
solu>ons 
• Nascent 
technologies 
require 
mul>ple 
point 
solu>ons 
• Technologies 
Benefits 
of 
a 
Hadoop 
Data 
Lake 
17 © RedPoint Global Inc. 2014 Confidential 
are 
not 
enterprise 
grade 
• Some 
func>onality 
may 
not 
be 
possible 
within 
these 
frameworks 
• Data 
is 
ingested 
in 
its 
raw 
state 
regardless 
of 
format, 
structure 
or 
lack 
of 
structure 
• Raw 
data 
can 
be 
used 
and 
reused 
for 
differing 
purposes 
across 
the 
enterprise 
• Beyond 
inexpensive 
storage, 
Hadoop 
is 
an 
extremely 
power 
and 
scalable 
and 
segmentable 
computa>onal 
plaMorm 
• Master 
Data 
can 
be 
fed 
across 
the 
enterprise 
and 
deep 
analy>cs 
on 
clean 
data 
is 
immediately 
enabled
Big Data Can Become Big Information 
! Inges>on 
of 
all 
data 
available 
from 
any 
source, 
format, 
cadence, 
structure 
or 
non-­‐structure 
! ELT 
and 
data 
transforma>on, 
refinement, 
cleansing, 
comple>on, 
valida>on 
and 
standardiza>on 
! Geospa>al 
processing 
and 
geocoding 
! Data 
profiling, 
lineage 
and 
metadata 
management 
! Iden>ty 
resolu>on 
and 
persistent 
keying 
and 
en>ty 
profile 
management 
! ASribute 
source 
and 
consumer 
mapping 
18 © RedPoint Global Inc. 2014 Confidential
Data Lake Architecture for MDM 
Data 
Sources 
CRM 
ERP 
Billing 
Subscriber 
Product 
Network 
Weather 
Compete 
Manuf. 
Clickstream 
Online 
Chat 
Sensor 
Data 
Social 
Media 
Call 
Detail 
Records 
Fabrica>on 
Logs 
Sales 
Feedback 
Field 
Feedback 
Field 
Feedback 
+ 
19 © RedPoint Global Inc. 2014 Confidential
Key Functions for Master Data Management 
ETL 
& 
ELT 
Data 
Quality 
Master 
Key 
Management 
Web 
Services 
Integra>on 
20 © RedPoint Global Inc. 2014 Confidential 
Integra>on 
& 
Matching 
Process 
Automa>on 
& 
Opera>ons 
• Profiling, 
reads/writes, 
transforma>ons 
• Single 
project 
for 
all 
jobs 
• Cleanse 
data 
• Parsing, 
correc>on 
• Geo-­‐spa>al 
analysis 
• Grouping 
• Fuzzy 
match 
• Create 
keys 
• Track 
changes 
• Maintain 
matches 
over 
>me 
• Consume 
and 
publish 
• HTTP/HTTPS 
protocols 
• XML/JSON/SOAP 
formats 
• Job 
scheduling, 
monitoring, 
no>fica>ons 
• Central 
point 
of 
control 
• Meta 
Data 
Management
So How to Proceed? 
21 © RedPoint Global Inc. 2014 Confidential
Overview - What is Hadoop/Hadoop 2.0 
Hadoop 
1.0 
• All 
opera>ons 
based 
on 
Map 
Reduce 
• Intrinsic 
inconsistency 
of 
code 
based 
solu>ons 
• Highly 
skilled 
and 
expensive 
resources 
needed 
• 3rd 
party 
applica>ons 
constrained 
by 
the 
need 
to 
generate 
code 
22 © RedPoint Global Inc. 2014 Confidential 
Hadoop 
2.0 
• Introduc>on 
of 
the 
YARN: 
“a 
general-­‐purpose, 
distributed, 
applica>on 
management 
framework 
that 
supersedes 
the 
classic 
Apache 
Hadoop 
MapReduce 
framework 
for 
processing 
data 
in 
Hadoop 
clusters.” 
• Mature 
applica>ons 
can 
now 
operate 
directly 
on 
Hadoop 
• Reduce 
skill 
requirements 
and 
increased 
consistency
RedPoint Data Management on Hadoop 
Par>>oning 
AM 
/ 
Tasks 
Parallel 
Sec>on 
(UI) 
Execu>on 
AM 
/ 
Tasks 
Data 
I/O 
Key 
/ 
Split 
Analysis 
YARN 
23 © RedPoint Global Inc. 2014 Confidential 
MapReduce
Reference Hadoop Architecture 
Monitoring and Management Tools 
AMBARI 
DATA REFINEMENT 
PIG HIVE 
MAPREDUCE 
REST 
HTTP 
STREAM 
STRUCTURE 
HCATALOG 
(metadata services) 
DBs 
Fil 
esF il 
Feilse s 
NFS 
Ÿ 
24 © RedPoint Global Inc. 2014 Confidential 
Query/Visualization/ 
Reporting/Analytical 
Tools and Apps 
SOURCE 
DATA 
- Sensor Logs 
- Clickstream 
JMS 
- Flat Queue’s 
Files 
- Unstructured 
- Sentiment 
- Customer 
- Inventory 
Data Sources 
RDBMS 
EDW 
INTERACTIVE 
HIVE Server2 
LOAD 
SQOOP 
WebHDFS 
Flume 
LOAD 
SQOO P/Hive 
Web HDFS 
YARN 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
n 
HDFS 
1 Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ
RedPoint Functional Footprint 
Monitoring and Management Tools 
AMBARI 
DATA REFINEMENT 
PIG HIVE 
MAPREDUCE 
REST 
HTTP 
STREAM 
STRUCTURE 
HCATALOG 
(metadata services) 
DBs 
Fil 
esF il 
Feilse s 
NFS 
Ÿ 
25 © RedPoint Global Inc. 2014 Confidential 
Query/Visualization/ 
Reporting/Analytical 
Tools and Apps 
SOURCE 
DATA 
- Sensor Logs 
- Clickstream 
JMS 
- Flat Queue’s 
Files 
- Unstructured 
- Sentiment 
- Customer 
- Inventory 
Data Sources 
RDBMS 
EDW 
INTERACTIVE 
HIVE Server2 
LOAD 
SQOOP 
WebHDFS 
Flume 
LOAD 
SQOO P/Hive 
Web HDFS 
YARN 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
n 
HDFS 
1 Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ 
Ÿ
Sample 
MapReduce 
(small 
subset 
of 
the 
entire 
code 
which 
totals 
nearly 
150 
lines): 
public 
static 
class 
MapClass 
extends 
Mapper<WordOffset, Text, Text, IntWritable> { 
26 © RedPoint Global Inc. 2014 Confidential 
RedPoint 
Benchmarks – Project Gutenberg 
Map 
Reduce 
Pig 
private 
final 
static 
String delimiters = 
"',./<>?;:"[]{}-=_+()&*%^#$!@`~ |«»¡¢£¤¥¦©¬®¯±¶·¿"; 
private 
final 
static 
IntWritable one = new 
IntWritable(1); 
private 
Text word = new 
Text(); 
public 
void 
map(WordOffset key, Text value, Context context) 
throws 
IOException, InterruptedException { 
String line = value.toString(); 
StringTokenizer itr = new 
StringTokenizer(line, delimiters); 
while 
(itr.hasMoreTokens()) { 
word.set(itr.nextToken()); 
context.write(word, one); 
} 
} 
} 
Sample 
Pig 
script 
without 
the 
UDF: 
SET 
pig.maxCombinedSplitSize 67108864 
SET 
pig.splitCombination true 
A = LOAD 
'/testdata/pg/*/*/*'; 
B = FOREACH A GENERATE FLATTEN(TOKENIZE((chararray)$0)) AS 
C = FOREACH B GENERATE UPPER(word) AS 
word; 
D = GROUP 
C BY 
word; 
E = FOREACH D GENERATE COUNT(C) AS 
occurrences, group; 
F = ORDER 
E BY 
occurrences DESC; 
STORE F INTO 
'/user/cleonardi/pg/pig-count'; 
>150 Lines of MR Code ~50 Lines of Script Code 0 Lines of Code 
6 hours of development 3 hours of development 15 min. of development 
6 minutes runtime 15 minutes runtime 3 minutes runtime 
Extensive optimization 
needed 
User Defined Functions 
required prior to running 
script 
No tuning or optimization 
required
Data Lake Architecture for MDM 
Data 
Sources 
CRM 
ERP 
Billing 
Subscriber 
Product 
Network 
Weather 
Compete 
Manuf. 
Clickstream 
Online 
Chat 
Sensor 
Data 
Social 
Media 
Call 
Detail 
Records 
Fabrica>on 
Logs 
Sales 
Feedback 
Field 
Feedback 
Field 
Feedback 
+ 
27 © RedPoint Global Inc. 2014 Confidential
Twitter Tag: #briefr 
The Briefing Room 
Perceptions & Questions 
Analyst: 
Robin Bloor
What Can You Do With 
a Data Lake? 
Robin Bloor, Ph.D.
The Story So Far… 
The old Data Warehouse World 
(environment) is fast dying – 
giving way to a dystopian 
future dominated by alien and 
mutant data, carried by vast 
unruly data streams that flow 
rapidly into dank and murky 
data lakes. 
This is Hadoop World. 
HOW DO WE MAKE SENSE OF THIS?
The Big Data Architecture 
Filtering 
Replicating 
& Routing 
Local 
Data 
Data 
Reservoir 
(Hadoop) 
Local 
Data 
General 
Data 
Server(s) 
Local 
Data 
Specialist 
Data 
Server(s) 
Data 
Preparation 
Data Flow 
(Optimize) 
Local 
Workloads 
ETL & 
Data Virt'n 
Local 
Data 
Data Refinery and Processing Hub 
The Application 
Layer 
Data 
Streaming 
Apps 
Data 
Mart 
Trans 
Apps 
Data 
Mart 
BI 
Apps 
Data 
Mart 
Office 
Apps 
Data 
Mart 
Events 
Data Flow 
Data 
Export 
The Data 
Layer 
Applications may use the Data Hub Directly 
Streams 
IOT 
Log files 
DaaS 
Mobile 
Devices 
Desktops 
Servers 
The Cloud 
Social 
media 
Etc.
The Main Point to Note 
This is WAY more 
complicated than the 
old Data Warehouse 
world
The Governance of Data 
It’s all GOVERNANCE!! 
Data 
Reservoir 
(Hadoop) 
General 
Data 
Server(s) 
Specialist 
Data 
Server(s) 
ETL & 
Data Virt'n 
Data 
Security 
Data Life 
Cycle Mgt 
MDM & 
Business 
Glossary 
Data 
Cleansing 
System 
Management 
Local 
Workloads 
MetaData 
Management 
Performance 
Monitoring 
& Mgt 
Data 
Lineage 
Data 
Mapping 
Data 
ExtrDaacttas 
Extracts 
MetaData 
Discovery 
Service Level 
Mgt 
Corporate Data Hub
The Evolution of Hadoop 
u There were many 
components before YARN 
and Tez 
u But YARN and Tez have 
changed the picture 
u Hadoop is becoming the 
default scale-out file 
system and the OS for 
data flow
The Prognosis 
The foundation is in place for a 
comprehensive Big Data 
Information Architecture… 
But BUILDING such 
integrated systems 
will not be easy
u How does RedPoint see the role of Hadoop 
(ingest-point, ETL engines, MDM work area, 
analytical sandbox, database, etc.); some of 
these? All of these? 
u Often in the past, MDM implementations have 
proved to be disappointing. What makes RedPoint 
different given that the data environment is 
more challenging than ever? 
u Which companies/technologies do you see as 
competitive with RedPoint
u Which verticals have shown the greatest 
interest in RedPoint? 
u How does a RedPoint engagement normally pan 
out? 
u If you are intent upon doing MDM, where is it 
best to start?
Twitter Tag: #briefr 
The Briefing Room
This Month: INTEGRATION & DATA FLOW 
October: ANALYTIC PLATFORMS 
November: DISCOVERY & VISUALIZATION 
www.insideanalysis.com/webcasts/the-briefing-room 
Twitter Tag: #briefr 
The Briefing Room 
Upcoming Topics 
2014 Editorial Calendar at 
www.insideanalysis.com
Twitter Tag: #briefr 
THANK YOU 
for your 
ATTENTION! 
Opening slide image courtesy of Wikimedia Commons 
The Briefing Room

More Related Content

What's hot

The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsHortonworks
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 
Data Discovery, Visualization, and Apache Hadoop
Data Discovery, Visualization, and Apache HadoopData Discovery, Visualization, and Apache Hadoop
Data Discovery, Visualization, and Apache HadoopHortonworks
 
Anil_BigData Resume
Anil_BigData ResumeAnil_BigData Resume
Anil_BigData ResumeAnil Sokhal
 
How Salesforce.com uses Hadoop
How Salesforce.com uses HadoopHow Salesforce.com uses Hadoop
How Salesforce.com uses HadoopNarayan Bharadwaj
 
Level Up – How to Achieve Hadoop Acceleration
Level Up – How to Achieve Hadoop AccelerationLevel Up – How to Achieve Hadoop Acceleration
Level Up – How to Achieve Hadoop AccelerationInside Analysis
 
Predicting Customer Behavior with Customer Convsrsation Modeling
Predicting Customer Behavior with Customer Convsrsation ModelingPredicting Customer Behavior with Customer Convsrsation Modeling
Predicting Customer Behavior with Customer Convsrsation ModelingDataWorks Summit
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks
 
Trafodion – an enterprise class sql based on hadoop
Trafodion – an enterprise class sql based on hadoopTrafodion – an enterprise class sql based on hadoop
Trafodion – an enterprise class sql based on hadoopKrishna-Kumar
 
Driving Enterprise Adoption: Tragedies, Triumphs and Our NEXT
Driving Enterprise Adoption: Tragedies, Triumphs and Our NEXTDriving Enterprise Adoption: Tragedies, Triumphs and Our NEXT
Driving Enterprise Adoption: Tragedies, Triumphs and Our NEXTDataWorks Summit
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack EuropeHortonworks
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopHortonworks
 
1 - The Case for Trafodion
1 - The Case for Trafodion1 - The Case for Trafodion
1 - The Case for TrafodionRohit Jain
 
Resume_bibhu_prasad_dash
Resume_bibhu_prasad_dash Resume_bibhu_prasad_dash
Resume_bibhu_prasad_dash Bibhu Dash
 

What's hot (20)

The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data Analytics
 
GauravSriastava
GauravSriastavaGauravSriastava
GauravSriastava
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
Data Discovery, Visualization, and Apache Hadoop
Data Discovery, Visualization, and Apache HadoopData Discovery, Visualization, and Apache Hadoop
Data Discovery, Visualization, and Apache Hadoop
 
Anil_BigData Resume
Anil_BigData ResumeAnil_BigData Resume
Anil_BigData Resume
 
How Salesforce.com uses Hadoop
How Salesforce.com uses HadoopHow Salesforce.com uses Hadoop
How Salesforce.com uses Hadoop
 
HimaBindu
HimaBinduHimaBindu
HimaBindu
 
Level Up – How to Achieve Hadoop Acceleration
Level Up – How to Achieve Hadoop AccelerationLevel Up – How to Achieve Hadoop Acceleration
Level Up – How to Achieve Hadoop Acceleration
 
Predicting Customer Behavior with Customer Convsrsation Modeling
Predicting Customer Behavior with Customer Convsrsation ModelingPredicting Customer Behavior with Customer Convsrsation Modeling
Predicting Customer Behavior with Customer Convsrsation Modeling
 
SreenivasulaReddy
SreenivasulaReddySreenivasulaReddy
SreenivasulaReddy
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
 
Trafodion – an enterprise class sql based on hadoop
Trafodion – an enterprise class sql based on hadoopTrafodion – an enterprise class sql based on hadoop
Trafodion – an enterprise class sql based on hadoop
 
Driving Enterprise Adoption: Tragedies, Triumphs and Our NEXT
Driving Enterprise Adoption: Tragedies, Triumphs and Our NEXTDriving Enterprise Adoption: Tragedies, Triumphs and Our NEXT
Driving Enterprise Adoption: Tragedies, Triumphs and Our NEXT
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack Europe
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
 
Sourav banerjee resume
Sourav banerjee   resumeSourav banerjee   resume
Sourav banerjee resume
 
hadoop exp
hadoop exphadoop exp
hadoop exp
 
1 - The Case for Trafodion
1 - The Case for Trafodion1 - The Case for Trafodion
1 - The Case for Trafodion
 
Resume_bibhu_prasad_dash
Resume_bibhu_prasad_dash Resume_bibhu_prasad_dash
Resume_bibhu_prasad_dash
 
Resume_of_Vasudevan - Hadoop
Resume_of_Vasudevan - HadoopResume_of_Vasudevan - Hadoop
Resume_of_Vasudevan - Hadoop
 

Similar to Has Traditional MDM Finally Met its Match?

Hadoop 2.0 - Solving the Data Quality Challenge
Hadoop 2.0 - Solving the Data Quality ChallengeHadoop 2.0 - Solving the Data Quality Challenge
Hadoop 2.0 - Solving the Data Quality ChallengeInside Analysis
 
Data Quality in the Data Hub with RedPointGlobal
Data Quality in the Data Hub with RedPointGlobalData Quality in the Data Hub with RedPointGlobal
Data Quality in the Data Hub with RedPointGlobalCaserta
 
Drive dataqualityatyourcompanycreateadatalake
Drive dataqualityatyourcompanycreateadatalakeDrive dataqualityatyourcompanycreateadatalake
Drive dataqualityatyourcompanycreateadatalakeThe Pathway Group
 
YARN: the Key to overcoming the challenges of broad-based Hadoop Adoption
YARN: the Key to overcoming the challenges of broad-based Hadoop AdoptionYARN: the Key to overcoming the challenges of broad-based Hadoop Adoption
YARN: the Key to overcoming the challenges of broad-based Hadoop AdoptionDataWorks Summit
 
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014MapR Technologies
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...MapR Technologies
 
Big Data in Action – Real-World Solution Showcase
 Big Data in Action – Real-World Solution Showcase Big Data in Action – Real-World Solution Showcase
Big Data in Action – Real-World Solution ShowcaseInside Analysis
 
The Anywhere Enterprise – How a Flexible Foundation Opens Doors
The Anywhere Enterprise – How a Flexible Foundation Opens DoorsThe Anywhere Enterprise – How a Flexible Foundation Opens Doors
The Anywhere Enterprise – How a Flexible Foundation Opens DoorsInside Analysis
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Barijaxconf
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseRizaldy Ignacio
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Hortonworks
 
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...BigDataEverywhere
 
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin MotgiWhither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin MotgiFelicia Haggarty
 
Track B-1 建構新世代的智慧數據平台
Track B-1 建構新世代的智慧數據平台Track B-1 建構新世代的智慧數據平台
Track B-1 建構新世代的智慧數據平台Etu Solution
 
Informatica + Hadoop = Best of Both Worlds
Informatica + Hadoop = Best of Both WorldsInformatica + Hadoop = Best of Both Worlds
Informatica + Hadoop = Best of Both WorldsAhmed Tayeh
 
Big Data & SQL: The On-Ramp to Hadoop
Big Data & SQL: The On-Ramp to Hadoop Big Data & SQL: The On-Ramp to Hadoop
Big Data & SQL: The On-Ramp to Hadoop Inside Analysis
 
How Experian increased insights with Hadoop
How Experian increased insights with HadoopHow Experian increased insights with Hadoop
How Experian increased insights with HadoopPrecisely
 
Hadoop and the Future of SQL: Using BI Tools with Big Data
Hadoop and the Future of SQL: Using BI Tools with Big DataHadoop and the Future of SQL: Using BI Tools with Big Data
Hadoop and the Future of SQL: Using BI Tools with Big DataSenturus
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Hortonworks
 

Similar to Has Traditional MDM Finally Met its Match? (20)

Hadoop 2.0 - Solving the Data Quality Challenge
Hadoop 2.0 - Solving the Data Quality ChallengeHadoop 2.0 - Solving the Data Quality Challenge
Hadoop 2.0 - Solving the Data Quality Challenge
 
Data Quality in the Data Hub with RedPointGlobal
Data Quality in the Data Hub with RedPointGlobalData Quality in the Data Hub with RedPointGlobal
Data Quality in the Data Hub with RedPointGlobal
 
Drive dataqualityatyourcompanycreateadatalake
Drive dataqualityatyourcompanycreateadatalakeDrive dataqualityatyourcompanycreateadatalake
Drive dataqualityatyourcompanycreateadatalake
 
YARN: the Key to overcoming the challenges of broad-based Hadoop Adoption
YARN: the Key to overcoming the challenges of broad-based Hadoop AdoptionYARN: the Key to overcoming the challenges of broad-based Hadoop Adoption
YARN: the Key to overcoming the challenges of broad-based Hadoop Adoption
 
Hadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data WarehouseHadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data Warehouse
 
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
 
Big Data in Action – Real-World Solution Showcase
 Big Data in Action – Real-World Solution Showcase Big Data in Action – Real-World Solution Showcase
Big Data in Action – Real-World Solution Showcase
 
The Anywhere Enterprise – How a Flexible Foundation Opens Doors
The Anywhere Enterprise – How a Flexible Foundation Opens DoorsThe Anywhere Enterprise – How a Flexible Foundation Opens Doors
The Anywhere Enterprise – How a Flexible Foundation Opens Doors
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
 
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...
 
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin MotgiWhither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
 
Track B-1 建構新世代的智慧數據平台
Track B-1 建構新世代的智慧數據平台Track B-1 建構新世代的智慧數據平台
Track B-1 建構新世代的智慧數據平台
 
Informatica + Hadoop = Best of Both Worlds
Informatica + Hadoop = Best of Both WorldsInformatica + Hadoop = Best of Both Worlds
Informatica + Hadoop = Best of Both Worlds
 
Big Data & SQL: The On-Ramp to Hadoop
Big Data & SQL: The On-Ramp to Hadoop Big Data & SQL: The On-Ramp to Hadoop
Big Data & SQL: The On-Ramp to Hadoop
 
How Experian increased insights with Hadoop
How Experian increased insights with HadoopHow Experian increased insights with Hadoop
How Experian increased insights with Hadoop
 
Hadoop and the Future of SQL: Using BI Tools with Big Data
Hadoop and the Future of SQL: Using BI Tools with Big DataHadoop and the Future of SQL: Using BI Tools with Big Data
Hadoop and the Future of SQL: Using BI Tools with Big Data
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014
 

More from Inside Analysis

An Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIAn Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIInside Analysis
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessInside Analysis
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationInside Analysis
 
Fit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownFit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownInside Analysis
 
To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security Inside Analysis
 
The Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeThe Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeInside Analysis
 
Introducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataIntroducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataInside Analysis
 
The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionInside Analysis
 
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsAhead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsInside Analysis
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingInside Analysis
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLInside Analysis
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelInside Analysis
 
Structurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureStructurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureInside Analysis
 
SQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskSQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskInside Analysis
 
The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataInside Analysis
 
A Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseA Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseInside Analysis
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopInside Analysis
 
Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldRethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldInside Analysis
 
DisrupTech - Dave Duggal
DisrupTech - Dave DuggalDisrupTech - Dave Duggal
DisrupTech - Dave DuggalInside Analysis
 

More from Inside Analysis (20)

An Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIAn Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BI
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for Success
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter Integration
 
Fit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownFit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data Letdown
 
To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security
 
The Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeThe Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On Time
 
Introducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataIntroducing: A Complete Algebra of Data
Introducing: A Complete Algebra of Data
 
The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop Adoption
 
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsAhead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time Analytics
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of Everything
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global Level
 
Structurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureStructurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your Architecture
 
SQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskSQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the Risk
 
The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big Data
 
A Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseA Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data Warehouse
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of Hadoop
 
Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldRethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile World
 
DisrupTech - Dave Duggal
DisrupTech - Dave DuggalDisrupTech - Dave Duggal
DisrupTech - Dave Duggal
 
Modus Operandi
Modus OperandiModus Operandi
Modus Operandi
 

Recently uploaded

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 

Has Traditional MDM Finally Met its Match?

  • 1. Grab some coffee and enjoy the pre-show banter before the top of the hour!
  • 2. Has Traditional MDM Finally Met Its Match? The Briefing Room
  • 3. Twitter Tag: #briefr The Briefing Room Welcome Host: Eric Kavanagh eric.kavanagh@bloorgroup.com @eric_kavanagh
  • 4. ! Reveal the essential characteristics of enterprise software, good and bad ! Provide a forum for detailed analysis of today’s innovative technologies ! Give vendors a chance to explain their product to savvy analysts ! Allow audience members to pose serious questions... and get answers! Twitter Tag: #briefr The Briefing Room Mission
  • 5. This Month: INTEGRATION & DATA FLOW October: ANALYTIC PLATFORMS November: DISCOVERY & VISUALIZATION Twitter Tag: #briefr The Briefing Room Topics 2014 Editorial Calendar at www.insideanalysis.com/webcasts/the-briefing-room
  • 6. There’s a New Sheriff in Town! Twitter Tag: #briefr The Briefing Room Executive Summary • Speed and power trump the old way • Traditional MDM is officially archaic • YARN is the new fabric of MDM
  • 7. Twitter Tag: #briefr The Briefing Room Analyst: Robin Bloor Robin Bloor is Chief Analyst at The Bloor Group robin.bloor@bloorgroup.com @robinbloor
  • 8. Twitter Tag: #briefr The Briefing Room RedPoint Global ! RedPoint Global is a data management and integrated marketing technology company ! RedPoint Data Management offers solutions designed for master data management (MDM), collaboration and architecture integration ! RedPoint Data Management for Hadoop is YARN-compliant and enables analysts to access and manipulate data directly within the Hadoop cluster
  • 9. Twitter Tag: #briefr The Briefing Room Guest: George Corugedo George Corugedo is Chief Technology Officer & Co- Founder at RedPoint Global Inc. A mathematician and seasoned technology executive, George has over 20 years of business and technical expertise. As co-founder and CTO of RedPoint Global, George is responsible for leading the development of the RedPoint Convergent Marketing Platform™. A former math professor, George left academia to co-found Accenture’s Customer Insight Practice, which specialized in strategic data utilization, analytics and customer strategy. Previous positions include director of client delivery at ClarityBlue, Inc., a provider of hosted customer intelligence solutions to enterprise commercial entities, and COO/CIO of Riscuity, a receivables management company specializing in the utilization of analytics to drive collections.
  • 10. MDM for the Modern Data Architecture September 2014
  • 11. Purpose of MDM Create correct and consistent data across the enterprise that earns trust in information and acceleration of growth. 11 © RedPoint Global Inc. 2014 Confidential
  • 12. Vicious Cycle of Unmanaged Data 1. Master Data Issues remain unaddressed or unresolved 2. Garbage in/garbage out creates process confusion 4. Data conflicts reinforce siloed operations 3. Lack of process trust slows business momentum 12 © RedPoint Global Inc. 2014 Confidential
  • 13. 13 © RedPoint Global Inc. 2014 Confidential © Hortonworks Inc. 2014 A Data Architecture Under Pressure
  • 14. Broad Spectrum of Benefits Across Industries 14 © RedPoint Global Inc. 2014 Confidential
  • 15. Gartner’s Nexus of Forces Making Things Worse 15 © RedPoint Global Inc. 2014 Confidential
  • 16. Business Benefits of MDM 16 © RedPoint Global Inc. 2014 Confidential
  • 17. Types of Data in a Typical Organization Challenges to Data Lake Approach • Severe shortage of Map Reduce skilled resources • Inconsistent skills lead to inconsistent results of code based solu>ons • Nascent technologies require mul>ple point solu>ons • Technologies Benefits of a Hadoop Data Lake 17 © RedPoint Global Inc. 2014 Confidential are not enterprise grade • Some func>onality may not be possible within these frameworks • Data is ingested in its raw state regardless of format, structure or lack of structure • Raw data can be used and reused for differing purposes across the enterprise • Beyond inexpensive storage, Hadoop is an extremely power and scalable and segmentable computa>onal plaMorm • Master Data can be fed across the enterprise and deep analy>cs on clean data is immediately enabled
  • 18. Big Data Can Become Big Information ! Inges>on of all data available from any source, format, cadence, structure or non-­‐structure ! ELT and data transforma>on, refinement, cleansing, comple>on, valida>on and standardiza>on ! Geospa>al processing and geocoding ! Data profiling, lineage and metadata management ! Iden>ty resolu>on and persistent keying and en>ty profile management ! ASribute source and consumer mapping 18 © RedPoint Global Inc. 2014 Confidential
  • 19. Data Lake Architecture for MDM Data Sources CRM ERP Billing Subscriber Product Network Weather Compete Manuf. Clickstream Online Chat Sensor Data Social Media Call Detail Records Fabrica>on Logs Sales Feedback Field Feedback Field Feedback + 19 © RedPoint Global Inc. 2014 Confidential
  • 20. Key Functions for Master Data Management ETL & ELT Data Quality Master Key Management Web Services Integra>on 20 © RedPoint Global Inc. 2014 Confidential Integra>on & Matching Process Automa>on & Opera>ons • Profiling, reads/writes, transforma>ons • Single project for all jobs • Cleanse data • Parsing, correc>on • Geo-­‐spa>al analysis • Grouping • Fuzzy match • Create keys • Track changes • Maintain matches over >me • Consume and publish • HTTP/HTTPS protocols • XML/JSON/SOAP formats • Job scheduling, monitoring, no>fica>ons • Central point of control • Meta Data Management
  • 21. So How to Proceed? 21 © RedPoint Global Inc. 2014 Confidential
  • 22. Overview - What is Hadoop/Hadoop 2.0 Hadoop 1.0 • All opera>ons based on Map Reduce • Intrinsic inconsistency of code based solu>ons • Highly skilled and expensive resources needed • 3rd party applica>ons constrained by the need to generate code 22 © RedPoint Global Inc. 2014 Confidential Hadoop 2.0 • Introduc>on of the YARN: “a general-­‐purpose, distributed, applica>on management framework that supersedes the classic Apache Hadoop MapReduce framework for processing data in Hadoop clusters.” • Mature applica>ons can now operate directly on Hadoop • Reduce skill requirements and increased consistency
  • 23. RedPoint Data Management on Hadoop Par>>oning AM / Tasks Parallel Sec>on (UI) Execu>on AM / Tasks Data I/O Key / Split Analysis YARN 23 © RedPoint Global Inc. 2014 Confidential MapReduce
  • 24. Reference Hadoop Architecture Monitoring and Management Tools AMBARI DATA REFINEMENT PIG HIVE MAPREDUCE REST HTTP STREAM STRUCTURE HCATALOG (metadata services) DBs Fil esF il Feilse s NFS Ÿ 24 © RedPoint Global Inc. 2014 Confidential Query/Visualization/ Reporting/Analytical Tools and Apps SOURCE DATA - Sensor Logs - Clickstream JMS - Flat Queue’s Files - Unstructured - Sentiment - Customer - Inventory Data Sources RDBMS EDW INTERACTIVE HIVE Server2 LOAD SQOOP WebHDFS Flume LOAD SQOO P/Hive Web HDFS YARN Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ n HDFS 1 Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ
  • 25. RedPoint Functional Footprint Monitoring and Management Tools AMBARI DATA REFINEMENT PIG HIVE MAPREDUCE REST HTTP STREAM STRUCTURE HCATALOG (metadata services) DBs Fil esF il Feilse s NFS Ÿ 25 © RedPoint Global Inc. 2014 Confidential Query/Visualization/ Reporting/Analytical Tools and Apps SOURCE DATA - Sensor Logs - Clickstream JMS - Flat Queue’s Files - Unstructured - Sentiment - Customer - Inventory Data Sources RDBMS EDW INTERACTIVE HIVE Server2 LOAD SQOOP WebHDFS Flume LOAD SQOO P/Hive Web HDFS YARN Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ n HDFS 1 Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ
  • 26. Sample MapReduce (small subset of the entire code which totals nearly 150 lines): public static class MapClass extends Mapper<WordOffset, Text, Text, IntWritable> { 26 © RedPoint Global Inc. 2014 Confidential RedPoint Benchmarks – Project Gutenberg Map Reduce Pig private final static String delimiters = "',./<>?;:"[]{}-=_+()&*%^#$!@`~ |«»¡¢£¤¥¦©¬®¯±¶·¿"; private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(WordOffset key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); StringTokenizer itr = new StringTokenizer(line, delimiters); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } } } Sample Pig script without the UDF: SET pig.maxCombinedSplitSize 67108864 SET pig.splitCombination true A = LOAD '/testdata/pg/*/*/*'; B = FOREACH A GENERATE FLATTEN(TOKENIZE((chararray)$0)) AS C = FOREACH B GENERATE UPPER(word) AS word; D = GROUP C BY word; E = FOREACH D GENERATE COUNT(C) AS occurrences, group; F = ORDER E BY occurrences DESC; STORE F INTO '/user/cleonardi/pg/pig-count'; >150 Lines of MR Code ~50 Lines of Script Code 0 Lines of Code 6 hours of development 3 hours of development 15 min. of development 6 minutes runtime 15 minutes runtime 3 minutes runtime Extensive optimization needed User Defined Functions required prior to running script No tuning or optimization required
  • 27. Data Lake Architecture for MDM Data Sources CRM ERP Billing Subscriber Product Network Weather Compete Manuf. Clickstream Online Chat Sensor Data Social Media Call Detail Records Fabrica>on Logs Sales Feedback Field Feedback Field Feedback + 27 © RedPoint Global Inc. 2014 Confidential
  • 28. Twitter Tag: #briefr The Briefing Room Perceptions & Questions Analyst: Robin Bloor
  • 29. What Can You Do With a Data Lake? Robin Bloor, Ph.D.
  • 30. The Story So Far… The old Data Warehouse World (environment) is fast dying – giving way to a dystopian future dominated by alien and mutant data, carried by vast unruly data streams that flow rapidly into dank and murky data lakes. This is Hadoop World. HOW DO WE MAKE SENSE OF THIS?
  • 31. The Big Data Architecture Filtering Replicating & Routing Local Data Data Reservoir (Hadoop) Local Data General Data Server(s) Local Data Specialist Data Server(s) Data Preparation Data Flow (Optimize) Local Workloads ETL & Data Virt'n Local Data Data Refinery and Processing Hub The Application Layer Data Streaming Apps Data Mart Trans Apps Data Mart BI Apps Data Mart Office Apps Data Mart Events Data Flow Data Export The Data Layer Applications may use the Data Hub Directly Streams IOT Log files DaaS Mobile Devices Desktops Servers The Cloud Social media Etc.
  • 32. The Main Point to Note This is WAY more complicated than the old Data Warehouse world
  • 33. The Governance of Data It’s all GOVERNANCE!! Data Reservoir (Hadoop) General Data Server(s) Specialist Data Server(s) ETL & Data Virt'n Data Security Data Life Cycle Mgt MDM & Business Glossary Data Cleansing System Management Local Workloads MetaData Management Performance Monitoring & Mgt Data Lineage Data Mapping Data ExtrDaacttas Extracts MetaData Discovery Service Level Mgt Corporate Data Hub
  • 34. The Evolution of Hadoop u There were many components before YARN and Tez u But YARN and Tez have changed the picture u Hadoop is becoming the default scale-out file system and the OS for data flow
  • 35. The Prognosis The foundation is in place for a comprehensive Big Data Information Architecture… But BUILDING such integrated systems will not be easy
  • 36. u How does RedPoint see the role of Hadoop (ingest-point, ETL engines, MDM work area, analytical sandbox, database, etc.); some of these? All of these? u Often in the past, MDM implementations have proved to be disappointing. What makes RedPoint different given that the data environment is more challenging than ever? u Which companies/technologies do you see as competitive with RedPoint
  • 37. u Which verticals have shown the greatest interest in RedPoint? u How does a RedPoint engagement normally pan out? u If you are intent upon doing MDM, where is it best to start?
  • 38. Twitter Tag: #briefr The Briefing Room
  • 39. This Month: INTEGRATION & DATA FLOW October: ANALYTIC PLATFORMS November: DISCOVERY & VISUALIZATION www.insideanalysis.com/webcasts/the-briefing-room Twitter Tag: #briefr The Briefing Room Upcoming Topics 2014 Editorial Calendar at www.insideanalysis.com
  • 40. Twitter Tag: #briefr THANK YOU for your ATTENTION! Opening slide image courtesy of Wikimedia Commons The Briefing Room