SlideShare a Scribd company logo
1 of 30
Download to read offline
1 ©	Hortonworks	Inc.	2011	–2017.	All	Rights	Reserved
Enterprise	Data	Warehouse	Optimization
Piet	Loubser
VP	Product	and	Solutions	Marketing
Hortonworks
Dr Barry	Devlin
Founder	&	Principal
9sight	Consulting
Copyright	© 2017	9sight	Consulting,		All	Rights	Reserved	
Dr	Barry	Devlin	
Founder	&	Principal	
9sight	Consulting
The	EDW	Lives	On
The	Beating	Heart	of	the	Data	Lake
10	August	2017
Hortonworks	Webinar
Dr.	Barry	Devlin
3 Copyright	©	2017,	9sight	Consulting
Founder	and	Principal
9sight	Consulting,	www.9sight.com
Dr. Barry	Devlin	is	a	founder	of	the	data	warehousing	industry,	defining	its	
first	architecture	in	1985.	A	foremost	authority	on	business	intelligence	(BI),	
big	data	and	beyond,	he	is	respected	worldwide	as	a	visionary	and	thought-
leader	in	the	evolving	industry.	Barry	has	authored	two	ground-breaking	
books:	the	classic	"Data	Warehouse--from	Architecture	to	Implementation"	
and	“Business	unIntelligence--Insight	and	Innovation	Beyond	Analytics	and	
Big	Data”	(http://bit.ly/BunI_Book)	in	2013.
Barry	has	over	30	years	of	experience	in	the	IT	industry,	previously	with	IBM,	
as	a	consultant,	manager	and	distinguished	engineer.	As	founder	and	
principal	of	9sight	in	2008,	Barry	provides	strategic	consulting	and	thought-
leadership	to	buyers	and	vendors	of	BI	and	Big	Data	solutions.	He	is	an	
associate	editor	of	TDWI's	Journal	of	Business	Intelligence,	and	a	regular	
keynote	speaker,	teacher	and	writer	on	all	aspects	of	information	creation	
and	use.	
Barry	operates	worldwide	from	Cape	Town,	South	Africa.
Email:	
barry@9sight.com
Twitter:
@BarryDevlin
4 Copyright	©	2017,	9sight	Consulting
Agenda
1. Past	– from	a	warehouse	to	a	lake
2. Present	– a	warehouse	and a	lake
3. Emerging	– a	warehouse	by	a	lake
4. Conclusions
The	data	architecture	since	the	mid-’80s
§ Two	layers	within	the	Data	Warehouse…
– Enterprise	data	warehouse
– Reconciled	data
– Data	marts
– What	the	users	need
§ …	fed	from	and	separate	to	
operational	systems
– Data	to	run	the	business
– Created	by	the	processes	
of	the	business
§ All	data	created	within	the	enterprise	(or	
within	partner	ecosystem)
5 Copyright	©	2017,	9sight	Consulting
Data													marts
Enterprise	data	
warehouse
Metadata
Data	
warehouse
Operational	systems
“An	architecture	for	a	business	
and	information	 system”,	
B.	A.	Devlin,	 P.	T.	Murphy,	
IBM	Systems	 Journal,	 (1988)
The	drive	toward	the	data	lake	since	2010
§ Data	warehouse	architecture	“old-fashioned”
– Linked	to	(traditional)	relational	databases
– Too	structured,	schema-on-write
– Too	slow	/	complex	to	build
– Lacking	support	for	big	data
– No	link	to	Hadoop
§ Data	lake	proposed	as	alternative
– Cheaper,	bigger	and	more	flexible
– Structure-agnostic,	schema-on-read
(late	binding)
– Supports	all	data	types
– Agile,	flexible,	rapid	implementation
– Driven	by	Hadoop	ecosystem
– Data	reservoir	– a	better(?)	architected	data	lake
6 Copyright	©	2017,	9sight	Consulting
Data warehouse
Image:	Gartner	via	Bill	Schmarzo,	
infocus.emc.com/william_schmarzo/
data-lake-data-reservoir-data-dumpblah-blah-blah/	(2014)
Data	lake	architecture
7 Copyright	©	2017,	9sight	Consulting
www.capgemini.com/blog/capping-it-off/2014/08/you-have-to-manage-your-data-lake-the-fallacy-
of-technology-being-magic
From	BI	to	Business	unIntelligence
§ People	process	information
§ People:	Rational	thought	and	far	beyond
– People	make	all	decisions!
§ Process:	Logic	– predefined,	emergent
– Decision	making	is	a	process	
§ Information:	Data,	knowledge,	meaning
– Data/information	is	only	the	foundation
§ Not	business	intelligence…	Business	unIntelligence
§ Amazon:	http://bit.ly/BunI_Book
§ Or	http://bit.ly/BunI-TP2:	25%	discount	with	code	“BIInsights25”
8 Copyright	©	2017,	9sight	Consulting
Information
Process
People
Business	unIntelligence	– Information	pillars
§ One	architecture	for	all	types	of	information
– Mix/match	technology	as	needed
– Relational,	NoSQL,	Hadoop,	etc.
§ Integration	of	sources	and	stores
– Instantiation	gathers	inputs
– Assimilation	integrates	stored	info.
§ Data	flows	as	fast	as	needed	and	reconciled	
when	necessary
– No	unnecessary	storage	or	transformations
§ Distinct	data	management	/	governance	
approaches	as	required
9 Copyright	©	2017,	9sight	Consulting
Transactions
Human-
sourced
(information)
Machine-
generated
(data)
Process-
mediated
(data)
Context-setting	(information)
Assimilation
Transactional
(data)
EventsMeasures Messages
Instantiation
Positioning	of	data	lake	and	warehouse	today
§ Serve	different	purposes
– Functional	– run	/	manage	the	business
– Illustrative	– predict	/	influence	the	future
§ Both	required
– Optimized	for	different	strengths
– Warehouse	=	accuracy	and	consistency
– Lake	=		timeliness	and	rawness
§ Links	between	environments
– Better	than	copying	everything	
into	one	(or	both)
§ Together	– foundation	for	pervasive	analytics
10 Copyright	©	2017,	9sight	Consulting
Events Measures Messages
Data	warehouse
Functional
Accurate,	consistent	data
Discarded	if	outdated
Legally	binding,	
traceable	process
Transactions
Data	Lake
Illustrative
Timely,	raw	data
Stored	forever
Creative,	free-flowing	
process	
Operational	systems
User	access	to	all data
A	warehouse	by	a	lake	(1)	Preparation	and	enrichment
§ Challenge:	ETL	(extract,	transform and	load)	
to	data	warehouse	complex	and	
computationally	expensive
§ Transform	in:
– Proprietary	ETL	server	– with	
high	licensing	cost
– Data	warehouse	server	– with	
impact	on	analytic	tasks
§ Solution:	Pump	some	or	all	data
through	the	data	lake
– Reduced	processing	cost	and/or	
impact	on	DW	work
11 Copyright	©	2017,	9sight	Consulting
Data	warehouse
Transactions
Op.	systems
Events Measures Messages
Data	Lake
User	access	to	all data
A	warehouse	by	a	lake	(2)	Archival
§ Challenge:	Storing	seldom-used	(cold)	data	
in	a	data	warehouse	is	an	expensive	waste	of	
high-performance	hardware
§ Archiving	to	magnetic	tape
delays	and	complicates	access	
to	off-line	data	when	needed
§ Solution:	archive	to	commodity
servers	and	disks	in	data	lake
– Hadoop	– no	licensing	costs
– Faster	access	when	needed	–
almost	equal	to	DW	
– Same	tools	(SQL-based)	for	access	as	DW
12 Copyright	©	2017,	9sight	Consulting
Data	warehouse
Transactions
Op.	systems
Events Measures Messages
Data	Lake
User	access	to	all data
A	warehouse	by	a	lake	(3)	Access
§ Challenge:	Data	increasingly	resides	on	
disparate	platforms
– Traditional	business	info	in	relational
– Business	people	familiar	with	SQL
– Social	media,	IoT	on	Hadoop	/	
NoSQL	/	etc.
– Copying	back	and	forth	is
expensive
§ Solution:	Virtualize	access	to
data	on	all	platforms
– SQL-based	queries
– Join	data	across	platforms
13 Copyright	©	2017,	9sight	Consulting
Data	warehouse
Transactions
Op.	systems
Events Measures Messages
Data	Lake
User	access	to	all data
Conclusions
1. Enterprise	data	warehouse	lives	on
– Focused	on	core	business	information
– Traditional	relational	platforms	still	preferred
14 Copyright	©	2017,	9sight	Consulting
2. Data	lake	complements	data	warehouse
– Focused	on	externally	sourced	data
– Linked	to	data	warehouse	in	multiple	ways
3. Data	lake	can	assist	/	offload	data	warehouse
– Use	commodity	storage	and	processing	power
– Reduce	costs	and	improve	performance
Copyright	© 2017	9sight	Consulting,		All	Rights	Reserved	
Dr	Barry	Devlin	
Founder	&	Principal	
9sight	Consulting
Thank	You
Piet	Loubser
VP	Product	and	Solutions	Marketing
Hortonworks
17 ©	Hortonworks	Inc.	2011	–2017.	All	Rights	Reserved
The	New	Way	of	Business	Is	Fueled	By	Connected	Data
• Connected	Customers,	
Vehicles,	Devices
• Socially	crowd-
sourced	requirements
• Digital	design	and	
analysis
• Digital	prototypes	and	
tests	(simulations)
• Connected	Factories,	
Sensors,	Devices
• Human-robotic	
interaction
• 3D-printing	on	
demand
• Connected	Trucks,	
Inventory
• Location,	traffic,	
weather-aware	
distribution
• Real-time	inventory	
visibility
• Dynamic	rerouting
• Connected	
Customers,	Devices
• Omni- channel	
demand	sensing
• Real-Time	
Recommendations
• Connected	Assets
• Remote	service	
monitoring	&	
delivery
• Predictive	
maintenance
• OTA	Updates
Development Manufacturing Distribution Marketing/Sales Service
18 ©	Hortonworks	Inc.	2011	–2017.	All	Rights	Reserved
D A T A 	C E N T E R
Enterprise	
Data	Lake
Data	Flow	&	
Stream	
Processing
Big	Data	
Cloud	Service
C L O U D
Big	Data	
Cloud	Service
A	Connected	Data	Strategy	Connects	Data	Center	and	Cloud
Security
Data	Lake
AWS	IaaSAzure	IaaS
19 ©	Hortonworks	Inc.	2011	–2017.	All	Rights	Reserved
Typical	EDW	Architecture
Used	inefficiently,	from	$7,500	to	$35,000	per	TB1 of	data	stored	and	processed
In	a	typical	EDW:
• 50-70%	of	data	is	unused	and/or	cold
• 45-65%of	CPU	capacity	is	ETL/ELT
• 25-35%of	CPU	consumed	by	ETL	is	to	
load	unused	data
• 30-40%	of	CPU	is	consumed	by	only	5%
of	ETL	workloads
• As	little	as	2.8%	of	the	data	is	Hot1
ANALYTICS
Data	
Marts
Business	
Analytics
Visualization
&	Dashboards
DATA	SYSTEMS
Systems	of	Record
RDBMS
ERP
CRM
Other
Source:	Hortonworks	Innovation	and	Strategy	Team	and	Appfluent Analysis	
1.	EY	Analysis	shows	typical	range	from	$10-15k	/	TB.	Hortonworks	experience	shows	a	wide	range	observed	in	the	field,	from	$35k/TB	for	massive,	in-memory	EDW	
appliances	to	$7.5k/TB	for	RDBMS	based,	home-grown	EDW	solutions
2.	For	example,	for	a	client	keeping	a	rolling	36-month	window	of	data	for	reporting	in	an	EDW,	only	1	month	of	the	36	(2.8%)	is new/hot.
20 ©	Hortonworks	Inc.	2011	–2017.	All	Rights	Reserved
Hortonworks	Connection:	Services	and	Solutions	for	Your	Success
Data	Services
Hortonworks	Solutions
Enterprise	Data
Warehouse	Optimization
Cyber	Security	and
Threat	Management
Internet	of	Things
and	Streaming	Analytics
Data	Science	Experience
Advanced	SQL
Data	Center
Hortonworks	Data	Suite
HDFHDP
Hortonworks
Connection
Cloud
Hortonworks	Data	Cloud
AWS HDInsight
Hortonworks	Connection
Enablement	Subscription
SmartSense™
Premier	Operational	Support
Educational	Services
Professional	Services
Community	Connection
21 ©	Hortonworks	Inc.	2011	–2017.	All	Rights	Reserved
Enterprise	Data	Warehouse	Optimization
Dramatic	Cost	Reductions
Reduce	cost	of	your	EDW	Implementation	by	
offloading	ETL	processes	and	archiving	cold	data
Deploy	Business	Intelligence	on	Hadoop
Empower	Business	users	with	powerful	reporting,	
new	applications,	visualization	tools,	and	artificial	
intelligence
Support	More	Types	of	Unstructured	Data
Index	and	search	images,	videos,	text	&	sound	files
22 ©	Hortonworks	Inc.	2011	–2017.	All	Rights	Reserved
EDW	Plus	Hadoop	helps	you	optimize	and	reduce	costs	associated	
with	your	EDE
Archive Cold Data away from EDW
• Move	cold	or	rarely	used	data	to	Hadoop	
as	active	archive	
• Store	more	of	your	data	longer,	cheaper
Offload costly ETL process
• Free	your	EDW	to	perform	high-value	functions	like	
analytics	&	reporting,	not	ETL
• Use	Hadoop	for	advanced	or	massive-scale	ETL/ELT
ANALYTICSDATA	SYSTEMS
Data	
Marts
Business	
Analytics
Visualization
&	Dashboards
Systems	of	Record
RDBMS
ERP
CRM
Other
ELT
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
N
Cold	Data,	
Deeper	Archive
&	New	Sources
Enterprise	Data	
Warehouse
Hot
Data
Science
OLAP	on	Hadoop
23 ©	Hortonworks	Inc.	2011	–2017.	All	Rights	Reserved
EDW	Optimization:	ETL	Offload
à The	Problem:
– EDWs	consume	between	50%	and	90%	of	
CPU	just	on	ETL/ELT	tasks.
– These	jobs	interfere	with	more	business-
critical	tasks	like	BI	and	advanced	analytics.
à The	Solution:
– Hive	and	HDP	deliver	ETL	that	scales	to	
petabytes.
– Syncsort	DMX-h	for	simple	drag-and-drop	ETL	
workflows.
– Economical	scale-out	processing	on	
commodity	servers.
à The	Result:
– Better	SLAs	for	mission-critical	analytics.
– Limit	EDW	expansion	or	retire	old	systems.
ETL/ELT
DATA
MART
DATA
LANDING	&
DEEP
ARCHIVE
CUBE
MART
END	USER
APPLICATIONS
APPLICATIONS
APPLICATIONS
END	USERS
AND	APPS
EDW	OPTIMIZATION	SOLUTION
24 ©	Hortonworks	Inc.	2011	–2017.	All	Rights	Reserved
EDW	Optimization:	Active	Archive
à The	Problem:
– Increasing	data	volumes	and	cost	pressure	
force	data	to	be	archived	to	tape.
– Archived	data	not	available	for	analytics,	or	
must	be	retrieved	at	great	expense.
à The	Solution:
– Adopting	Hadoop	delivers	cost	per	terabyte	
on	par	with	tape	backup	solutions.
– Data	in	Hadoop	can	be	analyzed	by	all	major	
BI	tools,	allowing	analytics	on	archive	data.
à The	Result:
– Data	always	available	for	analytics.
– Store	years	of	data	rather	than	months.
ETL/ELT
DATA
MART
DATA
LANDING	&
DEEP
ARCHIVE
CUBE
MART
END	USER
APPLICATIONS
APPLICATIONS
APPLICATIONS
END	USERS
AND	APPS
EDW	OPTIMIZATION	SOLUTION
25 ©	Hortonworks	Inc.	2011	–2017.	All	Rights	Reserved
EDW	Optimization:	Fast	BI	on	Hadoop
à The	Problem:
– Proprietary	EDW	systems	were	adopted	for	
Fast	BI	and	deep	slice-and-dice	analytics,	but	
EDW	prices	are	unsustainably	high.
à The	Solution:
– Interactive	SQL	is	a	reality	on	Hadoop	today.
– Partner	Solutions	(IBM	BigSQL,	Kyvos,	Jethro)	
adds	powerful	SQL	and	OLAP	capabilities	for	
deep	drilldown	at	scale.
à The	Result:
– Query	terabytes	of	data	in	seconds.
– Connect	your	favorite	BI	tools	like	Tableau	and	
Excel	through	SQL	and	MDX	interfaces.
– The	EDW	Optimization	Solution	is	tailor-made	
to	deliver	Fast	BI	on	Hadoop.
ETL/ELT
DATA
MART
DATA
LANDING	&
DEEP
ARCHIVE
CUBE
MART
END	USER
APPLICATIONS
APPLICATIONS
APPLICATIONS
END	USERS
AND	APPS
EDW	OPTIMIZATION	SOLUTION
26 ©	Hortonworks	Inc.	2011	–2017.	All	Rights	Reserved
Centrica	Transforms	Service	For	Utility	Customers
3	Million
Customers
ETL	efficiency	
gains
300	GB/Day
Ingest
Decommissioned	
some	EDWs
can	access	“smart	
energy	reports”
from	11	hours	to	
45	minutes/job
rationalizes work	
of	field	engineers
saving	millions	
annually
SITUATION
Data	fragmentation	hid	
business-wide	patterns	
from	analysts
Existing	infrastructure	made	
loading	data	difficult	&	
caused	analytic	bottlenecks
Goal:	reduce	costs,	
streamline	processes	for	a	
single	view	of	customers
DATA	
DISCOVERY	
Smart
Meter	Data
PREDICTIVE
ANALYTICS
Engineer	Schedule
Optimization
SINGLE
VIEW
Customer
Segment	
Analysis
SINGLE
VIEW
Product
Cross-Sell
PREDICTIVE
ANALYTICS
Tailored
Services
SINGLE	VIEW
Smart	Meter	Mobile	App
DATA
ENRICHMENT
On-Site	Data
Capture
ACTIVE
ARCHIVE
EDW
Offload
ETL	
OFFLOAD
Streaming
Ingest
“Focusing	on	innovation,	learning	to	forget	traditional	legacy	ways	of	working	and	approaching	it	in	
new	ways	creates	unexpected	behavioural changes,	because	people	feel	freer	and	they	also	feel	valued.”	
Dajit Rehal,	Senior	Systems	Director
27 ©	Hortonworks	Inc.	2011	–2017.	All	Rights	Reserved
EDW	Plus	Hadoop	helps	you	land	and	enrich	more	data	to	respond	
faster	to	new	business	requests
Archive Cold Data away from EDW
• Move	cold	or	rarely	used	data	to	Hadoop	
as	active	archive	
• Store	more	of	your	data	longer,	cheaper
Offload costly ETL process
• Free	your	EDW	to	perform	high-value	functions	like	
analytics	&	reporting,	not	ETL
• Use	Hadoop	for	advanced	or	massive-scale	ETL/ELT
Land & Enrich more data to create
more value-add analytics
• Use	Hadoop	to	ingest	new	data	sources,	such	as	web	
and	machine	data	for	new	analytical	context	from	
unstructured	and	semi-structured	sources
• Create	an	analytical	sandbox for	advanced	data	
science
ANALYTICSDATA	SYSTEMS
Data	
Marts
Business	
Analytics
Visualization
&	Dashboards
Systems	of	Record
RDBMS
ERP
CRM
Other
ELT
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
N
Cold	Data,	
Deeper	Archive
&	New	Sources
Enterprise	Data	
Warehouse
Hot
Data
Science
OLAP	on	Hadoop
Clickstream Web	&	Social Geolocation Sensor	
& Machine
Server	
Logs
Unstructured
NEW	SOURCES
Ingest	 Stream	 Events
28 ©	Hortonworks	Inc.	2011	–2017.	All	Rights	Reserved
Prescient	Harnesses	Machine	Learning	for	Traveler	Safety	Warnings
SITUATION
Could	only	produce	one	
assessment	every	3-4	days
Performs	risk	
management	
Uses	humans	to	
identify	false	positives
Needed	efficient	way	to	
store	raw	data	for	analytics
49,500	Data	
Sources
700%	Productivity	
Improvement
5	Petabytes	of	
Data
Hybrid
Architecture
ingested	by	HDF	
into	HDP
for	geospatial	
analysts
stored	in	HDP	
connected	EMC
HDF	connects	
data	center	to	
cloud
ETL	
OFFLOAD
Sensor	Data	
Ingest
DATA	
DISCOVERY	
Threat
Assessments
SINGLE
VIEW
Global
Threat	Map
PREDICTIVE
ANALYTICS
Threat-Proximity	Mobile	Alerts
ACTIVE
ARCHIVE
Streaming	
Threat	Archive
DATA
ENRICHMENT
Provenance
Metadata
“We	know	that	when	we	define	a	high-threat	area	in	a	given	area	of	the	world,	that	it	is	underpinned	by	
very	specific	data	sources.	It’s	data-driven,	and	we	can	point	to	those	sources—if	ever	asked—and	say,	
‘Here’s	why.’”	Mike	Bishop,	Chief	Systems	Architect
29 ©	Hortonworks	Inc.	2011	–2017.	All	Rights	Reserved
Why	Hortonworks?
Powering	All	Data
Data-at-Rest,	Data-in-Motion
Cloud,	On-Premises
Structured,	unstructured
Powered	By	100%	Open	
Source
Rapid	innovation
Dramatic	cost	reduction
Enterprise	Ready
Governance
Fine	grained	security
Lineage	and	data	provenance
hortonworks.com/get-started/big-data-scorecard/
Forrester	Wave:	Big	Data	Warehouse,	Q2	2017
30 ©	Hortonworks	Inc.	2011	–2017.	All	Rights	Reserved
Thank	You

More Related Content

What's hot

The Data Effect: Canadian Big Data & Analytics Update - Dr. Alison Brooks Dir...
The Data Effect: Canadian Big Data & Analytics Update - Dr. Alison Brooks Dir...The Data Effect: Canadian Big Data & Analytics Update - Dr. Alison Brooks Dir...
The Data Effect: Canadian Big Data & Analytics Update - Dr. Alison Brooks Dir...
CityAge
 
DataEd Slides: Data Strategy — Plans Are Useless, but Planning Is Invaluable
DataEd Slides: Data Strategy — Plans Are Useless, but Planning Is InvaluableDataEd Slides: Data Strategy — Plans Are Useless, but Planning Is Invaluable
DataEd Slides: Data Strategy — Plans Are Useless, but Planning Is Invaluable
DATAVERSITY
 
Data science capabilities
Data science capabilitiesData science capabilities
Data science capabilities
Yann Lecourt
 

What's hot (19)

Predictive and Prescriptive Analytics Expert Session Webinar
Predictive  and Prescriptive Analytics Expert Session Webinar Predictive  and Prescriptive Analytics Expert Session Webinar
Predictive and Prescriptive Analytics Expert Session Webinar
 
The Data Effect: Canadian Big Data & Analytics Update - Dr. Alison Brooks Dir...
The Data Effect: Canadian Big Data & Analytics Update - Dr. Alison Brooks Dir...The Data Effect: Canadian Big Data & Analytics Update - Dr. Alison Brooks Dir...
The Data Effect: Canadian Big Data & Analytics Update - Dr. Alison Brooks Dir...
 
7 Big Data Challenges and How to Overcome Them
7 Big Data Challenges and How to Overcome Them7 Big Data Challenges and How to Overcome Them
7 Big Data Challenges and How to Overcome Them
 
How to Avoid Pitfalls in Big Data Analytics Webinar
How to Avoid Pitfalls in Big Data Analytics WebinarHow to Avoid Pitfalls in Big Data Analytics Webinar
How to Avoid Pitfalls in Big Data Analytics Webinar
 
Speed Matters - Intelligent Strategies to Accelerate Data-Driven Decisions
Speed Matters - Intelligent Strategies to Accelerate Data-Driven DecisionsSpeed Matters - Intelligent Strategies to Accelerate Data-Driven Decisions
Speed Matters - Intelligent Strategies to Accelerate Data-Driven Decisions
 
Data Centers in the Digital Economy
Data Centers in the Digital EconomyData Centers in the Digital Economy
Data Centers in the Digital Economy
 
DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...
 
Big Data LDN 2017: Disruption in Data
Big Data LDN 2017: Disruption in DataBig Data LDN 2017: Disruption in Data
Big Data LDN 2017: Disruption in Data
 
GDPR Benhmark: 70% of companies failing on their own GDPR compliance claims
GDPR Benhmark: 70%  of companies failing on their own GDPR compliance claimsGDPR Benhmark: 70%  of companies failing on their own GDPR compliance claims
GDPR Benhmark: 70% of companies failing on their own GDPR compliance claims
 
Data strategy demistifying data
Data strategy demistifying dataData strategy demistifying data
Data strategy demistifying data
 
DataEd Slides: Data Strategy — Plans Are Useless, but Planning Is Invaluable
DataEd Slides: Data Strategy — Plans Are Useless, but Planning Is InvaluableDataEd Slides: Data Strategy — Plans Are Useless, but Planning Is Invaluable
DataEd Slides: Data Strategy — Plans Are Useless, but Planning Is Invaluable
 
General Data Protection Regulation - BDW Meetup, October 11th, 2017
General Data Protection Regulation - BDW Meetup, October 11th, 2017General Data Protection Regulation - BDW Meetup, October 11th, 2017
General Data Protection Regulation - BDW Meetup, October 11th, 2017
 
Modern Data Integration Expert Session Webinar
Modern Data Integration Expert Session Webinar Modern Data Integration Expert Session Webinar
Modern Data Integration Expert Session Webinar
 
Data science capabilities
Data science capabilitiesData science capabilities
Data science capabilities
 
Embedded Analytics Expert Session Webinar
Embedded Analytics Expert Session Webinar Embedded Analytics Expert Session Webinar
Embedded Analytics Expert Session Webinar
 
DataEd Slides: Data Management + Data Strategy = Interoperability
DataEd Slides: Data Management + Data Strategy = InteroperabilityDataEd Slides: Data Management + Data Strategy = Interoperability
DataEd Slides: Data Management + Data Strategy = Interoperability
 
Data Virtualization Accelerating Your Data Strategy
Data Virtualization Accelerating Your Data StrategyData Virtualization Accelerating Your Data Strategy
Data Virtualization Accelerating Your Data Strategy
 
Data-Ed: A Framework for no sql and Hadoop
Data-Ed: A Framework for no sql and HadoopData-Ed: A Framework for no sql and Hadoop
Data-Ed: A Framework for no sql and Hadoop
 
Neil Sholay - Data Driven Business - #OracleCloudDay London
Neil Sholay - Data Driven Business - #OracleCloudDay LondonNeil Sholay - Data Driven Business - #OracleCloudDay London
Neil Sholay - Data Driven Business - #OracleCloudDay London
 

Similar to Exploring the Heated-and Completely Unnecessary- Data Lake Debate

Data Warehousing and Business Intelligence 2012
Data Warehousing and Business Intelligence 2012Data Warehousing and Business Intelligence 2012
Data Warehousing and Business Intelligence 2012
Ola Odejayi
 
Sept 27, 2016 IBM DataFirst Launch Event2
Sept 27, 2016 IBM DataFirst Launch Event2Sept 27, 2016 IBM DataFirst Launch Event2
Sept 27, 2016 IBM DataFirst Launch Event2
Mary Ann Alberry
 
Dambaru jena resume 2010
Dambaru jena resume 2010Dambaru jena resume 2010
Dambaru jena resume 2010
Dambaru Jena
 
Dambaru jena resume 2010
Dambaru jena resume 2010Dambaru jena resume 2010
Dambaru jena resume 2010
Dambaru Jena
 

Similar to Exploring the Heated-and Completely Unnecessary- Data Lake Debate (20)

Business unIntelligence - a Whistle Stop Tour
Business unIntelligence - a Whistle Stop TourBusiness unIntelligence - a Whistle Stop Tour
Business unIntelligence - a Whistle Stop Tour
 
20 Emerging influencers in 2020 for big data
20 Emerging influencers in 2020 for big data20 Emerging influencers in 2020 for big data
20 Emerging influencers in 2020 for big data
 
Why Big Data Analytics Needs Business Intelligence Too
Why Big Data Analytics Needs Business Intelligence Too Why Big Data Analytics Needs Business Intelligence Too
Why Big Data Analytics Needs Business Intelligence Too
 
Data Warehousing and Business Intelligence 2012
Data Warehousing and Business Intelligence 2012Data Warehousing and Business Intelligence 2012
Data Warehousing and Business Intelligence 2012
 
Data Virtualization – Gateway to a Digital Business - Barry Devlin
Data Virtualization – Gateway to a Digital Business - Barry DevlinData Virtualization – Gateway to a Digital Business - Barry Devlin
Data Virtualization – Gateway to a Digital Business - Barry Devlin
 
How to Achieve Agility with Analytics
How to Achieve Agility with AnalyticsHow to Achieve Agility with Analytics
How to Achieve Agility with Analytics
 
Sept 27, 2016 IBM DataFirst Launch Event2
Sept 27, 2016 IBM DataFirst Launch Event2Sept 27, 2016 IBM DataFirst Launch Event2
Sept 27, 2016 IBM DataFirst Launch Event2
 
Building Effective Data Governance
Building Effective Data GovernanceBuilding Effective Data Governance
Building Effective Data Governance
 
The Business Of Big Data (Ga Preso) Final
The Business Of Big Data (Ga Preso) FinalThe Business Of Big Data (Ga Preso) Final
The Business Of Big Data (Ga Preso) Final
 
Dambaru jena resume 2010
Dambaru jena resume 2010Dambaru jena resume 2010
Dambaru jena resume 2010
 
Dambaru jena resume 2010
Dambaru jena resume 2010Dambaru jena resume 2010
Dambaru jena resume 2010
 
Top 10 BI Trends for 2013
Top 10 BI Trends for 2013Top 10 BI Trends for 2013
Top 10 BI Trends for 2013
 
Business Intelligence Trends (based on 2012 experience)
Business Intelligence Trends (based on 2012 experience)Business Intelligence Trends (based on 2012 experience)
Business Intelligence Trends (based on 2012 experience)
 
Big Data Use Cases for Different Verticals and Adoption Patterns
Big Data Use Cases for Different Verticals and Adoption PatternsBig Data Use Cases for Different Verticals and Adoption Patterns
Big Data Use Cases for Different Verticals and Adoption Patterns
 
A Strategic View of Enterprise Reporting and Analytics: The Data Funnel
A Strategic View of Enterprise Reporting and Analytics: The Data FunnelA Strategic View of Enterprise Reporting and Analytics: The Data Funnel
A Strategic View of Enterprise Reporting and Analytics: The Data Funnel
 
Harvey Nash USA Webinar: The Big Opportunity of Big Data
Harvey Nash USA Webinar: The Big Opportunity of Big DataHarvey Nash USA Webinar: The Big Opportunity of Big Data
Harvey Nash USA Webinar: The Big Opportunity of Big Data
 
What's Hyperion?
What's Hyperion?What's Hyperion?
What's Hyperion?
 
What's Hyperion?
What's Hyperion?What's Hyperion?
What's Hyperion?
 
What's Hyperion
What's HyperionWhat's Hyperion
What's Hyperion
 
What's hyperion
What's hyperionWhat's hyperion
What's hyperion
 

More from Hortonworks

More from Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Recently uploaded

Breaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdfBreaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
UK Journal
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
panagenda
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
UXDXConf
 

Recently uploaded (20)

How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at Comcast
 
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdfBreaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptx
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptxBT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptx
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptx
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & Ireland
 

Exploring the Heated-and Completely Unnecessary- Data Lake Debate