SlideShare a Scribd company logo
DEBUNKING THE MYTHS 
Speaker 10 of 17 
Martin Willcox 
@Willcoxmnk 
What is Data Lake, Anyway? 
Followed by 
Anthony Miller
One of the Big Data labels that we risk over-loading to 
complete abstraction is the idea of a "Data Lake”… 
2 © 2014 Teradata 
“…store all data present 
and future and create a 
centralised data archive 
location.” 
“A large 
object-based 
repository that 
holds data in 
its native 
format” 
“Sometimes 
called the bit 
bucket or the 
landing zone” 
“All Water 
and Little 
Substance” 
“As more and more applications 
are created that derive value 
from… new types of data… the 
Data Lake forms”
“Data lakes can 
help resolve the 
nagging problem of 
accessibility and 
data integration” 
…and some of the discussions sound eerily familiar 
3 © 2014 Teradata 
Data accessibility 
and integration? 
Isn’t that what the 
Data Warehouse is 
for?
So is the Data Lake a new architectural construct? 
4 © 2014 Teradata 
Or are we just re-platforming Data Marts? 
Simple, single subject area Dimensional 
Data Marts – with all of the dimensions 
pre-joined to the fact table? One-per-workload 
/ application? 
Is this really the future of Enterprise 
Analytics? Or circa 1995 silo, 
departmental Decision Support Systems 
warmed-over?
Take the merits of the different technologies out of the 
equation – and this is what some of us are thinking… 
5 © 2014 Teradata
…but there are no free lunches in Information 
Management – merely more and different options 
Explicit, or implicit, there 
is always, always, always 
(at least one) schema 
6 © 2014 Teradata 
Agile application 
development, versus 
agile data acquisition 
None of the information 
management 
strategies / technologies 
are magic - “pay me 
now, or pay me later”
7 © 2014 Teradata 
Big Data Are Plural 
For the foreseeable future, we will need multiple Information 
Management strategies - and multiple Information 
Management technologies 
DATA WAREHOUSE 
DISCOVERY PLATFORM 
Integration 
becomes a 
critical concern 
DATA 
PLATFORM 
– Gartner – 
Logical Data Warehouse 
– Forrester – 
Enterprise Data Hub 
– Teradata – 
Unified Data Architecture
8 © 2014 Teradata 
A definition of the Data Lake (Data Reservoir) 
A centralised, consolidated, persistent store of raw, un-modelled and un-transformed data from 
multiple sources / silos (without an explicit, pre-defined schema, without externally defined metadata – 
and without guarantees about the quality, provenance and security of the data) 
Agile data acquisition – 
a haystack to go looking 
for needles… 
…with a natural storage 
model for complex, 
multi-structured data… 
…support for efficient 
non-relational 
computation… 
Now that is new, interesting and (potentially) very, very useful… 
…and provision for cost-effective 
storage of large 
and noisy data-sets.
9 © 2014 Teradata 
Data. Science
does nature tend to give us a single, beautiful lake? Or a messy patchwork of lakes, plural? 
10 © 2014 Teradata 
Left to its own devices, 
STOP PRESS: Laws of Physics* Unchanged! 
(* More specifically, the 2nd Law of Thermodynamics) 
None of the new information management strategies and technologies is by itself a cure 
for information entropy – data silos form naturally, just like lakes form naturally
11 © 2014 Teradata 
Summary and conclusions

More Related Content

What's hot

Info qiy foundation digital me - dappre-eng-aug17
Info qiy foundation   digital me - dappre-eng-aug17Info qiy foundation   digital me - dappre-eng-aug17
Info qiy foundation digital me - dappre-eng-aug17
BigDataExpo
 
Modern Data Architecture
Modern Data ArchitectureModern Data Architecture
Modern Data Architecture
Ed Thewlis
 
Agile v Warehouse? Maurice Lynch CEO of Nathaen Technologies - Dublinked Data...
Agile v Warehouse? Maurice Lynch CEO of Nathaen Technologies - Dublinked Data...Agile v Warehouse? Maurice Lynch CEO of Nathaen Technologies - Dublinked Data...
Agile v Warehouse? Maurice Lynch CEO of Nathaen Technologies - Dublinked Data...
Dublinked .
 
Education Seminar: Self-service BI, Logical Data Warehouse and Data Lakes
Education Seminar: Self-service BI, Logical Data Warehouse and Data LakesEducation Seminar: Self-service BI, Logical Data Warehouse and Data Lakes
Education Seminar: Self-service BI, Logical Data Warehouse and Data Lakes
Denodo
 
A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)
Denodo
 
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Denodo
 
Dell hans timmerman v1.1
Dell hans timmerman v1.1Dell hans timmerman v1.1
Dell hans timmerman v1.1
BigDataExpo
 
A "First Time Right" Start with Data Virtualization by Bart De Groeve, Practi...
A "First Time Right" Start with Data Virtualization by Bart De Groeve, Practi...A "First Time Right" Start with Data Virtualization by Bart De Groeve, Practi...
A "First Time Right" Start with Data Virtualization by Bart De Groeve, Practi...
Patrick Van Renterghem
 
Logical Data Warehouse: The Foundation of Modern Data and Analytics (APAC)
Logical Data Warehouse: The Foundation of Modern Data and Analytics (APAC)Logical Data Warehouse: The Foundation of Modern Data and Analytics (APAC)
Logical Data Warehouse: The Foundation of Modern Data and Analytics (APAC)
Denodo
 
Accelerate Cloud Modernization using Data Virtualization
Accelerate Cloud Modernization using Data VirtualizationAccelerate Cloud Modernization using Data Virtualization
Accelerate Cloud Modernization using Data Virtualization
Denodo
 
Data Virtualization for Compliance – Creating a Controlled Data Environment
Data Virtualization for Compliance – Creating a Controlled Data EnvironmentData Virtualization for Compliance – Creating a Controlled Data Environment
Data Virtualization for Compliance – Creating a Controlled Data Environment
Denodo
 
Advanced Data Analytics and Open Data - Dr Ingo Keck of CeADAR - Dublinked Da...
Advanced Data Analytics and Open Data - Dr Ingo Keck of CeADAR - Dublinked Da...Advanced Data Analytics and Open Data - Dr Ingo Keck of CeADAR - Dublinked Da...
Advanced Data Analytics and Open Data - Dr Ingo Keck of CeADAR - Dublinked Da...
Dublinked .
 
Atlantis company overview
Atlantis company overviewAtlantis company overview
Atlantis company overviewAriel Schwieg
 
TechEvent 2019: Provisioning of Data Platforms - Why, how, what; Martin Wunde...
TechEvent 2019: Provisioning of Data Platforms - Why, how, what; Martin Wunde...TechEvent 2019: Provisioning of Data Platforms - Why, how, what; Martin Wunde...
TechEvent 2019: Provisioning of Data Platforms - Why, how, what; Martin Wunde...
Trivadis
 
A Successful Data Strategy for Insurers in Volatile Times (EMEA)
A Successful Data Strategy for Insurers in Volatile Times (EMEA)A Successful Data Strategy for Insurers in Volatile Times (EMEA)
A Successful Data Strategy for Insurers in Volatile Times (EMEA)
Denodo
 
Multi-Cloud-Datenintegration mit Datenvirtualisierung
Multi-Cloud-Datenintegration mit DatenvirtualisierungMulti-Cloud-Datenintegration mit Datenvirtualisierung
Multi-Cloud-Datenintegration mit Datenvirtualisierung
Denodo
 
Study: #Big Data in #Austria
Study: #Big Data in #AustriaStudy: #Big Data in #Austria
Study: #Big Data in #Austria
Semantic Web Company
 
Data encryption-cloud
Data encryption-cloudData encryption-cloud
Data encryption-cloud
Secure Channels Inc.
 
Empowering your Enterprise with a Self-Service Data Marketplace (EMEA)
Empowering your Enterprise with a Self-Service Data Marketplace (EMEA)Empowering your Enterprise with a Self-Service Data Marketplace (EMEA)
Empowering your Enterprise with a Self-Service Data Marketplace (EMEA)
Denodo
 

What's hot (20)

Info qiy foundation digital me - dappre-eng-aug17
Info qiy foundation   digital me - dappre-eng-aug17Info qiy foundation   digital me - dappre-eng-aug17
Info qiy foundation digital me - dappre-eng-aug17
 
Modern Data Architecture
Modern Data ArchitectureModern Data Architecture
Modern Data Architecture
 
Agile v Warehouse? Maurice Lynch CEO of Nathaen Technologies - Dublinked Data...
Agile v Warehouse? Maurice Lynch CEO of Nathaen Technologies - Dublinked Data...Agile v Warehouse? Maurice Lynch CEO of Nathaen Technologies - Dublinked Data...
Agile v Warehouse? Maurice Lynch CEO of Nathaen Technologies - Dublinked Data...
 
Education Seminar: Self-service BI, Logical Data Warehouse and Data Lakes
Education Seminar: Self-service BI, Logical Data Warehouse and Data LakesEducation Seminar: Self-service BI, Logical Data Warehouse and Data Lakes
Education Seminar: Self-service BI, Logical Data Warehouse and Data Lakes
 
A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)
 
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
 
Dell hans timmerman v1.1
Dell hans timmerman v1.1Dell hans timmerman v1.1
Dell hans timmerman v1.1
 
A "First Time Right" Start with Data Virtualization by Bart De Groeve, Practi...
A "First Time Right" Start with Data Virtualization by Bart De Groeve, Practi...A "First Time Right" Start with Data Virtualization by Bart De Groeve, Practi...
A "First Time Right" Start with Data Virtualization by Bart De Groeve, Practi...
 
Logical Data Warehouse: The Foundation of Modern Data and Analytics (APAC)
Logical Data Warehouse: The Foundation of Modern Data and Analytics (APAC)Logical Data Warehouse: The Foundation of Modern Data and Analytics (APAC)
Logical Data Warehouse: The Foundation of Modern Data and Analytics (APAC)
 
Accelerate Cloud Modernization using Data Virtualization
Accelerate Cloud Modernization using Data VirtualizationAccelerate Cloud Modernization using Data Virtualization
Accelerate Cloud Modernization using Data Virtualization
 
Data Virtualization for Compliance – Creating a Controlled Data Environment
Data Virtualization for Compliance – Creating a Controlled Data EnvironmentData Virtualization for Compliance – Creating a Controlled Data Environment
Data Virtualization for Compliance – Creating a Controlled Data Environment
 
Vendor-Checklist
Vendor-ChecklistVendor-Checklist
Vendor-Checklist
 
Advanced Data Analytics and Open Data - Dr Ingo Keck of CeADAR - Dublinked Da...
Advanced Data Analytics and Open Data - Dr Ingo Keck of CeADAR - Dublinked Da...Advanced Data Analytics and Open Data - Dr Ingo Keck of CeADAR - Dublinked Da...
Advanced Data Analytics and Open Data - Dr Ingo Keck of CeADAR - Dublinked Da...
 
Atlantis company overview
Atlantis company overviewAtlantis company overview
Atlantis company overview
 
TechEvent 2019: Provisioning of Data Platforms - Why, how, what; Martin Wunde...
TechEvent 2019: Provisioning of Data Platforms - Why, how, what; Martin Wunde...TechEvent 2019: Provisioning of Data Platforms - Why, how, what; Martin Wunde...
TechEvent 2019: Provisioning of Data Platforms - Why, how, what; Martin Wunde...
 
A Successful Data Strategy for Insurers in Volatile Times (EMEA)
A Successful Data Strategy for Insurers in Volatile Times (EMEA)A Successful Data Strategy for Insurers in Volatile Times (EMEA)
A Successful Data Strategy for Insurers in Volatile Times (EMEA)
 
Multi-Cloud-Datenintegration mit Datenvirtualisierung
Multi-Cloud-Datenintegration mit DatenvirtualisierungMulti-Cloud-Datenintegration mit Datenvirtualisierung
Multi-Cloud-Datenintegration mit Datenvirtualisierung
 
Study: #Big Data in #Austria
Study: #Big Data in #AustriaStudy: #Big Data in #Austria
Study: #Big Data in #Austria
 
Data encryption-cloud
Data encryption-cloudData encryption-cloud
Data encryption-cloud
 
Empowering your Enterprise with a Self-Service Data Marketplace (EMEA)
Empowering your Enterprise with a Self-Service Data Marketplace (EMEA)Empowering your Enterprise with a Self-Service Data Marketplace (EMEA)
Empowering your Enterprise with a Self-Service Data Marketplace (EMEA)
 

Similar to Martin Willcox - What is a Data Lake, Anyway?

Data Virtualization – Gateway to a Digital Business - Barry Devlin
Data Virtualization – Gateway to a Digital Business - Barry DevlinData Virtualization – Gateway to a Digital Business - Barry Devlin
Data Virtualization – Gateway to a Digital Business - Barry Devlin
Denodo
 
Data lakes
Data lakesData lakes
Data lakes
Şaban Dalaman
 
Data lake ppt
Data lake pptData lake ppt
Data lake ppt
SwarnaLatha177
 
Gerenral insurance Accounts IT and Investment
Gerenral insurance Accounts IT and InvestmentGerenral insurance Accounts IT and Investment
Gerenral insurance Accounts IT and Investment
vijayk23x
 
Big data data lake and beyond
Big data data lake and beyond Big data data lake and beyond
Big data data lake and beyond
Rajesh Kumar
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
Denodo
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data Virtualization
Denodo
 
From Single Purpose to Multi Purpose Data Lakes - Broadening End Users
From Single Purpose to Multi Purpose Data Lakes - Broadening End UsersFrom Single Purpose to Multi Purpose Data Lakes - Broadening End Users
From Single Purpose to Multi Purpose Data Lakes - Broadening End Users
Denodo
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Denodo
 
Enterprise Data Lake
Enterprise Data LakeEnterprise Data Lake
Enterprise Data Lake
sambiswal
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digital
sambiswal
 
An Introduction to Data Virtualization in 2018
An Introduction to Data Virtualization in 2018An Introduction to Data Virtualization in 2018
An Introduction to Data Virtualization in 2018
Denodo
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
Denodo
 
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Denodo
 
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Denodo
 
Myth Busters III: I’m Building a Data Lake, So I Don’t Need Data Virtualization
Myth Busters III: I’m Building a Data Lake, So I Don’t Need Data VirtualizationMyth Busters III: I’m Building a Data Lake, So I Don’t Need Data Virtualization
Myth Busters III: I’m Building a Data Lake, So I Don’t Need Data Virtualization
Denodo
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
Denodo
 
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User GroupBig Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
Scott Mitchell
 
The Marriage of the Data Lake and the Data Warehouse and Why You Need Both
The Marriage of the Data Lake and the Data Warehouse and Why You Need BothThe Marriage of the Data Lake and the Data Warehouse and Why You Need Both
The Marriage of the Data Lake and the Data Warehouse and Why You Need Both
Adaryl "Bob" Wakefield, MBA
 

Similar to Martin Willcox - What is a Data Lake, Anyway? (20)

Data Virtualization – Gateway to a Digital Business - Barry Devlin
Data Virtualization – Gateway to a Digital Business - Barry DevlinData Virtualization – Gateway to a Digital Business - Barry Devlin
Data Virtualization – Gateway to a Digital Business - Barry Devlin
 
Data lakes
Data lakesData lakes
Data lakes
 
Data lake ppt
Data lake pptData lake ppt
Data lake ppt
 
Gerenral insurance Accounts IT and Investment
Gerenral insurance Accounts IT and InvestmentGerenral insurance Accounts IT and Investment
Gerenral insurance Accounts IT and Investment
 
Big data data lake and beyond
Big data data lake and beyond Big data data lake and beyond
Big data data lake and beyond
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data Virtualization
 
From Single Purpose to Multi Purpose Data Lakes - Broadening End Users
From Single Purpose to Multi Purpose Data Lakes - Broadening End UsersFrom Single Purpose to Multi Purpose Data Lakes - Broadening End Users
From Single Purpose to Multi Purpose Data Lakes - Broadening End Users
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)
 
Enterprise Data Lake
Enterprise Data LakeEnterprise Data Lake
Enterprise Data Lake
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digital
 
An Introduction to Data Virtualization in 2018
An Introduction to Data Virtualization in 2018An Introduction to Data Virtualization in 2018
An Introduction to Data Virtualization in 2018
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
 
Teradata
TeradataTeradata
Teradata
 
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
 
Myth Busters III: I’m Building a Data Lake, So I Don’t Need Data Virtualization
Myth Busters III: I’m Building a Data Lake, So I Don’t Need Data VirtualizationMyth Busters III: I’m Building a Data Lake, So I Don’t Need Data Virtualization
Myth Busters III: I’m Building a Data Lake, So I Don’t Need Data Virtualization
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User GroupBig Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
 
The Marriage of the Data Lake and the Data Warehouse and Why You Need Both
The Marriage of the Data Lake and the Data Warehouse and Why You Need BothThe Marriage of the Data Lake and the Data Warehouse and Why You Need Both
The Marriage of the Data Lake and the Data Warehouse and Why You Need Both
 

More from Saratoga

Georgina Armstrong - Data Visualisations. Making Boring Data Exciting and Emp...
Georgina Armstrong - Data Visualisations. Making Boring Data Exciting and Emp...Georgina Armstrong - Data Visualisations. Making Boring Data Exciting and Emp...
Georgina Armstrong - Data Visualisations. Making Boring Data Exciting and Emp...
Saratoga
 
David Shorten - Artificial intelligence
David Shorten - Artificial intelligenceDavid Shorten - Artificial intelligence
David Shorten - Artificial intelligence
Saratoga
 
Theo Priestley - Internet of Things - Forget the Numbers, Let's Talk Realities
Theo Priestley - Internet of Things - Forget the Numbers, Let's Talk RealitiesTheo Priestley - Internet of Things - Forget the Numbers, Let's Talk Realities
Theo Priestley - Internet of Things - Forget the Numbers, Let's Talk Realities
Saratoga
 
Jasper Horrell - SKA and Big Data: Up in Space and on the Ground
Jasper Horrell - SKA and Big Data: Up in Space and on the GroundJasper Horrell - SKA and Big Data: Up in Space and on the Ground
Jasper Horrell - SKA and Big Data: Up in Space and on the Ground
Saratoga
 
Barry Devlin - The Myth of Data-Driven Business
Barry Devlin - The Myth of Data-Driven BusinessBarry Devlin - The Myth of Data-Driven Business
Barry Devlin - The Myth of Data-Driven Business
Saratoga
 
Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...
Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...
Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...
Saratoga
 
Anthony Miller - The second Half of the Chessboard: Thriving in a Time of Exp...
Anthony Miller - The second Half of the Chessboard: Thriving in a Time of Exp...Anthony Miller - The second Half of the Chessboard: Thriving in a Time of Exp...
Anthony Miller - The second Half of the Chessboard: Thriving in a Time of Exp...
Saratoga
 
Marc Smith - Charting Collections of Connections in Social Media: Creating Ma...
Marc Smith - Charting Collections of Connections in Social Media: Creating Ma...Marc Smith - Charting Collections of Connections in Social Media: Creating Ma...
Marc Smith - Charting Collections of Connections in Social Media: Creating Ma...
Saratoga
 
Tristan Bergh - Predictive Analytics in Action: Real Business Results in Sout...
Tristan Bergh - Predictive Analytics in Action: Real Business Results in Sout...Tristan Bergh - Predictive Analytics in Action: Real Business Results in Sout...
Tristan Bergh - Predictive Analytics in Action: Real Business Results in Sout...
Saratoga
 
Gill Staniland - Interconnected BI - A systems thinking approach
Gill Staniland - Interconnected BI - A systems thinking approachGill Staniland - Interconnected BI - A systems thinking approach
Gill Staniland - Interconnected BI - A systems thinking approach
Saratoga
 
Gary Hope - Machine Learning: It's Not as Hard as you Think
Gary Hope - Machine Learning: It's Not as Hard as you ThinkGary Hope - Machine Learning: It's Not as Hard as you Think
Gary Hope - Machine Learning: It's Not as Hard as you Think
Saratoga
 
Jerry Chetty - Myth About Data Investigation
Jerry Chetty - Myth About Data InvestigationJerry Chetty - Myth About Data Investigation
Jerry Chetty - Myth About Data Investigation
Saratoga
 
Mike McDougall - Business Intelligence - Perdition or Paradise
Mike McDougall - Business Intelligence - Perdition or ParadiseMike McDougall - Business Intelligence - Perdition or Paradise
Mike McDougall - Business Intelligence - Perdition or Paradise
Saratoga
 
Mbwana Alliy - Big data from Silicon Valley to Africa
Mbwana Alliy - Big data from Silicon Valley to AfricaMbwana Alliy - Big data from Silicon Valley to Africa
Mbwana Alliy - Big data from Silicon Valley to Africa
Saratoga
 
The art of visualising requirements
The art of visualising requirementsThe art of visualising requirements
The art of visualising requirements
Saratoga
 
Getting investment ready tech4 africa (zach)
Getting investment ready   tech4 africa (zach)Getting investment ready   tech4 africa (zach)
Getting investment ready tech4 africa (zach)
Saratoga
 

More from Saratoga (16)

Georgina Armstrong - Data Visualisations. Making Boring Data Exciting and Emp...
Georgina Armstrong - Data Visualisations. Making Boring Data Exciting and Emp...Georgina Armstrong - Data Visualisations. Making Boring Data Exciting and Emp...
Georgina Armstrong - Data Visualisations. Making Boring Data Exciting and Emp...
 
David Shorten - Artificial intelligence
David Shorten - Artificial intelligenceDavid Shorten - Artificial intelligence
David Shorten - Artificial intelligence
 
Theo Priestley - Internet of Things - Forget the Numbers, Let's Talk Realities
Theo Priestley - Internet of Things - Forget the Numbers, Let's Talk RealitiesTheo Priestley - Internet of Things - Forget the Numbers, Let's Talk Realities
Theo Priestley - Internet of Things - Forget the Numbers, Let's Talk Realities
 
Jasper Horrell - SKA and Big Data: Up in Space and on the Ground
Jasper Horrell - SKA and Big Data: Up in Space and on the GroundJasper Horrell - SKA and Big Data: Up in Space and on the Ground
Jasper Horrell - SKA and Big Data: Up in Space and on the Ground
 
Barry Devlin - The Myth of Data-Driven Business
Barry Devlin - The Myth of Data-Driven BusinessBarry Devlin - The Myth of Data-Driven Business
Barry Devlin - The Myth of Data-Driven Business
 
Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...
Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...
Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...
 
Anthony Miller - The second Half of the Chessboard: Thriving in a Time of Exp...
Anthony Miller - The second Half of the Chessboard: Thriving in a Time of Exp...Anthony Miller - The second Half of the Chessboard: Thriving in a Time of Exp...
Anthony Miller - The second Half of the Chessboard: Thriving in a Time of Exp...
 
Marc Smith - Charting Collections of Connections in Social Media: Creating Ma...
Marc Smith - Charting Collections of Connections in Social Media: Creating Ma...Marc Smith - Charting Collections of Connections in Social Media: Creating Ma...
Marc Smith - Charting Collections of Connections in Social Media: Creating Ma...
 
Tristan Bergh - Predictive Analytics in Action: Real Business Results in Sout...
Tristan Bergh - Predictive Analytics in Action: Real Business Results in Sout...Tristan Bergh - Predictive Analytics in Action: Real Business Results in Sout...
Tristan Bergh - Predictive Analytics in Action: Real Business Results in Sout...
 
Gill Staniland - Interconnected BI - A systems thinking approach
Gill Staniland - Interconnected BI - A systems thinking approachGill Staniland - Interconnected BI - A systems thinking approach
Gill Staniland - Interconnected BI - A systems thinking approach
 
Gary Hope - Machine Learning: It's Not as Hard as you Think
Gary Hope - Machine Learning: It's Not as Hard as you ThinkGary Hope - Machine Learning: It's Not as Hard as you Think
Gary Hope - Machine Learning: It's Not as Hard as you Think
 
Jerry Chetty - Myth About Data Investigation
Jerry Chetty - Myth About Data InvestigationJerry Chetty - Myth About Data Investigation
Jerry Chetty - Myth About Data Investigation
 
Mike McDougall - Business Intelligence - Perdition or Paradise
Mike McDougall - Business Intelligence - Perdition or ParadiseMike McDougall - Business Intelligence - Perdition or Paradise
Mike McDougall - Business Intelligence - Perdition or Paradise
 
Mbwana Alliy - Big data from Silicon Valley to Africa
Mbwana Alliy - Big data from Silicon Valley to AfricaMbwana Alliy - Big data from Silicon Valley to Africa
Mbwana Alliy - Big data from Silicon Valley to Africa
 
The art of visualising requirements
The art of visualising requirementsThe art of visualising requirements
The art of visualising requirements
 
Getting investment ready tech4 africa (zach)
Getting investment ready   tech4 africa (zach)Getting investment ready   tech4 africa (zach)
Getting investment ready tech4 africa (zach)
 

Recently uploaded

一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
2023240532
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 

Recently uploaded (20)

一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 

Martin Willcox - What is a Data Lake, Anyway?

  • 1. DEBUNKING THE MYTHS Speaker 10 of 17 Martin Willcox @Willcoxmnk What is Data Lake, Anyway? Followed by Anthony Miller
  • 2. One of the Big Data labels that we risk over-loading to complete abstraction is the idea of a "Data Lake”… 2 © 2014 Teradata “…store all data present and future and create a centralised data archive location.” “A large object-based repository that holds data in its native format” “Sometimes called the bit bucket or the landing zone” “All Water and Little Substance” “As more and more applications are created that derive value from… new types of data… the Data Lake forms”
  • 3. “Data lakes can help resolve the nagging problem of accessibility and data integration” …and some of the discussions sound eerily familiar 3 © 2014 Teradata Data accessibility and integration? Isn’t that what the Data Warehouse is for?
  • 4. So is the Data Lake a new architectural construct? 4 © 2014 Teradata Or are we just re-platforming Data Marts? Simple, single subject area Dimensional Data Marts – with all of the dimensions pre-joined to the fact table? One-per-workload / application? Is this really the future of Enterprise Analytics? Or circa 1995 silo, departmental Decision Support Systems warmed-over?
  • 5. Take the merits of the different technologies out of the equation – and this is what some of us are thinking… 5 © 2014 Teradata
  • 6. …but there are no free lunches in Information Management – merely more and different options Explicit, or implicit, there is always, always, always (at least one) schema 6 © 2014 Teradata Agile application development, versus agile data acquisition None of the information management strategies / technologies are magic - “pay me now, or pay me later”
  • 7. 7 © 2014 Teradata Big Data Are Plural For the foreseeable future, we will need multiple Information Management strategies - and multiple Information Management technologies DATA WAREHOUSE DISCOVERY PLATFORM Integration becomes a critical concern DATA PLATFORM – Gartner – Logical Data Warehouse – Forrester – Enterprise Data Hub – Teradata – Unified Data Architecture
  • 8. 8 © 2014 Teradata A definition of the Data Lake (Data Reservoir) A centralised, consolidated, persistent store of raw, un-modelled and un-transformed data from multiple sources / silos (without an explicit, pre-defined schema, without externally defined metadata – and without guarantees about the quality, provenance and security of the data) Agile data acquisition – a haystack to go looking for needles… …with a natural storage model for complex, multi-structured data… …support for efficient non-relational computation… Now that is new, interesting and (potentially) very, very useful… …and provision for cost-effective storage of large and noisy data-sets.
  • 9. 9 © 2014 Teradata Data. Science
  • 10. does nature tend to give us a single, beautiful lake? Or a messy patchwork of lakes, plural? 10 © 2014 Teradata Left to its own devices, STOP PRESS: Laws of Physics* Unchanged! (* More specifically, the 2nd Law of Thermodynamics) None of the new information management strategies and technologies is by itself a cure for information entropy – data silos form naturally, just like lakes form naturally
  • 11. 11 © 2014 Teradata Summary and conclusions