Hadoop Data Lake Trends and Predictions
Mark van Rijmenam
Founder
Datafloq
Big Data Strategist
International Keynote speaker
Author Think Bigger
Data Lakes: Store Anything, Analyze Everything
Ingest and store
multiple data
streams
From a variety of
data sources
Enabling real-time
in-memory &
in-database analyticsWithin an open and
secure environment
Democratize Access to Data
6 Important Characteristics of a Hadoop Data Lake
Big	Data	Lake	
Data	Source	&	
Latency	demands		
Agnos7c	
Data	is	stored	
in	na7ve	format	
Build	with	
Hadoop	
framework		
Role-based	
Access	to	
All	Data	
Mix	&	match	
any	data	
source	
Complete	
flexibility	and	
extremely	scalable
Prediction 2015:
Time to Experiment with Data Lakes
The Growth in Data and Advantages of Data
Lakes results in increase in Data Lakes
Let’s look ahead and discuss the a few
Data Lake Trends for 2016
1)  The Rise of IoT Data Lakes
2)  Spark Enabled Data Lakes
3)  The Appearance of Data-Lake-as-a-Service
500 Billion connected devices need a good data
storage & analytics solution: the IoT Data Lake
The Challenges of the Internet of Things are the
drivers behind the IoT Data Lake
Shared Standards
and Infrastructure
Data Control
and Access
Data
Security
The IoT
Data Lake
Spark will Ignite Analytics on your Data Lake
Store = Hadoop & Analytics = Spark
There are several important components of
Apache Spark
General Purpose Engine
Large-scale data
processing
In-memory distributed
computing engine
Build quickly,
iterate fast
Unified
platform
The future of Big Data:
Data-Lake-as-a-Service Solutions
Data-Lake-as-a-Service
Image: EMC
Three important challenges solved with
Data-Lake-as-a-Service Solutions
Governance
Data
Preparation
Metadata
Management
Security
Data
Breaches
Role-based
Access
Value
Reduce
Complexity
Reduce Costs
Improve
Analytics
The three Data Lake trends will lead to new
approach of Big Data
There is bright future ahead of Data Lakes. Let
Kelly discuss examples and use cases
Thank you
@vanRijmenam
https://datafloq.com
Available on Amazon

Hadoop Big Data Lakes Keynote

  • 1.
    Hadoop Data LakeTrends and Predictions
  • 2.
    Mark van Rijmenam Founder Datafloq BigData Strategist International Keynote speaker Author Think Bigger
  • 3.
    Data Lakes: StoreAnything, Analyze Everything Ingest and store multiple data streams From a variety of data sources Enabling real-time in-memory & in-database analyticsWithin an open and secure environment Democratize Access to Data
  • 4.
    6 Important Characteristicsof a Hadoop Data Lake Big Data Lake Data Source & Latency demands Agnos7c Data is stored in na7ve format Build with Hadoop framework Role-based Access to All Data Mix & match any data source Complete flexibility and extremely scalable
  • 5.
    Prediction 2015: Time toExperiment with Data Lakes
  • 6.
    The Growth inData and Advantages of Data Lakes results in increase in Data Lakes
  • 7.
    Let’s look aheadand discuss the a few Data Lake Trends for 2016
  • 8.
    1)  The Riseof IoT Data Lakes 2)  Spark Enabled Data Lakes 3)  The Appearance of Data-Lake-as-a-Service
  • 9.
    500 Billion connecteddevices need a good data storage & analytics solution: the IoT Data Lake
  • 10.
    The Challenges ofthe Internet of Things are the drivers behind the IoT Data Lake Shared Standards and Infrastructure Data Control and Access Data Security The IoT Data Lake
  • 11.
    Spark will IgniteAnalytics on your Data Lake Store = Hadoop & Analytics = Spark
  • 12.
    There are severalimportant components of Apache Spark General Purpose Engine Large-scale data processing In-memory distributed computing engine Build quickly, iterate fast Unified platform
  • 13.
    The future ofBig Data: Data-Lake-as-a-Service Solutions Data-Lake-as-a-Service Image: EMC
  • 14.
    Three important challengessolved with Data-Lake-as-a-Service Solutions Governance Data Preparation Metadata Management Security Data Breaches Role-based Access Value Reduce Complexity Reduce Costs Improve Analytics
  • 15.
    The three DataLake trends will lead to new approach of Big Data
  • 16.
    There is brightfuture ahead of Data Lakes. Let Kelly discuss examples and use cases
  • 17.