SlideShare a Scribd company logo
1 of 36
1 © Hortonworks Inc. 2011–2018. All rights reserved
© Hortonworks, Inc. 2011-2018. All rights reserved. | Hortonworks confidential and proprietary information.
The Implacable advance of Data - Data Lifecycle in Hadoop
Niru Anisetti, Principal Product Manager
2 © Hortonworks Inc. 2011–2018. All rights reserved
Hortonworks Legal Disclaimer
This document may contain product/services features and technology directions that are under development, may be under development in the future
or may ultimately not be developed. Project capabilities are based on information that is publicly available within the Apache Software Foundation
project websites ("Apache"). Progress of the project capabilities can be tracked from inception to release through Apache, however, Technical feasibility,
market demand, user feedback and the overarching Apache Software Foundation community development process can all effect timing and final delivery.
This document’s description of these features and technology directions does not represent a contractual commitment, promise or obligation from
Hortonworks to deliver these features in any generally available product. Product & Service features and technology directions are subject to change, and
must not be included in contracts, purchase orders, or sales agreements of any kind. Since this document contains an outline of general product
development plans, customers should not rely upon it when making a purchase decision. The security of your IT system is very important to us, but no
single product or service can make your IT systems completely secure or prevent all improper disclosures or access. Any effective security program
requires a layered approach, which will require various systems, products, and services, as well as operational policies and procedures. HORTONWORKS
DOES NOT WARRANT THAT ANY SERVICES, SYSTEMS OR PRODUCTS PREVENT ACCIDENTAL, ILLEGAL OR MALICIOUS CONDUCT OF ANY PARTY.
❑ This presentation may contain products, services, features, or technology directions that are under development, may be under development in
the future, or may ultimately not be under development or developed. The description herein of such products, services, features or technology
directions does not represent a contractual commitment, promise or obligation by Hortonworks to deliver them in any generally available product.
❑ Apache project timelines are based in part on information that is publicly available within the Apache Software Foundation (“Apache”) project
websites. Progress of Apache projects can be tracked through Apache announcements on these websites; however, technical feasibility, market
demand, user feedback, the overarching Apache community development process, and other factors can all affect timing and delivery of Apache
projects.
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Big Data Trend
90% of the data in the world has been created in the last two years alone
Digital content is doubling every 18
months
Structured Data
- Database
- Data Warehouse
- ERPs
- CRMs
Unstructured Data
- Web blogs
- Social media
- Audio, Video
- Software file-systems
Source: Frost & Sullivan - World’s Top Global Mega Trends To 2025 and Implications to
Business, Society and Cultures
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
The Hortonworks Opportunity
At the core of ~$1.9T in
market opportunity
over the next 5 years
Cloud
~$410 B
Streaming
~$1.65 B
Data Science
~$180 B
Big Data
~$210 B
IoT
~$1.1 T
1.0 2.0 3.0
© Hortonworks Inc. 2011 – 2018. All Rights Reserved
Sources: IDC Worldwide Big Data and Analytics Software Forecast, 2017-2021, Forecasts Continuous/Streaming Analytics revenue to be $1.65B by 2021, July, 2017; Data Science Platform market size to reach $183.7B by 2023, Allied Market Research, Data Science Platform Market by Type and End User: Global Opportunity and Forecast, 2017-2023; IDC Worldwide Semiannual Big Data and Analytics Spending Guide Update, Forecasts Big Data & Business Analytics
revenues to be $210B by 2020, Press Release March 2017; Gartner Worldwide Public Cloud Services Revenue, Forecasts Public Cloud Services Revenue to be $411.4B by 2020, Press Release October 2017; IDC Worldwide Semiannual IoT Spending Guide Update, Forecasts Worldwide IoT Spending forecast to be ~$1.1T by 2021, Press Release December 2017.
5 © Hortonworks Inc. 2011–2018. All rights reserved
DataPlane Service: Manage, Govern & Secure
Native Capabilities Clusters & Data Sources, Shared Services
Core Services Extensibility, Metering, Telemetry
Data Lifecycle
Manager
Oct, 2017
Data Steward
Studio
Q2, 2018
DPS EXTENSIBLE SERVICES
DPS PLATFORM
Data at Rest Data in Motion
6 © Hortonworks Inc. 2011–2018. All rights reserved
“Data Lifecycle Manager” (DLM) Service
⬢ Is a DPS add-on service that safeguards
enterprise data
⬢ Manages the data lifecycle:
– Data Replication/ Failback for Disaster
Recovery
– Auto Tiering to Optimize Storage Cost &
Performance
– Backup & Recover Critical Business Data
– Offline replication for large datasets
⬢ Maintains common metadata and security
policies across data sources and hybrid
environments
Production Site Disaster Recovery Site
Offsite Replication
Failback
Sunday Monday Tuesday Wednesday Thursday Friday Saturday Sunday
Full Backup
Cumulative incremental backup
Accidental Deletion
Solid State Drive Hard Drive Archive
Access to Data
0days 30days 90days Forever
ProbabilityofReuse
Time
100%
0%
Disaster
Recovery
Backup &
Restore
Auto
Tiering
S3
7 © Hortonworks Inc. 2011–2018. All rights reserved
DLM Features
• Incremental Hive replication & Hive metadata
• HDFS snapshot based replication between HDP clusters
• Ranger policy replication to Target cluster
• TDE & TLS support
• Support multiple keys/KMS
• Cloud storage replication (AWS)
• Active/standby behavior on DR site using Ranger
Available Now
8 © Hortonworks Inc. 2011–2018. All rights reserved
DLM Service
Data Lifecycle Manager Architecture
HDP Distro REST
Data Plane UI
DLM ServicePlugin Manager
REST infrastructure
Job Manager Alerts ManagerConfiguration Manager
Security Infrastructure Copy Services
HDFS Hive Ranger
Log Manager
DLM DB
Logs
9 © Hortonworks Inc. 2011–2018. All rights reserved
DLM Deployment Model
⬢ DLM Deployment packages
– DLM App (Installed as part of DPS app)
– DLM Engine in HDP (Management Pack)
DLM Releases
⬢ DPS 1.0/DLM 1.0
– October, 2017
– HDP 2.6.3 (Cluster)
⬢ DPS 1.1/DLM 1.1
– May 2018
– HDP 2.6.5 (Cluster)
Cluster 1
Cluster 2
DLM
Engine
DLM
Engine
On-Premise Data Center 1
Cluster 3
Cluster 4
DLM
Engine
On-Premise Data Center 2
Cluster 3
Operating
Cluster
DPS-DLM APP
Public Cloud
Push based replication to Cloud
Pull based replication
DLM 1.1 User Flow
Cloud Replication and Encryption
11 © Hortonworks Inc. 2011–2018. All rights reserved
12 © Hortonworks Inc. 2011–2018. All rights reserved
13 © Hortonworks Inc. 2011–2018. All rights reserved
14 © Hortonworks Inc. 2011–2018. All rights reserved
15 © Hortonworks Inc. 2011–2018. All rights reserved
16 © Hortonworks Inc. 2011–2018. All rights reserved
17 © Hortonworks Inc. 2011–2018. All rights reserved
18 © Hortonworks Inc. 2011–2018. All rights reserved
19 © Hortonworks Inc. 2011–2018. All rights reserved
20 © Hortonworks Inc. 2011–2018. All rights reserved
21 © Hortonworks Inc. 2011–2018. All rights reserved
22 © Hortonworks Inc. 2011–2018. All rights reserved
Cluster-1
Source
ListofJiraRMPs
VPC
DWS San Jose 2018 Summit DLM Demo scenarios
Cluster-3
IaaS/HDP
Onprem
HMS
S3 Buckets
Demo Setup
• Data: NY Traffic Collision Data (partitioned by date/Boroughs)
• Size: ~2GB
• Interactive Application: Zeppelin & Shell
• Pre-setup: Bootstrap, Cloud credentials, and Ranger policies on Target
• DLM Policy schedule interval: 1-minute interval
Demo Scenarios
1. Onprem-Onprem HDFS: Snapshot based incremental replication
2. Onprem-Cloud-HDFS: Replication of a HDFS folder to S3 bucket
Onprem
Hive
replication
setup
HDFS
replication
setup
DLM Customer Use Cases
24 © Hortonworks Inc. 2011–2018. All rights reserved
DLM Customer Use cases/Solutions
Replicate 100+ TB data between
on-prem and cloud storage
locations
Metadata along with security
policy replication is critical
GDPR compliance is required
Tiering has be supported to
reduce overall TCO
Pharmaceutical Industry
Replicate PB+ TB data between
various data centers
Data has to be replicated along
with metadata and security
policies
GDPR compliance is required
Tiering has be supported to
reduce overall TCO
Finance & Banking Industry
Replicate corporate events data
between Hybrid locations
Build and fine-tune insights to
prove ROI for each of the event
related algorithms
Employee services Industry
25 © Hortonworks Inc. 2011–2018. All rights reserved
DLM 2.x: Tiering User Flow
26 © Hortonworks Inc. 2011–2018. All rights reserved
What is Tiered Storage?
• Data with different characteristics is moved to various types of storage media to
reduce total storage cost.
• Tiers are determined based on performance and cost of the media.
• DLM enables customers to define Tiers through DLM interface.
• DLM-Tiering is achieved by intra and/or inter cluster data movement
27 © Hortonworks Inc. 2011–2018. All rights reserved
28 © Hortonworks Inc. 2011–2018. All rights reserved
29 © Hortonworks Inc. 2011–2018. All rights reserved
30 © Hortonworks Inc. 2011–2018. All rights reserved
31 © Hortonworks Inc. 2011–2018. All rights reserved
32 © Hortonworks Inc. 2011–2018. All rights reserved
33 © Hortonworks Inc. 2011–2018. All rights reserved
34 © Hortonworks Inc. 2011–2018. All rights reserved
DLM 2018+ Roadmap
3
5
© Hortonworks Inc. Confidential 2011 – 2017. All Rights Reserved
DLM 2018+ Roadmap
1
2
3
2H 2017 1H 2018 2H 2018* 1H 2019*
DLM 1.0 1.1 1.x 2.0 3.0
HDP 2.6.3 2.6.5 2.6.x 2.6.x/3.x 3.x
Replication
• On-prem/HDP Cloud
replication
• Cloud Storage
replication (S3)
• Encryption
(TDE&TLS)
• Cloud Storage
replication (ADLS &
WASB)
• Atlas support (GDPR)
• GCS
• Hybrid/Multi-Cloud
support (One-to-
many)
• HBase
• Offline replication
Kafka support
DR
Failback & Failover N/A N/A N/A Failback
Failover
Auto-Tiering
N/A N/A N/A N/A Policy based Tiering -
hot/warm/cold data to
reduce TCO
* Subject to change. Features are not committed.
PRODUCTTHEMES
Released Released Planning Planning-in-progress
36 © Hortonworks Inc. 2011–2018. All rights reserved
Thank you

More Related Content

What's hot

Running Enterprise Workloads with an open source Hybrid Cloud Data Architectu...
Running Enterprise Workloads with an open source Hybrid Cloud Data Architectu...Running Enterprise Workloads with an open source Hybrid Cloud Data Architectu...
Running Enterprise Workloads with an open source Hybrid Cloud Data Architectu...DataWorks Summit
 
Running Enterprise Workloads with an open source Hybrid Cloud Data Architecture
Running Enterprise Workloads with an open source Hybrid Cloud Data ArchitectureRunning Enterprise Workloads with an open source Hybrid Cloud Data Architecture
Running Enterprise Workloads with an open source Hybrid Cloud Data ArchitectureDataWorks Summit
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
 
Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...
Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...
Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...DataWorks Summit
 
BI on Big Data with instant response times at Verizon
BI on Big Data with instant response times at VerizonBI on Big Data with instant response times at Verizon
BI on Big Data with instant response times at VerizonDataWorks Summit
 
Global Data Management – a practical framework to rethinking enterprise, oper...
Global Data Management – a practical framework to rethinking enterprise, oper...Global Data Management – a practical framework to rethinking enterprise, oper...
Global Data Management – a practical framework to rethinking enterprise, oper...DataWorks Summit
 
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifySimplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifyHortonworks
 
Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...DataWorks Summit/Hadoop Summit
 
Spark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark Summit
 
The Car of the Future - Autonomous, Connected, and Data Centric
The Car of the Future - Autonomous, Connected, and Data CentricThe Car of the Future - Autonomous, Connected, and Data Centric
The Car of the Future - Autonomous, Connected, and Data CentricDataWorks Summit
 
A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...
A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...
A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...DataWorks Summit/Hadoop Summit
 
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseUsing Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseDataWorks Summit
 
How to use flash drives with Apache Hadoop 3.x: Real world use cases and proo...
How to use flash drives with Apache Hadoop 3.x: Real world use cases and proo...How to use flash drives with Apache Hadoop 3.x: Real world use cases and proo...
How to use flash drives with Apache Hadoop 3.x: Real world use cases and proo...DataWorks Summit
 
Worldpay - Delivering Multi-Tenancy Applications in A Secure Operational Plat...
Worldpay - Delivering Multi-Tenancy Applications in A Secure Operational Plat...Worldpay - Delivering Multi-Tenancy Applications in A Secure Operational Plat...
Worldpay - Delivering Multi-Tenancy Applications in A Secure Operational Plat...DataWorks Summit/Hadoop Summit
 
Data in Motion - Data at Rest - Hortonworks a Modern Architecture
Data in Motion - Data at Rest - Hortonworks a Modern ArchitectureData in Motion - Data at Rest - Hortonworks a Modern Architecture
Data in Motion - Data at Rest - Hortonworks a Modern ArchitectureMats Johansson
 
Pivotal - Advanced Analytics for Telecommunications
Pivotal - Advanced Analytics for Telecommunications Pivotal - Advanced Analytics for Telecommunications
Pivotal - Advanced Analytics for Telecommunications Hortonworks
 
Machine Learning Everywhere
Machine Learning EverywhereMachine Learning Everywhere
Machine Learning EverywhereDataWorks Summit
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 

What's hot (20)

Running Enterprise Workloads with an open source Hybrid Cloud Data Architectu...
Running Enterprise Workloads with an open source Hybrid Cloud Data Architectu...Running Enterprise Workloads with an open source Hybrid Cloud Data Architectu...
Running Enterprise Workloads with an open source Hybrid Cloud Data Architectu...
 
Running Enterprise Workloads with an open source Hybrid Cloud Data Architecture
Running Enterprise Workloads with an open source Hybrid Cloud Data ArchitectureRunning Enterprise Workloads with an open source Hybrid Cloud Data Architecture
Running Enterprise Workloads with an open source Hybrid Cloud Data Architecture
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...
Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...
Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...
 
BI on Big Data with instant response times at Verizon
BI on Big Data with instant response times at VerizonBI on Big Data with instant response times at Verizon
BI on Big Data with instant response times at Verizon
 
Global Data Management – a practical framework to rethinking enterprise, oper...
Global Data Management – a practical framework to rethinking enterprise, oper...Global Data Management – a practical framework to rethinking enterprise, oper...
Global Data Management – a practical framework to rethinking enterprise, oper...
 
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifySimplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
 
Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...
 
Spark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun Murthy
 
The Car of the Future - Autonomous, Connected, and Data Centric
The Car of the Future - Autonomous, Connected, and Data CentricThe Car of the Future - Autonomous, Connected, and Data Centric
The Car of the Future - Autonomous, Connected, and Data Centric
 
A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...
A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...
A Tale of Two Regulations: Cross-Border Data Protection For Big Data Under GD...
 
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseUsing Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
 
How to use flash drives with Apache Hadoop 3.x: Real world use cases and proo...
How to use flash drives with Apache Hadoop 3.x: Real world use cases and proo...How to use flash drives with Apache Hadoop 3.x: Real world use cases and proo...
How to use flash drives with Apache Hadoop 3.x: Real world use cases and proo...
 
Worldpay - Delivering Multi-Tenancy Applications in A Secure Operational Plat...
Worldpay - Delivering Multi-Tenancy Applications in A Secure Operational Plat...Worldpay - Delivering Multi-Tenancy Applications in A Secure Operational Plat...
Worldpay - Delivering Multi-Tenancy Applications in A Secure Operational Plat...
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data in Motion - Data at Rest - Hortonworks a Modern Architecture
Data in Motion - Data at Rest - Hortonworks a Modern ArchitectureData in Motion - Data at Rest - Hortonworks a Modern Architecture
Data in Motion - Data at Rest - Hortonworks a Modern Architecture
 
Pivotal - Advanced Analytics for Telecommunications
Pivotal - Advanced Analytics for Telecommunications Pivotal - Advanced Analytics for Telecommunications
Pivotal - Advanced Analytics for Telecommunications
 
Machine Learning Everywhere
Machine Learning EverywhereMachine Learning Everywhere
Machine Learning Everywhere
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 

Similar to Data Lifecycle in Hadoop

Hortonworks Hybrid Cloud - Putting you back in control of your data
Hortonworks Hybrid Cloud - Putting you back in control of your dataHortonworks Hybrid Cloud - Putting you back in control of your data
Hortonworks Hybrid Cloud - Putting you back in control of your dataScott Clinton
 
Running Enterprise Workloads with an Open Source Hybrid Cloud Data Architecture
Running Enterprise Workloads with an Open Source Hybrid Cloud Data ArchitectureRunning Enterprise Workloads with an Open Source Hybrid Cloud Data Architecture
Running Enterprise Workloads with an Open Source Hybrid Cloud Data ArchitectureDataWorks Summit
 
Hortonworks - IBM Cognitive - The Future of Data Science
Hortonworks - IBM Cognitive - The Future of Data ScienceHortonworks - IBM Cognitive - The Future of Data Science
Hortonworks - IBM Cognitive - The Future of Data ScienceThiago Santiago
 
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Hortonworks
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Barijaxconf
 
Hortonworks & Bilot Data Driven Transformations with Hadoop
Hortonworks & Bilot Data Driven Transformations with HadoopHortonworks & Bilot Data Driven Transformations with Hadoop
Hortonworks & Bilot Data Driven Transformations with HadoopMats Johansson
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...Hortonworks
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks
 
Hortonworks Hadoop @ Oslo Hadoop User Group
Hortonworks Hadoop @ Oslo Hadoop User GroupHortonworks Hadoop @ Oslo Hadoop User Group
Hortonworks Hadoop @ Oslo Hadoop User GroupMats Johansson
 
Are your Cloud Services Secure and Compliant today?
Are your Cloud Services Secure and Compliant today?Are your Cloud Services Secure and Compliant today?
Are your Cloud Services Secure and Compliant today?Sridhar Karnam
 
hitachi-content-platform-portfolio-esg-validation-report
hitachi-content-platform-portfolio-esg-validation-reporthitachi-content-platform-portfolio-esg-validation-report
hitachi-content-platform-portfolio-esg-validation-reportIngrid Fernandez, PhD
 
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsPredicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsHortonworks
 
Apache Atlas: Tracking dataset lineage across Hadoop components
Apache Atlas: Tracking dataset lineage across Hadoop componentsApache Atlas: Tracking dataset lineage across Hadoop components
Apache Atlas: Tracking dataset lineage across Hadoop componentsDataWorks Summit/Hadoop Summit
 
Telecom Provider Case Study | Big Data Adoption | Diyotta
Telecom Provider Case Study | Big Data Adoption | DiyottaTelecom Provider Case Study | Big Data Adoption | Diyotta
Telecom Provider Case Study | Big Data Adoption | Diyottadiyotta
 
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Hortonworks
 
Enabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical EnterpriseEnabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical EnterpriseHortonworks
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Rittman Analytics
 
Hortonworks and HP Vertica Webinar
Hortonworks and HP Vertica WebinarHortonworks and HP Vertica Webinar
Hortonworks and HP Vertica WebinarHortonworks
 

Similar to Data Lifecycle in Hadoop (20)

Hortonworks Hybrid Cloud - Putting you back in control of your data
Hortonworks Hybrid Cloud - Putting you back in control of your dataHortonworks Hybrid Cloud - Putting you back in control of your data
Hortonworks Hybrid Cloud - Putting you back in control of your data
 
Running Enterprise Workloads with an Open Source Hybrid Cloud Data Architecture
Running Enterprise Workloads with an Open Source Hybrid Cloud Data ArchitectureRunning Enterprise Workloads with an Open Source Hybrid Cloud Data Architecture
Running Enterprise Workloads with an Open Source Hybrid Cloud Data Architecture
 
Hortonworks - IBM Cognitive - The Future of Data Science
Hortonworks - IBM Cognitive - The Future of Data ScienceHortonworks - IBM Cognitive - The Future of Data Science
Hortonworks - IBM Cognitive - The Future of Data Science
 
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
 
Hortonworks & Bilot Data Driven Transformations with Hadoop
Hortonworks & Bilot Data Driven Transformations with HadoopHortonworks & Bilot Data Driven Transformations with Hadoop
Hortonworks & Bilot Data Driven Transformations with Hadoop
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
 
Hortonworks Hadoop @ Oslo Hadoop User Group
Hortonworks Hadoop @ Oslo Hadoop User GroupHortonworks Hadoop @ Oslo Hadoop User Group
Hortonworks Hadoop @ Oslo Hadoop User Group
 
Meetup oslo hortonworks HDP
Meetup oslo hortonworks HDPMeetup oslo hortonworks HDP
Meetup oslo hortonworks HDP
 
Are your Cloud Services Secure and Compliant today?
Are your Cloud Services Secure and Compliant today?Are your Cloud Services Secure and Compliant today?
Are your Cloud Services Secure and Compliant today?
 
hitachi-content-platform-portfolio-esg-validation-report
hitachi-content-platform-portfolio-esg-validation-reporthitachi-content-platform-portfolio-esg-validation-report
hitachi-content-platform-portfolio-esg-validation-report
 
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsPredicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
 
Apache Atlas: Tracking dataset lineage across Hadoop components
Apache Atlas: Tracking dataset lineage across Hadoop componentsApache Atlas: Tracking dataset lineage across Hadoop components
Apache Atlas: Tracking dataset lineage across Hadoop components
 
Telecom Provider Case Study | Big Data Adoption | Diyotta
Telecom Provider Case Study | Big Data Adoption | DiyottaTelecom Provider Case Study | Big Data Adoption | Diyotta
Telecom Provider Case Study | Big Data Adoption | Diyotta
 
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
 
Enabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical EnterpriseEnabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical Enterprise
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptx
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
 
Hortonworks and HP Vertica Webinar
Hortonworks and HP Vertica WebinarHortonworks and HP Vertica Webinar
Hortonworks and HP Vertica Webinar
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfWildaNurAmalia2
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptJoemSTuliba
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxFarihaAbdulRasheed
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPirithiRaju
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxBerniceCayabyab1
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptArshadWarsi13
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx023NiWayanAnggiSriWa
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubaikojalkojal131
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》rnrncn29
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 

Recently uploaded (20)

BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.ppt
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.ppt
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 

Data Lifecycle in Hadoop

  • 1. 1 © Hortonworks Inc. 2011–2018. All rights reserved © Hortonworks, Inc. 2011-2018. All rights reserved. | Hortonworks confidential and proprietary information. The Implacable advance of Data - Data Lifecycle in Hadoop Niru Anisetti, Principal Product Manager
  • 2. 2 © Hortonworks Inc. 2011–2018. All rights reserved Hortonworks Legal Disclaimer This document may contain product/services features and technology directions that are under development, may be under development in the future or may ultimately not be developed. Project capabilities are based on information that is publicly available within the Apache Software Foundation project websites ("Apache"). Progress of the project capabilities can be tracked from inception to release through Apache, however, Technical feasibility, market demand, user feedback and the overarching Apache Software Foundation community development process can all effect timing and final delivery. This document’s description of these features and technology directions does not represent a contractual commitment, promise or obligation from Hortonworks to deliver these features in any generally available product. Product & Service features and technology directions are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind. Since this document contains an outline of general product development plans, customers should not rely upon it when making a purchase decision. The security of your IT system is very important to us, but no single product or service can make your IT systems completely secure or prevent all improper disclosures or access. Any effective security program requires a layered approach, which will require various systems, products, and services, as well as operational policies and procedures. HORTONWORKS DOES NOT WARRANT THAT ANY SERVICES, SYSTEMS OR PRODUCTS PREVENT ACCIDENTAL, ILLEGAL OR MALICIOUS CONDUCT OF ANY PARTY. ❑ This presentation may contain products, services, features, or technology directions that are under development, may be under development in the future, or may ultimately not be under development or developed. The description herein of such products, services, features or technology directions does not represent a contractual commitment, promise or obligation by Hortonworks to deliver them in any generally available product. ❑ Apache project timelines are based in part on information that is publicly available within the Apache Software Foundation (“Apache”) project websites. Progress of Apache projects can be tracked through Apache announcements on these websites; however, technical feasibility, market demand, user feedback, the overarching Apache community development process, and other factors can all affect timing and delivery of Apache projects.
  • 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Big Data Trend 90% of the data in the world has been created in the last two years alone Digital content is doubling every 18 months Structured Data - Database - Data Warehouse - ERPs - CRMs Unstructured Data - Web blogs - Social media - Audio, Video - Software file-systems Source: Frost & Sullivan - World’s Top Global Mega Trends To 2025 and Implications to Business, Society and Cultures
  • 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved The Hortonworks Opportunity At the core of ~$1.9T in market opportunity over the next 5 years Cloud ~$410 B Streaming ~$1.65 B Data Science ~$180 B Big Data ~$210 B IoT ~$1.1 T 1.0 2.0 3.0 © Hortonworks Inc. 2011 – 2018. All Rights Reserved Sources: IDC Worldwide Big Data and Analytics Software Forecast, 2017-2021, Forecasts Continuous/Streaming Analytics revenue to be $1.65B by 2021, July, 2017; Data Science Platform market size to reach $183.7B by 2023, Allied Market Research, Data Science Platform Market by Type and End User: Global Opportunity and Forecast, 2017-2023; IDC Worldwide Semiannual Big Data and Analytics Spending Guide Update, Forecasts Big Data & Business Analytics revenues to be $210B by 2020, Press Release March 2017; Gartner Worldwide Public Cloud Services Revenue, Forecasts Public Cloud Services Revenue to be $411.4B by 2020, Press Release October 2017; IDC Worldwide Semiannual IoT Spending Guide Update, Forecasts Worldwide IoT Spending forecast to be ~$1.1T by 2021, Press Release December 2017.
  • 5. 5 © Hortonworks Inc. 2011–2018. All rights reserved DataPlane Service: Manage, Govern & Secure Native Capabilities Clusters & Data Sources, Shared Services Core Services Extensibility, Metering, Telemetry Data Lifecycle Manager Oct, 2017 Data Steward Studio Q2, 2018 DPS EXTENSIBLE SERVICES DPS PLATFORM Data at Rest Data in Motion
  • 6. 6 © Hortonworks Inc. 2011–2018. All rights reserved “Data Lifecycle Manager” (DLM) Service ⬢ Is a DPS add-on service that safeguards enterprise data ⬢ Manages the data lifecycle: – Data Replication/ Failback for Disaster Recovery – Auto Tiering to Optimize Storage Cost & Performance – Backup & Recover Critical Business Data – Offline replication for large datasets ⬢ Maintains common metadata and security policies across data sources and hybrid environments Production Site Disaster Recovery Site Offsite Replication Failback Sunday Monday Tuesday Wednesday Thursday Friday Saturday Sunday Full Backup Cumulative incremental backup Accidental Deletion Solid State Drive Hard Drive Archive Access to Data 0days 30days 90days Forever ProbabilityofReuse Time 100% 0% Disaster Recovery Backup & Restore Auto Tiering S3
  • 7. 7 © Hortonworks Inc. 2011–2018. All rights reserved DLM Features • Incremental Hive replication & Hive metadata • HDFS snapshot based replication between HDP clusters • Ranger policy replication to Target cluster • TDE & TLS support • Support multiple keys/KMS • Cloud storage replication (AWS) • Active/standby behavior on DR site using Ranger Available Now
  • 8. 8 © Hortonworks Inc. 2011–2018. All rights reserved DLM Service Data Lifecycle Manager Architecture HDP Distro REST Data Plane UI DLM ServicePlugin Manager REST infrastructure Job Manager Alerts ManagerConfiguration Manager Security Infrastructure Copy Services HDFS Hive Ranger Log Manager DLM DB Logs
  • 9. 9 © Hortonworks Inc. 2011–2018. All rights reserved DLM Deployment Model ⬢ DLM Deployment packages – DLM App (Installed as part of DPS app) – DLM Engine in HDP (Management Pack) DLM Releases ⬢ DPS 1.0/DLM 1.0 – October, 2017 – HDP 2.6.3 (Cluster) ⬢ DPS 1.1/DLM 1.1 – May 2018 – HDP 2.6.5 (Cluster) Cluster 1 Cluster 2 DLM Engine DLM Engine On-Premise Data Center 1 Cluster 3 Cluster 4 DLM Engine On-Premise Data Center 2 Cluster 3 Operating Cluster DPS-DLM APP Public Cloud Push based replication to Cloud Pull based replication
  • 10. DLM 1.1 User Flow Cloud Replication and Encryption
  • 11. 11 © Hortonworks Inc. 2011–2018. All rights reserved
  • 12. 12 © Hortonworks Inc. 2011–2018. All rights reserved
  • 13. 13 © Hortonworks Inc. 2011–2018. All rights reserved
  • 14. 14 © Hortonworks Inc. 2011–2018. All rights reserved
  • 15. 15 © Hortonworks Inc. 2011–2018. All rights reserved
  • 16. 16 © Hortonworks Inc. 2011–2018. All rights reserved
  • 17. 17 © Hortonworks Inc. 2011–2018. All rights reserved
  • 18. 18 © Hortonworks Inc. 2011–2018. All rights reserved
  • 19. 19 © Hortonworks Inc. 2011–2018. All rights reserved
  • 20. 20 © Hortonworks Inc. 2011–2018. All rights reserved
  • 21. 21 © Hortonworks Inc. 2011–2018. All rights reserved
  • 22. 22 © Hortonworks Inc. 2011–2018. All rights reserved Cluster-1 Source ListofJiraRMPs VPC DWS San Jose 2018 Summit DLM Demo scenarios Cluster-3 IaaS/HDP Onprem HMS S3 Buckets Demo Setup • Data: NY Traffic Collision Data (partitioned by date/Boroughs) • Size: ~2GB • Interactive Application: Zeppelin & Shell • Pre-setup: Bootstrap, Cloud credentials, and Ranger policies on Target • DLM Policy schedule interval: 1-minute interval Demo Scenarios 1. Onprem-Onprem HDFS: Snapshot based incremental replication 2. Onprem-Cloud-HDFS: Replication of a HDFS folder to S3 bucket Onprem Hive replication setup HDFS replication setup
  • 24. 24 © Hortonworks Inc. 2011–2018. All rights reserved DLM Customer Use cases/Solutions Replicate 100+ TB data between on-prem and cloud storage locations Metadata along with security policy replication is critical GDPR compliance is required Tiering has be supported to reduce overall TCO Pharmaceutical Industry Replicate PB+ TB data between various data centers Data has to be replicated along with metadata and security policies GDPR compliance is required Tiering has be supported to reduce overall TCO Finance & Banking Industry Replicate corporate events data between Hybrid locations Build and fine-tune insights to prove ROI for each of the event related algorithms Employee services Industry
  • 25. 25 © Hortonworks Inc. 2011–2018. All rights reserved DLM 2.x: Tiering User Flow
  • 26. 26 © Hortonworks Inc. 2011–2018. All rights reserved What is Tiered Storage? • Data with different characteristics is moved to various types of storage media to reduce total storage cost. • Tiers are determined based on performance and cost of the media. • DLM enables customers to define Tiers through DLM interface. • DLM-Tiering is achieved by intra and/or inter cluster data movement
  • 27. 27 © Hortonworks Inc. 2011–2018. All rights reserved
  • 28. 28 © Hortonworks Inc. 2011–2018. All rights reserved
  • 29. 29 © Hortonworks Inc. 2011–2018. All rights reserved
  • 30. 30 © Hortonworks Inc. 2011–2018. All rights reserved
  • 31. 31 © Hortonworks Inc. 2011–2018. All rights reserved
  • 32. 32 © Hortonworks Inc. 2011–2018. All rights reserved
  • 33. 33 © Hortonworks Inc. 2011–2018. All rights reserved
  • 34. 34 © Hortonworks Inc. 2011–2018. All rights reserved DLM 2018+ Roadmap
  • 35. 3 5 © Hortonworks Inc. Confidential 2011 – 2017. All Rights Reserved DLM 2018+ Roadmap 1 2 3 2H 2017 1H 2018 2H 2018* 1H 2019* DLM 1.0 1.1 1.x 2.0 3.0 HDP 2.6.3 2.6.5 2.6.x 2.6.x/3.x 3.x Replication • On-prem/HDP Cloud replication • Cloud Storage replication (S3) • Encryption (TDE&TLS) • Cloud Storage replication (ADLS & WASB) • Atlas support (GDPR) • GCS • Hybrid/Multi-Cloud support (One-to- many) • HBase • Offline replication Kafka support DR Failback & Failover N/A N/A N/A Failback Failover Auto-Tiering N/A N/A N/A N/A Policy based Tiering - hot/warm/cold data to reduce TCO * Subject to change. Features are not committed. PRODUCTTHEMES Released Released Planning Planning-in-progress
  • 36. 36 © Hortonworks Inc. 2011–2018. All rights reserved Thank you