Submit Search
Upload
Cloud storage filesystems and Hive transactional tables
•
Download as PPTX, PDF
•
2 likes
•
903 views
AI-enhanced title
Hortonworks
Follow
Hive - 1455: Cloud Storage
Read less
Read more
Technology
Slideshow view
Report
Share
Slideshow view
Report
Share
1 of 11
Download now
Recommended
S3Guard: What's in your consistency model?
S3Guard: What's in your consistency model?
Hortonworks
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache Ambari
DataWorks Summit/Hadoop Summit
Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016
Hortonworks
The state of SQL-on-Hadoop in the Cloud
The state of SQL-on-Hadoop in the Cloud
DataWorks Summit/Hadoop Summit
Ingest and Stream Processing - What will you choose?
Ingest and Stream Processing - What will you choose?
DataWorks Summit/Hadoop Summit
Cloudy with a Chance of Hadoop - Real World Considerations
Cloudy with a Chance of Hadoop - Real World Considerations
DataWorks Summit/Hadoop Summit
An Overview on Optimization in Apache Hive: Past, Present Future
An Overview on Optimization in Apache Hive: Past, Present Future
DataWorks Summit/Hadoop Summit
How to manage Hortonworks HDB Resources with YARN
How to manage Hortonworks HDB Resources with YARN
Hortonworks
Recommended
S3Guard: What's in your consistency model?
S3Guard: What's in your consistency model?
Hortonworks
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache Ambari
DataWorks Summit/Hadoop Summit
Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016
Hortonworks
The state of SQL-on-Hadoop in the Cloud
The state of SQL-on-Hadoop in the Cloud
DataWorks Summit/Hadoop Summit
Ingest and Stream Processing - What will you choose?
Ingest and Stream Processing - What will you choose?
DataWorks Summit/Hadoop Summit
Cloudy with a Chance of Hadoop - Real World Considerations
Cloudy with a Chance of Hadoop - Real World Considerations
DataWorks Summit/Hadoop Summit
An Overview on Optimization in Apache Hive: Past, Present Future
An Overview on Optimization in Apache Hive: Past, Present Future
DataWorks Summit/Hadoop Summit
How to manage Hortonworks HDB Resources with YARN
How to manage Hortonworks HDB Resources with YARN
Hortonworks
Getting involved with Open Source at the ASF
Getting involved with Open Source at the ASF
Hortonworks
LLAP: long-lived execution in Hive
LLAP: long-lived execution in Hive
DataWorks Summit
How to Use Apache Zeppelin with HWX HDB
How to Use Apache Zeppelin with HWX HDB
Hortonworks
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
DataWorks Summit
Apache Hive on ACID
Apache Hive on ACID
DataWorks Summit/Hadoop Summit
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
DataWorks Summit/Hadoop Summit
Hortonworks Technical Workshop - HDP Search
Hortonworks Technical Workshop - HDP Search
Hortonworks
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
DataWorks Summit
Hive Does ACID
Hive Does ACID
DataWorks Summit
An Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
DataWorks Summit
Evolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage Subsystem
DataWorks Summit/Hadoop Summit
Hive 3 - a new horizon
Hive 3 - a new horizon
Thejas Nair
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
alanfgates
An Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
DataWorks Summit
Curb your insecurity with HDP
Curb your insecurity with HDP
DataWorks Summit/Hadoop Summit
Operationalizing YARN based Hadoop Clusters in the Cloud
Operationalizing YARN based Hadoop Clusters in the Cloud
DataWorks Summit/Hadoop Summit
Transactional SQL in Apache Hive
Transactional SQL in Apache Hive
DataWorks Summit
Troubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the Beast
DataWorks Summit
Apache Hadoop 3.0 What's new in YARN and MapReduce
Apache Hadoop 3.0 What's new in YARN and MapReduce
DataWorks Summit/Hadoop Summit
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017
alanfgates
How Universities Use Big Data to Transform Education
How Universities Use Big Data to Transform Education
Hortonworks
Hortonworks Data Cloud for AWS
Hortonworks Data Cloud for AWS
Hortonworks
More Related Content
What's hot
Getting involved with Open Source at the ASF
Getting involved with Open Source at the ASF
Hortonworks
LLAP: long-lived execution in Hive
LLAP: long-lived execution in Hive
DataWorks Summit
How to Use Apache Zeppelin with HWX HDB
How to Use Apache Zeppelin with HWX HDB
Hortonworks
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
DataWorks Summit
Apache Hive on ACID
Apache Hive on ACID
DataWorks Summit/Hadoop Summit
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
DataWorks Summit/Hadoop Summit
Hortonworks Technical Workshop - HDP Search
Hortonworks Technical Workshop - HDP Search
Hortonworks
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
DataWorks Summit
Hive Does ACID
Hive Does ACID
DataWorks Summit
An Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
DataWorks Summit
Evolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage Subsystem
DataWorks Summit/Hadoop Summit
Hive 3 - a new horizon
Hive 3 - a new horizon
Thejas Nair
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
alanfgates
An Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
DataWorks Summit
Curb your insecurity with HDP
Curb your insecurity with HDP
DataWorks Summit/Hadoop Summit
Operationalizing YARN based Hadoop Clusters in the Cloud
Operationalizing YARN based Hadoop Clusters in the Cloud
DataWorks Summit/Hadoop Summit
Transactional SQL in Apache Hive
Transactional SQL in Apache Hive
DataWorks Summit
Troubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the Beast
DataWorks Summit
Apache Hadoop 3.0 What's new in YARN and MapReduce
Apache Hadoop 3.0 What's new in YARN and MapReduce
DataWorks Summit/Hadoop Summit
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017
alanfgates
What's hot
(20)
Getting involved with Open Source at the ASF
Getting involved with Open Source at the ASF
LLAP: long-lived execution in Hive
LLAP: long-lived execution in Hive
How to Use Apache Zeppelin with HWX HDB
How to Use Apache Zeppelin with HWX HDB
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
Apache Hive on ACID
Apache Hive on ACID
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Hortonworks Technical Workshop - HDP Search
Hortonworks Technical Workshop - HDP Search
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
Hive Does ACID
Hive Does ACID
An Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
Evolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage Subsystem
Hive 3 - a new horizon
Hive 3 - a new horizon
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
An Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
Curb your insecurity with HDP
Curb your insecurity with HDP
Operationalizing YARN based Hadoop Clusters in the Cloud
Operationalizing YARN based Hadoop Clusters in the Cloud
Transactional SQL in Apache Hive
Transactional SQL in Apache Hive
Troubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the Beast
Apache Hadoop 3.0 What's new in YARN and MapReduce
Apache Hadoop 3.0 What's new in YARN and MapReduce
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017
Viewers also liked
How Universities Use Big Data to Transform Education
How Universities Use Big Data to Transform Education
Hortonworks
Hortonworks Data Cloud for AWS
Hortonworks Data Cloud for AWS
Hortonworks
Dynamic Column Masking and Row-Level Filtering in HDP
Dynamic Column Masking and Row-Level Filtering in HDP
Hortonworks
The path to a Modern Data Architecture in Financial Services
The path to a Modern Data Architecture in Financial Services
Hortonworks
Pivotal - Advanced Analytics for Telecommunications
Pivotal - Advanced Analytics for Telecommunications
Hortonworks
Top 5 Strategies for Retail Data Analytics
Top 5 Strategies for Retail Data Analytics
Hortonworks
Edw Optimization Solution
Edw Optimization Solution
Hortonworks
Scaling real time streaming architectures with HDF and Dell EMC Isilon
Scaling real time streaming architectures with HDF and Dell EMC Isilon
Hortonworks
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks
Enabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical Enterprise
Hortonworks
Double Your Hadoop Hardware Performance with SmartSense
Double Your Hadoop Hardware Performance with SmartSense
Hortonworks
Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
Hortonworks
Webinar Series Part 5 New Features of HDF 5
Webinar Series Part 5 New Features of HDF 5
Hortonworks
SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...
SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...
Hortonworks
Hortonworks technical workshop operations with ambari
Hortonworks technical workshop operations with ambari
Hortonworks
Apache Hadoop 0.23
Apache Hadoop 0.23
Hortonworks
Zementis hortonworks-webinar-2014-09
Zementis hortonworks-webinar-2014-09
Hortonworks
The Power of your Data Achieved - Next Gen Modernization
The Power of your Data Achieved - Next Gen Modernization
Hortonworks
Apache Ambari: Past, Present, Future
Apache Ambari: Past, Present, Future
Hortonworks
Credit Card Analytics on a Connected Data Platform
Credit Card Analytics on a Connected Data Platform
Hortonworks
Viewers also liked
(20)
How Universities Use Big Data to Transform Education
How Universities Use Big Data to Transform Education
Hortonworks Data Cloud for AWS
Hortonworks Data Cloud for AWS
Dynamic Column Masking and Row-Level Filtering in HDP
Dynamic Column Masking and Row-Level Filtering in HDP
The path to a Modern Data Architecture in Financial Services
The path to a Modern Data Architecture in Financial Services
Pivotal - Advanced Analytics for Telecommunications
Pivotal - Advanced Analytics for Telecommunications
Top 5 Strategies for Retail Data Analytics
Top 5 Strategies for Retail Data Analytics
Edw Optimization Solution
Edw Optimization Solution
Scaling real time streaming architectures with HDF and Dell EMC Isilon
Scaling real time streaming architectures with HDF and Dell EMC Isilon
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Enabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical Enterprise
Double Your Hadoop Hardware Performance with SmartSense
Double Your Hadoop Hardware Performance with SmartSense
Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
Webinar Series Part 5 New Features of HDF 5
Webinar Series Part 5 New Features of HDF 5
SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...
SAS - Hortonworks: Creating the Omnichannel Experience in Retail webinar marc...
Hortonworks technical workshop operations with ambari
Hortonworks technical workshop operations with ambari
Apache Hadoop 0.23
Apache Hadoop 0.23
Zementis hortonworks-webinar-2014-09
Zementis hortonworks-webinar-2014-09
The Power of your Data Achieved - Next Gen Modernization
The Power of your Data Achieved - Next Gen Modernization
Apache Ambari: Past, Present, Future
Apache Ambari: Past, Present, Future
Credit Card Analytics on a Connected Data Platform
Credit Card Analytics on a Connected Data Platform
Similar to Cloud storage filesystems and Hive transactional tables
Connecting Hadoop and Oracle
Connecting Hadoop and Oracle
Tanel Poder
Webinar: Untethering Compute from Storage
Webinar: Untethering Compute from Storage
Avere Systems
Building data pipelines with kite
Building data pipelines with kite
Joey Echeverria
Avoiding big data antipatterns
Avoiding big data antipatterns
grepalex
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
gluent.
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?
DataWorks Summit
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - Tokyo
DataWorks Summit
HDFS: Optimization, Stabilization and Supportability
HDFS: Optimization, Stabilization and Supportability
DataWorks Summit/Hadoop Summit
Hdfs 2016-hadoop-summit-dublin-v1
Hdfs 2016-hadoop-summit-dublin-v1
Chris Nauroth
Real-time Big Data Analytics Engine using Impala
Real-time Big Data Analytics Engine using Impala
Jason Shih
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
Uwe Printz
Big Data Conference April 2015
Big Data Conference April 2015
Aaron Benz
LLAP: Building Cloud First BI
LLAP: Building Cloud First BI
DataWorks Summit
MySQL highav Availability
MySQL highav Availability
Baruch Osoveskiy
Everything You Need to Know About Docker and Storage by Ryan Wallner, ClusterHQ
Everything You Need to Know About Docker and Storage by Ryan Wallner, ClusterHQ
Docker, Inc.
Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5
Chris Nauroth
Hadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the Field
DataWorks Summit
Revision
Revision
David Sherlock
MySQL Webinar 2/4 Performance tuning, hardware, optimisation
MySQL Webinar 2/4 Performance tuning, hardware, optimisation
Mark Swarbrick
SD Big Data Monthly Meetup #4 - Session 2 - WANDisco
SD Big Data Monthly Meetup #4 - Session 2 - WANDisco
Big Data Joe™ Rossi
Similar to Cloud storage filesystems and Hive transactional tables
(20)
Connecting Hadoop and Oracle
Connecting Hadoop and Oracle
Webinar: Untethering Compute from Storage
Webinar: Untethering Compute from Storage
Building data pipelines with kite
Building data pipelines with kite
Avoiding big data antipatterns
Avoiding big data antipatterns
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - Tokyo
HDFS: Optimization, Stabilization and Supportability
HDFS: Optimization, Stabilization and Supportability
Hdfs 2016-hadoop-summit-dublin-v1
Hdfs 2016-hadoop-summit-dublin-v1
Real-time Big Data Analytics Engine using Impala
Real-time Big Data Analytics Engine using Impala
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
Big Data Conference April 2015
Big Data Conference April 2015
LLAP: Building Cloud First BI
LLAP: Building Cloud First BI
MySQL highav Availability
MySQL highav Availability
Everything You Need to Know About Docker and Storage by Ryan Wallner, ClusterHQ
Everything You Need to Know About Docker and Storage by Ryan Wallner, ClusterHQ
Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the Field
Revision
Revision
MySQL Webinar 2/4 Performance tuning, hardware, optimisation
MySQL Webinar 2/4 Performance tuning, hardware, optimisation
SD Big Data Monthly Meetup #4 - Session 2 - WANDisco
SD Big Data Monthly Meetup #4 - Session 2 - WANDisco
More from Hortonworks
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
Hortonworks
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Hortonworks
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
Hortonworks
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Hortonworks
HDF 3.2 - What's New
HDF 3.2 - What's New
Hortonworks
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Hortonworks
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Hortonworks
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
Hortonworks
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
Hortonworks
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
Hortonworks
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
Hortonworks
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Hortonworks
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Hortonworks
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
Hortonworks
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Hortonworks
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
Hortonworks
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
Hortonworks
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Hortonworks
More from Hortonworks
(20)
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
HDF 3.2 - What's New
HDF 3.2 - What's New
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Recently uploaded
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
Delhi Call girls
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
Pixlogix Infotech
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
OnBoard
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
null - The Open Security Community
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
Padma Pradeep
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
Hyundai Motor Group
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
Delhi Call girls
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Patryk Bandurski
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
AndikSusilo4
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
Softradix Technologies
Key Features Of Token Development (1).pptx
Key Features Of Token Development (1).pptx
LBM Solutions
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
ThousandEyes
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
naman860154
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
BookNet Canada
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Delhi Call girls
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
XfilesPro
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
soniya singh
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions
Recently uploaded
(20)
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
Key Features Of Token Development (1).pptx
Key Features Of Token Development (1).pptx
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
Cloud storage filesystems and Hive transactional tables
1.
Page1 © Hortonworks
Inc. 2011 – 2015. All Rights Reserved Hive-14535 : Cloud storage Gopal V
2.
Page2 © Hortonworks
Inc. 2011 – 2015. All Rights Reserved Cloud “FileSystems” are Strange Beasts “There are no directories. Only paths.” “There are no users. Only keys.” “There are no permissions. Only acl rules.” “There is consistency, but not as we know it.”
3.
Page3 © Hortonworks
Inc. 2011 – 2015. All Rights Reserved “Directories vs Paths.” • Storage of Path information can be assumed to be a sorted hash-table. • File listings are no longer listing off a tree, but prefix search • Directories don’t need to necessarily exist for a path below it • Listing a single level is more complex than a full-depth traversal • Renames can cause rebalancing and moving about of the structure • Adjacent files are sometimes more expensive than random ops
4.
Page4 © Hortonworks
Inc. 2011 – 2015. All Rights Reserved “Users & permissions vs keys & ACLs” • Distinguishing the user for an accessing process has no meaning • Access keys are often rotated and occasionally invalidated • User identity can be mapped to a key (externally or by id management) • Buckets are commonly used to differentiate stores, instead of permissions • Permissions are rarely set or applied per-file, but across path patterns • Permissions set to a directory need extra user checks to be useful (chmod +x)
5.
Page5 © Hortonworks
Inc. 2011 – 2015. All Rights Reserved “Consistency” • Arguably the most complex issue • Renames needn’t be consistent, creates can have collisions • Reads can return old data for the same path when overwriting • Versioned reads are complex to manage and hard to throw a “Time machine” over • Cross-Region Replication often lags and doubles stale-read issues
6.
Page6 © Hortonworks
Inc. 2011 – 2015. All Rights Reserved Micro-Managed Hive Tables • Support for all Hive input formats, including user ones • Avoid rename operations as much as possible • Never collide final paths for different inserts • Ongoing inserts should be atomic across > 1 partitions • Snapshot isolation for data reads for existing partitions being back-filled • Stage data without accidental partial-reads for bucket replication
7.
Page7 © Hortonworks
Inc. 2011 – 2015. All Rights Reserved Micro-Managed Hive Tables CREATE TABLE `web_returns_hive_commit`(… `wr_net_loss` float) PARTITIONED BY (`wr_returned_date_sk` int) STORED AS <FORMAT> LOCATION 's3a://hwdev-hive-14535/web_returns_hive_commit' TBLPROPERTIES ('transactional'='true', 'transactional_properties'='insert_only');
8.
Page8 © Hortonworks
Inc. 2011 – 2015. All Rights Reserved Micro-Managed Hive Tables drwxrwxrwx - cloudbreak 0 2016-12-07 21:42 s3a://hwdev-hive-14535/web_returns_hive_commit/wr_returned_date_sk=2450820 drwxrwxrwx - cloudbreak 0 2016-12-07 21:42 s3a://hwdev-hive-14535/web_returns_hive_commit/wr_returned_date_sk=2450820/mm_0 -rw-rw-rw- 1 cloudbreak 1791 2016-12-07 00:55 s3a://hwdev-hive-14535/web_returns_hive_commit/wr_returned_date_sk=2450820/mm_0/000021_0 drwxrwxrwx - cloudbreak 0 2016-12-07 21:42 s3a://hwdev-hive-14535/web_returns_hive_commit/wr_returned_date_sk=2450821 drwxrwxrwx - cloudbreak 0 2016-12-07 21:42 s3a://hwdev-hive-14535/web_returns_hive_commit/wr_returned_date_sk=2450821/mm_0 -rw-rw-rw- 1 cloudbreak 2186 2016-12-07 00:55 s3a://hwdev-hive-14535/web_returns_hive_commit/wr_returned_date_sk=2450821/mm_0/000022_0 drwxrwxrwx - cloudbreak 0 2016-12-07 21:42 s3a://hwdev-hive-14535/web_returns_hive_commit/wr_returned_date_sk=2450822 drwxrwxrwx - cloudbreak 0 2016-12-07 21:42 s3a://hwdev-hive-14535/web_returns_hive_commit/wr_returned_date_sk=2450822/mm_0 -rw-rw-rw- 1 cloudbreak 1814 2016-12-07 00:55 s3a://hwdev-hive-14535/web_returns_hive_commit/wr_returned_date_sk=2450822/mm_0/000023_0 /web_returns_hive_commit/wr_returned_date_sk=2450820/mm_0/000021_0
9.
Page9 © Hortonworks
Inc. 2011 – 2015. All Rights Reserved “Take a number” for inserts
10.
Page10 © Hortonworks
Inc. 2011 – 2015. All Rights Reserved Read: tracking committed data • Similar to Hive-ACID (ORC) • Committed txns disappear from the tracking data • With each query, it takes a highest known txn + list of open/aborted txns • All valid transactions are < max(transaction_id) and not IN (open_txns) • The transaction filtering is done at the listing level for all formats
11.
Page11 © Hortonworks
Inc. 2011 – 2015. All Rights Reserved Branch + Future Work Current measurement has 21% reduction in partition load time (+HIVE-15368) Time taken to load dynamic partitions: 350.846 seconds -> 274.715 seconds Work continues in the branch for hive-14535 Work ongoing to optimize to take advantage of faster recursive listings Discussions towards incremental refresh for cube engines for backfill Questions? Suggestions?
Download now