SlideShare a Scribd company logo
1 of 15
Download to read offline
Alluxio Product School
Alluxio and NetApp
Modern Data Architecture Requirements
Resiliency
Data Placement Scale
2
Joseph Kandatilparambil
Solutions Architect, StorageGRID Software
Group, NetApp
Joseph leads the Big Data and Analytics strategy
for StorageGRID and has been with the
StorageGRID team for two years now.
Linkedin:
https://www.linkedin.com/in/jkandatilparambil/
 
Michael Waldrop
Mike leads the Global Solutions Engineering team for Alluxio.
Mike works with enterprises to solve complex data problems
and modernize their Big Data platforms.
https://www.linkedin.com/in/mikewaldrop
Object Use Cases Are Evolving
Cold
Archive
App
Storage
Data
Streaming
Analytics
Active
Archive and
Blob Store
HPC
Cold
Archive
Active
Archive
and
Blob
Store
Data streaming
HPC
App Storage
Analytics
5
• Massive unstructured data growth
continues to drive adoption
• Tipping point for S3 adoption across
workloads and applications
achieved
• Hybrid cloud is the standard
• Migrations from existing / legacy
object installations (not just green
field)
Object Storage Growth is Accelerating
87%
6
Common User Journey
Mount HDFS and object storage
into a common namespace to
facilitate migration to object
store without changing analytics
tools
Take advantage of all the
benefits of Object storage
Burst Compute to cloud to
scale quickly
Leverage multiple cloud
providers to get best of breed
analytics tools without vendor
lock-in
7
What is Data Orchestration ?
8
Modern Data Analytics Architecture
StorageGRID S3 Data Lake
On-Prem
On-Prem
On-Prem
9
• Software-defined object
storage
• Policy-based information
lifecycle management at
scale
• Global Namespace and true
Multi-Tenancy
• Durable, available, and
scale-out
• Cloud integrations: AWS SNS
notification, cloud mirroring,
Metadata search
Why NetApp StorageGRID for your Data Lake?
10
Data Fabric
Seattle
Denver
New York
New Apps
StorageGRID
Up to 16 logical/physical sites
S3.company.com
Public Cloud
S3
Developer
Data Scientist
Use case 1: Migration from HDFS to Object Storage
• Reduces capacity overhead costs
• Decouple compute and Storage
• Performance and scale
• Policy-based data migration to S3
StorageGRID S3 Data
Lake
11
Use case 2: Expose data on-prem to compute in cloud
ON PREMISE
• Enable storage at a fraction of the
cost
• Scale compute on demand
• Data is always protected – dual
layered protection
• Have control over data locality
12
Use case 3: Enable Multi-Cloud workloads
● Minimize data movement
● Have full control over
data placement
● Avoid vendor lock-in
● Adapt to new
requirements
Compute
On-Prem
StorageGRID S3 Data Lake
StorageGRID Data lake storage is designed for high-performance, fault-tolerance, and scale with low
touch operations
13
alluxio.io/slack
www.alluxio.io
twitter.com/alluxio
linkedin.com/alluxio
www.storagegrid.com
twitter.com/NetApp
linkedin.com/NetApp
joseph.kandatilparambil@netapp.com michael.waldrop@alluxio.com
15

More Related Content

Similar to Geo-distributed Analytics with NetApp StorageGRID and Alluxio

Se training storage grid webscale technical overview
Se training   storage grid webscale technical overviewSe training   storage grid webscale technical overview
Se training storage grid webscale technical overviewsolarisyougood
 
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...Alluxio, Inc.
 
NetApp Se training storage grid webscale technical overview
NetApp Se training   storage grid webscale technical overviewNetApp Se training   storage grid webscale technical overview
NetApp Se training storage grid webscale technical overviewsolarisyougood
 
Shaping the Role of a Data Lake in a Modern Data Fabric Architecture
Shaping the Role of a Data Lake in a Modern Data Fabric ArchitectureShaping the Role of a Data Lake in a Modern Data Fabric Architecture
Shaping the Role of a Data Lake in a Modern Data Fabric ArchitectureDenodo
 
Achieving Separation of Compute and Storage in a Cloud World
Achieving Separation of Compute and Storage in a Cloud WorldAchieving Separation of Compute and Storage in a Cloud World
Achieving Separation of Compute and Storage in a Cloud WorldAlluxio, Inc.
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAlluxio, Inc.
 
Alluxio 2.0 Deep Dive – Simplifying data access for cloud workloads
Alluxio 2.0 Deep Dive – Simplifying data access for cloud workloadsAlluxio 2.0 Deep Dive – Simplifying data access for cloud workloads
Alluxio 2.0 Deep Dive – Simplifying data access for cloud workloadsAlluxio, Inc.
 
The rise of “Big Data” on cloud computing
The rise of “Big Data” on cloud computingThe rise of “Big Data” on cloud computing
The rise of “Big Data” on cloud computingMinhazul Arefin
 
High-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaHigh-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaCloudera, Inc.
 
Enabling big data & AI workloads on the object store at DBS
Enabling big data & AI workloads on the object store at DBS Enabling big data & AI workloads on the object store at DBS
Enabling big data & AI workloads on the object store at DBS Alluxio, Inc.
 
Big Data on Public Cloud
Big Data on Public CloudBig Data on Public Cloud
Big Data on Public CloudIMC Institute
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization Denodo
 
Gluent Extending Enterprise Applications with Hadoop
Gluent Extending Enterprise Applications with HadoopGluent Extending Enterprise Applications with Hadoop
Gluent Extending Enterprise Applications with Hadoopgluent.
 
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...Denodo
 
Alluxio Use Cases and Future Directions
Alluxio Use Cases and Future DirectionsAlluxio Use Cases and Future Directions
Alluxio Use Cases and Future DirectionsAlluxio, Inc.
 
How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...Alluxio, Inc.
 
Big Data/Cloudera from Excelerate Systems
Big Data/Cloudera from Excelerate SystemsBig Data/Cloudera from Excelerate Systems
Big Data/Cloudera from Excelerate SystemsDavid Bennett
 
Cloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCO
Cloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCOCloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCO
Cloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCOStorage Switzerland
 
Moving data to the cloud BY CESAR ROJAS from Pivotal
Moving data to the cloud BY CESAR ROJAS from PivotalMoving data to the cloud BY CESAR ROJAS from Pivotal
Moving data to the cloud BY CESAR ROJAS from PivotalVMware Tanzu Korea
 

Similar to Geo-distributed Analytics with NetApp StorageGRID and Alluxio (20)

Se training storage grid webscale technical overview
Se training   storage grid webscale technical overviewSe training   storage grid webscale technical overview
Se training storage grid webscale technical overview
 
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
 
NetApp Se training storage grid webscale technical overview
NetApp Se training   storage grid webscale technical overviewNetApp Se training   storage grid webscale technical overview
NetApp Se training storage grid webscale technical overview
 
Shaping the Role of a Data Lake in a Modern Data Fabric Architecture
Shaping the Role of a Data Lake in a Modern Data Fabric ArchitectureShaping the Role of a Data Lake in a Modern Data Fabric Architecture
Shaping the Role of a Data Lake in a Modern Data Fabric Architecture
 
Achieving Separation of Compute and Storage in a Cloud World
Achieving Separation of Compute and Storage in a Cloud WorldAchieving Separation of Compute and Storage in a Cloud World
Achieving Separation of Compute and Storage in a Cloud World
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud Era
 
Alluxio 2.0 Deep Dive – Simplifying data access for cloud workloads
Alluxio 2.0 Deep Dive – Simplifying data access for cloud workloadsAlluxio 2.0 Deep Dive – Simplifying data access for cloud workloads
Alluxio 2.0 Deep Dive – Simplifying data access for cloud workloads
 
The rise of “Big Data” on cloud computing
The rise of “Big Data” on cloud computingThe rise of “Big Data” on cloud computing
The rise of “Big Data” on cloud computing
 
High-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaHigh-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache Impala
 
Enabling big data & AI workloads on the object store at DBS
Enabling big data & AI workloads on the object store at DBS Enabling big data & AI workloads on the object store at DBS
Enabling big data & AI workloads on the object store at DBS
 
Big Data on Public Cloud
Big Data on Public CloudBig Data on Public Cloud
Big Data on Public Cloud
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
 
Gluent Extending Enterprise Applications with Hadoop
Gluent Extending Enterprise Applications with HadoopGluent Extending Enterprise Applications with Hadoop
Gluent Extending Enterprise Applications with Hadoop
 
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
 
Alluxio Use Cases and Future Directions
Alluxio Use Cases and Future DirectionsAlluxio Use Cases and Future Directions
Alluxio Use Cases and Future Directions
 
How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...
 
Big Data/Cloudera from Excelerate Systems
Big Data/Cloudera from Excelerate SystemsBig Data/Cloudera from Excelerate Systems
Big Data/Cloudera from Excelerate Systems
 
Bigdata-Intro.pptx
Bigdata-Intro.pptxBigdata-Intro.pptx
Bigdata-Intro.pptx
 
Cloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCO
Cloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCOCloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCO
Cloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCO
 
Moving data to the cloud BY CESAR ROJAS from Pivotal
Moving data to the cloud BY CESAR ROJAS from PivotalMoving data to the cloud BY CESAR ROJAS from Pivotal
Moving data to the cloud BY CESAR ROJAS from Pivotal
 

More from Alluxio, Inc.

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Optimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with AlluxioOptimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with AlluxioAlluxio, Inc.
 
Speed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio CachingSpeed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio CachingAlluxio, Inc.
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLAlluxio, Inc.
 
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio, Inc.
 
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...Alluxio, Inc.
 
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache EvictionData Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache EvictionAlluxio, Inc.
 
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio EdgeData Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio EdgeAlluxio, Inc.
 
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the CloudData Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the CloudAlluxio, Inc.
 
Data Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet ReaderData Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet ReaderAlluxio, Inc.
 
Data Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionData Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionAlluxio, Inc.
 
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio, Inc.
 
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...Alluxio, Inc.
 
AI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI EraAI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI EraAlluxio, Inc.
 
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...Alluxio, Inc.
 
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...Alluxio, Inc.
 
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ MetaAI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ MetaAlluxio, Inc.
 
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber ScaleAI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber ScaleAlluxio, Inc.
 
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWSAlluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWSAlluxio, Inc.
 

More from Alluxio, Inc. (20)

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Optimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with AlluxioOptimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with Alluxio
 
Speed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio CachingSpeed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio Caching
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
 
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
 
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
 
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache EvictionData Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
 
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio EdgeData Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
 
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the CloudData Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
 
Data Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet ReaderData Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet Reader
 
Data Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionData Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage Evolution
 
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
 
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
 
AI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI EraAI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI Era
 
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
 
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
 
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ MetaAI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
 
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber ScaleAI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
 
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWSAlluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
 

Recently uploaded

Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 

Recently uploaded (20)

Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 

Geo-distributed Analytics with NetApp StorageGRID and Alluxio

  • 2. Modern Data Architecture Requirements Resiliency Data Placement Scale 2
  • 3. Joseph Kandatilparambil Solutions Architect, StorageGRID Software Group, NetApp Joseph leads the Big Data and Analytics strategy for StorageGRID and has been with the StorageGRID team for two years now. Linkedin: https://www.linkedin.com/in/jkandatilparambil/  
  • 4. Michael Waldrop Mike leads the Global Solutions Engineering team for Alluxio. Mike works with enterprises to solve complex data problems and modernize their Big Data platforms. https://www.linkedin.com/in/mikewaldrop
  • 5. Object Use Cases Are Evolving Cold Archive App Storage Data Streaming Analytics Active Archive and Blob Store HPC Cold Archive Active Archive and Blob Store Data streaming HPC App Storage Analytics 5
  • 6. • Massive unstructured data growth continues to drive adoption • Tipping point for S3 adoption across workloads and applications achieved • Hybrid cloud is the standard • Migrations from existing / legacy object installations (not just green field) Object Storage Growth is Accelerating 87% 6
  • 7. Common User Journey Mount HDFS and object storage into a common namespace to facilitate migration to object store without changing analytics tools Take advantage of all the benefits of Object storage Burst Compute to cloud to scale quickly Leverage multiple cloud providers to get best of breed analytics tools without vendor lock-in 7
  • 8. What is Data Orchestration ? 8
  • 9. Modern Data Analytics Architecture StorageGRID S3 Data Lake On-Prem On-Prem On-Prem 9
  • 10. • Software-defined object storage • Policy-based information lifecycle management at scale • Global Namespace and true Multi-Tenancy • Durable, available, and scale-out • Cloud integrations: AWS SNS notification, cloud mirroring, Metadata search Why NetApp StorageGRID for your Data Lake? 10 Data Fabric Seattle Denver New York New Apps StorageGRID Up to 16 logical/physical sites S3.company.com Public Cloud S3 Developer Data Scientist
  • 11. Use case 1: Migration from HDFS to Object Storage • Reduces capacity overhead costs • Decouple compute and Storage • Performance and scale • Policy-based data migration to S3 StorageGRID S3 Data Lake 11
  • 12. Use case 2: Expose data on-prem to compute in cloud ON PREMISE • Enable storage at a fraction of the cost • Scale compute on demand • Data is always protected – dual layered protection • Have control over data locality 12
  • 13. Use case 3: Enable Multi-Cloud workloads ● Minimize data movement ● Have full control over data placement ● Avoid vendor lock-in ● Adapt to new requirements Compute On-Prem StorageGRID S3 Data Lake StorageGRID Data lake storage is designed for high-performance, fault-tolerance, and scale with low touch operations 13
  • 15. 15