Optimize Data Management with Key Architecture Principles

•Download as PPTX, PDF•

1 like•182 views

These slides are from a webinar that featured a discussion on ways companies can address shifts in big data infrastructure and design the appropriate data management architecture for optimal performance and scale. It covers lessons gleaned from real customer challenges and implementations, offering attendees practical advice on the kinds of design decisions that can optimize the protection and management modern data platforms from Hadoop to NoSQL databases.

Technology

Confidential and Proprietary1
Key Architecture &
Performance Principles to
Optimize Data Management
May 18, 2017

Confidential and Proprietary2
Webinar Goals
Outline key architecture principles for backup/recovery
& test data management in a modern data world
Illustrate the importance of these principles using real
customer examples
Identify specific tradeoffs that can be made when
deploying your data management infrastructure

Confidential and Proprietary3
Attributes of Modern Data Platforms
Scale out to petabyte
workloads
Analytics-driven
Intelligence
Storage optimization
across diverse data &
environments
Minimize copies
Create storage and
compute pools on
commodity H/W

Confidential and Proprietary4
Why Incremental Forever: Backups
Traditional
Approach
Big Data
Platform
Approach
Day 1 Days 2-7 Day 8
Full
backup
Incremental
backups
Full
backup
Incremental
backups
Full
backup
Incremental
backup
Days 9-14
Incremental
backups
Incremental
backups
Day 15
Full
backup
Incremental
backup

Confidential and Proprietary5
Why Incremental Forever: Restores
Day 1 2 3 . . . . 80 81
Data Size 1
TB
1.02
TB
1.03
TB
. . . . 1.2TB Developer
error
Changed data
from last
backup
- 50 GB 50 GB . . . . 50GB
Backup type Full Incr Incr . . . . Incr
Data recovered by traditional approach: 1 TB + 79 x 50 GB = 4.95 TB
Data recovered by big data approach: 1.2 TB
Key concept: the notion of a “virtualized full” image

Confidential and Proprietary6
The Importance of Parallelism
Test Platform Utility Talena Differential
Full backup 8 hours, 17 min 2 hours, 20 min 3.5x
Incremental
backup
4 hours, 55
minutes
26 min, 7 seconds 11.3x
Full restore to
different cluster
40 hours, 28 min 14 hours, 55
minutes
2.7x
Full restore to
same cluster
6 hours, 21 min 1 hour, 58 minutes 2.2x
Full restore using
incremental
restore point to
same cluster
21 hours, 28 min 2 hours, 5 min 10.3x
Eliminate choke
points
Tradeoff between
backup/restore
performance
versus production
cluster
Bandwidth efficiency

Confidential and Proprietary7
Elastic Scaling: What Are The Issues
Multi-DC
Cassandra Cluster
100-nodes, 320 TB
ARCHIVE
Data Center #1
Data Center #2
Cassandra Cluster
50-node, 125 TB
Data Center #1
Year 1 Year 2
Topic Key consideration
Scaling backup
infrastructure
Just adding nodes or
forklift
Agents/listeners Manageability
Multi-DC awareness Minimize WAN bandwidth
overhead

Confidential and Proprietary8
The Cloud Effect
NoSQL/Hado
op/EDW
Local
Storage
Production
Cluster
Object
Storage
Cold
Storage• Storage tiering
• Transparent access
• Bandwidth impact

Confidential and Proprietary9
The Evolution of Data Management
THE NEXT
25 YEARS
THE
TRADITIONAL
WORLD
Data ManagementData Platforms

Confidential and Proprietary10
The Talena Architecture
• Deep de-duplication and compression with app-aware architecture
• Incremental-forever backup architecture
• High availability via erasure coding in distributed cluster architecture
Smart Storage Optimizer

Confidential and Proprietary11
The Talena Architecture
Native querying and analytics
via active compute layer
Unbounded scale with a
Hadoop-native architecture
Smart Storage Optimizer
Active Compute Services Distributed File System

Confidential and Proprietary12
The Talena Architecture
• Google-like catalog
shortens data recovery
time
• Automatic schema
generation for mirroring
and backups
• Granular recovery at an
object level
• Recovery to multiple
topologies
• Native integration with
LDAP and Kerberos for
authentication
• Role-based access control
defines specific privileges
• Transparent data encryption
• Masking for PII data
Smart Storage Optimizer
Active Compute Services Distributed File System
Metadata Catalog Data Orchestration ServicesSecurity Services

Confidential and Proprietary13
Smart Storage Optimizer
The Talena Architecture
GUI CLI API
Active Compute Services Distributed File System
• ‘Single pane of glass’ for multiple use cases and data platforms
• Agentless architecture minimizes management overhead
• GUI, CLI, REST-based Talena API options
Metadata Catalog Data Orchestration ServicesSecurity Services

Confidential and Proprietary14
Q&A
 We’ll send you a link to our
architecture white paper
 Additional resources: talena-
inc.com/resources and
talena-inc.com/blog
 Ping us with any additional
questions: info@talena-
inc.com

Confidential and Proprietary15
Thank You

What's hot

Big Data Case Study: Fortune 100 TelcoBlueData, Inc.

Polyglot Persistence and Database Deployment by Sandeep Khuperkar CTO and Dir...Ashnikbiz

Optimize Your Vertica Data Management InfrastructureImanis Data

Get Savvy with SnowflakeMatillion

Cloudian HyperStore Operating EnvironmentCloudian

Reducing large S3 API costs using Alluxio at Datasapiens Alluxio, Inc.

EDB Postgres in DBaaS & Container PlatformsAshnikbiz

Enabling big data & AI workloads on the object store at DBS Alluxio, Inc.

AltaVaultJohn Davis

Webinar: Don't believe the hype, you don't need dedicated storage for VDI NetApp

Webinar: Sizing Up Object Storage for the EnterpriseStorage Switzerland

Altis AWS Snowflake PracticeSamanthaSwain7

Snowflake + Power BI: Cloud Analytics for EveryoneAngel Abundez

#FMS2018 NGD Systems Real World Results with #ComputationalStorageScott Shadley, MBA,PMC-III

Revolutionising Storage for your Future Business RequirementsNetApp

Alluxio - Virtual Unified File System Alluxio, Inc.

Data Fabric: NetApp's Vision for the Future of Data ManagementNetApp

Three Steps to Modern Media Asset Management with Active ArchiveAvere Systems

Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud WorldCloudera, Inc.

A 30 day plan to start ending your data struggle with SnowflakeSnowflake Computing

What's hot (20)

Big Data Case Study: Fortune 100 Telco

Polyglot Persistence and Database Deployment by Sandeep Khuperkar CTO and Dir...

Optimize Your Vertica Data Management Infrastructure

Get Savvy with Snowflake

Cloudian HyperStore Operating Environment

Reducing large S3 API costs using Alluxio at Datasapiens

EDB Postgres in DBaaS & Container Platforms

Enabling big data & AI workloads on the object store at DBS

AltaVault

Webinar: Don't believe the hype, you don't need dedicated storage for VDI

Webinar: Sizing Up Object Storage for the Enterprise

Altis AWS Snowflake Practice

Snowflake + Power BI: Cloud Analytics for Everyone

#FMS2018 NGD Systems Real World Results with #ComputationalStorage

Revolutionising Storage for your Future Business Requirements

Alluxio - Virtual Unified File System

Data Fabric: NetApp's Vision for the Future of Data Management

Three Steps to Modern Media Asset Management with Active Archive

Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World

A 30 day plan to start ending your data struggle with Snowflake

Similar to Optimize Data Management with Key Architecture Principles

Debunking Common Myths of Hadoop Backup & Test Data ManagementImanis Data

vFabric Data Director 2.7 customer deckJunchi Zhang

ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...DATAVERSITY

MISYS-KL - Cintra Optimized Oracle Archiecture Solution and Services 1.1.pdfssuserd5e338

Cloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCOStorage Switzerland

Revolutionary Storage for Modern Databases, Applications and Infrastrcturesabnees

ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY

32992 lam ebc storage overview3gmazuel

Data Architecture Best Practices for Advanced AnalyticsDATAVERSITY

Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed_Hat_Storage

Oracle database 12c introduction- Satyendra Pasalapudipasalapudi123

IMC Summit 2016 Breakout - Pandurang Naik - Demystifying In-Memory Data Grid,...In-Memory Computing Summit

5. od optimized data-protection_archival_v1Doina Draganescu

Presentation dell™ power vault™ md3xKinAnx

Demystifying Data Warehouse as a Service (DWaaS)Kent Graziano

Webinar: Cloud Storage: The 5 Reasons IT Can Do it BetterStorage Switzerland

1. beyond mission critical virtualizing big data and hadoopChiou-Nan Chen

Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMFSUSE Italy

Managing The Data Deluge By Optimizing StorageDell World

IBM Spectrum Scale Overview november 2015Doug O'Flaherty

Similar to Optimize Data Management with Key Architecture Principles (20)

Debunking Common Myths of Hadoop Backup & Test Data Management

vFabric Data Director 2.7 customer deck

ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...

MISYS-KL - Cintra Optimized Oracle Archiecture Solution and Services 1.1.pdf

Cloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCO

Revolutionary Storage for Modern Databases, Applications and Infrastrcture

ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture

32992 lam ebc storage overview3

Data Architecture Best Practices for Advanced Analytics

Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware

Oracle database 12c introduction- Satyendra Pasalapudi

IMC Summit 2016 Breakout - Pandurang Naik - Demystifying In-Memory Data Grid,...

5. od optimized data-protection_archival_v1

Presentation dell™ power vault™ md3

Demystifying Data Warehouse as a Service (DWaaS)

Webinar: Cloud Storage: The 5 Reasons IT Can Do it Better

1. beyond mission critical virtualizing big data and hadoop

Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF

Managing The Data Deluge By Optimizing Storage

IBM Spectrum Scale Overview november 2015

Recently uploaded

Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group

Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski

CloudStudio User manual (basic edition):comworks

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent

Vulnerability_Management_GRC_by Sohang Sengupta.pptxnull - The Open Security Community

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren

Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited

Key Features Of Token Development (1).pptxLBM Solutions

Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55

Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software

Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

The transition to renewables in India.pdfCompetition Advisory Services (India) LLP

Install Stable Diffusion in windows machinePadma Pradeep

Recently uploaded (20)

Snow Chain-Integrated Tire for a Safe Drive on Winter Roads

Injustice - Developers Among Us (SciFiDevCon 2024)

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...

CloudStudio User manual (basic edition):

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...

Vulnerability_Management_GRC_by Sohang Sengupta.pptx

08448380779 Call Girls In Friends Colony Women Seeking Men

SQL Database Design For Developers at php[tek] 2024

Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365

Key Features Of Token Development (1).pptx

Advanced Test Driven-Development @ php[tek] 2024

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation

Next-generation AAM aircraft unveiled by Supernal, S-A2

GenCyber Cyber Security Day Presentation

Breaking the Kubernetes Kill Chain: Host Path Mount

The transition to renewables in India.pdf

Install Stable Diffusion in windows machine

Optimize Data Management with Key Architecture Principles

1. Confidential and Proprietary1 Key Architecture & Performance Principles to Optimize Data Management May 18, 2017

2. Confidential and Proprietary2 Webinar Goals Outline key architecture principles for backup/recovery & test data management in a modern data world Illustrate the importance of these principles using real customer examples Identify specific tradeoffs that can be made when deploying your data management infrastructure

3. Confidential and Proprietary3 Attributes of Modern Data Platforms Scale out to petabyte workloads Analytics-driven Intelligence Storage optimization across diverse data & environments Minimize copies Create storage and compute pools on commodity H/W

4. Confidential and Proprietary4 Why Incremental Forever: Backups Traditional Approach Big Data Platform Approach Day 1 Days 2-7 Day 8 Full backup Incremental backups Full backup Incremental backups Full backup Incremental backup Days 9-14 Incremental backups Incremental backups Day 15 Full backup Incremental backup

5. Confidential and Proprietary5 Why Incremental Forever: Restores Day 1 2 3 . . . . 80 81 Data Size 1 TB 1.02 TB 1.03 TB . . . . 1.2TB Developer error Changed data from last backup - 50 GB 50 GB . . . . 50GB Backup type Full Incr Incr . . . . Incr Data recovered by traditional approach: 1 TB + 79 x 50 GB = 4.95 TB Data recovered by big data approach: 1.2 TB Key concept: the notion of a “virtualized full” image

6. Confidential and Proprietary6 The Importance of Parallelism Test Platform Utility Talena Differential Full backup 8 hours, 17 min 2 hours, 20 min 3.5x Incremental backup 4 hours, 55 minutes 26 min, 7 seconds 11.3x Full restore to different cluster 40 hours, 28 min 14 hours, 55 minutes 2.7x Full restore to same cluster 6 hours, 21 min 1 hour, 58 minutes 2.2x Full restore using incremental restore point to same cluster 21 hours, 28 min 2 hours, 5 min 10.3x Eliminate choke points Tradeoff between backup/restore performance versus production cluster Bandwidth efficiency

7. Confidential and Proprietary7 Elastic Scaling: What Are The Issues Multi-DC Cassandra Cluster 100-nodes, 320 TB ARCHIVE Data Center #1 Data Center #2 Cassandra Cluster 50-node, 125 TB Data Center #1 Year 1 Year 2 Topic Key consideration Scaling backup infrastructure Just adding nodes or forklift Agents/listeners Manageability Multi-DC awareness Minimize WAN bandwidth overhead

8. Confidential and Proprietary8 The Cloud Effect NoSQL/Hado op/EDW Local Storage Production Cluster Object Storage Cold Storage• Storage tiering • Transparent access • Bandwidth impact

9. Confidential and Proprietary9 The Evolution of Data Management THE NEXT 25 YEARS THE TRADITIONAL WORLD Data ManagementData Platforms

10. Confidential and Proprietary10 The Talena Architecture • Deep de-duplication and compression with app-aware architecture • Incremental-forever backup architecture • High availability via erasure coding in distributed cluster architecture Smart Storage Optimizer

11. Confidential and Proprietary11 The Talena Architecture Native querying and analytics via active compute layer Unbounded scale with a Hadoop-native architecture Smart Storage Optimizer Active Compute Services Distributed File System

12. Confidential and Proprietary12 The Talena Architecture • Google-like catalog shortens data recovery time • Automatic schema generation for mirroring and backups • Granular recovery at an object level • Recovery to multiple topologies • Native integration with LDAP and Kerberos for authentication • Role-based access control defines specific privileges • Transparent data encryption • Masking for PII data Smart Storage Optimizer Active Compute Services Distributed File System Metadata Catalog Data Orchestration ServicesSecurity Services

13. Confidential and Proprietary13 Smart Storage Optimizer The Talena Architecture GUI CLI API Active Compute Services Distributed File System • ‘Single pane of glass’ for multiple use cases and data platforms • Agentless architecture minimizes management overhead • GUI, CLI, REST-based Talena API options Metadata Catalog Data Orchestration ServicesSecurity Services

14. Confidential and Proprietary14 Q&A  We’ll send you a link to our architecture white paper  Additional resources: talena- inc.com/resources and talena-inc.com/blog  Ping us with any additional questions: info@talena- inc.com

15. Confidential and Proprietary15 Thank You

Editor's Notes

.
Starting over 20 years ago, the traditional database market became the foundation of enterprise applications. A whole ecosystem of data management products emerged to provide capabilities like backup/recovery (Veritas), storage pooling (Data Domain) test/dev management (Delphix) and Iron Mountain (archiving). But, companies had to purchase separate products to provide a full data management solution for their enterprise. Over the past few years and into the foreseeable future, modern data platforms will become new hubs of enterprise applications. These modern data platforms also need data management capabilities, similar to what happened with traditional databases. (Click for build) Our vision is to help companies with their critical data management needs in a single software product, one that is optimized specifically for these modern Big Data environments.
The next few slides will introduce the unique Talena architecture and highlight how this architecture delivers on these core business benefits. One of the most significant components of our architecture is our Smart Storage Optimizer. By integrating compute and storage management into our storage optimizer, we’re able to deliver significant cost savings. Our application-aware architecture enables us to do deep de-duplication and compression. Our backup process is incremental-forever, saving on storage costs, and by incorporating erasure coding we also ensure high availability no matter how large a Talena cluster you choose to deploy.
Supports transparent data encryption in the security services section
Our agentless architecture makes Talena an ideal solution for big data architectures and minimizes your operational overhead. Furthermore, Talena can support multiple data platforms, versions, and use cases in a single deployment of Talena, thereby providing a “single pane of glass” for all your big data management needs. While most of our clients work within our user interface, we also provide a REST-based API to accomplish the same tasks.

Optimize Data Management with Key Architecture Principles

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Optimize Data Management with Key Architecture Principles

Similar to Optimize Data Management with Key Architecture Principles (20)

Recently uploaded

Recently uploaded (20)

Optimize Data Management with Key Architecture Principles

Editor's Notes