SlideShare a Scribd company logo
BUILDING A HIGHLY AVAILABLE DATA INFRASTRUCTURE 1
© COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED.
BUILDING A HIGHLY AVAILABLE
DATA INFRASTRUCTURE
By
Jon Toigo
Chairman, Data Management Institute
jtoigo@toigopartners.com
INTRODUCTION
In survey after survey of business IT planners, a leading reason cited for adopting server
virtualization technology is to improve application availability. Hypervisor-based computing
vendors promise to improve the availability of applications by enabling failover between
servers connected into clusters. For such failover clustering to work, servers typically use a
“heartbeat” channel – a signal exchange – to confirm that both systems are operational. If the
primary system ceases to send or respond to these signals, the issue is reported to an
administrator so that a failover can be initiated and workloads shift to the alternative servers in
the cluster.
In the view of many administrators, this is the meaning of high availability: active-passive server
nodes with heartbeats and failover. Such technology is not only useful in providing operational
continuity in the face of a user, hardware or software fault (unscheduled downtime) that
impairs a production server, it also provides a fairly elegant means to move workloads from the
active to the passive server when scheduled maintenance requires taking the active server
offline. In that way, server HA clustering mitigates application downtime for scheduled
maintenance.
BUILDING A HIGHLY AVAILABLE DATA INFRASTRUCTURE 2
© COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED.
That view of clustering and availability is incomplete, however. In addition to server
equipment, for a system or application to be highly available, its data infrastructure must also
be highly available. Without access to data, server failover alone will not ensure the continuity
of business application processing.
As illustrated above, the simple clustering of two servers, facilitated by a health monitoring
heartbeat process over a LAN, with both servers connected to a common storage platform,
does not provide sufficient redundancy to ensure data availability. The single point of failure is
the data storage infrastructure, which would make both servers and their hosted applications
incapable of processing workload.
The logical alternative is to provide storage on each server independently, and to establish an
on-going data mirroring process between the storage arrays. The above diagram illustrates such
a configuration using a direct attached storage topology in which two identical arrays are
BUILDING A HIGHLY AVAILABLE DATA INFRASTRUCTURE 3
© COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED.
tethered directly to each server via a host bus adapter, cabling, and ports on each array. Array
to array replication can be implemented via server-based software or software-loaded on each
array controller. Using on-array mirroring typically requires that identical equipment be used
on both sides of the mirror operation.
Regardless of whether you use a direct attached storage array, or a network-attached storage
(NAS) appliances, or a storage area network (SAN) to host your data, if this data infrastructure is
not designed for high availability, then the data it stores is not highly available by extension,
application availability is at risk – regardless of server clustering.
The purpose of this paper is to outline best practices for improving overall business application
availability by building a highly available data infrastructure.
DATA AVAILABILITY STRATEGY
A long time dictum of continuity planning is that one of two strategies are required to ensure
availability of data. One is a strategy of replacement: essentially, you do nothing prior to a
data corruption or deletion event, then as soon as possible once the cause of the corruption or
deletion event has been resolved, re-key all of the lost data from original source materials.
That is a strategy of replacement, and if it sounds somewhat far-fetched in these days of
exponential data growth rates, it is!
For data, there is only one viable strategy for ensuring availability: a strategy of redundancy.
To make data highly available, you need a copy of the data.
By extension, to create a highly available data infrastructure, you need to eliminate single
points of failure. You need to ensure that both the storage devices that host the data, and also
the interconnect elements -- including switches, host bus adapters and/or network interface
cards, and the wiring and cabling connecting servers to storage infrastructure -- are also made
redundant.
As stated in the introduction, simple clustering of two servers, with both servers connected to a
common storage platform, remains vulnerable to an interruption of access to the shared array.
Adding a second array and mirroring data between direct-attached storage systems
compensated for the vulnerability, but in a manner that could quickly become unwieldy as
infrastructure grows.
That was one reason for the introduction of storage area networks in the 1990s. SANs enabled
a switched infrastructure for storage devices to be created and shared as a pooled resource to
servers. A simple SAN is illustrated in the following diagram.
BUILDING A HIGHLY AVAILABLE DATA INFRASTRUCTURE 4
© COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED.
Notice that the single switch in the data path provides access to either data repository, so if
storage array 1 becomes unavailable, I/O requests can be shunted to storage array 2, where a
mirrored copy of data should be found. However, the switch itself is a single point of failure in
terms of access to stored data. If the switch fails, neither clustered host has access to its data.
The logical resolution of this potential problem for data availability is to add a redundant
storage area network switch in order to provide alternative paths to the same data storage.
The benefits of the strategy are that it provides multiple paths to data from each host. The
potential downside is that expenses for the configuration have multiplied. Each server must be
equipped with multiple host bus adapters, additional cabling is required both from each server
to both switches and from each switch to both storage arrays. Moreover, this configuration
grows more complex with the addition of more servers and more storage arrays.
BUILDING A HIGHLY AVAILABLE DATA INFRASTRUCTURE 5
© COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED.
There is also another factor to consider: shared utility services. To ensure that this HA
topology is truly effective, different switches and storage need to be on different electrical
sources so that a single power outage doesn’t take out the entire infrastructure. In some cases,
it may be desirable to place different SANs into different locations (different buildings in a
campus environment, different branch office facilities, etc.). For this to work, heartbeat
functionality, plus data mirroring/replication services, must be extended across a Metropolitan
Area or Wide Area Network interconnect.
From these illustrations it is easy to see the point that ensuring data availability requires more
than simply clustering two servers together. From a data availability standpoint, planners need
to consider
 Redundancies at the storage layer, including data replication, RAID or erasure coding on
storage media itself
 Redundancies at the storage system layer, including redundant power supplies, fans,
interconnecting controllers, ports, etc.
 Redundancies at the switch layer
 Redundancies in the cabling and interconnect layer
 Redundancies in host bus adapters or NIC cards in the server hosts
Those are just the hardware components. From a storage software perspective, there is a need
for perfect data mirroring software -- software that works with all storage gear that is deployed
and with the interconnect capabilities that are present in the topology. If storage equipment is
heterogeneous (that is, is not of the same model type or from the same manufacturer), it is
likely that third party mirroring and replication software will be needed to ensure that bits
written to one array are faithfully copied to mirror targets.
BUILDING A HIGHLY AVAILABLE DATA INFRASTRUCTURE 6
© COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED.
Mirroring software provides synchronous replication of data over reasonably jitter free links of
less than 70 kilometers. To mirror data beyond that distance, asynchronous replication
software will likely be needed.
Additionally, the entire end-to-end storage I/O path needs to be continuously monitored, as
does the integrity of mirrors themselves to ensure that data states in the source and target
arrays do not move too far afield of each other. Hence, there is a “transparency” requirement
that must be addressed in the data availability infrastructure that the planner designs.
NEXT GENERATION: FROM HARDWARE-DEFINED TO SOFTWARE-DEFINED
Over the past few years, substantial push-back has been coming from business IT executives
regarding the cost and complexity of building resilient hardware storage topologies for data
availability. Critics have argued that embedding storage services on the controllers of individual
arrays has a tendency to obfuscate their ability to monitor the I/O path or to test and validate
data mirror processes.
Software-defined storage (SDS) offers an optional model in which on-array storage services are
abstracted away from the array controller and instead instantiated on the server as a layer of
the server virtualization hypervisor software stack or as a standalone virtual machine. SDS,
however, has no universal definition. Some vendors interpret it to mean moving caching
functionality only off array controllers and into a server-based software layer. Other vendors
include a broad range of storage services that may include thin provisioning, data mirroring,
snapshots, de-duplication and other functionality. To optimize data availability, however, SDS
must include capacity management as well as other services provided traditionally by array
controllers.
By providing all functionality to an SDS “storage-controller-in-software” underlying physical
storage infrastructure can be optimized and both storage capacity and services can be doled
out to application workloads that need them in a more agile way.
Consider that, with traditional storage infrastructure, the movement of a virtualized application
between servers (for example, vMotion) must be accompanied by a manual intervention within
the application itself in order to provide a “new route” to the data store that is accessed and
used by the application. The only alternative is to engage in expensive local mirroring of
application data to storage locations on every conceivable hosting system, which can be an
extremely expensive and complex proposition.
If data infrastructure itself is virtualized, if capacity is pooled into a shared resource and
surfaced or exposed as virtual volumes, these virtual volumes can travel with the virtual
machine as it moves from server host to server host. The result is that path re-routing between
BUILDING A HIGHLY AVAILABLE DATA INFRASTRUCTURE 7
© COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED.
application and the physical location where its data is stored is managed transparently and
automatically by the SDS controller.
Moreover, by abstracting capacity management (sometimes called virtualizing storage), the SDS
layer can enable the assignment of data availability services to virtual storage volumes so that
all data written to the target is immediately subjected to the right combination of protective
services and replication processes that are appropriate given its importance to the business and
criticality to operations.
Such a strategy insulates planners from the complexities brought about by scaling storage
capacity, integrating different brands and models of hardware from different vendors, and the
typical adds, moves and changes to the storage switch and cabling infrastructure. Plus, the
contents of virtual volumes can be compared in real time to validate mirrors without disrupting
the data copy process: a real boon.
Additionally, this strategy enables a truly available data infrastructure to be created that can
handle not only active-passive -- but also active-active -- clustering and failover. With active-
active clustering, application workload is spread over multiple hosts which cooperate in the
processing of data. In a true active-active cluster, if one server goes offline, the other active
node simply takes on the entire workload burden until the failed server or a replacement can
be brought online. When that happens, data is resynchronized and the active-active
relationship is reestablished. Without capacity abstraction, this infrastructure is extremely
difficult, if not impossible, to do build and operate – at least not with less expensive
heterogeneous storage.
BUILDING IN HIGHLY AVAILABLE DATA INFRASTRUCTURE USING HYPER-CONVERGED
STORAGE
Lately, software-defined storage has spawned pre-configured combinations of servers and
storage devices joined together with SDS “middleware,” and offered as “atomic units of
computing.” This model for deploying infrastructure, known as hyper-converged, is still
undergoing significant development, but it is likely to become a fixture in highly virtualized
server environments.
Hyper-converged infrastructure provides an opportunity to realize the goal of a high availability
data infrastructure. By enabling top-to-bottom management of the infrastructure stack, and by
standardizing the techniques for connecting and clustering hyper-converged appliances, one
can see how horizontal and vertical scaling of capacity could be handled in an efficient way.
Plus, the best products will enable clusters to stretch across metropolitan area network (MAN)
connections with the same alacrity as local area networks. These stretch clusters provide
another level of availability to infrastructure and data.
BUILDING A HIGHLY AVAILABLE DATA INFRASTRUCTURE 8
© COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED.
In some cases, hyper-converged infrastructure, server+SDS+storage “building blocks,” will come
to displace some “legacy” infrastructure consisting of servers connected to SANs and NAS
appliances. However, the latest word from analysts suggests that the environment going
forward will be a mixture of legacy and hyper-converged models.
Needed is a robust “hybrid” software-defined storage technology that can embrace both the
requirements of data availability in hyper-converged infrastructure and also on legacy
infrastructure like SANs. Finding such an SDS architecture will require a careful evaluation of
available options. Here are some suggested questions for identifying the solution that best fits
your needs.
First, planners need to ask whether the subject SDS solution works with their current
infrastructure and with future storage architectures, like hyper-converged, that may be under
consideration.
Second, planners need to know whether the subject SDS solution will support their current
workloads – virtualized and not.
Third, planners need to discover whether the SDS solution under consideration offers capacity
virtualization as well as storage services aggregation.
Fourth, planners need to assess their own on-staff skills and determine, based on this
assessment, the importance of user friendliness features of the SDS solution. Does it provide
simple configuration and wizard-based allocation of capacity and services?
BUILDING A HIGHLY AVAILABLE DATA INFRASTRUCTURE 9
© COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED.
Finally, since we are talking about data availability, planners need to look closely at what the
SDS solution offers from the standpoint of data mirroring and replication services. More to the
point:
1) Does the solution enable mirroring of data across unlike hardware?
2) Does the solution enable centralized management of diverse data replication
processes for ease of management?
3) Does the SDS solution offer mirror transparency – visibility into mirroring processes
for ease of testing and validation?
4) Does the solution automate the failover of data access between different media,
arrays or SANs that are part of the SDS-administered infrastructure so that the
re-routing of applications to their data requires no human intervention?
5) Does the solution provide the means to failback from a failover once the error
condition that necessitated the failover is resolved?
A word of warning, most SDS solutions will not pass muster given these criteria. That is mainly
because only a handful embrace the notion of a hybrid legacy/hyper-converged infrastructure
or the concept of virtualizing storage capacity. Planners need to press for this functionality and
to prefer SDS solutions that meet these critical requirements if they are to build an infrastructure
that delivers data availability.

More Related Content

What's hot

122 whitepaper
122 whitepaper122 whitepaper
122 whitepaper
tamilmagazin
 
The Database Environment Chapter 13
The Database Environment Chapter 13The Database Environment Chapter 13
The Database Environment Chapter 13
Jeanie Arnoco
 
Fast Start Failover DataGuard
Fast Start Failover DataGuardFast Start Failover DataGuard
Fast Start Failover DataGuard
Borsaniya Vaibhav
 
Disaster Recovery Infrastructure Whitepaper 2012
Disaster Recovery Infrastructure Whitepaper 2012Disaster Recovery Infrastructure Whitepaper 2012
Disaster Recovery Infrastructure Whitepaper 2012
Jade Global
 
Getting Most Out of Your Disaster Recovery Infrastructure Using Active Data G...
Getting Most Out of Your Disaster Recovery Infrastructure Using Active Data G...Getting Most Out of Your Disaster Recovery Infrastructure Using Active Data G...
Getting Most Out of Your Disaster Recovery Infrastructure Using Active Data G...
Jade Global
 
DesignOfAnExtensibleTelemetry&CommandArcitectureForSmallSatellites
DesignOfAnExtensibleTelemetry&CommandArcitectureForSmallSatellitesDesignOfAnExtensibleTelemetry&CommandArcitectureForSmallSatellites
DesignOfAnExtensibleTelemetry&CommandArcitectureForSmallSatellitesBrenden Hogan
 
1523 emc-srdf
1523 emc-srdf1523 emc-srdf
1523 emc-srdf
Gangadhar Chekuri
 
The Grouping of Files in Allocation of Job Using Server Scheduling In Load Ba...
The Grouping of Files in Allocation of Job Using Server Scheduling In Load Ba...The Grouping of Files in Allocation of Job Using Server Scheduling In Load Ba...
The Grouping of Files in Allocation of Job Using Server Scheduling In Load Ba...
iosrjce
 
Symmetrix remote-data-facility
Symmetrix remote-data-facilitySymmetrix remote-data-facility
Symmetrix remote-data-facilityramparasa
 
IRJET- Big Data Processes and Analysis using Hadoop Framework
IRJET- Big Data Processes and Analysis using Hadoop FrameworkIRJET- Big Data Processes and Analysis using Hadoop Framework
IRJET- Big Data Processes and Analysis using Hadoop Framework
IRJET Journal
 
Introduction to Greenplum
Introduction to GreenplumIntroduction to Greenplum
Introduction to Greenplum
Dave Cramer
 
Greenplum Database Overview
Greenplum Database Overview Greenplum Database Overview
Greenplum Database Overview
EMC
 
Teradata Unity
Teradata UnityTeradata Unity
Teradata Unity
Teradata
 
Overcoming Rack Power Limits with Virtual Power Systems Dynamic Redundancy an...
Overcoming Rack Power Limits with Virtual Power Systems Dynamic Redundancy an...Overcoming Rack Power Limits with Virtual Power Systems Dynamic Redundancy an...
Overcoming Rack Power Limits with Virtual Power Systems Dynamic Redundancy an...
Steve Houck
 
Smart Energy in the Data Center
Smart Energy in the Data CenterSmart Energy in the Data Center
Smart Energy in the Data Center
Steve Houck
 
An enhanced wireless presentation system for large scale content distribution
An enhanced wireless presentation system for large scale content distribution An enhanced wireless presentation system for large scale content distribution
An enhanced wireless presentation system for large scale content distribution
Conference Papers
 
Virtual Power Systems - Intelligent Control of Energy (ICE) and Software Defi...
Virtual Power Systems - Intelligent Control of Energy (ICE) and Software Defi...Virtual Power Systems - Intelligent Control of Energy (ICE) and Software Defi...
Virtual Power Systems - Intelligent Control of Energy (ICE) and Software Defi...
Steve Houck
 

What's hot (19)

122 whitepaper
122 whitepaper122 whitepaper
122 whitepaper
 
The Database Environment Chapter 13
The Database Environment Chapter 13The Database Environment Chapter 13
The Database Environment Chapter 13
 
Fast Start Failover DataGuard
Fast Start Failover DataGuardFast Start Failover DataGuard
Fast Start Failover DataGuard
 
Disaster Recovery Infrastructure Whitepaper 2012
Disaster Recovery Infrastructure Whitepaper 2012Disaster Recovery Infrastructure Whitepaper 2012
Disaster Recovery Infrastructure Whitepaper 2012
 
Getting Most Out of Your Disaster Recovery Infrastructure Using Active Data G...
Getting Most Out of Your Disaster Recovery Infrastructure Using Active Data G...Getting Most Out of Your Disaster Recovery Infrastructure Using Active Data G...
Getting Most Out of Your Disaster Recovery Infrastructure Using Active Data G...
 
DesignOfAnExtensibleTelemetry&CommandArcitectureForSmallSatellites
DesignOfAnExtensibleTelemetry&CommandArcitectureForSmallSatellitesDesignOfAnExtensibleTelemetry&CommandArcitectureForSmallSatellites
DesignOfAnExtensibleTelemetry&CommandArcitectureForSmallSatellites
 
1523 emc-srdf
1523 emc-srdf1523 emc-srdf
1523 emc-srdf
 
The Grouping of Files in Allocation of Job Using Server Scheduling In Load Ba...
The Grouping of Files in Allocation of Job Using Server Scheduling In Load Ba...The Grouping of Files in Allocation of Job Using Server Scheduling In Load Ba...
The Grouping of Files in Allocation of Job Using Server Scheduling In Load Ba...
 
Symmetrix remote-data-facility
Symmetrix remote-data-facilitySymmetrix remote-data-facility
Symmetrix remote-data-facility
 
IRJET- Big Data Processes and Analysis using Hadoop Framework
IRJET- Big Data Processes and Analysis using Hadoop FrameworkIRJET- Big Data Processes and Analysis using Hadoop Framework
IRJET- Big Data Processes and Analysis using Hadoop Framework
 
Introduction to Greenplum
Introduction to GreenplumIntroduction to Greenplum
Introduction to Greenplum
 
Greenplum Database Overview
Greenplum Database Overview Greenplum Database Overview
Greenplum Database Overview
 
Teradata Unity
Teradata UnityTeradata Unity
Teradata Unity
 
Overcoming Rack Power Limits with Virtual Power Systems Dynamic Redundancy an...
Overcoming Rack Power Limits with Virtual Power Systems Dynamic Redundancy an...Overcoming Rack Power Limits with Virtual Power Systems Dynamic Redundancy an...
Overcoming Rack Power Limits with Virtual Power Systems Dynamic Redundancy an...
 
Smart Energy in the Data Center
Smart Energy in the Data CenterSmart Energy in the Data Center
Smart Energy in the Data Center
 
An enhanced wireless presentation system for large scale content distribution
An enhanced wireless presentation system for large scale content distribution An enhanced wireless presentation system for large scale content distribution
An enhanced wireless presentation system for large scale content distribution
 
Virtual Power Systems - Intelligent Control of Energy (ICE) and Software Defi...
Virtual Power Systems - Intelligent Control of Energy (ICE) and Software Defi...Virtual Power Systems - Intelligent Control of Energy (ICE) and Software Defi...
Virtual Power Systems - Intelligent Control of Energy (ICE) and Software Defi...
 
AdaptiveCaseStudy_3_3
AdaptiveCaseStudy_3_3AdaptiveCaseStudy_3_3
AdaptiveCaseStudy_3_3
 
Etl elt simplified
Etl elt simplifiedEtl elt simplified
Etl elt simplified
 

Similar to Building a Highly Available Data Infrastructure

Storage Area Networks Unit 1 Notes
Storage Area Networks Unit 1 NotesStorage Area Networks Unit 1 Notes
Storage Area Networks Unit 1 Notes
Sudarshan Dhondaley
 
50 Shades of Grey in Software-Defined Storage
50 Shades of Grey in Software-Defined Storage50 Shades of Grey in Software-Defined Storage
50 Shades of Grey in Software-Defined Storage
StorMagic
 
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud StorageIRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET Journal
 
Centralized Data Verification Scheme for Encrypted Cloud Data Services
Centralized Data Verification Scheme for Encrypted Cloud Data ServicesCentralized Data Verification Scheme for Encrypted Cloud Data Services
Centralized Data Verification Scheme for Encrypted Cloud Data Services
Editor IJMTER
 
router virtualization white paper
router virtualization white paperrouter virtualization white paper
router virtualization white paperPranav Dharwadkar
 
IRJET- A Comprehensive Review on Query Optimization for Distributed Databases
IRJET- A Comprehensive Review on Query Optimization for Distributed DatabasesIRJET- A Comprehensive Review on Query Optimization for Distributed Databases
IRJET- A Comprehensive Review on Query Optimization for Distributed Databases
IRJET Journal
 
Noticias Teleco feb2011
Noticias Teleco feb2011Noticias Teleco feb2011
Noticias Teleco feb2011
Francisco Apablaza
 
Provable multicopy dynamic data possession in cloud computing systems
Provable multicopy dynamic data possession in cloud computing systemsProvable multicopy dynamic data possession in cloud computing systems
Provable multicopy dynamic data possession in cloud computing systems
Pvrtechnologies Nellore
 
A Study on Replication and Failover Cluster to Maximize System Uptime
A Study on Replication and Failover Cluster to Maximize System UptimeA Study on Replication and Failover Cluster to Maximize System Uptime
A Study on Replication and Failover Cluster to Maximize System Uptime
YogeshIJTSRD
 
IRJET- Deduplication of Encrypted Bigdata on Cloud
IRJET- Deduplication of Encrypted Bigdata on CloudIRJET- Deduplication of Encrypted Bigdata on Cloud
IRJET- Deduplication of Encrypted Bigdata on Cloud
IRJET Journal
 
THE SURVEY ON REFERENCE MODEL FOR OPEN STORAGE SYSTEMS INTERCONNECTION MASS S...
THE SURVEY ON REFERENCE MODEL FOR OPEN STORAGE SYSTEMS INTERCONNECTION MASS S...THE SURVEY ON REFERENCE MODEL FOR OPEN STORAGE SYSTEMS INTERCONNECTION MASS S...
THE SURVEY ON REFERENCE MODEL FOR OPEN STORAGE SYSTEMS INTERCONNECTION MASS S...
IRJET Journal
 
Deduplication on Encrypted Big Data in HDFS
Deduplication on Encrypted Big Data in HDFSDeduplication on Encrypted Big Data in HDFS
Deduplication on Encrypted Big Data in HDFS
IRJET Journal
 
An Investigation into Convergence of Networking and Storage Solutions
An Investigation into Convergence of Networking and Storage Solutions An Investigation into Convergence of Networking and Storage Solutions
An Investigation into Convergence of Networking and Storage Solutions
Blesson Babu
 
SECURE FILE STORAGE IN THE CLOUD WITH HYBRID ENCRYPTION
SECURE FILE STORAGE IN THE CLOUD WITH HYBRID ENCRYPTIONSECURE FILE STORAGE IN THE CLOUD WITH HYBRID ENCRYPTION
SECURE FILE STORAGE IN THE CLOUD WITH HYBRID ENCRYPTION
IRJET Journal
 
Flaw less coding and authentication of user data using multiple clouds
Flaw less coding and authentication of user data using multiple cloudsFlaw less coding and authentication of user data using multiple clouds
Flaw less coding and authentication of user data using multiple clouds
IRJET Journal
 
Parallel and distributed system projects for java and dot net
Parallel and distributed system projects for java and dot netParallel and distributed system projects for java and dot net
Parallel and distributed system projects for java and dot net
redpel dot com
 
H03401049054
H03401049054H03401049054
H03401049054
theijes
 
Provable Multicopy Dynamic Data Possession in Cloud Computing Systems
Provable Multicopy Dynamic Data Possession in Cloud Computing SystemsProvable Multicopy Dynamic Data Possession in Cloud Computing Systems
Provable Multicopy Dynamic Data Possession in Cloud Computing Systems
1crore projects
 
V04405122126
V04405122126V04405122126
V04405122126
IJERA Editor
 

Similar to Building a Highly Available Data Infrastructure (20)

Storage Area Networks Unit 1 Notes
Storage Area Networks Unit 1 NotesStorage Area Networks Unit 1 Notes
Storage Area Networks Unit 1 Notes
 
50 Shades of Grey in Software-Defined Storage
50 Shades of Grey in Software-Defined Storage50 Shades of Grey in Software-Defined Storage
50 Shades of Grey in Software-Defined Storage
 
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud StorageIRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
 
Centralized Data Verification Scheme for Encrypted Cloud Data Services
Centralized Data Verification Scheme for Encrypted Cloud Data ServicesCentralized Data Verification Scheme for Encrypted Cloud Data Services
Centralized Data Verification Scheme for Encrypted Cloud Data Services
 
router virtualization white paper
router virtualization white paperrouter virtualization white paper
router virtualization white paper
 
IRJET- A Comprehensive Review on Query Optimization for Distributed Databases
IRJET- A Comprehensive Review on Query Optimization for Distributed DatabasesIRJET- A Comprehensive Review on Query Optimization for Distributed Databases
IRJET- A Comprehensive Review on Query Optimization for Distributed Databases
 
Noticias Teleco feb2011
Noticias Teleco feb2011Noticias Teleco feb2011
Noticias Teleco feb2011
 
Provable multicopy dynamic data possession in cloud computing systems
Provable multicopy dynamic data possession in cloud computing systemsProvable multicopy dynamic data possession in cloud computing systems
Provable multicopy dynamic data possession in cloud computing systems
 
A Study on Replication and Failover Cluster to Maximize System Uptime
A Study on Replication and Failover Cluster to Maximize System UptimeA Study on Replication and Failover Cluster to Maximize System Uptime
A Study on Replication and Failover Cluster to Maximize System Uptime
 
IRJET- Deduplication of Encrypted Bigdata on Cloud
IRJET- Deduplication of Encrypted Bigdata on CloudIRJET- Deduplication of Encrypted Bigdata on Cloud
IRJET- Deduplication of Encrypted Bigdata on Cloud
 
THE SURVEY ON REFERENCE MODEL FOR OPEN STORAGE SYSTEMS INTERCONNECTION MASS S...
THE SURVEY ON REFERENCE MODEL FOR OPEN STORAGE SYSTEMS INTERCONNECTION MASS S...THE SURVEY ON REFERENCE MODEL FOR OPEN STORAGE SYSTEMS INTERCONNECTION MASS S...
THE SURVEY ON REFERENCE MODEL FOR OPEN STORAGE SYSTEMS INTERCONNECTION MASS S...
 
Deduplication on Encrypted Big Data in HDFS
Deduplication on Encrypted Big Data in HDFSDeduplication on Encrypted Big Data in HDFS
Deduplication on Encrypted Big Data in HDFS
 
An Investigation into Convergence of Networking and Storage Solutions
An Investigation into Convergence of Networking and Storage Solutions An Investigation into Convergence of Networking and Storage Solutions
An Investigation into Convergence of Networking and Storage Solutions
 
SECURE FILE STORAGE IN THE CLOUD WITH HYBRID ENCRYPTION
SECURE FILE STORAGE IN THE CLOUD WITH HYBRID ENCRYPTIONSECURE FILE STORAGE IN THE CLOUD WITH HYBRID ENCRYPTION
SECURE FILE STORAGE IN THE CLOUD WITH HYBRID ENCRYPTION
 
Flaw less coding and authentication of user data using multiple clouds
Flaw less coding and authentication of user data using multiple cloudsFlaw less coding and authentication of user data using multiple clouds
Flaw less coding and authentication of user data using multiple clouds
 
Parallel and distributed system projects for java and dot net
Parallel and distributed system projects for java and dot netParallel and distributed system projects for java and dot net
Parallel and distributed system projects for java and dot net
 
H03401049054
H03401049054H03401049054
H03401049054
 
Provable Multicopy Dynamic Data Possession in Cloud Computing Systems
Provable Multicopy Dynamic Data Possession in Cloud Computing SystemsProvable Multicopy Dynamic Data Possession in Cloud Computing Systems
Provable Multicopy Dynamic Data Possession in Cloud Computing Systems
 
50620130101004
5062013010100450620130101004
50620130101004
 
V04405122126
V04405122126V04405122126
V04405122126
 

More from DataCore Software

Software-Defined Storage Accelerates Storage Cost Reduction and Service-Level...
Software-Defined Storage Accelerates Storage Cost Reduction and Service-Level...Software-Defined Storage Accelerates Storage Cost Reduction and Service-Level...
Software-Defined Storage Accelerates Storage Cost Reduction and Service-Level...
DataCore Software
 
NVMe and Flash – Make Your Storage Great Again!
NVMe and Flash – Make Your Storage Great Again!NVMe and Flash – Make Your Storage Great Again!
NVMe and Flash – Make Your Storage Great Again!
DataCore Software
 
Zero Downtime, Zero Touch Stretch Clusters from Software-Defined Storage
Zero Downtime, Zero Touch Stretch Clusters from Software-Defined StorageZero Downtime, Zero Touch Stretch Clusters from Software-Defined Storage
Zero Downtime, Zero Touch Stretch Clusters from Software-Defined Storage
DataCore Software
 
From Disaster to Recovery: Preparing Your IT for the Unexpected
From Disaster to Recovery: Preparing Your IT for the UnexpectedFrom Disaster to Recovery: Preparing Your IT for the Unexpected
From Disaster to Recovery: Preparing Your IT for the Unexpected
DataCore Software
 
How to Integrate Hyperconverged Systems with Existing SANs
How to Integrate Hyperconverged Systems with Existing SANsHow to Integrate Hyperconverged Systems with Existing SANs
How to Integrate Hyperconverged Systems with Existing SANs
DataCore Software
 
How to Avoid Disasters via Software-Defined Storage Replication & Site Recovery
How to Avoid Disasters via Software-Defined Storage Replication & Site RecoveryHow to Avoid Disasters via Software-Defined Storage Replication & Site Recovery
How to Avoid Disasters via Software-Defined Storage Replication & Site Recovery
DataCore Software
 
Cloud Infrastructure for Your Data Center
Cloud Infrastructure for Your Data CenterCloud Infrastructure for Your Data Center
Cloud Infrastructure for Your Data Center
DataCore Software
 
TUI Case Study
TUI Case StudyTUI Case Study
TUI Case Study
DataCore Software
 
Thorntons Case Study
Thorntons Case StudyThorntons Case Study
Thorntons Case Study
DataCore Software
 
Top 3 Challenges Impacting Your Data and How to Solve Them
Top 3 Challenges Impacting Your Data and How to Solve ThemTop 3 Challenges Impacting Your Data and How to Solve Them
Top 3 Challenges Impacting Your Data and How to Solve Them
DataCore Software
 
Business Continuity for Mission Critical Applications
Business Continuity for Mission Critical ApplicationsBusiness Continuity for Mission Critical Applications
Business Continuity for Mission Critical Applications
DataCore Software
 
Dynamic Hyper-Converged Future Proof Your Data Center
Dynamic Hyper-Converged Future Proof Your Data CenterDynamic Hyper-Converged Future Proof Your Data Center
Dynamic Hyper-Converged Future Proof Your Data Center
DataCore Software
 
Community Health Network Delivers Unprecedented Availability for Critical Hea...
Community Health Network Delivers Unprecedented Availability for Critical Hea...Community Health Network Delivers Unprecedented Availability for Critical Hea...
Community Health Network Delivers Unprecedented Availability for Critical Hea...
DataCore Software
 
Case Study: Mission Community Hospital
Case Study: Mission Community HospitalCase Study: Mission Community Hospital
Case Study: Mission Community Hospital
DataCore Software
 
Emergency Communication of Southern Oregon
Emergency Communication of Southern OregonEmergency Communication of Southern Oregon
Emergency Communication of Southern Oregon
DataCore Software
 
DataCore At VMworld 2016
DataCore At VMworld 2016DataCore At VMworld 2016
DataCore At VMworld 2016
DataCore Software
 
Integrating Hyper-converged Systems with Existing SANs
Integrating Hyper-converged Systems with Existing SANs Integrating Hyper-converged Systems with Existing SANs
Integrating Hyper-converged Systems with Existing SANs
DataCore Software
 
Fighting the Hidden Costs of Data Storage
Fighting the Hidden Costs of Data StorageFighting the Hidden Costs of Data Storage
Fighting the Hidden Costs of Data Storage
DataCore Software
 
Can $0.08 Change your View of Storage?
Can $0.08 Change your View of Storage?Can $0.08 Change your View of Storage?
Can $0.08 Change your View of Storage?
DataCore Software
 
The Need for Speed: Parallel I/O and the New Tick-Tock in Computing
The Need for Speed: Parallel I/O and the New Tick-Tock in ComputingThe Need for Speed: Parallel I/O and the New Tick-Tock in Computing
The Need for Speed: Parallel I/O and the New Tick-Tock in Computing
DataCore Software
 

More from DataCore Software (20)

Software-Defined Storage Accelerates Storage Cost Reduction and Service-Level...
Software-Defined Storage Accelerates Storage Cost Reduction and Service-Level...Software-Defined Storage Accelerates Storage Cost Reduction and Service-Level...
Software-Defined Storage Accelerates Storage Cost Reduction and Service-Level...
 
NVMe and Flash – Make Your Storage Great Again!
NVMe and Flash – Make Your Storage Great Again!NVMe and Flash – Make Your Storage Great Again!
NVMe and Flash – Make Your Storage Great Again!
 
Zero Downtime, Zero Touch Stretch Clusters from Software-Defined Storage
Zero Downtime, Zero Touch Stretch Clusters from Software-Defined StorageZero Downtime, Zero Touch Stretch Clusters from Software-Defined Storage
Zero Downtime, Zero Touch Stretch Clusters from Software-Defined Storage
 
From Disaster to Recovery: Preparing Your IT for the Unexpected
From Disaster to Recovery: Preparing Your IT for the UnexpectedFrom Disaster to Recovery: Preparing Your IT for the Unexpected
From Disaster to Recovery: Preparing Your IT for the Unexpected
 
How to Integrate Hyperconverged Systems with Existing SANs
How to Integrate Hyperconverged Systems with Existing SANsHow to Integrate Hyperconverged Systems with Existing SANs
How to Integrate Hyperconverged Systems with Existing SANs
 
How to Avoid Disasters via Software-Defined Storage Replication & Site Recovery
How to Avoid Disasters via Software-Defined Storage Replication & Site RecoveryHow to Avoid Disasters via Software-Defined Storage Replication & Site Recovery
How to Avoid Disasters via Software-Defined Storage Replication & Site Recovery
 
Cloud Infrastructure for Your Data Center
Cloud Infrastructure for Your Data CenterCloud Infrastructure for Your Data Center
Cloud Infrastructure for Your Data Center
 
TUI Case Study
TUI Case StudyTUI Case Study
TUI Case Study
 
Thorntons Case Study
Thorntons Case StudyThorntons Case Study
Thorntons Case Study
 
Top 3 Challenges Impacting Your Data and How to Solve Them
Top 3 Challenges Impacting Your Data and How to Solve ThemTop 3 Challenges Impacting Your Data and How to Solve Them
Top 3 Challenges Impacting Your Data and How to Solve Them
 
Business Continuity for Mission Critical Applications
Business Continuity for Mission Critical ApplicationsBusiness Continuity for Mission Critical Applications
Business Continuity for Mission Critical Applications
 
Dynamic Hyper-Converged Future Proof Your Data Center
Dynamic Hyper-Converged Future Proof Your Data CenterDynamic Hyper-Converged Future Proof Your Data Center
Dynamic Hyper-Converged Future Proof Your Data Center
 
Community Health Network Delivers Unprecedented Availability for Critical Hea...
Community Health Network Delivers Unprecedented Availability for Critical Hea...Community Health Network Delivers Unprecedented Availability for Critical Hea...
Community Health Network Delivers Unprecedented Availability for Critical Hea...
 
Case Study: Mission Community Hospital
Case Study: Mission Community HospitalCase Study: Mission Community Hospital
Case Study: Mission Community Hospital
 
Emergency Communication of Southern Oregon
Emergency Communication of Southern OregonEmergency Communication of Southern Oregon
Emergency Communication of Southern Oregon
 
DataCore At VMworld 2016
DataCore At VMworld 2016DataCore At VMworld 2016
DataCore At VMworld 2016
 
Integrating Hyper-converged Systems with Existing SANs
Integrating Hyper-converged Systems with Existing SANs Integrating Hyper-converged Systems with Existing SANs
Integrating Hyper-converged Systems with Existing SANs
 
Fighting the Hidden Costs of Data Storage
Fighting the Hidden Costs of Data StorageFighting the Hidden Costs of Data Storage
Fighting the Hidden Costs of Data Storage
 
Can $0.08 Change your View of Storage?
Can $0.08 Change your View of Storage?Can $0.08 Change your View of Storage?
Can $0.08 Change your View of Storage?
 
The Need for Speed: Parallel I/O and the New Tick-Tock in Computing
The Need for Speed: Parallel I/O and the New Tick-Tock in ComputingThe Need for Speed: Parallel I/O and the New Tick-Tock in Computing
The Need for Speed: Parallel I/O and the New Tick-Tock in Computing
 

Recently uploaded

Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Mind IT Systems
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
Cyanic lab
 
Launch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in MinutesLaunch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in Minutes
Roshan Dwivedi
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
Pro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp BookPro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp Book
abdulrafaychaudhry
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Globus
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
e20449
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Shahin Sheidaei
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
informapgpstrackings
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 

Recently uploaded (20)

Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
Launch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in MinutesLaunch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in Minutes
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Pro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp BookPro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp Book
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 

Building a Highly Available Data Infrastructure

  • 1. BUILDING A HIGHLY AVAILABLE DATA INFRASTRUCTURE 1 © COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED. BUILDING A HIGHLY AVAILABLE DATA INFRASTRUCTURE By Jon Toigo Chairman, Data Management Institute jtoigo@toigopartners.com INTRODUCTION In survey after survey of business IT planners, a leading reason cited for adopting server virtualization technology is to improve application availability. Hypervisor-based computing vendors promise to improve the availability of applications by enabling failover between servers connected into clusters. For such failover clustering to work, servers typically use a “heartbeat” channel – a signal exchange – to confirm that both systems are operational. If the primary system ceases to send or respond to these signals, the issue is reported to an administrator so that a failover can be initiated and workloads shift to the alternative servers in the cluster. In the view of many administrators, this is the meaning of high availability: active-passive server nodes with heartbeats and failover. Such technology is not only useful in providing operational continuity in the face of a user, hardware or software fault (unscheduled downtime) that impairs a production server, it also provides a fairly elegant means to move workloads from the active to the passive server when scheduled maintenance requires taking the active server offline. In that way, server HA clustering mitigates application downtime for scheduled maintenance.
  • 2. BUILDING A HIGHLY AVAILABLE DATA INFRASTRUCTURE 2 © COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED. That view of clustering and availability is incomplete, however. In addition to server equipment, for a system or application to be highly available, its data infrastructure must also be highly available. Without access to data, server failover alone will not ensure the continuity of business application processing. As illustrated above, the simple clustering of two servers, facilitated by a health monitoring heartbeat process over a LAN, with both servers connected to a common storage platform, does not provide sufficient redundancy to ensure data availability. The single point of failure is the data storage infrastructure, which would make both servers and their hosted applications incapable of processing workload. The logical alternative is to provide storage on each server independently, and to establish an on-going data mirroring process between the storage arrays. The above diagram illustrates such a configuration using a direct attached storage topology in which two identical arrays are
  • 3. BUILDING A HIGHLY AVAILABLE DATA INFRASTRUCTURE 3 © COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED. tethered directly to each server via a host bus adapter, cabling, and ports on each array. Array to array replication can be implemented via server-based software or software-loaded on each array controller. Using on-array mirroring typically requires that identical equipment be used on both sides of the mirror operation. Regardless of whether you use a direct attached storage array, or a network-attached storage (NAS) appliances, or a storage area network (SAN) to host your data, if this data infrastructure is not designed for high availability, then the data it stores is not highly available by extension, application availability is at risk – regardless of server clustering. The purpose of this paper is to outline best practices for improving overall business application availability by building a highly available data infrastructure. DATA AVAILABILITY STRATEGY A long time dictum of continuity planning is that one of two strategies are required to ensure availability of data. One is a strategy of replacement: essentially, you do nothing prior to a data corruption or deletion event, then as soon as possible once the cause of the corruption or deletion event has been resolved, re-key all of the lost data from original source materials. That is a strategy of replacement, and if it sounds somewhat far-fetched in these days of exponential data growth rates, it is! For data, there is only one viable strategy for ensuring availability: a strategy of redundancy. To make data highly available, you need a copy of the data. By extension, to create a highly available data infrastructure, you need to eliminate single points of failure. You need to ensure that both the storage devices that host the data, and also the interconnect elements -- including switches, host bus adapters and/or network interface cards, and the wiring and cabling connecting servers to storage infrastructure -- are also made redundant. As stated in the introduction, simple clustering of two servers, with both servers connected to a common storage platform, remains vulnerable to an interruption of access to the shared array. Adding a second array and mirroring data between direct-attached storage systems compensated for the vulnerability, but in a manner that could quickly become unwieldy as infrastructure grows. That was one reason for the introduction of storage area networks in the 1990s. SANs enabled a switched infrastructure for storage devices to be created and shared as a pooled resource to servers. A simple SAN is illustrated in the following diagram.
  • 4. BUILDING A HIGHLY AVAILABLE DATA INFRASTRUCTURE 4 © COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED. Notice that the single switch in the data path provides access to either data repository, so if storage array 1 becomes unavailable, I/O requests can be shunted to storage array 2, where a mirrored copy of data should be found. However, the switch itself is a single point of failure in terms of access to stored data. If the switch fails, neither clustered host has access to its data. The logical resolution of this potential problem for data availability is to add a redundant storage area network switch in order to provide alternative paths to the same data storage. The benefits of the strategy are that it provides multiple paths to data from each host. The potential downside is that expenses for the configuration have multiplied. Each server must be equipped with multiple host bus adapters, additional cabling is required both from each server to both switches and from each switch to both storage arrays. Moreover, this configuration grows more complex with the addition of more servers and more storage arrays.
  • 5. BUILDING A HIGHLY AVAILABLE DATA INFRASTRUCTURE 5 © COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED. There is also another factor to consider: shared utility services. To ensure that this HA topology is truly effective, different switches and storage need to be on different electrical sources so that a single power outage doesn’t take out the entire infrastructure. In some cases, it may be desirable to place different SANs into different locations (different buildings in a campus environment, different branch office facilities, etc.). For this to work, heartbeat functionality, plus data mirroring/replication services, must be extended across a Metropolitan Area or Wide Area Network interconnect. From these illustrations it is easy to see the point that ensuring data availability requires more than simply clustering two servers together. From a data availability standpoint, planners need to consider  Redundancies at the storage layer, including data replication, RAID or erasure coding on storage media itself  Redundancies at the storage system layer, including redundant power supplies, fans, interconnecting controllers, ports, etc.  Redundancies at the switch layer  Redundancies in the cabling and interconnect layer  Redundancies in host bus adapters or NIC cards in the server hosts Those are just the hardware components. From a storage software perspective, there is a need for perfect data mirroring software -- software that works with all storage gear that is deployed and with the interconnect capabilities that are present in the topology. If storage equipment is heterogeneous (that is, is not of the same model type or from the same manufacturer), it is likely that third party mirroring and replication software will be needed to ensure that bits written to one array are faithfully copied to mirror targets.
  • 6. BUILDING A HIGHLY AVAILABLE DATA INFRASTRUCTURE 6 © COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED. Mirroring software provides synchronous replication of data over reasonably jitter free links of less than 70 kilometers. To mirror data beyond that distance, asynchronous replication software will likely be needed. Additionally, the entire end-to-end storage I/O path needs to be continuously monitored, as does the integrity of mirrors themselves to ensure that data states in the source and target arrays do not move too far afield of each other. Hence, there is a “transparency” requirement that must be addressed in the data availability infrastructure that the planner designs. NEXT GENERATION: FROM HARDWARE-DEFINED TO SOFTWARE-DEFINED Over the past few years, substantial push-back has been coming from business IT executives regarding the cost and complexity of building resilient hardware storage topologies for data availability. Critics have argued that embedding storage services on the controllers of individual arrays has a tendency to obfuscate their ability to monitor the I/O path or to test and validate data mirror processes. Software-defined storage (SDS) offers an optional model in which on-array storage services are abstracted away from the array controller and instead instantiated on the server as a layer of the server virtualization hypervisor software stack or as a standalone virtual machine. SDS, however, has no universal definition. Some vendors interpret it to mean moving caching functionality only off array controllers and into a server-based software layer. Other vendors include a broad range of storage services that may include thin provisioning, data mirroring, snapshots, de-duplication and other functionality. To optimize data availability, however, SDS must include capacity management as well as other services provided traditionally by array controllers. By providing all functionality to an SDS “storage-controller-in-software” underlying physical storage infrastructure can be optimized and both storage capacity and services can be doled out to application workloads that need them in a more agile way. Consider that, with traditional storage infrastructure, the movement of a virtualized application between servers (for example, vMotion) must be accompanied by a manual intervention within the application itself in order to provide a “new route” to the data store that is accessed and used by the application. The only alternative is to engage in expensive local mirroring of application data to storage locations on every conceivable hosting system, which can be an extremely expensive and complex proposition. If data infrastructure itself is virtualized, if capacity is pooled into a shared resource and surfaced or exposed as virtual volumes, these virtual volumes can travel with the virtual machine as it moves from server host to server host. The result is that path re-routing between
  • 7. BUILDING A HIGHLY AVAILABLE DATA INFRASTRUCTURE 7 © COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED. application and the physical location where its data is stored is managed transparently and automatically by the SDS controller. Moreover, by abstracting capacity management (sometimes called virtualizing storage), the SDS layer can enable the assignment of data availability services to virtual storage volumes so that all data written to the target is immediately subjected to the right combination of protective services and replication processes that are appropriate given its importance to the business and criticality to operations. Such a strategy insulates planners from the complexities brought about by scaling storage capacity, integrating different brands and models of hardware from different vendors, and the typical adds, moves and changes to the storage switch and cabling infrastructure. Plus, the contents of virtual volumes can be compared in real time to validate mirrors without disrupting the data copy process: a real boon. Additionally, this strategy enables a truly available data infrastructure to be created that can handle not only active-passive -- but also active-active -- clustering and failover. With active- active clustering, application workload is spread over multiple hosts which cooperate in the processing of data. In a true active-active cluster, if one server goes offline, the other active node simply takes on the entire workload burden until the failed server or a replacement can be brought online. When that happens, data is resynchronized and the active-active relationship is reestablished. Without capacity abstraction, this infrastructure is extremely difficult, if not impossible, to do build and operate – at least not with less expensive heterogeneous storage. BUILDING IN HIGHLY AVAILABLE DATA INFRASTRUCTURE USING HYPER-CONVERGED STORAGE Lately, software-defined storage has spawned pre-configured combinations of servers and storage devices joined together with SDS “middleware,” and offered as “atomic units of computing.” This model for deploying infrastructure, known as hyper-converged, is still undergoing significant development, but it is likely to become a fixture in highly virtualized server environments. Hyper-converged infrastructure provides an opportunity to realize the goal of a high availability data infrastructure. By enabling top-to-bottom management of the infrastructure stack, and by standardizing the techniques for connecting and clustering hyper-converged appliances, one can see how horizontal and vertical scaling of capacity could be handled in an efficient way. Plus, the best products will enable clusters to stretch across metropolitan area network (MAN) connections with the same alacrity as local area networks. These stretch clusters provide another level of availability to infrastructure and data.
  • 8. BUILDING A HIGHLY AVAILABLE DATA INFRASTRUCTURE 8 © COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED. In some cases, hyper-converged infrastructure, server+SDS+storage “building blocks,” will come to displace some “legacy” infrastructure consisting of servers connected to SANs and NAS appliances. However, the latest word from analysts suggests that the environment going forward will be a mixture of legacy and hyper-converged models. Needed is a robust “hybrid” software-defined storage technology that can embrace both the requirements of data availability in hyper-converged infrastructure and also on legacy infrastructure like SANs. Finding such an SDS architecture will require a careful evaluation of available options. Here are some suggested questions for identifying the solution that best fits your needs. First, planners need to ask whether the subject SDS solution works with their current infrastructure and with future storage architectures, like hyper-converged, that may be under consideration. Second, planners need to know whether the subject SDS solution will support their current workloads – virtualized and not. Third, planners need to discover whether the SDS solution under consideration offers capacity virtualization as well as storage services aggregation. Fourth, planners need to assess their own on-staff skills and determine, based on this assessment, the importance of user friendliness features of the SDS solution. Does it provide simple configuration and wizard-based allocation of capacity and services?
  • 9. BUILDING A HIGHLY AVAILABLE DATA INFRASTRUCTURE 9 © COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED. Finally, since we are talking about data availability, planners need to look closely at what the SDS solution offers from the standpoint of data mirroring and replication services. More to the point: 1) Does the solution enable mirroring of data across unlike hardware? 2) Does the solution enable centralized management of diverse data replication processes for ease of management? 3) Does the SDS solution offer mirror transparency – visibility into mirroring processes for ease of testing and validation? 4) Does the solution automate the failover of data access between different media, arrays or SANs that are part of the SDS-administered infrastructure so that the re-routing of applications to their data requires no human intervention? 5) Does the solution provide the means to failback from a failover once the error condition that necessitated the failover is resolved? A word of warning, most SDS solutions will not pass muster given these criteria. That is mainly because only a handful embrace the notion of a hybrid legacy/hyper-converged infrastructure or the concept of virtualizing storage capacity. Planners need to press for this functionality and to prefer SDS solutions that meet these critical requirements if they are to build an infrastructure that delivers data availability.