Thinklogical White Paper: Redundant Fiber-Based Systems

WHITE
Redundant Fiber-Based
Systems

A Thinklogical White Paper
By Larry Wachter
Senior Product Manager - Routing and Extension Solutions - Thinklogical

This white paper illustrates the concept of redundant and resilient
systems and how ber-based extension and routing solutions can
maintain operability in the event of a failure.

www.thinklogical.com

White Paper - Redundant Fiber-Based Systems

Introduction

At the most basic level, availability can be de ned as the probability that a system is operating
successfully when needed. The term high availability has been used to encompass all things related to
productivity, including reliability and maintainability. The adoption of high availability has led to
redundant and resilient systems spurring a ripple e ect and ending with the creation of ber
infrastructures which require products and solutions that provide various levels of fault-tolerance. In
particular, this is true of ber-based routing and extension solutions, which not only provide mechanisms
that aid in modular redundant system architecture, but also provide high bandwidth, cost-e ectiveness,
and support for complex topologies. Consequently, Thinklogical has designed a redundant ber-based
routing and extension solution that meets the requirements for reliable signal transmission in modular
redundant system deployments.

High Availability Achieved Through Redundant and Resilient Systems

Redundancy can involve a variety of technologies, all of which pertain to physical backups, whereas
resiliency deals primarily with communication protocols. A redundant device may activate as a result of a
failure, but without built in resiliency as well, there could be data loss, or worse, the inability to establish
the redundant connection. A resilient system will return to an operable state after encountering trouble.
Therefore, if a risk event knocks a system o ine, a highly resilient system will resume its intended work
and function with minimal downtime.

Building a redundant and resilient system requires a holistic mentality. One must prioritize every
foreseeable risk and then determine not only how to reduce the risk in the rst place, but determine how
to minimize its impact on the system. The need or requirement for redundancy can be based on a set of
system criteria questions:

Does the system need to run around the clock and is downtime unacceptable?

If a system fault occurs, should the primary system switch over to the secondary system seamlessly?

What is the degree to which the data shared between sources and destinations must remain constant
and reliable?

How can single points of failure within the system be minimized and how can one ensure that
components within the infrastructure will not stop the overall operation of the system?

2 www.thinklogical.com

Application Diagram BROADCAST & POST-PRODUCTION BRIEF

High availability, achieved through redundancy and fault tolerance, is a critical component of many
routing and extension installations, especially in secure visual computing environments. While the loss of
an enterprise system for a few minutes is inconvenient, losing a secure visual computing system can have
disastrous consequences. Some form of redundancy and fault tolerance is generally used if a control
system shutdown or loss of visibility causes a major loss of revenue, loss of equipment, disruption to
public services and/or safety. Redundancy in these situations means the duplication, or even triplication,
of equipment that is needed to operate without disruption, if and when the primary equipment fails
during the mission. In these types of environments the cost of failure is so high that a redundant system
approach is crucial.

By using a ber-based solution that supports redundant system design, users enjoy highly reliable data
transmission, reduced costs of deployment and a guaranteed upgrade strategy as requirements evolve.
This white paper will touch upon several various redundant and fault tolerant features and architectures
for ber-based infrastructures, but will focus primarily on Dual Modular Redundancy, otherwise known as
Parallel Redundancy, which is the approach taken by Thinklogical systems. This paper will also highlight
features within the Thinklogical product lines that can help achieve higher availability.

Redundancy on a Component Level

The most important place to start to guarantee reliable operation is to provide redundant,
hot-swappable components. It is also critical that modules or components should be capable of being
removed, replaced or added to the system without interruption. Replacements should not need rewiring
or reprogramming. In addition, many innovations have been created, such as state-based control and
self-learning diagnostic routines, which have raised the ability of the controller to detect, annunciate and
describe problems within the components. For many users, the ability to maintain and revise the system
without shutting down o ers an acceptable level of availability, especially if the change or repair can be
completed in minutes.



Critical system components:

Uninterrupted power supply (UPS)
Redundant power supplies
Redundant components
- Chassis
- Processors
- I/O modules
- Sensors and actuators
- PCs/HMI
- Networks
- Media
- Servers
- Databases

Thinklogical’s System Contingencies

Power supply redundancy is a very popular means to increase system reliability. A single power supply
failure could have a catastrophic e ect that equates to a tremendous amount of lost revenue. This need
for system integrity and guaranteed performance in these demanding conditions necessitates power
redundancy. Therefore, all of Thinklogical’s routing and modular extension products are equipped with
redundant, hot-swappable power supplies.

Thinklogical’s VX and HDX line of routers are designed with hot-swappable critical system components,
such as cooling fans and pluggable optics (SFP+), thus minimizing business impact in the unlikely event
a component should fail. The hot-swappable I/O boards also provide excellent in-service expansion
capabilities allowing the router to be recon gured without interrupting signal processing by powering
down the router. In addition, the HDX Router line is equipped with dual controller cards with the ability
to switch between cards in the event of a failure.



Models of Redundancy

There are a number of common redundancy models used in the industry, such as Standby Redundancy
and Dual Modular Redundancy, or Parallel Redundancy.

Standby Redundancy

Standby Redundancy refers to a con guration where there is an identical secondary unit to backup the
primary unit. Under standby redundancy they do not share any of the load and they start operating only
when active components fail. In addition, a third party may be needed to monitor the system and give
the command when a switchover condition is met.

In standby redundancy, the components are set to have three state: Cold, Warm and Hot Standby.
Typically in Cold Standby the secondary unit is powered o in order to preserve the life of the unit. The
disadvantage of this model is that there is a signi cant time delay in getting the replacement system up
and running. While the hardware and software are available the unit needs to be powered up before it
can be brought online into a known state.

Warm Standby has a faster response time because the backup (redundant) system is always running and
regularly synchronized with the Device Under Control (DUC). When a failure occurs on the primary
system, the redundant system can disconnect from the failed system and connect to the backup system.
This allows the system to recover fairly quickly (usually within seconds) and continue to work. Although
some data will be lost during this disconnect/reconnect cycle, warm standby can be an acceptable
solution where some data loss can be tolerated.

In these types of redundant models the switching is not seamless and adds to the probability of failure
within a given system. To o set this increased probability, additional hardware (a third party voter) can
be added to the redundancy con guration to help assist in the switching from the primary to secondary
source. While these system components add to the reliability, they are normally connected in series,
which creates a hybrid parallel-series connection and introduces another point of failure for the system.
In addition, the system cost typically doubles with the additional hardware.



Hot standby means that both the primary and secondary data systems run simultaneously and both are
providing identical data streams to the downstream client. If the primary system fails, the switchover to
the secondary system is intended to be completely seamless, or “bumpless,” with no data loss. Hot
Standby is the best choice for systems that cannot tolerate the data loss of a Cold or Warm Standby
system. There are some variations of the Hot Standby model, such as Dual Modular Redundancy or
Parallel Redundancy. The di erentiating factor between these models is how tightly the primary and
secondary units are synchronized.

Dual Modular Redundancy (DMR) or Parallel Redundancy

The approach of having multiple units running completely synchronized and in parallel is known as
DMR, or Parallel Redundancy. This model typically has rapid switchover time.

There are three basic tenets of dual system redundancy:
1. Physical separation of signal paths
2. Dual-chassis redundant signal controllers
3. Synchronization of status information

A DMR routing and extension system is con gured with two tightly synchronized primary and
secondary routers running in parallel. These routers mirror one another with identical signals being
sent through both of them at the same time. These signals are sent to their destination at a receiver
component. Deciding which unit is correct can be challenging if you have more than one router.
Having to choose which unit you are going to “trust the most” defeats the purpose (by arbitrarily giving
one router priority without dynamic review of operating parameters). Also monitoring and determining
when to switch to the secondary unit can be complicated.



The Thinklogical Advantage

Thinklogical has designed a cost-e ective, resilient solution to take the complexity out of the DMR approach.
The feature is designed into the SDI Xtreme 3G+ Receivers, and is known as a “switchover capability.” This allows
the component to receive identical streams on both input bers. By default it will attempt to synchronize to the
‘primary’ ber by searching for the synchronization characters in the received stream. Simultaneously, it will also
check the ‘secondary’ ber and attempt to synchronize to its stream. After a pre-determined amount of time,
whichever stream the receiver locks on to will be selected and the SDI data will then be decoded from that
stream. In the event that the selected stream loses synchronization, the receiver will automatically switch to the
other stream. There will be minimal loss of SDI video during this switchover. In order to prevent switching back
and forth between an intermittent signal, the receiver will continue to use the ‘switched-over’ stream regardless
of whether or not it re-acquires lock to the original stream. If an event occurs such that the switched-over
stream loses lock, then the receiver will attempt to switch back over to the original stream.



This synchronization scheme ensures the maximum uptime in the event of a failure at any point in the
system. Interestingly, this approach mirrors the classic design common among disaster recovery
implementations. In fact, most highly available systems stick to this simple design pattern: a single,
high quality, multi-purpose physical system with comprehensive internal resiliency running
interdependent functions paired with a second, physically separated, duplicate system. The overriding
purpose of this design is the prevention of, or rapid recovery from, a failure, which allows a system to
continue to operate despite a partial or complete failure of any signi cant component.

Summary

The idea of redundancy is not di cult to grasp, but implementing it takes some thought. An initial
decision on Cold, Warm or Hot Standby will impact all aspects of the implementation. The choice of
proper hardware and robust system architecture is critical for a well functioning system.

It is clear that organizations cannot fully leverage the bene ts of redundancy models without a
comprehensive routing and extension solution. Thinklogical’s system solutions o er innovative
organizations the ability to create high density, scalable and redundant system architectures that
deliver broad functionality and provide high ROI. It is very important to keep in mind that lower
system cost doesn’t always equal lower total cost of ownership. More importantly, the cost of one
unplanned shutdown far outweighs the costs of redundancy. If data connectivity is crucial to the
success of the company or organization, it would be wise to consider the possibility of installing a
redundant system and to weigh the options carefully when choosing the key components.

About Thinklogical
Thinklogical is the leading manufacturer and provider of ber optic KVM/video extension solutions,
and ber matrix routers and switches. Organizations worldwide rely on Thinklogical's products and
solutions for optimal performance in secure visual computing environments. Through pioneering next
generation ber optic extension, switching, and server management technologies Thinklogical helps
customers reduce cost and simplify the management of complex computing infrastructures.

© 2011 Thinklogical. All rights reserved.

Thinklogical claims or other product information

¡
contained in this document are subject to change

Extend Distribute Innovate without notice. This document may not be reproduced,

in whole or in part, without the express written consent

of Thinklogical.
September 2011

Thinklogical White Paper: Redundant Fiber-Based Systems

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Similar to Thinklogical White Paper: Redundant Fiber-Based Systems

Similar to Thinklogical White Paper: Redundant Fiber-Based Systems (20)

Recently uploaded

Recently uploaded (20)

Thinklogical White Paper: Redundant Fiber-Based Systems