EOUG95 - Client Server Very Large Databases - Paper

David Walker Page 1

BUILDING LARGE SCALEABLE CLIENT/SERVER SOLUTIONS

David Walker
European Oracle User Group, Firenze, Italia, 1995

Summary

The replacement of 'Legacy' systems by 'Right-Sized' solutions has lead to a growth in
the number of large open client/server installations. In order to achieve the required
return on investment these solutions must be at the same time flexible, resilient and
scaleable. This paper sets out to describe the building of an open client/server solution
by breaking it up into components. Each component is examined for scaleability and
resilience. The complete solution, however, is more than the sum of the parts and
therefore a view of the infrastructure required around each element will be discussed.

Introduction

This paper will explore the proposition that large scale client server solutions can be
designed and implemented by de-constructing the architecture into quantifyable
components.

This can be expressed as a model based on the following four server types:

Database Server
Application Server
Batch Server
Print Server

Each of these components must have two characteristics. The first characteristic is
resilience. Having a single point of failure will mean that the whole solution may fail.
This must be designed out wherever possible. The second characteristic is scaleablity.
All computers have limitations in processing power, memory and disk capacity. The
solution must take into account hardware limitations and be designed to make
optimum use of the underlying platform.

It is assumed that the system described will be mission critical, with over two hundred
concurrent users and a database not less than ten gigabytes. The growth of these sites to
thousands of concurrent users and hundreds of gigabytes of data is made easier by
designing the system to be scaleable.

David Walker Page 3

Database Server

The Database Server is the component traditionally thought of as 'The Server'. It is here
that the Oracle RDBMS deals with requests, normally via SQL*Net. A resilient
Database Server will be spread over more than one node of a clustered system using
Oracle Parallel Server [OPS] in case of hardware failure within a single node.

A clustered system is one where homogeneous hardware nodes are linked together.
They share a common pool of disk, but have discrete processor and memory. There is a
Lock Manager [LM] which manages the locking of data and control objects on the
shared disks. The Lock Manager operates over a Low Latency Inter-connect [LLI]
(typically Ethernet or FDDI) to pass messages between nodes. The disks are connected
to all nodes through a High Bandwidth Inter-connect [HBI] (typically SCSI).
Clustering a system provides additional resilience, processing power and memory
capacity. However, since the complete database must be available on shared disk, the
maximum database size can be no greater than the maximum database size configurable
on a single node.

The disks within such mission critical systems are also required to meet the resilience
and scaleability criteria. The Redundant Array of Inexpensive Disks [RAID]
technologies provide a solution. RAID level 0 provides striping across many disks for
high performance, but it is fault intolerant. RAID level 1 implements disk mirroring,
where each disk has an identical copy. RAID level 5 offers striping with parity. Whilst
RAID level 5 is cheaper than full mirroring there are performance issues when a failure
occurs. In the type of system that is being examined in this paper the solution is to use
RAID 0 and RAID 1; striping and mirroring.

The basic level of resilience is to start an OPS instance on each of two nodes. All users
are then connected to one node, the other node participates in the OPS database but no
connections are made to the database. When the active node fails, all users migrate to
the alternate node and continue processing. The precise method of migration will
depend on the network protocol in use. Any existing network connections will have to
be established to the alternate node and the current transaction will be lost. The
connection will require users to be directed to the alternate node. This would typically
be dealt with transparently from the 'Application Server' (See below).

As the application grows the load may exceed the processing power, or memory
capacity of a single node. At this point the users can be distributed over multiple nodes.
In order to do this the application must have been designed with OPS in mind. This
requires that, where possible, the applications are 'partitioned'.

David Walker Page 4

A large airline system, for example, may have applications for 'Reservations' and for
'Flight Operations'. Select statements will involve blocks being read from common
lookup tables. Inserts and updates will take place on tables mainly within the set of
tables belonging to one or other application. Partitioning these tables reduces the lock
traffic being passed between each node and consequently improves the overall
performance of the OPS system. Where it is impossible to partition an application more
sophisticated techniques are required. A detailed discussion of these techniques is
beyond the scope of this paper.

It is also important to make some decisions about the use of stored procedures. These
are a powerful tool that can reduce coding in front-end applications. They can also
ensure code integrity as changes to standard packages need only be modified in one
place. They do, however, force the load back into the Database Server. This means that
scaleability is limited by the processing capacity of the Database Server. This, as it will
be shown later, may not be as scaleable as putting the work back into the Application
Server (See Below).

There exists a number of concerns about buying 'redundant' equipment. For two node
systems the second node could, for example, be utilised for a Decision Support System
[DSS] database. This normally fits well with the business requirements. A DSS is
normally built from the transactional data of the On-line Transaction Processing
[OLTP] system. This will transfer the data via the system bus rather than over a network
connection. If the node running the OLTP system fails, DSS users can be disabled and
the OLTP load transferred to the DSS node.

Where there are two or more OLTP nodes a strategic decision can be made to limit
access to the 'most important' users up to the capacity of the remaining node(s). This is
inevitably difficult to decide and even harder to manage. When one node has failed
pressure is on to return the system to service. Having to make decisions about the
relative importance of users may be impossible.

The OPS option is available with Oracle 7.0. With Oracle 7.1 the ability to use the
Parallel Query Option [PQO] became available. This allows a query to be broken up
into multiple streams and run in parallel on one node. This is not normally useful in the
daytime operations of an OLTP system, as it is designed to make use of full table scans
that return a large volume of data. Batch jobs and the DSS database may however take
advantage of PQO for fast queries. In Oracle 7.2 the ability to perform PQO over
multiple nodes becomes available. This is most useful to perform the rapid extract of
data in parallel from the OLTP system running across N minus one nodes whilst
inserting into the DSS node.

David Walker Page 5

Backup of the system is also critical. A system of this size will most probably use the
Oracle on-line or 'hot' backup ability of the database. Backup and recovery times are
directly related to the number of tape devices, the size of each device, the read/write
speed of the devices and the I/O scaleability of the system and the backup software. If
the system provides backup tape library management then it is desirable to use a larger
number of short tape devices with high read/write speeds. Tapes such as 3480 or 3490
compatible cartridges are therefore desirable over higher capacity but less durable and
slower devices such as DAT or Exabyte

Recovery will require one or more datafiles to be read from tape. These can be quickly
referenced by a backup tape library manager that prompts for the correct tape to be
inserted. Large tapes may contain more than one of the required datafiles and may have
the required datafile at the end of a long tape. This does not affect backup time but can
have a considerable affect on the time taken to recover. Since all nodes are unavailable a
quick recovery time is important.

Further enhancements to the speed of backup will be made possible by using Oracle
Parallel Backup and Restore Utility [OPBRU]. This will integrate with vendor specific
products to provide high performance, scaleable, automated backups and restores via
robust mainframe class media management tools. OPBRU became available with
Oracle 7.0.

The optimal solution for the Database Server is based on an N node cluster. There
would be N minus one nodes providing the OLTP service and one node providing the
DSS service. The DSS functionality is replaced by OLTP when a system fails. The
number and type of backup devices should relate to the maximum time allowed for
recovery.

Application Server

The Application Server configuration can be broken down into two types. The first type
is called 'host based', the second is 'client based'. These two techniques can either be
used together in a single solution, or one can be used exclusively.

A host based Application Server is where the user connects to a host via telnet or rlogin
and starts up a session in either SQL*Forms or SQL*Menu. This typically uses
SQL*Forms v3, character based SQL*Forms v4, or the MOTIF based GUI front-end to
SQL*Forms v4. This type of application is highly scaleable and makes efficient use of
machine resources. The terminal is a low resource device of the 'Thin-client' type. The
amount of memory required per user can be calculated as the sum of the size of
concurrently open forms. The CPU load can be measured by benchmarking on a per
SQL*Forms session basis and scaled accordingly. Disk requirement equals the size of
the code set for the application, plus the size of the code set for Oracle.

David Walker Page 6

Each host based Application Server can potentially support many hundreds of users,
dependent on platform capability, and the growth pattern is easily predicted. When a
host based Application Server is full, a second identical machine can be purchased. The
code set is then replicated to that machine and users connected. Resilience is achieved
by having N+1 host based Application Servers, where N is the number of machines
required to support the load. The load is then distributed over all the machines. Each
machine is then 100*N/(N+1)% busy. In the case of a host based Application Server
failure 100/(N+1)% of the users are redistributed causing minimal impact to the users as
a whole. It should be noted that these machines are not clustered; they only hold
application code that is static.

Since the machines are not clustered it is possible to continue to add machines
indefinitely (subject to network capacity). This relates back to the issue of stored
procedures mentioned above. The Database Server is limited by the maximum
processing power of the cluster. The host based Application Server can always add an
additional node to add power. This means that the work should take place on the host
based Application Server rather than the Database Server. The more scaleable solution
is therefore to put validation into the SQL*Forms, rather than into stored procedures. In
practice, it is likely that a combination of stored procedures and client based procedures
will provide the best solution.

Host based Application Servers also make it easy to manage transition between nodes in
the Database Server when a node in the Database Server fails. A number of methods
could be employed to connect the client session to the appropriate server node. A simple
example is that of a script monitoring the primary and secondary nodes. If the primary
Database Server node is available then its host connection string is written to a
configuration file. If it is not available then the connection string for a secondary
Database Server node is written to the configuration file. When users reconnect they
read the configuration file and connect to the correct server.

Client based Application Servers are used in conjunction with the more traditional client
side machines. Here a Personal Computer [PC] runs the SQL*Forms which today is
likely to be Windows based SQL*Forms v4. This is the so called 'Fat Client'. In a large
environment the management of the distribution of application code to maybe several
thousand PCs is a system administrators' nightmare. Each PC would more than likely
need to be a powerfully configured machine requiring a 486 or better processor, with 16
Mb of memory and perhaps 0.5 Gb of disk to hold the application. Best use of the
power and capacity available is not made as it is an exclusive, rather than shared,
resource.

Microsoft's Systems Management Server [SMS] has recently become available. This
may help with the management of many networked PCs. SMS provides four main
functions: an inventory of hardware and software across the network; the management
of networked application; the automated installation and execution of software on
workstations; the remote diagnostics and control of workstations. Whilst in production,
this software is yet to be tested on large scale sites.

David Walker Page 7

The client based Application Server can help by serving the source code to each PC,
typically by acting as a network file server. The SQL*Forms code is then downloaded
to the PC at run-time. This overcomes the problem of disk capacity on the client PC and
helps in the distribution of application code sets. It does create the problem of
potentially heavy local network traffic between the PC and the client based Application
Server as the SQL*Forms are downloaded. It also does not help the requirement for
memory and processor power on the PC. The client based Application Server can also
be used to hold the configuration file in the same manner as the host based Application
Server. This again deals with the problem of Database Server node failure.

The growing requirements for Graphical User Interfaces [GUI] may be driven by a mis-
guided focus of those in control of application specification. Windows based front-end
systems are seen as very attractive by those managers who are considering them. The
basic operator using the system however is usually keyboard, rather than mouse,
orientated. Users will typically enter a code and only go to a lookup screen when an
uncommon situation arises.

An example where a GUI is inappropriate is of a Bureau de Change whose old character
based systems accepted UKS for UK Sterling and USD for US Dollars. In all, entering
the code required eight keystrokes and took about five seconds. The new system
requires the use of a drop down menu. A new 'mousing' skill was needed by the operator
to open the window. More time was spent scrolling to the bottom of the window. A
double click was then used to select the item. This was repeated twice. After a month of
use, the best time achieved by the operator was about thirty seconds. This is six times
longer than with the original character based system and any delay is immediately
obvious as the operation is carried out in front of the customer.

A large scaleable client/server system can take advantage of both methodologies. The
general operator in a field office can make a connection to a host based Application
Server and use the traditional character based system. This is a low cost solution as the
high power PCs are not required and the resources are shared. The smaller number of
managers and 'knowledge workers' who are based at the head office or in regional
offices can then take advantage of local client based Application Servers and GUIs.
Although these users require high specification PCs the ratio of knowledge workers to
operators makes the investment affordable.

In order to make the task of building application code simpler, it is advantageous to
code for a standard generic batch and print service. Since the platform each component
is running on may be different (for example: a UNIX Database Server, a Windows/NT
client based Application Server, etc.), SQL*Forms user exits should be coded to one
standard for batch and printing. A platform specific program should then be
implemented on each server to work with a local or remote batch queue manager.

David Walker Page 8

Batch Server

Batch work varies from site to site, and is often constrained by business requirements. A
Batch Server will manage all jobs that can be queued. The Batch Server may also act as
the Print Server (See below). The batch queues have to be resilient and when one node
fails another node must be capable of picking up the job queues and continuing the
processing. Some ways of achieving this resilience are suggested below.

The first is to implement the queue in the database. The database is accessible from all
nodes and the resilience has been achieved. The alternate node therefore need only
check that the primary node is still running the batch manager, and if not start up a local
copy. The second method is for the Batch Server to be clustered with another machine
(perhaps the Print Server), and a cluster-aware queue manager product employed. The
queues set up under such a product can be specified as load-balancing across clustered
nodes, or queuing to one node, with automatic job fail-over and re-start to an alternate
node on failure. An alternative clustered implementation would be to maintain the
queue on shared disk. When the Batch Server node fails the relevant file systems are
released and acquired by the alternate node, which examines the queue and restarts from
the last incomplete job.

Two other requirements exist. The first is the submission of jobs; the second is the
processing of jobs. The submission of jobs requires that the submitting process sends a
message to the batch manager or, in the case of a queue held in the database, inserts a
row into the queue table. This can be done via the platform specific code discussed
above. If the queue is held in the database the workload is increased on the Database
Server as the batch queue is loaded, however it is also easy to manage from within the
environment.

Job processing depends on the location of the resource to perform the task. The task can
be performed on the Batch Server, where it can have exclusive access to the machine's
resources. This requires that the code to perform the task is held in one place and the
ability to process batch requests is therefore limited by the performance of the Batch
Server node.

Alternatively the Batch Server can distribute jobs to utilise available Application Server
systems. These machines must each contain a copy of the batch processing code and a
batch manager sub-process to determine the level of resource available. If an
Application Server finds that the system has spare capacity then it may take on a job.
The batch manager sub-process then monitors the load, accepting more jobs until a pre-
defined load limit is reached.

David Walker Page 9

In practice it is common to find many batch jobs are run in the evening when the
Application Servers are quiet, with a small number of jobs running during the day. To
manage this effectively the system manager may enable the batch manager sub-
processes on the Application Servers only outside the main OLTP hours. During the
OLTP operational window batch jobs are then limited to the Batch Server. This
capitalises on available resource whilst ensuring that batch and OLTP processing do not
interfere.

It should be noted that all jobs submit work to the Database Server. Large batch jobs
that take out locks will inhibit the performance of the database as a whole. It is
important to categorise jobs that can be run without affecting OLTP users and those
which can not. Only those tasks that do not impact the Database Server should be run
during the day.

Batch Servers of this nature are made from existing products combined with bespoke
work tailored to meet site requirements.

Print Server

The Print Server provides the output for the complete system. There are many possible
printing methods that could be devised. This paper will look at three. The first is the
distributed Print Server. Each place where output is produced manages a set of queues
for all possible printers it may use. Printers need to be intelligent and able to manage
new requests whilst printing existing ones. As the load increases the chance of a print-
out being lost or of a queue becoming unavailable increases. The management of
many queues also becomes very complex.

An alternative is to use a central queue manager. This can be integrated with the batch
manager mentioned above and use either of the queuing strategies suggested. The pitfall
with this method is that the output file has to be moved to the Print Server before being
sent to the printer. This involves a larger amount of network traffic that may cause
overall performance problems. The movement of the file from one system to another
may be achieved either by file transfer or by writing the output to a spool area that is
Network File System [NFS] mounted between the systems. This will also stress the
print sub-system on the Print Server.

A third method requires the print task be submitted to a central queue for each printer on
the Print Server (or in the database). This requires that the task submission includes the
originating node. When the task is executed a message is sent back to the submitting
node. The Print Server sub-process then sends the output to the local queue. On
completion the sub-process notifies the Print Server. The Print Server can then queue
the next job for a given printer. In this way the network is not saturated by sending
output files over the network twice, but there is a cost associated with managing the
queues on machines that can submit print requests.
As has already been suggested the Print Server can either be combined with the Batch
Server, or clustered with it for resilience. When a failure occurs each can take on the
role of the other.

David Walker Page 10

The above illustrates that printer configuration requires considerable thought at the
development stage. Each site will have very specific requirements and generic solutions
will have to be tailored to meet those requirements.

As previously mentioned, this type of solution requires a combination of existing
product and bespoke work.

Backup and Recovery

Backup and recovery tasks have already been identified for the Database Server. This is
by far the most complex server to backup as it holds the volatile data. All the other
servers identified remain relatively static. This is because they contain application code
and transient files that can be regenerated if lost. Most systems are replicas of another
(Application Servers being clones and the Batch/Print Servers being clustered), but full
system backups should be regularly taken and recovery strategies planned.

In the case of a complete black-out of the data centre it is important to understand what
level of service can be provided by a minimal configuration. The size of the systems
means that a complete hot site is unlikely to be available. The cost of the minimal
configuration should be balanced against the cost to the business of running with a
reduced throughput at peak times. The full recovery of a system is non-trivial and
requires periodic full rehearsals to ensure that the measures in place meet the
requirements and that the timespans involved are understood.

Systems Management

Effective management of a large number of complex systems requires a number of
specialist tools. Many such tools have become available over the last two years. Most
have a component that performs the monitoring task and a component that displays the
information. If the monitoring component runs on the a server then it can increase the
load on that server. Some sites now incorporate SNMP Management Information Bases
[MIBs] into their applications and write low-impact SNMP agents to look at the system
and database. Information from these MIBs can be requested be the remote console
component. It is desirable that any monitoring tool has the ability to maintain a history,
trigger alerts and perform recovery actions. Long term monitoring of trends can assist in
predicting failures and capacity problems. This information will help avoid system
outages and maintain availability.

David Walker Page 11

Networking

The network load on these systems can be very complex. It is recommended that a
network such as FDDI with its very high bandwidth is provided between the servers for
information flow. Each machine should also have a private Ethernet or Token Ring for
systems management. This network supports the agent traffic for the system
management and system administrators' connections. The users are only attached to the
Application Servers via a third network. Where clusters are in use each LLI will also
require network connections. The distribution over many networks aids resilience and
performance.

Integrating the Solution

The methodology discussed above is complex. Much of the technology used will be
leading edge. Many design decisions will be made before the final hardware platform is
purchased. A successful implementation requires hardware (including the network), the
database and an application. Each element must have a symbiotic relationship with the
others. This has worked best where hardware vendor, database supplier, application
provider (where appropriate) and customer put a team together and work in partnership.
Staffing of the new system will be on a commensurate scale requiring operators and
administrators for the system, database and network. A '7 by 24' routine and cover for
leave and sickness means that one person in each role will be insufficient.

Conclusion

The building of a large scaleable client/server solution is possible with existing
technologies. An important key to success is to design the system to grow. This should
be done regardless of whether the current project requires it or not. The easiest way to
manage the design process is to break up the system into elements and then examine
each element for scaleability and resilience. Then bring all the elements together and
ensure that they work together. Do not attempt to do this on your own but partner with
your suppliers to get the best results from your investment.

EOUG95 - Client Server Very Large Databases - Paper

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (8)

Similar to EOUG95 - Client Server Very Large Databases - Paper

Similar to EOUG95 - Client Server Very Large Databases - Paper (20)

More from David Walker

More from David Walker (20)

Recently uploaded

Recently uploaded (20)

EOUG95 - Client Server Very Large Databases - Paper