Software System Scalability: Concepts and Techniques (keynote talk at ISEC 2009)

Software System Scalability:
Concepts and Techniques

David S. Rosenblum
University College London
United Kingdom

http://www.cs.ucl.ac.uk/staff/D.Rosenblum/

Acknowledgments

• Letícia Duboc

• Tony Wicks

• Emmanuel Letier

ISEC 2009 2

Scalability: A Widely Used Term

• The technical literature has many uses of the term
– Product brochures
– Research papers
– Design documents
– Standards specifications
• But there are very few precise definitions

ISEC 2009 4

A Typical Example
SAP Specification

Mark Handley, Colin Perkins and Edmund Whelan, Session
Announcement Protocol, RFC 2974, October 2000.

• 5500 Words, Including 3 Occurrences of ‘Scalability’:
– Abstract: ‘This document describes version 2 of the multicast session
directory announcement protocol, Session Announced Protocol (SAP), and
the related issues affecting security and scalability that should be taken
into account by implementors.’
– Section on Terminology: ‘A SAP announcer periodically multicasts an
announcement packet to a well known multicast address and port. The
announcement is multicast with the same scope as the session it is
announcing, ensuring that the recipients of the announcement are within
the scope of the session the announcement describes (bandwidth and
other such constraints permitting). This is also important for the scalability
of the protocol, as it keeps local session announcements local.’
– Section Heading: ‘Scalability and Caching’

ISEC 2009 5

The Problem

‘I examined aspects of scalability, but did not find a
useful, rigorous definition of it. Without such a
definition, I assert that calling a system “scalable”
is about as useful as calling it “modern”. I
encourage the technical community to either
rigorously define scalability or stop using it to
describe systems.’

[Mark D. Hill, ‘What is Scalability?’, ACM SIGARCH Computer
Architecture News, vol. 18, no. 4, Dec. 1990, pp. 18-21.]

ISEC 2009 6

Does This Lack of Rigour Matter?

Publications with the word scalable or scalability
in the title
[source: Engineering Village 2]
2500

2000

1500
1980: Computer Architecture
P ublications

1000
1988: Neural Networks

500

0
1965 1970 1975 1980 1985 1990 1995 2000 2005 2010
Year

ISEC 2009 7

Why Does It Matter to Software Engineering?

• Scalability is an important
multi-dimensional concern
– And engineers have difficulty
reasoning about multi-dimensionality!
• The dimensions exhibit
highly unpredictable trends
– And engineers have difficulty anticipating these trends!
• Growth in users, growth in capacity requirements
• New deployment needs (mergers, miniaturisation)
• Software engineers have only primitive, ad hoc
techniques to address scalability concerns
ISEC 2009 8

Some Typical Notions of Scalability

• Performance
– High throughput, low latency
• Parallel speedup
• Tractability of algorithms
– Polynomial versus Exponential
• Testing versus Verification
• State spaces in model checking
• Linear growth in resource usage
– What about quicksort???

ISEC 2009 9

Scalability of What?

• The running system?

• The software design?

• The number of users?

• Something else?

• All of the above???

ISEC 2009 10

Characterising and Analysing Scalability

So What Is Scalability?

• Scalability is a quality of a software system
characterising its ability …
– to satisfy its quality goals …
– to levels that are acceptable to its stakeholders …
– when characteristics of the execution environment …
– and the system design …
– vary over expected ranges.

Scalability is thus a meta-quality of other system qualities

ISEC 2009 12

A Scalability Framework
As a Form of Experimental Design

scaling non-scaling
design environment
system execution

system
behaviour dependent
independent
variables govern determine variables

system
qualities
environment and
design characteristics

ISEC 2009 13

Example
Google Search Engine

• Most people would agree that Google is scalable
– Dramatic growth in the size of the Web
– Dramatic growth in the rate of queries to Google
– Yet a virtually constant response time for users

• It’s a naturally parallelisable problem
– Implemented as a cluster of commodity PCs
– Cluster increased as Web and query load increase

ISEC 2009 14

The Scalability Framework
As Exemplified by Google

scaling non-scaling
Google is scalable with respect to response time
design environment

size of network
system execution

Web latency
response
queries per available system time
second bandwidth
because it maintains a constant response
timeI/O usage
as the
behaviour
govern determine
cluster number of queries per second
choice of price per
size algorithms
and the number of Web pages scale over performance
time,
system
qualities
environment and
by increasing the number of machines in the cluster

ISEC 2009 15

Case Study
Fortent Data Analysis System

• Intelligent Enterprise Framework (IEF)
– Overnight analysis of transactional data to
identify unusual and possibly fraudulent
patterns of bank and credit card transactions
– Java - 1,556 classes - 326,293 lines of code

• Surrogate Key Server (SKS) Component
BE BE BE SK SK SK
BE BE BE replace business SK SK SK
entity identifiers
BE BE BE SK SK SK
BE BE BE BE SK SK SK SK

BE SK
batches of BE SK injected
transactions on surrogate keys
business entities entity-key
mapping
ISEC 2009 17

Case Study
SKS Implementation Details

• Scalability problem: support a growing number of
business entities in overnight batches, while
maintaining throughput and memory usage within
acceptable levels

• First Generation Design (year 2000)
– In-memory cache
– High storage overhead, eventually crashing system

• Second Generation Design (year 2003)
– Disk-based cache for high-volume business entities
– In-memory cache for low-volume business entities
ISEC 2009 18

Scalability of IEF’s SKS
Characterisation

scaling non-scaling
design environment
system execution

number of average
business throughput
entities system
behaviour
memory
memory govern determine usage
number cache vs
of disk cache
disk usage
threads JVM heap
size
system
qualities
environment and

ISEC 2009 19

Analysis in Terms of Microeconomics

scaling non-scaling
distinct
design environment

behaviours
system execution

number of average
business throughput
entities system
behaviour
manipulate memory
memory
over ranges govern new prototype determine usage
number cache vs vs
of disk cache old raw data
disk usage
implementation
threads
JVM size measure system
qualities
environment and
preference functions
t(), m(), d()
utility function
Design Comparison preference values
10t()+10m()+d()

ISEC 2009 20

Case Study
Preferences and Utility

• Throughput preference
∧ -1, if x < 100
t(x) =
x – 100 , otherwise
400 – 100

• Heap usage preference
∧ -1, if y > 500
h(y) = • System utility
∧ ∧ ∧
500 – y , otherwise U(x,y,z) = 10 t(x) + 10 h(y) + d(z)
500 – 0 21

• Disk usage preference
∧ -1, if z > 24
d(z) =
24 – z , otherwise
24 – 0

ISEC 2009 21

Analysis Results

ISEC 2009 22

Where Do the Variables and Preferences and
Utilities Come From?

• They must come from system stakeholders
– Are able to identify important scalability variables
– But like to think in terms of simple bounds
• Rather than the underlying functions that relate them
– And are usually poor at estimating those bounds
• Typically underestimate system load and system lifetime

• Goal-Oriented Requirements Engineering can be
used to elicit Scalability Requirements
– KAOS Method [van Lamsweerde, Letier]

ISEC 2009 24

The Scalability Framework
In the Context of Requirements Engineering

Scalability
identify and bound Goals identify and bound

scaling non-scaling
design environment
system execution

system
behaviour dependent
independent
variables govern determine variables

ISEC 2009 25

Goal-Oriented Requirements Engineering
As Exemplified by IEF

Goal
Fraudulent Transactions Handled
AND-Refinement

Sub-Goal
Obstacle
Fraudulent Transactions Acted Upon
Fraudulent Transactions
Detected Quickly
Requirement nt Transactions Not
Expectation
Fraudule
Acted Upon Bank
… IT Team
…
Batch Processed Overnight
Obstacle Refinement
Agent
Too Many Alerts
IEF for IT Team Sub-Obstacle
Alert Generator
Agent

ISEC 2009 26

Scalability Requirements

• A scaling assumption is a goal specifying how
some quantity in the application domain is
assumed to vary over time or system variants
• A scalability goal is a goal specifying the required
levels of satisfaction under variations specified in
associated scalability assumptions
• A scalability obstacle is a condition where the load
imposed by a goal exceeds the capacity of the
agent assigned to the goal

We can use goal-obstacle analysis to elicit these
ISEC 2009 27

Goal-Obstacle Analysis of IEF

Batch Processed Overnight Scalability Obstacle
Scaling Assumption Scalability Requirement
Batch Siz
eIs Unbou Batch Processed Overnight for
Expected Batch Size Variation nded
Expected Batch Size Variation

Assumption Expected Batch Size Variation IEF
Instance of scaling assumption Number of transactions exceeds Alert Generator
Definition Over the next three years, daily Alert Generator processing speed
Resolution Tactic:
batches for all customers are expected to
Introduce scaling assumption
have between 50,000 and 300 million mitigates
transactions
Adapt Alert Generator
Processing Speed at Runtime
Resolution Tactic:
Dynamically adapt agent capacity

Accurate Batch Size Prediction Alert Generator Processing Speed
Above Maximum Predicted Batch Size

Fortent Bank IT Team

ISEC 2009 28

Goal-Obstacle Analysis Summary

• Can now elicit scalability requirements for
Goal-Oriented Requirements Engineering
– Identify the key independent and dependent variables
– Identify scalability obstacles
– Resolve scalability obstacles
– All precisely and quantitatively
• What’s Missing?
– Agent Load Feasibility Analysis
– Cost/Benefit Analysis of Obstacle
Resolutions
– Testing Scalability Requirements
ISEC 2009 29

Summary

• Scalability is an important software quality
• But it has been poorly understood
– And it’s not just about performance!
• A proper characterisation of a system’s scalability
must be qualified with reference to relevant
independent and dependent variables
• And these should be derived through a precise
elicitation of scalability requirements

ISEC 2009 31

Thank you!

http://www.cs.ucl.ac.uk/staff/D.Rosenblum/

Software System Scalability: Concepts and Techniques (keynote talk at ISEC 2009)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Software System Scalability: Concepts and Techniques (keynote talk at ISEC 2009)

Similar to Software System Scalability: Concepts and Techniques (keynote talk at ISEC 2009) (20)

More from David Rosenblum

More from David Rosenblum (15)

Recently uploaded

Recently uploaded (20)

Software System Scalability: Concepts and Techniques (keynote talk at ISEC 2009)

Editor's Notes