Software System Scalability: Concepts and Techniques (keynote talk at ISEC 2009)
1. Software System Scalability:
Concepts and Techniques
David S. Rosenblum
University College London
United Kingdom
http://www.cs.ucl.ac.uk/staff/D.Rosenblum/
4. Scalability: A Widely Used Term
• The technical literature has many uses of the term
– Product brochures
– Research papers
– Design documents
– Standards specifications
• But there are very few precise definitions
ISEC 2009 4
5. A Typical Example
SAP Specification
Mark Handley, Colin Perkins and Edmund Whelan, Session
Announcement Protocol, RFC 2974, October 2000.
• 5500 Words, Including 3 Occurrences of ‘Scalability’:
– Abstract: ‘This document describes version 2 of the multicast session
directory announcement protocol, Session Announced Protocol (SAP), and
the related issues affecting security and scalability that should be taken
into account by implementors.’
– Section on Terminology: ‘A SAP announcer periodically multicasts an
announcement packet to a well known multicast address and port. The
announcement is multicast with the same scope as the session it is
announcing, ensuring that the recipients of the announcement are within
the scope of the session the announcement describes (bandwidth and
other such constraints permitting). This is also important for the scalability
of the protocol, as it keeps local session announcements local.’
– Section Heading: ‘Scalability and Caching’
ISEC 2009 5
6. The Problem
‘I examined aspects of scalability, but did not find a
useful, rigorous definition of it. Without such a
definition, I assert that calling a system “scalable”
is about as useful as calling it “modern”. I
encourage the technical community to either
rigorously define scalability or stop using it to
describe systems.’
[Mark D. Hill, ‘What is Scalability?’, ACM SIGARCH Computer
Architecture News, vol. 18, no. 4, Dec. 1990, pp. 18-21.]
ISEC 2009 6
7. Does This Lack of Rigour Matter?
Publications with the word scalable or scalability
in the title
[source: Engineering Village 2]
2500
2000
1500
1980: Computer Architecture
P ublications
1000
1988: Neural Networks
500
0
1965 1970 1975 1980 1985 1990 1995 2000 2005 2010
Year
ISEC 2009 7
8. Why Does It Matter to Software Engineering?
• Scalability is an important
multi-dimensional concern
– And engineers have difficulty
reasoning about multi-dimensionality!
• The dimensions exhibit
highly unpredictable trends
– And engineers have difficulty anticipating these trends!
• Growth in users, growth in capacity requirements
• New deployment needs (mergers, miniaturisation)
• Software engineers have only primitive, ad hoc
techniques to address scalability concerns
ISEC 2009 8
9. Some Typical Notions of Scalability
• Performance
– High throughput, low latency
• Parallel speedup
• Tractability of algorithms
– Polynomial versus Exponential
• Testing versus Verification
• State spaces in model checking
• Linear growth in resource usage
– What about quicksort???
ISEC 2009 9
10. Scalability of What?
• The running system?
• The software design?
• The number of users?
• Something else?
• All of the above???
ISEC 2009 10
12. So What Is Scalability?
• Scalability is a quality of a software system
characterising its ability …
– to satisfy its quality goals …
– to levels that are acceptable to its stakeholders …
– when characteristics of the execution environment …
– and the system design …
– vary over expected ranges.
Scalability is thus a meta-quality of other system qualities
ISEC 2009 12
13. A Scalability Framework
As a Form of Experimental Design
scaling non-scaling
design environment
system execution
system
behaviour dependent
independent
variables govern determine variables
system
qualities
environment and
design characteristics
ISEC 2009 13
14. Example
Google Search Engine
• Most people would agree that Google is scalable
– Dramatic growth in the size of the Web
– Dramatic growth in the rate of queries to Google
– Yet a virtually constant response time for users
• It’s a naturally parallelisable problem
– Implemented as a cluster of commodity PCs
– Cluster increased as Web and query load increase
ISEC 2009 14
15. The Scalability Framework
As Exemplified by Google
scaling non-scaling
Google is scalable with respect to response time
design environment
size of network
system execution
Web latency
response
queries per available system time
second bandwidth
because it maintains a constant response
timeI/O usage
as the
behaviour
govern determine
cluster number of queries per second
choice of price per
size algorithms
and the number of Web pages scale over performance
time,
system
qualities
environment and
design characteristics
by increasing the number of machines in the cluster
ISEC 2009 15
17. Case Study
Fortent Data Analysis System
• Intelligent Enterprise Framework (IEF)
– Overnight analysis of transactional data to
identify unusual and possibly fraudulent
patterns of bank and credit card transactions
– Java - 1,556 classes - 326,293 lines of code
• Surrogate Key Server (SKS) Component
BE BE BE SK SK SK
BE BE BE replace business SK SK SK
entity identifiers
BE BE BE SK SK SK
BE BE BE BE SK SK SK SK
BE SK
batches of BE SK injected
transactions on surrogate keys
business entities entity-key
mapping
ISEC 2009 17
18. Case Study
SKS Implementation Details
• Scalability problem: support a growing number of
business entities in overnight batches, while
maintaining throughput and memory usage within
acceptable levels
• First Generation Design (year 2000)
– In-memory cache
– High storage overhead, eventually crashing system
• Second Generation Design (year 2003)
– Disk-based cache for high-volume business entities
– In-memory cache for low-volume business entities
ISEC 2009 18
19. Scalability of IEF’s SKS
Characterisation
scaling non-scaling
design environment
system execution
number of average
business throughput
entities system
behaviour
memory
memory govern determine usage
number cache vs
of disk cache
disk usage
threads JVM heap
size
system
qualities
environment and
design characteristics
ISEC 2009 19
20. Scalability of IEF’s SKS
Analysis in Terms of Microeconomics
scaling non-scaling
distinct
design environment
behaviours
system execution
number of average
business throughput
entities system
behaviour
manipulate memory
memory
over ranges govern new prototype determine usage
number cache vs vs
of disk cache old raw data
disk usage
implementation
threads
JVM size measure system
qualities
environment and
design characteristics
preference functions
t(), m(), d()
utility function
Design Comparison preference values
10t()+10m()+d()
ISEC 2009 20
21. Case Study
Preferences and Utility
• Throughput preference
∧ -1, if x < 100
t(x) =
x – 100 , otherwise
400 – 100
• Heap usage preference
∧ -1, if y > 500
h(y) = • System utility
∧ ∧ ∧
500 – y , otherwise U(x,y,z) = 10 t(x) + 10 h(y) + d(z)
500 – 0 21
• Disk usage preference
∧ -1, if z > 24
d(z) =
24 – z , otherwise
24 – 0
ISEC 2009 21
24. Where Do the Variables and Preferences and
Utilities Come From?
• They must come from system stakeholders
– Are able to identify important scalability variables
– But like to think in terms of simple bounds
• Rather than the underlying functions that relate them
– And are usually poor at estimating those bounds
• Typically underestimate system load and system lifetime
• Goal-Oriented Requirements Engineering can be
used to elicit Scalability Requirements
– KAOS Method [van Lamsweerde, Letier]
ISEC 2009 24
25. The Scalability Framework
In the Context of Requirements Engineering
Scalability
identify and bound Goals identify and bound
scaling non-scaling
design environment
system execution
system
behaviour dependent
independent
variables govern determine variables
ISEC 2009 25
26. Goal-Oriented Requirements Engineering
As Exemplified by IEF
Goal
Fraudulent Transactions Handled
AND-Refinement
Sub-Goal
Obstacle
Fraudulent Transactions Acted Upon
Fraudulent Transactions
Detected Quickly
Requirement nt Transactions Not
Expectation
Fraudule
Acted Upon Bank
… IT Team
…
Batch Processed Overnight
Obstacle Refinement
Agent
Too Many Alerts
IEF for IT Team Sub-Obstacle
Alert Generator
Agent
ISEC 2009 26
27. Scalability Requirements
• A scaling assumption is a goal specifying how
some quantity in the application domain is
assumed to vary over time or system variants
• A scalability goal is a goal specifying the required
levels of satisfaction under variations specified in
associated scalability assumptions
• A scalability obstacle is a condition where the load
imposed by a goal exceeds the capacity of the
agent assigned to the goal
We can use goal-obstacle analysis to elicit these
ISEC 2009 27
28. Goal-Obstacle Analysis of IEF
Batch Processed Overnight Scalability Obstacle
Scaling Assumption Scalability Requirement
Batch Siz
eIs Unbou Batch Processed Overnight for
Expected Batch Size Variation nded
Expected Batch Size Variation
Assumption Expected Batch Size Variation IEF
Instance of scaling assumption Number of transactions exceeds Alert Generator
Definition Over the next three years, daily Alert Generator processing speed
Resolution Tactic:
batches for all customers are expected to
Introduce scaling assumption
have between 50,000 and 300 million mitigates
transactions
Adapt Alert Generator
Processing Speed at Runtime
Resolution Tactic:
Dynamically adapt agent capacity
Accurate Batch Size Prediction Alert Generator Processing Speed
Above Maximum Predicted Batch Size
Fortent Bank IT Team
ISEC 2009 28
29. Goal-Obstacle Analysis Summary
• Can now elicit scalability requirements for
Goal-Oriented Requirements Engineering
– Identify the key independent and dependent variables
– Identify scalability obstacles
– Resolve scalability obstacles
– All precisely and quantitatively
• What’s Missing?
– Agent Load Feasibility Analysis
– Cost/Benefit Analysis of Obstacle
Resolutions
– Testing Scalability Requirements
ISEC 2009 29
31. Summary
• Scalability is an important software quality
• But it has been poorly understood
– And it’s not just about performance!
• A proper characterisation of a system’s scalability
must be qualified with reference to relevant
independent and dependent variables
• And these should be derived through a precise
elicitation of scalability requirements
ISEC 2009 31
As in any kind of analysis you are trying to answer a question We represent this question in terms of preferences and utility functions, which I ’ll explain it later. As I have mentioned scalability always have to do with the scaling or variation of application domain or machine design characteristics. We call those independent variables, which are variables that can be manipulated on the analysis. Note that not all variables will vary, therefore we further subdivide them into scaling and non-scaling. Other variables may affect scalability, but we have no control of them. We call them nuisance variables. - Experimental design to reveal the causal relationship Factors Dependent variables The analysis of dependent variables in the presence of the variation of certain factors turn an ordinary quality analysis into a scalability analysis And thus it is vague to refer simply to “the scalability of a system”; instead one must refer to “the scalability with respect to throughput”, or “the scalability with respect to latency and memory consumption”. Scalability analysis should unveil this relationship in a explicit and continuous form Any system analysis conducted with respect to a variation over a range of environmental or design qualities is a scalability analysis Performance, reliability, availability, security, etc.
As in any kind of analysis you are trying to answer a question We represent this question in terms of preferences and utility functions, which I ’ll explain it later. As I have mentioned scalability always have to do with the scaling or variation of application domain or machine design characteristics. We call those independent variables, which are variables that can be manipulated on the analysis. Note that not all variables will vary, therefore we further subdivide them into scaling and non-scaling. Other variables may affect scalability, but we have no control of them. We call them nuisance variables. - Experimental design to reveal the causal relationship Factors Dependent variables The analysis of dependent variables in the presence of the variation of certain factors turn an ordinary quality analysis into a scalability analysis And thus it is vague to refer simply to “the scalability of a system”; instead one must refer to “the scalability with respect to throughput”, or “the scalability with respect to latency and memory consumption”. Scalability analysis should unveil this relationship in a explicit and continuous form Any system analysis conducted with respect to a variation over a range of environmental or design qualities is a scalability analysis Performance, reliability, availability, security, etc.
Surrogate Key Server is critical subsystem.
This is a retrospective study Call attention to multi-criteria trade off: memory vs throughput
As in any kind of analysis you are trying to answer a question We represent this question in terms of preferences and utility functions, which I ’ll explain it later. As I have mentioned scalability always have to do with the scaling or variation of application domain or machine design characteristics. We call those independent variables, which are variables that can be manipulated on the analysis. Note that not all variables will vary, therefore we further subdivide them into scaling and non-scaling. Other variables may affect scalability, but we have no control of them. We call them nuisance variables. - Experimental design to reveal the causal relationship Factors Dependent variables The analysis of dependent variables in the presence of the variation of certain factors turn an ordinary quality analysis into a scalability analysis And thus it is vague to refer simply to “the scalability of a system”; instead one must refer to “the scalability with respect to throughput”, or “the scalability with respect to latency and memory consumption”. Scalability analysis should unveil this relationship in a explicit and continuous form Any system analysis conducted with respect to a variation over a range of environmental or design qualities is a scalability analysis Performance, reliability, availability, security, etc.
As in any kind of analysis you are trying to answer a question We represent this question in terms of preferences and utility functions, which I ’ll explain it later. As I have mentioned scalability always have to do with the scaling or variation of application domain or machine design characteristics. We call those independent variables, which are variables that can be manipulated on the analysis. Note that not all variables will vary, therefore we further subdivide them into scaling and non-scaling. Other variables may affect scalability, but we have no control of them. We call them nuisance variables. - Experimental design to reveal the causal relationship Factors Dependent variables The analysis of dependent variables in the presence of the variation of certain factors turn an ordinary quality analysis into a scalability analysis And thus it is vague to refer simply to “the scalability of a system”; instead one must refer to “the scalability with respect to throughput”, or “the scalability with respect to latency and memory consumption”. Scalability analysis should unveil this relationship in a explicit and continuous form Any system analysis conducted with respect to a variation over a range of environmental or design qualities is a scalability analysis Performance, reliability, availability, security, etc.
In hindsight, the file-based design may appear to be obviously superior to the memory-based design, but this was not at all obvious when the memory-based design was first developed. In fact, if the designs had been compared only in terms of the load at the time the memory-based system was first being developed, then the memory-based design would have been selected instead of the file-based design. Only by doing a proper analysis over the full range of the scaling dimensions are we able to select the most scalable design.
As in any kind of analysis you are trying to answer a question We represent this question in terms of preferences and utility functions, which I ’ll explain it later. As I have mentioned scalability always have to do with the scaling or variation of application domain or machine design characteristics. We call those independent variables, which are variables that can be manipulated on the analysis. Note that not all variables will vary, therefore we further subdivide them into scaling and non-scaling. Other variables may affect scalability, but we have no control of them. We call them nuisance variables. - Experimental design to reveal the causal relationship Factors Dependent variables The analysis of dependent variables in the presence of the variation of certain factors turn an ordinary quality analysis into a scalability analysis And thus it is vague to refer simply to “the scalability of a system”; instead one must refer to “the scalability with respect to throughput”, or “the scalability with respect to latency and memory consumption”. Scalability analysis should unveil this relationship in a explicit and continuous form Any system analysis conducted with respect to a variation over a range of environmental or design qualities is a scalability analysis Performance, reliability, availability, security, etc.