3. The Importance of Scalability
Gartner predictions for 2008
Moore’s Law continues to hold
Desktop PC: 4–8 CPUs @ 40GHz, 4–12GB
RAM, 1.5TB storage, 100Gb network
Desktop PCs < 50% of end-user devices
Bandwidth more cost-effective than
computing
But only if software systems can scale!
But only if software systems can scale!
3
4. Some Notions of Scalability (1)
“Scalability is a key requirement for the corporate
content infrastructure, … [which] needs to be
capable of handling high volumes of content as
well as of fulfilling high performance
requirements.”
— Documentum
“The Java 2 Platform, Micro Edition (J2ME) technology
from Sun Microsystems, Inc. is used by developers
to scale Java technology-based applications into
small consumer and embedded devices.”
— Sun Microsystems
4
5. Some Notions of Scalability (2)
“[A Session Announcement Protocol] announcement is
multicast with the same scope as the session it is
announcing … [This is] important for the
scalability of the protocol, as it keeps local
session announcements local.”
— Mark Handley et al., RFC 2974
“Scalability: the ease with which a system or
component can be modified to fit the problem
area.”
— Software Engineering Institute5
6. What Is Scalability?
Is It High Performance?
Computations/messages/transactions per
second
Fixed-size and fixed-time parallel speedup
Is It Computational Complexity?
Time and space complexity of algorithms
Is It Abstraction?
Example: programmer productivity versus
expressive power of programming languages
It is a characterisation of resource
It is a characterisation of resource
consumption as a function of problem size
consumption as a function of problem size 6
8. Scalability in My Own Research
Continual interest in problems of
engineering large-scale software
systems
Scalable software infrastructures
Scalable software development tools
Many techniques used to achieve
scalability
Each has an associated cost
Each was chosen serendipitously
8
9. Technique #1
Abstraction
Intuition: Analyse larger systems by
ignoring the “nonessential detail”
Cost: Abstraction can hide useful
information and introduce
inaccuracies
Example: 5ESS Build Process Studies
(1991–1994)
A.L. Wolf and D.S. Rosenblum, “A Study in Software Process Data Capture and Analysis”,
9
Proc. 2nd Int’l Conf. on the Software Process, Berlin, Germany, Feb. 1993, pp. 115–124.
10. The 5ESS®
Switching System Software
Primary central office switch product of
Lucent (formerly AT&T)
By the numbers (c. 1994)
5–7 million lines of code in 50+ subsystems
2000 developers
New version built every 4 weeks
2 hours downtime per year (“The Good”)
Except on 15 January 1990 (“The Ugly”)
“It’s not just a program—it’s a field of study!”
“It’s not just a program—it’s a field of study!” 10
11. The 5ESS Build Process
Nominally required 1 week to compile new
version of software into executable form
Frequently took 2–3 weeks (“The Bad”)
Simple syntax and semantics errors required
frequent restart of build
What are the “problem subsystems”?
Events from build tools give very accurate
abstract characterisation of process
11
12.
13. Technique #2
Execution
Intuition: Observing individual executions
is easier than analysing whole programs
Cost: Forgo results about all executions
Example: Runtime assertion checking
ANNA (ANNotated Ada) (1983–1988)
Attain benefits of systematic specification without
difficulties and costs of formal proofs of correctness
APP (1988–1996)
Assertions detected 80% of faults in case study
Assertion diagnostics quickly isolated faults
D.S. Rosenblum, “A Practical Approach to Programming with Assertions”,
13
IEEE Transactions on Software Engineering, Vol. 21, No. 1, Jan. 1995, pp. 19–31.
14. Technique #3
Coarse-Grained Analysis
Intuition: Variant of abstraction
Cost: Reduced precision
Example: TestTube (1991–2001)
Selective regression testing of C programs
Existing approaches worked at statement level
TestTube analyses test coverage in terms of
Functions + Global Vars + Types + Macros
First large empirical study of selective
regression testing (30 KLOC)
Y.-F. Chen, D.S. Rosenblum and K.-P. Vo, “TestTube: A System for Selective Regression
Testing”, Proc. 16th Int’l Conf. on Software Engineering, Sorrento, Italy, May 1994,
pp. 211–220. 14
15. Technique #4
Distribution
Intuition: Build or analyse larger systems
by partitioning the solution
Enhance with replication, decentralisation,
localisation, multicasting, …
Cost: Increased complexity of solution
Example: SIENA (1996- )
Internet-scale publish/subscribe network
communication
A. Carzaniga, D.S. Rosenblum and A.L. Wolf, “Design and Evaluation of a Wide-Area
Event Notification Service”, ACM Transactions on Computer Systems, Vol. 19, No. 3,
15
August 2001, pp. 332–383.
16. Publish/Subscribe
A natural communication style for asynchronous,
A natural communication style for asynchronous,
real-time, distributed content dissemination
real-time, distributed content dissemination
Publishers Publish/Subscribe Infrastructure Subscribers
Infrastructure register subscriptions to
Publishers publish notifications with
Subscribers delivers notifications
Infrastructure determines which
subscriptions matchinformation
characterizing information interests
interesting subscribers
matching which notifications
16
18. Technique #5
Approximation
Intuition: Handle larger problem sizes by forgoing
exact results
Cost: False positives and/or false negatives
Example: PreCache (2001–2003)
Conservative approximate matching algorithms for
improved performance in publish/subscribe networking
O(1) and O(log k) matching time against k subscriptions
Was scalability achieved?
Approximation increases messages traffic
Increased traffic requires increased matching
18
19. Other Techniques
6. Compositionality
o Intuition: Subdivide problem,
manipulate pieces, compose results
o Different from Distribution
7. Scale with Moore’s Law
o Intuition: Hope that physics saves you
o This has worked for regression testing
19
21. Scalability Questions
How can one demonstrate the ability of a
system like 5ESS to scale before trying to
implement it and deploy it on thousands of
computers across the world?
What information about a software system
enables prediction of its scalability?
What design characteristics increase or
decrease (or contribute to or detract from)
scalability?
21
22. Scalability Engineering
A principled basis for
Choosing and applying scalability
enabling techniques
Evaluating scalability of software system
designs and implementations
Choosing scalable engineering methods
Comparing scalability of different
designs/implementations/methods
22
24. Conclusion
Scalability is an increasingly important
issue for software systems engineering
Everybody talks about it
Yet we lack well-defined, applicable,
measurable principles of scalability
Scalability engineering should become a part
Scalability engineering should become a part
of every software system engineering effort
of every software system engineering effort
24