The Importance of Scalability Gartner predictions for 2008 Moore’s Law continues to hold Desktop PC: 4–8 CPUs @ 40GHz, 4–12GB RAM, 1.5TB storage, 100Gb network Desktop PCs < 50% of end-user devices Bandwidth more cost-effective than computing But only if software systems can scale! But only if software systems can scale! 3
Some Notions of Scalability (1)“Scalability is a key requirement for the corporate content infrastructure, … [which] needs to be capable of handling high volumes of content as well as of fulfilling high performance requirements.” — Documentum“The Java 2 Platform, Micro Edition (J2ME) technology from Sun Microsystems, Inc. is used by developers to scale Java technology-based applications into small consumer and embedded devices.” — Sun Microsystems 4
Some Notions of Scalability (2)“[A Session Announcement Protocol] announcement is multicast with the same scope as the session it is announcing … [This is] important for the scalability of the protocol, as it keeps local session announcements local.” — Mark Handley et al., RFC 2974“Scalability: the ease with which a system or component can be modified to fit the problem area.” — Software Engineering Institute5
What Is Scalability? Is It High Performance? Computations/messages/transactions per second Fixed-size and fixed-time parallel speedup Is It Computational Complexity? Time and space complexity of algorithms Is It Abstraction? Example: programmer productivity versus expressive power of programming languages It is a characterisation of resource It is a characterisation of resource consumption as a function of problem size consumption as a function of problem size 6
Scalability in My Own Research Continual interest in problems of engineering large-scale software systems Scalable software infrastructures Scalable software development tools Many techniques used to achieve scalability Each has an associated cost Each was chosen serendipitously 8
Technique #1Abstraction Intuition: Analyse larger systems by ignoring the “nonessential detail” Cost: Abstraction can hide useful information and introduce inaccuracies Example: 5ESS Build Process Studies (1991–1994)A.L. Wolf and D.S. Rosenblum, “A Study in Software Process Data Capture and Analysis”, 9Proc. 2nd Int’l Conf. on the Software Process, Berlin, Germany, Feb. 1993, pp. 115–124.
The 5ESS® Switching System Software Primary central office switch product of Lucent (formerly AT&T) By the numbers (c. 1994) 5–7 million lines of code in 50+ subsystems 2000 developers New version built every 4 weeks 2 hours downtime per year (“The Good”) Except on 15 January 1990 (“The Ugly”)“It’s not just a program—it’s a field of study!”“It’s not just a program—it’s a field of study!” 10
The 5ESS Build Process Nominally required 1 week to compile new version of software into executable form Frequently took 2–3 weeks (“The Bad”) Simple syntax and semantics errors required frequent restart of build What are the “problem subsystems”? Events from build tools give very accurate abstract characterisation of process 11
Technique #2Execution Intuition: Observing individual executions is easier than analysing whole programs Cost: Forgo results about all executions Example: Runtime assertion checking ANNA (ANNotated Ada) (1983–1988) Attain benefits of systematic specification without difficulties and costs of formal proofs of correctness APP (1988–1996) Assertions detected 80% of faults in case study Assertion diagnostics quickly isolated faults D.S. Rosenblum, “A Practical Approach to Programming with Assertions”, 13 IEEE Transactions on Software Engineering, Vol. 21, No. 1, Jan. 1995, pp. 19–31.
Technique #3Coarse-Grained Analysis Intuition: Variant of abstraction Cost: Reduced precision Example: TestTube (1991–2001) Selective regression testing of C programs Existing approaches worked at statement level TestTube analyses test coverage in terms of Functions + Global Vars + Types + Macros First large empirical study of selective regression testing (30 KLOC)Y.-F. Chen, D.S. Rosenblum and K.-P. Vo, “TestTube: A System for Selective RegressionTesting”, Proc. 16th Int’l Conf. on Software Engineering, Sorrento, Italy, May 1994,pp. 211–220. 14
Technique #4Distribution Intuition: Build or analyse larger systems by partitioning the solution Enhance with replication, decentralisation, localisation, multicasting, … Cost: Increased complexity of solution Example: SIENA (1996- ) Internet-scale publish/subscribe network communication A. Carzaniga, D.S. Rosenblum and A.L. Wolf, “Design and Evaluation of a Wide-Area Event Notification Service”, ACM Transactions on Computer Systems, Vol. 19, No. 3, 15 August 2001, pp. 332–383.
Publish/Subscribe A natural communication style for asynchronous, A natural communication style for asynchronous, real-time, distributed content dissemination real-time, distributed content disseminationPublishers Publish/Subscribe Infrastructure Subscribers Infrastructure register subscriptions to Publishers publish notifications with Subscribers delivers notifications Infrastructure determines which subscriptions matchinformation characterizing information interests interesting subscribers matching which notifications 16
Technique #5Approximation Intuition: Handle larger problem sizes by forgoing exact results Cost: False positives and/or false negatives Example: PreCache (2001–2003) Conservative approximate matching algorithms for improved performance in publish/subscribe networking O(1) and O(log k) matching time against k subscriptions Was scalability achieved? Approximation increases messages traffic Increased traffic requires increased matching 18
Other Techniques6. Compositionality o Intuition: Subdivide problem, manipulate pieces, compose results o Different from Distribution7. Scale with Moore’s Law o Intuition: Hope that physics saves you o This has worked for regression testing 19
Scalability Questions How can one demonstrate the ability of a system like 5ESS to scale before trying to implement it and deploy it on thousands of computers across the world? What information about a software system enables prediction of its scalability? What design characteristics increase or decrease (or contribute to or detract from) scalability? 21
Scalability Engineering A principled basis for Choosing and applying scalability enabling techniques Evaluating scalability of software system designs and implementations Choosing scalable engineering methods Comparing scalability of different designs/implementations/methods 22
Conclusion Scalability is an increasingly important issue for software systems engineering Everybody talks about it Yet we lack well-defined, applicable, measurable principles of scalabilityScalability engineering should become a partScalability engineering should become a partof every software system engineering effort of every software system engineering effort 24