B5 15 - large-scale complex it systems

444 views

Published on

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
444
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

B5 15 - large-scale complex it systems

  1. 1. doi:10.1145/ 2209249 . 2 2 0 9 2 6 8 The reductionism behind today’s software- engineering methods breaks down in the face of systems complexity. by Ian Sommerville, Dave Cliff, Radu Calinescu, Justin Keen, Tim Kelly, Marta Kwiatkowska, John McDermid, and Richard PaigeLarge-ScaleComplexIT SystemsOn the afternoon of May 6, 2010, the U.S. equitymarkets experienced an extraordinary upheaval. Overapproximately 10 minutes, the Dow Jones IndustrialAverage dropped more than 600 points, representingthe disappearance of approximately $800 billion ofmarket value. The share price of several blue-chipmultinational companies fluctuated curred, it reversed, so over the next fewdramatically; shares that had been at minutes most of the loss was recoveredtens of dollars plummeted to a penny and share prices returned to levelsin some cases and rocketed to values close to what they had been before theover $100,000 per share in others. As crash.suddenly as this market downturn oc- This event came to be known as the “Flash Crash,” and, in the inquiry re- key insights port published six months later,7 the trigger event was identified as a single Coalitions of systems, in which the system block sale of $4.1 billion of futures con- elements are managed and owned independently, pose challenging new tracts executed with uncommon ur- problems for systems engineering. gency on behalf of a fund-management When the fundamental basis of company. That sale began a complex engineering—reductionism—breaks pattern of interactions between the down, incremental improvements to current engineering techniques are high-frequency algorithmic trading unable to address the challenges of systems (algos) that buy and sell blocks developing, integrating, and deploying of financial instruments on incredibly large-scale complex IT systems. short timescales. Developing complex systems requires A software bug did not cause the a socio-technical perspective involving human, organizational, social, and Flash Crash; rather, the interactions of political factors, as well as technical independently managed software sys- factors. tems created conditions unforeseen ju ly 2 0 1 2 | vo l. 55 | n o. 7 | c om m u n ic at ion s of t he acm 71
  2. 2. contributed articles (probably unforeseeable) by the own- and managerially independent. Char- ers and developers of the trading sys- acterizing SoS, he covered a range of tems. Within seconds, the result was a systems, from directed (developed for failure in the broader socio-technical a particular purpose) to virtual (lack- markets that increasingly rely on the algos (see the sidebar “Socio-Techni- Developers cannot ing a central management authority or centrally agreed purpose). LSCITS is a cal Systems”). Society depends on complex IT sys- analyze inherent type of SoS in which the elements are owned and managed by different orga- tems created by integrating and or- complexity nizations. In this classification, the col- chestrating independently managed systems. The incredible increase in during system lection of systems that led to the Flash Crash (an LSCITS) would be called a scale and complexity in them over the development, “virtual system of systems.” However, past decade means new software-engi- neering techniques are needed to help as it depends since Maier’s article was published in 1998, the word “virtual” has generally us cope with their inherent complexity. on the system’s taken on a different meaning—virtual Here, we explain the principal reasons today’s software-engineering meth- dynamic operating machines; consequently, we propose an alternative term that we find more ods and tools do not scale, proposing environment. descriptive—“coalition of systems.” a research and education agenda to The systems in a coalition of systems help address the inherent problems work together, sometimes reluctantly, of large-scale complex IT systems, or as doing so is in their mutual interest. LSCITS, engineering. Coalitions of systems are not explicitly designed but come into existence as Coalitions of Systems different member systems interact ac- The key characteristic of these systems cording to agreed-upon protocols. Like is that they are assembled from other political coalitions, there might even systems that are independently con- be hostility between various members, trolled and managed. While there is and members enter and leave accord- increasing awareness in the software- ing to their interpretation of their own engineering community of related is- best interests. sues,10 the most relevant background The interacting algos that led to the work comes from systems engineering. Flash Crash represent an example of a Systems engineering focuses on devel- coalition of systems, serving the pur- oping systems as a whole, as defined by poses of their owners and cooperating the International Council for Systems only because they have to. The owners Engineering (http://www.incose.org/): of the individual systems were compet- “Systems engineering integrates all the ing finance companies that were often disciplines and specialty groups into mutually hostile. Each system jealous- a team effort forming a structured de- ly guarded its own information and velopment process that proceeds from could change without consulting any concept to production to operation. other system. Systems engineering considers both Dynamic coalitions of software- the business and the technical needs intensive systems are a challenge for of all customers with the goal of pro- software engineering. Designing de- viding a quality product that meets the pendability into the coalition is not user needs.” possible, as there is no overall design Systems engineering emerged to authority, nor is it possible to central- take a systemwide perspective on com- ly control the behavior of individual plex engineered systems involving systems. The systems in the coalition structures and electrical and mechani- can change unpredictably or be com- cal systems. Almost all systems today pletely replaced, and the organiza- are software-intensive, and systems tions running them might themselves engineers address the challenge of cease to exist. Coalition “design” in- constructing ultra-large-scale software volves the protocols for communica- systems.17 The most relevant aspects of tions, and each organization using systems engineering is work on “sys- the coalition orchestrates the constit- tem of systems,” or SoS,12 about which uent systems its own way. However, Maier said the distinction between the designers and managers of each SoS and complex monolithic systems individual system must consider how is that SoS elements are operationally to make it robust enough to ensure 72 comm unicatio ns o f the ac m | j u ly 201 2 | vo l . 5 5 | no. 7
  3. 3. contributed articlestheir organizations are not threat-ened by failure or undesirable behav-ior elsewhere in the coalition. Socio-Technical SystemsSystem Complexity Engineers are concerned primarily with building technical systems from hardware and software components and assume system requirements reflect the organizationalComplexity stems from the number needs for integration with other systems, compliance, and business processes.and type of relationships between the Yet systems in use are not simply technical systems but “socio-technical systems.”system’s components and between the To reflect the fact they involve evolving and interacting communities that includesystem and its environment. If a rela- technical, human, and organizational elements, they are sometimes also called “socio- technical ecosystems,” though the term socio-technical systems is more common.tively small number of relationships Socio-technical systems include people and processes, as well as technologicalexist between system components and systems. The process definitions outline how the system designers intend the systemthey change relatively slowly over time, should be used, but, in practice, users interpret and adapt them in unpredictable ways,then engineers can develop determin- depending on their education, experience, and culture. Individual and group behavior also depend on organizational rules and regulations, as well as organizational culture,istic models of the system and make or “the way things are done around here.”predictions concerning its properties. Defining technical software-intensive systems intended to support an However, when the elements in a organization’s work in isolation is an oversimplification that hinders software engineering. So-called system requirements represent the interface between thesystem involve many dynamic relation- technical system and the wider socio-technical system, yet requirements are inevitablyships, complexity is inevitable. Com- incomplete, incorrect, and/or out of date. Coalitions of systems cannot operate on thisplex systems are nondeterministic, basis. Rather, engineers must recognize that by taking advantage of the abilities andand system characteristics cannot be inventiveness of people, the systems will be more effective and resilient.predicted by analyzing the system’sconstituents. Such characteristicsemerge when the whole system is put Even when the relationships be- and 1980s, modern software is muchto use and changes over time, depend- tween system elements are simpler, larger, more complex, more reliable,ing how it is used and on the state of its relatively static, and, in principle, un- and often developed more quickly.external environment. derstandable, there may be so many Software products deliver astonishing Dynamic relationships include elements and relationships that un- functionality at relatively low cost.those between system elements and derstanding them is practically impos- Software engineering has focusedthe system’s environment that change. sible. Such complexity is called “epis- on reducing and managing epistemicFor example, a trust relationship is a temic complexity” due to our lack of complexity, so, where inherent com-dynamic relationship; initially, com- knowledge of the system rather than plexity is relatively limited and a singleponent A might not trust component some inherent system characteris- organization controls all system ele-B, so, following some interchange, tics.16 For example, it may be possible ments, software engineering is highlyA checks that B has performed as ex- in principle to deduce the traceability effective. However, for coalitions ofpected. Over time, these checks may relationships between requirements systems with a high degree of inherentbe reduced in scope as A’s trust in B and design, but, if appropriate tools complexity, today’s software engineer-increases. However, some failure in B are not available, doing so may be prac- ing techniques are inadequate.may profoundly influence that trust, tically impossible. This is reflected in the failure that isand, after the failure, even more strin- If you do not know enough about all too common in large government-gent checks might be introduced. a system’s components and their re- funded projects. The software may be Complexity stemming from the lationships, you cannot make predic- delivered late, be more expensive todynamic relationships between ele- tions about overall behavior, even if the develop than anticipated, and inad-ments in a system depends on the ex- system lacks dynamic relationships equate for the needs of its users. Anistence and nature of these relation- between its elements. Epistemic com- example of such a project was the at-ships. Engineers cannot analyze this plexity increases with system size; as tempt, from 2000 to 2010, to automateinherent complexity during system ever-larger systems are built, they are U.K. health records; the project was ul-development, as it depends on the inevitably more difficult to understand timately abandoned at a cost estimatedsystem’s dynamic operating environ- and their behavior and properties at $5 billion–$10 billion.19ment. Coalitions of systems in which more difficult to predict. This distinc- The fundamental reason today’selements are large software systems tion between inherent and epistemic software engineering cannot effec-are always inherently complex. The complexity is important. As discussed tively manage inherent complexity isrelationships between the elements of in the following section, it is the prima- that its basis is in developing individ-the coalition change because they are ry reason new approaches to software ual programs rather than in interact-not independent of how the systems engineering are needed. ing systems. The consequence is thatare used or of the nature of their op- software-engineering methods areerating environments. Consequently, Reductionism and unsuitable for building LSCITS. To ap-the nonfunctional (often even the Software Engineering preciate why, we need to examine thefunctional) behavior of coalitions of In some respects, software engineering essential divide-and-conquer reduc-systems is emergent and impossible to has been incredibly successful. Com- tionist assumption that is the basis ofpredict completely. pared to the systems built in the 1970s all modern engineering. ju ly 2 0 1 2 | vo l. 55 | n o. 7 | c om m u n ic at ion s of t he acm 73
  4. 4. contributed articles Reductionism is a philosophical nical criteria. Decision making in orga- stitute at Carnegie Mellon University position that a complex system is no nizations is profoundly influenced by (http://www.sei.cmu.edu/) on ultra- more than the sum of its parts, and political considerations, with actors large-scale systems (ULSS)13 triggered that an account of the overall system striving to maintain or improve their research leading to creation of the can be reduced to accounts of individ- current positions to avoid losing face. Center for Ultra-Large Scale Software- ual constituents. From an engineering Technical considerations are rarely the Intensive Systems, or ULSSIS (http:// perspective, this means systems engi- most significant factor in large-system ulssis.cs.virginia.edu/ULSSIS), a re- neers must be able to design a system decision making; and search consortium involving the Uni- so it is composed of discrete smaller The problem is definable, and sys- versity of Virginia, Michigan State parts and interfaces allowing the parts tem boundaries are clear. The nature University, Vanderbilt University, to work together. A systems engineer of “wicked problems”15 is that the and the University of California, San then builds the system elements and “problem” is constantly changing, de- Diego. In the U.K., the comparable integrates them to create the desired pending on the perceptions and status LSCITS Initiative addresses problems overall system. of stakeholders. As stakeholder posi- of inherent and epistemic complexity Researchers generally adopt this tions change, the boundaries are like- in LSCITS, while Hillary Sillitto, a se- reductionist assumption, and their wise redefined. nior systems architect at Thales Land work concerns finding better ways to However, for coalitions of systems, Joint Systems U.K., has proposed decompose problems or systems (such these assumptions never hold true, ULSS design principles.17 as software architecture), better ways and many software project “failures,” Northrop et al.13 made the point to create the parts of the system (such where software is delivered late and/ that developing ultra-large-scale sys- as object-oriented techniques), or bet- or over budget, are a consequence of tems needs to go beyond incremental ter ways to do system integration (such adherence to the reductionist view. To improvements to current methods, as test-first development). Underlying help address inherent complexity, soft- identifying seven important research all software-engineering methods and ware engineering must look toward areas: human interaction, computa- techniques (see Figure 1) are three re- the systems, people, and organizations tional emergence, design, computa- ductionist assumptions: that make up a software system’s envi- tional engineering, adaptive system System owners control system devel- ronment. We need to represent, ana- infrastructure, adaptable and predict- opment. A reductionist perspective lyze, model, and simulate potential op- able system quality and policy, and ac- takes the view that an ultimate control- erational environments for coalitions quisition and management. The SEI ler has the authority to take decisions of systems to help us understand and ULSS report suggested it is essential to about a system and is therefore able manage, so far as possible, the com- deploy expertise from a range of disci- to enforce decisions on, say, how com- plex relationships in the coalition. plines to address these challenges. ponents interact. However, when sys- We agree the research required is tems consist of independently owned Challenges interdisciplinary and that incremental and managed elements, there is no Since 2006, initiatives in the U.S. improvement in existing techniques is such owner or controller and no cen- and in Europe have sought to ad- unable to address the long-term soft- tral authority to take or enforce design dress engineering large coalitions of ware-engineering challenges of ultra- decisions; systems. In the U.S., a report by the large-scale systems engineering. How- Decisions are rational, driven by tech- influential Software Engineering In- ever, a weakness of the SEI report was its failure to set out a roadmap outlin- Figure 1. Reductionist assumptions vs. LSCITS reality. ing how large-scale systems engineer- ing can get from where it is today to the research it proposed. Reductionist assumptions Software engineers worldwide cre- ating large complex software systems require more immediate, perhaps Owners of Decisions made Definable problem a system control rationally, driven by and clear system more incremental, research, driven by its development technical criteria boundaries the practical problems of complex IT systems engineering. The pragmatic Control Rationality Problem definition proposals we outline here begin to ad- dress some of them, aiming for medi- Wicked problem and um-term, as well as a longer-term, im- No single owner Decisions driven by constantly renegotiated or controller political motives system boundaries pact on LSCITS engineering. The research topics we propose here might be viewed as part of the roadmap that could lead us from cur- Large-scale complex IT systems reality rent practice to LSCITS engineering. We see them as a bridge between the short- and medium-term imperative to improve our ability to create coalitions 74 com municatio ns o f th e acm | j u ly 201 2 | vo l . 5 5 | no. 7
  5. 5. contributed articlesof systems and the longer-term vision a failure for some users may have no ef-set out in the SEI ULSS report. fect on others. Because some failures Developing coalitions of systems in- are ambiguous, automated systemsvolves engineering individual systems cannot cope on their own. Human op-to work in the orchestration, as well asconfiguration, of a coalition to meet The nonfunctional erators must use information from the system, intervening to enable it to re-organizational needs. Based on theideas in the SEI ULSS report and on our (and, often, the cover from failure. This means under- functional) behavior standing the socio-technical processesown experience in the U.K. LSCITS Ini- of failure recovery, the support thetiative, we have identified 10 questionsthat can help define a research agenda of coalitions operators need, and how to design co- alition members to be “good citizens”for future LSCITS software engineering: of systems is able to support failure recovery. How can interactions between inde-pendent systems be modeled and simu- emergent and How can socio-technical factors be integrated into systems and software-lated? To help understand and manage impossible to engineering methods? Software- andcoalitions of systems LSCITS engineersneed dynamic models that are updated predict completely. systems-engineering methods support development of technical systems andin real time with information from generally consider human, social, andthe system itself. These models are organizational issues to be outsideneeded to help make what-if assess- the system’s boundary. However, suchments of the consequences of system- nontechnical factors significantly af-change options. This requires new fect development, integration, andperformance- and failure-modeling operation of coalitions of systems.techniques where the models adapt au- Though a considerable body of worktomatically due to system-monitoring covers socio-technical systems, it hasdata. We do not suggest simulations not been industrialized or made acces-can be complete or predict all possible sible to practitioners. Baxter and Som-problems. However, other engineering merville2 surveyed this work and pro-disciplines (such as civil and aeronau- posed a route to industrial-scale usetical engineering) have benefited enor- of socio-technical methods. However,mously from simulation, and compa- much more research and experience israble benefits could be achieved for required before socio-technical analy-software engineering. ses are used routinely for complex sys- How can coalitions of systems be mon- tems engineering.itored? And what are the warning signs To what extent can coalitions of sys-problems produce? In the run-up to the tems be self-managing? Needed is re-Flash Crash, no warning signs indicat- search into self-management so sys-ed the market was tending toward an tems are able to detect changes in bothunstable state. To help avoid transition their operation and operational envi-to an unstable system state, systems ronment and dynamically reconfigureengineers need to know the indicators themselves to cope with the changes.that provide information about the The danger is that reconfiguration willstate of the coalition of systems, how create further complexity, so a key re-they may be used to provide both early quirement is for the techniques to op-warnings of system problems, and, if erate in a safe, predictable, auditablenecessary, switch to safe-mode operat- way ensuring self-management doesing conditions that prevent the possi- not conflict with “design for recovery.”bility of damage. How can organizations manage com- How can systems be designed to re- plex, dynamically changing system con-cover from failure? A fundamental prin- figurations? Coalitions of systems willciple of software engineering is that be constructed through orchestrationsystems should be built so they do not and configuration, and desired systemfail, leading to development of meth- configurations will change dynami-ods and tools based on fault avoidance, cally in response to load, indicatorsfault detection, and fault tolerance. of system health, unavailability ofHowever, as coalitions of systems are components, and system-health warn-constructed with independently man- ings. Ways are needed to support con-aged elements and negotiated require- struction by configuration, managingments, avoiding failure is increasingly configuration changes and recordingimpractical. Indeed, what seems to be changes, including automated chang- ju ly 2 0 1 2 | vo l. 55 | n o. 7 | c om m u n ic at ion s of t he acm 75
  6. 6. contributed articles es from the self-management system, sive. For some safety-critical systems, key problem will not be compatibility in real time, so an audit trail includes the cost of certification can exceed the but understanding what the informa- the configuration of the coalition at cost of development, and certification tion exchange really means. This is ad- any point in time. costs will increase as systems become dressed today on a system-by-system How should the agile engineering of larger and more complex. Though cer- basis through negotiation between coalitions of systems be supported? The tification as practiced today is almost system owners to clarify the meaning business environment changes quickly certainly impossible for coalitions of of shared information. However, if in response to economic circumstanc- systems, research is urgently needed dynamic coalitions are allowed, with es, competition, and business reorga- into incremental and evolutionary cer- systems entering and leaving the coali- nization. Likewise, coalitions of sys- tification so our ability to deploy criti- tion, negotiation is not practical. The tems must be able to change quickly to cal complex systems is not curtailed by key is developing a means of sharing reflect current business needs. A model certification requirements. This issue the meaning of information, perhaps of system change that relies on lengthy is social, as well as technical, as societ- through ontologies like those pro- processes of requirements analysis and ies decide what level of certification is posed by Antoniou and van Harmelen1 approval does not work. Agile methods socially and legally acceptable. involving the semantic Web. of programming have been success- How can systems undergo “probabilis- A major problem researchers must ful for small- to medium-size systems tic verification”? Today’s techniques of address is lack of knowledge of what where the dominant activity is systems system testing and more formal analy- happens in real systems. High-profile development. For large complex sys- sis are based on the assumption that a failures (such as the Flash Crash) lead tems, development processes are often system involves a definitive specifica- to inquiries, but more is needed about dominated by coordination activities tion and that behavior deviating from the practical difficulties faced by de- involving multiple stakeholders and it is recognized. Coalitions of systems velopers and operators of coalitions engineers who are not colocated. How have no such specification nor is sys- of systems and how to address them can agile approaches be effective for tem behavior guaranteed to be deter- as they arise. New ideas, tools, and “systems development in the large” to ministic. The key verification issue will methods must be supported by long- support multi-organization global sys- not be whether the system is correct term empirical studies of the systems tems development? but the probability that it satisfies es- and their development processes to How should coalitions of systems sential properties (such as safety) that provide a solid information base for re- be regulated and certified? Many such take into account its probabilistic, real- search and innovation. coalitions represent critical systems, time, nondeterministic behavior.8,11 The U.K. LSCITS Initiative5 address- failure of which could threaten individ- How should shared knowledge in a es some of them, working with partners uals, organizations, and national econ- coalition of systems be represented? We from the computer, financial services, omies. They may have to be certified by assume the systems in a coalition in- and health-care industries to develop regulators checking that, as far as pos- teract through service interfaces so an understanding of the fundamental sible, they do not pose a threat to their the system has no overarching con- systems engineering problems they operators or to the wider systems en- troller. Information is encoded in a face. Key to this work is a long-term en- vironment. But certification is expen- standards-based representation. The gagement with the National Health In- formation Center to create coalitions Figure 2. Outline structure for master’s course in LSCITS. of systems to provide external access to and analysis of vast amounts of health and patient data. The project is developing practical Systems Engineering Business techniques of socio-technical systems Ultra-large-scale Systems Organizational engineering2 and exploring design for systems engineering change failure.18 It has so far developed prac- Complexity Systems Legal and tical, predictable techniques for au- procurement regulatory issues tonomic system management3,4 and Mathematical modeling Systems Technology is investigating the scaling up of agile integration innovation methods14 and exploring incremental Socio-technical systems Systems Program system certification9 and development resilience management of techniques for system simulation and modeling. Education. To address the practical Options from computer science, engineering, psychology, business issues of creating, managing, and op- erating LSCITS, engineers need knowl- Industrial project edge and understanding of the systems and with techniques outside a “nor- mal” software- or systems-engineering education. In the U.K., the LSCITS Ini- tiative provides a new kind of doctoral 76 comm unicatio ns o f the acm | j u ly 201 2 | vo l . 5 5 | no. 7
  7. 7. contributed articlesdegree, comparable in standard to a equate, saying: “For 40 years, we have Regarding the Market Events of May 6th, 2010. Report of the CFTC and SEC to the Joint Advisory CommitteePh.D. in computer science or engi- embraced the traditional engineering on Emerging Regulatory Issues, 2010; http://www.neering. Students get an engineering perspective. The basic premise under- sec.gov/news/studies/2010/marketevents-report.pdf 8. Ge, X., Paige, R.F., and McDermid, J.A. Analyzingdoctorate, or EngD, in LSCITS,20 with lying the research agenda presented in system failure behaviors with PRISM. In Proceedingsthe following key differences between this document is that beyond certain of the Fourth IEEE International Conference on Secure Software Integration and ReliabilityEngD and Ph.D.: complexity thresholds, a traditional Improvement Companion (Singapore, June). IEEE Industrial problems. Students must centralized engineering perspective is Computer Society Press, Los Alamitos, CA, 2010, 130–136.work on and spend significant time no longer adequate nor can it be the 9. Ge, X., Paige, R.F., and McDermid, J.A. An iterativeon an industrial problem. Universities primary means by which ultra-complex approach for development of safety-critical software and safety arguments. In Proceedings of Agile 2010cannot simply replicate the complex- systems are made real.” A key contri- (Orlando, FL, Aug.). IEEE Computer Society Press, Los Alamitos, CA, 2010, 35–43.ity of modern software-intensive sys- bution of our work in LSCITS is articu- 10. Goth, G. Ultralarge systems: Redefining softwaretems, with few faculty members hav- lating the fundamental reasons this engineering. IEEE Software 25, 3 (May 2008), 91–94. 11. Kwiatkowska, M., Norman, G., and Parker D. PRISM:ing experience and understanding of assertion is true. By examining how en- Probabilistic model checking for performance andthe systems; gineering has a basis in the philosophi- reliability analysis. ACM SIGMETRICS Performance Evaluation Review 36, 4 (Oct. 2009), 40–45. Range of courses. Students must take cal notion of reductionism and how 12. Maier, M.W. Architecting principles for system ofa range of courses focusing on com- reductionism breaks down in the face systems. Systems Engineering 1, 4 (Oct. 1998), 267–284.plexity and systems engineering (such of complexity, it is inevitable that tradi- 13. Northrop, L. et al. Ultra-Large-Scale Systems: Theas for LSCITS, socio-technical systems, tional software-engineering methods Software Challenge of the Future. Technical Report. Carnegie Mellon University Software Engineeringhigh-integrity systems engineering, will fail when used to develop LSCITS. Institute, Pittsburgh, PA, 2006; http://www.sei.cmu.empirical methods, and technology in- Current software engineering is simply edu/library/abstracts/books/0978695607.cfm 14. Paige, R.F., Charalambous, R., Ge, X., and Brooke, P.J.novation); and not good enough. We need to think dif- Towards agile development of high-integrity systems. Portfolio of work. Students do not ferently to address the urgent need for In Proceedings of the 27th International Conference on Computer Safety, Reliability, and Security (Newcastle,have to deliver a conventional thesis, a new engineering approaches to help U.K., Sept.) Springer-Verlag, Heidelberg, 2008, 30–43.book on a single topic, but can deliver construct large-scale complex coali- 15. Rittel, H. and Webber, M. Dilemmas in a general theory of planning. Policy Sciences 4 (Oct. 1973), 155–73.a portfolio of work around their select- tions of systems we can trust. 16. Rushby, J. Software verification and system assurance. In Proceedings of the Seventh IEEEed area; it is a better reflection of work International Conference on Software Engineering andin industry and makes it easier for the Acknowledgments Formal Methods (Hanoi, Nov.). IEEE Computer Society Press, Los Alamitos, CA, 2009, 1–9.topic to evolve as systems change and We would like to thank our colleagues 17. Sillitto, H.T. In Proceedings of the 20th Internationalnew research emerges. Gordon Baxter and John Rooksby of St. Council for Systems Engineering International Symposium (Chicago, July). Curran Associates, Inc., However, graduating a few advanced Andrews University in Scotland and Red Hook, NY, 2010.doctoral students is not enough. Uni- Hillary Sillitto of Thales Land Joint 18. Sommerville, I. Designing for Recovery: New Challenges for Large-scale Complex IT Systems.versities and industry must also create Systems U.K. for their constructive Keynote address, Eighth IEEE Conference onmaster’s courses that educate com- comments on drafts of this article. The Composition-Based Software Systems (Madrid, Feb. 2008); http://sites.google.com/site/iansommerville/plex-systems engineers for the coming work report here was partially funded keynote-talks/DesigningForRecovery.pdfdecades; our thoughts on what might by the U.K. Engineering and Physical 19. U.K. Cabinet Office. Programme Assessment Review of the National Programme for IT. Major Projectsbe covered are outlined in Figure 2. Science Research Council (www.epsrc. Authority, London, 2011; http://www.cabinetoffice.The courses must be multidisciplinary, ac.uk) grant EP/F001096/1. gov.uk/resource-library/review-department-health- national-programme-itcombining engineering and business 20. University of York. The LSCITS Engineering Doctoratedisciplines. It is not only the knowl- Centre, York, England, 2009; http://www.cs.york.ac.uk/ References EngD/edge the disciplines bring that is im- 1. Antoniou, G. and van Harmelen, F. A Semantic Web Primer, Second Edition. MIT Press, Cambridge, MA,portant but also that students be sen- 2008. Ian Sommerville (Ian.Sommerville@st-andrews.ac.uk)sitized to the perspectives of a variety 2. Baxter, G. and Sommerville, I. Socio-technical systems: is a professor in the School of Computer Science, St. From design methods to systems engineering. Andrews University, Scotland.of disciplines and so move beyond the Interacting with Computers 23, 1 (Jan. 2011), 4–17. 3. Calinescu, R., Grunske, L., Kwiatkowska, M., Mirandola, Dave Cliff (dc@cs.bris.ac.uk) is a professor in thesilo of single-discipline thinking. R., and Tamburrelli, G. Dynamic QoS management Department of Computer Science, Bristol University, and optimisation in service-based systems. IEEE England. Transactions on Software Engineering 37, 3 (Mar.Conclusion 2011), 387–409. Radu Calinescu (raduc@cs.york.ac.uk) is a lecturer in the Department of Computer Science, Aston University,Since the emergence of widespread 4. Calinescu, R. and Kwiatkowska, M. Using quantitative England. analysis to implement autonomic IT systems. Innetworking in the 1990s, all societies Proceedings of the 31st International Conference Justin Keen (J.Keen@leeds.ac.uk) is a professor in thehave grown increasingly dependent on on Software Engineering (Vancouver, May). IEEE School of Health Informatics, Leeds University, England. Computer Society Press, Los Alamitos, CA, 2009,complex software-intensive systems, 100–110. Tim Kelly (Tim.Kelly@cs.york.ac.uk) is a senior lecturerwith failure having profound social 5. Cliff, D., Calinescu, R., Keen, J., Kelly, T., Kwiatkowska, in the Department of Computer Science, York University, M., McDermid, J., Paige, R., and Sommerville, I. The England.and economic consequences. Indus- U.K. Large-Scale Complex IT Systems Initiative 2010; http://lscits.cs.bris.ac.uk/docs/lscits_overview_2010. Marta Kwiatkowska (Marta.Kwiatkowska@comlab.ox.ac.trial organizations and government uk) is a professor in the Department of Computer Science, pdfagencies alike build these systems 6. Cliff, D. and Northrop, L. The Global Financial Markets: Oxford University England.without understanding how to analyze An Ultra-Large-Scale Systems Perspective. Briefing John McDermid (jam@cs.york.ac.uk) is a professor in paper for the U.K. Government Office for Science the Department of Computer Science, York University,their behavior and without appropriate Foresight Project on the Future of Computer Trading England.engineering principles to support their in the Financial Markets, 2011; http://www.bis.gov. uk/assets/bispartners/foresight/docs/computer- Richard Paige (paige@cs.york.ac.uk) is a professor inconstruction. trading/11-1223-dr4-global-financial-markets- the Department of Computer Science, York University, systems-perspective.pdf England. The SEI ULSS report14 argued that 7. Commodity Futures Trading Commission andcurrent engineering methods are inad- Securities and Exchange Commission (U.S.). Findings © 2012 ACM 0001-0782/12/07 $15.00 ju ly 2 0 1 2 | vo l. 55 | n o. 7 | c om m u n ic at ion s of t he acm 77

×