SlideShare a Scribd company logo
1 of 6
Download to read offline
Why B Things
Projects
Happe to Goo(d
KARENMACKEY,Lotus DeweloprnentCorporation
that involvesmu
t the start of a distributed software project,
have you ever felt that you could finally “do it
right the first time?” You have all your plans
laid out, a super team, and you’re primed to
use the latest software technologies and
deve1opment methodo1ogies. €3 ut telling
yourself you’re going to “do it right” can be dangerous: You
are likely to set yourself up for failure because of unforeseen
complications.’ Furthermore, if you share your optimism
with your managers and they build business schedules around
it, you’re likely to both lose their trust and put your job or
enterprise at risk.
If you’re building a relatively complex system involving multi-
ple computers and multiple users, and if the system entails sig-
nificant innovation - such as new technology or expanded scale
- something will inevitably go wrong. Realizing this might
encourage you to use both design and user-interface proto-
types?,’and the spiral model of development+so that you can
look ahead and assess risks as you go. Before you start a major
project, you need to understand what problems can affect it,
even if you use the best available techniques and methods.
I E E E S O F T W A R E 0 7 4 0 7 4 5 9 / 9 6 / $ 0 5 0 0 0 1996 I E E E
This article characterizes two possi-
ble pitfalls: the Quality-Capacity
Syndrome and the Missing-Tools Crisis.
Since both have occurred with some fre-
quency in unrelated software-develop-
ment projects, they appear to be inde-
pendent of individual and management
slulls - factors responsible for a large
variance in project success.’ Likewise,
since the affected projects followed rea-
sonably good development practices,the
appearance of these problems serves to
underscore that serious problems can
occur even in the best-intentioned
multiuser, distributed-application devel-
opment project.
To examine these two pitfalls, I past-
ed together a fictitious development
project called GEMS, Greatest
Electronic Mail Systems. GEMS is a
composite of real projects that experi-
enced the Quality-Capacity Syndrome
and Missing-Tools Crisis. Managers of
the various projects were so sensitive
about their project’s problems that the
only way they would release the infor-
mation was for me to create a composite
project. This sensitivityunderscores the
difficultywithin our industry of having
open discussions about lessons learned,
while the problems I describe under the
guise of GEMS emphasize the need for
such discussions.
The creators of GEMS wanted to
create a uniform user interface for an
electronic mail service in a heteroge-
neous environment comprised of IB,M
and Amdahl mainframes and D E C
minicomputers. The existing mail ser-
vice w-as an internal system developed
by the company’s computer services
department. It prorided mail service
across the different systems, but on
each system the mail command
behaved differently. Also, because each
system had unique software, it was dif-
ficult to maintain softu-are and add
features. The developers and main-
tainers of the existing system decided
to create a replacement. They were
going to do it again, and they were
going to do it right.
The developers designed the new
spstern top-down. First they found out
what the users needed, and then they
developed requirements. They worked
from an understanding of the problem
to the design of a solution, rather than
conversely. The developers employed
functional decomposition, carefully
separating the transport facility and the
mail-handling functions. They also
used modularization6 within the con-
fines of decomposition to encapsulate
related functionality. T h e design
decomposed functions in a layered
style, with complex layers built on top
of more rudimentary layers, rising to a
crescendo of sophisticated user capa-
bilities a t the highest layer. Further-
more, since the existing mail system
was working, the team had time for a
thoughtful design.
T h e design goal was to maximize
portability between the different sys-
tems. A standard mail interface
buffered the GEMS software from the
idiosyncrasies of different operating
systems. By using a single high-level
language that had compilers on the
various systems, the developers could
port much of the software. The porta-
bility in turn helped improve maintain-
ability, because a bug in the portable
portion of the code had to be fixed
only once.
The developers designed in many
fault-recovery fearures: Point-to-point
protocols detected errors and initiated
retransmissions; end-to-end protocols
resent mail if network hardware failed
during delivery; the system automati-
cally restarted if the code failed; if the
destination host was down, a queue
held the mail; the inail queue was
crash-resistant and audited at every
restart to protect mail integrity. In
summary, the developers were doing
things right. In fact, they considered
GEMS a really neat project.
GEMS was origiiially deployed on a
three-node pilot system. As probleins
arose, the team solved them. In gener-
al, the users were quite happy. Because
the system seemed to be working just
fine, the developers added five more
hosts to the pilot system and made
near-term plans to expand GEMS to
80 to 100 more hosts. That’s when
both the Quality-Capacity Syndrome
and Missing-Tools Crisis struck, and
GEMS took a critical turn for the
worse: The mail wasn’t being delivered
and the system was unresponsive.
Users couldn’t tell whether their indi-
vidual systems were locked up or the
mail software was just taking a long
time. Users were no longer happy, and
the system administrators faced a
major quandary.
E
Quality means how well a system is
working. In the GEMS context, it cor-
responded to the inverse of the nuin-
her of recovery actions per time inter-
val. Quality problems like software
bugs, hardware failures, and resource
contention would trigger the fault-
recovery features built into GEMS.
Capacity refers to how much work a
system can do, which in this case
meant the number of messages GEMS
delivered per time interval. Measures
taken to increase quality directly
reduced capacity.
T h e Quality-Capacity Syndrome
has three symptoms:
M A Y 1996
t With a light load, a system per-
forms well.
t As system load increases, system
quality falls sharply.
t Even if the quality problems are
solved, the capacity remains unsatisfac-
tory.
GEMS’Ssystem quality was good at
a low mail volume. As .the number of
messages increased, the system had lots
of retransmissions and code restarts.
However, the main problem was that
system capacity fell rapidly as traffic
increased. Message queues backed up
and had to be processed at night.
System capacity seemed to be limited
by the implementation, not just the
bugs.
Causes. Why did GEMS develop the
Quality- Capacity Syndrome? T w o
things contributed to the problem: the
desigddevelopment methodology and
the pilot project.
GEMS was inherently complex, so
it was essential to use a good
design/development methodology.
T h e use of modularity, clean inter-
faces, common functions, a high-level
language, and a general-.purpose oper-
ating system did result in less efficient
code. However, once the designers
understood where the inefficiencies
were, they could improve efficiency
fairly quickly. Poorly designed soft-
ware would not have permitted such
easy efficiencyimprovements.
T h e GEMS design methods
encouraged functional partitioning and
localization of concerns. Partitioning
allowed the designers to break the pro-
ject into development tasks that could
proceed in parallel. However, because
all developers were assigned a specific
partition, there were no system gener-
alists who understood thoroughly how
the pieces fit together. Without this
perspective, it was very difficult to
debug across interfaces or boundaries.
For example, when a bug surfaced
between the transport mechanism and
the mail-handling layer, the two teams
assigned to those layers wasted a lot of
time chasing the bug back and forth
across the interface. Unfortunately, a
lot of demoralising blaming and fin-
ger-pointing accompanied the chase.
T h e project needed someone who
understood the interface from both
sides.
The pilot project also contributed
to the Quality-Capacity Syndrome.
Both prototypes and pilot projects are
important to developing large, com-
plex systems. In general, these are posi-
tive activities. In this case, however, a
demonstration of functionality early on
gave a false impression of progress.
The GEMS pilot succeeded in getting
a message to go between system A and
system B, so the developers broke out
the champagne and moved up the
schedule. The problem was that they
- and even worse, their bosses -
thought the project was further along
than it really was.
The focus of the pilot project was to
demonstrate feasibility and usability,
not test failure legs. With all the fault-
recovery features in GEIMS, the latter
was a sizable task. The developers sim-
ply got caught up in the pilot project’s
success and failed to assess accurately
what essential testing they still needed
to do.
the load to the level at which the sys-
tem runs well, then take baseline mea-
surements. To improve quality, reduce
and control change so yosu can identify
the sources of problems. T o improve
capacity, seek out changes that will
have a large effect. hTeitlierteams nor
people can focus effectively on both
thrusts at the same time. ’Thus,you can
either have a single development team
that deals first with quality issues and
then with capacity, or if you have suffi-
cient staff you can form a group to
address quality and one to address
capacity. In either case,you’ll need sys-
tems generalists to complete a timely
cure.
The initial focus of the cure should
be to improve quality until the system
can handle an acceptable load. For
GEMS, this meant reducing the num-
ber of restarts, rather th,an raising the
throughput level. Once this was
accomplished, the team could look for
ways to improve capacky. The devel-
opers discovered a traffic pattern in
Treatment. How do you treat
Quality-Capacity Syndrome? T h e
GEMS crew applied the Hass Cure,
named after R.J. Hass of AT&T Bell
Laboratories, who first characterized,
named, and treated the syndrome. This
“cure” assumes that the underlying sys-
tem architecture is in fact capable of
handling the desired capacity. Without
this guarantee, no amount of work can
overcome the limitations.
The Hass Cure has four steps:
1. Stabilisethe system.
2. Separate the quality and capacity
concerns, which work at cross-purposes.
3. Address quality problems.
4. Address capacity problems.
T o stabilize the system, you reduce
GEMS in which 70 percent of the
messages passed throu,gh one host.
Blocking the text in the rnessage trans-
fers through this host increased the
transfer rate by two to five times. This
improved the overall syrjtem capacity
and successfully cured the Quality-
Capacity Syndrome.
MISSING-TOOLS CRISIS
When the Quality-Capacity Syn-
drome struck GEMS, the system admin-
I E E E S O F T W A R E
istrators faced a inajor crisis that they
were ill-equipped to handle. Although
the GEMS team had plenty of develop-
ment, debugging, and testing tools, their
administration tools were totally inade-
quate. In addition to the quality-capacity
crisis,adniiiiistrators were faced with the
crisis of missing tools.
The Missing-Tools Crisis has four
characteristics:
+ A major problem - such as bugs,
hardware failures, or a full-fledged
Quality-Capacity Syndrome - illunii-
nates the tool deficit
+ The system lacks adequate moni-
toring and control took
+ Administrators lack adequate tools
to change the software 111 the deployed
system.
+ The exisang system administratwe
procedures and tools do not scale up
adequately
In GEMS, the Missing-Tools Crisis
emerged in the wake of the Quality-
Capacity Syndrome GEMS notified
users of the success or failure of mail
delivery, but offered them no window
to watch their mail progress through
the system Even system administrators
had iio way to moiiitor what was going
on and had iio tools to take corrective
action. If a problem arose, system
iestart was the inaiii recourse.
One illustration of the importance
of monitoring and control tools
involved a clcver adaptive algorithm
for routing messages around failed
components. When a ineinory over-
write conhused GEMS, it responded by
looping niessages around the network.
This looping was detected not by sys-
tem administrators - n-ho had no
tools for this sort of sun-eillance -but
by frustrated users who nei-er received
delivery notification. T o reinitialize
and resynchronize the sJ-stem,adminis-
trators had to bring down the entire
system and restart it. lioiiitoring aiid
control tools could have prevented
such a drastic measure.
T o overcome the Quality-Capacity
Syndrome, the GEIIS teain needed to
integrate changes into the running soft-
ware quickly. Vsing the standard proce-
dure, this integration as slo~r.At one
point, a message with a bad address
slipped through GEJIS defenses and
blocked the message queues. It brought
the system to its knees for a week -
despite the fact that the developers
uiiderstood the problem and proded a
solution within a day. They needed a
inore responsive n-a)-to insert changes
into the system.
In place of an administrative inter-
face designed for the nen- system,
GEXIS had different ad hoc tools on
each host system. Even the logged mes-
sages generated b!- C;E,lS software
were messages for debugging rather
than management - but they were all
that administrators had to work with.
Furthermore, the administrators man-
aged the system using one terminal per
host. U’ith the three-node pilot, this
management strategy -as possible.
With the eight-node network, the task
became cumbersome. Tf‘ith a projected
addition of 80 to 100 nodes, it would be
impossible. The makeshift administra-
tive interface simply did not scale up.
Causes. M’hy did the system develop
the Missing-Tools Crisis? How could
developers have 01-erlooked such criti-
cal needs? Again, the two main coii-
tributors to the problem were the
design/developinent methodology and
the pilot project. ktually, it was more
the “religion” of the methodology
than the methodology itself: People
did not consider monitoring and coii-
trol tools and the need for making cor-
rective changes because they felt the
system was going to work correctly.
The GEMS designers thought the sys-
tem would be totally automated and
self-correcting; they never foresaw the
need for humans to manage the system.
Also, inany system designers had
little or no system-administration
experience. In niaiiy companies and
universities, software eiigineers have
their own persoiial computers and get
little exposure to distributed-systems
management. They are genuinely
naive - and understandably so. Few
career paths lead through systems
administr atio11 i11 to deve1opnient,
except in small enterprises where the
developer does both. Even worse, a
class distinction soinetiines exists
between system administrators aiid
developers, which inhibits rapport and
sensitivity to the management side of
distributed systems.
Another unfortunate influence on
the design/development methodology
came from the focus infused into the
development process: The user func-
tions justified the funding for the
GEMS project. With this orientation,
support functions got slighted; they
were uiiiinportant until the users grew
more sophisticated and demanded bet-
ter performance.
Yet another influence came from
assumptions about the users. When
GEMS was designed, developers envi-
sioned small, single-page messages
going between users. This was in the
early days of e-mail, before its usefu-
ness for file transfer was established,
and developers didn’t anticipate a user
sending a 3-Mbyte message and the
impact it would have on the system.
Because they failed to imagine things a
user could do with the system, devel-
opers provided no means to monitor
and control them.
The pilot project also helped bring
on the Missing-Tools Crisis by, again,
giving a false sense of progress. ’The
pilot project focused on user functions
rather than administration. It did nialte
M A Y 1996
some sense not to build Idahorate man-
agement tools before thc feasibility of
user functions was proven. Also, since
the pilot focused 011 sm~all-scalefeasi-
bility, management needs were not
obvious. Scaling up the project uncov-
ered the need for adiniriistrative inter-
facc and management tolols.
Treatment. Treating the Missing-
Tools Crisis IS straightforward
+ retrofit an interface for monitor-
ing and control tools,
+ create quick-change tools and
proceclures so developers can make
code correctioiis quickly, and
+ have the developers both use and
maiiage the system they have built.
GEMS developers had to modify
the system design to ,iccommodate
monitoring and control features, as the
developers of rnany ne tuork systems
- including DECNet, IBM SNA, and
thc I S 0 OS1 model -- have done
before them. The advantage of incor-
porating tools after the system 15 run-
ning is that developers have a better
idea of what administr‘itive tools the
system needs and how users are likely
to usc the system. It thc software is
well designed, adding this interface can
be rel‘itively easy. T h e work the
GEIMS de5igners put into their origi-
nal design paid off heire. the solid
design allowed them to add the tools
fairly quickly.
Many argument5 show how much
inorc cxpensive it is to make changes
late i n thc developtnent process than to
get it right the first time.’ However,
even though GELVSdevelopers were
extremely ca-etul and followed a good
development methodology, they still
had to make late-tage clr post-deploy-
inent code changes. L q e , complex
system can require several rounds of
corrections, so ebery system should
iiicorporate quick-change tools.
A few years ago, a former student
of mine went 011 ‘1 j01) interview At
that time, the acadcinic community
h d lust fully embraced structured
I E E E S O F T W A R E
coding as a useful software-engineer-
ing technique. Wanting to make sure
that the company he might work for
was forward-looking and used up-to-
date software practices, the student
asked the interviewer if they did struc-
tured coding. ’The interviewer said
yes, they structured their code with a
block of assembly code followed by a
block of reserved memory callcd a
patch area. If they needed to insert a
change, they could zap out the offend-
ing code, branch down to the correct-
ed code loaded into the reserved
block, then branch back in-line.
When he related the story in class,
we all had a good, self-righteous laugh
at the interviewer’s old-fashioned defin-
ition of structured coding. However, as
T realized later, we missed the usefulness
of this “antiquated” technique for
quickly fixing problems in deployed sys-
tems. We still have much to learn from
solutions out of our pre-structured-
coding and pre-object-oriented past,
even though newer software-engineer-
ing practices might alter their exact
form.
Finally, having developers use what
they build has gained widespread
acceptance in the software-engineer-
ing community as a way of improving
understanding of user needs.
However, system administration often
remains an unexplored perspective.
Managing the system for a period of
time definitely enhanced GEMS
developers’ awareness of the need for
administrative tools.
DOINGTHINGS BETTER
T o avoid the Quality-Capacity
Syndrome, G. Scott Graham suggest-
ed the following steps in a University
of Santa Cruz seminar:
+ Build a simple analytic perfor-
mance model early in the design
process, even as early as during system
definition, and iinprove it as the
design progresses.
+ Build a picture of logical resource
use or resource demand. For example,
determine how many disk accesses a
routine might make.
+ ‘Tie logical resource use to physical
resource use. For example, tie the disk
accessesto the actual disk-accesstime.
+ Use the analytic model to identi-
fy the most-used modules, then opti-
inize those modules.
+ Conduct a walkthrough explicitly
to review the design for performance.
+ Design into the system an inter-
face to capture data that nieasures
quality and capacity.
Frequently, capacity and perfor-
mance goals get shelved duriiig devel-
opment. After the system is built, we
push it off a cliff and see if it flies. It
would be better to keep capacity goals
in mind during design/’developinent
and to get performance feedback all
along. Conducting an explicit perfor-
iiiance walkthrough and designing in
appropriate data-col1ection mecha-
nisms are tangible activities that will
e1evate the design ’s performance
aspects. T o improve capacity, you can
form a capacity group that follows
behind the implementation group.
Using the system model’s numeric
assumptions and ana1ysi:jas the start-
ing point, the group can measure the
actual capacity and improve it.
Finally, an absolutc necessity for
avoiding or treating the Quality-
Capacity Syndrome is to assign one or
more developers the role of systems
generalists. You must identify and cul-
tivate systems generalists throughout
the development process.
To avoid the Missing-Tools Crisis,
try the following suggestions:
+ Design interfaces into the system
that support monitoring and control
tools. You don’t have to build all the
envisioned monitoring and control
tools at the onset of the project, but at
least design a control scheme and
build in the appropriate interfaces,
with extra attention to manual over-
rides of clever adaptive algorithms.
+ Have the developers manage the
system prior to deployment. It’s espe-
cially helpful to study the administra-
tion procedures in light of the scale of
the final deployment.
+ Develop the tools and procedures
to support quick changes. A vital on-
line system will undoubtedly need
them.
he developers of multiuser dis-
tributed systems are particular-
+ Conduct a systems-administra-
tion walkthrough, and include actual
administrators. Also, bring in the peo-
ple who will be using the system so
they can share their perspectives.
ly vulnerable to the pitfalls of
Quality-Capacity Syndrome and
Missing-Tools Crisis. What makes
them so deadly is that they tend to
occur together just as the project is
nearing completion, putting the
schedule in jeopardy. The Quality-
Capacity Syndrome teaches us to start
both performance modeling and
model validation early. The Missing-
Tools Crisis teaches us to consider
the administration and management
of the system under development.
Perhaps the best lesson learned
from this experience is that we should
beware of relying too heavily on our
ability to “do it right.” There’s always
+a pitfall waiting to educate us.
Process-CenteredSoftware
Engineering Environments
edited by Pankaj K. Carg and Mehdi Jazayeri
Presents a comprehensive picture of this
emerging technology while highlighting the key
concepts and issues. The book introduces some of
the basic concepts and developments behind PSEEs
and discusses the unifying role it plays in combining
project management, software engineering, and
process engineering. It reviews related process
modeling and representation concepts, terminology, and issues, and analyzes
the features of some example PSEEs while taking an inside look at their
impleinentation by describing specific design choices. The book concludes
with a discussion of the significant role they will play in the software life cycle.
Contents:Preface Introduction Software Processes: Modeling and
Representation PSEE Features Fundamental Design Issues Future
Directions Further Readings
424 pages. September 1995. Softcover. lS6N 0-8186-7103-3.
Catalog # BP07103 - $40.00 Members / $50.00 List
II
@cOMPUTER
SOCIETY
REFERENCES
I. F.P. Brooks, Jr., The .Llythictll Mtlii-~Vloiitli,
Addison-M’esley, Reading, Mass., 1975 .
2 . L. Bernstein, “Get the Design Right!,” IEEE
Sofmuw,Sept. 1903, pp. 61-63.
3. H. Ledgard, Sofrz~weEirgiweeringConcepts,
Addison-Vesley, Reading, Mass., 1987.
4,B.M’. Boehin, “A Spiral Model of Software
Development and Enhancement,” Coiiiprrter,
,May 1988,pp. 61-72.
5 . ’r.DeMarco and T. Lister, Peopleu,trre,Dorset
lIouse, New York, 10x7.
6. 11.L.Parnas, “On the Criteria to Be Used in
Decomposing Systems into Modules,” Cultlnr.
ACLf, VOI.5, No. 12, Dec. 1072, pp. 1,053-
1,058.
7 . B.W. Boehni, .Suffmim Eiigiiiiwiwg Econuniic.r,
Prentice-I lall, Upper Saddle River, NJ.,
1981.
Karen Mackeyis a devel-
opment manager at Lotus
Development Corpora-
tion, a subsidiary of 1B.V.
Previously, she was a soft-
ware engineer and manager
at T R W and AT&TRcll
Laboratories.
in coininiter science from
Mackey received a PhD
Pennsylvania State
University, University Park. She is a meniber of the
IEEE Computer Society, ACM, and Silicon ‘alley
SPIN.
Address questions ahout this article to Mackey at
I??!, Suwi $’ay, Sunn)?.ale, CA 9.1087;
kinackey@best.ci)in.
M A Y 1 9 9 6

More Related Content

What's hot

Chapter 01 software engineering pressman
Chapter 01  software engineering pressmanChapter 01  software engineering pressman
Chapter 01 software engineering pressmanRohitGoyal183
 
Software engineering principles in system software design
Software engineering principles in system software designSoftware engineering principles in system software design
Software engineering principles in system software designTech_MX
 
Process models
Process modelsProcess models
Process modelsStudent
 
Chapter 03
Chapter 03Chapter 03
Chapter 03ppp mmm
 
Notes of Software engineering and Project Management
Notes of Software engineering and Project ManagementNotes of Software engineering and Project Management
Notes of Software engineering and Project ManagementNANDINI SHARMA
 
Scalable light weight processes
Scalable light weight processesScalable light weight processes
Scalable light weight processesGlen Alleman
 
Extreme programming
Extreme programmingExtreme programming
Extreme programmingtuanvu8292
 
Major Accomplishments, Career Summary & Recommendations.
Major Accomplishments, Career Summary & Recommendations.Major Accomplishments, Career Summary & Recommendations.
Major Accomplishments, Career Summary & Recommendations.Lou Piombino
 

What's hot (10)

Chapter 01 software engineering pressman
Chapter 01  software engineering pressmanChapter 01  software engineering pressman
Chapter 01 software engineering pressman
 
Software engineering principles in system software design
Software engineering principles in system software designSoftware engineering principles in system software design
Software engineering principles in system software design
 
Chapter 12
Chapter 12Chapter 12
Chapter 12
 
Process models
Process modelsProcess models
Process models
 
Chapter 03
Chapter 03Chapter 03
Chapter 03
 
Notes of Software engineering and Project Management
Notes of Software engineering and Project ManagementNotes of Software engineering and Project Management
Notes of Software engineering and Project Management
 
Scalable light weight processes
Scalable light weight processesScalable light weight processes
Scalable light weight processes
 
Chapter 13
Chapter 13Chapter 13
Chapter 13
 
Extreme programming
Extreme programmingExtreme programming
Extreme programming
 
Major Accomplishments, Career Summary & Recommendations.
Major Accomplishments, Career Summary & Recommendations.Major Accomplishments, Career Summary & Recommendations.
Major Accomplishments, Career Summary & Recommendations.
 

Viewers also liked

Problem mapping example
Problem mapping exampleProblem mapping example
Problem mapping exampleSteven Zeegers
 
Comptia All Partner Deck
Comptia All Partner DeckComptia All Partner Deck
Comptia All Partner DeckLevel Platforms
 
Jardins2010
Jardins2010Jardins2010
Jardins2010Seb Lh
 
The Maid Of The Mist Ongiaras Retold By S E
The Maid Of The Mist Ongiaras Retold By S EThe Maid Of The Mist Ongiaras Retold By S E
The Maid Of The Mist Ongiaras Retold By S ERachel Fontus
 
Boom Sizzle Pop By: Tynesia Fields
Boom Sizzle Pop By: Tynesia FieldsBoom Sizzle Pop By: Tynesia Fields
Boom Sizzle Pop By: Tynesia FieldsRachel Fontus
 
Clean Up! Presentation
Clean Up! PresentationClean Up! Presentation
Clean Up! PresentationAmsontan
 
Wolstencroft the Bear by Karen Lewis - Illustrated by Michael S.
Wolstencroft the Bear by Karen Lewis - Illustrated by Michael S. Wolstencroft the Bear by Karen Lewis - Illustrated by Michael S.
Wolstencroft the Bear by Karen Lewis - Illustrated by Michael S. Rachel Fontus
 
32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your Business32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your BusinessBarry Feldman
 

Viewers also liked (10)

Problem mapping example
Problem mapping exampleProblem mapping example
Problem mapping example
 
Comptia All Partner Deck
Comptia All Partner DeckComptia All Partner Deck
Comptia All Partner Deck
 
Jardins2010
Jardins2010Jardins2010
Jardins2010
 
Exploring Reggaeton
Exploring ReggaetonExploring Reggaeton
Exploring Reggaeton
 
The Maid Of The Mist Ongiaras Retold By S E
The Maid Of The Mist Ongiaras Retold By S EThe Maid Of The Mist Ongiaras Retold By S E
The Maid Of The Mist Ongiaras Retold By S E
 
Boom Sizzle Pop By: Tynesia Fields
Boom Sizzle Pop By: Tynesia FieldsBoom Sizzle Pop By: Tynesia Fields
Boom Sizzle Pop By: Tynesia Fields
 
Clean Up! Presentation
Clean Up! PresentationClean Up! Presentation
Clean Up! Presentation
 
Exploring Reggaeton
Exploring ReggaetonExploring Reggaeton
Exploring Reggaeton
 
Wolstencroft the Bear by Karen Lewis - Illustrated by Michael S.
Wolstencroft the Bear by Karen Lewis - Illustrated by Michael S. Wolstencroft the Bear by Karen Lewis - Illustrated by Michael S.
Wolstencroft the Bear by Karen Lewis - Illustrated by Michael S.
 
32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your Business32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your Business
 

Similar to 4 why bad_things_happen_to_goog_projects

Software engineering project(srs)!!
Software engineering project(srs)!!Software engineering project(srs)!!
Software engineering project(srs)!!sourav verma
 
Xp(Xtreme Programming) presentation
Xp(Xtreme Programming) presentationXp(Xtreme Programming) presentation
Xp(Xtreme Programming) presentationMuaazZubairi
 
Periodic Table of Agile Principles and Practices
Periodic Table of Agile Principles and PracticesPeriodic Table of Agile Principles and Practices
Periodic Table of Agile Principles and PracticesJérôme Kehrli
 
3784_Streamlining_the_development_process_with_feature_flighting_and_Azure_cl...
3784_Streamlining_the_development_process_with_feature_flighting_and_Azure_cl...3784_Streamlining_the_development_process_with_feature_flighting_and_Azure_cl...
3784_Streamlining_the_development_process_with_feature_flighting_and_Azure_cl...Crystal Thomas
 
Week 4 Assignment - Software Development PlanScenario-Your team has be.docx
Week 4 Assignment - Software Development PlanScenario-Your team has be.docxWeek 4 Assignment - Software Development PlanScenario-Your team has be.docx
Week 4 Assignment - Software Development PlanScenario-Your team has be.docxestefana2345678
 
DevOps - Continuous Integration, Continuous Delivery - let's talk
DevOps - Continuous Integration, Continuous Delivery - let's talkDevOps - Continuous Integration, Continuous Delivery - let's talk
DevOps - Continuous Integration, Continuous Delivery - let's talkD Z
 
Software process versus design quality a tug of war - ieee software july 2015
Software process versus design quality   a tug of war - ieee software july 2015Software process versus design quality   a tug of war - ieee software july 2015
Software process versus design quality a tug of war - ieee software july 2015Ganesh Samarthyam
 
Note on Tool to Measure Complexity
Note on Tool to Measure Complexity Note on Tool to Measure Complexity
Note on Tool to Measure Complexity John Thomas
 
A research on- Sales force Project- documentation
A research on- Sales force Project- documentationA research on- Sales force Project- documentation
A research on- Sales force Project- documentationPasupathi Ganesan
 
chapter-03-Agile view of process.ppt
chapter-03-Agile view of process.pptchapter-03-Agile view of process.ppt
chapter-03-Agile view of process.pptNakulP3
 
Tech challenges in a large scale agile project
Tech challenges in a large scale agile projectTech challenges in a large scale agile project
Tech challenges in a large scale agile projectHarald Soevik
 
One XP Experience: Introducing Agile (XP) Software Development into a Culture...
One XP Experience: Introducing Agile (XP) Software Development into a Culture...One XP Experience: Introducing Agile (XP) Software Development into a Culture...
One XP Experience: Introducing Agile (XP) Software Development into a Culture...David Leip
 
Software process life cycles
Software process life cyclesSoftware process life cycles
Software process life cycles sathish sak
 
Durgesh o level_2nd_part
Durgesh o level_2nd_partDurgesh o level_2nd_part
Durgesh o level_2nd_partDurgesh Singh
 
How to test LLMs in production.pdf
How to test LLMs in production.pdfHow to test LLMs in production.pdf
How to test LLMs in production.pdfAnastasiaSteele10
 
AEM.Design - Project Introduction
AEM.Design - Project IntroductionAEM.Design - Project Introduction
AEM.Design - Project IntroductionMax Barrass
 

Similar to 4 why bad_things_happen_to_goog_projects (20)

Software engineering project(srs)!!
Software engineering project(srs)!!Software engineering project(srs)!!
Software engineering project(srs)!!
 
Xp(Xtreme Programming) presentation
Xp(Xtreme Programming) presentationXp(Xtreme Programming) presentation
Xp(Xtreme Programming) presentation
 
Periodic Table of Agile Principles and Practices
Periodic Table of Agile Principles and PracticesPeriodic Table of Agile Principles and Practices
Periodic Table of Agile Principles and Practices
 
3784_Streamlining_the_development_process_with_feature_flighting_and_Azure_cl...
3784_Streamlining_the_development_process_with_feature_flighting_and_Azure_cl...3784_Streamlining_the_development_process_with_feature_flighting_and_Azure_cl...
3784_Streamlining_the_development_process_with_feature_flighting_and_Azure_cl...
 
Week 4 Assignment - Software Development PlanScenario-Your team has be.docx
Week 4 Assignment - Software Development PlanScenario-Your team has be.docxWeek 4 Assignment - Software Development PlanScenario-Your team has be.docx
Week 4 Assignment - Software Development PlanScenario-Your team has be.docx
 
DevOps - Continuous Integration, Continuous Delivery - let's talk
DevOps - Continuous Integration, Continuous Delivery - let's talkDevOps - Continuous Integration, Continuous Delivery - let's talk
DevOps - Continuous Integration, Continuous Delivery - let's talk
 
Software process versus design quality a tug of war - ieee software july 2015
Software process versus design quality   a tug of war - ieee software july 2015Software process versus design quality   a tug of war - ieee software july 2015
Software process versus design quality a tug of war - ieee software july 2015
 
Note on Tool to Measure Complexity
Note on Tool to Measure Complexity Note on Tool to Measure Complexity
Note on Tool to Measure Complexity
 
A research on- Sales force Project- documentation
A research on- Sales force Project- documentationA research on- Sales force Project- documentation
A research on- Sales force Project- documentation
 
chapter-03-Agile view of process.ppt
chapter-03-Agile view of process.pptchapter-03-Agile view of process.ppt
chapter-03-Agile view of process.ppt
 
Tech challenges in a large scale agile project
Tech challenges in a large scale agile projectTech challenges in a large scale agile project
Tech challenges in a large scale agile project
 
Quality Software Development
Quality Software DevelopmentQuality Software Development
Quality Software Development
 
One XP Experience: Introducing Agile (XP) Software Development into a Culture...
One XP Experience: Introducing Agile (XP) Software Development into a Culture...One XP Experience: Introducing Agile (XP) Software Development into a Culture...
One XP Experience: Introducing Agile (XP) Software Development into a Culture...
 
Software process life cycles
Software process life cyclesSoftware process life cycles
Software process life cycles
 
01lifecycles
01lifecycles01lifecycles
01lifecycles
 
Durgesh o level_2nd_part
Durgesh o level_2nd_partDurgesh o level_2nd_part
Durgesh o level_2nd_part
 
Software process model
Software process modelSoftware process model
Software process model
 
How to test LLMs in production.pdf
How to test LLMs in production.pdfHow to test LLMs in production.pdf
How to test LLMs in production.pdf
 
AEM.Design - Project Introduction
AEM.Design - Project IntroductionAEM.Design - Project Introduction
AEM.Design - Project Introduction
 
Software modernization
Software modernizationSoftware modernization
Software modernization
 

Recently uploaded

mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 

Recently uploaded (20)

mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 

4 why bad_things_happen_to_goog_projects

  • 1. Why B Things Projects Happe to Goo(d KARENMACKEY,Lotus DeweloprnentCorporation that involvesmu t the start of a distributed software project, have you ever felt that you could finally “do it right the first time?” You have all your plans laid out, a super team, and you’re primed to use the latest software technologies and deve1opment methodo1ogies. €3 ut telling yourself you’re going to “do it right” can be dangerous: You are likely to set yourself up for failure because of unforeseen complications.’ Furthermore, if you share your optimism with your managers and they build business schedules around it, you’re likely to both lose their trust and put your job or enterprise at risk. If you’re building a relatively complex system involving multi- ple computers and multiple users, and if the system entails sig- nificant innovation - such as new technology or expanded scale - something will inevitably go wrong. Realizing this might encourage you to use both design and user-interface proto- types?,’and the spiral model of development+so that you can look ahead and assess risks as you go. Before you start a major project, you need to understand what problems can affect it, even if you use the best available techniques and methods. I E E E S O F T W A R E 0 7 4 0 7 4 5 9 / 9 6 / $ 0 5 0 0 0 1996 I E E E
  • 2. This article characterizes two possi- ble pitfalls: the Quality-Capacity Syndrome and the Missing-Tools Crisis. Since both have occurred with some fre- quency in unrelated software-develop- ment projects, they appear to be inde- pendent of individual and management slulls - factors responsible for a large variance in project success.’ Likewise, since the affected projects followed rea- sonably good development practices,the appearance of these problems serves to underscore that serious problems can occur even in the best-intentioned multiuser, distributed-application devel- opment project. To examine these two pitfalls, I past- ed together a fictitious development project called GEMS, Greatest Electronic Mail Systems. GEMS is a composite of real projects that experi- enced the Quality-Capacity Syndrome and Missing-Tools Crisis. Managers of the various projects were so sensitive about their project’s problems that the only way they would release the infor- mation was for me to create a composite project. This sensitivityunderscores the difficultywithin our industry of having open discussions about lessons learned, while the problems I describe under the guise of GEMS emphasize the need for such discussions. The creators of GEMS wanted to create a uniform user interface for an electronic mail service in a heteroge- neous environment comprised of IB,M and Amdahl mainframes and D E C minicomputers. The existing mail ser- vice w-as an internal system developed by the company’s computer services department. It prorided mail service across the different systems, but on each system the mail command behaved differently. Also, because each system had unique software, it was dif- ficult to maintain softu-are and add features. The developers and main- tainers of the existing system decided to create a replacement. They were going to do it again, and they were going to do it right. The developers designed the new spstern top-down. First they found out what the users needed, and then they developed requirements. They worked from an understanding of the problem to the design of a solution, rather than conversely. The developers employed functional decomposition, carefully separating the transport facility and the mail-handling functions. They also used modularization6 within the con- fines of decomposition to encapsulate related functionality. T h e design decomposed functions in a layered style, with complex layers built on top of more rudimentary layers, rising to a crescendo of sophisticated user capa- bilities a t the highest layer. Further- more, since the existing mail system was working, the team had time for a thoughtful design. T h e design goal was to maximize portability between the different sys- tems. A standard mail interface buffered the GEMS software from the idiosyncrasies of different operating systems. By using a single high-level language that had compilers on the various systems, the developers could port much of the software. The porta- bility in turn helped improve maintain- ability, because a bug in the portable portion of the code had to be fixed only once. The developers designed in many fault-recovery fearures: Point-to-point protocols detected errors and initiated retransmissions; end-to-end protocols resent mail if network hardware failed during delivery; the system automati- cally restarted if the code failed; if the destination host was down, a queue held the mail; the inail queue was crash-resistant and audited at every restart to protect mail integrity. In summary, the developers were doing things right. In fact, they considered GEMS a really neat project. GEMS was origiiially deployed on a three-node pilot system. As probleins arose, the team solved them. In gener- al, the users were quite happy. Because the system seemed to be working just fine, the developers added five more hosts to the pilot system and made near-term plans to expand GEMS to 80 to 100 more hosts. That’s when both the Quality-Capacity Syndrome and Missing-Tools Crisis struck, and GEMS took a critical turn for the worse: The mail wasn’t being delivered and the system was unresponsive. Users couldn’t tell whether their indi- vidual systems were locked up or the mail software was just taking a long time. Users were no longer happy, and the system administrators faced a major quandary. E Quality means how well a system is working. In the GEMS context, it cor- responded to the inverse of the nuin- her of recovery actions per time inter- val. Quality problems like software bugs, hardware failures, and resource contention would trigger the fault- recovery features built into GEMS. Capacity refers to how much work a system can do, which in this case meant the number of messages GEMS delivered per time interval. Measures taken to increase quality directly reduced capacity. T h e Quality-Capacity Syndrome has three symptoms: M A Y 1996
  • 3. t With a light load, a system per- forms well. t As system load increases, system quality falls sharply. t Even if the quality problems are solved, the capacity remains unsatisfac- tory. GEMS’Ssystem quality was good at a low mail volume. As .the number of messages increased, the system had lots of retransmissions and code restarts. However, the main problem was that system capacity fell rapidly as traffic increased. Message queues backed up and had to be processed at night. System capacity seemed to be limited by the implementation, not just the bugs. Causes. Why did GEMS develop the Quality- Capacity Syndrome? T w o things contributed to the problem: the desigddevelopment methodology and the pilot project. GEMS was inherently complex, so it was essential to use a good design/development methodology. T h e use of modularity, clean inter- faces, common functions, a high-level language, and a general-.purpose oper- ating system did result in less efficient code. However, once the designers understood where the inefficiencies were, they could improve efficiency fairly quickly. Poorly designed soft- ware would not have permitted such easy efficiencyimprovements. T h e GEMS design methods encouraged functional partitioning and localization of concerns. Partitioning allowed the designers to break the pro- ject into development tasks that could proceed in parallel. However, because all developers were assigned a specific partition, there were no system gener- alists who understood thoroughly how the pieces fit together. Without this perspective, it was very difficult to debug across interfaces or boundaries. For example, when a bug surfaced between the transport mechanism and the mail-handling layer, the two teams assigned to those layers wasted a lot of time chasing the bug back and forth across the interface. Unfortunately, a lot of demoralising blaming and fin- ger-pointing accompanied the chase. T h e project needed someone who understood the interface from both sides. The pilot project also contributed to the Quality-Capacity Syndrome. Both prototypes and pilot projects are important to developing large, com- plex systems. In general, these are posi- tive activities. In this case, however, a demonstration of functionality early on gave a false impression of progress. The GEMS pilot succeeded in getting a message to go between system A and system B, so the developers broke out the champagne and moved up the schedule. The problem was that they - and even worse, their bosses - thought the project was further along than it really was. The focus of the pilot project was to demonstrate feasibility and usability, not test failure legs. With all the fault- recovery features in GEIMS, the latter was a sizable task. The developers sim- ply got caught up in the pilot project’s success and failed to assess accurately what essential testing they still needed to do. the load to the level at which the sys- tem runs well, then take baseline mea- surements. To improve quality, reduce and control change so yosu can identify the sources of problems. T o improve capacity, seek out changes that will have a large effect. hTeitlierteams nor people can focus effectively on both thrusts at the same time. ’Thus,you can either have a single development team that deals first with quality issues and then with capacity, or if you have suffi- cient staff you can form a group to address quality and one to address capacity. In either case,you’ll need sys- tems generalists to complete a timely cure. The initial focus of the cure should be to improve quality until the system can handle an acceptable load. For GEMS, this meant reducing the num- ber of restarts, rather th,an raising the throughput level. Once this was accomplished, the team could look for ways to improve capacky. The devel- opers discovered a traffic pattern in Treatment. How do you treat Quality-Capacity Syndrome? T h e GEMS crew applied the Hass Cure, named after R.J. Hass of AT&T Bell Laboratories, who first characterized, named, and treated the syndrome. This “cure” assumes that the underlying sys- tem architecture is in fact capable of handling the desired capacity. Without this guarantee, no amount of work can overcome the limitations. The Hass Cure has four steps: 1. Stabilisethe system. 2. Separate the quality and capacity concerns, which work at cross-purposes. 3. Address quality problems. 4. Address capacity problems. T o stabilize the system, you reduce GEMS in which 70 percent of the messages passed throu,gh one host. Blocking the text in the rnessage trans- fers through this host increased the transfer rate by two to five times. This improved the overall syrjtem capacity and successfully cured the Quality- Capacity Syndrome. MISSING-TOOLS CRISIS When the Quality-Capacity Syn- drome struck GEMS, the system admin- I E E E S O F T W A R E
  • 4. istrators faced a inajor crisis that they were ill-equipped to handle. Although the GEMS team had plenty of develop- ment, debugging, and testing tools, their administration tools were totally inade- quate. In addition to the quality-capacity crisis,adniiiiistrators were faced with the crisis of missing tools. The Missing-Tools Crisis has four characteristics: + A major problem - such as bugs, hardware failures, or a full-fledged Quality-Capacity Syndrome - illunii- nates the tool deficit + The system lacks adequate moni- toring and control took + Administrators lack adequate tools to change the software 111 the deployed system. + The exisang system administratwe procedures and tools do not scale up adequately In GEMS, the Missing-Tools Crisis emerged in the wake of the Quality- Capacity Syndrome GEMS notified users of the success or failure of mail delivery, but offered them no window to watch their mail progress through the system Even system administrators had iio way to moiiitor what was going on and had iio tools to take corrective action. If a problem arose, system iestart was the inaiii recourse. One illustration of the importance of monitoring and control tools involved a clcver adaptive algorithm for routing messages around failed components. When a ineinory over- write conhused GEMS, it responded by looping niessages around the network. This looping was detected not by sys- tem administrators - n-ho had no tools for this sort of sun-eillance -but by frustrated users who nei-er received delivery notification. T o reinitialize and resynchronize the sJ-stem,adminis- trators had to bring down the entire system and restart it. lioiiitoring aiid control tools could have prevented such a drastic measure. T o overcome the Quality-Capacity Syndrome, the GEIIS teain needed to integrate changes into the running soft- ware quickly. Vsing the standard proce- dure, this integration as slo~r.At one point, a message with a bad address slipped through GEJIS defenses and blocked the message queues. It brought the system to its knees for a week - despite the fact that the developers uiiderstood the problem and proded a solution within a day. They needed a inore responsive n-a)-to insert changes into the system. In place of an administrative inter- face designed for the nen- system, GEXIS had different ad hoc tools on each host system. Even the logged mes- sages generated b!- C;E,lS software were messages for debugging rather than management - but they were all that administrators had to work with. Furthermore, the administrators man- aged the system using one terminal per host. U’ith the three-node pilot, this management strategy -as possible. With the eight-node network, the task became cumbersome. Tf‘ith a projected addition of 80 to 100 nodes, it would be impossible. The makeshift administra- tive interface simply did not scale up. Causes. M’hy did the system develop the Missing-Tools Crisis? How could developers have 01-erlooked such criti- cal needs? Again, the two main coii- tributors to the problem were the design/developinent methodology and the pilot project. ktually, it was more the “religion” of the methodology than the methodology itself: People did not consider monitoring and coii- trol tools and the need for making cor- rective changes because they felt the system was going to work correctly. The GEMS designers thought the sys- tem would be totally automated and self-correcting; they never foresaw the need for humans to manage the system. Also, inany system designers had little or no system-administration experience. In niaiiy companies and universities, software eiigineers have their own persoiial computers and get little exposure to distributed-systems management. They are genuinely naive - and understandably so. Few career paths lead through systems administr atio11 i11 to deve1opnient, except in small enterprises where the developer does both. Even worse, a class distinction soinetiines exists between system administrators aiid developers, which inhibits rapport and sensitivity to the management side of distributed systems. Another unfortunate influence on the design/development methodology came from the focus infused into the development process: The user func- tions justified the funding for the GEMS project. With this orientation, support functions got slighted; they were uiiiinportant until the users grew more sophisticated and demanded bet- ter performance. Yet another influence came from assumptions about the users. When GEMS was designed, developers envi- sioned small, single-page messages going between users. This was in the early days of e-mail, before its usefu- ness for file transfer was established, and developers didn’t anticipate a user sending a 3-Mbyte message and the impact it would have on the system. Because they failed to imagine things a user could do with the system, devel- opers provided no means to monitor and control them. The pilot project also helped bring on the Missing-Tools Crisis by, again, giving a false sense of progress. ’The pilot project focused on user functions rather than administration. It did nialte M A Y 1996
  • 5. some sense not to build Idahorate man- agement tools before thc feasibility of user functions was proven. Also, since the pilot focused 011 sm~all-scalefeasi- bility, management needs were not obvious. Scaling up the project uncov- ered the need for adiniriistrative inter- facc and management tolols. Treatment. Treating the Missing- Tools Crisis IS straightforward + retrofit an interface for monitor- ing and control tools, + create quick-change tools and proceclures so developers can make code correctioiis quickly, and + have the developers both use and maiiage the system they have built. GEMS developers had to modify the system design to ,iccommodate monitoring and control features, as the developers of rnany ne tuork systems - including DECNet, IBM SNA, and thc I S 0 OS1 model -- have done before them. The advantage of incor- porating tools after the system 15 run- ning is that developers have a better idea of what administr‘itive tools the system needs and how users are likely to usc the system. It thc software is well designed, adding this interface can be rel‘itively easy. T h e work the GEIMS de5igners put into their origi- nal design paid off heire. the solid design allowed them to add the tools fairly quickly. Many argument5 show how much inorc cxpensive it is to make changes late i n thc developtnent process than to get it right the first time.’ However, even though GELVSdevelopers were extremely ca-etul and followed a good development methodology, they still had to make late-tage clr post-deploy- inent code changes. L q e , complex system can require several rounds of corrections, so ebery system should iiicorporate quick-change tools. A few years ago, a former student of mine went 011 ‘1 j01) interview At that time, the acadcinic community h d lust fully embraced structured I E E E S O F T W A R E coding as a useful software-engineer- ing technique. Wanting to make sure that the company he might work for was forward-looking and used up-to- date software practices, the student asked the interviewer if they did struc- tured coding. ’The interviewer said yes, they structured their code with a block of assembly code followed by a block of reserved memory callcd a patch area. If they needed to insert a change, they could zap out the offend- ing code, branch down to the correct- ed code loaded into the reserved block, then branch back in-line. When he related the story in class, we all had a good, self-righteous laugh at the interviewer’s old-fashioned defin- ition of structured coding. However, as T realized later, we missed the usefulness of this “antiquated” technique for quickly fixing problems in deployed sys- tems. We still have much to learn from solutions out of our pre-structured- coding and pre-object-oriented past, even though newer software-engineer- ing practices might alter their exact form. Finally, having developers use what they build has gained widespread acceptance in the software-engineer- ing community as a way of improving understanding of user needs. However, system administration often remains an unexplored perspective. Managing the system for a period of time definitely enhanced GEMS developers’ awareness of the need for administrative tools. DOINGTHINGS BETTER T o avoid the Quality-Capacity Syndrome, G. Scott Graham suggest- ed the following steps in a University of Santa Cruz seminar: + Build a simple analytic perfor- mance model early in the design process, even as early as during system definition, and iinprove it as the design progresses. + Build a picture of logical resource use or resource demand. For example, determine how many disk accesses a routine might make. + ‘Tie logical resource use to physical resource use. For example, tie the disk accessesto the actual disk-accesstime. + Use the analytic model to identi- fy the most-used modules, then opti- inize those modules. + Conduct a walkthrough explicitly to review the design for performance. + Design into the system an inter- face to capture data that nieasures quality and capacity. Frequently, capacity and perfor- mance goals get shelved duriiig devel- opment. After the system is built, we push it off a cliff and see if it flies. It would be better to keep capacity goals in mind during design/’developinent and to get performance feedback all along. Conducting an explicit perfor- iiiance walkthrough and designing in appropriate data-col1ection mecha- nisms are tangible activities that will e1evate the design ’s performance aspects. T o improve capacity, you can form a capacity group that follows behind the implementation group. Using the system model’s numeric assumptions and ana1ysi:jas the start- ing point, the group can measure the actual capacity and improve it. Finally, an absolutc necessity for avoiding or treating the Quality- Capacity Syndrome is to assign one or more developers the role of systems generalists. You must identify and cul-
  • 6. tivate systems generalists throughout the development process. To avoid the Missing-Tools Crisis, try the following suggestions: + Design interfaces into the system that support monitoring and control tools. You don’t have to build all the envisioned monitoring and control tools at the onset of the project, but at least design a control scheme and build in the appropriate interfaces, with extra attention to manual over- rides of clever adaptive algorithms. + Have the developers manage the system prior to deployment. It’s espe- cially helpful to study the administra- tion procedures in light of the scale of the final deployment. + Develop the tools and procedures to support quick changes. A vital on- line system will undoubtedly need them. he developers of multiuser dis- tributed systems are particular- + Conduct a systems-administra- tion walkthrough, and include actual administrators. Also, bring in the peo- ple who will be using the system so they can share their perspectives. ly vulnerable to the pitfalls of Quality-Capacity Syndrome and Missing-Tools Crisis. What makes them so deadly is that they tend to occur together just as the project is nearing completion, putting the schedule in jeopardy. The Quality- Capacity Syndrome teaches us to start both performance modeling and model validation early. The Missing- Tools Crisis teaches us to consider the administration and management of the system under development. Perhaps the best lesson learned from this experience is that we should beware of relying too heavily on our ability to “do it right.” There’s always +a pitfall waiting to educate us. Process-CenteredSoftware Engineering Environments edited by Pankaj K. Carg and Mehdi Jazayeri Presents a comprehensive picture of this emerging technology while highlighting the key concepts and issues. The book introduces some of the basic concepts and developments behind PSEEs and discusses the unifying role it plays in combining project management, software engineering, and process engineering. It reviews related process modeling and representation concepts, terminology, and issues, and analyzes the features of some example PSEEs while taking an inside look at their impleinentation by describing specific design choices. The book concludes with a discussion of the significant role they will play in the software life cycle. Contents:Preface Introduction Software Processes: Modeling and Representation PSEE Features Fundamental Design Issues Future Directions Further Readings 424 pages. September 1995. Softcover. lS6N 0-8186-7103-3. Catalog # BP07103 - $40.00 Members / $50.00 List II @cOMPUTER SOCIETY REFERENCES I. F.P. Brooks, Jr., The .Llythictll Mtlii-~Vloiitli, Addison-M’esley, Reading, Mass., 1975 . 2 . L. Bernstein, “Get the Design Right!,” IEEE Sofmuw,Sept. 1903, pp. 61-63. 3. H. Ledgard, Sofrz~weEirgiweeringConcepts, Addison-Vesley, Reading, Mass., 1987. 4,B.M’. Boehin, “A Spiral Model of Software Development and Enhancement,” Coiiiprrter, ,May 1988,pp. 61-72. 5 . ’r.DeMarco and T. Lister, Peopleu,trre,Dorset lIouse, New York, 10x7. 6. 11.L.Parnas, “On the Criteria to Be Used in Decomposing Systems into Modules,” Cultlnr. ACLf, VOI.5, No. 12, Dec. 1072, pp. 1,053- 1,058. 7 . B.W. Boehni, .Suffmim Eiigiiiiwiwg Econuniic.r, Prentice-I lall, Upper Saddle River, NJ., 1981. Karen Mackeyis a devel- opment manager at Lotus Development Corpora- tion, a subsidiary of 1B.V. Previously, she was a soft- ware engineer and manager at T R W and AT&TRcll Laboratories. in coininiter science from Mackey received a PhD Pennsylvania State University, University Park. She is a meniber of the IEEE Computer Society, ACM, and Silicon ‘alley SPIN. Address questions ahout this article to Mackey at I??!, Suwi $’ay, Sunn)?.ale, CA 9.1087; kinackey@best.ci)in. M A Y 1 9 9 6