Reviving the Phoenix Dealing with Failed or Failing Projects Tom Rethard Lecturer Department of Computer Science and Engineering The University of Texas at Arlington
Agenda Introduction 4 Case Studies Case Study #1 - Mainframe System Software Case Study #2 - Mission Critical Asset Scheduling Software Case Study #3 - Tax Software Case Study #4 - Mission Critical Financial Analysis and Planning Sales Tool Symptoms of a project in trouble Causes  Cures Basic Recovery Process  Resources  Questions?
The University of Texas at Arlington Also known as “The Stealth University” Approximately 25,000 students College of Engineering : 4,000 Computer Science and Engineering : 1000 600 undergrad, 400 grad BSCS, BCCSE, BSSE, MSCS, MSCSE, MSSE, MCS, PhD Accredited by ABET and CSAB
Introduction 60% of our (software) projects fail! Failure means: Never delivered or cancelled Delivered significantly late Delivered significantly over budget Delivered with significantly reduced functionality
Caveat Please note that the parties involved in these case studies are not explicitly identified herein. To protect the innocent To protect the guilty To avoid general embarrassment of anyone connected to them Please also note that you may be able to identify the parties involved
Case Study #1   Mainframe System Software Application COTS Application to Monitor and Control MVS Storage Subsystems Customer Storage Peripheral Manufacturer, for sale to MVS Licensees with Large Storage Subsystems Developer Developed In House
Initial Situation Prototype product developed in Australia “Artificial Intelligence” (i.e., rule-based system) running on a PC Performed rudimentary analysis of mainframe SMF and Logrec data Too slow to be practical Resulted from brainstorming session between the Aussie branch and the CS department at a major Aussie University
Initial Situation (con’t) Marketing proposed an integrated product: Higher performance Mainframe-based Common code framework “ Add-on” functions (“Several”) A year after initial conception, there was still no coherent set of individual product ideas, much less requirements Development team had been assembled and was trained
Realizing It’s in Trouble No requirements after a year No sign of agreements on requirements Lots of finger-pointing: Marketing vs. Development Increasing doubts about feasibility of the few ideas floating about Increasing pressure to start coding (!)
Fixing the Problem All but two members of development staff (myself and one other on a three month contract) “temporarily” reassigned Two remaining members built a simulation model of the MVS storage subsystem (in PROLOG)  To better understand behavior To (hopefully) find potential requirements
The Model Worked quite well Was extremely flexible Showed us enough about the hidden behaviors to convince me we shouldn’t be there I began movement to cancel the project
The Battle to Kill It Some stakeholders had more ego than money at stake, so no support VP of development didn’t want to be embarrassed, so no support Marketing wanted the products to help sell hardware, so no support Ever feel like the lone voice in the wilderness?
The Final Outcome A new senior exec with a storage products background at IBM was hired VP of Software Development reported to him One meeting with the new VP was enough to get the project cancelled I was reassigned to outer Slobovia VP of Software Development was moved to “Special Projects”
Case Study #2   Mission Critical Asset Scheduling Software Application “Speed up” Existing Scheduling Modeler Customer A Major Transportation Company Developer In House
Initial Situation Basic scheduling problem is an NP-complete problem “Schedulers” were a small staff of mathematicians Project had been running for 2 ½ years No documentation 35 developers on staff LOTS of code, not much functionality
Initial Situation Just hired 3 rd  Project Manager Several key developers were not trusted by the new PM, and  vice versa New PM hired me as a consultant architect Original solution (from a statement by the team member who proposed this) We’ll use a massively parallel computer, an object database and object-oriented analysis, design and coding to solve all our problems.
Realizing It’s in Trouble First clue was the PM turnover! Second clue was the lack of tangible progress 3 rd  PM realized there were problems But VP of Development didn’t – or didn’t want to admit it About half of the developers knew something was wrong, but didn’t know what
Recovery I came in primarily as an architect Placed myself as a buffer between the various leads and the PM Met with PM on daily basis Intent to build trust, initially through me, within their organization Spent several month interviewing key developers and users
Recovery After 6 months, we had A real, live, agreed upon architectural definition that supported the requirements A parallel computer company that went out of business before delivering, but after being paid An object database company that showed no signs of delivering
Recovery But we also had An study done with the users on the performance problem: 90% of elapsed time was human wait time, elimination of which would effectively be a 10x speedup (NB: cost of $0) A VP of Development who was still in denial One developer (the original proposer) who continued to sabotage efforts to move the project forward
The Final Outcome The 3 rd  Project Manager was fired two weeks after I left The saboteur became the 4 th  Project Manager Scrapped the architecture Changed to a vector machine instead of MPP Most team leads left Project eventually scrapped
Case Study #3   Tax Software Application Final Release of a Tax Software Product  (COTS) Customer A Software Company, for Shipment to Existing Customers Developer In House
Initial Situation Product originally designed as a single-user system Users discovered networks, and began to share the database! With the expected result of data corruption Assigned to in-house team (23 people) Intent was to add database sharing and stabilize the product as the last release Had a 6 month schedule
Realizing It’s in Trouble Project Manager and Director No clue until a few weeks before the ship date Never asked why the staff needed to be so large Team lead requested doubling of his staff (to 46) and said the product would be late
Recovery Dump it in Tom’s lap Met with project team Requested all pertinent docs Received a one-page list of 22 “enhancements” Sought the team’s justification of each “enhancement” Bottom line: lot’s of gold-plating by the team lead All but one enhancement was unjustified
Recovery Removed all but two developers from the project Remaining two (not including the team lead) spent two days analyzing the code in light of the real requirement One day spent with the developers and myself designing the solution Solution was implemented in 5 days
The Final Outcome Product was tested and turned over for shipment two weeks ahead of schedule I made a few enemies……
Case Study #4 Mission Critical Financial Analysis and Planning Sales Tool Application Mission-Critical Sales Tool to Analyze Client Needs and Design a Financial Plan Customer A Financial Services and Planning Company Developer A Major Software Consulting Firm
Initial Situation Original project on fixed-price bid of $3.8M Let to major national consulting firm Had also completed the “Requirements” Went through 3 Project Managers Last PM Requested, and got, 6 month extension on Alpha Test date Extension included bonus to all contractors who stayed through the end date
Initial Situation Final PM produced a status report on Friday before Alpha Test start: “AOK” On Monday of Alpha Test request another 6 month extension, and more money. Client terminated the relationship Project had generated 550 KLOC of C++ code at time of termination
Realizing It’s in Trouble First observed clue: project was late (the first extension) Second observed clue: the lie on the status report Third observed clue: the request for more money
Immediate First Aid Removed the consulting firm from the premises, but retained some “key” contractors Froze all work and began gathering the ashes Began search for an independent project manager who could “finish” the project
Recovery Client made a good start by stabilizing the project I had two weeks to study the project and make a recommendation to the Board of Directors I was given complete control over the project, including hire/fire of contractors and even permanent staff (from the project)
Interesting Observations Consultant wrote 50 KLOC to produce a generalized report writing function. All “design documents” were done via Rational Rose reverse engineering of code after it was written The “scope control” process noted only a couple dozen requests for changes Client reps to project meetings were from VP level or higher
Interesting Observations There was no usable schedule There was no source control Several “master copies” Most programmers could not see others’ code There was no quality assurance group - or processes There was absolutely no measurement of progress or productivity
Interesting Observations Database Architect had not been allowed to determine the database architecture Programmers told him what they wanted Code revealed that programmers had NO understanding of relational databases Client had been physically excluded from the project area, even though it was on their own property There were not even any project files – just stacks of unorganized paper
Recommendation to the Board Three possible approaches Scrap everything and start over Keep everything and debug the existing code Take the triage approach and minimize our costs (and time), but also cripple long term potentials Board chose the last
Triage Classical Triage, just like an emergency room in a hospital Some of the project can be used as is (very little) A slightly larger portion of existing code could be fixed, although would be more difficult to support long term Most of the code was useless due to scaffolding of internals and would be scrapped
Recovery Reorganized the staff into multiple teams Teams owned one or more functional areas Included a QA team Empowered the DBA And got him an assistant Directed teams to take detailed inventory
Recovery Established essential processes Change control (PVCS Tracker) NO changes without approval, including programmers Provided the basis for much of the scheduling Source Control (MS SourceSafe) Single master copy of source Schedule Control (MS Project) Reporting
Recovery Opened the area to all Client employees But don’t bother my staff without going through me Maintained a “war room” that contained weekly progress charts emphasizing the growth of  function Directly invited the CEO and all senior executives to examine anything and everything
Recovery Determined appropriate order of development to allow for incremental delivery to QA Fixed and/or created functionality in the prescribed order One “new” subsystem was built with JAD, using a clerk and a very good designer/programmer in record time
Recovery Actual recovery overall Uneventful No real surprises Relatively low turnover of personnel Entire staff remained focused Little overtime necessary
Final Outcome Went to Alpha Test on time (recovery schedule) after about 13 months Product deployed to 12 sites at acceptance, to 180 sites within 6 months First year savings from use of the product indicated a lost savings of $28.8M from the delays
Final Outcome Client eventually sued the original Consulting Firm for the original $3.8M and the $28.8M in lost savings. After several years, the Consulting Firm made a settlement offer that amounted to $5M, which was accepted
Symptoms of a Project in Trouble Late Delivery (most common) Involuntary Overtime Non-specific Tasks on Schedule Large Tasks on Schedule (over 24 man-hours) No Schedule Complaints About “Scope Creep” High Turnover Rate, Especially Managers Secrecy Adding Resources No Measurements Against Estimates (or No Estimates) No Requirements or Design Documentation No Reviews or Review Results Documented
Symptoms of a Project in Trouble PAIN in general Everybody knows when a project is going sour – most won’t admit to it, except by jumping ship
Causes Too many to enumerate! Often people issues Seldom technology issues! Failure to plan, or failure to follow the plan Lack of risk management Poor communication, both within and without the team Hiding of problems (lies) Failure to use change control processes
Causes Failure to use source control processes “ Build it all at one time” approach “ Code like hell” approach Unk-Unks Failure to maintain focus Use of unqualified personnel without training them Wishful thinking
Causes Nearly everything you can think of! See McConnell’s book
Cures Only one known: Good project management from day one
Basic Process for Recovery Stop EVERYTHING! No more coding No more designing No more writing Etc. Gather the artifacts  Assess the situation Take a deep breath
Basic Process for Recovery Plan as if it were a new project Give special attention to Communications Change control Risk management Personnel issues Report the bad news FIRST And immediately!
Basic Process for Recovery Be flexible There will be changes in the scope You will find many things that were hidden from you initially Remember that nobody tries to do a bad job, but some need help in learning how to do a good job. And that’s the PM’s job Lead by example
Bottom Line There is no silver bullet The best recovery manager is a good project manager With a good track record Often with a high degree of technical knowledge Stay FOCUSED
How to Get There Experience is the best teacher Education (other people’s experience) is the second best teacher And will help to solidify your own experience Professional organizations ( e.g ., PMI) Formal Training Reading
Resources Rapid Development  – Steve McConnell And anything else he wrote! Griffin-Tate Group:  Project Recovery UC Davis:  Project Recovery Management  University of Texas at Arlington:  Project Recovery Techniques Part of our  Advanced Project Management Certificate  program University of Washington:  Managing Project Complexities Part of their  Certificate Program in [sic] Senior Project Manager Oklahoma State University/OKC:  Project Recovery Management
Certification ACTP: Association of Certified Turnaround Professionals Offer the “ACTP” Certificate – BUT it’s aimed at Corporate turnarounds, not software project turnarounds Others: None found But there are a number of “Advanced Project Management” Certificate programs that are more appropriate
Questions?

Project Recovery

  • 1.
    Reviving the PhoenixDealing with Failed or Failing Projects Tom Rethard Lecturer Department of Computer Science and Engineering The University of Texas at Arlington
  • 2.
    Agenda Introduction 4Case Studies Case Study #1 - Mainframe System Software Case Study #2 - Mission Critical Asset Scheduling Software Case Study #3 - Tax Software Case Study #4 - Mission Critical Financial Analysis and Planning Sales Tool Symptoms of a project in trouble Causes Cures Basic Recovery Process Resources Questions?
  • 3.
    The University ofTexas at Arlington Also known as “The Stealth University” Approximately 25,000 students College of Engineering : 4,000 Computer Science and Engineering : 1000 600 undergrad, 400 grad BSCS, BCCSE, BSSE, MSCS, MSCSE, MSSE, MCS, PhD Accredited by ABET and CSAB
  • 4.
    Introduction 60% ofour (software) projects fail! Failure means: Never delivered or cancelled Delivered significantly late Delivered significantly over budget Delivered with significantly reduced functionality
  • 5.
    Caveat Please notethat the parties involved in these case studies are not explicitly identified herein. To protect the innocent To protect the guilty To avoid general embarrassment of anyone connected to them Please also note that you may be able to identify the parties involved
  • 6.
    Case Study #1 Mainframe System Software Application COTS Application to Monitor and Control MVS Storage Subsystems Customer Storage Peripheral Manufacturer, for sale to MVS Licensees with Large Storage Subsystems Developer Developed In House
  • 7.
    Initial Situation Prototypeproduct developed in Australia “Artificial Intelligence” (i.e., rule-based system) running on a PC Performed rudimentary analysis of mainframe SMF and Logrec data Too slow to be practical Resulted from brainstorming session between the Aussie branch and the CS department at a major Aussie University
  • 8.
    Initial Situation (con’t)Marketing proposed an integrated product: Higher performance Mainframe-based Common code framework “ Add-on” functions (“Several”) A year after initial conception, there was still no coherent set of individual product ideas, much less requirements Development team had been assembled and was trained
  • 9.
    Realizing It’s inTrouble No requirements after a year No sign of agreements on requirements Lots of finger-pointing: Marketing vs. Development Increasing doubts about feasibility of the few ideas floating about Increasing pressure to start coding (!)
  • 10.
    Fixing the ProblemAll but two members of development staff (myself and one other on a three month contract) “temporarily” reassigned Two remaining members built a simulation model of the MVS storage subsystem (in PROLOG) To better understand behavior To (hopefully) find potential requirements
  • 11.
    The Model Workedquite well Was extremely flexible Showed us enough about the hidden behaviors to convince me we shouldn’t be there I began movement to cancel the project
  • 12.
    The Battle toKill It Some stakeholders had more ego than money at stake, so no support VP of development didn’t want to be embarrassed, so no support Marketing wanted the products to help sell hardware, so no support Ever feel like the lone voice in the wilderness?
  • 13.
    The Final OutcomeA new senior exec with a storage products background at IBM was hired VP of Software Development reported to him One meeting with the new VP was enough to get the project cancelled I was reassigned to outer Slobovia VP of Software Development was moved to “Special Projects”
  • 14.
    Case Study #2 Mission Critical Asset Scheduling Software Application “Speed up” Existing Scheduling Modeler Customer A Major Transportation Company Developer In House
  • 15.
    Initial Situation Basicscheduling problem is an NP-complete problem “Schedulers” were a small staff of mathematicians Project had been running for 2 ½ years No documentation 35 developers on staff LOTS of code, not much functionality
  • 16.
    Initial Situation Justhired 3 rd Project Manager Several key developers were not trusted by the new PM, and vice versa New PM hired me as a consultant architect Original solution (from a statement by the team member who proposed this) We’ll use a massively parallel computer, an object database and object-oriented analysis, design and coding to solve all our problems.
  • 17.
    Realizing It’s inTrouble First clue was the PM turnover! Second clue was the lack of tangible progress 3 rd PM realized there were problems But VP of Development didn’t – or didn’t want to admit it About half of the developers knew something was wrong, but didn’t know what
  • 18.
    Recovery I camein primarily as an architect Placed myself as a buffer between the various leads and the PM Met with PM on daily basis Intent to build trust, initially through me, within their organization Spent several month interviewing key developers and users
  • 19.
    Recovery After 6months, we had A real, live, agreed upon architectural definition that supported the requirements A parallel computer company that went out of business before delivering, but after being paid An object database company that showed no signs of delivering
  • 20.
    Recovery But wealso had An study done with the users on the performance problem: 90% of elapsed time was human wait time, elimination of which would effectively be a 10x speedup (NB: cost of $0) A VP of Development who was still in denial One developer (the original proposer) who continued to sabotage efforts to move the project forward
  • 21.
    The Final OutcomeThe 3 rd Project Manager was fired two weeks after I left The saboteur became the 4 th Project Manager Scrapped the architecture Changed to a vector machine instead of MPP Most team leads left Project eventually scrapped
  • 22.
    Case Study #3 Tax Software Application Final Release of a Tax Software Product (COTS) Customer A Software Company, for Shipment to Existing Customers Developer In House
  • 23.
    Initial Situation Productoriginally designed as a single-user system Users discovered networks, and began to share the database! With the expected result of data corruption Assigned to in-house team (23 people) Intent was to add database sharing and stabilize the product as the last release Had a 6 month schedule
  • 24.
    Realizing It’s inTrouble Project Manager and Director No clue until a few weeks before the ship date Never asked why the staff needed to be so large Team lead requested doubling of his staff (to 46) and said the product would be late
  • 25.
    Recovery Dump itin Tom’s lap Met with project team Requested all pertinent docs Received a one-page list of 22 “enhancements” Sought the team’s justification of each “enhancement” Bottom line: lot’s of gold-plating by the team lead All but one enhancement was unjustified
  • 26.
    Recovery Removed allbut two developers from the project Remaining two (not including the team lead) spent two days analyzing the code in light of the real requirement One day spent with the developers and myself designing the solution Solution was implemented in 5 days
  • 27.
    The Final OutcomeProduct was tested and turned over for shipment two weeks ahead of schedule I made a few enemies……
  • 28.
    Case Study #4Mission Critical Financial Analysis and Planning Sales Tool Application Mission-Critical Sales Tool to Analyze Client Needs and Design a Financial Plan Customer A Financial Services and Planning Company Developer A Major Software Consulting Firm
  • 29.
    Initial Situation Originalproject on fixed-price bid of $3.8M Let to major national consulting firm Had also completed the “Requirements” Went through 3 Project Managers Last PM Requested, and got, 6 month extension on Alpha Test date Extension included bonus to all contractors who stayed through the end date
  • 30.
    Initial Situation FinalPM produced a status report on Friday before Alpha Test start: “AOK” On Monday of Alpha Test request another 6 month extension, and more money. Client terminated the relationship Project had generated 550 KLOC of C++ code at time of termination
  • 31.
    Realizing It’s inTrouble First observed clue: project was late (the first extension) Second observed clue: the lie on the status report Third observed clue: the request for more money
  • 32.
    Immediate First AidRemoved the consulting firm from the premises, but retained some “key” contractors Froze all work and began gathering the ashes Began search for an independent project manager who could “finish” the project
  • 33.
    Recovery Client madea good start by stabilizing the project I had two weeks to study the project and make a recommendation to the Board of Directors I was given complete control over the project, including hire/fire of contractors and even permanent staff (from the project)
  • 34.
    Interesting Observations Consultantwrote 50 KLOC to produce a generalized report writing function. All “design documents” were done via Rational Rose reverse engineering of code after it was written The “scope control” process noted only a couple dozen requests for changes Client reps to project meetings were from VP level or higher
  • 35.
    Interesting Observations Therewas no usable schedule There was no source control Several “master copies” Most programmers could not see others’ code There was no quality assurance group - or processes There was absolutely no measurement of progress or productivity
  • 36.
    Interesting Observations DatabaseArchitect had not been allowed to determine the database architecture Programmers told him what they wanted Code revealed that programmers had NO understanding of relational databases Client had been physically excluded from the project area, even though it was on their own property There were not even any project files – just stacks of unorganized paper
  • 37.
    Recommendation to theBoard Three possible approaches Scrap everything and start over Keep everything and debug the existing code Take the triage approach and minimize our costs (and time), but also cripple long term potentials Board chose the last
  • 38.
    Triage Classical Triage,just like an emergency room in a hospital Some of the project can be used as is (very little) A slightly larger portion of existing code could be fixed, although would be more difficult to support long term Most of the code was useless due to scaffolding of internals and would be scrapped
  • 39.
    Recovery Reorganized thestaff into multiple teams Teams owned one or more functional areas Included a QA team Empowered the DBA And got him an assistant Directed teams to take detailed inventory
  • 40.
    Recovery Established essentialprocesses Change control (PVCS Tracker) NO changes without approval, including programmers Provided the basis for much of the scheduling Source Control (MS SourceSafe) Single master copy of source Schedule Control (MS Project) Reporting
  • 41.
    Recovery Opened thearea to all Client employees But don’t bother my staff without going through me Maintained a “war room” that contained weekly progress charts emphasizing the growth of function Directly invited the CEO and all senior executives to examine anything and everything
  • 42.
    Recovery Determined appropriateorder of development to allow for incremental delivery to QA Fixed and/or created functionality in the prescribed order One “new” subsystem was built with JAD, using a clerk and a very good designer/programmer in record time
  • 43.
    Recovery Actual recoveryoverall Uneventful No real surprises Relatively low turnover of personnel Entire staff remained focused Little overtime necessary
  • 44.
    Final Outcome Wentto Alpha Test on time (recovery schedule) after about 13 months Product deployed to 12 sites at acceptance, to 180 sites within 6 months First year savings from use of the product indicated a lost savings of $28.8M from the delays
  • 45.
    Final Outcome Clienteventually sued the original Consulting Firm for the original $3.8M and the $28.8M in lost savings. After several years, the Consulting Firm made a settlement offer that amounted to $5M, which was accepted
  • 46.
    Symptoms of aProject in Trouble Late Delivery (most common) Involuntary Overtime Non-specific Tasks on Schedule Large Tasks on Schedule (over 24 man-hours) No Schedule Complaints About “Scope Creep” High Turnover Rate, Especially Managers Secrecy Adding Resources No Measurements Against Estimates (or No Estimates) No Requirements or Design Documentation No Reviews or Review Results Documented
  • 47.
    Symptoms of aProject in Trouble PAIN in general Everybody knows when a project is going sour – most won’t admit to it, except by jumping ship
  • 48.
    Causes Too manyto enumerate! Often people issues Seldom technology issues! Failure to plan, or failure to follow the plan Lack of risk management Poor communication, both within and without the team Hiding of problems (lies) Failure to use change control processes
  • 49.
    Causes Failure touse source control processes “ Build it all at one time” approach “ Code like hell” approach Unk-Unks Failure to maintain focus Use of unqualified personnel without training them Wishful thinking
  • 50.
    Causes Nearly everythingyou can think of! See McConnell’s book
  • 51.
    Cures Only oneknown: Good project management from day one
  • 52.
    Basic Process forRecovery Stop EVERYTHING! No more coding No more designing No more writing Etc. Gather the artifacts Assess the situation Take a deep breath
  • 53.
    Basic Process forRecovery Plan as if it were a new project Give special attention to Communications Change control Risk management Personnel issues Report the bad news FIRST And immediately!
  • 54.
    Basic Process forRecovery Be flexible There will be changes in the scope You will find many things that were hidden from you initially Remember that nobody tries to do a bad job, but some need help in learning how to do a good job. And that’s the PM’s job Lead by example
  • 55.
    Bottom Line Thereis no silver bullet The best recovery manager is a good project manager With a good track record Often with a high degree of technical knowledge Stay FOCUSED
  • 56.
    How to GetThere Experience is the best teacher Education (other people’s experience) is the second best teacher And will help to solidify your own experience Professional organizations ( e.g ., PMI) Formal Training Reading
  • 57.
    Resources Rapid Development – Steve McConnell And anything else he wrote! Griffin-Tate Group: Project Recovery UC Davis: Project Recovery Management University of Texas at Arlington: Project Recovery Techniques Part of our Advanced Project Management Certificate program University of Washington: Managing Project Complexities Part of their Certificate Program in [sic] Senior Project Manager Oklahoma State University/OKC: Project Recovery Management
  • 58.
    Certification ACTP: Associationof Certified Turnaround Professionals Offer the “ACTP” Certificate – BUT it’s aimed at Corporate turnarounds, not software project turnarounds Others: None found But there are a number of “Advanced Project Management” Certificate programs that are more appropriate
  • 59.