OUTLINEA plane crash on the 8th January 1989British Midland Flight 92. Flying from Heathrow to BelfastCrashes by the M1 motorway near Kegworth, whileattempting an emergency landing at East Midlands AirportThe plane was a Boeing 737-400. A new variant of Boeing737. In use by BM for less than two monthsThere were 118 passengers and 8 Crew. 47 die, and 74seriously injured
SEQUENCE OFEVENTS• The pilots hear a pounding noise and feel vibrations (subsequently found to be caused by a fan blade breaking inside the left engine).• Smoke enters the cabin and passengers sitting near the rear of the plane notice flames coming from the left engine• The flight is diverted to East Midlands Airport• The pilot shuts down the engine on the right
SEQUENCE OFEVENTS• The pilots can no longer feel the vibrations, and do not notice the vibration detector is still reporting a problem. The smoke disperses.• The pilot informs the passengers and crew that there was a problem with the right engine and that it has been shut down• 20 minutes later. On approach to East Midlands Airport, the pilot increases thrust. This causes the left engine to burst into flames and cease operating• The pilots try to restart the left engine, but crash short of the runway
WRONG ENGINE SHUTDOWN. WHY?Incorrect assumption: Pilots believed the “bleed air” was taken from the right engine, and therefore the smoke must be coming from the right. The 737 used bleed air from the right engine, not the 737-400. Psychologists call this a mistake in “knowledge based performance”Design issues: No visibility of engines, so relied on other information sources to explain vibrations. The vibration sensors were tiny, and had a new style of digital display. The vibration sensors were inaccurate on the 737 but not the 737-400Inadequate training: A one day course, and no simulator training
ERROR NOT TRAPPED.WHY?Coincidence: The smoke disappeared after shutting down the right engine and the vibrations lessened. - Psychologists call this “Confirmation bias”.Lapse in procedure: After shutting down the right engine the pilot began checking all meters and reviewing decisions but stopped after being interrupted by a transmission from the airport asking him to descend to 12,000 ft.Lack of Communication: Some cabin crew and passengers could see the left engine was on fire, but did not inform the pilot, even when the pilot announced he was shutting down the right engine.Design Issue: The vibration meters would have shown a problem with the left engine, but were too difficult to read. There was no alarm.
VIEWPOINTSTraditional engineering view• The crash was caused by an engine failure. Therefore we must design better engines.Traditional managerial view• The crash was caused by the pilots. We must hire better pilots.The Socio-technical systems engineering view or new view• The crash had no single cause, but involved problems in Testing, Design, Training, Teamwork, Communications, Procedure Following, Decision Making, poor „upgrade‟ management, (and more)• We need better engines, but we also need to expect problems to happen and to be adequately prepared for them
THE “NEW VIEW” OF HUMANERRORThe old view The new viewHuman error is the cause of Human error is a symptom ofaccidents trouble deeper inside a system Systems are inherently unsafeSystems are inherently safe and people usually keep themand people introduce errors running wellBad things happen to bad All humans are falliblepeople
THE “NEW VIEW” OF HUMANERRORIs not new! This is just a name, it has been around for 20years.Draws the emphasis away from modelling human error, andtowards understanding what underlies human actions whenoperating technology• How do people get things right?Argues too much emphasis is placed on “the sharp end”. Itargues that error is symptomatic of deeper troubleOpposes the “blame culture” that has arisen in manyorganisations. We are too quick to blame system operatorswhen managers and engineers are at fault.
HUMAN RELIABILITYHumans don‟t just introduce errors into systems, but areoften responsible for avoiding and correcting them too.What do people really do when they are operating atechnology?• Very little human work is driven by a clear and unambiguous set of recipes or processes, even when these are available• All human work is situationally contingent. Work must inevitably be more than following a set of steps.• If people work to rule, accidents can happen. For example the prior to the sinking of the SS Estonia a crew member did not report a leak as it was not his job.
CORRECT PROCEDURE?There is not always a „correct‟ procedure by which to judgeany action.Sometimes trial and error processes are necessary• In young organisations, best practices may not yet exist• New and unusual situations may occur in which a trial and error approach is appropriate• Sometimes it is appropriate to play or experiment. This is how innovation often happens.So deciding when something is an error, and judging whetheran error was appropriate to a set of circumstances can behighly context dependent.
FIELDWORKOften we don‟t notice that people need to do things to keepcomplex systems running smoothly.• Fieldwork is an important aspect of understanding how systems are operated and how people work.
STUDYING SUCCESSIt is important to study and understand ordinary workWe can also learn lessons from “successful failures”,including• The Apollo 13 Mission• The Airbus A380 engine explosion over Batom island• The Sioux City Crashhowever accounts of successful failures can turn into a formof hero worship, and organisations that experience thesekinds of success against the odds can build a false sense ofinvulnerability.
PROBLEMS WITH AUTOMATIONAs work becomes automated, engineers often make themistake of automating the aspects that are easy to automate.• The Fitts list MABA-MABA approach can lead to a dangerous lack of awareness and control for systems operators.• The “paradox of automation” is that automation creates and requires new forms of labour.• The major design problem is no longer how to support workflow, but how to support awareness across a system and organisation, and how to support appropriate kinds of intervention
CREW RESOURCE MANAGEMENTOne approach to improving reliability and reducing humanerror is crew resource management (CRM)• Developed in the aviation industry, and now widely used• Formerly Crew Resource ManagementCRM Promotes• The effective use of all resources (human, physical, software)• Teamwork• Proactive accident prevention
CREW RESOURCE MANAGEMENTThe focus of CRM is upon• Communication: How to communicate clearly and effectively• Situational awareness: How to build and maintain an accurate and shared picture of an unfolding situation• Decision making: How to make appropriate decisions using the available information. (and how to make appropriate information available)• Teamwork: Effective group work, effective leadership, and effective followership.• Removing barriers: How to remove barriers to the above
KEY POINTSIt can be too narrow to focus on human error• Human errors are usually symptomatic of deeper problems• Human reliability is not just about humans not making errors, but about how humans maintain dependabilityWe cannot rely on there being correct procedures for everysituations. Procedures are important, but we need to supportcooperative workingDesign approaches, as well as human and organisationalapproaches, can be taken to support human reliability.