Why would experienced pilots shut down the wrong engine?
On the 8th of January 1989, a Boeing 737-400 (registration G-OBME) was on a routine flight from London Heathrow to Belfast. The aircraft was almost new, having been in service for just 2 months. No major work had been carried out since delivery.
A failure on one of the two engines led to the engine malfunctioning. In response to this, the crew shut down that engine. Or so they believed.
Shortly afterwards, the crew realised their mistake. They had shut down the good engine, not the engine that was faulty. With 118 passengers and eight crew on board, this aircraft had become a huge glider, and with no power, the aircraft crash-landed on the M1 motorway (freeway) close to Kegworth village. Since then, this has become known as the Kegworth disaster.
The pilots were diverting to East Midlands airport, but crashed short of the runway, impacting on one side of the motorway, remaining relatively intact, before coming to a standstill on the embankment on the other side of the road, miraculously missing all vehicles. The aircraft split into three main sections.
In total, 47 people died. Many more suffered horrific injuries, in fact almost all of the remaining 79 people were seriously injured. The majority of passengers were trapped due to injury, seat failure or debris from overhead, and it took 8 hours to evacuate all passengers. Both pilots were trapped, but survived.
The investigation report focussed on the actions of the two crew members. After all, they shut down the wrong engine! The engine malfunction wasn’t a disaster, but the failure of the crew to respond appropriately to that engine failure led to disaster. The investigation report and the media blamed the two pilots.
At the time of writing, March 2022, this remains the worst aviation incident in the UK. And human factors was central to this disaster.
In the following sections we will explore decision-making, mental models and human error.
Before we explore the human factors issues, let’s briefly recap the incident.
During the climb from Heathrow, a fan blade fractured in the No.1 (left) engine, leading to heavy vibration and the ingress of smoke into the flightdeck. Initially believing that the damage was to the No.2 (right) engine, the pilots throttled that engine back, and the heavy vibration stopped as they did so. The smoke also cleared from the flightdeck. This feedback reinforced the crew’s belief that they had reacted appropriately to the emergency. The crew then fully shut down the No.2 engine and diverted to East Midlands airport.
As they approached East Midlands, the aircraft appeared to be responding normally, albeit with a high level of vibration from the No.1 engine. However, on the decent towards the runway, when power was increased, engine No.1 failed completely, leaving the aircraft with no power. At this point an accident was inevitable. Fuel lines to the engine failed and sparks ignited the fuel.
The crew realised their mistake and unsuccessfully attempted to restart engine No.2.
The Commander made an announcement to passengers “Prepare for crash landing, prepare for crash landing”. Ten seconds later, the aircraft struck a field on the eastern side of the M1 motorway then crashed on the western embankment of the motorway, approximately 1500 metres from the East Midlands runway.
Of the 118 passengers, 39 died in the accident and a further 8 died later from their injuries.
Had the engine fire not been extinguished so rapidly by the East Midlands airfield fire service, there may have been a much greater loss of life.
Confusion and panic in the sky
The three flight attendants in the rear of the cabin saw evidence of fire from the No.1 (left) engine. Many passengers also reported seeing signs of fire from the left engine. Following a message from the flight service manager that “the passengers are very very panicky”, the Commander informed passengers over the PA system that there had been an issue with the right engine leading to the smoke, and that this engine had been shut down. This puzzled some passengers, who had seen fire from the left engine, but the discrepancy was not relayed to the cabin crew.
The recording from the Cockpit Voice Recorder (CVR) was recovered. The First Officer stated “It’s a fire coming through”, and the Commander was heard to ask “Which one is it?”. The First Officer replied “It’s the le. . . it’s the right one”.
The initial symptoms were severe, but also novel. The symptoms were outside the experience and training of the pilots, and there was no specific procedure or drill to support them.
The “causes” of the accident
The outer panel of a fan blade detached in the No.1 engine (left side of the aircraft). This led to the tip of the blade rubbing – causing smoke and the smell of burning to be passed into the air conditioning system.
The pilots also experienced severe engine-induced vibration on the flight deck.
As a result of these symptoms, they decided to shut down the working No.2 engine (right side of the aircraft). They also disengaged the autopilot and the autothrottle.
During the final approach to land at East Midlands airport, No.1 engine failed completely and the aircraft lost all power.
The investigation states that:
“In the event, both pilots reacted to the emergency before they had any positive evidence of which engine was operating abnormally. Their incorrect diagnosis of the problem, must, therefore, be attributed to their too rapid reaction and not to any failure of the engine instrument system to display the correct indications”p.98
But there must be more to this incident than human error?
After the Kegworth disaster, in June 1989, there were two further instances of fan blade fracture on engines fitted to Boeing 737-400 aircraft. In both cases, the failed engine was identified correctly, and the aircraft landed without incident. Although the high vibration indicators were used to assess the situation, the crew of one event reported that the small size of the instruments and their manner of presenting information made them difficult to read.
Human factors analysis
We know from these previous similar events that a failure of one engine on this type of aircraft does not necessarily lead to disaster. The Kegworth investigation – and reports in the media – appear to point to a failure of the pilots to respond correctly to the engine problem.
But the investigation showed that a fatigue fracture of a fan blade in the No.1 engine (left side of the aircraft) was caused by the blade being exposed to vibration stress greater than that for which it was designed. The vibration stress was not detected during engine testing. So, before we examine the decisions and actions of the pilots, let’s look back at the engine design and testing.
Given that two other aircraft had suffered similar engine failures, the fan blade fractures were examined and highlighted a generic problem affecting this engine. In all three cases, the fan blade had fractured across the same blade section, at similar flight conditions (engine speed and altitude). All three engines had completed a similar number of flight cycles.
Vibration testing during engine development included running the engines at physical speeds much higher than would be experienced in flight. However, these airworthiness tests were undertaken on a test-bed.
Although the engine was run through all anticipated speeds in order to check for vibration stresses, when testing on the ground the airflow through the engine is different to that experienced at high altitude. No indication of vibration issues was found during this testing, and so no tests were performed in an altitude test call or in flight.
“Had a similar flight test been performed during the certification testing, the manufacturer and certificating authorities would have become aware of the vibration mode and the engine could not have been considered acceptable for introduction into service before the characteristic had been eliminated”p.120
During the investigation, a flight test confirmed the vibration and that it could produce stress levels approaching the endurance limit. It was concluded that only by testing in the real operating environment could issues be revealed.
The crew reaction to the engine problem
The Commander disengaged autopilot eight seconds after the first indication of an issue, and therefore his attention would have been focussed on handling the aircraft. The Commander’s decision to shut down the good right-hand engine may have been influenced by his assumption that the First Officer had seen positive indications from the instruments.
The official investigation pulls no punches when commenting on the speed with which the pilots acted. This is reported to be contrary to training and the Operations Manual. The investigation states:
“If they had taken more time to study the engine instruments it should have been apparent that the No.2 engine indications were normal and that the No.1 engine was behaving erratically”p.98
However, the Operations Manual contained no guidance for pilots on the action to take in the event that vibration and smoke/fumes occurred together. Boeing was requested to amend the flight manual to indicate actions to be taken in these circumstances.
A key factor in pilot decision-making would be their experience of training in a flight simulator. During such training, virtually all engine problems result in an engine shutdown. In a real emergency such as this, it is not surprising that their actions were those that they had practiced in a flight simulator.
The Commander stated that he judged the No.2 engine to be at fault based on his knowledge of the air conditioning system. He thought that the smoke and fumes were coming forward from the passenger cabin, and as the air for the cabin came from the No.2 engine, the fault must be with that engine. However, the Commander’s mental model of the aircraft air conditioning system was faulty. On this new aircraft, some of the air for the cabin comes from the No.1 engine.
And yet, when the throttle to engine No.2 was retarded, the vibration symptoms appeared to go away. This reinforced the pilot’s belief that they had taken the correct action and that the problem was with engine No.2.
Unfortunately, the problematic engine No.1 ceased to surge when the engine No.2 was throttled back. The reduction in the vibrations felt in the cockpit confirmed the crew’s faulty mental model of what was happening. Unknown to the pilots, the earlier disconnection of the autothrottle led to the reduction in engine No.1 vibration when engine No.2 was throttled back.
After shutting down the No.2 engine, the Commander decided to land at the nearest suitable airfield. The decision to divert to East Midlands airport was considered appropriate by the investigation.
The fact that this airport was nearby might appear to be a positive factor. But it required the crew to immediately make preparations to land. This increased their workload and prevented them from reconsidering the nature of the emergency or the actions that had been taken. The high workload was considered to contribute to the failure of the crew to notice the high vibration readings.
The Commander did not re-engage the autopilot. His attempt to review events was interrupted by communications with Air Traffic Control and the emergency fire vehicles.
The First Officer tried unsuccessfully to programme the flight system for a diversion to East Midlands. Reprogramming the flight management system to land at a new destination whilst mid-flight is unusual and rarely practiced (if ever).
Restarting engine No.2
The Commander instructed the First Officer to restart the No.2 engine 50 seconds before the crash. This was attempted, but not successfully. The cockpit procedures were only suitable for the restart of a No.1 engine. An attempt to restart the No.2 engine would have required some improvisation.
The crew did not attempt a double engine failure restart, which would require the crew to improvise and accomplish an unlisted procedure. In the time available to the crew, it is doubtful that engine No.2 could have been functional in the time available.
Training and competence
The Boeing Series 400 aircraft were relatively new. The cockpit displays were very different to previous aircraft and were equipped with an Engine Instrument System (EIS). This system provided engine-related parameters using an LED display, replacing the previous electro-mechanical instruments.
However, the flight simulators used for training were not yet equipped with these new displays. The first time that a pilot would encounter abnormal indications on the Engine Instrument System would be in-flight on an aircraft with a failing engine. This is far from ideal. The crew of G-OBME were interpreting novel readings, under the worst possible conditions.
As the complexity of aircraft has increased, and technical systems become increasingly fail-safe, pilot technical training seems to be based on the need-to-know principle. But this assumes that all technical failures can be anticipated. The investigation states:
“In this accident, the pilots were suddenly presented with an unforeseen combination of symptoms that was outside their training or experience”p.110
Pilot training might be considered to address three areas:
- Skills for handling an aircraft, or flying skills
- Understanding of procedures (e.g., how to deal with a specific event)
- How to deal with unexpected and/or poorly defined problems.
At the time of this disaster, technical training in the industry focussed on the first two of these aspects.
But if every event cannot be anticipated, it is the third type of training, Decision Making, that may be key. The balance between technical competence, and training in decision making, was inappropriate.
The crew had very different experience levels. The Commander had been a Captain with the company for 14 years. The First Officer had six months experience. This mismatch in ability appears to have no impact on their coordination. They worked together as a team throughout and decisions were all accepted jointly.
The First Officer was the handling (or flying) pilot until the emergency occurred. The Commander was the non-handling pilot up to this point, and was responsible for monitoring instrumentation.
At the emergency, the Commander disengaged the autopilot and took control of the aircraft. The First Officer would have been focussed on handling the aircraft until then. Rapidly moving into the monitoring role may have influenced his identification of the wrong engine.
The lack of coordination between the flight deck and the cabin was noted by the investigation:
“It must be stated that had some initiative been taken by one or more of the cabin crew who had seen the distress of the left engine, this accident could have been prevented”p.106
The pilots were unaware of the sparks and flames from the No.1 engine, observed by many passengers and three of the cabin crew.
It is unfortunate that information from passengers or cabin crew was not communicated to the flight deck. However, at the time of the accident, training by airlines did not address coordination between cabin and flight crew in emergencies. The cabin crew would also have been aware that intrusion into the flight deck during busy phases of flight, or in an emergency, may be distracting. The incident led to a review of training in Non-Technical Skills.
Engine vibration gauges
The design of these gauges is key to the disaster.
When engine No.2 was throttled back, the vibration in the cockpit reduced. And yet the vibration indicator for engine No.1 continued to show a high reading.
The Commander stated that he rarely scanned the engine vibration gauges, as he believed them to be unreliable and prone to giving spurious readings. In previous models of aircraft, the vibration indicators were widely known to be unreliable. They could, for example, indicate high vibration from sources other than the engines that they were monitoring.
However, the vibration monitoring system fitted to this aircraft, the Boeing 737 series 400, was more reliable and only displayed vibration signals from the rotating assemblies within the engines. The investigation notes that the Commander’s training on the 737 did not draw his attention to the much-improved performance of the newer vibration monitoring system. Neither had the Commander practiced an emergency in which the vibration indications were used as a cue in fault diagnosis.
The official investigation states that the pilots should have been aware of a Bulletin issued a year prior, which introduced a procedure for managing high engine vibration. But this Bulletin “implicitly drew attention to the vibration indicators”.
The vibration indicators on this model of aircraft were designed as part of a ‘glass cockpit’. Computer generated displays had replaced the mechanical instruments. The LED pointers of the new vibration indicators were smaller and much less conspicuous than a mechanical pointer. When at maximum vibration, the pointer was rendered even less conspicuous due to it being close to the engine oil digital display. This oil display is more prominent, and is the same colour as the vibration pointer.
There is some recognition in the investigation that by not following the principles of good human factors engineering, the design of these gauges set the pilots up to fail:
“. . . there is a possibility that the methods of displaying information on these instruments may have contributed to the error”p.101
The investigation recommended that the engine instrument system be modified to include “an attention-getting facility to draw attention to each vibration indicator when it indicates maximum vibration” (p.101).
The “incorrect response” of the flight crew is emphasised in the investigation.
The overall conclusion was that:
“The cause of the accident was that the operating crew shut down the No.2 engine after a fan blade had fractured in the No.1 engine. This engine subsequently suffered a major thrust loss due to secondary fan damage after power had been increased during the final approach to land”p.148
I agree that the flight crew failed in their diagnosis and decision-making.
However, wider organisational issues were at play. The investigation makes reference to many of these wider issues within the main body of the report. And yet it’s unfortunate that the conclusion focusses entirely on the pilots. This focus has led to the media, the general public and history remembering this disaster as being the result of “pilot error”.
Let’s ask the substitution test: In the same situation, would two different pilots have made the same mistake?
The failure to fully understand and share the reasons why the pilots acted the way that they did is perhaps the largest error in this disaster.
Report on the accident to Boeing 737-400 G-OBME near Kegworth, Leicestershire on 8 January 1989, Aircraft Accident Report 4/90, Air Accidents Investigation Branch, Department of Transport, London, HMSO. Published 25 August 1990.
For a discussion of human error, blame and investigations, see this article.
For an introduction to Non-Technical Skills (or “Crew Resource Management”) see this page.
For an introduction to human factors in design (“Human Factors Engineering”) and mental models, see this page.