Reflecting on the Challenger disaster and its investigation

What the incident still teaches us about investigations, risk and organisational failure

TL;DR – Key insights

The Challenger disaster is often explained as the result of “pressure to launch”, but this explanation may be too simplistic.

Many major incidents emerge not from a single bad decision, but from normal organisational processes unfolding over time.

Investigation findings are shaped by where investigators choose to focus their attention.

Media reporting of major events may not accurately reflect official findings.

If organisations want to learn from disasters, they must look beyond individual mistakes and examine how decisions are shaped by organisational context.

As the 40th anniversary of the Space Shuttle Challenger disaster approached in January 2026, I watched a documentary that caused me to reflect on how such events are investigated, how they are portrayed in the media and how they are later remembered.

This article is therefore a little different from my other incident reviews. As usual, I outline some of the key facts of the disaster. But I also reflect on what the Challenger investigation can teach us about how we understand incidents and how investigations shape the stories we tell.

The NASA Space Shuttles were reusable spacecraft and in 1986 there were four in the fleet: Challenger, Columbia, Discovery and Atlantis. Prior to this disaster there were 24 successful launches – and space travel had become routine. In January 1986 this was Challenger’s 10th mission. On board was a teacher selected from thousands of applicants, partly to reignite public interest in the shuttle program.

The shuttles are launched into space with the aid of two Solid-propellant Rocket Boosters (SRBs). The shuttle’s own engines then take over, fed with a mix of hydrogen and oxygen by the larger external liquid fuel tank. The Solid Rocket Boosters help to propel the shuttle during launch and then fall away as it ascends (to be recovered and reused). The external fuel tank is jettisoned shortly after, and is not recovered.

The Space Shuttle system includes these two SRBs, the external fuel tank and the airplane-like spacecraft (the orbiter).

An image showing the main components of the Space Shuttle System (the two Solid Rocket Boosters, the External fuel tank and the airplane-like Orbiter vehicle) — The main components of the Space Shuttle system (Image: NASA.gov)

The SRBs were made from several large cylindrical segments by Morton Thiokol, a NASA contractor in Utah, then part-assembled at the Kennedy Space Centre (the launch site). In the joints between these sections, two gaskets (the primary and secondary O-rings) were installed to seal any gaps that may be created during ignition, preventing high-pressure hot propellant gases from escaping the SRB.

Space Shuttle Challenger: The disaster

On 28 January 1986, NASA Space Shuttle Challenger broke apart 73 seconds after launch, killing all seven crew members, including a school teacher. This disaster ended any perceptions that the shuttle was an operational program and that human spaceflight was routine.

Put simply, the technical issue was failure of the O-ring seals in a joint on the right Solid Rocket Booster (SRB). Hot gases were able to breach both the primary and secondary O-rings and ignited, damaging the SRB support structure, and leading to rupture of the external liquid fuel tank. A release of liquid fuel from this tank ignited, engulfing the shuttle in flames. At the same time, huge structural forces were acting on the shuttle.

“In view of the findings, the Commission concluded that the cause of the Challenger accident was the failure of the pressure seal in the aft field joint of the right Solid Rocket Motor”
Presidential Commission Report, Volume I, p.73

There was a long history of these O-rings failing on previous flights. In cold launch conditions, they were hardened by the low temperatures, reducing the effectiveness of this critical seal between sections of the SRB. There were concerns and debate about the effect of low-temperatures on the O-rings up to (and immediately prior to) this launch of Challenger. However, there was no predictable pattern to the O-ring anomalies and when there was an issue, few joints were affected – leading engineers to conclude that it was not a design problem.

The risk was deemed to be “acceptable”, and the issue was seen as something to be corrected, rather than the need for a radical redesign.

Challenger Shuttle launch January 1986 — Investigations showed that escaping gases were seen from the lowest Solid Rocket Booster (SRB) joint at lift-off

Although the direct failure was a technical issue, many contributing factors are documented in the official investigation (Report of the Presidential Commission) and in the subsequent Congress Committee Report.

Challenger: Pressure to launch

The Presidential Commission Report discusses schedule pressures and a flawed decision making process. It refers to serious management failures; for example, failing to report O-ring problems up the NASA hierarchy. This combination of production pressures and managerial wrong-doing, widely reported in the media, is how many people remember the event.

It would be easy to assume that these managers took a calculated risk and lost. They were reported to be influenced by pressure to launch a certain number of flights per year – and to launch them on time. “Pressure to launch” became the convenient and accepted explanation of the disaster. Pressures that took priority over safety, and pressures that were to override engineering concerns. With the benefit of investigation hindsight, managerial misconduct appears to be a suitable explanation.

“The decision to launch the Challenger was flawed”.
Presidential Commission Report, Volume I, p.83

Would professional NASA engineers and managers engage in misconduct, in order to further the goals of the organisation? Can the Challenger event be explained by managers intentionally violating rules in order to achieve the organisation’s launch targets?

This “pressure to launch” may not be sufficient to explain the circumstances that led to the Challenger disaster. There are many examples of NASA delaying previous shuttle launches – including for safety concerns and where there was a negative impact on launch schedules. Furthermore, there were many systems, checks and balances in place prior to a shuttle launch – some of these incurred great cost, for the benefit of safety.

Of course, there may be decisions that were made at the expense of safety, but there are also many examples of decisions that enhanced safety, increased cost, or delayed launch schedules. A complex organisation such as NASA made thousands of key decisions each year.

The NASA decisions that you choose to examine will influence the conclusions that you reach. In any investigation, the factors on which you focus will influence the outcome.

The official explanation of events, widely reported in the media, (and often reframed and heavily summarised), appears to be too simplistic. It’s easy to look back at certain decisions made prior to the disaster and create a narrative that concludes “flight safety was traded for production pressure”.

Essentially – in investigations, you find what you go looking for.

Human error. Again.

Previously, it was not uncommon for serious incidents to be blamed on ‘human error’. We’re now seeing this explanation of events less frequently and the pitfalls of concluding that an incident was due to human error are widely-known. Unfortunately, other terms have crept into the safety vocabulary to replace a conclusion of ‘human error’, misuse of the term ‘situation awareness‘ being one of them.

There has been a move away from explaining incidents through the lens of front-line behaviours, and we’re seeing less incidents being blamed on these individuals – such as a pilot, nurse, driver or control room operator. However, prominent cases that focus on “the last person to touch the equipment” continue; for example, see this article on Is human error a crime?.

The more mature incident investigations attempt to uncover why individuals failed; noting perhaps that human performance was influenced by fatigue, distractions or workload. You can read more about these Performance Influencing Factors in the Homer Simpson article. Performance Influencing Factors are much better explanations than simple ‘human error’, as they allow us to make tangible changes to the workplace to prevent similar events.

A handout of Performance Influencing Factors — Performance Influencing Factors (pdf version available)

The CSB investigation into the 2005 Texas City explosion is a good example of an investigation that examines what lies beneath human error.

More recently, a ‘failure of leadership’ has emerged as a common explanation. However, in many cases this conclusion has merely shifted the blame from a front-line individual to another individual who is higher up the organisational hierarchy. Is this any better than concluding ‘human error’?

In a previous article, I commented on how leaders in the healthcare sector can be set up to fail by unrelenting financial and operational pressures, media attention and increasing complexity (see Leadership – Delivering the Impossible). This focus on leadership failure is having a negative impact on recruitment into healthcare senior executive positions (particularly those organisations that would benefit most from experienced leadership).

There is a real danger that focusing on human error or failures of leadership will prevent learning.

Good managers – Poor managers?

In the various terrible events described on my Incidents page, the investigation reports often conclude that managers and leaders failed. They failed to spot certain warning signs, failed to heed warnings, or failed to assess the risks. However, these are the same people that create success each and every day – despite significant challenges and obstacles. Were they good managers until a significant failure occurred that redefined their actions?

If we want to support learning from events, it doesn’t help to focus on a few individuals in an organisation, regardless of their position in the hierarchy. Particularly if we don’t try to understand the rich context that influenced their decisions and behaviours.

The better we understand how decisions are made and influenced within an organisation, the more we can support people to make “good” decisions.

Organisational failures

Although focusing attention on an individual who ‘failed’ is neat and tidy from an investigation or legal viewpoint, many incidents (and the systems in which they occur) are just too complex to conclude that an individual or small group of individuals somehow failed. How does such an extraordinary sequence of events happen, resulting in the Challenger, Texas City, Piper Alpha, Macondo and many other disasters?

Rather than simply focusing on individuals in organisations who make mistakes, we need to acknowledge that these mistakes are often produced by normal organisational life. The NASA organisation, like many complex organisations, created a way of seeing that was also a way of not seeing.

The Challenger launch decisions may not have been conscious acts of wrong-doing, but more likely people doing what made sense to them at the time. On the ‘inside’ of the organisation, their actions were not seen as wrong or deviant.

Debates around the Challenger launch decision should not be taken to imply that the conditions that led to disaster are unique to this event, or unique to NASA. Many other well-known incidents and disasters show that organisations proceed as if nothing is wrong, when in fact, a wealth of information shows that something IS wrong.

The failures are often at an organisational level, not at the level of individuals (regardless of whether these individuals are at the sharp end, middle managers, or senior leaders). To really understand these events, we need to examine how organisations work.

The sociological approach

Although my higher education is in psychology, I studied sociology for a few years, and the two subjects complement each other. Human factors is a multi-disciplinary approach, and includes these two disciplines of psychology and sociology; so it’s helpful to look at organisations through both lenses. A sociological perspective has been applied to several major disasters, including Longford and Macondo (Hopkins, 2000, 2009) and Challenger (Vaughan, 1996). I’ve written elsewhere about the “Organising for Safety” report (1993), which discusses the importance of understanding how organisations work and how they are structured.

When we look at normal life in large organisations, including NASA, we find that decisions are influenced by the social context. No doubt there were several social factors at play, including pressure to launch. If managers and engineers were influenced by production or schedule pressures, it may have been a subtle influence, rather than a conscious input to their decision making. It would be easy to conclude that rules were violated, or that decisions were made based on schedules; but perhaps instead the production pressures were institutionalised, simply a part of how the organisation “worked”.

If we examine the decision making process in NASA from a wider viewpoint, we see that NASA slowly, incrementally, created a system that enabled poor judgment. In her extensive analysis of the disaster, Diane Vaughan concludes:

“No extraordinary actions by individuals explain what happened: no intentional managerial wrongdoing, no rule violations, no conspiracy. The cause of the disaster was a mistake embedded in the banality of organizational life and facilitated by an environment of scarcity and competition, elite bargaining, uncertain technology, incrementalism, patterns of information, routinization, organizational and interorganizational structures, and a complex culture”
Diane Vaughan, The Challenger Launch Decision (1996, p.xiv)

I’ve written other articles about “normalisation of deviance“, where teams and organisations become conditioned over time to see failures or technical issues as normal. Previous successes, even when known failures are present, reinforces this normalisation. In the case of Challenger, the team accepted increased risk once more, even in discussions the evening before the disaster.

The declaration by President Reagan on 4 July 1982 that the shuttle was “operational” and ready for routine spaceflight changed the perception of risk for key stakeholders and within NASA.

Risk is socially constructed as much as it is constructed from technical analysis.

The societal context within NASA (and the related organisations) can be described as follows:

learning by doing,
ambiguity was a fact of life, and that
problems were normal.

They were making engineering decisions using unruly technology, in an unforgiving environment. When we review the full data from interviews and documentation, we find work groups that were often cautious, conservative and vigilant, not complacent or overly optimistic.

Decision – what decision?

As a result of the investigations, the decision to launch Challenger on that cold morning in January 1986 has received much attention. For example, the focus is on who decided that the risk was “acceptable”, whether the launch decision was made with all available information and the possible schedule pressures that impacted this decision.

But taking a deeper perspective to understanding this event, we should be interested in other key decisions, such as:

Selecting this particular design for the Solid Rocket Boosters, where sections were assembled at the Kennedy Space Centre (so-called ‘field joints’).
Selecting O-rings as the solution to sealing the joints.
Definitions of acceptable risk, when the joint/seal design flaws were identified as early as 1978.

These and other decisions have received less profile than the launch decision and therefore organisations will be less likely to understand what can be learned from these. As with any investigation, it’s important to consider a much longer timeline of events that starts several years before disaster.

Understanding the launch decision is key, but the disaster timeline started with design decisions almost a decade earlier.

The magical power of hindsight

In the course of a detailed investigation, with significant resources over a period of months or years, the issue of the SRB joints on shuttles seems clear and well-defined. Looking back at the decisions that were made, that contributed to the disaster, it’s not difficult to piece together a narrative that “explains” the Challenger disaster. This is not unique – tragedy always leads to a re-examination and sometimes re-interpretation of data, methods and decisions.

With the benefit of hindsight, we might question why certain decisions were made. An adverse consequence always seems more obvious after the event. But at the time, this narrative of events may not have been so neat and well-structured for those within NASA making these decisions. Decisions that were made in a very different time and place from the investigation. Although looking at the same information as the investigation team, the NASA workforce were seeing it incrementally, one flight at a time.

These people were doing what made sense to them at the time, with the information that they had, based on their prior experiences, with their biases and assumptions, and with the various social and cultural forces at play.

Unintentionally, as they attempt to reconstruct what happened, incident investigations can create an account of behaviours and decisions that differs greatly from that which was experienced by those at the time. Without reference to the social, cultural and organisational context, these behaviours and decisions may seem inappropriate to us after the event.

How investigations shape the story

Let’s take a concrete example: a few weeks before the Challenger launch, the O-ring erosion problems were “closed out” in the computer tracking system (known as MPAS). This was seen by investigators as a deviant action – perhaps an attempt to hide problems from senior managers.

But closing out the O-ring erosion issue in MPAS did not mean that the problem was being ignored – on the contrary, there were two relevant Task Forces, daily reviews, and ongoing research into the issue. Those closely involved knew that they were not actually “closing out” the issue, they were simply removing it from one of many tracking systems (and thus having a positive effect on their admin workload).

Therefore, this action was not seen as “deviant” by those engineers at the time that the decision was made. Without understanding this wider context, it would be easy to conclude that the O-ring problems were not being addressed, when in fact, the opposite was true.

Our understanding of the Challenger event is partly shaped by who the Commission chose to interview, what information they had available, and the lines of questioning taken during the investigation. For example:

At the time of the televised hearings, the Commission was unaware of a video showing a briefing to senior NASA administrators about the O-ring issue following a previous cold weather launch. Evidence of this briefing may have changed the conclusions of Volume 1 of the Report of the Presidential Commission.
Of the fifteen engineers who participated in the final teleconference prior to launch, seven were chosen to testify in the televised hearings. These seven engineers had all opposed the launch. If engineers with diverse views been chosen to testify, the questions and conclusions may have been different.

How events are reported in the media: TL;DR

In writing about incidents on this website, I often review hundreds of pages of original source documents, as well as the official investigation reports. The challenge here is time – understanding these documents takes hours (or more realistically, several days!). For example, there are a total of five volumes of the Challenger Presidential Commission Report, including many detailed appendices. These volumes were themselves produced from a significantly larger set of documents. And the Presidential Commission is just one of several detailed investigations into Challenger.

When major disasters are reported in the media, the focus will, understandably, be on a selection of the issues. Sometimes, these will be issues that make good headlines and generate more online views (and therefore advertising revenue). Attention-grabbing ‘soundbites’ are great for TV news channels. The general public absorbs some of this selective reporting, which becomes the accepted version of events. There is a danger that such selected issues form the basis of any future learning.

Some issues were reported widely, some were ignored by the media. The Presidential Commission Report received extensive media coverage (at least Volume 1 did), but the subsequent report by the Congress Committee on Science and Technology (that contradicted some of those findings) didn’t receive the same media attention.

My review of the media reporting suggests that the focus on management wrong-doing was stronger in the media than that reported in the Commission Report itself. There were poor decisions made under uncertainty, but the Commission did not conclude intentional wrong-doing or misconduct.

Given the significant amount of published materials in detailed inquiry or investigation reports, it’s not surprising that much of this remains unread by the majority (even by those who specialise in safety and risk!). Hence the acronym TL;DR, which stands for Too Long; Didn’t Read. At the start of a long article, you may see the TL;DR version – essentially this is a summary of key points.

In the case of Challenger, emails uncovered in the investigation or the testimony of engineers fed a line of inquiry relating to “pressures and launch decisions”, which then featured prominently in the Presidential Commission Report, which made good headlines in the media, and thus became the “cause” of the disaster for many of the general public.

Unfortunately, major investigations uncover significant details and create overwhelming reports. These are then summarised to become manageable and in doing so, we often lose context. This is a problem that I face when writing incident summaries for this website: trying to balance detail with readability.

If we are to learn from disasters in complex organisations, there is limited value that can be gained from addressing the headlines or high-level summaries.

As an investigator, or the author of an incident report, bear in mind that your key findings (and how you present them) can have a lasting impact on which lessons are learnt and what people remember about the event. This is particularly the case when your readers are time-poor.

Final thoughts

The Space Shuttle program was simply amazing – carrying people into orbit, launching satellites and enabling construction of the International Space Station. However, despite 30 years of firsts and achievements at the frontiers of technology, the program is remembered by many for the loss of the shuttles Challenger in 1986, and Columbia in 2003.

But this disaster did not occur because a few individuals made poor decisions on a cold January morning. It emerged from the normal workings of an organisation: from the way that risks are understood, the subtle influences on decisions and how problems become accepted as routine.

If our understanding of the social and cultural influences on people in complex organisations fails to keep pace with developments in technology and engineering, then other industries can expect to experience their own Challenger event.

Reflection and discussion questions

How does your organisation define “acceptable risk”? How often are those definitions challenged or revisited?

When incidents occur, what tends to receive the most attention: technical failures, individual mistakes, or organisational factors? What might be overlooked as a result?

What production, performance or schedule pressures exist in your organisation, and how might these subtly influence everyday decision making?

When something goes wrong, how often is the explanation framed around individual error, rather than the conditions in which people were working?

Are there issues in your organisation that were once considered unusual, but over time have become accepted as “just the way things are”?

When reviewing past decisions, do you attempt to understand the context in which those decisions were made?

When investigating incidents, how far back do you look? Are early design, policy or structural decisions given attention?

Does identifying who was responsible end the conversation, or does it prompt deeper questions about how the organisation works?

Human factors and Homer Simpson

What is human factors? Do you have difficulty explaining the topic to others? And what value does human factors add?

This case study examines the factors that might influence a control room operator’s behaviour…

Principles of human performance

Human factors applies an understanding of humans in order to create the best possible fit between people, the things they use and the systems in which they find themselves. In order to achieve this, we…

Space Shuttle Columbia

The Columbia space shuttle disaster on February 1, 2003, resulted from a foam strike that compromised its wing’s thermal protection. NASA’s organisational culture, characterized by complacency and inadequate safety assessments, contributed to this tragedy. Despite…