THORP (Thermal Oxide Reprocessing Plant), 20 April 2005
The Sellafield site in the north west of England is described by the UK National Audit Office as the UK’s largest and most hazardous nuclear site, due to what it calls a legacy of poor planning and neglect. The Nuclear Decommissioning Authority (NDA) plans that decommissioning at Sellafield will be complete by 2120, at a projected cost of approximately £67 billion (~US$95 billion).
Sellafield is split into several operating units, one of which is THORP (Thermal Oxide Reprocessing Plant). This plant reprocesses fuel from nuclear power plants. Part of the process involves used nuclear fuel being dissolved in nitric acid to form dissolver product liquor, before being separated into uranium, plutonium and highly radioactive liquid waste effluent.
On 20 April 2005 a leak of highly-radioactive product liquor inside a heavily-shielded cell of the THORP plant was reported to the UK HSE. In total, approximately 83000 litres of dissolver product liquor, containing approximately 22000 kilograms of nuclear fuel (mostly uranium incorporating around 160 kilograms of plutonium), had leaked onto the floor of the cell. At the time of the event, the nuclear safety and security regulator was Her Majesty’s Nuclear Installations Inspectorate (HM NII – part of the HSE); but is now the Office for Nuclear Regulation (ONR), established on 1 April 2014.
The UK Guardian newspaper ran a story on 9 May 2005 “Huge radioactive leak closes THORP nuclear plant“:
“A leak of highly radioactive nuclear fuel dissolved in concentrated nitric acid, enough to half fill an Olympic-size swimming pool, has forced the closure of Sellafield’s THORP reprocessing plant. The highly dangerous mixture, containing about 20 tonnes of uranium and plutonium fuel, has leaked through a fractured pipe into a huge stainless steel chamber which is so radioactive that it is impossible to enter. Recovering the liquids and fixing the pipes will take months and may require special robots to be built and sophisticated engineering techniques devised to repair the £2.1bn plant”.
The THORP Safety Case assumes that any leaks of product liquor would be detected and recovered within a few days. However, the HSE investigation revealed that the leak of dissolver product liquor into the cell went undetected from before August 2004 until April 2005 (although it appears to have been a small leak until January 2005). HSE later noted that the liquor was returned to primary containment on 14 June 2005.
Although a ‘criticality incident’ (a nuclear chain reaction) would not have been possible, and there is no evidence of any harm to workers or the public, there had been a significant prolonged reduction in the high standards required of a nuclear operation. HSE requires that “leaders and managers of the industry instil an open challenging, questioning culture that continuously strives for sustained excellence in operation” (HSE, 2007).
The HSE investigation (2007) made recommendations concerning design, engineering, human factors, organisational issues and safety management.
The leakage of process liquor occurred from the failure of a nozzle where it met a vessel. The nozzle failed due to vibration-induced fatigue, as the vessel moved visibly during certain operations. The technical origins of the leak were a combination of design and construction issues, together with inadequately assessed future modifications.
Human factors findings
The investigation considered a range of technical, human and organisational factors. I have extracted the main human factors issues from the official investigation report; background information on these topics can be found in the Key Topics section of this website.
Competence issues arose in both the maintenance of equipment; and in the identification and diagnosis of faults.
A number of the instrument maintainers were ‘cross trained’ electricians, who did not receive the same level of training as a fully trained instrument mechanic. This may have been a contributing factor in preventing maintainers from identifying that there was a problem with instruments over a long period of time. The investigation questioned whether these maintenance staff were ‘SQEP’ for their tasks (Suitably Qualified and Experienced Personnel).
The investigation also considered that the training for personnel who may need to analyse data trends from the plant systems. It is key that these personnel are SQEP for the task of retrieving, trending, analysing and evaluating data. This will improve skills in diagnosing non-routine faults on complex plant.
The investigating team concluded that there were significant deficiencies in the scope, content and implementation of instructions (particularly for maintenance and proof testing of sump pneumercators). These maintenance and proof test procedures only tested part of the pneumercator system.
HSE recommended that the company should provide adequate written instructions for plant operations for taking, processing and responding to the results from routine sump sampling operations.
In order to reduce the likelihood that short cuts or workarounds will emerge over time, there should be adequate control, supervision and monitoring to confirm that such operations are actually carried out as required.
“The HSE investigation team found that there were significant operational problems with the management of a vast number of alarms in THORP, resulting in important alarms being missed” (HSE, 2007).
A key piece of equipment in this incident, the buffer sump pneumercator, had been behaving erratically for some time. The means of ‘topping up’ the sump with acid was difficult and personnel had significant difficulty in adjusting the sump level to the required depth. Because of this, staff had problems in getting the pneumercator level instrument to operate between the low and high alarm levels (i.e. ‘normal’ operating conditions).
The pneumercator had therefore been below both the ‘low’ alarm and the ‘low low’ alarm setting for extended periods. The HSE investigation team concluded that the pneumercator was routinely left in low alarm, and this was routinely condoned by supervisors.
It was found that supervisors did not regard low-level alarms as being as important as high alarms. This may in part be due to the fact that the safety case did not recognise a low sump alarm as being significant, but it did for a high alarm. In the low alarm case, the safety case appears inadequate and this is thought to have had an influence on the level of attention of supervisors.
There was also an issue with how alarms were presented to operators on the screens of the DCS (Distributed Control System). As new alarms are displayed, existing alarms are pushed down the page, with the result that if they are not reset or return within acceptable parameters, alarms would be pushed back several pages and likely be forgotten by the operators. Furthermore, alarms were presented from other unrelated plant that the operators were not responsible for, leading to alarm floods, distraction and an increase in their workload.
The culture at the THORP Plant seemed to allow instruments to operate in alarm mode, rather than questioning the alarm and rectifying the relevant fault. This culture was condoned by supervisors, and was ‘encouraged’ by the large number of alarms. Operators were therefore ‘nudged’ into an alarm-tolerant culture. This culture also condoned a failure to comply with alarm response instructions – as unless an alarm was dealt with soon after occurring, it would likely not be attended to.
“The fact that the plant had deliberately been operating for some time in alarm mode, and was therefore non-compliant with instructions, raises concerns about control and supervision as well as the effectiveness of the safety management system and safety culture existing in the plant at the time of the leak” (HSE, 2007)
“An underlying cause was the culture within the plant that condoned the ignoring of alarms, the non-compliance with some key operating instructions, and safety-related equipment which was not kept in effective working order for some time, so this became the norm“ (HSE, 2007).
These findings for the alarm management and safety culture topics show how the various human factors topics are interrelated. The list of key human factors topics is a useful structure, but it’s often difficult to separate-out the issues in an investigation.
Roles, responsibilities and learning
There was some confusion between manufacturing personnel and the manufacturing support team as to who was responsible for aspects of plant monitoring and trending of plant safety data, and what data should be trended.
The need for clarity of responsibility was a finding of BNFL’s own investigation in 1998 of a similar event in the Head End Dissolver cell when pipework had eroded through and leaked a small quantity of diluted dissolver product liquor into the sump. This event should have resulted in the implementation of recommendations to improve leak detection and monitoring at Head End. HSE concluded that these recommendations were directly relevant to the 2004/05 leak.
Furthermore, the company concluded in a 2005 internal investigation report that ‘the shift management structure within Head End can be confusing and that control and supervision arrangements are often unclear among the workforce’.
The failure to learn from previous events was therefore a contributory factor in this incident.
The job scope and range of duties and responsibilities for the Shift Team Manager at this area of the plant was considered by the HSE team to be more demanding than other THORP Shift Team Managers (e.g. they had additional safe systems of work duties). It is HSE’s opinion that this pressure may have contributed to plant monitoring not being carried out sufficiently rigorously.
Decision-making and managing change
It appeared that the requirement for sump sampling at three-monthly intervals had been reduced some years ago (from a monthly sampling interval). The reason for this change of frequency could not be established. If sump sampling had been carried out properly (and the results assessed), the leak would have been identified much earlier.
Decisions were also made to change the agitation system for the vessel under investigation – to agitate the contents for prolonged periods and to agitate the tanks when less than full. These changes had the effect of increasing the horizontal movement of the vessel, further stressing the fixed pipework. It is not clear how these decisions were audited or assessed prior to changes being introduced. The full implications of these changes do not appear to have been considered.
There are many examples across the safety-critical and high-hazard industries of incidents due to inadequately-assessed changes – whether these are changes to design, equipment, procedures, processes or organisational arrangements. The HSE concluded with a message for the company and the wider industry:
“It is essential that changes, even those that are apparently minor, are carried out with appropriate assessment of their potential impact by people who understand their safety significance in relation to the original design intent of the plant or processes to be changed” (HSE, 2007).
During the period before the leak was discovered, there appeared to be an absence of a questioning attitude, for example, where evidence from accountancy data was indicating that something was untoward. The possibility of a leak did not appear to be considered as a credible explanation. The need for a questioning attitude towards potential safety issues and the need to encourage challenge are aspects of a strong positive safety culture.
Oversight and audit
The fact that there were some long-standing failings in some key safety arrangements raises questions about the effectiveness of the company’s arrangements for monitoring, audit and review. There need to be effective arrangements to provide assurance that those controls to ensure safety which are intended to be in place actually are in place and are working effectively.
“In the investigating team’s view, this lack of management oversight and consequent lack of proper ongoing proactive monitoring and audit was one of the principal reasons why this event proceeded for as long as it did” (HSE, 2007).
It was apparent from discussions with some Shift Team Managers that infrequent routine samples, such as the sump sampling, are not high on their list of priorities. HSE concluded that operational samples needed to keep the plant processing appear to be the priority.
No news is good news?
It’s not unusual for organisations to fail to seek out bad news. High Reliability Organisations (HROs) proactively look for warning signals and act on them before they lead to a major event. In his excellent book ‘Lessons from Longford’, Andrew Hopkins states that ‘The focus on success breeds confidence that all is well’ (2000, p.141). In my summary of the NASA Shuttle Columbia incident, I discuss how the fact that Shuttles continued to arrive home safely, even after foam strikes during take-off, led to NASA operating on the principle that ‘nothing bad has happened yet’.
According to a report prepared for the Nuclear Decommissioning Authority (NDA), there existed a ‘new plant culture’ at THORP (Sellafield’s flagship plant) which was simply not expected to have problems. The investigation into the THORP nuclear event concluded that:
“Senior managers cannot rely on the absence of incidents as an indicator that everything is as it should be or as they would wish” (HSE, 2007).
Although the THORP plant was Sellafield’s flagship installation, it was not operated, maintained and managed to the high standards required of the nuclear industry. The HSE served two Improvement Notices during the initial investigation due to deficiencies in relation to compliance with the nuclear licence conditions.
British Nuclear Group Sellafield Limited (BNGSL) pleaded guilty to three offences under the Nuclear Installations Act 1965 and was fined a total of £500,000 plus costs. In his sentencing remarks Mr Justice Openshaw stated that:
“these were serious offences, the breaches amount to a significant departure from the relevant safety standard over a prolonged period of time and a failure to comply with important conditions concerned with safety attached to a licence to operate the most hazardous nuclear undertaking in the United Kingdom” (16th October 2006).
Report of the investigation into the leak of dissolver product liquor at the Thermal Oxide Reprocessing Plant (THORP), Sellafield, notified to HSE on 20 April 2005. Published by the Health and Safety Executive, February 2007.