Curious Case of Tenerife Plane Crash & Medical Errors
- What we see through the Swiss Cheese Model -
This piece was featured on KevinMD.com, the leading social media platform for healthcare providers and researchers. You can read the published version here (https://www.kevinmd.com/2023/03/the-curious-cases-of-the-tenerife-plane-crash-and-medical-errors-what-we-see-through-the-swiss-cheese-model.html).
Last October, I wrote this piece about losing my grandfather to medical errors. Throughout the writing process, I felt close and distant from my experience. It felt distant because as a public health professional, part of me was doing a post-mortem of the incident, reflecting on medical errors and systems thinking. Although the last thing I wanted was to relive what happened to my grandfather scene by scene, I could not help but analyze how we could have prevented the tragedy. In healthcare, we often use lessons from the aviation industry to improve team-based care and patient outcome. I hope this post will help patients, their families, and healthcare providers to explore the layers beneath the immediate problem they are looking at. Each solution is context-specific and there is no perfect solution; however, examining the root cause and developing situational awareness is the first step in guiding us in the right direction.
“Even a room with flammable gas will not explode unless someone strikes a match.” – Dr. Bob Wachter
Case 1:
On March 27, 1977, two 747s collided on the runway at Tenerife in the Canary Islands, killing 583 people. It was a foggy morning, and the KLM 747 was waiting for clearance to take off. Captain Van Zanten was a well-known pilot with excellent safety records. The KLM crew spotted a Pan Am 747 taxiing toward the lone runway a while before and assumed the plane was out of the way. The fog was so thick that the crew had to rely on the Air Traffic Controller (ATC). The KLM copilot said to the ATC, “We are now at takeoff,” which is a nonstandard statement in aviation. The pilot then added, “We are going.” The ATC assumed the KLM plane was in a takeoff position (and not actually taking off) and replied “OK,” which is another nonstandard reply and led the KLM captain to believe they were cleared for takeoff. Due to the interference in the radio frequency, the KLM crew couldn’t hear accurate information about the Pan Am 747. The KLM flight engineer asked the captain: “Is [the Pan Am] not clear?” The captain just replied with “Yes” and pulled the throttle, accelerating for takeoff. Emerging from the fog was the Pan Am plane sitting on the main runway right in front of them. The captain got the nose of the KLM over the Pan Am, but the tail dragged along the ground and through the Pan Am’s upper deck of the fuselage. Both planes exploded, resulting in the worst air traffic collision of all time. This tragedy led to the development of Crew Resource Management (CRM), a set of training exercises to improve interpersonal communication, leadership, and decision-making in cockpits to minimize human errors.
Case 2:
On April 21, 2013, an 86-year-old male patient who was recovering from pneumonia at a community hospital experienced fluctuating blood pressure throughout the day. A nurse noticed the trend in the morning yet did not voice her concern to the resident and attending physician. The resident later noted the patient’s unstable condition and was not sure about the next steps; however, he did not escalate it to the attending because the attending was off duty. Later in the day, the patient’s daughter noticed the patient’s decreasing heart rate. She pressed the nurse call buttons numerous times, but nobody was there to respond to her call at the nurse station. The daughter rushed to the nurse station and realized that all the heart monitors at the station were turned off. She went back to her father’s bedside and held his hand until he passed away 4 minutes later.
The seemingly unrelated cases above tell you several common underlying themes:
1) A single event/person did not cause either incident.
2) Authority gradients (psychological distance between a worker and supervisor) and miscommunication played huge roles in the buildup of each event.
3) Both workplaces did not have a system that caught human errors.
We will explore each theme separately.
Theme #1: A single event/person did not cause either incident.
Accumulation of multiple lapses led to the plane collision and the patient’s death. Let’s look at each event using the Swiss Cheese Model. This model is often used in commercial aviation and healthcare to demonstrate that a single “sharp-end” (e.g. the pilot who is operating the plane or the surgeon who makes the incision) error is rarely enough to cause harm. The error must penetrate multiple incomplete layers of protection (Swiss cheese layers) to cause an accident.1 The goal of each organization is to shrink the holes in the Swiss Cheese (latent errors) through multiple overlapping layers of protection to decrease the probability that the holes will align and cause harm.2
Case 1: Plane collision at Tenerife
Multiple factors caused the plane collision. The crew and ATC had control over three out of four factors or “holes” in Swiss Cheese layers.
Factor 1: Weather (natural force) - Foggy weather contributed to poor vision -> The crew had no control.
Factor 2: Communication between the KLM copilot and Air Traffic Controller (ATC) (human factors) - Both parties were using nonstandard terminology when communicating their plane position and status. ATC assumed the KLM plane was in a takeoff position and not actually taking off (cognitive error). ATC’s “OK” in response to the copilot/pilot assured the pilot that he was cleared for takeoff. -> The ATC and crew did have control.
Factor 3: Technology (human factors to some extent) - Interference on the radio frequency precluded the KLM crew from hearing Pan Am’s status message and the ATC’s response. -> The crew could have some control by asking follow-up questions.
Factor 4: Authority gradient (human factors) - Hierarchy in the aviation industry enforced the psychological distance between the flight engineer/copilot and the pilot. The flight engineer and co-pilot did not raise concerns about the missed radio transmission from Pan Am. Although the KLM flight engineer asked a nudging question, it was not strong enough to stop the pilot. The pilot was not receptive to/aware of their colleagues’ mild voices of concern. -> The crew did have control.
Now moving on to Case 2. In this case, the hospital care team had control over four out of four factors to prevent the incident.
Case 2: Medical Errors
Factor 1: Authority Gradient (human factors) - Hierarchy in healthcare enforced the psychological distance between the nurse and the resident/attending physician. The nurse noticed the patient’s fluctuating blood pressure in the morning and felt concerned; however, she didn’t voice it to the resident/attending physician. -> The healthcare team did have control.
Factor 2: Cognitive & Communication Lapses/Authority Gradient (human factors) - When the resident physician later noticed the patient’s condition, he didn’t know what to do. He didn’t ask for help by escalating the issue to the attending physician. -> The healthcare team did have control.
Factor 3: Alarm Fatigue (human factors) - The main crisis monitor was turned off at the nurse station. Healthcare providers would not have been able to detect warning signals that showed the patient’s dangerously low heart rate for nearly 30 minutes before his heart stopped. Too many insignificant alarms including false ones in Health Information Technology (HIT) created mental fatigue among healthcare workers. -> The healthcare team did have control.
Factor 4: Staffing issue (human factors) - Nobody was at the nurse station to respond to the nurse call. Low nurse staffing led to a compromised safety culture. -> Hospital administration did have control.
Theme #2: Authority gradients (psychological distance between a worker and supervisor) and miscommunication played huge roles in the buildup of each event.
Communication structures in both incidents were heavily influenced by authority gradients/hierarchy in the cockpit and hospital.
Here's the interesting question: Were these people, especially individuals of a higher status such as the pilot and resident/attending physician aware of authority gradients and their potential repercussions on a flight or patient safety? The captain and the resident/attending physicians might have been cognizant to some degree but had they ever considered workflow from other roles’ perspectives?
Figure 1 shows that surgeons and nurses/residents often have completely different perceptions of the efficacy of their communication structures. While attending surgeons in the survey felt that teamwork in their OR was solid, the rest of the team members disagreed. This means that followers, not just leaders, should also evaluate the quality of communication and teamwork.3 Acknowledging the differences in degrees of perception among team members is critical because we cannot design effective solutions unless we become aware of existing problems.
Figure 1.
Members of different groups of operating room personnel who rated teamwork as “high” in their OR. (Sexton JB, Thomas EJ, Helmreich RL. Error, stress, and teamwork in medicine and aviation: cross-sectional surveys. BMJ 2000;320:745–749.)
Another important question is: Why would the discrepancy in perception happen in the first place? It is easy to blame people for their attitudes such as arrogance and complacency, but that alone does not explain the whole picture. The core issue is the lack of systems thinking, a paradigm that looks at relationships among parts instead of separate parts when understanding the complexity of the world. In many industries like aviation and healthcare, each role is highly specialized, and individuals are often not aware of interrelationships among different roles, let alone how those relationships affect their workflow and outcome. As a result, cross-departmental communication rarely happens (or did not use to happen in aviation until the Tenerife incident), and each role relies on its assumptions about other roles.
In his interview reflecting on human-factors issues, Donald Norman points out the tendency in engineering to forget that individual elements work together, creating a system. Each instrument may be designed well and function perfectly, but when you put them together, you could create a disaster.4Everyone wants to do the right thing and perhaps functions well on their own; however, most work requires teamwork, and when these professionals come together without understanding how each project works as a system, things fall apart.
Theme #3: Both workplaces did not have a system that caught human errors.
“Errors are largely unintentional. It is very difficult for management to control what people did not intend to do in the first place.” – James Reason
Humans err. Telling people not to slip is not realistic because we work in a dynamic, not static environment. Multiple factors including our fatigue level, mental state, and interpersonal factors influence our decision-making and actions. Understanding the types of risks is critical because our solutions need to match the types of problems.5 For example, a checklist might be useful for simple steps such as preparing and operating surgical instruments, but could be distracting if the problem requires creative solutions.
There are two types of errors: slips and mistakes. Slips are inadvertent, unconscious lapses when performing automatic tasks such as pilots taking off or healthcare providers writing prescriptions. In Case 1, slips happened for both the copilot/pilot and ATC when they used nonstandard statements during their status communication (“We’re now at takeoff” instead of “We’re now at takeoff position” by the copilot and “OK” instead of “(Callsign) 123 RNAV to MPASS Runway […], Cleared for Takeoff” by the ATC). In Case 2, slips happened when the heart monitor was turned off at the nurse station and nobody was there to respond to the nurse call. If the nurse station was adequately staffed, they might have been able to prevent the sharp-end harm by responding to the nurse call despite the heart monitor being turned off. Another potential slip could be that the nurse who noticed the patient’s fluctuating blood pressure might have forgotten to bring it up to the resident due to her heavy workload or the information was not communicated well to an incoming nurse during handoffs (healthcare provider’s shift change).
The second error type, mistakes, result from incorrect choices due to insufficient knowledge, lack of experience/training, inadequate information, and applying the wrong set of rules to a decision.6 The KLM flight crew made a wrong assumption that the Pan Am plane was on a different runway based on inadequate information (they were not able to hear the Pan Am due to radio frequency interference) and did not attempt to clarify further. The resident physician did not know the next step when noticing the patient’s unstable condition due to insufficient knowledge and lack of experience/training, and he did not call the attending physician.
While mistakes require discipline-specific approaches such as addressing weaknesses in flight training and medical education, slips can be prevented by relatively easier approaches. For example, built-in redundancies, cross-checks/checklists, readbacks (“let me read your order back to you”), and safety practices such as asking patients for their name and date of birth before administering medications, have been successfully implemented in the U.S.7 Case 1 encouraged readbacks where the pilot reads the flight plan clearance back to the ATC to prevent misunderstanding. In Case 2, simple, standardized communication procedures among healthcare providers might have helped create a complete and accurate information flow in the team. Teaching hospitals in the U.S. have increasingly adopted the I-PASS mnemonic (Illness Severity, Patient Summary, Action List, Situation Awareness and Contingency Planning, Synthesis by Received) to standardize provider-to-provider signout, which has improved safety.
So What Can We Do?
The good news is that Crew Resource Management (CRM) developed after the Tenerife disaster has been one of the most successful practices to improve safety culture in aviation, and its procedures have also influenced healthcare in the U.S. Following the adoption of CRM in aviation industry, the U.S. and Canadian airlines had a remarkable reduction in the annual fatal accident rate (Figure 2). CRM focuses on training crews in communication and teamwork to encourage crews to speak up about the authority gradient. Communication skills such as SBAR (Situation, Background, Assessment, and Recommendations) and briefing/debriefing techniques have been applied in healthcare to improve staff communication, especially between nurses and physicians. CUS words (“I’m concerned about…,” then “I’m uncomfortable…,” and finally, “This is a safety issue!”) are also effective ways to escalate levels of concern for anyone lower on a hierarchy who needs attention of someone higher.8 While aviation and healthcare have unique challenges specific to each field, CRM combined with multi-pronged approaches is critical to address inevitable yet preventable human factors errors.
Figure 2.
Commercial aviation's remarkable safety record. The graph shows the annual fatal accident rate in accidents per million departures, broken out for U.S. and Canadian operators (solid line) and the rest of the world's commercial jet fleet (hatched line). (Source: http://www.boeing.com/news/techissues/pdf/statsum.pdf, Slide 10.)
To recap, here are three main points about systems thinking:
1. Systems thinking and the Swiss cheese model are critical to 1) understanding the root cause of each incident and designing system-specific solutions and 2) developing situational awareness about how an individual’s role affects other roles and impacts the workflow.
2. Systems thinking aims to anticipate and catch human factors errors such as slips and mistakes before they cause sharp-end harm.
3. Crew Resource Management (CRM) in the aviation industry has a wide application in different industries including healthcare to dampen authority gradients.
Writing this, I cannot help but feel a twinge of regret because I wish I knew about systems thinking back in 2013. I wish I had asked questions of my grandfather’s healthcare providers to guide them to develop situational awareness. I wish I had been able to advocate for my grandfather and my family. I wish my grandfather was still here with us. At the same time, I wonder if my questions to healthcare providers would have made much difference given the cultural differences (In Japan, authority gradients between physicians and nurses are much steeper than in the U.S., and we would need culturally competent tools to address the safety culture). I hope the safety culture initiatives will continue to solidify the teamwork among healthcare providers and between the healthcare teams and patients/families. At the end of the day, we all want the same thing: Saving our loved ones’ lives one at a time.
Chapter 2. Basic Principles of Patient Safety. In: Wachter RM. eds. Understanding Patient Safety, 2e. McGraw Hill; 2012. Accessed January 25, 2023. https://accessmedicine.mhmedical.com/content.aspx?bookid=396§ionid=40414534
Chapter 2. Basic Principles of Patient Safety. In: Wachter RM. eds. Understanding Patient Safety, 2e. McGraw Hill; 2012. Accessed January 25, 2023. https://accessmedicine.mhmedical.com/content.aspx?bookid=396§ionid=40414534
Chapter 9. Teamwork and Communication Errors. In: Wachter RM. eds. Understanding Patient Safety, 2e. McGraw Hill; 2012. Accessed January 25, 2023. https://accessmedicine.mhmedical.com/content.aspx?bookid=396§ionid=40414542
Chapter 7. Human Factors and Errors at the Person–Machine Interface. In: Wachter RM. eds. Understanding Patient Safety, 2e. McGraw Hill; 2012. Accessed January 25, 2023. https://accessmedicine.mhmedical.com/content.aspx?bookid=396§ionid=40414540
Chapter 2. Basic Principles of Patient Safety. In: Wachter RM. eds. Understanding Patient Safety, 2e. McGraw Hill; 2012. Accessed January 25, 2023. https://accessmedicine.mhmedical.com/content.aspx?bookid=396§ionid=40414534
Chapter 2. Basic Principles of Patient Safety. In: Wachter RM. eds. Understanding Patient Safety, 2e. McGraw Hill; 2012. Accessed January 25, 2023. https://accessmedicine.mhmedical.com/content.aspx?bookid=396§ionid=40414534
Chapter 2. Basic Principles of Patient Safety. In: Wachter RM. eds. Understanding Patient Safety, 2e. McGraw Hill; 2012. Accessed January 25, 2023. https://accessmedicine.mhmedical.com/content.aspx?bookid=396§ionid=40414534
Chapter 9. Teamwork and Communication Errors. In: Wachter RM. eds. Understanding Patient Safety, 2e. McGraw Hill; 2012. Accessed January 25, 2023. https://accessmedicine.mhmedical.com/content.aspx?bookid=396§ionid=40414542