Matthew E May Advises Us to “Boot Your Root (Cause)”

This is a great post from Matthew E. May on Root Cause Analysis.

Boot Your Root (Cause)

Thursday, June 4th, 2015

Process improvers the world over rally around root cause analysis as if it were the Holy Grail of all things organizational. But is it?

Understanding the root cause of a problem certainly makes sense in the context of a present day situation carrying the potential for a correct answer or solution. In the process improvement world, problems center on reducing some form of excess, which comes in several traditional flavors…all of which center on something not working as well as it should be in a perfect world.
But the one critical place in business where root cause analysis has no real place is in strategy formulation.

I’m sure I’ll be taken to task on this by the lean/kaizen/six sigma crowd, but bear with me, because I’ve witnessed repeated attempts to apply root cause analysis to strategy, only to be met with derailment and eventual failure.

The difference between a fix for an existing process or pain point and a set of choices about the future is night and day. Process problems are generally focused inward on activities you presently control. Strategic problems are generally focused outward on the future, and forces you cannot control. In process improvement, you’re pursuing perfection. In strategy formulation, there’s no such thing as a perfect strategy, so you couldn’t pursue one even if you wanted to.

If I own a traditional taxi or limo company, for example, I don’t need to know specifically why Uber entered my market, only that they did, and that my market share is dwindling and my growth and profitability is eroding.

Looked at another way, all strategic problems boil down to a single root cause: customers are finding superior value elsewhere, from a competing offer.
This may seem blazingly obvious. But that doesn’t seem to deter organizations (and their consultants) from applying traditional problem solving to strategy development, spiraling ever downward in an endless series of “why?” questions. The result is an emphasis on drafting a perfect plan and a futile attempt to craft a detailed articulation of the perfect future for the company.

It’s unnecessary, mostly irrelevant, and doesn’t work.

Click here to read the post on Matthew E. May’s blog:


Why 5 Whys?

I’m a big fan of Pete Abilla’s shumla blog.  Way back in the spring of 2007, the title of Pete’s post was “Ask ‘Why’ Five Times About Every Matter”. This is a quote from the visionary Taiichi Ohno. I’ve re-posted below and I hope you get as much out of this as I have.

“Ask ‘Why’ Five Times About Every Matter”

by Pete Abilla on April 16, 2007

Taiichi Ohno is known to have said that “having no problems is the biggest problem of all.”  He viewed problems not as a negative but as a “Kaizen opportunity in disguise.”  Whenever problems arose, he encouraged his staff to investigate the problem at the source and to as “ask ‘why’ five times about every matter (src).”

In a series of events, where people are involved, mistakes happen.  Functional areas such software engineering, industrial engineering or more general areas such as medicine, law, or sociology — these areas are composed by a series of events, involving people, process, machines, environment, and other items.  Undoubtedly, mistakes will happen. What typically happens in response to mistakes is that blame is thrown around, which builds resistance, then communication fails which could lead to project failure. The better approach is to identify the root causes of mistakes and attacking that, instead of what might be perceived as the cause: Perceived causes are most likely symptoms and not the root cause, in which case the problem was never really solved. This, more rigorous and long-lasting, approach to solving problems is called Root Cause Analysis.

Ohno was fond of using the following example to illustrate Root Cause Analysis:

1. “Why did the robot stop?”
The circuit has overloaded, causing a fuse to blow.
2. “Why is the circuit overloaded?”
There was insufficient lubrication on the bearings, so they locked up.
3. “Why was there insufficient lubrication on the bearings?”
The oil pump on the robot is not circulating sufficient oil.
4. “Why is the pump not circulating sufficient oil?”
The pump intake is clogged with metal shavings.
5. “Why is the intake clogged with metal shavings?”
Because there is no filter on the pump.

There are several tools that can aid in the process of Root Cause Analysis.  Basically, it is a simple approach of asking “why” several times until you arrive at an atomic but actionable item. To visually view the process of the “5-why’s”, a tool called an (Ishikawa Diagram) or a (Cause-and-Effect Diagram) or a (Fishbone Diagram) is often helpful — this tool is referred by either of these.

ishikawa diagram

Main Components of an Ishikawa Diagram

  1. At the head of the Fishbone is the defect or effect, stated in the form of a question.
  2. The major bones are the capstones, or main groupings of causes.
  3. The minor bones are detailed items under each capstone.
  4. There are common capstones, but they may or may not apply to your specific problem. The common ones are:
  • People
  • Equipment
  • Material
  • Information
  • Methods/Procedures
  • Measurement
  • Environment

After completing your Fishbone Diagram excercise as a group, it is helpful to test your logic by working the bones: top-down OR bottom-up like:

this happens because of g; g happens because of f; f happens because of e; e happens because of d; d happens because of c; c happens because of b; b happens because of a.

The excercise above is crucially important — you must test your logic so that it makes pragmatic sense and that the atomic root cause is actionable — that is, you can do something to correct it, reduce it, or eliminate the root cause.

Once you or your team arrive at a root cause for a specific capstone, then you typically “cloud” it to identify it as a root cause. A good rule is that there is typically *NOT* 1 root cause for a problem, but potentially several. Below is a diagram of one fishbone, decomposed:

ishikawa, fishbone,

A Few Helpful Hints

  1. It is helpful to pull many people into the construction of these diagrams, as this ensures enough diversity of thought to make sure you get the righ potential root causes.
  2. Keep asking “why” until you arrive at something atomic and actionable.
  3. The purpose of this tool is to answer a question, then brainstorm about how to fix the identified root cause.
  4. Getting more people involved will give them a sense of ownership — and that sense of ownership is very important because now that they feel part of the process, resistance to change will likely be less of a problem.

Real-World Example of Root Cause Analysis

I once helped a large healthcare organization save several million dollars. This organization had the largest call center in California, handling over 8 million calls per year. These were mostly inbound calls, resulting from some internal mistake that caused people to call. My job was to identify the largest opportunity (call type), why are people calling, and eliminating or reducing the root causes.

After pulling log file data and running this enormous data set against my handy-dandy, home-grown regular expression engine (written in Python), we stratified the data into logical stratifications and identified that the largest number of calls into the call center were calls related to a specific product. This discovery naturally begs the question “why” — i.e., this question forms the head of our fishbone: “Why are x% of calls related to x product type?”

We got a cross-functional team together and proceeded with the Ishikawa excercise and identified several root causes. After running the root causes against a prioritization matrix, we went after the low hanging fruit, then the more difficult root causes after that. The result? We demonstrated a quantifiable reduction of inbound calls of this specific type — a reduction of ~8%, which amounted to over $2 Million Dollars in cost savings.

Facts Versus Data

Ohno seems to see a difference in the two.  In his words,

“The root cause of any problem is the key to a lasting solution,” Ohno used to say.  He constantly emphasized the importance of genchi genbutsu, or ‘going to the source,’ and clarifying the problem with one’s own eyes. ”‘Data’ is of course important in manufacturing,” he often remarked, “but I place greatest emphasis on ‘facts.’”

I believe what he means is that data, is a degree removed from the actual place where the phenomena is happening.  He placed a greater value on being where the work is done and where value is added.  Whereas data is often on a computer screen or on paper.  He preferred to be at the source of the phenomena.

More Fish To Go Around

Root Cause Analysis can be used anywhere. In software engineering, it can be used to identify the root causes of bugs in code; in industrial engineering, root cause analysis can be used to identify defects in design; in medicine, root cause analysis can be used to arrive at the reasons for mistakes or lack of patient satisfaction.

Root Cause Analysis is a helpful business tool with application to all areas of business and technology. Eliminating or reducing the root cause is much more effective than fixing a symptom. Involving people in the process will ensure buy-in and the elimination of resistance.

Creating an Improvement Culture

Ohno believed that by empowering each associate to have ownership and improve their work and the Gemba, that is how innovation is achieved and a culture of improvement is created — it is empowering your people to make the right changes using the tools that work — Root Cause Analysis is one of the tools that can help empower your employees and help to create a culture of improvement throughout your enterprise.  One item missed by most people is that Toyota doesn’t just build cars, but it also builds people.  Root Cause Analysis is an effective tool that helps associates feel empowered to make their everyday work better.

Check out Pete’s blog at

Root Cause Analysis (Overview)

Root cause analysis (RCA) is a class of problem solving methods aimed at identifying the root causes of problems or incidents. The practice of RCA is predicated on the belief that problems are best solved by attempting to correct or eliminate root causes, as opposed to merely addressing the immediately obvious symptoms. By directing corrective measures at root causes, it is hoped that the likelihood of problem recurrence will be minimized. However, it is recognized that complete prevention of recurrence by a single intervention is not always possible. Thus, RCA is often considered to be an iterative process, and is frequently viewed as a tool of continuous improvement.

Ishikawa fishbone-type cause-and-effect diagram
Image via Wikipedia

RCA, initially is a reactive method of problem detection and solving. This means that the analysis is done after an incident has occurred. By gaining expertise in RCA it becomes a pro-active method. This means that RCA is able to forecast the possibility of an incident even before it could occur. While one follows the other, RCA is a completely separate process to Incident Management.

Root cause analysis is not a single, sharply defined methodology; there are many different tools, processes, and philosophies of RCA in existence. However, most of these can be classed into five, very-broadly defined “schools” that are named here by their basic fields of origin: safety-based, production-based, process-based, failure-based, and systems-based.

  • Safety-based RCA descends from the fields of accident analysis and occupational safety and health.
  • Production-based RCA has its origins in the field of quality control for industrial manufacturing.
  • Process-based RCA is basically a follow-on to production-based RCA, but with a scope that has been expanded to include business processes.
  • Failure-based RCA is rooted in the practice of failure analysis as employed in engineering and maintenance.
  • Systems-based RCA has emerged as an amalgamation of the preceding schools, along with ideas taken from fields such as change management, risk management, and systems analysis.

Despite the seeming disparity in purpose and definition among the various schools of root cause analysis, there are some general principles that could be considered as universal. Similarly, it is possible to define a general process for performing RCA.

General principles of root cause analysis

  1. The primary aim of RCA is to identify the root cause of a problem in order to create effective corrective actions that will prevent that problem from ever re-occurring, otherwise known as the ‘100 year fix’.
  2. To be effective, RCA must be performed systematically as an investigation, with conclusions and the root cause backed up by documented evidence.
  3. There is always one true root cause for any given problem, the difficult part is having the stamina to reach it.
  4. To be effective the analysis must establish a sequence of events or timeline to understand the relationships between contributory factors, the root cause and the defined problem.
  5. Root cause analysis can help to transform an old culture that reacts to problems into a new culture that solves problems before they escalate but more importantly; reduces the instances of problems occurring over time within the environment where the RCA process is operated.

General process for performing and documenting an RCA-based Corrective Action

Notice that RCA (in steps 3, 4 and 5) forms the most critical part of successful corrective action, because it directs the corrective action at the true root cause of the problem. The root cause is secondary to the goal of prevention, but without knowing the root cause, we cannot determine what an effective corrective action for the defined problem will be.

  1. Define the problem.
  2. Gather data/evidence.
  3. Ask why and identify the true root cause associated with the defined problem.
  4. Identify corrective action(s) that will prevent recurrence of the problem (your 100 year fix).
  5. Identify effective solutions that prevent recurrence, are within your control, meet your goals and objectives and do not cause other problems.
  6. Implement the recommendations.
  7. Observe the recommended solutions to ensure effectiveness.
  8. Variability Reduction methodology for problem solving and problem avoidance.

Root cause analysis techniques

  • Barrier analysis – a technique often used in particularly in process industries. It is based on tracing energy flows, with a focus on barriers to those flows, to identify how and why the barriers did not prevent the energy flows from causing harm.
  • Bayesian inference
  • Causal factor tree analysis – a technique based on displaying causal factors in a tree-structure such that cause-effect dependencies are clearly identified.
  • Change analysis – an investigation technique often used for problems or accidents. It is based on comparing a situation that does not exhibit the problem to one that does, in order to identify the changes or differences that might explain why the problem occurred.
  • Current Reality Tree showing diagnosis of mult...
    Image via Wikipedia

    Current Reality Tree – A method developed by Eliahu M. Goldratt in his theory of constraints that guides an investigator to identify and relate all root causes using a cause-effect tree whose elements are bound by rules of logic (Categories of Legitimate Reservation). The CRT begins with a brief list of the undesirables things we see around us, and then guides us towards one or more root causes. This method is particularly powerful when the system is complex, there is no obvious link between the observed undesirable things, and a deep understanding of the root cause(s) is desired.

  • Failure mode and effects analysis
  • Fault tree analysis
  • 5 Whys
  • Ishikawa diagram, also known as the fishbone diagram or cause-and-effect diagram. The Ishikawa diagram is the preferred method for project managers for conducting RCA, mainly due to its simplicity, and the complexity of the rest of the methods[1].
  • Pareto analysis
  • RPR Problem Diagnosis – An ITIL-aligned method for diagnosing IT problems.

Common cause analysis (CCA) common modes analysis (CMA) are evolving engineering techniques for complex technical systems to determine if common root causes in hardware, software or highly integrated systems interaction may contribute to human error or improper operation of a system. Systems are analyzed for root causes and causal factors to determine probability of failure modes, fault modes, or common mode software faults due to escaped requirements. Also ensuring complete testing and verification are methods used for ensuring complex systems are designed with no common causes that cause severe hazards. Common cause analysis are sometimes required as part of the safety engineering tasks for theme parks, commercial/military aircraft, spacecraft, complex control systems, large electrical utility grids, nuclear power plants, automated industrial controls, medical devices or other safety safety-critical systems with complex functionality.

Basic elements of root cause

  • Materials
    • Defective raw material
    • Wrong type for job
    • Lack of raw material
  • Man Power
    • Inadequate capability
    • Lack of Knowledge
    • Lack of skill
    • Stress
    • Improper motivation
  • Machine / Equipment
    • Incorrect tool selection
    • Poor maintenance or design
    • Poor equipment or tool placement
    • Defective equipment or tool
  • Environment
    • Orderly workplace
    • Job design or layout of work
    • Surfaces poorly maintained
    • Physical demands of the task
    • Forces of nature
  • Management
    • No or poor management involvement
    • Inattention to task
    • Task hazards not guarded properly
    • Other (horseplay, inattention….)
    • Stress demands
    • Lack of Process
  • Methods
    • No or poor procedures
    • Practices are not the same as written procedures
    • Poor communication
  • Management system
    • Training or education lacking
    • Poor employee involvement
    • Poor recognition of hazard
    • Previously identified hazards were not eliminated

From many sources including Wikipedia, the free encyclopedia (16 Sept 2010)

Getting to Root Cause

from Evan Durant at the Kaizen Notebook

August 3, 2010 by Evan Durant 

This is real 'root cause'
Root Cause Analysis is rarely this easy!

The picture above is of a sidewalk over which I run most days.  If you look closely, you’ll see that a section of the sidewalk has been replaced with fresh, new cement.  The reason is that the old section was pushed up, broken, and a terrible trip hazard on dark winter mornings.

Naturally I’m pleased with the improvement, but the lean thinker in me can’t help but see a flaw in this fix.  After all, look just to the left of the sidewalk and you can see the tree that caused the sidewalk damage in the first place.  In this case the root cause of the problem is, well, a root.  Assuming the tree continues to grow and the roots continue to spread, it seems like we’ll be paying to replace this sidewalk again in the future.

Anyone else out there patching a sidewalk with a tree growing underneath it?

5 Whys in the Real World

I love poking around the internet looking for what I call “real world” examples. I ran across this example of Amazon CEO Jeff Bezos whiteboarding his own, quick & dirty 5 whys. It demonstrates that doing root cause analysis is not time intensive. It also demonstrates that root cause analysis does not take long to accomplish. This post comes from Pete Abilla’s blog. I hope you enjoy it as much as I have.

 This is a post from Pete Abilla’s shmula blog.

Jeff Bezos and Root Cause Analysis

Posted by Pete Abilla
January 23, 2009

I’m always impressed when CEO’s demonstrate Deming-like behavior as they lead; it’s rare, but there’s almost a magical, mobilizing, and inspiring force that happens when CEO’s or corporate leaders behave in a respectful, inspiring, common-sense, and thoughtful way.

Jeff Bezos
Amazon CEO Jeff Bezos (Image via Wikipedia)

Today, I’m reminded of an experience back in 2004 while I worked for — something Jeff Bezos did that I still carry with me to this day.

During Q4, Bezos and his leadership team have a tradition of visiting the Fulfillment Centers, spends time with the associates, and also physically works on the floor alongside everyone else.

During one visit, there had just been a safety incident where an associate had damaged his finger.  When Jeff learned of this during a meeting, he was very disturbed and got very emotional — angry at first, then felt very bad for this associate and his family.  Then, he did something remarkable.

He got up, walked to the whiteboard and began to ask the 5-why’s (I quote the below from memory):

Why did the associate damage his thumb?

Because his thumb got caught in the conveyor.

Why did his thumb get caught in the conveyor?

Because he was chasing his bag, which was on a running conveyor.

Why did he chase his bag?

Because he placed his bag on the conveyor, but it then turned-on by surprise

Why was his bag on the conveyor?

Because he used the conveyor as a table

So, the likely root cause of the associate’s damaged thumb is that he simply needed a table, there wasn’t one around, so he used a conveyor as a table. To eliminate further safety incidences, we need to provide tables at the appropriate stations or provide portable, light tables for the associates to use and also update and a greater focus on safety training. Also, look into preventative maintenance standard work.

There are several things amazing about this experience:

  1. Jeff Bezos cared enough about an hourly associate and his family to spend time discussing his situation.
  2. Jeff properly facilitated the 5-why exercise to arrive at a root cause: he did not blame people or groups — no finger pointing.
  3. He involved a large group of stakeholders, demonstrated by example, and arrived at a root cause and he didn’t focus on symptoms of the problem.
  4. He is the founder and CEO of, yet he got involved in the dirt and sweat of his employees’ situation.
  5. In that simple moment, he taught all of us to focus on root causes — quickly — not heavily relying on data or overanalysis of the situation, and yet he was spot-on in identifying the root causes of the safety incident.

Every company has its warts and zits, but, make no mistake — Jeff Bezos is a Lean and Six Sigma fanatic and, in my opinion, makes a strong effort to run his company in a very Deming-like way.

How will you apply the 5-why’s today?  Will you focus on the root causes of your challenges and not just on the symptoms?

Read More: