Root Cause Analysis thoughts

Within an organization’s Quality Assurance (QA) program there is a need to continuously improve on processes throughout the Software Delivery Life Cycle (SDLC). One key process improvement initiative is the use of Root Cause Analysis (RCA) however  a common concern is achieving buy-in from the Business end to pay for these initiatives. It is essential that the organization recognize that metrics from this process is very valuable to the future success of an organization. Previous research supports the importance of a communicative relationship between QA and Business groups (Berriault, 2012). During QUEST 2014 discussion occurred  regarding recognition of both disciplines’ needs and consensus of high level points to encourage a communication focus shift  for far reaching improvements within any organization. The QA Value Chain is one of the key processes within the Root Cause Analysis process to assist with justification.

QA Value

RCA is a critical component within any SDLC whether it is Waterfall, Iterative or Agile delivery. There must be a form of tracking and identifying  consistent issues that will allow for the development of solutions to improve delivery quality and speed.  A good Root Cause Analysis will impact each of the support activities of Innovation, Technology Efficiency, output and efficiency. The RCA process should not be taken lightly, viewed as an afterthought or considered a vehicle that is quick and easy or a reason to place blame. It is a slow and methodical process that uses statistics to drive out solutions within a collaborative team environment.

Some groups see the end product of this process as a set of final defect metrics as follows:

Root CauseDefect count

This negative outcome is one of the issues some organizations see with Root Cause Analysis like QA. This is sometimes viewed as a good metric; however it is counterintuitive to RCA. Organizations usually expand resources to deal with defects faster.

The three alternatives that come out of this type of RCA is:

  • Expand capacity – more staff
  • Work harder – overtime
  • Keep fixing – postpone for future releases

The first two are considered as the easiest to accomplish. Here is the kicker, they are not. Each will increase the overall costs and decrease the organization’s Return on Investment (ROI), while not resolving the repetitive problematic issue. This a keep fixing solution which just adds more costs and gives the organization’s customers a faulty product. The ROI is impacted through continued associated costs and tucking it away in a scheduled release is only fooling the organization that they are doing the right thing. To achieve  quality output a clear definition of needed data is required before the RCA process can be considered.

Defects, variances or bugs are terms used to identify process issues. Ann Hunngate (Quest 2014) coined the term “saves” a more positive and forward term.

A good definition of a save should be: Any situation that adversely affects the project’s ability to deliver a product within the project’s expectations. To drill down on the definition consider: Any situation that requires unplanned investigation causing delays. This means anything from a coding issue to printer jams will get a save documented against it.  Consideration of this definition could possibly increase the number of saves, creating anxiety due to elevated statistics. A high number of saves might be viewed as a heavily flawed product, which would be further from the truth. There needs to be an organization change in mindset, where saves only happen during test case execution The outcome provides valuable data on the inner workings of the SDLC. Think of it as the computer centre within a car. It is a central location that is watching all aspects of the car and providing warnings when something occurs.

As stated earlier the RCA process is more than collecting and displaying final save numbers. The actual Analysis is the critical aspect providing the organization’s decision makers with the needed metrics to make positive assessments for future improvements.

RCA requires a focus on processes and process tools. This eliminates the human aspect, where emotions could add an unwanted hurdle, and zeroes in on what is truly causing the issues.

If you don’t ask the right questions, you don’t get the right answers. Asking questions is the ABC of diagnosis. Only the inquiring mind solves problems” – Edward Hodnett, American poet.

Utilizing a change management process is the best way to ensure stakeholders are comfortable with change. The Why, How and What are components recognised in most models. Communication is the link between the disciplines. Begin with why the change is needed. Once that is clear the how and what of the process are the next explanations to be shared. This will allow for easier utilization and acceptance (Sinek, 2009). A side effect of this route of action is the improved overall perception of the QA team; highlighting the team as more than test executioners. There are plenty of change management techniques that will provide additional suggestions to introduce change.

Appropriate use of this process prevents laying blame or placing the failure of some or all of the components on any individual or group creating conflict. Conflict decreases employee morale and creates a barrier for communication between groups and a breakdown of trust.

Quite simply, RCA has two parts:

  • Collecting Data
  • Asking question

The data collection is a little more difficult than identifying a save, such as a code, requirements or environment issue.  It begins early within the SDLC and not in just the execution phases. This is why it is important when explaining the why process to the stakeholders and the rest of the organization.  the following are some points that can be used when having these discussions.

Process focused

RCA will help determine what processes need adjusting or possible required revamping. Much like an assembly line the movement of data and work products follow a similar design where information from one area feeds into another. On such lines the flow and or quality throughout could be improved upon to increase efficiency and the bottom line for the organization. RCA will provide that data to make informed decisions on improvement initiatives (Slack, Chambers, & Johnston, 2010).

One of the key elements to find a process issue where stakeholders will take notice is to have a dollar figure attached to it.  Here is where some additional work is needed when documenting saves. Depending on how the organization is set up from a time tracking perspective the data could already be present and formulated to provide the information needed. If not there is a simple work around that can be used with any tool set that is used to calculate the financial impact. There are three components that need to be documented: Investigation, Fixing and Retest. As stated earlier this process occurs throughout the entire SDLC.

Using these terms are not solely for the use of test execution. Investigation and Fixing are universal statements for any work that is done throughout while Retest would be new in some areas. The easiest way to think about retest is confirmation the Fix is what is expected. For example, ensuring a reference in a requirement that was not there before is there.

For each component there must be a financial component. With most large organizations there would be a time tracking system where data can be pulled off, such as hourly rates.  Defect tracking tools can be customized to allow for each group working on the save to key in the amount of time spent on it. With these two sets of data you now have a cost associated with each issue.

Here is an example using a base day cost of an issue:

How many hours each person worked on the issue

E.g. Lead 1hr, Tester 2 hrs, Developer 3 hours (all rough estimates and keyed into the comment section of the save)

Then determine the number of days – 7.5 hours = 1 day

Then multiply by average “person day” cost.(for this example  use $450)

Using the example above 6 hours would be 0.8 Person day X 450.00 =360.00.

Therefore the cost associated with the Fix is $360.00. Next a process or non-employee issue must be associated with it as well. This component can become difficult due to the amount of analysis that is needed. Frequently it is much simpler to put a quick label on it and forget about it.

Here are some guidelines to determine if time is needed to spend on more detailed analysis:

  • Do I have an idea how this save could have been prevented in the first place?
  • Is the save significant? (i.e. Was there a lot of rework time spent on the defect?)
  • Could small saves turn into monsters?
    • Some saves could be seen as bricks. On its own in one project it could be nothing.
    • If it is repeated in other projects it will turn into a wall.
    • Unresolved saves would have to be monitored throughout the year.

One of the simplest RCA tools that would help determine the root of the issue is the why-why analysis (Slack, Chambers, & Johnston, 2010). At the end of this type of analysis the human component will not be present.

Here is an example:

A coding issue caused all the reports to be deleted.

1 – Why did the reports get deleted?

Developer incorrectly coded the module

2 – Why did the developer incorrectly code the module?

The Developer did not follow the coding standards

3 – Why did the developer not follow the coding standards?

The developer is new to the organization

4 – Why was the developer not trained on the new standards?

The developer was given access to the share-point where the standards reside

5 – Why didn’t anyone confirm with the developer that the standards were understood?

There was no time

As per the above example the broken process here could be charged to poor training procedure. With this information now it can be tracked to see if it will continuously occur. Also now that there is a cost associated with it and using cost benefit analysis suggests a regular similar process break can be applied and solution costs can be calculated.

Taking the above example the cost benefit analysis could be as follows:

Each occurrence costs $360.00

It occurred 8 times in the past 12 months

Total costs over the year $4320.00

Potential solution a two hour job shadow session for new developers with senior developers with a daily average per person cost of $450 for 8 hours the total solution costs is $225.00.

If the department hires 4 developers a year the total yearly costs would be approximately $900. That would be a costs savings of $3420.00 a year. Senior management would find this information purposeful due to the ROI.

RCA is a very powerful tool to help organizations improve SDLC and it can be adapted to use in all forms of development from Agile to Waterfall. These metrics will assist the QA department to provide true quality assurance.

Works Cited

Berriault, J. (2012, September 16). Breaking down the root cause for the knowledge gap between testing and business. Retrieved February 5, 2013, from Athabasca Library:

TED (Producer). (2009). How great leaders inspire action [Motion Picture].

Slack, N., Chambers, S., & Johnston, R. (2010). Operations Management (6th edition ed.). Edinborough Gate: Prentice Hall.