Applying the Five Ws to Incident Management

Blogs· 4min July 26, 2023

In this blogpost, David introduces us to the five W's of information gathering - Who? What? When? Where? Why? Answering the five Ws helps Incident Managers get a deeper understanding of the cause and impact of incidents, not just their remedy, leading to more robust solutions. Fixing the cause of an outage is only just the beginning and the five Ws pave the way for team collaboration during investigations.

How do we fix it?

If you're new to the ideas of Incident Management, you can read David's introductory blog post giving a quick intro to the day-to-day life of an Incident Manager.

One of the first things that comes to mind when an incident occurs is one question. Ok, we have a problem. How do we fix it? A perfectly valid and logical response, but not always the one that meets our goals in the most effective way as Incident Managers. Before we can even start to think about how we fix an issue we first have a series of rapid fire questions we need to process first.

Information is key!

Who? What? When? Where? Why?

Those are the questions we really need to answer before we can move on to how.

The five Ws of information gathering

Who?

  • Who is impacted?
  • Who do we need to call?
  • Who knows more about this?
  • Who are our end users?

If we have no idea who is impacted by this issue, or what kind of expertise we are likely to need to fix it. How are we going to know the answer to our overall question of how do we fix it?

What?

  • What is not working as intended?
  • What has changed recently?
  • What is the impact on our clients?
  • What is the impact on end-users?

If we have no idea what is not working as we expect it to, if we have no idea what the impact is. How can we effectively communicate with our customers and how can we start working on a resolution?

When?

  • When did this all start?
  • When will the impact get worse?
  • When do we need fresh eyes?

If we can't see the point at which an issue started, it becomes very hard to identify a cause and a resolution. Some issue, such as say a high CPU or low memory warning can be just that, a warning. So, we need to know when is that warning going to become a problem. We need to be able to articulate that to our customers if needed. Likewise, we as a company have a duty of care. We need to acknowledge an engineer may be at the end of an on-call period and a handover may be more beneficial.

Where?

  • Where is the stopping point?
  • Where, geographically are our blockers?

In infrastructure a blocker can be a high sign of where we can start looking for a fix. If we look at the flow of information we can start at the start, health check, but how long will that take? If we know where our stopping point is, our blocker. We can "skip" five steps and focus in on where we see in the logs that we need to check first.

Why?

  • Why does the network traffic route that way?
  • Why is this customer affected and this one is not?
  • Why did this start at this time?

Why can be one of the most effective initial tools in our arsenal. Perhaps the key question Incident Managers can at times forget, why can be seen as a Problem Management question. Why did it break? Well, that's in root cause analysis. Why can point an investigation in a direction very quickly because it can identify something out of the normal, something unusual and unexpected.

Why can lead us to the answers of so many questions and open our minds to find more questions that could have been overlooked by a rush. The Incident Management process should never be a rush, it should be a smooth process of decisive but deliberate choices. All our questions lead us to more questions, often more than we do answers. We must work together to choose which of these questions need answers, to find our priorities.

Answering the Five Ws

One of the huge benefits at Form3 is the tooling we have, but more than that, ones Incident Management has. It can be easy to think a technical tool for monitoring for example should be used by Development to provide information to Operations. Good dashboards and metrics that Incident Management have access to can answer most of these questions without an Incident Manager ever needing to ask an engineer. Our Incident Managers frequently rely on logz.io and Grafana as their sources of information.

  • Collaboration.
  • Communication.
  • Consideration.

When we work with Development in a DevOps environment, be that DevOps or SecDevOps, we give Incident Management more tools and more options so our engineers can focus on the fix, while we focus on the impact and the five Ws.

Conclusions

The five Ws can be the best place to start in an incident scenario but not all at once. We don't need answers to why, what, when and who to reach how, but it means we know the purpose of being there and as our decisions are made, they build the foundation of the best possible solution, the best way to restore service. I've seen so many people seek how with a need for instant gratification and sometimes it works, often however it creates removes the confidence of technical teams who must say we don't know.

I say, we don't know yet, but we have questions and that's the perfect start.

Written by

github-icon
David Macarthur Incident Management Specialist

David is one of our Incident Managers at Form3. He has a focus on continual improvement in our processes across the whole company. He is also passionate about accessibility, diversity, inclusion and leadership.

You can find David on LinkedIn where he has several articles on various topics.