Improved IT Service Management requires better process management

According to ITIL Version 3, 55 percent of all incidents involving outages are actually self-inflicted. Does this mean that change management or incident management are the problem? Or, is this really a symptom of viewing change and incidents as siloed activities? I want to suggest that viewing incident and change separately is equivalent to blind monks touching different parts of an elephant and trying to describe what an elephant is. The fact is that incident, problem, and change are all part of one system with a focus on improving the effectiveness and efficiency of service delivery.

 

Using a real case example to “make it real”

Several years ago I was able to spend some time with a leading Fortune 500 financial organization. This customer was interested in using analytics to improve the quality of their incident management. However, they already thought that they were “pretty good at incident management”. The interesting thing was that in the end it was unclear if they were indeed “good at incident management” or not. Who knows, maybe they were in isolation?

They really needed to ask, “How good are we at service delivery?”  When this question was asked the answer was clear—but I am getting ahead of myself here… Let’s first take a peek at their data so you can draw your own conclusions.

 

First call resolution (FCR)

 

FCR blog.jpg

 

I have shown the first call percentage of this company to many audiences without mentioning the customer’s name. I have asked for a show of hands for those thinking is the percentage is “good”, “okay” or “bad”. According to my (highly scientific) results, the vast majority of hands are raised for “bad”. Everyone seems to say that a “good” first call response is 75 percent or higher.

In fact, a recent survey of HP customers showed the same results. So clearly, January FCR for this company is lower than anyone would desire. With that being said, I was left asking what happened to push the number lower than it should be.

 

Incident Escalations

Next, let’s  look at Incident Escalations. At first blush look, things looked okay for incident escalations by category.

 

incident escal by category.jpg

 

I have been told by our consultants that 1.2 escalations is actually a pretty good. 1.2 means that 20 percent of the incidents are escalated from the service desk more than once. This means that 20 percent of the incidents are escalated to an org that cannot fix the customers problem and have to be sent somewhere else.  2.5 means that all incidents have to be escalated more than once and 50 percent are escalated three times.

 

However, when drilling into incidents by subcategory, you can see that there is an issue here. Take a look at the incident escalation for check processing (a made up name to protect the not so innocent).

 

incident escalation by subcategory.jpg

 

 

Think about what a 2.5 says for this subcategory—the organization is not only ineffective but inefficient. And think about the customer satisfaction of customers that have averaged 2.5 escalations for their issues. What a terrible user experience! Armed with this information, let’s look at Incident Volume.

 

Incident Volume

As can be seen below, the volume of incidents dramatically rose in November and stayed high into January.

 

incident volume.jpg

 

And even more interesting, incident resolution times lengthened continuously throughout the period.

 

incident resolution.jpg

 

 To understand more about this I looked at Incident Aging. Here everything kind of came together. Incidents dramatically increased and even though more throughput occurred, the number of incidents that take more than a day to fix grew dramatically.

 

incident aging.jpg

 

 

Finally, let’s look at incidents by severity—especially across core systems. (I will warn you, this figure is scary!)

 

incident table.jpg

 

 

Customer debrief

Armed with this information, I wanted to know what caused such a spike in incidents and even worse, the number of Sev 1 incidents. After all this is a financial institution that people like to have touching their money at the end of year. So I called a meeting to review. As I presented the data, I asked a simplest of questions, “What the h… is going on?”

Critical systems were clearly being brought to their knees. At this moment, everyone in the room smiled. Someone answered that they had gotten all of their changes done by end year. You see they had a changed window in December so they pushed everything through in November.

 

Conclusions

The problem that prevented me from seeing whether they had good incident process or not was change. The question still remains, “Can a good incident process exist without a good change process or for that matter, a good problem process?” The answer is no. As I suggested at the beginning, we should no longer view performance in isolation. What we are delivering is not incident, problem, or change, it is instead service delivery. And these are the measures that matter. This is how IT can truly make it matter!

 

Learn more about the importance of incident and change management in your organization. Visit us here to find out how we can help you and your IT Service management.

 

Related links:

Solution page:  Service Management

Twitter: @MylesSuer

Comments
StuartRance | ‎09-09-2013 01:21 PM

That is a nice use of data to help inform an intelligent discussion of how well IT is meeting the customer needs. Thank you for sharing.

 

I  tried to locate where in the ITIL books this figure of 55% comes from, but could not find it.

MylesS | ‎09-09-2013 01:25 PM

Thank you, Stuart. I was an ITIL Version 3.0 reviewer. It was in the original service transition book. It would take some work, but I could go back and find the quote.

 

Myles

chuck_darst | ‎09-10-2013 08:45 AM

The percentage of incidents caused by change is illusive. In my v3 Foundations class ~6 years ago, 60% was in the student workbook.

 

The past couple of years, Forrester and the itSMF AMS have run a survey which includes one question on this topic. The results are presented at the FUSION event in the fall and published later as "The State Of IT Service Management In 2012". The most common answer last year (after Don't Know) is 10-39%. I don't think you can consider this scientifically accurate, but it is interesting.

 

Here is another interesting one. Gartner published a paper this past May "Know the Top Five Reasons Why New Change Management Implementations Fail and How to Avoid Failure". In this paper, there is a statement to the effect of 80% of incidents encountered by infrastructure and operations will be caused by failed changes (and here is the key) in organizations that have not implemented an effective change management process.

 

BTW, it would be hard to prove, but I think you could make an interesting analysis/case for higher percentages of major service disrupting incidents to be caused by change in larger, distributed organizations. See http://h30499.www3.hp.com/t5/IT-Service-Management-Blog/Improving-Service-Quality-by-the-Numbers-11-...

 

Last thing as I love this topic, historically to get up into the 60-80% type numbers of incidents caused by change, you had to include introducing "bugs" during a change. The people tasked with the change management process and tasks could have done everything right. Luckily Agile will fix this :-).

 

Chuck

MylesS | ‎09-10-2013 08:55 AM

Thanks Chuck! Now, I don't have to look this up. The fact is the number is large and if we can do a better job at change and as part of this, if can lock down configuration, we can reduce incident volume and outages. This improves service resilence and availability. This is important for IT and the rest business that runs on IT!

| ‎09-12-2013 10:22 AM

Does HP SM 9.32 have the ability to generate reports such as you use in this article?

MylesS | ‎09-12-2013 10:52 AM

These reports were generated through a supporting product called HP Executive Scorecard. It also has the ability to create KPIs and drive performance and accountability to process improvement goals. To see a talk including this please click this link. http://ow.ly/oOsVt. Please know there are, also, some supporting reports provided with HP Service Manager as well.

Leave a Comment

We encourage you to share your comments on this post. Comments are moderated and will be reviewed
and posted as promptly as possible during regular business hours

To ensure your comment is published, be sure to follow the Community Guidelines.

Be sure to enter a unique name. You can't reuse a name that's already in use.
Be sure to enter a unique email address. You can't reuse an email address that's already in use.
Type the characters you see in the picture above.Type the words you hear.
Search
About the Author
Mr. Suer is a senior manager for IT Performance Management. Prior to this role, Mr. Suer headed IT Performance Management Analytics Product ...


Follow Us
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation