IT operations running well? 6 activities will let you know.

IT value chains: detect to correct

In “What every business leader should know about IT management,” I shared that it was possible for business leaders to understand “what’s going on” in IT by understanding the five value chains of IT:

 

-strategy to portfolio

-require to deploy

-request to fulfill

-detect to correct

-data to information

 

In this post, I will review the fourth value chain, detect to correct, having reviewed request to fulfill in my last post.

 

Detect to correct

Detect to correct is about how well your IT organization prevents services and supporting infrastructure from breaking down or degrading, as well as how well it manages things when the inevitable happens—something breaks. Simply put, this value chain aims to, as COBIT 5 suggests, “increase user productivity and minimize disruptions.” As you’d expect, the detect-to-correct value chain touches many IT activity categories. On my list are the following: capacity, availability, operations, incident, problem, and security.

 

What is the goal for detect to correct?

To successfully manage this value chain, your IT department must ensure its operational activities are running efficiently and effectively. Increasingly, accomplishing this requires that your IT organization automate their ability to drive compliance against IT standards and policies. They should begin by making sure operational activities are performed as required and on schedule. Ask whether standard operational procedures have been defined, and ascertain how operations are monitored, measured, reported, and remediated. This includes making sure that the availability plan anticipates business needs and provides appropriate capacity based on your company’s requirements. At the same time, you want to know that capacity, availability, and performance issues are identified and routinely resolved—again, this cries out for automation. Ask what elements of an automation stack have been implemented.

 

When things break, you want to know that incidents are resolved according to service levels. You should ask your IT department for a comparison between service-level goals and actual delivery. Obviously, IT should always aim to reduce the repeat offenders of poor delivery; this is accomplished by policies, standards, automation, and problem management. Problem management is about decreasing the number of recurring incidents over time—ultimately, it is about making problems go away permanently. Finally, IT’s goal should be to run operations securely. This means that operations are run in accordance with security policy. The goal here is to minimize the business impact stemming from information vulnerability and incidents over time.

 

Measuring whether improvement is indeed happening

Your IT organization should be continuously improving how it detects the need for corrections across its operation’s activity categories. IT should be tracking performance against a number of key performance indicators (KPIs) and metrics. Given the broad scope of this value chain, I will highlight just a few KPIs here and why they are important. The percent of services that meet performance goals and the percent of applications with high availability are great starting points. Together these numbers tell you how well you are running the activities around availability and capacity. With this, you should be asking about how well incident response times are being delivered against goal. You want to see the percent of incident response times met improve month over month. This is a great way to be shown the quality of processes as well as the responsiveness to fix important issues.

 

Next, since problem management aims to decrease repeating issues, it is good to get a sense of IT’s responsiveness when it comes to getting things fixed permanently in an timely manner—for this activity category, a good KPI is the percent of problems resolved by due dates. Finally, to measure the security of operations as a whole, you want to look at the percent of incidents classified as security incidents. This number should be going down over time. And to view security proactively, you should look at the average time to run a policy check and the mean time to recover from non-compliance. These two KPIs will tell you if infrastructure is being driven to standard, which in turn prevents downtime for changing and patching software in addition to limiting security vulnerability issues.

 

Where do you go from here?

Start by asking your business about its detect to correct value chain. Then make sure they are actively working to improve the process—this is never-ending work. This way you can ensure that when you need IT to perform well, you get what you need!

 

Related links:

Solution page: IT Performance Management

Twitter: @MylesSuer

Leave a Comment

We encourage you to share your comments on this post. Comments are moderated and will be reviewed
and posted as promptly as possible during regular business hours

To ensure your comment is published, be sure to follow the Community Guidelines.

Be sure to enter a unique name. You can't reuse a name that's already in use.
Be sure to enter a unique email address. You can't reuse an email address that's already in use.
Type the characters you see in the picture above.Type the words you hear.
Search
About the Author
Mr. Suer is a senior manager for IT Performance Management. Prior to this role, Mr. Suer headed IT Performance Management Analytics Product ...


Follow Us
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation