IT value chain: detect to correct value stream ensures efficiency and effectiveness!

D to C.pngThe detect to correct value stream is about the efficiency ofIT operationsand day-to-day IT management. Using a tennis analogy, the “front hand” of detect to correct is concerned with how well IT monitors, detects and corrects issues—how well IT prevents business services and supporting infrastructure from breakdowns and performance degradation. The “backhand” covers what happens when the inevitable occurs—something breaks. It is about how well IT manages its internal processes when an event triggers an incident, self-service fails to solve a user issue or a user calling to say there is an issue. In other words, how efficiently and effectively are the following are handled:

 

  • Incidents get managed
  • Problems prevent further incidents from happening
  • Changes are managed in response to incidents/or critical events

This can involve how IT manages the state of configuration or how IT uses automation to eliminate/decrease the time it takes to fix issues. Finally, detect to correct must capture the knowledge and share it effectively. Simply put, this value stream aims to, as COBIT 5 suggests, “increase (end) user productivity and minimize disruptions.” As you’d expect, the detect to correct value stream touches many IT activity categories.

 

What are the goals for detect to correct?

To successfully manage this value stream, IT must ensure its operational activities are running efficiently and effectively. Increasingly, this requires IT organizations to automate their ability to drive compliance against IT standards and policies. This is particularly important for activities such as configuration management. IT should ensure that operational activities are performed as required. IT leaders, in particular, should lead by determining standard operational procedures, and they should ensure IT operations are regularly monitored, measured, reported and remediated. This includes making sure that the availability plan anticipates business needs and provides appropriate capacity based on business requirements. At the same time, IT must know that service monitoring is occurring and, where possible, capacity, availability and performance issues are being identified and routinely resolved. These areas cry out for automation in configuration management, and diagnostics and remediation.

 

When things break, you want to know that incidents are resolved according to service levels, regardless of how they are initiated—self-service, phone call or event. You should reconcile service-level goals with actual performance. Obviously, IT should always aim to reduce the repeat offenders of poor delivery; this is accomplished by policies, standards, automation and problem management. The goal of problem management is a decrease in the number of recurring incidents over time—ultimately, it is about making problems go away permanently. As well, IT should be to improve the knowledge base and the ability to share information and collaborate. Where change is invoked, you want minimal errors to be created on top of the offending issue. Finally, the goal should be to run operations securely. This means that operations are run in accordance with security policy. The objective should be to minimize the business impact stemming from information vulnerability and incidents over time.

 

Measuring whether improvement is happening

IT should be continuously improving how it detects the need for corrections across activity categories. IT should track its performance against a number of key performance indicators (KPIs) and metrics. Given the broad scope of this value stream, I will selectively highlight operational KPIs and why they are important. Everything should start with whether operational procedures are in place to ensure the delivery of IT services. Here, you want to know about the number of non-standard operational procedures executed and the number of incidents caused by operational problems. From there, you want to look at the percentage of services that meet performance goals and the percentage of applications with high availability. These two KPIs tell you how well you are running activities associated with availability and capacity. Next, you should be managing how well incident response times are being delivered against goal. You want the percentage of incident response times that are met to improve month over month. This is great way to show your business customers the quality of your process management as well as your responsiveness to their issues.

Next, it is good to get things fixed permanently. Another good KPI here is the percentage of problems resolved by due dates. For changes initiated during detect to correct, we want to know about the change success rate—a lower number means your change process is actually contributing to poorer availability. At the same time, you want configuration management to provide sufficient information about services so they are effectively managed, change impact is thoroughly assessed, and service incidents are dealt with appropriately. It is essential that configuration is not only up-to-date, but controlled and managed. You should measure configuration by the following two KPIs: 1) the number of deviations between the configuration repository and live configuration; and 2) the number of discrepancies relating to incomplete or missing configuration information.

 

Finally, to measure the security of operations as a whole, you want to look at the percentage of incidents classified as security incidents. This number should be going down over time if an IT operation has effective control. And to view security proactively, you should look at the average time to run a policy check and the mean time to recover from non-compliance. These KPIs tell you if infrastructure is being driven to standard, which in turn prevents downtime and eliminates a source of security vulnerability.

 

Where do you go from here?

Start where you drive the most improvement in business performance. Often this can involve fixing your change process or the method by which you queue up incidents for remediation. Regardless of where you start, be sure to measure performance against concrete goals to drive real improvement.

 

Related links: IT value chain: The request to fulfill value stream

Finding your true value

Value streams: A user-centric model for the enterprise CIO

Solution page: IT Performance Management

Twitter: @MylesSuer

 

Labels: IT operations
Leave a Comment

We encourage you to share your comments on this post. Comments are moderated and will be reviewed
and posted as promptly as possible during regular business hours

To ensure your comment is published, be sure to follow the Community Guidelines.

Be sure to enter a unique name. You can't reuse a name that's already in use.
Be sure to enter a unique email address. You can't reuse an email address that's already in use.
Type the characters you see in the picture above.Type the words you hear.
Search
Showing results for 
Search instead for 
Do you mean 
About the Author
Mr. Suer is a senior manager for IT Performance Management. Prior to this role, Mr. Suer headed IT Performance Management Analytics Product ...
Featured


Follow Us
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation.