Business Service Management (BAC/BSM/APM/NNM)
More than everything monitoring, BSM provides the means to determine how IT impacts the bottom line. Its purpose and main benefit is to ensure that IT Operations are able to reactively and proactively determine where they should be spending their time to best impact the business. This covers event management to solve immediate issues, resource allocation and through reporting performance based on the data of applications, infrastructure, networks and from third-party platforms. BSM includes powerful analytics that gives IT the means to prepare, predict and pinpoint by learning behavior and analyzing IT data forward and backwards in time using Big Data Analytics applied to IT Operations.

Freddy Kruger, James Bond and IT Cost Optimization

by Michael Procopio


Top 10 Ten ways to improve IT Cost Optimization. Another fun video from Mark Leake, this time with an appearance from Freddy and James.


 












Get the latest updates on our Twitter feed @HPITOps http://twitter.com/HPITOps


Join the HP Software LinkedIn Group


For the Business Availability Center, Michael Procopio

HP Software Universe – Mainstage Andy Isherwood


Andy Isherwood VP, Support & Services
kicked off Mainstage.


There are four key areas shown in the picture above. HP announced this week its  IT Financial Management offering. Andy likened ITFM to an ERP system for IT. Information Management magazine wrote an article on HP ITFM.


HP has had offerings in IT Performance Analytics and IT Resource Optimization for awhile. HP Cloud Assure was announced was announced in May 2009, HP Unveils “Cloud Assure” to Drive Business Adoption of Cloud Services


Some key points from his opening remarks:



  1. Prepare for coming out of the
    recession when cutting costs and innovating.


  2. Best in class means being good at all
    four - aligning to the business, taking out costs, increasing efficiency and
    consolidation.


  3. Jetblue, Altec and T-Mobile were the
    winners of the HP Software Award of Excellence.


  4. As an example of the quick ROI
    companies can get, Altec produced 10% application downtime reduction, 20% faster
    response time, 15% increase in customer satisfaction and a 300% improve
    application transaction time in 6 months.


  5. Last year we were HP Software, this
    year HP Software and Solutions. This was the combining HP Software with HP
    Consulting and Integration. The net result increased our delivery options. In
    addition to offering software for in-house use, HP now has EDS, SaaS and
    continues with it Partners


  6. HP SaaS business is seven years old
    this year and has 650 customers.


You can read other coverage of HP Software Universe in the ITOpsBlog. There are a variety of Twitter accounts
you can follow:


HPITOps  – Covers BSM, ITFM, ITSM, Operations and
Network Management


HPSU09  – show logistics and other
information


HPSoftwareCTO


informationCTO


HPSoftware


BTOCMO – HP BTO Chief Marketing Officer


as well as the Twitter hashtag #HPSU09


For HP BSM, Michael Procopio


 

HP Software Universe - day 1

by Michael Procopio


 


Today was the first day of Software Universe. I had customer meetings all day today. Here are some interesting items from my conversations.



  1. Most said budgets were down in 2009 and will be flat to down in 2010. But a few who were related to government stimulus said theirs will be up.

  2. Co-sourcing and outsourcing continue as ways to reduce costs

  3. A few were focusing on asset management with the express purpose of getting rid of things in the environment they don’t need anymore. They know they are out there but they need to find them first.

  4. Most customers I spoke to said they keep aggregated performance data for 2 years the range was 18 months to 5 years.

  5. There was an interesting discussion about the definition of a business service versus an IT service. The point being made was a business service by definition involves more than IT. While I agree this is a good point, I think the IT industry has focused on business service as a way to say - “I’m thinking about this IT service in the context the business thinks about it not just from my own IT based perspective”

  6. A number of customers have or are about to implement NNMi. If this is something you are interested in check out the NNMi Portal

  7. Many customers are moving to virtualized environment highest percentage I heard was 70%. Another customer forces all internal developers to deliver software as a virtual image.

  8. Another topic was how to monitor out tasked items. For example, some part of what you offer is delivered by a third party - how do you make sure they are living up to your standards. Two methods I heard were 1/ use HP Business Process Monitor 2/ get the 3rd party to send you alerts from their monitoring system.

  9. On the question does your manager of managers send back data to sync the original tools 1 did, 1 didn’t. For the one who did it was part of a closed loop process.

    • Monitor tool finds problem send alert to MOM (Manager of managers).

    • MOM send event ID to monitoring tool

    • Subject matter expert uses monitoring tools to diagnose problem

    • Once diagnosed updates monitoring tool which updates MOM




A very productive day for me. I hope some of this is useful information to you.


For additional coverage my blogger buddy Pete Spielvogel is also here and beat me to the first post. You can read his posts at the ITOps Blog.


There are a variety of Twitter accounts you can follow as well as the hashtag #HPSU09


HPITOps – Covers BSM, Operations and Network Management


HPSU09 – show logistics and other information


HPSoftwareCTO


informationCTO


HPSoftware


BTOCMO – HP BTO Chief Marketing Officer 


 


For HP BSM, Michael Procopio

Fuel Efficient IT Operations

Mike Shaw, BSM Product Marketing.


My wife just bought a BMW 118D. The 118D won the "Green Car of the Year" award in 2008 at the New York Auto Show.  It does an amazing number of miles to the gallon (km to the litre / miles to the US gallon). Her old car (also a BMW) did about 26 miles per gallon. The 118D does 63 miles per gallon. Now, the new car is slightly smaller, so we're not comparing apples to apples. However, you get the point -- car manufacturers are pushing fuel economy to new limits. At the cost of acceleration? Not that I've noticed - when you put to the floor in the 118D, it most certainly accelerates.


I think there are parallels between fuel economy and IT operations.  During a down-turn, because there is less activity, there is less pressure on IT operations (fewer events, fewer system overloads, etc). This is like a car that is only required to go at 30 miles per hour and accelerate slowly because that's what everyone else on the road is doing.  In an attempt to cut the costs of motoring, one might be tempted to adjust the fuel injector so that a smaller amount of fuel is available. This will cut fuel costs during this recessionary period.


 


BUT, when we come out of recession (some time in 2010??), acceleration will be required. Actually, our competitors will be accelerating - it's up to us whether or not we match them. If we've chosen to create a fuel efficient car (like the BMW 118D), then we can match the required acceleration and have fuel efficiency. If we've decided to simply cut the fuel that goes into the car without any consideration for fuel efficiency, our competitors will accelerate away from us come the upturn.


 


During a down-turn, we are under pressure to cut IT operations costs. In fact, in a recent IDC study performed for HP Europe, 40% of customers surveyed said they were very likely to cut IT operating costs while 74% said it was likely they would cut IT ops costs.


 


We have two choices in how we behave in response to this pressure to cut costs. We can take a simple "let's cut people and that's it" path, or do we take the "fuel efficiency" path and create an IT operations to match the BMW 118D. If we just cut people, we'll drown in IT operations stuff when the upturn comes. If we create a fuel efficient IT ops engine, we'll be able to embrace the acceleration when the upturn comes.


 


This sentiment is echoed by recent comments make by HP's CEO, Mark Hurd (I'm sure Mark will be greatly comforted to know that he and I are in snych on this one). Mark said he didn't want to simply cut heads because when the upturn comes, he won't have the "people muscle" required to handle the upturn. HP's IT department is taking the BMW 118D approach - data centre consolidation, network operations efficiency, centralized event management, pro-active user experience management, constrained self-serve of IT product, etc.


 


So, how do we create a fuel efficient IT operations? I'm not an expert across the whole IT operations stack, so I'll talk to the area I know about - availability and performance management.  And in the interests of keeping these blog posts to a manageable size, I'll do that in the next post.


 


(Footnote: I'm sure all car manufacturers are producing more fuel efficient cars. My wife just happens to like BMWs, and she only looked at BMW!  I'll bet the average HP sales rep wished their customers were so loyal (naive ??))

BSM Evolution: The CIO/Ops Perception Gap

 


There are many potential culprits for why IT organizations struggle to make substantive progress in evolving their ITSM/BSM effectiveness. A customer research project we did a few years ago offered an interesting insight into one particular issue that I rarely see the industry address. The research showed that most CIO’s simply had a different perception – when compared to their IT operations managers- of their IT organization’s fundamental service delivery maturity and capability. This seemingly benign situation often proved to be a powerful success inhibitor.


 


The Gap:


A substantial sample size of international, Global 2000 enterprise IT executives participated in the study. When asked to prioritize investment priorities on a broad range of IT capabilities, we saw a definite gap. IT Operations managers consistently ranked, “Investing to improve general IT service support and production IT operations” in their top 1 or 2 priorities, where CIO’s ranked this same capability much lower as a priority 6 or 7.


 


The Perception:


When pressed further, CIOs believed that the IT service management basics of process and technology were already successfully completed, and the CIO’s had mentally moved on to other priorities such as rolling out new applications, IT financial management, or project and portfolio management.


 


Most of the CIOs in the study could clearly recall spending thousands of dollars sending IT personnel to ITIL education, and thousands more purchasing helpdesk, network, and system management software. Apparently, these CIO’s thought of their investment in service operations as a onetime project, rather than an ongoing journey that requires multiple years of investment, evolution, reevaluation, and continuous improvement.


 


IT operations managers on the other hand- clearly had a different view of the world. They were generally pleased with the initial progress from the service operations investments, but realized they were far from the desired end state. The Ops managers could plainly see the need to get proactive, to execute advanced IT processes and more sophisticated management tools, but could not drain the proverbial swamp while fighting off the alligators.


 


The Trap:


We probed deeper in the research, diligently questioning the IT operations managers on why they didn’t dispel the CIO’s inaccurate perception. In order to secure the substantial budget, these Ops managers had fallen into the trap of over-promising the initial service management project’s end-state, ROI and time to value. (I wouldn’t be surprised if they had been helped along by the process consultants and software management vendors!)


 


These Ops managers saw it as “a personal failure” to re-approach the CIO and ask for additional budget to continue improving the IT fundamentals. Worse yet, they had to continually reinforce the benefits from the original investment so the CIO didn’t think they had wasted the money. So, the IT operations staff enjoyed the result of reactively working nights and weekends to meet business’ expectations, and make sure everyone kept their jobs. Meanwhile, the CIO’s slept well at night thinking, “Hey, we are doing a pretty darn good job”, but faced the next day asking, “Why are my people burnt out?” A vicious cycle.


 


Recommendation through Observation:


Im not wild about making recommendations since I merely research this stuff… not actually perform hands-on implementation. Instead, I will offer some observations of best practices from companies who appear to be breaking through on BSM, lowering costs, raising efficiency and improving IT quality of service.


 



  1. Focus on Fundamentals: It is boring and basic, but absolutely critical to continually look for ways to improve the foundational service management elements of event, incident, problem, change, and configuration management. Successful IT organizations naturally assume that if they implemented these core processes more than 3 years ago, they likely need to update both technology and process. If FIFA World Cup Football clubs and Major League Baseball teams revisit their fundamental skills each and every year, why wouldn’t IT?

 



  1. Assume a Journey: IT leaders who develop a step-wise, modular path of realistic projects that deliver a defined ROI at each step have the best track record of securing ongoing funding from the business. The danger here is defining modular steps that are so disconnected and silo’d, that IT never progresses toward an integrated BSM/ITSM process and technology architecture. This balance continues to be one of the most difficult to manage.

 



  1. Empowered VP of IT Operations: The advantages of a CIO empowering a VP of IT operations and holding them accountable for end-to-end business service has been discussed in previous posts. The practice of having a strong VP of operations who has executive focus on service operations and continual service improvement, while having end-to-end service performance responsibility does appear to be a growing trend and success factor.

 



  1. Focus on the Applications: In the same research study that showed the perception gap on, “Investing to improve general IT service support and production IT operations”, there was consistent agreement on, “Investing to improve business critical application performance and availability”. The CIO’s, Ops Managers and Business Relationship managers all ranked this capability as a top 1 or 2 priority.

 


Successful BSM implementations focus on the fundamentals of process and infrastructure management, but do so from a business service, or an application perspective. This approach not only enables an advantageous budget discussion with the business, but it also hones the scope and execution of projects.


 


It is difficult to assess the relative impact of this CIO/IT Ops perception gap, considering the wide variety of challenges that IT faces. But hopefully, this post gives you something to consider when assessing your own IT organization’s situation and evolution.


 


Let us know where your organization fits – please take our two question survey (two demographics questions also). We’ll publish the results on the blog.


 



  • Describe the perception of your IT's fundamental service delivery process

  • How often does your IT organization significantly evaluate and invest to update your fundamental IT process

 


Click Here to take survey


 


Bryan Dean – BSM Research

Monitoring your cloud computing as easy as calling an airport shuttle

HP made an announcement about new cloud computing management capabilities today: HP Unveils "Cloud Assure" to Drive Business Adoption.


HP currently offers Software-as-a-Service (SaaS) for individual management applications such as HP Business Availability Center (BAC) and HP Service Manager primarily for intranet and extranet applications.




HP Cloud Assure helps customers validate:



  • Security – by scanning networks, operating systems, middleware layers and web applications. It also performs automated penetration testing to identify potential vulnerabilities. This provides customers with an accurate security-risk picture of cloud services to ensure that provider and consumer data are safe from unauthorized access.

  • Performance – by making sure cloud services meet end-user bandwidth and connectivity requirements and provide insight into end-user experiences. This helps validate that service-level agreements are being met and can improve service quality, end-user satisfaction and loyalty with the cloud service.

  • Availability – by monitoring cloud-based applications to isolate potential problems and identify root causes with end-user environments and business processes and to analyze performance issues. This allows for increased visibility, service uptime and performance.




HP Cloud Assure provides control over the three types of cloud service environments:



  • For Infrastructure as a Service, it helps ensure sufficient bandwidth ability and validates appropriate levels of network, operating system and middleware security to prevent intrusion and denial-of-service attacks.

  • For Platform as a Service, it helps ensure customers who build applications using a cloud platform are able to test and verify that they have securely and effectively built applications that can scale and meet the business needs.

  • For Software as a Service, it monitors end-user service levels on the cloud applications, loads tests from a business process perspective and tests for security penetration.




A diagram showing the differences in the services is at Cloud Computing Basics.




In the end it doesn't matter where the service is; you need to be sure it is available and performing to expectations. Cloud Assure provides the capability in a way that is very agile. You say "I need this service monitored" and it is monitored. Its just like calling for an airport shuttle -- you call, they show up.




Related articles:





For Business Availability Center Michael Procopio, product manager HP Problem Isolation.

BSM Evolution Paths: Financial Services Example

 


When two Fortune 500 companies merge the IT convergence can feel like two high speed trains on parallel tracks speeding toward a single-track tunnel. Not only is IT tasked with maintaining or increasing quality of service, but the CEO’s are quite impatient to quickly rationalize the IT operating expense equation of “1+1=1.25”. Maybe 1.50 if you have an extremely benevolent Board of Directors.


 


Unlike the Automotive Industry example posted earlier (BSM Evolution Paths: Auto Industry Sample), this Financial Services example has much less tops-down roadmap direction, and much more independent parallel paths. Let’s take a look at three of the key personas and evolutions within these parallel paths.


 


Data Center Operations Manager; Infrastructure Operations path:


 


The new Data Center Operations Manager (DCOM; reporting to VP of IT Ops) commissioned a tools architecture analysis. They inventoried their management tools and counted over 80 major “platforms” in the fault, performance and availability category alone!


 


The DCOM empowered a Global Software Management Architect to drive a “limited vendor” strategy to simplify and standardize the tool environment. Although there were many individual domain experts bent out of shape, this standardized environment limited the vendor touches, enabled renegotiated license/support contracts, concentrated tool expertise and resulted in improved quality of service.


 


The fault, performance and availability architecture was boiled down to three major vendors covering three broad categories (plus device specific element plug-ins):



  • System Infrastructure (Server, OS, Database, storage middleware, LAN feeds, etc.)

  • Network Services (WAN, LAN, advanced protocols, route analytics, etc.)

  • Enterprise Event (consolidated event console, correlation, filtering, root cause)

 


The DCOM could have pushed harder for a single vendor covering all three categories, but it was a matter of time-to-deploy pragmatism. A vendor could only be selected as category solution if the product was successfully deployed previously, and internal deployment expertise existed to lead the global implementation. This “survival of the fittest” approach did not necessarily drive the most elegant architecture, but it did speed deployment and limit risk.


 


Independent roadmaps and key integration capabilities were developed for each category to meet 6, 12, 18 and 24 month milestones.


 


CTO; Business Service Oversight path


 


Early on in the merger process, there was a power struggle to own the business service visibility and accountability solution. The VP of IT Operations wanted the tools, process and organizational power, but the Lines of Business insisted on a more independent group that would sit between IT Operations and the business-aligned Application Owners.


 


The Online Banking Group from one of the pre-merger divisions had successfully implemented a business service dashboard and Service Level Agreement reporting solution (based primarily on end-user experience monitoring). Using an “adopt and go” strategy, the CIO empowered the CTO to develop an end-to-end group and expand the solution to all six major business units.


 


This business unit expansion rolled out over 12-18 months and was successful, but limited to monitoring and reporting. Over the next 12 months, Application Owners, Line of Business CIO’s and VP of IT Operations all wanted to extend the business service monitoring to:



  1. Problem isolation, application diagnostics, and incident resolution

  2. In-depth transaction management of composite applications

 


Director Service Management; Enterprise CMDB path


 


The Director of Service Management, reporting to VP IT Ops, drove two major initiatives over the first 12 months of the merger.



  1. Consolidate to a single, global, follow-the-sun service desk

  2. Rationalize and standardize the request and incident management process

 


I could easily spend an entire blog post discussing the IT process convergence and standardization, but I refuse! Instead, I’ll focus on what happened in the 12 months following the service desk consolidation.


 


The Director of Service Management launched a CMDB RFP which was originally grounded in incident, problem and configuration management. The RFP touched off an enterprise-wide nerve, not to mention a flurry of vendor responses. The project quickly expanded, and changed focus to the “hotter” driver of change and (release) risk management, and how to drive all IT process from an enterprise service model.


 


Once the application owners got involved (from a change/release control perspective), and the infrastructure operations got involved (from a change and performance/availability perspective), and the CTO got involved (from a business service reporting and accountability rperspective) all of a sudden incident management took a back seat in the decision process.


 


In the end, a service discovery, dependency mapping and change/release management solution was selected that was a different vendor all together from the incumbent service desk solution.


 


An interesting journey… so far


 


The three paths described above are clearly a small subset of the overall work done for this corporate merger, but hopefully gives a glimpse into the BSM evolution dynamics. By all accounts, this company has been successful in their journey; you may be interested to know that this financial services company is not participating in the government bail-out program.


 


The lack of a tops-down “enterprise IT transformation” roadmap did not hinder their progress… in fact some will argue it enable their progress! You can observe, however, that at the end of each path there is a drive towards further integration and cross-IT dependence. It will be interesting to watch this company, and see how their approach evolves as they continue down the intersecting evolution paths.


 


Bryan Dean, BSM Research

Prediction BSM Evolution

I usually do not like making public predictions because I hate being wrong. But I was discussing my last blog post (Business Service Visibility & Accountability: Where is it Homed?) with a colleague and he pressed me for my prediction of where I thought this function will eventually live in the organization. Maybe more of you have the same question, so in this post I will lay out what I believe is the compelling evidence… and I might even make a prediction.


Let’s take a quick look some of the key evidence or clues:


 


CIO role continues to shift


This has been researched to death, but is still true. CIO’s are spending more time on business innovation and less time on production IT operations. The range of issues that CIO’s drive and influence is staggering. Does this mean they don’t care about production operations and business service accountability? No, they care greatly; it is just that most CIO’s have learned that having a top-notch, empowered VP of IT Operations is the only way to be a proactive CIO.


 


Application owners want to focus on development


My previous post looked historically at application owners and line of business CIO’s buying their own business service visibility and accountability tools because the pressure they felt from the business. This did happened and continues to happen in many organizations, but research shows that after a couple years of owning, architecting and maintaining these tools the application owners realize that production management tools takes valuable time away from their primary goals.


 


They are primarily goaled to get new functionality out the door that meets business requirements for function, quality, performance and security. It is still vitally important to the Application Owners to maintain visibility and accountability once their applications are in production. They will continue to be a catalyst in purchasing performance tools, and providing the intellectual property for rules, thresholds and reporting metrics. But ownership of the tools, configuration, vendor management and ongoing maintenance of the tools is clearly shifting to the production operations teams.


 


Line of Business CIO’s don’t own enough


Line of business CIO’s love to have business visibility and accountability tools in their hot hands, but they also recognize the issues of owning tools without owning the IT infrastructure. Security access and rights is a constant issue for them. Management process and tool architecture is also becoming a more standardized, centralized function that the line of business IT participates strongly in, but really is not in a position to own.


 


Successful customers adding problem resolution


Something I have observed in customers, who have successfully implemented a business service visibility and accountability solution, is that the next step in their evolution is to tie issue visibility to issue diagnostics and resolution. They find it is wonderful that they now have a business relevant way to measure IT service performance, but their constituents quickly move to, “Ok, now fix it when it breaks”.


 


Nobody in IT will be shocked to hear business takes a, “what have you done for me lately”, stance. So, the tool owners now find themselves sorting through how to integrate into the established event, incident and problem management processes. Depending on where they sit organizationally, this can be a painful yet necessary adjustment when trying to improve efficiency and time to diagnose/repair.


 


VP of IT Operations taking on end-to-end responsibilities


Seven years ago we conducted an extensive ITSM customer research project. At that time, there were a large number of CIO’s and industry pundits who had taken on the mantra, “Run IT like a business”. IT consolidation, adoption of ITIL process standards, organization alignment and tool deployment were all solid benefits from this era (and continue as we speak), but did not solve the issue of managing end-to-end business services.


 


At the time, too many IT Operations managers became “infrastructure service providers”, and when polled did not feel responsibility for the application, the end user experience, or the final business service. The CIO, and many of the application owners ended up shouldering the business service responsibility. Today, this is radically changing and ITIL V3 clearly reflects this evolution.


 


Application development teams continue to be organizationally aligned to the line of business more often than not, but research is showing a dramatic shift in mindset as to whom is responsible for the end-to-end application performance. The majority of IT Operations organizations today own Level 1 application monitoring and often own level 2 application support. level 3 application support typically remains aligned with the development teams, but there is no doubt that the VP of IT operations is taking on end-to-end responsibility.


 


Tools vendors are getting their act together


Alas, I must at least touch on technology… but only briefly! The major tools vendors have done a commendable job putting together portfolios of solutions that span the BSM/ITSM lifecycle. Plenty of improvement can still be made on integration, interoperability and ease of use; but I think it is fair to say IT finally has access to a management technology architecture that can be leveraged and multi-purposed to serve a wide range of persona needs and management disciplines.


 


The Prediction


You have probably guessed my prediction by now based on my biased presentation of the six clues above. I believe the ownership of the business service visibility and accountability solution will be homed under the VP of IT Operations, and purpose-specific instances will be customized for the application owners, business relationship managers, line of business CIO’s and executive IT management.


 


The VP of IT Operations – empowered by the CIO - will continue to drive compliance to a single, standardized IT process and software management architecture (not “single vendor”, but “limited vendors”). This will irritate many ‘best-of-breed’ fans, but in the end it will pay off.


 


The business service visibility and accountability function will be a module of a more comprehensive fault, performance and availability solution set that effectively ties together discovery, visibility, accountability, issue detection, isolation, diagnosis, business impact analysis and direct connection to the enterprise service model.


 


Implementing this cross-IT management solution will not be easy. Organizationally, the VP of IT Operations will have to empower an independent executive-level manager to drive, similar to an ERP Application owner. This will fail if owned by an “ivory tower” type, but must be practically driven by trading off the lobbying of existing IT domain and function specialists with the need to consolidate, standardize and implement a modular, multi-purposed solution.


 


Ok, maybe I got a little carried away at the end there, but one cannot ignore the demonstrable evidence of business, organization and persona driver dynamics. I would be surprised if we do not see a pragmatic, yet steady evolution toward this ultimate model.

Business Service Visibility & Accountability: Where is it Homed?

Virtually every customer that I have studied has a critical moment in their BSM evolution where they realize the need for viewing, measuring and reporting business service performance in a business-relevant way.  We could discuss the technical complexities of integrating service model discovery, end-user experience, transaction management, performance and event data to develop this business service view, but in this post I’m going to examine the most common key personas, core motivations, and organizational impact.

In the previous post, BSM Evolution Paths: Auto Industry Sample we saw how the core motivation came tops-down from senior IT management.  Let’s compare three different models. 


Line of Business / Application Driven


Key Personas:  Application owner, Business Relationship Manager, Business Unit CIO


 


Core Motivations:  These personas are typically closest to how business utilizes IT to execute a business process or function.  They usually report into the business unit itself, rather than into IT Operations.  They have responsibility for the application, but the business perceives them as owning the end-to-end service performance, even though they often have little control of the underlying IT infrastructure and service delivery processes.


 


At some point, a business critical service melts-down, or endures a never-ending spree of performance degradations where Global IT Operations says, “All the systems and network are green”.   This is the point where many business unit managers take matters into their own hands and fund a significant investment in End-to-End business service visibility tools.


 


Software:  Since they do not control the infrastructure, the application owners often look for tools that require minimal agentry and do not require a lot of feeds from the individual domain management tools.  They gravitate towards sophisticated end-user experience tools, probes, application diagnostics, and the ability to traverse composite application middleware. 


 

Organization:  They use these tools to prove accountability to the business units, but they also use the tools -not always politely- to hold infrastructure operations accountable. The animosity usually wanes, and the separate IT groups work out the process integration… but often not the tool integration. This leaves the end-to-end group outside of IT Operations. We also see this model where the infrastructure operations are outsourced, and the service provider is held accountable to specific Service Level Agreements. 


Infrastructure Operations Hero


Key Personas:  Infrastructure Operations Manager, Data Center Manager, NOC manager


 


Core Motivations:  These personas traditionally have the responsibility for care and feeding of the vast shared-service IT infrastructure environment.  They have likely done a reasonable job of consolidated event management, and domain-level configuration, performance and capacity management.  But, they have a vision of elevating IT to demonstrate the value delivered to the business, and proactively solve issues before end users report them.  This effort can be either in conjunction or parallel to an ITIL-driven service management initiative.


 


Software:  Often very budget constrained, they don’t always have the funding that the application owners do.  They look first toward leveraging investment of their existing tool set, gathering agent-based data from their infrastructure and augmenting with lighter-weight end-user experience tools.  Converting this data to business-relevant information is difficult, as they often don’t have the deep business process or application knowledge, but it is much better than the previous IT element statistic data. 


 

Organizational:  The Hero Operations manager then faces the daunting task of taking the new service oriented visibility and reporting capability to upper management and business unit managers.  Sometimes they yawn. Sometimes the strategy is embraced, and the operations manager is elevated to strategic status. The Operations manager keeps both tools and processes very integrated.  New end-to-end skill sets are developed, but usually not new organizational groups.  

Tops-down Service Management  
Key Personas:  CIO, CTO, VP IT Operations


Core Motivations: These personas have the luxury of controlling the organization, budget and overall priority of IT, yet their job is likely on the line.  Pressure from the business units, a personal drive to elevate IT to a strategic partner and sometimes fear of being outsourced are the powerful drivers.


 


Business service visibility and accountability is usually part of a larger, multi-project, multi-step roadmap that includes a hefty process component.  Since these initiatives tend to be “horizontal” in nature across all IT, many companies fall into the trap of trying to institute end-to-end business service performance tools too broadly.  The successful organizations focus on a discrete business service and satisfy key metrics that are specific to the particular business and application. 


 


Software:   These personas tend to focus on service level management, and the ability to demonstrate the value IT is delivering to business.  Typically requires a substantial investment in tools that can abstract the business services into something meaningful to business, looks hot to business stakeholders, yet also improves service delivery time to diagnose and repair.  This ends up requiring a rationalization of the service discovery model, CMDB, and the enterprise operational tools.   


 


Organization:  I’ve seen some CIO’s form executive business relationship management functions, keeping the team independent from both business and IT Operations.  Other CIO’s formally extend the VP of IT operations charter to include this new end-to-end function that bridges the infrastructure operations teams and the helpdesk/service desk teams.  

Conclusion?
Here’s a news flash…there is a wide variety of organizational models.  But there are some definite patterns, and in my next post, I will offer some evidence that the model will be more predictable in the future.Bryan Dean, BSM Research 

 

Webinar announcement: "Decrease IT Operational Costs by Accelerating Problem Resolution"

In a recent post, I talked about  “from user experience monitoring to user experience management”.  


Related to this, a recent Forrester study found that 74% of problems with business services are reported by the end users through the service desk; not reported by infrastructure management tools. The same survey found that an average of six service desk calls are needed to identify the problem owner for a top-down performance problem.  


 


What is therefore needed if we want to increase IT Ops efficiency and stop using our customers as the most expensive monitoring devices there are, is to proactively monitor customer experience before our customers do and have the tools to quickly and accurately pinpoint the cause of business service performance problems.  


Senior Enterprise Management Associates (EMA) Analyst Liam McGlynn and my colleague Sanjay Anne are conducting a webinar on this topic on March 19. The webinar is entitled “Decrease IT Operational Costs by Accelerating Problem Resolution”.


 


For more details and to register, please go to: http://www.enterprisemanagement.com/research/asset.php?id=1127.


 


 

BSM Evolution Paths: Auto Industry Sample

In the last post, Bryan Dean, our research expert in the BSM team, outlined the different ways in which customer evolve towards Business Service Management. In the next few posts, Bryan will give an example of each of the different types of evolution. Over to you Bryan ....
_______
About three years ago, the business division managers of a multinational automobile manufacturing company planned a bold transformation of their distribution network to leapfrog the competition.  They enthusiastically laid out a roadmap for business process innovation and aggressive customer/dealer satisfaction initiatives.  

Only one real problem; the CIO knew that building, rolling-out, and operating the underlying IT for this future business vision exceeded their current capabilities.  The CIO eventually had to raise the red flag and explain to the executive committee why IT was the bottleneck.  Ouch, not a good day.

 

In the previous post BSM Evolution Paths:  Samples and Observations, we talked about five common evolution paths, the organizational and persona dynamics of an Automated BSM/ITSM journey.  In this post we will overview a specific example.


 


To be fair, the CIO spent years driving significant investment in process, tools and the organization.  Let’s look at a subset of key personas and BSM/ITSM foundation: 

 

Director of Infrastructure (reporting to the VP Global IT Ops):



  • Enterprise-class central event/performance platform and console

  • WAN/LAN network management platform

  • Basic, component level performance and availability reporting

  • Dozens of vendor-specific configuration and admin tools

Director of Service Management (reporting to the VP Global IT Ops):



  • Global, consolidated helpdesk/service desk

  • Well defined and automated incident process; basic level problem, configuration, and a manual change process

Director of Applications (development, test & level 3 support.  Reports to business divisions):



  • Suite of pre-production stress-test quality and performance tools

  • End-user  and application performance/diagnostic tools (test environment)

The Key Evolution Steps


Step 1:  CIO empowers and holds the VP of Global IT Operations (VPITops) accountable for end-to-end business service responsibility.  Imagine the panic on his face!  VPITops launches key lieutenants on quick gap analysis.

 

Step 2:  The VPITops needed a quick win.  He believed that visually demonstrating and reporting performance and availability from a business service perspective -versus an infrastructure perspective- would be a catalyst for driving “aligned” IT behavior.  The current network and infrastructure products didn’t have this capability, so VPITops leveraged the tools already proven by the application test and level 3 support team.


VPITops established a new team within Operations (parallel to infrastructure event management) to own and run the end-to-end business service visibility/accountability solution.  Integration was established between the two teams and tools.

 

Step 3a:  VPITops took his new business service visibility/accountability tool (in dashboard/report form) to key business division managers, and established a business relationship management function.  This converted the conversation from anecdotal complaints, to measurable service levels.  The CIO had tangible proof of progress.

 

Step 3b:  While engineering step 2, the Tools and Process Architect realized they needed a better means of discovering and maintaining the IT/Business service models.  Their infrastructure environment was shared, complex and dynamic enough that static service models were not effective, so they brought in an application dependency mapping technology.  This success spawned a serendipitous benefit to another team in step 4a.

 

Step 4a:  The application quality/test and release team realized the service model could be utilized in the service transition process.  They previously had several very painful episodes of moving complex applications from test into production.  With an accurate, up to date service model of the production environment they could better identify dependency issues before roll-out.  Speed and accuracy...  Happy CIO.

 

Step 4b:  The Director of Service Management and the architect evaluated how to federate the data between the application dependency mapping service model and the CI configuration data in the helpdesk.   The software vendor provided a federation / reconciliation adaptor, so the helpdesk was able to leverage the CI relationships and operate off a “single version of the truth” (sounds eerily like an ITIL V3 CMS!). 

 

Near Term Roadmap


·         Automate change/configuration workflow and provisioning


·         Upgrade/replace enterprise event and performance console to leverage service model for root cause analysis and business impact assessment


·         Apply business service relationship management to additional business divisions


·         End-to-end visibility of composite MQ application business transactions   

 

The Verdict of the Journey so far


The CIO still has a job, and has a funded roadmap.  One might ask why they didn’t start with step 4b, and establish the CMDB and service model first?  Well, the CIO was on the hot seat, and they were concerned about getting bogged down in an enterprise-wide CMDB architecture project. 

 

This exemplifies the unpredictable and unique nature of evolution paths.  More can be said about the delicate balance between tops-down guidance, and fostering organic innovation from within the ranks of IT.  In future posts, I will discuss and analyze this further, as well as introduce other examples.

BSM customer evolution paths: Samples and observations

When developing and marketing products, we often have questions  which can only be answered by going out there and seeing what people are doing. We have a guy on the BSM team who does this for us. His name is Bryan Dean. I've worked with Bryan for many years and I've always been impressed by his objectivity and the insight he brings to his analysis (i.e. he doesn't just present a set of figures - he gets behind the figures).


 


At the end of last year, we asked Bryan to analyze the top 20-odd BSM deals of 2008. He formed a number of conclusions from this research. One set of conclusions concerned how people "get to BSM" - how they evolve towards an integrated BSM solution. I asked Bryan to help me with a series of posts to share what he learnt about evolutions towards BSM because I think that knowing what our other BSM customers are doing may help you.


 


________


 


Mike: Bryan, can you give a summary of what you learnt?


Bryan: There is no one evolution path. It's fascinating to me that a hundred different IT organizations can have virtually the same high-level goals, fundamentally agree on the key factors for success, and yet end up with a hundred unique execution paths.


 


Before I answer your question, can I create a definition? The term "BSM" is very poorly defined within the IT industry - different vendors have different versions, and so do the industry analysts (in fact, some other research I did last year concluded that very few people had a clear idea of what BSM means).  So, I'd like to introduce the term "Automated Business/IT Service Management"  or AB/ITSM.


 


Back to your question, I think I can group all the different evolution paths into five key types:  




  1. ITSM incident, problem change & configuration:  this evolution is driven out of the need for process-driven IT service management with the service desk as a key component


  2. Consolidated infrastructure event, performance and availability: this is driven by a recognition that having a whole ton of event management and performance monitoring systems is not an efficient way to run IT, and so there is a drive to consolidate them into one console.


  3. Business service visibility & accountability:  this is more of a top-down approach - start with monitoring the customer's quality of experience and then figure out what needs to happen underneath. This is popular in industries where the "web customer experience" is everything - if it's not good, you lose your business


  4. Service discovery & model: this is where evolution towards integration is driven from the need for a central model (the CMDB). Often, the main driver for such a central model is the need to control change


  5. Business transaction management: today, this is the rarest starting point. It's driven by a need to monitor and diagnose complex composite transactions. We see this need most strongly in the financial services sector

Mike: How about the politics of such AB/ITSM projects?  (I don't see the AB/ITSM term taking hold, by the way :-) )


Bryan: Politics (or, most specifically, the motivational side) is important. I think many heavy thinkers in our industry have the mistaken assumption that that there is a single evolution path, controlled from the top on down by the CIO following a master plan. Trying to manage such a serialized, mega project is a huge challenge and too slow, not to mention that 99% of CIO’s are not in the habit of forcing tactical execution edicts on their lieutenants (I know I’ll get some argument on that one :-) ).


 


What I see from my research is that the most successful IT organizations are those who have figured out how to balance between discrete doable projects, and an overall AB/ITSM end-goal context and roadmap.  Typically, the CIO lays down a high-level vision that ties to specific business results, and then allows key lieutenants to assess and drive a prioritized set of federated, manageable projects that independently drive incremental ROI. Some IT organizations may have a well-defined integrated roadmap, but the majority of IT run federated projects in a fairly disjointed fashion.


 


These parallel paths are owned by many independent personas within IT, each trying to solve the specific set of issues at hand. For them, being bogged down in how their federated project aligns and integrates with all the other AB/ITSM projects is daunting… if not fatal.


 


And on reflection this makes sense to me - the human side of things plays a large role in such endeavors.


 


Mike: What do you mean?


Bryan: IT organizations of all shapes and sizes have goals to reduce costs, increase efficiency, improve business/IT service quality, and mitigate risk all while applying technology in an agile way to boost business performance.   What I find interesting is how specific, funded initiatives are created by specific personas to achieve the goals.


 


In future posts, I will share some specific examples of how customers evolved through these paths, the key driver personas, the core motivations and how these paths come together.

There are a number of ways of populating the service dependency map

 


In a post two weeks  on this blog, I listed all the ways that we use service dependency maps (model-based event correlation, service impact analysis, top-down performance problem isolation, SLAs, etc).  What can be used to discover service dependency information?


 


OperationsCenter Smart Plug-ins (SPIs) now discover to the CMDB


If you're using the agent-based side of OperationsCenter (OpC), then each managed node will have an agent on it. You can put a smart plug-in (SPI) onto that agent. SPIs have specialized knowledge of the domain they are managing. There are many SPIs for all kinds of things from infrastructure up to applications like SAP. Many of the SPIs discover (and continue to discover) the environment they are monitoring. This is agent-based discovery using all the credentials you've already configured into the OpC agent.




The OMi team are working on putting SPI-based discovery information into the HP CMDB (the Universal CMDB or uCMDB).


 


Agentless monitoring populates the uCMDB


If you have agentless monitoring (HP SiteScope) this will populate the uCMDB too (as of SiteScope version 10).




Whatever SiteScope monitors you have configured will send their configuration information to the uCMDB. So, if you're monitoring a server with a database on it, all the information about the server and its database will be sent to the uCDMB.


 


Network Node Manager populates the uCMDB


As of the latest version of Network Node Manager (NNMi 8.10), discovered network end-points are also put into the uCMDB. "Network end-points" are anything with a network terminator - network devices, servers, and printers. NNMi provides no service dependency information, but it does provide an inventory of what's out there.




This inventory discovery is useful for rouge device investigation - noticing an unknown device, creating a ticket to the group responsible for that type of device so they can look into it.


 


Standard Discovery


Our Standard Dependency Discovery Mapping product (DDM-Standard) will discover your hosts for you. This also discovers network artifacts (but, see NNM discovery above - if you have NNMi, this is a more detailed network discovery mechanism).


 


Advanced Discovery


Advanced Dependency Discovery Mapping will discover storage, mainframes, virtualized environments, LDAP, MS Active Directory, DNS, FTP, MQSeries buses, app servers, databases, Citrix, MS Exchange, SAP, Siebel, and Oracle Financials.




You can also create patterns for top -level business services and DDM-Advanced will discover those too.


 


Transaction Discovery


Our Business Transaction Management product, TransactionVision,  deploys sensors to capture application events (not operational events) from the application and middleware tiers. These sensors feed the events to the TransactionVision Analyzer which automatically correlates these events into an instance of a transaction. TransactionVision also classifies the transactions by type - bond trade, transfer request, etc. Thus, TransactionVision is discovering transactions for you.




TransactionVision puts this transaction information into the CMDB. In other words, the CMDB doesn't just know about "single node" CI types like servers, it also knows about flow CI types - transactions.




Also, if the CMDB notices that the transaction flows over a J2EE application, it links the transaction to information in the CMDB about this J2EE application - the transaction step and the J2EE app are now linked in the model. .


__________


 


By the way, my colleague Jon Haworth has just posted on the value of discovery in the realm of Operations Management at ITOpsBlog (28th January, "Automated Infrastructure Discovery - Extreme Makeover").

Answers to questions on "what's new in Business Availablity Center 8.0?"


I recently mentioned about a whats new webinar conducted on BAC v 8.0. You can now access this on-demand webinar at


https://h30406.www3.hp.com/campaigns/2008/events/sw-01-20-09/index.php?rtc=3-2CDASIY


 


Here are some of the questions which came up during the live webinar.


 


Q: When will 8.0 be available?


A: The 8.0 release will be made available the first week of February


 


Q: How will the new modeling changes affect my current custom views?


A: There are no more instance views, it’s just views and custom perspectives that provide the content in the view, the upgrade for most customers should be straightforward, unless they have changed the model, created new CI types with custom links or are heavily using pattern views with impact analysis, correlation rules and alerts.


 


Q: How about integration with HP Operations Manager? Can we leverage our current HPOV infrastructure monitoring capabilities and marry data with BAC application monitoring?


A: Yes with HP problem Isolation we have support of OM (Operations Manager) through event correlation to application problem/ anomaly start time.


 


Q: Does v 8.0 support oracle 11g platform


A: Yes with v 8.0 is it supported.


 


Q: I was told that the DDM portion of the new 8.0 can discover WebLogic 10.x iis it true?


A: Yes with v 8.0 is it supported.

Can I get away without using discovery?

When I was at our European HP Software event before Christmas / The Holidays, I spent a good deal of time talking to people about our new product releases and the future of BSM. One customer looked a little worried and said, "wow - you seem to rely on discovery a lot".


I guess there are two things to say in answer to that observation. The first is "yes - because we rely on the service hierarchy model a lot". And the second is, "but there are a number of different types of discovery - and a number of them you already have".


 


So, in a two part post, I thought I'd answer that observation more comprehensively. So, let's first look at how we use inventory and service hierarchy information in the management of service health (and thanks Jon Haworth from the OperationsManager team for his significant help on this post):


 



  • It helps with administering the monitoring deployment of the managed environment. It tells us what is out there, what we need to manage, what has disappeared, and so on. This only requires discovery of the infrastructure inventory – "tell me what servers exist" (unless everything is virtualized, in which case it needs a lot more. The OperationsCenter team has posted on the new virtualization SPI recently at ITOpsBlog. This SPI discovers, and more importantly, continues to discover, virtualized environments).

 



  • It helps OMi to understanding the stream of events which are being detected in the infrastructure and applications. The hierarchy of the monitored items ("configuration items" or CI's) allows OMi to tell us which events are causal events and which are symptoms – what do we need to work on and what can we ignore. I talked about how OMi does this in a post last year.

 



  •  It allows all parts of the BSM stack perform service impact analysis. This is where events are related to infrastructure and applications and their impact or potential impact on the services above in the hierarchy is established. We can then use this impact information to prioritize the events.  Service impact analysis requires a model of the hierarchy of CI's and services.  Maintaining the service hierarchy manually is untenable -- things just change too rapidly for humans to keep up.  

 



  • When a disk has a set of read/write errors, is that catastrophic? If it's a single disk, then yes - the infrastructure element is in trouble. If it's part of a RAID array, then no -  provided the rest of the array is OK.  If we know the type of CI that we seeing events against, we can make better decisions about its true health.

 


This is also a new feature in OMi: when CI's are discovered we know their type. OMi ships with a database of health indicators for each CI type. For example, for single disks, it's a problem if the disk gets bad errors; if it's a RAID array, then provided a high percentage of the other disks are OK, this is not a serious problem; and so on.


 


This feature makes the calculation of the true health of CI much easier. You don't need to define a set of propagation rules. OMi uses the discovered CI type information and it's lookup table to figure out propagation itself.


 


This all ties into a new feature in OMi called "Health Indicators". Jon Haworth has promised to post on this on his team's blog at the OperationsCenter blog


 



  • Our top-down performance Problem Isolation software needs to understand the service hierarchy on which the end user application rests. For example, if I have a web user interface, I need to understand what services that user interface depends on. As I discussed in a post last year, problem isolation uses statistical correlation analysis to suggest the likely cause of such top-down performance problems.

 



  • We need the service hierarchy for defining SLAs. I may define a compound SLA that depends on a number of OLAs and a top-level measured SLA. The modeling user interface for this and the subsequent off-line SLA calculation is done based on the service hierarchy.

 


In the second part of the post, I'll talk about all the things that now populate the host inventory and service dependency map.  Hint: if you have SPIs, you'll like what we have to say :-)


 


Mike Shaw

Announcing Business Availability Center 8.0

In a post last year, I talked about how to move from user experience monitoring to user experience management, you need to be able to figure out what is the cause of a measured user experience problem, like slow on-line  check-in times.  I talked about a tool we have called Problem Isolation that helps do to this figuring out.


Up till now, Problem Isolation has used just performance data measured by our agentless probes (from a product called SiteScope) in order to correlate between a top-line performance metric (like online check-in times) and the health of services that top-line metric depends on (database, app server, integration bus, etc). But there is another source of data we haven't included until now -- the events collected by our operations product, HP Operations Manager. If you have HP Operations Manager, you have a massive source of information that can also be used to determine where top-down performance problems lie. 


 


This is how Problem Isolation now uses HP Operations Manager data:


 



  1. A business service problem is identified. For example, thru synthetic or real-user monitoring we determine that online check-ins are running too slowly

  2. A “time buffer” around the problem start time is determined

  3. The model for the business service in the CMDB is traversed, returning a list of all services supporting the business service

  4. Events that occurred within the above-mentioned time-buffer relating to those supporting services are determined

  5. The services with the best-correlating events (taking into account severity as well) are identified as likely suspects

 


This algorithm applies to any event, whether it’s from a third party enterprise management system (e.g. Tivoli), from HP Operations Manager, or,  from HP Network Node Manager.


 


-------------


 


In our quest to move from service health monitoring to service health management, we're trying to provide as much information  relating to a problem/incident as possible - all in one place in such a way that the information is easily visualizable.


 


In BAC 8.0, you can see the following regarding a problem service, all from one place:


 



  • The current performance of the service

  • The performance of the service over time

  • SLAs resting on the service and their closeness to jeopardy

  • Business processes using that service and the impact of the problem on those business processes. If you have our Business Transaction Management modules of BAC, you can see exactly which business process instances are affected or at risk. In industries like financial services this matters because the value of transactions can vary hugely, and business operations wants to know which important business instances are affected (e.g. A $10m inter-bank transfer) so that they can initiate manual work-arounds

  • Measured user experiences resting on this service. Imagine an app server is having a problem. You can "look upwards" and see that this app server is used to serve the online check-in business service. You can then see the measured impact of the app server problem on the online check-in user experience. This would be measured using either synthetic or real-user monitoring

  • Real changes that have occurred under the problem service. The real changes are inferred by the discovery technology that notices deltas between the state of CIs today versus yesterday

  • Planned changes against the problem service as taken from the change/release management system

  • Outstanding incidents against the problem service. You can "look across" to the details of the incidents to see if they provide the app support team with any insight into how to solve the problem

  • Non-compliancy state of servers under the problem service. Our server automation technology now updates server compliance state into the CMDB and this can be viewed in this 360 degree view of the problem service

 


------------


Mike Shaw.

The new Operations Manager i (OMi)

Fifteen years ago when HP Operations Manager (or OpenView Operations as it used to be called) was released, event management was really “infrastructure event management”. The concepts of middleware, of customer experience, of SOA, and of automated business process didn’t exist. But now they do, and we need a consolidated management solution that does full “consolidated business service management” rather than simply “consolidated infrastructure management” so that all events can come into one place where the operators are highly empowered to deal with quickly and accurately.


 


This is the aim of the Operations Manager i (OMi) product we announced at Vienna Universe on December 9th.


 


OMi and a shared service dependency model


OMi shares the same discovered service model as BAC. The service dependency model holds information on business transactions, customer experience, applications, middleware, infrastructure and now, network information discovered by our network management product, NNMi.


 


Using a common service dependency model means you can look upwards in the model (if the event comes below) and understand what services, user experiences and transactions are affected. The SLAs are in the model - so you can see how they are affected, and how close to jeopardy this problem brings you.


 


The model also tells you changes that have been made under a CI; what changes are planned; and what incidents are outstanding in Service Manager. Also the HP Server Automation product puts the compliance state of servers into the same model, so you can see if anything under a CI is out of compliance.


 


OMi  and root event analysis using a discovered model
The perennial problem with any event management system, whether it be infrastructure or network management, is event noise. A problem with one object can cause a whole array of dependent objects to fire off events too.  For example:


 



  • the SAN has a problem and fires off an event ...

  • the Active Directory using that SAN fires off an event...

  • the Exchange Server using that Active Directory has just lost its directory and fires off an event ...

  • and the proxy server that is driving the web UI to Exchange fires off an event too.

 


Four events  - but one “root event” – one “actionable condition”.


 


Typically, event management systems have created “event correlation languages” so you can program up rules to eliminate these noise events, but such rules are simply not robust to change (and we all know how fast IT systems change). Also, the number of events that can be generated is so large that it’s impossible to write all the rules you need.


 


What we do with OMi is use the service dependency map to get to the root event when a series of events are generated. The actual technology used is the causal engine we released as part of NNMi, but it's using our discovered service dependency map rather than NNMi's discovered network topology.


 


OMi's health views


With OMi you can create health KPIs against the things (CIs) that you are managing. These can be combinations of attributes --  like CPU utilization and free memory on a server so that you can see at a glance the health of the Cis under your management, rather than having to achieve the same thing through wading thru a ton of events.  In other words, OMi is mapping the events onto Cis and building up a health picture for you.


 


OMi and existing HP Operations Manager installations


OMi actually sits on top of existing HP Operations Manager installations. It acts like a manager of mangers for them. It can also take events from other event management systems like SCOM.

One brand new product and two major enhancements to the BSM stack - Vienna HP Software Universe 2008

Today is the first day of our software user conference here in sunny Vienna, Austria. We just announced a brand new product, and two major upgrades.


I'll start with the new product ...


 


HP Operations Manager i (Part of HP Operations Center) is our next-generation consolidated event and performance management product following on from HP Operations Manager. Internally, we call it OMi. Three keys about OMi...


 



  • You can take events from anywhere into OMi because it sits directly on top of our CMDB which holds business transaction, user experience, application, middleware, and infrastructure information.

  • OMi does root event analysis using the discovered service dependency map held in the CMDB. This means that only root events are shown in the console and subsequent events caused by the root event are hidden.

  • OMi gives you more than simply an "event stream" view of the world. It can also give you a service health view of the services you are responsible for. The exact make-up of a service's health is up to you - it will obviously include availability and performance, but it can also include the number of open incidents, for example.

 


I'll write more about OMi in a post at the end of this week.


 


HP Business Availability Center 8.0  (BAC 8.0) for application management uses HP Labs patented statistical analysis to cut through the volume of performance and operations event data in order to help customers predict and proactively resolve business service performance problems before they impact end users.


 


I did a post recently on the difference between user experience monitoring and user experience management noting how important performance problem isolation was. The latest version of BAC 8.0 does such analysis using both performance information and the rich source of events that HP Operations Manager, our operations management software, can give you.


 


I'll post on BAC 8.0 in more detail next week.



HP Network Node Manager i Advanced : we released a brand new network management product, NNMi,  this time last year incorporating a clever root event analysis engine (now found in OM i) and new, much faster spiral network discovery engine. The new Advanced Edition of NNMi is targeted at large enterprises.


 


NNMi Advanced helps you predict the service impact of network degradation before business services are negatively effected through its integration with our CMDB.


 


And it natively uses our run-book automation technology, Operations Orchestration, to automatically collect data, fix problems and verify state once a fix has been actioned.


 


More on NNMi Advanced in the NNM blog. 

Why do we need Consolidated Event Management?

A lot of customers we talk to are trying to get to a situation where all events go to one place for initial processing - something often referred to as "Consolidated Event Management". Why this drive - what bad things happen if we don't have Consolidated Event Management?


In broad terms, there are three problem situations:



  • The first is where events do all come to one place (the "operations bridge" or "NOC" or "centralized first level support" - we'll use the term "Operations Bridge" from now on) but the Operations Bridge doesn't have the tools to effectively deal with these events.  Typically there are two ways in which the Operations Bridge is unempowered:

  •  



    • There are so many events that they can't figure out what's going on. They can't see how events relate to each other, so they can't clean away the "event noise" that is being generated.

     



    • There is a performance problem with a service (e.g. Online check-in for an airline). An event comes to the Operations Bridge, but they don't have the tools to figure out what is causing the problem.  Diagnosing such complex performance problems is hard. Firstly because of the sheer amounts of data involved and secondly because the interrelationships between IT elements are complex and numerous.  However, it's important we do furnish the Operations Bridge with tools to deal with these types of problems because otherwise we get "allocation ping-pong" as the problem bounces around the different domain expert groups.

     


  • Second is where all infrastructure events come into one place, but events from monitoring user experience or business transactions go elsewhere. Typically when people say "we are doing consolidated operations" they really mean, "we are doing consolidated infrastructure operations" - there are two parallel systems, one for infrastructure events and one for events from monitoring user experience and business transaction.

  •  


    When we have such a parallel system it's hard to triage user experience and business transaction problems unless you can see all events. You need the events from the infrastructure to understand what's causing user experience / business transaction problems. If you don't, solving such problems takes a long time and results in allocation ping-pong, thus wasting the time of valuable domain experts.


     


  • And the third problem is  that we have a number of different event management systems. Typically there is one for each domain we are managing. This situation is probably the worst because:

  •  




    • Duplication of effort. Events aren't raised in isolation. One system has a problem, which causes a dependent system to have a problem, which causes a dependent system to have a problem, and so on. If each domain is receiving events in isolation, then we have duplication of effort as each domain works on "their problem". In fact, only one domain has a problem - but we can't see that because all events don't come into one place. Domain experts are expensive resources. And if they aren't dealing with problems they can be actually doings that move the business forward rather than simply "keeping the lights on".



    • No overall view of what's happening. When there is a problem with a key business service, the first question the business answered is, "when will it be fixed?"  If all events are going to individual domain consoles, we can't answer that question.  While we can't always give a fix time when all events come to one place, we at least have a good chance of doing so.



    • It's very time consuming and inefficient to solve complex performance problems.  Let's imagine we have an "application performance problem". The event comes straight to the application support guys. They look at it and realize it's not their problem. In other words, the application domain experts have had their time wasted. Had the event gone to a central place, the performance problem could have been correlated against the performance of the dependent infrastructure services, against any events coming out of dependent services, against any changes that have occurred, against historical incident data for that service, and against the compliance state of dependent services.  It would have been allocated to the correct domain group more quickly and by less expensive 1st level support staff.

     


So, if we have events all come to central Operations Bridge where we have the tools to cut out event noise, to triage complex performance problems, and to understand how all IT services relate to each other, we can solve problems faster and in a more efficient way, making good use of our 1st level support and our expensive domain experts.

How does BSM improve IT Operations' efficiency?

Welcome to the BSM BLOG.


I was on the phone last night talking to a two Gartner consultants about a couple of announcements we'll be making soon. Every time I mentioned a new feature, they said, "how does this improve IT operations' efficiency?" And of course they were right to ask - in these hard times, everyone's boss is asking the same thing.


So .. how can BSM help improve IT operations' efficiency?


BSM is all about managing the health of services in a way such that IT's priorities are aligned with the business's priorities - it might almost be better if we called it "Business Service Health Management".


The best thing we can do is ensure that all the key business services are always healthy - ensure that they never have a problem. This may seem like a statement of the obvious, but I think it's an important goal for us BSM vendors to bear in mind. If we can do anything to help pro actively avoid service health problems ever occurring, then that's best. We've started down this path with the proactive anomaly detection technology we introduced in our Problem Isolation product - but we want to take it further in the future.


However - let's imagine we didn't prevent the problem with the health of a service, how can BSM help improve efficiency now? There are a number of problems with solving health problems:




  • Allocation churn: we find out the on line check-in business service for which we are responsible is not performing well. Where does the problem lie? Such a complex business service can rest on 20 or 30 IT artifacts. Which one is the cause? Which team shall we give the problem to? Tell you what - "it looks like a network problem" (i.e. "I've not really got any idea, but my intuition tells me it's the network guys"), let's give it to them. We all know that such complex performance problems "bounce around support" because we don't have the tools to let us analyze exactly where the problem really is. In fact, Forrester estimates that 80% of the solution time for a performance problem is spent figuring out where the problem lies, and only 20% is spent fixing the problem once we know where it lies. So - if BSM can give us the tools to figure out where the problem lies, this will help "cut the 80%".



  • "Domain expert inefficiency": In IT support, we typically have our first line support, and behind them, second and third level support. We often refer to the 2nd and 3nd level support groups as "domain experts". And when domain experts are not fixing support problems, they can be "moving the business forward" - merging duplicate IT systems resulting from an acquisition, moving more infrastructure to virtualized systems, etc. In fact, IDC estimates that 73% of IT budgets are spent "keeping the lights on". If we could be more efficient in our use of domain experts, we could shift some of this 73% towards things that give us consolidated billing systems, consolidated ordering systems, consolidated HR systems, etc. So, how do we make our domain experts more efficient?

    Let's imagine all events, from everything from business transactions, thru user experience problems, applications, middleware down to infrastructure and networks all come to one place. And let's imagine that the first level support is actually empowered to understand each one of these event types - they have tools that clean out irrelevant events, understand the business priority of each event, execute automated run-books to fix simple problems, and figure out who to give complex problems to without causing churn. We could then use our domain experts more effectively.  The experts wouldn't get events that hadn't been pre-processed by 1st level support. They would get incidents that were caused by their domain - no allocation churn. And all "trivial stuff" would have been filtered out allowing them to focus on just those incidents which 1st line really couldn't fix.

For the BSM BLOG team - Mike Shaw.

Search
About the Author(s)
  • Anil is an enterprise software professional with 15+ years of experience. He has both breadth and depth of understanding in IT Infrastructure management including Network, System, Storage, Virtualization and Cloud. As a product manager, Anil had successfully introduced many new products into the world wide market.He innovates on regular basis and he holds many patents.
  • Doug is a subject matter expert for network and system performance management. With an engineering career spanning 25 years at HP, Doug has worked in R&D, support, and technical marketing positions, and is an ambassador for quality and the customer interest.
  • Drew is a subject matter expert for the BSM product structure, the BSM simplification program and the BSM Customer Appreciation Program. With a career spanning 10 years at HP, Drew has worked in Consulting and Product Management on various HP Software management products.
  • Dan is a subject matter expert for BSM now working in a Technical Product Marketing role. Dan began his career in R&D as a devloper, and team manger. He most recently came from the team that created and delivered engaging technical training to HP pre-sales and Partners on BSM products/solutions. Dan is the co-inventor of 6 patents.
  • This account is for guest bloggers. The blog post will identify the blogger.
  • Manoj Mohanan is a Software Engineer working in the HP OMi Management Packs team. Apart being a developer he also dons the role of an enabler, working with HP Software pre-sales and support teams providing technical assistance with OMi Management Packs. He has experience of more than 8 years in this product line.
  • Nimish Shelat is currently focused on Datacenter Automation and IT Process Automation solutions. Shelat strives to help customers, traditional IT and Cloud based IT, transform to Service Centric model. The scope of these solutions spans across server, database and middleware infrastructure. The solutions are optimized for tasks like provisioning, patching, compliance, remediation and processes like Self-healing Incidence Remediation and Rapid Service Fulfilment, Change Management and Disaster Recovery. Shelat has 21 years of experience in IT, 18 of these have been at HP spanning across networking, printing , storage and enterprise software businesses. Prior to his current role as a World-Wide Product Marketing Manager, Shelat has held positions as Software Sales Specialist, Product Manager, Business Strategist, Project Manager and Programmer Analyst. Shelat has a B.S in Computer Science. He has earned his MBA from University of California, Davis with a focus on Marketing and Finance.
  • Architect and User Experience expert with more than 10 years of experience in designing complex applications for all platforms. Currently in Operations Analytics - Big data and Analytics for IT organisations. Follow me on twitter @nuritps
  • Pranesh Ramachandran is a Software Engineer working in HP Software’s System Management & Virtualization Monitoring products’ team. He has experience of more than 7 years in this product line.
  • Ramkumar Devanathan (twitter: @rdevanathan) works in the IOM-Customer Assist Team (CAT) providing technical assistance to HP Software pre-sales and support teams with Operations Management products including vPV, SHO, VISPI. He has experience of more than 12 years in this product line, working in various roles ranging from developer to product architect.
  • Ron is a subject matter expert for BSM\APM, Currently in the Demo Solutions Group. Ron have over thirteen years of technology experience, and a proven track record in providing exceptional customer service. He began his career in R&D as a software engineer, and team manager.
  • Stefan Bergstein is chief architect for HP’s Operations Management & Systems Monitoring products, which are part HP’s business service management solution. His special research interests include virtualization, cloud and software as a service.
Follow Us


HP Blog

HP Software Solutions Blog

Labels
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation