Infrastructure Management Software Blog

OH, IL, WI, IN, MI Operations Center Technical Roadshow - April 20th to April 29th - Don't miss it!

Ever wish you could talk face-to-face with more technical people about Operations Center and Network Management Center products? Don’t really have the time or budget to travel very far to do so?  Well, here is a great opportunity to meet and talk with technical experts on products like Operations Manager and NNMi – right in your background.


Vivit will be hosting a series of six (6) one-day sessions, where there will be a nice mix between presentations and Q&A sessions around these products.  The sessions will be held in the following states on the following days:


- (Columbus) Ohio – April 20, 2010


- (Orrville) Ohio – April 21, 2010


- (Dearborn) Michigan – April 22, 2010


- Wisconsin – April 27, 2010


- (Chicago) Illinois – April 28, 2010


 - (Fishers) Indiana – April 29, 2010


Feel free to contact me if you have any further questions about this roadshow at asksonja@hp.com.

Labels: agent| agentless| agentless monitoring| agents| automating operations management| automation| BES| BlackBerry Enterprise Server| CMDB| consolidate events| consolidated event| Consolidated Event and Performance Management| consolidated event management| Consolidated Management| correlate events| DDM| Discovery and Dependency Mapping| event console| event consolidation| event correlation| event management| Hewlett Packard| HP Network Node Manager| HP OMi| HP OpenView| HP Operations Center| HP Operations Manager| infrastructure management| infrastructure monitoring| IT dashboard| IT infrastructure management| IT infrastructure monitoring| IT management| manager of managers| managing IT| managing IT infrastructure| managing IT operations| monitoring| Network Management| Network Node Manager| NNM| NNMi| Norm Follett| OM| OMi| OML| OMU| OMU 9.0| OMW| OpenView| OpenView Operations| Operations Center| Operations Manager| Operations Manager i| Operations Manager on Linux| Operations Manager on Unix| Operations Manager on Windows| performance| Performance Agent| performance management| Performance Manager| performance monitoring| SiteScope| Smart Plug-in| Sonja Hickey| SPI| TBEC| Topology Based Event Correlation| topology-based event correlation| virtual server| virtual servers| virtual systems management| virtualization management| Virtualization SPI| virtualization sprawl| virtualization strategy| virtualizationation| virtualized environment| virtualized environments| Virtualized Infrastructure| Vivit

Learn how Independence Blue Cross reduced IT Operations costs

Join HP Software and Solutions for a live InformationWeek webcast with special guests Maryann Phillip, Director of Service Delivery at Independence Blue Cross (IBC), and Ken Herold, Practice Manager & Principal Architect with Melillo Consulting.


Hear first-hand how IBC is using HP Operations Center products like Operations Manager, Performance Manager, and DDM in addition to agentless and agent-based data collection to:



  • achieve profitable growth through enabling technologies

  • reduce costs by achieving a competitive cost structure

  • manage medical costs better -- through operational stability & improvements


Register today and learn how you can streamline and make YOUR processes more efficient.

Event Correlation: OMi TBEC and Problem Isolation - What's the difference (part 2 of 3)

If you have not done so already, you may want to start with part 1 in this series.
http://www.communities.hp.com/online/blogs/managementsoftware/archive/2009/09/25/event-correlation-omi-tbec-and-problem-isolation-what-s-the-difference-part-1-of-3.aspx


This is part 2 of 3 of my discussion of the event correlation technologies within OMi Topology Based Event Correlation (TBEC) and Problem Isolation. I'm going to focus on talking about how TBEC is used and how it helps IT Operations Management staff be more effective and efficient. My colleague Michael Procopio has discussed PI in more detail over in the BAC blog here: PI and OMi TBEC blog post 


If you think about an Operations Bridge (or "NOC"… but I've blogged my opinion of that term previously) then fundamentally its purpose is very simple.


 


The Ops Bridge is tasked with monitoring the IT Infrastructure (network, servers, applications, storage etc.) for events and resource exceptions which indicate a potential or actual threat to the delivery of the business services which rely on the IT infrastructure. The goal is to fix issues as quickly as possible in order to reduce the occurrence or duration of business service issues.


 


Event detection is an ongoing process 24x7 and the Ops Bridge will monitor the events during all production periods, often 24x7 using shift based teams.


 


Event monitoring is an inexact discipline. In many cases a single incident in the infrastructure will result in numerous events – only one of which actually relates to the cause of the incident, the other events are just symptoms.


 


The challenge for the Ops Bridge staff is to determine which events they need to investigate and to avoid chasing the symptom events. The operations team must prioritize their activities so that they invest their finite resources in dealing with causal events based on their potential business impact, and avoid wasting time in duplication of effort (chasing symptoms) or, even worse, in chasing symptoms down in a serial fashion before they finally investigate the actual causal event, as this will extend the potential for extended downtime of business services.


 


TBEC helps the Operations Bridge in addressing these challenges. TBEC works 24x7, examining the event stream, relating it to the monitored infrastructure and the automatically discovered dependencies between the monitored components. TBEC works to provide a clear indication that specific events are related to each other (related to a single incident) and to identify which event is the causal event and which are symptoms.


 


Consider a disk free space issue on a SAN, which is hosting an oracle database. With comprehensive event monitoring in place, this will result in three events:



  • a disk space resource utilization alert

  • quickly be followed by an Oracle database application error

  • and a further event which indicates that a Websphere server which uses the Oracle database is unhappy


 


Separately, all three events seem ‘important’ – so considerable time could be wasted in duplicate effort as the Ops Bridge tries to investigate all three events. Even worse, with limited resources, it is quite possible that the Operations staff will chase the events ‘top down’ (serially) – look at Websphere first, then Oracle, and finally the SAN – this extends the time to rectification and increases the duration (or potential) of a business outage.


 


TBEC will clearly show that the event indicating the disk space issue on the SAN is the causal event – and the other two events are symptoms.


 


In a perfect world the Ops Bridge can monitor everything, detect every possible event or compromised resource that might impact a business service and fix everything before a business service impact occurs.


 


The introduction of increasingly redundant and flexible infrastructure helps with this – redundant networks, clustered servers, RAID disk arrays, load balanced web servers etc. But, it also can add complications which I’ll illustrate later.


 


One of the challenges of event monitoring is that it simply can NOT detect everything that can impact business service delivery. For example, think about a complex business transaction, which traverses many components in the IT infrastructure. Monitoring of each of the components involved may indicate that they are heavily utilized – but not loaded to the point where an alert is generated.


 


However, the composite effect on the end to end response time of the business transaction may be such that response time is simply unacceptable. For a web based ordering system where customers connect to a company’s infrastructure and place orders for products this can mean the difference between getting orders or the customer heading over to a competitors web site.


 


This is why End User Monitoring technologies are important. I'll talk about EUM in the next, and final, edition of this blog serial.




Read part 3 in the series.
http://www.communities.hp.com/online/blogs/managementsoftware/archive/2009/09/25/event-correlation-omi-tbec-and-problem-isolation-what-s-the-difference-part-3-of-3.aspx



For HP Operations Center,  Jon Haworth.

Event Correlation: OMi TBEC and Problem Isolation - What's the difference (part 1 of 3)

 I often get asked questions about the differences between two of the products in our Business Service Management Portfolio; BAC Problem Isolation and OMi Topology Based Event Correlation. Folks seem to get a little confused by some of the high level messaging around these products and gain the impression that the two products "do the same thing".


I guess that, as part of HPs Marketing organization, I have to take some of the blame for this so I'm going to blog my conscience clear (or try to).


To aid brevity I'll use the acronyms PI for Problem Isolation and TBEC to refer to OMi Topology Based Event Correlation.


On the face of it, there are distinct similarities between what PI and TBEC do.



  • Both products try to help operational support personnel to understand the likely CAUSE of an infrastructure or application incident.

  • Both products use correlation technologies (often referred to as event correlation) to achieve their primary goal.



I'll try to summarize the differences in a few sentences.



  • TBEC correlates events (based on discovered topology and dependencies) continuously to indicate the cause event in a group of related events. TBEC is "bottom up" correlation that works even when there is NO business impact - it is driven by IT infrastructure issues.

  • PI correlates data from multiple sources to determine the cause (or causal configuration item) where a business service impacting incident has occurred (. PI performs correlation "on demand" and based on a much broader set of data than TBEC. PI might be considered "tops down" correlation because it starts from the perspective of a business service impacting issue.



In reality, the differences between the products are best explained by looking at how they are used and I'll use my next couple of blog posts to do exactly that for TBEC. If you want the detail on PI then visit this 


PI and OMi in the BAC blog


 post from my colleague, Michael Procopio.  


 Read part 2 in the series.
http://www.communities.hp.com/online/blogs/managementsoftware/archive/2009/09/25/event-correlation-omi-tbec-and-problem-isolation-what-s-the-difference-part-2-of-3.aspx


Read part 3 in the series.
http://www.communities.hp.com/online/blogs/managementsoftware/archive/2009/09/25/event-correlation-omi-tbec-and-problem-isolation-what-s-the-difference-part-3-of-3.aspx


For HP Operations Center, Jon Haworth.

Using Effective Incident Management to Minimize IT Outages

I had an interesting meeting this week with Christel Mes, Director of Marketing for one of our technology partners, AlarmPoint. They are a leader in Alert Management solutions, transforming events from monitoring, planning and help desk applications into role-specific information. AlarmPoint extends Operations Manager’s event consolidation capabilities by eliminating alert overload and reducing operational costs by making personnel more efficient and allowing them to resolve incidents faster.


We have a number of joint customers that are managing some very complex IT environments. AlarmPoint is hosting a free webinar on Wednesday, May 13, 2009 in which Wells Fargo will talk about how they are using the combination of Operations Manager and AlarmPoint for incident management. Their new approach now allows Wells Fargo to notify personnel of critical events and gives them the tools to intervene before there is a business impact to their customers.


Wells Fargo, like many online enterprises, found that downtime and unresolved issues were negatively affecting response times, mean time to repair and resolution of incidents. In order to manage, respond to and resolve critical alerts rapidly, the bank turned to automated Alert Management, consolidating IT staff profile information and combining it with complicated rotation schedules to ensure alerts never fell through the cracks.


I don’t want to give the entire story away here. If you want to learn the details, including some tips and tricks for increasing efficiency to reduce costs, please register for the AlarmPoint webinar.


For Operations Center, Peter Spielvogel.

A New Data Center is an Opportunity for New Thinking

With all the doom and gloom in the news these days, it was a bright spot in my week to have a meeting with a customer that is planning to build a new data center next year. Even more surprising is that the company is in the financial services industry. And no, they are not receiving any government money to finance this project.


They visited our Executive Briefing Center to learn about best practices in IT transformation. In the introductory comments, the Director of IT (who reports to the CIO) stated “A new data center is opportunity for new thinking.” So, this set the context of looking at the state of the art in data center management and how to build it right if you are starting with the proverbial clean sheet of paper, which in this case, they are.


First, let’s cover their existing IT environment. 900 people (mix of on-shore and off-shore) managing an assortment of hardware (most of it non-HP), running UNIX (not HP-UX), using enterprise storage from one of the major vendors. For management tools, they own a large collection of tools from a single vendor (not HP), most of which has not been deployed because of its complexity and problems with the parts they have put into production. But, in all fairness, what they have does work at some level as they do not currently have issues with outages.


So, where are they going? On the infrastructure side, they are planning to move to Linux, blade servers (likely HP) running Oracle 11g, and VMware. They also plan to refresh their enterprise applications to the latest versions. And, they plan to experiment with some software as a service (SaaS) to see if it meets their needs and fits with their culture.


The overall IT infrastructure management strategy is (1) prevent, (2) detect, (3) respond. Currently, they do not have a true NOC. They are moving in that direction following an ITIL model, building an Operations Bridge. Their IT management goals are to reduce time on incident management and to add automation as much as possible to reduce human error.


On the IT management tools side, they need a way to manage the physical and virtual infrastructure, from the OS through the applications, in a single enterprise event consolidation console. This will capture all the events (after they are de-duplicated upstream), prioritize according to business goals, and then respond appropriately, either by automatically fixing them or by routing to the right subject matter expert.


They generally liked HP’s vision and the success stories we shared about other organizations that had already implemented all or part of their vision. Interestingly, the place that generated the most skepticism was our discussion about runbook automation. While they saw the value of automating IT processes, they just could not believe that they could use this technology to streamline some of their common IT problems. Even talking about specific use cases (from a pool of hundreds of customers) did not sway them, Since seeing is believing, the sales rep took an action item to schedule a follow up meeting where we can show them a demo.


Overall, a great discussion. And, a happy day to hear that a customer is planning a new data center. Even more so when they want to use the opportunity to re-architect their systems to build in the latest and greatest business technology optimization.


For Operations Center, Peter Spielvogel


 

Search
Showing results for 
Search instead for 
Do you mean 
HP Blog

HP Software Solutions Blog

Featured


Follow Us
Labels
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation.