Infrastructure Management Software Blog

Mobile Provider uses HP Operations Manager to increase availability by 10% (Customer Success Story)

Consolidated operations drives efficiency. In this example, the leading mobile telephony distribution network in France has standardized on HP Operations Manager and HP Network Node Manager to optimize their IT operations.

Smart Decisions Today instead of Desperate Ones Tomorrow

Yesterday, I had the opportunity to participate in a briefing in which a dozen IT executives from a leading financial institution came to discuss their unique requirements and how we can help them meet their aggressive growth goals. The first slide that their head of technology presented contained the equation “change = opportunity + risk”. This started a six hour discussion about market and technology disruptions and how they affect their quest for increased performance, capacity, and efficiency.


The theme of the day was business value. The customer had recently made several major changes to their IT infrastructure, all to embrace innovations that provided more performance for less money. Some of these upgrades meant dropping vendors of proprietary technologies who have served them well for years in favor of open platforms such as Linux. In financial markets, efficiency rules.


While efficiency was the foundation of many of their technology decisions, innovation was their passion. Advanced technology was the enabler that allows them to be first, best, and fastest in meeting their customers’ demanding needs. The company needs to develop new financial instruments to address rapidly changing and unprecedented (at least in recent history) market conditions. With rising transaction volumes (long-term trend, anyway) and rising customer expectations, there is a constant need to increase system performance and capacity. And, for IT solutions that can manage growing complexity.


In addition to technology changes, the regulatory environment for financial services firms is also shifting rapidly. While some decade-old regulations such as Glass-Steagall are now gone, new ones are taking their place and there will be new regulations to fix perceived free market inadequacies. This places additional load on IT systems as they must track and log every moving part to ensure compliance with new rules.


Reporting granularity is a major requirement for any IT system. While averages provide useful trend information, they fall short in deliverable actionable intelligence for troubleshooting. Generally, it is spikes that cause system outages rather than averages. And, if your data collection times are too broad, you lose the ability to focus on the indecent that caused an outage or lost data. This was a big discussion topic.


Another memorable take away from the day was when one of their executives said “I would rather make smart decisions today instead of desperate ones in the future.” This was the premise for setting up what turned out to be a very productive meeting.


What smart decisions are you making to simplify managing your IT infrastructure?


For HP Operations Center, Peter Spielvogel

Automated Infrastructure Discovery - Extreme Makeover

Good Discovery Can Uncover Hidden Secrets
Infrastructure discovery has something of a bad reputation in some quarters. We've done some recent surveys of companies utilizing a variety of vendors’ IT operations products. What's interesting is that, in our survey results, automated infrastructure discovery fared pretty badly in terms of the support that it received within organizations - and also in terms of the success that they believed they had achieved.
 
There are a number of reasons underlying these survey results. Technology issues and organizational challenges were highlighted in our survey. But I believe that one of the main 'issues' that discovery has is that people have lost sight of its basic values and the benefits that they can bring. Organizations see 'wide reaching' discovery initiatives as complex to implement and maintain - and they do not see compelling short term benefits.
 
I got to thinking about discovery and the path that it has taken over the last 15 or 20 years. I remember the excitement when HP released its first cut of Network Node Manager. It included discovery that showed people things about their networks that they just did not know. There were always surprises when we took NNM into new sites to demonstrate it. Apart from showing folks what was actually connected to the network, NNM also showed how the network was structured, the topology.
 
Visualization --> Association --> Correlation
And once people can see and visualize those two sets of information they start to make associations about how events detected in the network relate to each other - they use the discovery information to optimize their ability to operate the network infrastructure.
 
So the next logical evolution for tools like NNM was to start building some of the analysis into the software as 'correlation'. For example the ability to determine that the 51 "node down" events you just received are actually just one "router down' event and 50 symptoms generated by the nodes that are 'behind' the router in the network topology. Network operators could ignore the 'noise' and focus on the events that were likely causes of outages. Pretty simple stuff (in principle) but very effective at optimizing operational activities.
 
Scroll forward 15 years. Discovery technologies now extend across most aspects of infrastructure and the use cases are much more varied. Certainly inventory maintenance is a key motivator for many organizations - both software and hardware discovery play important roles in supporting asset tracking and license compliance activities. Not hugely exciting for most Operational Management teams.
 
Moving Towards Service Impact Analysis
Service impact analysis is a more significant capability for Operations Management teams and is a goal that many organizations are chasing. Use discovery to find all my infrastructure components - network devices, servers, application and database instances - and tie them together so I can see how my Business Services are using the infrastructure. Then, when I detect an event on a network device or database I can understand which Business Services might be impacted and I can prioritize my operational resources and activities. Some organizations are doing this quite successfully and getting significant benefits in streamlining their operational management activities and aligning them with the priorities of the business.
 
But there is one benefit of discovery which seems to have been left by the side of the road. The network discovery example I started with provides a good reference. Once you know what is 'out there' and how it is connected together then you can use that topology information to understand how failures in one part of the infrastructure can cause 'ghost events' - symptom events' - to be generated by infrastructure components which rely in some way on the errant component. When you get 5 events from a variety of components - storage, database, email server, network devices - then if you know how those components are 'connected' you can relate the events together and determine which are symptoms and which is the likely cause.
 
Optimizing the Operations Bridge
Now, to be fair, many organizations understand that this is important in optimizing their operational management activities. In our survey, we found that many companies deploy skilled people with extensive knowledge of the infrastructure into the first level operations bridge to help make sense of the event stream - try to work out which events to work on and which are dead ends. But it's expensive to do this - and not entirely effective. Operations still end up wasting effort by chasing symptoms before they deal with the actual cause event. Inevitably this increases mean time to repair, increases operational costs and degrades the quality of service delivered to the business.
 
So where is the automation? We added correlation to network monitoring solutions years ago to help do exactly this stuff, why not do 'infrastructure wide' correlation'?
 
Well, it's a more complex problem to solve of course. And there is also the problem that many (most?) organizations just do not have comprehensive discovery across all of their infrastructure. Or if they do have good coverage it's from a variety of tools so it's not in one place where all of the inter-component relationships can be analyzed.
 
Topology Based Event Correlation - Automate Human Judgment
This is exactly the problem which we've been solving with our Topology Based Event Correlation (TBEC)  technology. Back to basics - although the developers would not thank me for saying that, as it's a complex technology. Take events from a variety of sources, do some clever stuff to map them to the discovered components in the discovery database (discovered using a number of discrete tools) and then use the relationships between the discovered components to automatically do what human operators are trying to do manually - indicate the cause event.
 
Doing this stuff automatically for network events made sense 15 years ago, doing it across the complexity of an entire infrastructure makes even more sense today. It eliminates false starts and wasted effort.
 
This is a 'quick win' for Operational Management teams. Improved efficiency, reduced operational costs, free up senior staff to work on other activities… better value delivered to the business (and of course huge pay raises for the Operations Manager).
 
So what do you need to enable TBEC to help streamline your operations. Well, you need events from infrastructure monitoring tools - and most organizations have more than enough of those. But you also need infrastructure discovery information - the more the better.
 
Maybe infrastructure discovery needs a makeover.

 

For HP Operations Center, Jon Haworth


 

Gaining Insight into your IT Operations

Here are some links to recent press coverage of HP’s latest infrastructure management software.


HP extends virtual systems management
Network World, 1/15, Denise Dubie


Means to an end: New HP management tools address green IT challenges
ZDNet's GreenTech Pastures, 1/15, Heather Clancy


HP Extends Orchestration Of Virtual, Physical Servers
InformationWeek, 1/15, Charles Babcock


HP adds server management, efficiency features to reduce IT costs
SearchDataCenter, 1/15, Bridget Botelho


Note: System administrators can use Insight Software on its own or use Operations Center to consolidate their operations and manage events from across the enterprise (any brand of hardware, both physical and virtual servers).


For HP Operations Center, Peter Spielvogel

Consolidated IT Operations: Return of the Prodigal Son

Let's face it, the concept of bringing together all of your IT infrastructure monitoring into a single "NOC" or Operations Bridge has been around for years. Mainframe folks will tell you they were doing this stuff 30 years ago.

 

Unfortunately, in the distributed computer systems world, a lot of organizations have still not managed to successfully consolidate all of their IT infrastructure operations. I see a lot of companies who believe that they have made good progress, often they've managed to pull together most of the server and application operations activities, maybe minimized the number of monitoring tools that they use.

 

But when you dig below the surface, often there will be a separate network operations team, and maybe an application support team that owns a 'special' application. And of course the admins who are responsible for the roll out of the new virtualization technology - that just "cannot" be monitored by the normal operations tools and processes.

 

And that's the problem... Often there is resistance from a number of different angles to initiatives which try to pull end-to-end infrastructure monitoring into a single place. Legacy organizational resistance is probably the biggest challenge - silos have a tendency to be very difficult to 'flatten'.

 

Another common theme is that the technical influencers (architects, consultants, application specialists etc.) in the organization create FUD that the toolset used by the operations teams is not suitable for monitoring the new technology that they are rolling out. They need to use their own special monitoring solution or the project will fail. Because it's a new technology and everyone is scared of a failed rollout, management acquiesces and another little fragmented set of monitoring technology, organization and processes is born. Every new technology has potential for this - I've seen it happen with MS Windows, Linux, Active Directory, Citrix, VMware - the list is endless.

 

So what? I hear you say, what's your point? Well I'm seeing a lot of organizations revisiting the whole topic of consolidating their IT operations and establishing a single Operations Bridge - and making some significant changes.

 

Why now? Simple - to reduce the Operational Expenditures associated with keeping the lights on. In the current economic climate organizations are motivated 'top down' to drive cost out wherever they can. Initiatives that deliver cost reductions in the short term get executive sponsors. There is also a lot lower tolerance for the kinds of hurdles that used to be raised as objections - organizational silos get flattened, tool portfolios are rationalized.

 

It's not just about cutting cost of course. Simply reducing headcount would achieve that goal, but the chances are that the quality of IT service delivered to the business would suffer, and there would be direct impacts on the ability of the business to function.

 

Of course, the trick is to consolidate into an Operations Bridge, and be able to deliver the same or higher quality IT services to the business but with reduced cost. Often the economies of scale and streamlined, consistent processes that are enabled by an Operations Bridge will deliver significant benefits - and reduce OpEx.

 

This is where HP's Operation Center solutions have focussed for the last 12 or 15 years. In my next post I'll talk about where HP see the next significant gains being made - where are we focusing so we can help our customers to take their existing Operations Bridge and significantly increase efficiency and effectiveness.

 

In the meantime, if you want to read a little more about the case for consolidated operations, take a look at this white paper "Working Smart in IT Operations - the case for consolidated operations".

 

For HP Operations Center, Jon Haworth.

 

 

ROI for IT Infrastructure Monitoring - Measuring what Matters

I read an interesting post and related ebook by David Meerman Scott on why traditional marketing ROI measures lead to failure. His premise is that measuring marketing metrics such as number of sales leads captured and press mentions lead to the wrong behaviors and in some ways undermine one of the primary goals of marketing, which is to increase sales and market share. So, why is everyone in both marketing and IT so focused on ROI?

 

The answer is that focusing on the return of your investments in different parts of the business allows you to allocate scare resources and drive the best returns for the shareholders. They key is tracking the metrics that matter. Here are two examples, one from marketing and one from IT.

 

In previous positions, I have created marketing dashboards that charted many the items that Mr. Scott slammed. Why would I or other seasoned marketing professionals do this? One reason is that some metrics are relatively easy to track (such as leads captured at a trade show or the number of responses to a marketing campaign). Correlating these to the real goal of increasing sales is much trickier and requires much heavier monitoring infrastructure including obtaining accurate input from sales and customers about the number of touches and how individual marketing campaigns or programs influenced each stage of the sales process. Few companies have the will or discipline to do this.

 

On the IT side, there are also easily-trackable metrics. Server utilization, power consumption, and application uptime appear on many IT dashboards. While these are certainly important, what really matters is how the IT infrastructure supports the business goals. Business owners care about:



  • Availability - can my users access the applications they need?

  • Performance - does the application deliver an acceptable response time?

  • Data accuracy - does the application maintain data integrity?
 

Again, tracking these business-focused metrics is harder than focusing on ones that are easy to gather from element managers that often accompany systems. But, the right management software and some automated processes make it straight forward to create IT dashboards with mean. This is what the field of business service management is all about. BSM links the underlying infrastructure and applications to business outcomes such as those listed above.

 

To learn more about BSM, please visit my colleague Mike Shaw’s BSM blog or download a white paper about HP’s approach to BSM.

 

In future posts, I and my fellow bloggers will address how robust IT infrastructure monitoring contributes to delivering availability, performance, and accurate data

 

For HP Operations Center, Peter Spielvogel.


 

Search
Showing results for 
Search instead for 
Do you mean 
HP Blog

HP Software Solutions Blog

Featured


Follow Us
Labels
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation.