Operations Log Intelligence – How to get to the bottom of root cause analysis

Guest post by By Shimrit Yacobi, Engineer, HP IT Operations

 

This article gives a use case scenario of the OLI solution for collecting and storing all log data to support fast and efficient real-time incident management from an SME point of view.

HP Operations Log Intelligence (OLI) is designed to offer Log Management capabilities to IT Operations teams. It collects and analyzes log data from any log generating source.

 

OLI exploits high speed, high compression technology, to store log entries for years, yet provide fast returns on searches launched by IT Operations team and Subject Matter Experts (SME). With OLI, organizations no longer need to be constrained by their ability to store and analyse data due to technology or cost.

 

For more details about the advantages of OLI, this blog shows you the advantages OLI brings to your organization.  I have also written a second blog on HP OLI on the 3 key features for Log Management with OLI . You can read it here.

 

OLI dashboards allow IT support staff at all levels to investigate log data according to the best practices defined by your operator. Dashboards are designed to coordinate cross-team investigations and enhance collaboration.  OLI includes OOTB dashboards tailored to suit IT Operation needs, as well as offer the ability to easily design and create custom dashboards.

 

Scenario

Bill is a SME monitoring an increasing number of log files in his virtualized environment. He has been experiencing performance issues with Windows and needs to find the root cause of the problem. Bill has a Windows application running on top of an Apache web service. Having used OLI to collect Apache and Windows log data, Bill has the data readily available to perform analysis and diagnose the root cause of the issues using OLI application.

 

Bill opens the OLI web application and begins on the summary page, which gives him a view of all the system components being monitored:

OLI 1.png

 

Bill scans the summary page for suspicious behaviour that might be related to the Windows issues he encountered. He notices that the apache_access_file agent count is higher than usual.

 

He clicks on the apache_access_file to drill down and is directed to the analyze page. 

 

OLI 2.png

 

You can download HP OLI for yourself here

 

The drill down operation automatically created a query for the analyze page. Bill sets the relevant time range to indicate the time his issues occurred. This results in an aggregate bar chart displaying the number of requests for every day within the selected time range.

 

As an experienced SME, Bill knows that having 2000 requests sent from one host within one hour might indicate a cause of his performance issues.   

 

He decides to drill deeper into the problem and opens the Apache Web Server dashboard, a preconfigured customized dashboard.

 

 The dashboard allows him to view IT logs, machine data, and integrations of different aggregation and charts over time. Bill adjusts the date range to see the analysis of the day in question all on one screen.

 

OLI 3.png

 

Bill investigates the decoded data from log files on the dashboard, and reviews the server behaviour for the last day. He looks for any suspicious error messages with a high severity, but none appeared in the logs according to the “Apache Error Count by Severity” pane.

 

Bill looks at the aggregate count of requests and notices a few peaks of 1000-2000 requests. He looks on the chart and sees a reoccurring anomaly of stress inputs on the Apache server within the same time range he encountered the performance issue.

 

Bill realizes there was an Apache script running every few hours that caused the peaks that eventually resulted in the Windows performance issues. After removing the script, the anomalies stopped and the corresponding Windows performance issues stopped as well. Using the OLI application, Bill determined that the Apache script was the root cause of the performance issues he encountered. Bill can now share his dashboards with IT staff at all levels in his organization to pinpoint and prevent similar performance issues in the future.  

 

If you would like to know what HP OLI can do for you, check out the product page here.

You can also download OLI to try it yourself here.

 

Read more about OLI in the second blog in this series, 3 key features for log management with HP OLI

 

Leave a Comment

We encourage you to share your comments on this post. Comments are moderated and will be reviewed
and posted as promptly as possible during regular business hours

To ensure your comment is published, be sure to follow the Community Guidelines.

Be sure to enter a unique name. You can't reuse a name that's already in use.
Be sure to enter a unique email address. You can't reuse an email address that's already in use.
Type the characters you see in the picture above.Type the words you hear.
Search
Showing results for 
Search instead for 
Do you mean 
About the Author
This account is for guest bloggers. The blog post will identify the blogger.
Featured


Follow Us
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation.