Modernize IT Monitoring Event Analytics with Predictive Analytics

IT monitoring technology is moving forward rapidly, thanks in large part to machine learning and predictive analytics. If you’re still getting by with a legacy IT solution, you’re missing out on a lot more than a shiny object. The benefits of a New IT approach are being proven every day.

Modernize IT Monitoring Event Analytics with Predictive Analytics
Modernize IT Monitoring Event Analytics with Predictive Analytics

Read on this article to find out how a data-driven, predictive approach to IT monitoring can bring your organization into the world of New IT. You’ll find out where you are on the IT Maturity Framework and get practical tips for how to move up the curve. Plus, you’ll learn how to:

  • Break down data silos to get the most value from all your data without extensive cleansing and preparation
  • Build alignment between your IT department and business stakeholders
  • Speed up the process of identifying bottlenecks and performing root-cause analysis, letting you solve problems faster and prevent them from reoccurring.

Content Summary

The Problem With Legacy IT Monitoring
Too Many Data Types and Formats
Too Many Silos
Too Much Time to Resolution
Lack of Alignment Between IT and Business Stakeholders
Long Time to Value
Why You Can’t Afford to Wait
IT Bears an Unfair Share of the Burden
Traditional IT Operating Model
The New IT Operating Model
Data Is the Foundation
The Three Pillars of IT
The Journey to World Class IT Monitoring
Level 1: Search and Monitor – Reactive
Level 2: Operational Visibility – Proactive
Level 3: Business Insights – Predictive
Level 4: Prediction and Improvement – Preventative
Getting There With Splunk
Conclusion

The Problem With Legacy IT Monitoring

It’s hard to overcome inertia, especially when it comes to replacing an IT operations monitoring solution. You’ve spent the money. People are used to it. When you look at replacement options, every vendor seems to promise end-to-end everything. It’s easy to fall back on the maxim that “If it ain’t broke, don’t fix it.”

Just because your IT monitoring solution isn’t broke(n) doesn’t mean it’s giving you all the value you should expect. Your monitoring solution should be preserving uptime and ensuring great customer experiences. But systems still go down. Customer experience suffers. Millions of dollars in revenue are lost. To monitor and maintain the customer experience, IT must be able to measure uptime, performance, and response time of mission-critical applications and the underlying infrastructure they run on.

The average enterprise runs hundreds of applications, servers, virtual machines, containers, and microservices which produce constant streams of data in disparate forms. But legacy IT tools can’t see how the layers of the stack play together — they’re usually point solutions, designed to focus on one system and ignore the world around them.

Meanwhile IT leaders are faced with shrinking budgets and growing demands, expanding data complexity and pressure to keep up with digital transformation and the move to the cloud.

IT teams feel the stress of continuous war room misery, arguments over accountability and the inevitable finger-pointing.

Root cause analysis goes by the wayside and the same problems resurface over and over again. Too much time is wasted in debugging old solutions rather than coming up with new and better ones.

So if there’s a high price to pay for staying with old, outdated solutions, why aren’t IT departments continually improving and updating their legacy monitoring systems? Or better yet, starting over with new ones? You can probably list half a dozen reasons without thinking hard.

Too Many Data Types and Formats

Ingesting and normalizing data of different formats and types is tedious and unmanageable, and it’s difficult to make real-time decisions. Too many monitoring tools for single layers of their IT stack, such as networks or applications, create silos and inefficiencies.

Too Many Silos

When data lives inside one tool but can’t access or communicate with data confined to other tools, IT practitioners lose context on what’s happening in their environment because they’re seeing only a part of the picture.

Too Much Time to Resolution

Infrastructure complexity affects IT’s ability to quickly determine the root cause of an issue. Ingesting data from different formats and making sense of it to diagnose and determine the root cause is problematic.

Lack of Alignment Between IT and Business Stakeholders

As digital business infrastructure increases in complexity, IT teams feel more pressure than ever to reduce business-impacting incidents. When IT systems fail, the ramifications go beyond the immediate financial loss of downtime — a business could lose customers and jeopardize its reputation.

Long Time to Value

Implementing a new IT monitoring tool is an important investment, but it can take a while to show return. Companies that move from one legacy system to another can find themselves hogtied by the need to retrain employees and update processes, plus address unexpected but inevitable compatibility issues. Monitoring tools that can’t automatically ingest and use data in multiple formats bring their own brand of headache.

Why You Can’t Afford to Wait

Every business is a technology business, and the IT department is essential to delivering a new wave of services. Digital evolution has changed the way every business operates. To compete, you need to be available whenever and wherever your customers are, because every issue and outage means one more chance for customers to click away to competitors. If they don’t get what they want, they’ll click away to a competitor.

IT Bears an Unfair Share of the Burden

The IT department carries the weight of the technology burden experienced by modern organizations. While end users may think of IT as the people they call when their laptop crashes, IT’s responsibilities are increasingly varied — and vital. Whether you’re looking at the old model or the new, the job of IT spans four general categories:

Building IT: Building and delivering new business services through technology

Supporting IT: Creating and maintaining the infrastructure to monitor and support and run these new services

Optimizing IT: Measuring the efficiency and effectiveness of service delivery based on that organization’s priorities, goals and metrics

Fixing IT: Diagnosing and fixing problems and preventing them from recurring

Too often, those core tasks were separate and conducted in silos, resulting in wasted time and money, unhappy customers and frustrated IT teams running from war room to war room.

Let’s take a look at the traditional IT operating model and compare it to the new, improved version.

Traditional IT Operating Model

In the traditional IT model, development and operations were separated by a wall, whether metaphorical or physical. Their work cycles were unique. Development built new applications with a cycle that started with planning and ran through deployment.

This may have worked in large organizations where teams worked in the same location, used simple tools built for simple technology environments and had downtime overnight to implement changes. But that’s not the environment we live in now.

Traditional IT Operating Model
Traditional IT Operating Model

The New IT Operating Model

The digital revolution requires that the development and operations cycles come together into one that can react to the speed and complexity of new IT environments. Integrating the two cycles creates an open feedback loop, where problems can be quickly identified and fixed without interruption. Customer expectations are continually fulfilled through new capabilities and new code, and by Development and Operations working together to meet expectations.

The New IT Operating Model
The New IT Operating Model

This cycle runs much faster, which is the whole point. In the traditional model, IT departments didn’t have the ability to work together and collaborate feature by feature. Development worked by themselves and delivered a new release every six months, which Operations maintained, scaled and monitored.

The New IT operating model unifies the development and operations cycles into one continuous, coordinated cycle. This model reflects a DevOps mindset, letting companies get features to market faster, innovate faster, test and iterate without interrupting service delivery.

Data Is the Foundation

For years, organizations have been experimenting with new ways to get value out of data. The biggest roadblock? Silos. IT ops had monitoring tools. Development had their own tools for integrated development environments. Network and security operations had their own tools. Nobody could see one another’s data. When something went wrong, finger-pointing ensued.

The New IT model creates a common and shared data substrate to form the foundation of IT. The tools for monitoring, collaboration and automation are built on top of that shared foundation, creating a closed loop to deliver and operate new IT services. The right stakeholders are engaged in delivering business value, identifying and understanding issues and working together to find the root cause of problems.

The data foundation also allows teams to work together on post-incident reviews, reflect on what happened and feed that information back into the cycle. Problems are solved once, not over and over across the organization.

Data is the foundation of the New IT environment, with three pillars built upon it to provide an operational structure.

The Three Pillars of IT

Monitoring
Monitoring in a New IT environment is a collaborative and centralized activity. Data for diagnostics, remediation and automation is available in one place for everyone to use. Happily, you don’t have to jump in feet first to get the benefits; you can build up your capabilities over time as you adopt new cloud and digital technologies.

Collaboration
Digital transformation relies on teamwork to get the right capabilities to every part of the organization. Collaboration across organizational and data silos means getting the right people with the right skills involved in solving problems. If you can provide them with all the information they need, they can do their jobs faster. Once you’ve fixed the problem, you reflect and learn, to make sure what you’ve learned will help prevent the same problem from happening again.

This process, known as collaborative incident resolution, is made possible by the new generation of monitoring and analytics solutions powered by artificial intelligence for IT Operations (AIOps).

Automation
Modern IT departments are filled with highly skilled professionals whose time is too valuable to be wasted on mundane, repetitive tasks that can be done by computers. Automation allows your IT team to focus their expertise and energy on finding creative, innovative ways to deliver business value. Applying automation to systems monitoring means taking your data, working collaboratively and then feeding the right tools with what they need to do the work.

Machine learning applied to historical and realtime data provides the raw material for effective automation, to take action, to remediate, to analyze the effects and handle the routine administrative tasks like opening and closing tickets.

The Journey to World Class IT Monitoring

The journey to world-class IT monitoring starts with the fundamentals and builds on them at each step to provide not just more features but more business value.

The Journey to World Class IT Monitoring
The Journey to World Class IT Monitoring

Level 1: Search and Monitor – Reactive

Level 1 features real-time monitoring of performance, collecting data from a number of sources and using it to search for errors and investigate the cause of potential IT performance issues. This data may include web infrastructure logs, network data, application diagnostics and cloud service information. At Level 1, organizations react and respond to incidents, outages and other issues to find and troubleshoot performance bottlenecks. The first step to world-class IT monitoring, Level 1 is reactive: reacting to events as they happen and providing information to fix them.

Reactive monitoring is at the start of the maturity curve, but it can still provide significant benefits, including centralized visibility to all data and the ability to collaborate across all teams. Benefits like faster mean time to repair boost customer satisfaction while demonstrating the value of the New IT model.

Level 2: Operational Visibility – Proactive

At Level 2 of monitoring maturity, organizations become more proactive and deliver flexible monitoring and alerting across the entire IT and application suite. In proactive monitoring, you bring in a more service-aware mindset, looking at how different components affect one another and how you can start identifying and fixing problems before they have an impact. At Level 2, you can monitor individual key performance indicators (KPIs) and look into them more closely to identify specific issues. Operational visibility provides an end-to-end understanding of what IT functions drive service delivery, helping to further instill a system-wide mindset.

Level 3: Business Insights – Predictive

Level 3 introduces the capability to predict outcomes, using artificial intelligence applied to historical and real-time data to compare the baseline of normal business operations to an unlimited number of potential outcomes. Predictive monitoring is at the core of the concept of observability — the ability not just to monitor your IT stack, but to fully understand it. Level 3 allows you to go beyond monitoring individual KPIs to see how each one affects performance. The obvious benefit of Level 3 is the ability to provide insight to make better operational decisions by predicting what will happen given a given set of circumstances as well as the ability to model different, more beneficial outcomes.

Level 4: Prediction and Improvement – Preventative

At the highest level of IT monitoring maturity, you’re not only able to predict outcomes and provide insight, but also predict and prevent potential issues before they occur. Imagine how much more your team could accomplish and how much more business value you would provide if you eliminated a significant amount of your troubleshooting, root-cause analysis and remediation. A fully mature IT monitoring practice delivers the most value to the rest of the business and to customers.

Getting There With Splunk

Hundreds of highly-successful organizations have selected Splunk to help them move from outdated legacy IT monitoring solutions and get the benefits of New IT.

Viasat
Viasat is a global communications company that enables consumers, businesses, governments and militaries worldwide to communicate through high quality, secure, affordable and fast connections on the ground, in the air or at sea.

Viasat first implemented Splunk Enterprise to solve a particular business challenge: understanding the impact of legacy network conditions on residential customers. Viasat also wanted a solution that could ingest varying data sets and address the need for custom data interrogation.

Splunk remains the best tool in our box to make sure we’re operationalizing our data and making it easy to use. – Lead Solutions Engineer, Viasat

Their top priority is protecting the customer experience — preventing technology problems from harming service performance — whether the customer is using the internet at home or streaming videos in a jet aircraft. Splunk gives Viasat visibility into the health and key performance indicators of critical IT infrastructure and business services. Viasat uses Splunk’s machine learning capabilities to predict the potential impact of IT maintenance activities and to strengthen crossteam collaboration. When network events do occur, remediation takes five minutes, compared with 20 minutes to an hour or longer in the past.

The lead solutions engineer at Viasat says, “We predict the likelihood of maintenance actions having an adverse effect on the environment. Splunk IT Service Intelligence (ITSI) finds outliers and lets us know when a threshold is in breach. Its predictive algorithms are very effective. Combining that insight with our orchestration and auto-remediation capabilities, we take scripted actions. So, not only are we advising people that their maintenance is likely to have an adverse effect, we’re ahead of the game in preventing impact.”

The benefits extend beyond happier customers to happy executives. “Demonstrating tangible value in business results and resource time saved wins executive buy-in,” he says.

Cox Automotive
Cox Automotive, a subsidiary of Atlanta-based Cox Enterprises, owns many global brands including Manheim, AutoTrader.com and Kelley Blue Book. Manheim conducts dealer-to-dealer car auctions, registering nearly seven million used vehicles annually, and facilitating transactions representing almost $46 billion.

Manheim auctions thousands of cars daily, using more than 850 cameras to broadcast real-time video streams. An audio-video scanner pings and polls devices at each auction lane every 30 to 45 seconds, resulting in massive amounts of data, transferred via network gear, switches and routers.

In the fast-paced, often emotional environment of an auction, failure of any device could lead to highly unsatisfied customers, affecting not only the company’s bottom line but its reputation. When a problem occurred, Manheim’s operations teams couldn’t tell if the disruption occurred across the network or was isolated to a single auction lane. Manheim wanted visibility into uptime and application stability challenges to quickly identify the root cause of problems and fix them before they affected customers, and chose Splunk for its data aggregation strategy.

Now if an incident with a camera, microphone or other device occurs, staff members get an alert within seconds, allowing them to quickly troubleshoot and identify the issue and determine the exact location so that an auction technician can minimize disruption. Moreover, using advanced analytics and machine learning, staff can predict outages and can monitor equipment degradation for proactive replacement.

With Splunk ITSI, we have been able to reduce the number of incidents at our auctions by 90%. – Ken Gavranovic, VP of Technology, Cox Automotive

“With Splunk ITSI, we have been able to reduce the number of incidents at our auctions by 90%,” says Ken Gavranovic, vice president of technology, Cox Automotive. “We have proactive infrastructure monitoring to ensure a consistent level of customer service for interested buyers to bid on cars.”

Cox can’t do business if it doesn’t provide a fast and reliable auction platform for its customers. The move to New IT has given them the ability to do that with full confidence.

Conclusion

In business, the path of least resistance is seldom the path to success. It can be tempting to stay with a legacy IT monitoring solution for as long as it continues to deliver at least a minimal degree of value. But given the pace of change in both IT and business, you can’t afford to wait until you’re forced to act. The benefits of New IT are clear, and becoming clearer every day. Now is the time to investigate the value of advanced IT monitoring for your organization, and get on the right path.

Source: Splunk