Tame Complex Hybrid Cloud Environments with a Business-Oriented Monitoring Strategy

  • December 13, 2019
NTT DATA Services Business Oriented Monitoring Blog

The modern enterprise is becoming complex. A majority of the organizations I work with are embracing cloud platforms while still retaining on-premise infrastructures. By keeping some workloads on-premise and others in private clouds, they are creating complex hybrid IT environments. The challenge for these businesses is how to monitor this new hybrid environment. Public cloud providers offer built-in cloud monitoring tools, such as Amazon CloudWatch, Azure Monitor, Stackdriver, by Google and others. However, they do not provide end-to-end visibility of business-critical applications running on private, public and hybrid cloud environments.

Adding to this complexity — the rise of microservices and containers
Microservices and containers are rapidly gaining traction as they address scalability, make continuous delivery easier, isolate failures and allow developers to work on a smaller codebase. However, as the number of microservices rises, it can be hard to keep track of them. Since containers are ephemeral, most of the performance data produced lose their value quickly, making monitoring difficult with a traditional infrastructure monitoring tool.

In the hybrid cloud environment, applications might be divided between several data centers and cloud providers, making observability into your applications, infrastructure, and network challenging. Without a more in-depth understanding of your workload and network performance, you’ll not be able to evaluate and optimize your infrastructure.

Event noise has also increased, making it harder for IT Ops teams to prioritize business-critical issues and distracting them to the point of hindering the very people they are supposed to help. Event noise refers to the notifications and alarms (e.g., CPU utilization, memory utilization, end-user response time), delivered by monitoring systems to IT Ops teams showing the health and performance of infrastructure and applications across their IT environment. Organizations require a process and the technology maturity to apply correlation, predictive intelligence, and machine learning, using advanced analytics of events, performance metrics, logs, and knowledgeable resources to reduce the noise and find the root cause.

With increase in breaches, an organization needs to have a strong event tracking and tracing capability to analyze vast amounts of data from on-premise and cloud sources to detect threats, generate prioritized data breach alerts, ensure compliance and optimize security investment.

The need for a business-oriented monitoring approach
Today’s hybrid and multi-cloud IT requires a combination of monitoring techniques to create a modern, full-stack, cloud monitoring capability. To remain ahead, I&O leaders must consider a business-oriented strategy — focusing not only on the server uptime but also on how to manage and measure the performance of business applications. End-user experience of on-premise or cloud-hosted applications is now a high priority. End-user transactions should be visible in real-time; the capacity to monitor actual user usage and KPIs becomes vital to improving monitoring, costs and delivery.

There are a plethora of monitoring solutions available in the market. Implementing a holistic monitoring solution will help your enterprise move towards Artificial Intelligence Operations (AIOps).

AIOps enables IT teams to be more efficient and effective, reducing human errors, and cutting Mean Time to Resolution (MTTR). Some of the most essential capabilities of AIOPs are:

  • Big data management
  • Root cause determination and faster resolution time of issues
  • Predictive analytics used with continuous Machine Learning
  • However, deployments are often tricky and must be approached gradually. To stay competitive, companies need to understand and implement technologies not just because they are new but because they will work in the best interest of the organization. The most important thing is to know where the company stands today and where they want to be. The approaches are different when a company doesn’t have monitoring in place, if they already have a robust platform, or if they are trying to get to the next level.

    Getting started
    Before identifying an appropriate business-oriented monitoring solution, companies must look at the big picture of their environment.

    1. Determine monitoring maturity and define the “As-Is” and the “To-Be” states: A company needs to understand the business value of the technology in place to help define what they need to monitor and how it is going to be aligned to business KPIs. An “As-Is” analysis will help outline the current state of monitoring, including any gaps or issues the company is currently facing with the current mode of operation. Here are some of the representative questions companies must answer to start with a monitoring maturity assessment:

    2. Does your company:
      • Understand the business value of your IT estate?
      • Know how data flows affect the performance of key systems?
      • Monitor the right things at the right times?
      • Have baselines defined for each class of equipment?
      • Manage thresholds and iterations in advance of notifications?
      • Alert the right people by having an incident process in place?

      Have you:
      • Analyzed notification requirements like Paging/Email vs. ITSM Ticketing?
      • Analyzed your need for SLAs or non-SLA driven processes?
      • Chosen the right tool with enough capabilities to expand your scope and focus?
      • Considered on-premise vs. Cloud-based infrastructure for your monitoring?

      Decide to deploy basic monitoring, based on your organization’s maturity: If most of the answers to the above questions are affirmative, then the company is likely gearing up for adopting advanced monitoring solutions built with AIOps technology. Once the “As-Is” state has been mapped out, companies can continue with the “To-Be” state — working with stakeholders to understand and define what needs urgent improvement.

    3. Determine your business requirements and create a tool analysis: Make sure your business requirements are fulfilled. Meet with your technical architects to discuss the requirements and functionality deemed necessary, complete a fit-gap analysis between the requirements and the current state.
      • Make a comparison of your tools: Based on your environment as it is today and expected to be in future, you might also include a comparison of the following product capabilities: Servers OS, Network, Storage, Databases, Hypervisors, Cloud, Integration, Service and Support and Cost.
      • Select the proper tools and start a Proof of Concept: To prove the feasibility and power of the new tools selected.

    While classic high availability disaster recovery solutions are aimed at solving problems after they have appeared, standard monitoring helps you react faster and minimize the ill effects. Proactive monitoring addresses the event itself and prevents it from happening.

    Business-Oriented Monitoring is like insurance to manage the complexity in a complex hybrid IT environment. Not only does it provide catastrophic coverage, but it also wellness-checks in advance of serious consequences, often allowing companies to take preventative measures well in advance of an outage.

    Want to build a flexible, business-aligned infrastructure? Check out NTT DATA’s end-to-end data center services.

    Subscribe to our blog

    Bogdan Ionut Buruiana

    Bogdan Ionut Buruiana is a Senior Technical Consultant in Infrastructure Cloud and Security business unit. He has more than 15 years of IT experience, implementing many automation and monitoring projects. Bogdan has intimate knowledge of the advanced monitoring, cloud and automation tools used in complex IT environments.

    Related Blog Posts