DevOps teams to benefit from greater AIOps insights

AIOps software-as-a-service (SaaS) platform, OpsRamp, has announced its release of OpsQ Observed Mode to build confidence in machine learning models for IT event and performance analysis.

The Summer 2019 Release also introduces automated alert suppression to reduce human time spent on first-response to alerts, continuous learning-based alert escalation using live event data, and new infrastructure monitoring capabilities for cloud native environments.

According to OpsRamp’s 2019 State of AIOps report, 67% of respondents have concerns about the relevance and reliability of the insights delivered by artificial intelligence for IT operations (AIOps) tools.

OpsQ Observed Mode enables IT teams to assess the accuracy of machine-learning-driven correlation decisions in preview mode, enhancing the integrity of data for improved decision-making.

Tim Hebert, chief managed services officer of Carousel Industries, said: “We already use the OpsQ event management engine to reduce alert storms from 200,000 raw events per month down to a more manageable 10,000 incidents per month.

“The OpsRamp Summer Release allows our infrastructure teams to understand the alert suppression capabilities of inference models before we commit to them, and that’s tremendously beneficial in our event management workflow.”

The OpsRamp Summer 2019 release includes:

  • Service-Centric AIOps: OpsQ is OpsRamp’s intelligent event management, alert correlation, and remediation solution. New OpsQ capabilities help IT teams drive faster incident prioritisation and rapid mean-time-to-resolution (MTTR) for dynamic infrastructure workloads and include:
  • OpsQ Observed Mode: OpsQ Observed Mode helps incident management teams assess the accuracy of the OpsRamp machine learning algorithms in a live production environment before they take effect. Observed Mode creates shadow inferences that show alert correlation decisions that OpsQ would have made if enabled.
  • Learning-Based Auto-Alert Suppression: OpsQ looks for recurring alert patterns in production environments and suppresses those alerts that occur at a predictable cadence. OpsQ uses seasonality-based and attribute-based auto-alert suppression techniques as a first-response mechanism so that incident responders no longer have to acknowledge, process, and triage every alert that they receive.
  • Automatic Resource Creation from Third-Party Events: OpsQ now has the ability to auto-extract metadata for resources managed by other tools and use this information to automatically contextualise future alerts from these resources.
  • Continuous Learning for Alert Escalation: Alert escalation policies support a continuous learning option for auto-incident creation. The OpsRamp platform continuously re-trains its machine learning models using live alert data, adapting to dynamic environments.
  • Service and Topology Maps: The Summer 2019 Release introduces new impact visibility and service context features that deliver dynamic relationship data for public cloud services and actionable insights for understanding cross-site interconnections.
  • Cloud Topology for AWS: The new AWS topology map shows dependency information for cloud resources such as AWS EC2, VPC, RDS, or ELB instances so that DevOps teams can keep track of all the different moving parts in their public cloud estate.
  • Cross-Site Connection Topology: OpsRamp network topology maps now incorporate routing layer relationships (BGP and OSPF) across WAN links.
  • Cloud Native Discovery and Monitoring: DevOps and site reliability engineering (SRE) teams can now monitor popular open source applications used in cloud native environments and access relevant performance insights for Mesosphere and Azure Stack in the OpsRamp platform.
  • Out-of-the-Box Kubernetes Dashboards: OpsRamp can automatically create performance management dashboards for Kubernetes environments. IT teams can gain instant visibility into the health of containerised deployments by tracking cluster, pod, and node level metrics.
  • Expanded Application Monitoring: OpsRamp now provides agentless monitoring for commonly used applications (ActiveMQ, Apache Spark, Apache Solr, CockroachDB, Couchbase,  Apache CouchDB, Elastic Search, Fluentd, Neo4j Graph Platform, RabbitMQ) within cloud and cloud native stacks.
  • Mesosphere: OpsRamp can now discover and monitor Mesosphere-based cloud native environments. The integration captures performance metrics for Mesos master and agent nodes that help optimise and scale modern enterprises apps built on dynamic infrastructure.
  • Azure Stack: OpsRamp can discover and monitor network connections, virtual networks and load balancers in an Azure Stack environment. Cloud admins can analyse the availability and performance of their hybrid infrastructure in Azure Stack through the integration.

Mahesh Ramachandran, VP of product management for OpsRamp, said: “Our customers have told us that they’d like to see how AIOps inferences proactively detect, diagnose, and address service continuity issues. OpsQ Observed Mode is a no-risk option for IT operations and DevOps teams to assess the accuracy and power of machine intelligence-driven event management.

“The Summer 2019 Release provides modern IT infrastructure teams the real-time intelligence to fix visibility gaps in their hybrid and multi-cloud environments.”

More
articles

Menu