Unified operational monitoring with Cluster Insights
Amazon OpenSearch Service now includes Cluster Insights, a monitoring solution that provides comprehensive operational visibility of your clusters through a single dashboard. This eliminates the complexity of having to analyze and correlate various logs and metrics to identify potential risks to cluster availability or performance. The solution automates the consolidation of critical operational data across nodes, indices, and shards, transforming complex troubleshooting into a streamlined process. You can detect issues like large shards and low disk watermarks, view detailed metrics at the node, index, and shard levels, and access security and resiliency best practices.
Note
Cluster Insights is available through OpenSearch Service UI at no additional cost to all users running OpenSearch version 2.17 or later.
Benefits
Proactive monitoring - Monitor cluster health proactively with detailed performance metrics across all components - from individual nodes and indices to shards and search queries.
Unified visibility - Consolidate monitoring data into a single dashboard
Actionable recommendations - Get step-by-step guidance for issue resolution
Comprehensive coverage - Monitor security, stability, and resiliency across your OpenSearch clusters
Query optimization - Identify resource-intensive queries and optimize performance
With Cluster Insights, you can maintain optimal cluster performance, reduce operational overhead, and ensure consistent best practices across your OpenSearch clusters
Create and configure an OpenSearch application to view Cluster Insights
You can view insights for a specific OpenSearch Service cluster through the OpenSearch UI (Dashboards). In OpenSearch UI, an application is simply an organizational construct like a folder. Each application can connect to and display insights for multiple OpenSearch Service clusters. Accessing Cluster Insights requires an administrative role in the OpenSearch UI application.
Note
Accessing Cluster Insights requires an administrative role in the OpenSearch UI application.
Create and configure an application to view Cluster Insights
-
Open the OpenSearch Service console at https://console.aws.amazon.com/aos/home
-
Choose OpenSearch UI (Dashboards) from the left navigation
-
Complete the following steps to create and configure an application:
-
After you complete the above two steps, you can view Cluster Insights in OpenSearch UI dashboard under the Settings > Data administrator > Cluster Insights section. The Settings icon is located at the bottom left of the OpenSearch UI screen.
Screen-1: Access Data Administrator from OpenSearch UI
Screen-2: Cluster Insights under the Manage data section
Understanding Cluster Insights
This section describes the various insights available in Cluster Insights.
Overview Dashboard
The Cluster Insights Overview page, as shown in the following screenshot, provides a high-level view of your cluster health at the application level and comprises the following sections:
Screen-3: Cluster Insights landing page in OpenSearch UI application.
Current cluster status
A donut chart displays your cluster health status:
Green - All primary shards and replicas are allocated to nodes
Yellow - All primary shards are allocated, but some replicas aren't
Red - At least one primary shard is not allocated to any node
Insights trend
The trend graph tracks issue patterns over the past 30 days, helping you identify emerging problems and monitor resolution progress.
Current open insights
A count organized by severity of open insights for the last 30 days.
OpenSearch Service Clusters
This section lists all your OpenSearch clusters with key statistics including node count, shard count, and active queries.
Top insights by severity
You can review insights across all domains in your application. This section prioritizes issues that need immediate attention (Critical, and High Severity). Each insight includes a description and specific recommendations, which can help you focus on critical issues first.
Insight details
Each insight in the Top insights by severity section is interactive and provides detailed analysis. For example, when you choose the Large Shard Size insight:
You see how many shards exceed the threshold and which indices are affected.
A resource map identifies each oversized shard with its index, ID, and current size.
The recommendations tab provides step-by-step remediation guidance.
The History tab displays a timeline of resource remediation actions.
Cluster Details
When you select a specific cluster in the OpenSearch Service Clusters section, OpenSearch displays insights for that cluster across the following tabs: Cluster health, Nodes view, Index view, Shard view, and Query view. The Cluster health tab displays the following information:
Overview
Key information includes cluster health, shard count, node count, index count, and document statistics.
Configuration best practices
Donut charts show compliance with recommended settings for resilience, and security.
Insights
A table lists recent insights generated for the cluster, with the same detailed breakdown and remediation guidance available from the overview page.
Screen-4: Cluster Health overview provides key metrics, best practices, and Insights
When you click on any insights, you can see details and impacted resources, recommendations. In addition, you can also see history of fixed resources.
Screen-5: Insight details. Provides you details, recommendations, and historical timeline.
Metrics Section
Interactive charts in this section display the following cluster metrics:
Overall cluster health metrics such as Cluster Status, Write status, and searchable documents
KPIs (Key Performance Indicators) like Indexing and Search rates and latencies
Resource Utilization metrics like JVM and CPU utilization
Node, Index, and Shard views
The Node, Index, and Shard views use OpenSearch stats to provide detailed visibility into cluster operations. You can view:
Real-time metrics such as CPU utilization and JVM memory pressure
Search and indexing performance data
Resource hotspots across cluster components
Granular node-level diagnostics
Top shard heap allocated
Screen-6: Node, Index, and Shard level metrics
Query View
Note
Query View feature is supported for OpenSearch versions 2.19 or later.
The Query View page helps you monitor resource-intensive queries with:
Live dashboards
View execution stats, CPU and memory usage, and completion progress for every query.
Top N queries
A ranked table shows the most significant queries with details including:
Query count
Latency, CPU, and memory usage
Search type and coordinator node
Target indices and shard count
Query details
Double-click any query to see:
Exact query payload and execution steps
Latency breakdown for each phase (expand, query, fetch)
Optimization recommendations
Screen-7: In-flight live view. You can also view Top-N queries