Modern distributed systems generate vast amounts of log data that can be challenging to analyze manually. In this blog post, we’ll explore how you can integrate Amazon Q CLI with OpenSearch’s advanced analysis tools to transform complex log investigations into natural language queries. We’ll demonstrate the integration of two powerful OpenSearch agent tools—the Log Pattern Analysis Tool and Data Distribution Tool—through the Model Context Protocol (MCP), showing how this combination enables streamlined command-line diagnostics and enhanced system troubleshooting.
Through a real-world OpenTelemetry Demo scenario investigating payment failures, you’ll learn how to set up the integration, configure the necessary components, and use these tools to perform automated pattern recognition an…
Modern distributed systems generate vast amounts of log data that can be challenging to analyze manually. In this blog post, we’ll explore how you can integrate Amazon Q CLI with OpenSearch’s advanced analysis tools to transform complex log investigations into natural language queries. We’ll demonstrate the integration of two powerful OpenSearch agent tools—the Log Pattern Analysis Tool and Data Distribution Tool—through the Model Context Protocol (MCP), showing how this combination enables streamlined command-line diagnostics and enhanced system troubleshooting.
Through a real-world OpenTelemetry Demo scenario investigating payment failures, you’ll learn how to set up the integration, configure the necessary components, and use these tools to perform automated pattern recognition and statistical analysis. By the end of this post, you’ll understand how to use conversational commands to quickly identify root causes in distributed system logs, significantly reducing time to resolution for critical issues.
Log Pattern Analysis tool
The Log Pattern Analysis tool is an OpenSearch agent tool that automates log analysis through multiple analysis modes. It performs differential pattern analysis between baseline and problem periods and analyzes log sequences using trace correlation. The tool provides log insights by automatically extracting error patterns and keywords from log data to accelerate troubleshooting.
Data Distribution tool
The Data Distribution tool is an OpenSearch agent tool that analyzes data distribution patterns within datasets and compares distributions between different time periods. It supports both single dataset analysis and comparative analysis to identify significant changes in field value distributions, helping detect anomalies, trends, and data quality issues.
The tool generates statistical summaries, including value frequencies, percentiles, and distribution metrics, to help understand data characteristics and identify potential data quality issues.
Amazon Q CLI integration
Amazon Q CLI can seamlessly integrate with these OpenSearch analysis tools through MCP. By configuring the OpenSearch MCP server, you can access both the Log Pattern Analysis tool and Data Distribution tool directly from your command-line interface.
This integration enables natural language queries for log analysis and data distribution insights, making complex diagnostic tasks accessible through simple conversational commands.
Implementation using opensearch-mcp-server-py
This integration is built on the opensearch-mcp-server-py project, which provides a Python-based MCP server for OpenSearch. To enable the Log Pattern Analysis tool and Data Distribution tool integration, you need to clone this project and add custom integration code for both tools. This extends the server’s capabilities to support these advanced analysis features.
Complete integration workflow
To set up Amazon Q CLI with OpenSearch MCP integration, follow these steps:
- Clone the MCP server repository:
git clone https://github.com/opensearch-project/opensearch-mcp-server-py.git
cd opensearch-mcp-server-py
- Add tool integration code:
- Implement Log Pattern Analysis tool integration in the MCP server.
- Implement Data Distribution tool integration in the MCP server.
- Register both tools with the MCP server’s tool registry.
- For a complete implementation example, see the demo at opensearch-mcp-server-py/integrate-skill-tool.
- Start the MCP server:
OPENSEARCH_URL="<your-opensearch-cluster-endpoint>" \
OPENSEARCH_USERNAME="<your-opensearch-username>" \
OPENSEARCH_PASSWORD="<your-opensearch-password>" \
python -m src.mcp_server_opensearch --transport stream --host 0.0.0.0 --port 9900
This command starts the MCP server on localhost:9900. When using this configuration, set the url field in your configuration file to "http://localhost:9900/mcp".
- Configure Amazon Q CLI:
- Open your Amazon Q CLI configuration file.
- Add the MCP server configuration.
- Submit natural language queries:
- Launch Amazon Q CLI.
- Query your OpenSearch data using conversational commands.
MCP configuration example
To configure the OpenSearch MCP server for Amazon Q CLI, add the following configuration:
{
"mcpServers": {
"opensearch-mcp-server": {
"type": http",
"url": <your-mcp-server-url>",
"env": {
"OPENSEARCH_URL": <your-opensearch-cluster-endpoint>",
"OPENSEARCH_USERNAME": <your-opensearch-username63.0,
"sampleLogs": [
*** METRIC EVENT: Recording checkout.result=0, status=failure, error=could not charge the card: rpc error: code = Unknown desc = Payment request failed. Invalid token. app.loyalty.level=gold ***",
*** METRIC EVENT: Recording checkout.result=0, status=failure, error=could not charge the card: rpc error: code = Unknown desc = Payment request failed. Invalid token. app.loyalty.level=gold ***"
]
},
{
"pattern": could not charge the card: rpc error: code = Unknown desc = Payment request failed. Invalid token. app.loyalty.level=gold",
"count": 63.0,
"sampleLogs": [
could not charge the card: rpc error: code = Unknown desc = Payment request failed. Invalid token. app.loyalty.level=gold",
could not charge the card: rpc error: code = Unknown desc = Payment request failed. Invalid token. app.loyalty.level=gold"
]
},
{
"pattern": failed to charge card: could not charge the card: rpc error: code = Unknown desc = Payment request failed. Invalid token. app.loyalty.level=gold",
"count": 63.0,
"sampleLogs": [
failed to charge card: could not charge the card: rpc error: code = Unknown desc = Payment request failed. Invalid token. app.loyalty.level=gold",
failed to charge card: could not charge the card: rpc error: code = Unknown desc = Payment request failed. Invalid token. app.loyalty.level=gold"
]
},
{
"pattern": *** METRIC EVENT: Recording checkout.result=<*> status=failure error=failed to prepare order: failed to get product #\"<*>Z\" ***",
"count": 19.0,
"sampleLogs": [
*** METRIC EVENT: Recording checkout.result=0, status=failure, error=failed to prepare order: failed to get product #\"OLJCESPC7Z\" ***",
*** METRIC EVENT: Recording checkout.result=0, status=failure, error=failed to prepare order: failed to get product #\"OLJCESPC7Z\" ***"
]
},
{
"pattern": failed to get product #\"<*>Z\"",
"count": 19.0,
"sampleLogs": [
failed to get product #\"OLJCESPC7Z\"",
failed to get product #\"OLJCESPC7Z\""
]
}
]
}
}
]
}
]
}
Data Distribution tool results
{
"inference_results": [
{
"output": [
{
"name": response",
"result": {
"singleAnalysis": [
{
"field": droppedAttributesCount",
"divergence": 1.0,
"topChanges": [
{
"value": 0",
"selectionPercentage": 1.0
}
]
},
{
"field": severityNumber",
"divergence": 0.689,
"topChanges": [
{
"value": 9",
"selectionPercentage": 0.69
},
{
"value": 0",
"selectionPercentage": 0.31
}
]
},
{
"field": severityText",
"divergence": 0.609,
"topChanges": [
{
"value": INFO",
"selectionPercentage": 0.61
},
{
"value": ",
"selectionPercentage": 0.18
},
{
"value": info",
"selectionPercentage": 0.12
},
{
"value": Information",
"selectionPercentage": 0.08
},
{
"value": error",
"selectionPercentage": 0.01
}
]
},
{
"field": flags",
"divergence": 0.594,
"topChanges": [
{
"value": 0",
"selectionPercentage": 0.59
},
{
"value": 1",
"selectionPercentage": 0.41
}
]
},
{
"field": schemaUrl",
"divergence": 0.573,
"topChanges": [
{
"value": ",
"selectionPercentage": 0.57
},
{
"value": https://opentelemetry.io/schemas/1.24.0",
"selectionPercentage": 0.25
},
{
"value": https://opentelemetry.io/schemas/1.34.0",
"selectionPercentage": 0.18
}
]
},
{
"field": serviceName",
"divergence": 0.223,
"topChanges": [
{
"value": kafka",
"selectionPercentage": 0.22
},
{
"value": product-catalog",
"selectionPercentage": 0.18
},
{
"value": frontend-proxy",
"selectionPercentage": 0.15
},
{
"value": checkout",
"selectionPercentage": 0.13
},
{
"value": load-generator",
"selectionPercentage": 0.11
},
{
"value": cart",
"selectionPercentage": 0.08
},
{
"value": shipping",
"selectionPercentage": 0.04
},
{
"value": ad",
"selectionPercentage": 0.03
},
{
"value": currency",
"selectionPercentage": 0.02
},
{
"value": recommendation",
"selectionPercentage": 0.01
}
]
}
]
}
}
]
}
]
}
Log Pattern Analysis tool contribution
The Log Pattern Analysis tool contributed to identifying the root cause of payment failures in the following ways.
Core contribution: Automated error pattern recognition
The Log Pattern Analysis tool played a crucial pattern identification role in this payment failure investigation:
- Primary failure pattern identification (63 occurrences):
- Automatically identified three related patterns for payment token validation failures.
- All patterns pointed to the same root cause:
Payment request failed. Invalid token. - Specifically identified that failures are associated with
app.loyalty.level=gold.
- Secondary failure pattern identification (19 occurrences):
- Identified two product-catalog-related failure patterns.
- Pattern:
failed to get product #"<*>Z". - Provided specific product ID examples:
OLJCESPC7Z.
- Pattern classification and quantification:
- Automatically grouped related error messages into logical failure categories.
- Provided exact occurrence count statistics.
- Delivered actual log samples for each pattern for validation.
Value delivered: This tool eliminated the need for manual pattern recognition, automatically discovering that payment token validation failures occur 3.3 times more frequently than product catalog issues, clearly establishing primary and secondary failure mode priorities.
Data Distribution tool contribution
The Data Distribution tool contributed to identifying the root cause of payment failures in the following ways.
Core contribution: Statistical analysis and context provision
The Data Distribution tool provided critical statistical background and field distribution analysis for the investigation:
- Service distribution analysis (divergence: 0.223):
kafka: 22% (highest log volume service)product-catalog: 18% (secondary failure source)frontend-proxy: 15% (user-facing errors)checkout: 13% (primary failure point)load-generator: 11%- Other services: 21%
- Severity level analysis (divergence: 0.609):
-
INFOlevel: 81% total (same severity, different service formats) -
INFO: 61% (Java-based services) -
info: 12% (Python-based services) -
Information: 8% (.NET-based services) -
errorlevel: 1% (concentrated failures) -
Empty values: 18%
- Field anomaly detection:
droppedAttributesCount: divergence = 1.0 (complete anomaly)severityNumber: divergence = 0.689 (high anomaly)flags: divergence = 0.594 (moderate anomaly)schemaUrl: divergence = 0.573 (moderate anomaly)
Value delivered: This tool revealed that despite the checkout service representing only 13% of total logs, it contains the highest concentration of critical failures. The severity distribution showed that error-level logs are rare (1%), making the 63 payment failures statistically significant. This quantitative context helped prioritize the checkout service investigation over higher-volume but less critical services like kafka.
Both tools working together
The combination of both tools created a synergistic effect that enhanced the investigation’s effectiveness:
- Qualitative + quantitative analysis: The Log Pattern Analysis tool provided specific error patterns, while the Data Distribution tool provided statistical validation.
- Priority guidance: Combined analysis showed the checkout service had disproportionately high failure impact despite lower log volume.
- Root cause validation: Both tools confirmed payment token validation as the primary issue, with the product catalog as secondary.
- Actionable insights: The tools worked together to provide specific error messages and statistical significance, supporting clear remediation recommendations.
This investigation demonstrates Amazon Q CLI’s orchestration of multiple OpenSearch tools: ListIndexTool and IndexMappingTool for data discovery, SearchIndexTool for targeted queries, DataDistributionTool for statistical analysis of field patterns, CountTool for quantitative assessment, and LogPatternAnalysisTool for automated pattern extraction.
The Log Pattern Analysis tool provided precise error pattern identification with exact occurrence counts (63 payment failures, 19 product catalog issues), while the Data Distribution tool offered statistical context that validated the significance of checkout service failures despite lower log volume. The combination generated a comprehensive root cause analysis that pinpointed invalid payment tokens as the primary issue affecting gold-tier customers, complete with actionable recommendations for token validation, service dependencies, and monitoring improvements.
Conclusion
The integration of Amazon Q CLI with OpenSearch’s Log Pattern Analysis tool and Data Distribution tool transforms complex log investigation into conversational analysis. Through MCP, these tools become accessible via natural language queries, significantly reducing diagnostic complexity.
Key benefits demonstrated:
- Conversational interface: Complex log analysis through natural language queries.
- Automated pattern recognition: No manual log parsing or pattern identification required.
- Statistical validation: Quantitative analysis supporting qualitative findings.
- Comprehensive investigation: Orchestration of multiple tools in a single conversation.
- Actionable results: Clear root cause identification with specific recommendations.
Together, these tools delivered comprehensive root cause analysis through simple conversational commands, transforming what traditionally required multiple manual queries and domain expertise into an automated, intelligent investigation process. This integration makes advanced log analysis accessible to broader audiences while significantly reducing time to resolution in distributed system troubleshooting.

Jiaru Jiang is an AWS software development engineer intern working on the OpenSearch Project.
View all posts

Hailong Cui is an AWS software development engineer working on the OpenSearch Project.