Logging Agent and Skills to CloudWatch in Bedrock AgentCore Environment

A comprehensive walkthrough for tracing using ClaudeAgentSDK and Skills.

15 min readDec 7, 2025

–

Press enter or click to view image in full size

This guide provides a comprehensive walkthrough for implementing CloudWatch logging in Bedrock AgentCore environments using ClaudeAgentSDK and Skills.

Why Agent Tracing Matters

Effective tracing is critical for AI agents — it enables debugging complex multi-turn conversations, tracking token usage and costs, identifying performance bottlenecks, and ensuring runtime reliability. Without proper tracing, diagnosing issues in production agents becomes nearly impossible.

CloudWatch as a Reliable Fallback

*While dedicated observability platforms (like Langfuse or LangSmith) offer richer visualization, CloudW…

A comprehensive walkthrough for tracing using ClaudeAgentSDK and Skills.

15 min readDec 7, 2025

–

Press enter or click to view image in full size

This guide provides a comprehensive walkthrough for implementing CloudWatch logging in Bedrock AgentCore environments using ClaudeAgentSDK and Skills.

Why Agent Tracing Matters

Effective tracing is critical for AI agents — it enables debugging complex multi-turn conversations, tracking token usage and costs, identifying performance bottlenecks, and ensuring runtime reliability. Without proper tracing, diagnosing issues in production agents becomes nearly impossible.

CloudWatch as a Reliable Fallback

While dedicated observability platforms (like Langfuse or LangSmith) offer richer visualization, CloudWatch provides a convenient, always-available tracing solution that requires no additional infrastructure. Although the log viewing experience is not as polished, CloudWatch is automatically integrated with AgentCore and serves as a dependable last-resort option for monitoring agent and skill behavior.

This guide covers the fundamentals of Skills, their advantages over alternatives like MCP, and provides a complete step-by-step example project.

What we will cover:

Introduction to Skills
Skills vs MCP: Understanding the Differences
Using Skills in ClaudeAgentSDK
Bedrock AgentCore Environment Specifics
Logging Strategy for CloudWatch
Complete Example Project
Best Practices and Troubleshooting

1. Introduction to Skills

What Are Skills?

Skills are modular capability packages that enhance Claude’s ability to perform specialized tasks. Each Skill consists of:

Instructions: A SKILL.md file containing procedural knowledge and usage guidelines.
Scripts: Optional executable code (Python, Bash) that Claude can invoke.
Resources: Supporting files like templates, reference documentation, and examples.

When Claude encounters a task that matches a Skill’s description, it automatically loads the Skill’s instructions and uses the associated tools to complete the task.

What Problems Do Skills Solve?

Traditional approaches to extending AI agent capabilities have several limitations:

Prompt Engineering Complexity: Without Skills, developers must craft complex system prompts that include all possible instructions, leading to token bloat and reduced effectiveness.
Static Capabilities: Agents without Skills have fixed capabilities that cannot be easily extended or customized for specific domains.
Knowledge Fragmentation: Without a structured approach, domain knowledge gets scattered across multiple prompts, making maintenance difficult.
Execution Reliability: Pure LLM-generated code can be unreliable for complex operations. Skills allow developers to provide tested, reliable scripts.

Skills address these problems by:

Modular Loading: Skills are loaded on-demand, keeping the context window efficient.
Structured Knowledge: Instructions, examples, and references are organized in a standard format.
Reliable Execution: Pre-written scripts handle complex operations reliably.
Easy Maintenance: Skills can be updated independently without affecting other capabilities.

2. Skills vs MCP: Understanding the Differences

Model Context Protocol (MCP)

MCP (Model Context Protocol) is a standardized protocol for connecting AI models to external services and data sources.

Focus: Provides access to external tools and APIs.
Architecture: Client-server model with JSON-RPC communication.
Deployment: Requires running MCP servers (local or remote).
Use Case: Integrating with external services (databases, APIs, SaaS tools).

Skills

Skills are filesystem-based capability packages.

Focus: Provides procedural knowledge and local execution capabilities.
Architecture: Filesystem-based, discovered from .claude/skills/ directories.
Deployment: Packaged with the application, no additional servers needed.
Use Case: Domain-specific tasks, local file processing, custom workflows.

Comparison at a Glance

Press enter or click to view image in full size

When to Use Each

Use Skills when:

Processing local files
Implementing custom domain workflows
Packaging reusable capabilities
Minimizing external dependencies

Use MCP when:

Integrating with external APIs (Jira, Confluence, databases)
Accessing remote data sources
Sharing tools across multiple agents

Use Both when:

Building complex agents that need both local processing and external integration.
Example: A document agent that processes PDFs locally (Skills) and stores results in Confluence (MCP).

3. Using Skills in ClaudeAgentSDK

Skill Directory Structure

Skills are organized in the .claude/skills/ directory. Your project structure should look like this:

project_root/├── .claude/│   └── skills/│       └── my-custom-skill/│           ├── SKILL.md          # Required: Main skill definition│           ├── REFERENCE.md      # Optional: Detailed API documentation│           ├── EXAMPLES.md       # Optional: Usage examples│           └── scripts/          # Optional: Executable scripts│               ├── tool_a.py│               └── tool_b.py├── src/│   └── ...└── requirements.txt

SKILL.md Format

The SKILL.md file defines the Skill using YAML frontmatter and Markdown content:

---name: my-custom-skilldescription: Brief description of what this skill does and when to use it.allowed-tools: Bash, Read, Write---# My Custom Skill## Quick StartProvide a simple example showing how to use the skill.## InstructionsStep-by-step instructions for Claude on how to perform the task.## Available ScriptsList and describe the scripts available in this skill.

Configuring ClaudeAgentSDK

To enable Skills in your agent, you need to configure the ClaudeAgentOptions.

from claude_agent_sdk import ClaudeAgentOptions, queryoptions = ClaudeAgentOptions(    system_prompt="Your agent's system prompt...",        # Enable Skills discovery from filesystem    setting_sources=["project", "user"],        # Include "Skill" in allowed tools    allowed_tools=[        "Skill",      # Required for Skills to work        "Bash",       # Used by Skills to execute scripts        "Read",       # File reading        "Write",      # File writing        "WebFetch"    # HTTP requests    ],        permission_mode="bypassPermissions",  # For automated agents    max_turns=50)# Use the agentasync for message in query(prompt="Analyze this document...", options=options):    # Process messages    pass

The setting_sources parameter controls where Skills are loaded from:

“project”: Scans .claude/skills/ in the project root.
“user”: Scans ~/.claude/skills/ in the user’s home directory.

4. Bedrock AgentCore Environment Specifics

AgentCore Architecture

Amazon Bedrock AgentCore provides a fully managed runtime environment for AI agents. Key characteristics include:

MicroVM-based: Each agent runs in an isolated microVM for security.
Container Deployment: Agents are packaged as Docker containers.
Managed Scaling: Automatic scaling based on demand.
VPC Integration: Secure deployment within your VPC.

How Skills Execute in AgentCore

In AgentCore, when Claude invokes a Skill, the process flows as follows:

Claude identifies a matching Skill based on the task.
Claude loads the Skill’s instructions into context.
Claude uses the Bash tool to execute scripts.
Scripts run as subprocess calls within the container.
Script output is captured and returned to Claude.

This execution model creates a unique logging challenge.

Press enter or click to view image in full size

The Logging Challenge

Problem* Skills execute as separate processes via Bash. Their stdout is captured by Claude as the tool result (JSON). Therefore, logging debug information to stdout would corrupt the JSON result and confuse the agent.*

Solution* Skills must log to stderr. In the AgentCore environment, stderr flows through to CloudWatch independently, allowing for debugging without interfering with the agent’s operation.*

5. Logging Strategy for CloudWatch

Understanding the Log Flow

In AgentCore, CloudWatch captures all output from the container’s stdout and stderr streams. The logging strategy must account for two distinct contexts:

Agent Process Logs: Python logging from the main agent code.
Skill Script Logs: Output from subprocess-executed scripts.

Agent Process Logging

The agent process uses Python’s standard logging module, configured to output to stdout.

Get Xue Langping’s stories in your inbox

Join Medium for free to get updates from this writer.

logging_config.py

import osimport sysimport loggingLOG_LEVEL = os.environ.get("LOG_LEVEL", "INFO").upper()LOG_FORMAT = "%(asctime)s - %(name)s - %(levelname)s - %(message)s"DATE_FORMAT = "%Y-%m-%d %H:%M:%S"def setup_logging() -> None:    """Configure logging for the entire application."""    numeric_level = getattr(logging, LOG_LEVEL, logging.INFO)        # Output to stdout - CloudWatch captures this    handler = logging.StreamHandler(sys.stdout)    handler.setLevel(numeric_level)    handler.setFormatter(logging.Formatter(LOG_FORMAT, DATE_FORMAT))        root_logger = logging.getLogger()    root_logger.setLevel(numeric_level)    root_logger.handlers.clear()    root_logger.addHandler(handler)def get_logger(name: str) -> logging.Logger:    """Get a logger with the specified name."""    return logging.getLogger(name)

Skill Script Logging (Critical Pattern)

For Skill scripts, the logging pattern is different and critical to understand:

stdout: Reserved for JSON results ONLY (captured by Claude).
stderr: Used for debug/trace logs (sent to CloudWatch).

Example Skill Script:

#!/usr/bin/env python3"""Skill script logging pattern.CRITICAL: - stdout: JSON results ONLY (captured by Claude)- stderr: Debug/trace logs (sent to CloudWatch)"""import sysimport jsonimport logging# Configure logging to stderr (NOT stdout!)logging.basicConfig(    level=logging.DEBUG,    format='%(asctime)s - [SKILL:my_tool] - %(levelname)s - %(message)s',    stream=sys.stderr  # This is the key!)logger = logging.getLogger(__name__)def my_function(arg):    """Example function with proper logging."""    logger.info(f"[START] Processing: {arg}")        try:        # Do work here        result = {"status": "success", "data": "..."}        logger.info("[END] Processing complete")        return result    except Exception as e:        logger.error(f"[ERROR] {str(e)}", exc_info=True)        return {"error": str(e)}if __name__ == "__main__":    # Parse arguments    arg = sys.argv[1] if len(sys.argv) > 1 else ""        # Execute and output JSON to stdout    result = my_function(arg)    print(json.dumps(result))  # Only JSON goes to stdout

Log Format Conventions

Use consistent prefixes to make logs easily searchable in CloudWatch:

[START]: Beginning of an operation
[END]: Completion of an operation
[ERROR]: Error conditions
[SKILL:name]: Identifies which skill generated the log
[TOOL_CALL]: Agent tool invocation
[TOOL_RESULT]: Tool execution result

CloudWatch Log Groups

In AgentCore, logs generally appear in this locations:

Standard Logs: /aws/bedrock-agentcore/runtimes/<agent_id>/standard-logs

6. Complete Example Project

This section provides a complete, working example that you can use as a template.

Project Structure

cloudwatch-logging-agent/├── .claude/│   └── skills/│       └── greeting-skill/│           ├── SKILL.md│           └── scripts/│               └── greeter.py├── src/│   ├── __init__.py│   ├── logging_config.py│   ├── agent.py│   └── server.py├── Dockerfile├── requirements.txt└── README.md

Step 1: Create the Logging Configuration

src/logging_config.py

"""Centralized Logging ConfigurationKey features:- Outputs to stdout (captured by CloudWatch in AgentCore)- Consistent format across all modules- Configurable log level via environment variable"""import osimport sysimport loggingLOG_LEVEL = os.environ.get("LOG_LEVEL", "INFO").upper()LOG_FORMAT = "%(asctime)s - %(name)s - %(levelname)s - %(message)s"DATE_FORMAT = "%Y-%m-%d %H:%M:%S"def setup_logging() -> None:    """    Configure logging for the entire application.    Call this once at application startup.    """    numeric_level = getattr(logging, LOG_LEVEL, logging.INFO)        handler = logging.StreamHandler(sys.stdout)    handler.setLevel(numeric_level)    handler.setFormatter(logging.Formatter(LOG_FORMAT, DATE_FORMAT))        root_logger = logging.getLogger()    root_logger.setLevel(numeric_level)    root_logger.handlers.clear()    root_logger.addHandler(handler)        # Configure uvicorn loggers    for logger_name in ["uvicorn", "uvicorn.access", "uvicorn.error"]:        uvicorn_logger = logging.getLogger(logger_name)        uvicorn_logger.handlers.clear()        uvicorn_logger.addHandler(handler)        uvicorn_logger.setLevel(numeric_level)        logger = logging.getLogger(__name__)    logger.info(f"Logging configured: level={LOG_LEVEL}, output=stdout")def get_logger(name: str) -> logging.Logger:    """Get a logger with the specified name."""    return logging.getLogger(name)

Step 2: Create the Skill

.claude/skills/greeting-skill/SKILL.md

---name: greeting-skilldescription: Generate personalized greetings. Use when users ask for greetings or welcome messages.allowed-tools: Bash---# Greeting SkillGenerate personalized greeting messages using Python.## Quick Start```bashpython3 .claude/skills/greeting-skill/scripts/greeter.py greet "Alice"```## Available Commands### greetGenerate a personalized greeting.```bashpython3 .claude/skills/greeting-skill/scripts/greeter.py greet "<name>"```### farewellGenerate a farewell message.```bashpython3 .claude/skills/greeting-skill/scripts/greeter.py farewell "<name>"```## Output FormatAll commands return JSON:```json{"message": "Hello, Alice!", "success": true}```

.claude/skills/greeting-skill/scripts/greeter.py

#!/usr/bin/env python3"""Greeting Skill ScriptIMPORTANT: This script demonstrates proper logging for Skills:- stdout: JSON results returned to Claude (do NOT print debug info here)- stderr: Debug/trace logs (safe for debugging, captured by CloudWatch)"""import osimport sysimport jsonimport timeimport loggingfrom datetime import datetime# Configure logging to stderr# This is CRITICAL - stdout is reserved for JSON resultslogging.basicConfig(    level=logging.DEBUG,    format='%(asctime)s - [SKILL:greeter] - %(levelname)s - %(message)s',    stream=sys.stderr)logger = logging.getLogger(__name__)def log_operation_start(operation: str, **kwargs):    """Log the start of an operation."""    logger.info(f"[START] Operation: {operation}")    for key, value in kwargs.items():        logger.info(f"  {key}: {value}")def log_operation_end(operation: str, success: bool, elapsed: float):    """Log the end of an operation."""    status = "SUCCESS" if success else "FAILED"    logger.info(f"[END] Operation: {operation} - {status} - {elapsed:.3f}s")def greet(name: str) -> dict:    """Generate a greeting message."""    start_time = time.time()    log_operation_start("greet", name=name)        try:        if not name or not name.strip():            logger.warning("Empty name provided")            return {"error": "Name cannot be empty", "success": False}                hour = datetime.now().hour        if hour < 12:            greeting = "Good morning"        elif hour < 18:            greeting = "Good afternoon"        else:            greeting = "Good evening"                message = f"{greeting}, {name}! Welcome to the CloudWatch Logging Demo."                logger.debug(f"Generated greeting: {message}")        log_operation_end("greet", success=True, elapsed=time.time() - start_time)                return {"message": message, "success": True}        except Exception as e:        logger.error(f"Error generating greeting: {str(e)}", exc_info=True)        log_operation_end("greet", success=False, elapsed=time.time() - start_time)        return {"error": str(e), "success": False}def farewell(name: str) -> dict:    """Generate a farewell message."""    start_time = time.time()    log_operation_start("farewell", name=name)        try:        if not name or not name.strip():            logger.warning("Empty name provided")            return {"error": "Name cannot be empty", "success": False}                message = f"Goodbye, {name}! Thank you for using the CloudWatch Logging Demo."                logger.debug(f"Generated farewell: {message}")        log_operation_end("farewell", success=True, elapsed=time.time() - start_time)                return {"message": message, "success": True}        except Exception as e:        logger.error(f"Error generating farewell: {str(e)}", exc_info=True)        log_operation_end("farewell", success=False, elapsed=time.time() - start_time)        return {"error": str(e), "success": False}if __name__ == "__main__":    if len(sys.argv) < 2:        print(json.dumps({"error": "Usage: greeter.py <command> [args...]"}))        sys.exit(1)        command = sys.argv[1]        if command == "greet":        if len(sys.argv) < 3:            print(json.dumps({"error": "Usage: greeter.py greet <name>"}))            sys.exit(1)        name = sys.argv[2]        result = greet(name)        print(json.dumps(result))        elif command == "farewell":        if len(sys.argv) < 3:            print(json.dumps({"error": "Usage: greeter.py farewell <name>"}))            sys.exit(1)        name = sys.argv[2]        result = farewell(name)        print(json.dumps(result))        else:        print(json.dumps({"error": f"Unknown command: {command}"}))        sys.exit(1)

Step 3: Create the Agent Logic

src/agent.py

"""Core Agent Logic with Comprehensive LoggingThis module demonstrates proper logging for:- Tool calls and results- Thinking blocks- Text responses- Cost tracking"""import jsonfrom typing import AsyncGeneratorfrom claude_agent_sdk import (    AssistantMessage,    ResultMessage,    UserMessage,    TextBlock,    ThinkingBlock,    ToolUseBlock,    ToolResultBlock,    ClaudeAgentOptions,    query,)from src.logging_config import get_loggerlogger = get_logger(__name__)LOG_SEPARATOR = "-" * 60def create_agent_options() -> ClaudeAgentOptions:    """Create configured ClaudeAgentOptions."""    config = {        "system_prompt": (            "You are a helpful assistant with greeting capabilities. "            "When users ask for greetings, use the greeting-skill. "            "Always complete the task using the available Skills."        ),        "permission_mode": "bypassPermissions",        "max_turns": 20,        "setting_sources": ["project", "user"],        "allowed_tools": [            "Skill",            "Bash",            "Read",            "Write",        ]    }        logger.info("=" * 60)    logger.info("Agent Configuration:")    logger.info(f"  Setting Sources: {config['setting_sources']}")    logger.info(f"  Allowed Tools: {config['allowed_tools']}")    logger.info("=" * 60)        return ClaudeAgentOptions(**config)def _log_tool_use(block: ToolUseBlock, turn: int) -> None:    """Log tool use information."""    logger.info(LOG_SEPARATOR)    logger.info(f"[TOOL_CALL] Turn {turn}")    logger.info(f"  Tool: {getattr(block, 'name', 'unknown')}")    logger.info(f"  ID: {getattr(block, 'id', 'unknown')}")    try:        input_str = json.dumps(getattr(block, 'input', {}), ensure_ascii=False)        logger.info(f"  Input: {input_str}")    except Exception as e:        logger.info(f"  Input: <serialization error: {e}>")def _log_tool_result(block: ToolResultBlock, turn: int) -> None:    """Log tool result information for tracing."""    tool_use_id = getattr(block, 'tool_use_id', 'unknown')    content = getattr(block, 'content', '')    is_error = getattr(block, 'is_error', False)        logger.info(LOG_SEPARATOR)    logger.info(f"[TOOL_RESULT] Turn {turn}")    logger.info(f"  Tool Use ID: {tool_use_id}")    logger.info(f"  Is Error: {is_error}")        # Log full result content (includes skill JSON output)    content_str = str(content)    logger.info(f"  Result: {content_str}")async def process_prompt(prompt: str) -> str:    """Process a user prompt and return the response."""    if not prompt or not prompt.strip():        logger.warning("Received empty prompt")        return "No text content provided."        logger.info("=" * 60)    logger.info("[REQUEST_START] Processing prompt")    logger.info(f"  Length: {len(prompt)} chars")    logger.info(f"  Content: {prompt[:200]}...")    logger.info("=" * 60)        options = create_agent_options()    full_response = []    turn_count = 0    tool_call_count = 0        try:        async for message in query(prompt=prompt, options=options):            if isinstance(message, AssistantMessage):                turn_count += 1                logger.info(f"[TURN {turn_count}] AssistantMessage")                                for block in message.content:                    if isinstance(block, ToolUseBlock):                        tool_call_count += 1                        _log_tool_use(block, turn_count)                    elif isinstance(block, ToolResultBlock):                        _log_tool_result(block, turn_count)                    elif isinstance(block, ThinkingBlock):                        logger.debug(f"[THINKING] Turn {turn_count}")                    elif isinstance(block, TextBlock):                        logger.info(f"[RESPONSE] Turn {turn_count}: {len(block.text)} chars")                        full_response.append(block.text)                                    elif isinstance(message, UserMessage):                # UserMessage contains tool results after skill/tool execution                logger.info(f"[TURN {turn_count}] UserMessage (tool results)")                content = message.content                # Iterate through content to log each tool result                if isinstance(content, list):                    for block in content:                        if isinstance(block, ToolResultBlock):                            _log_tool_result(block, turn_count)                        elif isinstance(block, TextBlock):                            logger.info(f"[USER_TEXT] Turn {turn_count}: {block.text[:200]}")                else:                    logger.info(f"[USER_CONTENT] Turn {turn_count}: {str(content)[:500]}")                                    elif isinstance(message, ResultMessage):                logger.info("=" * 60)                logger.info("[REQUEST_COMPLETE]")                logger.info(f"  Turns: {turn_count}")                logger.info(f"  Tool Calls: {tool_call_count}")                if message.total_cost_usd > 0:                    logger.info(f"  Cost: ${message.total_cost_usd:.6f}")                logger.info("=" * 60)                    except Exception as e:        logger.error(f"[REQUEST_ERROR] {str(e)}", exc_info=True)        return f"Error: {str(e)}"        return "".join(full_response) or "No response generated."

Step 4: Create the Server

src/server.py

"""FastAPI Server for AgentCoreImplements the HTTP protocol required by AgentCore runtime."""import uuidfrom contextlib import asynccontextmanagerfrom fastapi import FastAPI, Requestfrom fastapi.responses import JSONResponsefrom src.logging_config import setup_logging, get_loggerfrom src.agent import process_prompt# Initialize logging firstsetup_logging()logger = get_logger(__name__)@asynccontextmanagerasync def lifespan(app: FastAPI):    """Application lifecycle manager."""    logger.info("Server starting up...")    yield    logger.info("Server shutting down...")app = FastAPI(    title="CloudWatch Logging Demo Agent",    version="1.0.0",    lifespan=lifespan)@app.get("/ping")async def ping():    """Health check endpoint for AgentCore."""    return {"status": "Healthy"}@app.post("/invocations")async def invocations(request: Request):    """Main processing endpoint for AgentCore."""    request_id = str(uuid.uuid4())[:8]        try:        body = await request.json()    except Exception as e:        logger.error(f"[{request_id}] JSON parse error: {e}")        return JSONResponse({"error": "Invalid JSON"}, status_code=400)        prompt = body.get("prompt", "")    if not prompt:        return JSONResponse({"error": "Missing prompt"}, status_code=400)        logger.info(f"[{request_id}] Processing request")        try:        response = await process_prompt(prompt)        logger.info(f"[{request_id}] Request completed")        return JSONResponse({"response": response, "status": "success"})    except Exception as e:        logger.error(f"[{request_id}] Error: {e}", exc_info=True)        return JSONResponse({"error": str(e)}, status_code=500)if __name__ == "__main__":    import uvicorn    uvicorn.run(app, host="0.0.0.0", port=8080)

Step 5: Create requirements.txt

requirements.txt

claude-agent-sdk>=0.1.0fastapiuvicorn

Step 6: Create the Dockerfile

Dockerfile

FROM ghcr.io/astral-sh/uv:python3.12-bookworm-slimWORKDIR /app# Environment variablesENV UV_SYSTEM_PYTHON=1 \    UV_COMPILE_BYTECODE=1 \    PYTHONUNBUFFERED=1 \    HOME=/home/bedrock_agentcore \    PYTHONPATH=/app \    PORT=8080 \    LOG_LEVEL=INFO \    # Required for Claude Agent SDK to use Bedrock    CLAUDE_CODE_USE_BEDROCK=1 \    ANTHROPIC_MODEL=us.anthropic.claude-sonnet-4-20250514-v1:0# Install system dependenciesRUN apt-get update && apt-get install -y \    nodejs \    npm \    curl \    && rm -rf /var/lib/apt/lists/*# Install Claude Code CLI (required for claude-agent-sdk)RUN npm install -g @anthropic-ai/claude-code# Install Python dependenciesCOPY requirements.txt requirements.txtRUN uv pip install -r requirements.txt# Create non-root userRUN useradd -m -d /home/bedrock_agentcore -u 1000 bedrock_agentcoreRUN mkdir -p /home/bedrock_agentcore/.config && \    chown -R bedrock_agentcore:bedrock_agentcore /home/bedrock_agentcore# Copy application codeCOPY . .RUN chown -R bedrock_agentcore:bedrock_agentcore /appEXPOSE 8080HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \    CMD curl -f http://localhost:8080/ping || exit 1# Run as non-root userCMD ["runuser", "-u", "bedrock_agentcore", "--", \     "uvicorn", "src.server:app", \     "--host", "0.0.0.0", \     "--port", "8080", \     "--log-level", "info"]

Step 7: Create README.md

README.md

# CloudWatch Logging Demo AgentDemonstrates proper logging patterns for Skills in Bedrock AgentCore.## Key Concepts1. **Agent logs**: Use Python logging to stdout2. **Skill logs**: Use Python logging to stderr (stdout is for JSON results)## Local Development# Create virtual environmentpython -m venv venvsource venv/bin/activate# Install dependenciespip install -r requirements.txt# (Required for Bedrock backend)export CLAUDE_CODE_USE_BEDROCK=1# Run serverpython -m src.server## Testing# Health checkcurl http://localhost:8080/ping# Test greetingcurl -X POST http://localhost:8080/invocations \  -H "Content-Type: application/json" \  -d '{"prompt": "Please greet Alice"}'## Deployment to AgentCore# Step 1: Configure AgentCore (one-time setup)agentcore configure -e Dockerfile# Step 2: Launch the agentagentcore launch# Step 3: Test the deploymentagentcore invoke '{"prompt": "Please greet Alice"}'# The `agentcore launch` command automatically:# - Builds the Docker image from your Dockerfile# - Pushes it to Amazon ECR# - Deploys to AgentCore runtime# - Sets up CloudWatch logging automatically# Logs will appear in CloudWatch at:# - `/aws/bedrock-agentcore/runtimes/<agent_id>/standard-logs`

7. Best Practices and Troubleshooting

Best Practices

1. Consistent Log Format Use a consistent format across all components: TIMESTAMP — [COMPONENT:name] — LEVEL — MESSAGE

2025-01-15 10:30:45 - [SKILL:pdf_tools] - INFO - [START] Command: extract_text2025-01-15 10:30:46 - [SKILL:pdf_tools] - INFO - [END] Command: extract_text - SUCCESS - 1.234s

2. Operation Bracketing Always log the start and end of operations:

def my_operation():    start_time = time.time()    logger.info(f"[START] my_operation")        try:        # Do work        result = do_work()        logger.info(f"[END] my_operation - SUCCESS - {time.time() - start_time:.3f}s")        return result    except Exception as e:        logger.error(f"[END] my_operation - FAILED - {time.time() - start_time:.3f}s")        raise

3. Structured Data in Logs Include relevant context in logs for easier debugging:

logger.info(f"[TOOL_CALL] tool={tool_name} input_size={len(input_data)} turn={turn_count}")

4. Log Levels Use appropriate log levels:

DEBUG: Detailed diagnostic information
INFO: General operational events
WARNING: Unexpected but handled situations
ERROR: Errors that need attention

5. Avoid Sensitive Data Never log sensitive information. Always truncate or mask API keys and secrets.

# Badlogger.info(f"Processing with API key: {api_key}")# Goodlogger.info(f"Processing with API key: {api_key[:4]}***")

Troubleshooting

Problem: Skill logs not appearing in CloudWatch

Symptoms* Agent logs appear, but Skill script’s internal debug logs (like [START], [END]) are missing.*

Cause* Skill script is logging to stdout instead of stderr.*

Solution* Ensure skill scripts log to stderr (not stdout):*

# In skill scripts, configure logging to stderrlogging.basicConfig(    stream=sys.stderr,  # CRITICAL: Must be stderr, not stdout    level=logging.DEBUG,    format='%(asctime)s - [SKILL:my_tool] - %(levelname)s - %(message)s')

Expected CloudWatch output when configured correctly:

[TOOL_RESULT] Turn 4  Tool Use ID: toolu_bdrk_xxx  Is Error: False  Result: {"file_path": "/tmp/doc.pdf", "success": true}2025-12-06 07:06:35 - [SKILL:file_tools] - INFO - [START] Command: download2025-12-06 07:06:35 - [SKILL:file_tools] - INFO - url: https://example.com/doc.pdf2025-12-06 07:06:35 - [SKILL:file_tools] - INFO - Downloading from: https://...2025-12-06 07:06:35 - [SKILL:file_tools] - INFO - Download complete: 42237 bytes2025-12-06 07:06:35 - [SKILL:file_tools] - INFO - [END] Command: download - SUCCESS2025-12-06 07:06:35 - [SKILL:file_tools] - INFO - Elapsed: 0.356s

Problem: Logs are truncated

Symptoms* Long log messages are cut off in CloudWatch.*

Cause* CloudWatch has a 256KB limit per log event.*

Solution* Create logic to limit max length of log*

Problem: Duplicate logs

Symptoms* Same log message appears multiple times.*

Cause* Multiple logging handlers attached to the same logger.*

Solution* Clear existing handlers before adding new ones:*

Viewing Logs in CloudWatch

Using AWS Console

Navigate to CloudWatch > Log Management.
Find your agent’s log group: /aws/bedrock-agentcore/runtimes/<agent_id>/standard-logs.
Select the log stream for your time range.
Use CloudWatch Insights for advanced queries.

CloudWatch Insights Queries

Search for specific operations:

fields @timestamp, @message| filter @message like /\[SKILL:pdf_tools\]/| sort @timestamp desc| limit 100

Find errors:

fields @timestamp, @message| filter @message like /ERROR/| sort @timestamp desc| limit 50

Calculate operation duration:

fields @timestamp, @message| filter @message like /\[END\]/| parse @message /elapsed=(?<duration>[0-9.]+)s/| stats avg(duration), max(duration), min(duration) by bin(1h)

Summary

This guide covered:

Skills fundamentals: What they are, how they differ from MCP, and when to use each.
ClaudeAgentSDK configuration: How to enable and use Skills in your agents.
AgentCore specifics: Understanding the microVM environment and subprocess execution.
Logging strategy: The critical pattern of stdout for results, stderr for logs.
Complete example: A working project you can use as a template.
Best practices: Patterns for maintainable, debuggable logging.

The key insight is that Skills execute as subprocesses in AgentCore, and their stdout is captured by Claude as the tool result. All logging must go to stderr to appear in CloudWatch without corrupting the JSON response.

By following these patterns, you can build observable, debuggable agents with comprehensive logging that works seamlessly in the Bedrock AgentCore environment.

A comprehensive walkthrough for tracing using ClaudeAgentSDK and Skills.

A comprehensive walkthrough for tracing using ClaudeAgentSDK and Skills.

What we will cover:

1. Introduction to Skills

What Are Skills?

What Problems Do Skills Solve?

2. Skills vs MCP: Understanding the Differences

Model Context Protocol (MCP)

Skills

Comparison at a Glance

When to Use Each

3. Using Skills in ClaudeAgentSDK

Skill Directory Structure

SKILL.md Format

Configuring ClaudeAgentSDK

4. Bedrock AgentCore Environment Specifics

AgentCore Architecture

How Skills Execute in AgentCore

The Logging Challenge

5. Logging Strategy for CloudWatch

Understanding the Log Flow

Agent Process Logging

Get Xue Langping’s stories in your inbox

Skill Script Logging (Critical Pattern)

Log Format Conventions

CloudWatch Log Groups

6. Complete Example Project

Project Structure

Step 1: Create the Logging Configuration

Step 2: Create the Skill

Step 3: Create the Agent Logic

Step 4: Create the Server

Step 5: Create requirements.txt

Step 6: Create the Dockerfile

Step 7: Create README.md

7. Best Practices and Troubleshooting

Best Practices

Troubleshooting

Problem: Skill logs not appearing in CloudWatch

Problem: Logs are truncated

Problem: Duplicate logs

Viewing Logs in CloudWatch

Summary

Similar Posts