How Acontext Stores AI Messages?

Agent developers shouldn’t be writing custom message converters every time providers change their APIs. You need to work with messages from OpenAI, Anthropic, and other providers—but each one structures messages differently. Without a unified approach, you end up with fragmented, inconsistent data that’s hard to store, query, and learn from.

Building reliable agent systems requires treating message data as first-class infrastructure. Acontext …

Building reliable agent systems requires treating message data as first-class infrastructure. Acontext provides a unified and durable message layer that makes context observable, persistent, and reusable. You send messages in any format—OpenAI, Anthropic, or Acontext’s unified format—and they just work.

This post explores how Acontext stores and normalizes multi-format messages, the architecture that makes it possible, and what this means for building production-ready agents.

The Problem: Format Fragmentation

Today, agent context is scattered across different formats, storage systems, and provider-specific structures. The result? Fragmented, transient, and nearly impossible to analyze over time.

Consider what happens when you build an agent that needs to work with multiple providers:

Format Fragmentation: Each provider uses different message structures. OpenAI uses tool\_calls[] with nested function.name and function.arguments, while Anthropic uses tool\_use[] with name and input. Without normalization, you can’t query or analyze messages uniformly.

Information Loss: During format conversion, important metadata (tool call IDs, message names, source format) often gets lost or becomes inconsistent. You end up with incomplete context that can’t be reliably used for learning or debugging.

Storage Inefficiency: Storing large message payloads (images, files, long text) directly in PostgreSQL bloats the database and slows queries. But storing everything in object storage makes querying complex and loses transactional guarantees.

Provider Lock-in: Systems become tightly coupled to one provider’s format. When a new provider emerges or an existing one updates their API, you’re rewriting converters and risking breaking changes.

All these challenges stem from one root cause: message data is not treated as first-class infrastructure.

We don’t need another database or storage system. We need a unified layer that accepts messages in any format, normalizes them to a consistent representation, and stores them efficiently—all while preserving every piece of context. That’s what Acontext does.

The Solution: Unified Message Processing

Acontext’s message processing system provides a single interface through which your agents can flexibly receive, normalize, and store messages from any provider. Here’s how it works:

Multi-Format Support

Instead of requiring clients to convert messages to a single format, Acontext accepts messages in their native format (OpenAI, Anthropic, or Acontext’s unified format) and normalizes them internally.

This means:

Clients can use their existing SDKs without modification
The system validates messages using official provider SDKs, ensuring spec compliance
Format conversion happens once at ingestion, not repeatedly during processing

Acontext uses normalizers as adapters that translate provider-specific formats into a unified internal representation, so your downstream systems only need to understand one format.

Complete Context Preservation

Different providers structure messages differently. Without careful normalization, this information gets lost.

Acontext preserves all metadata by:

Storing the original source format in message metadata
Unifying field names (e.g., tool\_use → tool-call, input → arguments)
Preserving provider-specific metadata in a flexible meta field

This ensures downstream systems can handle tool calls uniformly while still accessing original provider-specific information when needed.

Efficient Hybrid Storage

Messages can contain large payloads—images, files, or long text content. Storing these directly in PostgreSQL would bloat the database and slow queries.

Acontext uses a three-tier storage strategy:

PostgreSQL stores lightweight metadata (message ID, session ID, role, timestamps)
S3 stores the actual message parts as JSON files, referenced by the database
Redis provides hot cache for frequently accessed message parts, reducing S3 reads
Content-addressed storage using SHA256 hashes enables automatic deduplication

This separation allows the database to remain fast for queries while S3 handles large payloads efficiently. Redis caching ensures hot data is served instantly without hitting S3, dramatically improving read performance for active sessions. It’s like having a filesystem with intelligent caching for your agent’s context data.

Provider Flexibility

As new providers emerge or existing providers update their formats, the system needs to adapt without breaking existing functionality.

Acontext achieves this through:

Pluggable normalizers: Each provider has its own normalizer that can be updated independently
Unified internal format: Downstream systems only need to understand one format
Format versioning: Message metadata tracks the source format, enabling format-specific handling when needed

Architecture: Layered Design

Acontext’s message processing follows a clean layered architecture that separates concerns, making it easy to add new providers or modify existing behavior:

Router Layer: Handles HTTP routing, authentication, and extracts URL parameters. It’s format-agnostic and focuses on request routing.

Handler Layer: Detects content type (JSON vs multipart), parses the request, and validates the format parameter. It orchestrates the normalization process but doesn’t perform format-specific parsing.

Normalizer Layer: This is where format-specific logic lives. Each provider has its own normalizer that:

Uses the official provider SDK for validation (ensuring spec compliance)
Converts provider-specific structures to the unified internal format
Preserves metadata and handles edge cases

Service Layer: Handles business logic—file uploads, S3 storage, Redis caching, reference counting. It works with the unified format, so it doesn’t need to know about provider differences. The caching layer ensures hot data is served instantly while maintaining S3 as the source of truth.

Repository Layer: Manages database operations and transactions. It stores metadata in PostgreSQL and references to S3 objects.

This separation means adding a new provider only requires implementing a new normalizer—the rest of the system remains unchanged. It’s designed to scale with your needs.

Data Flow: From Request to Storage

Here’s how a message flows through the system:

Step 1: Ingestion

Client sends message in native format (OpenAI, Anthropic, or Acontext’s unified format)
Router extracts format parameter and routes to handler

Step 2: Normalization

Handler detects content type and parses request
Normalizer uses official provider SDK to validate and convert
Message is transformed to unified internal format

Step 3: Storage (Write Path)

Service layer handles file uploads and calculates SHA256 hashes
Content-addressed storage enables automatic deduplication
S3: Parts JSON and files uploaded to S3, organized by SHA256 hash
Redis: After successful S3 upload, parts data cached in Redis with fixed TTL
PostgreSQL: Message metadata (ID, session ID, role, timestamps, source format) and S3 asset reference stored
Reference counting ensures safe deletion when messages are removed

Step 4: Retrieval (Read Path with Caching)

System queries PostgreSQL for message metadata
For each message, first checks Redis cache using SHA256 hash from metadata
Cache hit: Parts returned instantly from Redis (sub-millisecond latency)
Cache miss: Parts loaded from S3, then cached in Redis for subsequent requests
This ensures hot data (recently accessed messages) is served with minimal latency

Format Comparison: Provider Differences

Each provider structures messages differently. Here’s how they compare:

Example: Same Message, Different Formats

OpenAI Format:

{
"role": "assistant",
"content": [
{
"type": "text",
"text": "I'll check the weather for you."
}
],
"tool_calls": [
{
"id": "call_123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\":\"San Francisco\"}"
}
}
]
}

Anthropic Format:

{
"role": "assistant",
"content": [
{
"type": "text",
"text": "I'll check the weather for you."
},
{
"type": "tool_use",
"id": "toolu_123",
"name": "get_weather",
"input": {
"location": "San Francisco"
}
}
]
}

Acontext Unified Format:

{
"role": "assistant",
"parts": [
{
"type": "text",
"text": "I'll check the weather for you."
},
{
"type": "tool-call",
"meta": {
"name": "get_weather",
"arguments": "{\"location\":\"San Francisco\"}",
"id": "call_123"
}
}
],
"meta": {
"source_format": "openai"
}
}

The unified format means downstream systems only need to understand one structure, while original format information is preserved for debugging and auditing.

Storage: Three-Tier Architecture (PostgreSQL + S3 + Redis)

Why Three-Tier Storage?

Storing everything in PostgreSQL leads to database bloat and slow queries. Storing everything in S3 makes querying complex and loses transactional guarantees. Reading from S3 on every request adds latency. Acontext uses PostgreSQL for metadata, S3 for durable storage, Redis for hot cache—the best of all worlds.

PostgreSQL stores:

Message identifiers (ID, session ID, parent ID)
Lightweight fields (role, timestamps, status)
S3 reference (asset metadata pointing to the parts JSON file)
Message-level metadata (source format, name, etc.)

S3 stores:

Message parts as JSON files (content-addressed by SHA256)
Uploaded files (images, documents, etc.)
Organized by project and date (format: parts/{project\_id}/{YYYY/MM/DD}/{sha256}.json) for efficient access
Serves as the source of truth for all message content

Redis caches:

Frequently accessed message parts (hot data)
Keyed by SHA256 hash for content-based caching
Fixed TTL strategy ensures automatic expiration and bounded memory usage
Dramatically reduces S3 read operations for active sessions

This three-tier architecture allows fast SQL queries on metadata, efficient storage for large payloads, and instant access to hot data. It’s designed like a filesystem with intelligent caching for your agent’s context data.

Content-Addressed Storage and Deduplication

When the same content is uploaded multiple times (e.g., the same image used in different messages), Acontext automatically deduplicates it:

Calculate SHA256 hash of the content
Check if an object with this hash already exists in S3
If found, return the existing object metadata
If not found, upload with a path like parts/{project\_id}/{YYYY/MM/DD}/{sha256}.json

This means identical content is stored only once, saving storage costs and improving cache efficiency. The SHA256-based addressing also enables efficient Redis caching—the same hash is used as the cache key, so identical content benefits from cache hits regardless of which message references it. It’s automatic—you don’t need to think about it.

Reference Counting for Safe Deletion

When a message is deleted, we can’t immediately delete its S3 assets—other messages might reference the same content. Acontext uses reference counting:

Each asset has a ref\_count in the database
When a message references an asset, increment the count
When a message is deleted, decrement the count
Only delete from S3 when ref\_count reaches zero

This is handled atomically at the database level using PostgreSQL’s row-level locking, eliminating race conditions without application-level synchronization. It’s safe, automatic, and efficient.

Redis Hot Cache for Performance

To minimize latency when reading messages, Acontext uses Redis as a hot cache layer:

On Write (SendMessage):

After successfully uploading parts to S3, the parts data is cached in Redis
Cache key: message:parts:{sha256} (using SHA256 for content-based caching)
Uses a fixed TTL strategy: entries automatically expire after a set duration

On Read (GetMessages):

System first checks Redis cache using the SHA256 hash from message metadata
Cache hit: Parts returned instantly from Redis (sub-millisecond latency)
Cache miss: Parts loaded from S3, then cached in Redis for subsequent requests
This ensures recently accessed messages are served with minimal latency

Benefits:

Reduced S3 reads: Hot data (active sessions) rarely hits S3
Lower latency: Cache hits are orders of magnitude faster than S3 reads
Content-based caching: Same content (same SHA256) benefits from cache regardless of which message references it
Automatic cache management: Fixed TTL strategy ensures cache automatically expires stale entries without manual intervention or complex eviction policies
Predictable behavior: Simple, deterministic expiration makes cache behavior easy to reason about
Bounded memory: Cache size naturally self-regulates as entries expire, preventing unbounded growth

The cache is transparent—if Redis is unavailable, the system gracefully falls back to S3. This ensures reliability while maximizing performance for the common case.

Parsing: Multi-Format Normalization

Using Official SDKs for Validation

Rather than manually parsing JSON and hoping we got it right, Acontext uses official provider SDKs:

OpenAI: Uses openai-go↗ types
Anthropic: Uses anthropic-sdk-go↗ types

This ensures:

Type safety: Compile-time checking of message structure
Spec compliance: SDKs stay updated with provider changes
Error handling: SDKs handle edge cases we might miss

The normalizers act as thin adapters that use SDK types for parsing, then convert to the unified format. It’s reliable, maintainable, and future-proof.

The Unified Internal Format

Acontext converts all formats to a unified structure:

{
"role": "user",
"parts": [
{
"type": "text",
"text": "Hello"
},
{
"type": "tool-call",
"meta": {
"name": "get_weather",
"arguments": "{\"location\":\"SF\"}"
}
}
],
"meta": {
"source_format": "openai"
}
}

This unified format means:

Downstream systems only need to understand one structure
Tool calls are handled uniformly regardless of source
Original format is preserved in metadata for debugging/auditing

Use Cases: Real-World Applications

Multi-Provider Agent Systems

You’re building an agent that needs to work with both OpenAI and Anthropic, depending on the task. Without Acontext, you’d need separate storage systems, different query logic, and custom converters for each provider.

With Acontext:

from acontext import AcontextClient

client = AcontextClient(api_key="your-api-key")

# Send OpenAI format message
openai_message = {"role": "user", "content": "Hello"}
client.sessions.send_message(
session_id=session_id,
blob=openai_message,  # Native OpenAI format
format="openai"
)

# Send Anthropic format message
anthropic_message = {"role": "user", "content": "Hello"}
client.sessions.send_message(
session_id=session_id,
blob=anthropic_message,  # Native Anthropic format
format="anthropic"
)

# Query uniformly
messages = client.sessions.get_messages(session_id=session_id)
# All messages in unified format, regardless of source

Context Learning and Analysis

You want to analyze tool usage patterns across all your agent interactions, but messages are stored in different formats with inconsistent structures.

With Acontext:

All messages normalized to unified format
Tool calls queryable uniformly: WHERE parts->>'type' = 'tool-call'
Source format preserved for debugging: WHERE meta->>'source\_format' = 'openai'
Fast queries on metadata, efficient storage for payloads

Large Payload Handling

Your agent processes images, documents, and long-form content. Storing everything in PostgreSQL would bloat the database, but object storage makes querying difficult.

With Acontext:

Metadata in PostgreSQL for fast queries
Payloads in S3 for efficient storage
Redis cache for instant access to hot data
Automatic deduplication saves storage costs
Reference counting ensures safe deletion

Provider Migration

You’re migrating from OpenAI to Anthropic, or need to support both during a transition period. Without normalization, you’d need separate code paths and risk data inconsistency.

With Acontext:

Send messages in either format to the same API
Unified storage means consistent queries
Source format tracked for auditing
No code changes needed for downstream systems

Built to Scale

Acontext’s message processing is designed to grow with your needs:

Add new providers: Just implement a new normalizer
Handle large payloads: Hybrid storage scales automatically
Support new formats: The unified internal format remains stable
Maintain reliability: Official SDK validation ensures spec compliance

It’s infrastructure that works, so you don’t have to think about it.

Vision: Message Data as First-Class Infrastructure

Building a multi-format message processing system is about treating message data as first-class infrastructure. By providing a unified layer that normalizes, stores, and preserves context data, Acontext enables agents to work with consistent, observable, and learnable context.

This is what a Context Data Platform should do: make context data observable, persistent, and reusable, so you can focus on building agents that deliver real value.

You focus on building agents that solve real problems, not on babysitting message formats.

The system:

Accepts messages in any supported format (OpenAI, Anthropic, or Acontext’s unified format)
Normalizes them to a unified internal representation
Stores them efficiently using hybrid storage
Preserves all necessary metadata

This enables Acontext to serve as a reliable foundation for context data management, allowing your agents to work with consistent, queryable, and learnable context data—the foundation for self-learning AI systems.

The Problem: Format Fragmentation

The Solution: Unified Message Processing

Multi-Format Support

Complete Context Preservation

Efficient Hybrid Storage

Provider Flexibility

Architecture: Layered Design

Data Flow: From Request to Storage

Format Comparison: Provider Differences

Example: Same Message, Different Formats

Storage: Three-Tier Architecture (PostgreSQL + S3 + Redis)

Why Three-Tier Storage?

Content-Addressed Storage and Deduplication

Reference Counting for Safe Deletion

Redis Hot Cache for Performance

Parsing: Multi-Format Normalization

Using Official SDKs for Validation

The Unified Internal Format

Use Cases: Real-World Applications

Multi-Provider Agent Systems

Context Learning and Analysis

Large Payload Handling

Provider Migration

Built to Scale

Vision: Message Data as First-Class Infrastructure

References

Similar Posts