Building MeridianDB: Solving AI's Memory Crisis with Multi-Dimensional RAG

Building MeridianDB: Solving AI’s Memory Crisis with Multi-Dimensional RAG

Why I Built This

When exploring cloud platforms, I don’t just read documentation—I build something substantial. Recently, I dove deep into Cloudflare Workers, and I wanted to tackle a problem that’s becoming critical in today’s AI landscape: catastrophic forgetting.

The Problem: AI Agents That Forget

Traditional RAG (Retrieval-Augmented Generation) systems use vector databases to enhance AI outputs by storing data as embeddings—multi-dimensional vectors that machines can understand. When you search, the system transforms your query into vectors and performs similarity searches using mathematical distance calculations.

This approach searches for meaning, not just text. But it fails to sol…

Building MeridianDB: Solving AI’s Memory Crisis with Multi-Dimensional RAG

Why I Built This

The Problem: AI Agents That Forget

This approach searches for meaning, not just text. But it fails to solve a fundamental problem in agentic AI: catastrophic forgetting—when AI systems learn new information, they often forget old knowledge.

Standard RAG mitigates this issue but doesn’t fundamentally solve it. As user data grows exponentially, two critical questions emerge:

How does retrieved data affect AI generation quality?
How relevant is this data over time?

The Solution: Multi-Dimensional Memory

MeridianDB goes beyond traditional RAG by adding multiple dimensions on top of semantic search. Built entirely on Cloudflare’s infrastructure (Workers, D1, Vectorize, KV, Queues, and R2), it provides Auto-RAG that’s highly scalable, performant, and runs at the edge—near your users, without headaches.

The Four Dimensions of Memory

1. Semantic Search

Like any RAG database, MeridianDB uses Cloudflare Vectorize at its core. When your AI agent sends a query, it performs semantic search to retrieve meaningful data. We recommend over-fetching to allow other features to refine results.

2. Behavioral Learning

When your agent retrieves data, you can add like/dislike buttons to generated responses. User feedback creates behavioral signals—all memories retrieved get penalized for negative signals. Combined with agent configuration, this filters out memories that produce poor results.

3. Temporal Decay

Facts become irrelevant over time. We provide temporal features where you can:

Mark data as factual (always included, no decay)
Mark data as irrelevant (always excluded)
Let intelligent active/passive learning determine inclusion based on smart filtering and access patterns

Our exponential decay algorithm with frequency boost ensures recent and frequently accessed memories stay relevant while old, unused memories naturally fade.

4. Contextual Filtering

Developers or other AI agents can describe memories for specific tasks. This additional metadata helps task-performing agents find precisely what they need.

The Science Behind It

We considered adding graph capabilities—giving agentic AI the ability to build knowledge graphs would be powerful. We could implement this with edge columns and JOIN queries, but decided against it for now to maintain simplicity and performance.

The core challenge is balancing stability and plasticity:

Stability: AI systems must consolidate old knowledge when learning new things
Plasticity: AI agents must learn new things quickly

This balance varies wildly by use case. A chatbot’s stability-plasticity requirements differ dramatically from a coding agent, which needs longer memory consolidation and slower learning rates.

MeridianDB’s federated database is extremely configurable, with passive/active learning controlled through agent configuration.

Architecture Decisions

Handling Consistency

Many developers overlook a critical question: when building RAG, your queries are federated (affecting multiple databases)—how do you handle consistency?

Data can go out of sync. Embeddings may succeed while record insertion fails. Lots can go wrong.

MeridianDB handles all of this out of the box.

Our white paper details our approach:

Queue-based writes ensure eventual consistency without manual orchestration
Data is redundantly stored (Vectorize ( stores only Id of memory in D1 ) + D1 ( memory content )) to preserve multi-dimensional context
Automatic retries, failover, graceful degradation on retrieval, NewSQL inspired transactions and event-driven processing

The Learning Phases

We recommend operating agents in two phases:

Phase 1: Passive Learning

Start with successRate: 0.0 and stabilityThreshold: 0.0. This prevents false positives when the system lacks sufficient data. The agent collects interaction data without aggressive filtering.

Phase 2: Active Learning

Once you’ve accumulated meaningful data, activate filtering by setting appropriate thresholds. The system automatically filters out:

Memories with low success rates (behavioral)
Memories with low stability scores (temporal)

Temporal Configuration

We use exponential decay with frequency boost. Each agent has its own configuration:

Balanced (Default)

{
halfLifeHours: 168,      // 7 days
timeWeight: 0.6,
frequencyWeight: 0.4,
decayCurve: 'hybrid',
decayFloor: 0.15
}

Aggressive Decay (for chatbots)

{
halfLifeHours: 72,       // 3 days
timeWeight: 0.7,
frequencyWeight: 0.3,
decayCurve: 'exponential'
}

Long-Term Memory (for knowledge bases)

{
halfLifeHours: 720,      // 30 days
timeWeight: 0.5,
frequencyWeight: 0.5,
decayCurve: 'polynomial'
}

The recency score calculation runs in SQL, keeping retrieval latency at 300-500ms.

Behavioral Configuration

Behavioral features use the Wilson score confidence interval—a statistically robust method for scoring with sparse data:

function wilsonScore(success: number, failure: number, confidence = 0.95) {
const total = success + failure;
if (total === 0) return 0;

const p = success / total;
const z = confidence;

const denominator = 1 + (z * z) / total;
const center = p + (z * z) / (2 * total);
const spread = z * Math.sqrt((p * (1 - p) + (z * z) / (4 * total)) / total);

return Math.max(0, (center - spread) / denominator);
}

This prevents manipulation from sparse data and provides conservative scoring for new memories.

Developer Experience

Simple SDK

Install via npm:

npm i meridiandb-sdk

Three core methods: store, retrieve, recordFeedback.

Example usage:

import { MeridianDBClient } from "meridiandb-sdk";

const client = new MeridianDBClient({
url: "https://api.meridiandb.com",
accessToken: "your-token"
});

// Retrieve memories
const memories = await client.retrieveMemoriesSingleAgent({
query: "user preferences"
});

// Store new memory
await client.storeMemory({
agentId: "chatbot-v1",
content: "User prefers dark mode",
isFactual: true,
context: "UI preferences"
});

// Record feedback
await client.recordFeedback({
success: true,
memories: ["memory-id-1", "memory-id-2"]
});

Admin Portal

Built with React and Vite, deployable to Cloudflare Pages. The operator UI provides observability, data management, and debugging tools.

Technical Stack

Cloudflare D1: Relational metadata & feature storage
Cloudflare Vectorize: Embedding storage & similarity search
Cloudflare KV: Session state, counters, cache
Cloudflare R2: Object storage for models, artifacts, backups
Cloudflare Workers: Edge-native compute
Cloudflare Queues: Event-driven processing (enterprise version)

For development/free tier, we provide cfw-poor-man-queue—a lightweight distributed queue implementation that lets you run MeridianDB on Cloudflare’s free plan.

Performance & Scalability

<500ms retrieval latency including multi-dimensional filtering
Global edge deployment for low-latency access worldwide
SQL-based scoring for maximum scalability
Event-driven updates prevent write-on-read latency penalties
Horizontally scalable architecture

Limitations

Being transparent about trade-offs:

Eventual consistency: Reads may slightly lag behind writes
Manual context: Developers must supply contextual features (auto-generation coming)
Storage constraints: D1 has a 10GB limit per database
Platform coupling: Optimized for Cloudflare ecosystem - but replacing D1 with SQLite, workers with nodejs, vectorize with chromadb, cloudflare or PMQ with rabbitmq or kafka is totally doable.
Learning curve: Multi-dimensional retrieval differs from traditional vector search

Getting Started

Clone the repository

git clone https://github.com/ARAldhafeeri/MeridianDB
cd MeridianDB
npm install

Set up Cloudflare resources

# Create vectorize index
npx wrangler vectorize create meridiandb --dimensions=768 --metric=cosine

# Create metadata index for agent isolation
npx wrangler vectorize create-metadata-index meridiandb --property-name=agentId --type=string

Run migrations

npm run server:migrations
npm run server:migrate:local

Start development

npm run dev

Initialize super admin Hit /auth/init endpoint to set up admin access

Resources

Home Page
GitHub Repository: Source code
Documentation: Full API reference and guides
White Paper: Mathematical foundations and research
Postman Collections: API examples and testing

Why This Matters

Cloudflare offers Auto-RAG as a product. But if you want state-of-the-art RAG that actively learns from user behavior, adapts over time, and balances stability with plasticity—try MeridianDB.

The future of AI agents depends on memory systems that don’t just store and retrieve, but actively curate knowledge based on utility, recency, and performance. MeridianDB makes this vision practical and deployable today.

Interested in using MeridianDB for your team? Book a meeting to discuss your use case.

Scientific Foundation

MeridianDB’s approach is grounded in established research:

Ebbinghaus (1885): Forgetting curve and memory decay models
Wilson (1927): Confidence intervals for behavioral scoring
Mikolov et al. (2013): Word embeddings and semantic representations
Parisi et al. (2019): Continual learning in neural networks
Randazzo et al. (2022): Memory models for spaced repetition

By combining neuroscience-inspired principles with modern vector databases and edge computing, MeridianDB offers a mathematically grounded solution to one of AI’s most challenging problems: building agents that learn continuously without forgetting what matters.

Building MeridianDB: Solving AI’s Memory Crisis with Multi-Dimensional RAG

Why I Built This

The Problem: AI Agents That Forget

Building MeridianDB: Solving AI’s Memory Crisis with Multi-Dimensional RAG

Why I Built This

The Problem: AI Agents That Forget

The Solution: Multi-Dimensional Memory

The Four Dimensions of Memory

The Science Behind It

Architecture Decisions

Handling Consistency

The Learning Phases

Temporal Configuration

Behavioral Configuration

Developer Experience

Simple SDK

Admin Portal

Technical Stack

Performance & Scalability

Limitations

Getting Started

Resources

Why This Matters

Scientific Foundation

Similar Posts