🚀 The Problem with "Chatting" with PDFs
We’ve all built the standard RAG app:
- Upload a PDF ✅
- Ask a question ✅
- Get an answer ✅
It’s useful, but it’s passive.
Real learning isn’t passive. A real teacher doesn’t just answer; they quiz you, they track your progress, and they remind you when you’re about to forget something.
I built Echo-Learn to solve this.
Echo-Learn is an open-source Agentic RAG Engine designed to turn any document—whether it’s a coding manual, a medical textbook, or legal guidelines—into an active, voice-enabled study partner.
Here’s the deep dive into building an AI that "learns how you learn."
🧠 The Core Concept: "Agentic RAG" + "Biological Memory"
Echo-Learn isn’t just a wrapper around an LLM.…
🚀 The Problem with "Chatting" with PDFs
We’ve all built the standard RAG app:
- Upload a PDF ✅
- Ask a question ✅
- Get an answer ✅
It’s useful, but it’s passive.
Real learning isn’t passive. A real teacher doesn’t just answer; they quiz you, they track your progress, and they remind you when you’re about to forget something.
I built Echo-Learn to solve this.
Echo-Learn is an open-source Agentic RAG Engine designed to turn any document—whether it’s a coding manual, a medical textbook, or legal guidelines—into an active, voice-enabled study partner.
Here’s the deep dive into building an AI that "learns how you learn."
🧠 The Core Concept: "Agentic RAG" + "Biological Memory"
Echo-Learn isn’t just a wrapper around an LLM. It combines two powerful concepts:
| Concept | What It Means |
|---|---|
| Mode-Aware Agency | The AI switches gears between "Casual Chat" (low latency) and "Deep Learning" (analytical, saves progress) |
| Smart Memory Clusters | Uses the Ebbinghaus Forgetting Curve to predict when you’ll forget a fact and schedules reviews |
🛠️ The Architecture & Code
The stack is built on:
- 🧠 Google Gemini 3 Flash → Reasoning
- 💾 Upstash Redis → Memory & State
- 📐 Upstash Vector → Hybrid Search (Vectors + BM25)
- 👁️ Mistral OCR → Document Understanding
- 🔊 ElevenLabs → Real-time Voice Streaming
Let’s look at the code.
1️⃣ The "Echo" Mechanism: Coding the Forgetting Curve
The heart of Echo-Learn is its ability to track Mastery. We don’t just store what you know; we store how well you know it and when you last recalled it.
We implemented a decay algorithm directly in our storage layer:
📁 packages/storage/src/redis/mastery.ts
const DECAY_RATE = 0.1; // λ - controls how fast you forget
function calculateEffectiveMastery(
storedMastery: number,
lastInteractionDate: Date
): number {
const daysSince = daysBetween(lastInteractionDate, new Date());
// Exponential decay: Mastery drops over time without review
// Formula: M = S × e^(-λt)
const decayFactor = Math.exp(-DECAY_RATE * daysSince);
return storedMastery * decayFactor;
}
🎯 How It Works In Practice:
Day 0: You learn "React Hooks" → Mastery = 100%
Day 5: You haven't reviewed it → Mastery decays to ~60%
↓
Echo-Learn: "Quick quiz on useEffect before we continue?"
The AI doesn’t wait for you to ask—it proactively teaches.
2️⃣ The Brain: Mode-Aware Agentic Strategy
A study partner needs to know when to lecture and when to listen. We built a Mode-Aware Strategy using Vercel AI SDK’s ToolLoopAgent.
📁 packages/agentic/src/strategies.ts
export async function executeUnifiedAgenticStrategy(
query: string,
options: ModeAwareQueryOptions
) {
// The AI decides: Are we chatting, or are we LEARNING?
const mode = options.mode || "learn";
// Different modes get different prompts & tools
const systemPrompt = buildModeSystemPrompt(mode, modeResult, userProfile);
const agent = new ToolLoopAgent({
model: google("gemini-3-flash-preview"),
tools: getToolsForMode(allTools, mode),
// Dynamic tool selection per step
prepareStep: ({ stepNumber }) => {
if (stepNumber === 0) return { toolChoice: "required" }; // Force RAG search
return { toolChoice: "auto" }; // Then let AI decide
},
});
}
💬 The Personality Switch:
| Mode | Behavior | Example Response |
|---|---|---|
| Chat | Fast, casual, no tracking | "React is a UI library." |
| Learn | Deep, contextual, saves progress | "React is a UI library. You struggled with State management yesterday—let me connect this concept to that..." |
3️⃣ The Eyes: Semantic Chunking for Complex Docs
Textbooks have structure. If you just chop them into arbitrary 500-char blocks, you lose the lesson.
We use Semantic Chunking that detects topic shifts:
📁 packages/ingest/src/chunker/semantic-chunker.ts
function findTopicBreakPoints(
sentences: SentenceInfo[],
threshold: number
): number[] {
const breakPoints: number[] = [];
for (let i = 2; i < sentences.length - 1; i++) {
// Compare text BEFORE and AFTER this point
const prevContext = getCombinedText(sentences, i - 2, i);
const nextContext = getCombinedText(sentences, i, i + 2);
const similarity = calculateTextSimilarity(prevContext, nextContext);
// Topic changed? Start a new chunk here.
if (similarity < threshold) {
breakPoints.push(i);
}
}
return breakPoints;
}
This ensures that a "Chapter on Authentication" stays together, not split across 3 random chunks.
4️⃣ The Knowledge: Hybrid Search with Upstash Vector
We don’t just use semantic similarity. We use Hybrid Search (Vectors + BM25 Keywords).
📁 packages/storage/src/vector/client.ts
export async function searchWithEmbedding(
query: string,
options: SearchWithEmbeddingOptions = {}
): Promise<Array<VectorSearchResult>> {
const { topK = 10, fusionAlgorithm = "RRF" } = options;
// Upstash handles both dense (semantic) and sparse (BM25) search
const results = await vectorIndex.query({
data: query, // Text → Upstash auto-generates embedding
topK,
fusionAlgorithm: "RRF", // Reciprocal Rank Fusion
});
return results;
}
Why Hybrid Matters:
| Query Type | Pure Semantic | Hybrid (Semantic + BM25) |
|---|---|---|
| "What is Form N-648?" | ❌ Misses exact term | ✅ BM25 catches "N-648" |
| "How do I prove a disability?" | ✅ Gets concept | ✅ Gets concept + exact forms |
💡 Unlimited Use Cases: What Can YOU Build?
Echo-Learn is a platform, not just a single app. Swap the knowledge source, and you get a completely different product:
🎓 1. The Ultimate Study Guide
Input: University textbooks (Biology, History, Physics)
Experience: Upload a chapter → AI quizzes you → Tracks which formulas you keep forgetting → Creates personalized study schedule for finals week
🏢 2. Corporate "Brain" & Onboarding
Input: Employee Handbooks, Compliance PDFs, Technical Docs
Experience: New hire talks to Echo-Learn instead of reading 50 pages
- AI: "You read the security policy last week. Do you remember the phishing protocol?"
- User: "Uh..."
- AI: "Let’s review that section."
⚕️ 3. Medical & Legal Certification Prep
Input: DSM-5, Legal Codes, Case Law
Experience: Professionals studying for Boards or Bar exams. The "Smart Memory" ensures high-stakes info is retained, not just skimmed.
🛠️ 4. Technical DIY Assistant
Input: Car repair manuals, Appliance guides
Experience: "I’m under the car—what’s the torque spec for this bolt?" → AI retrieves the exact table row and reads it aloud.
📊 The Tech Stack At a Glance
| Layer | Technology |
|---|---|
| Frontend | TanStack Start, React 19, Tailwind v4 |
| Backend | Hono.js on Bun |
| LLM | Google Gemini 3 Flash |
| Voice | ElevenLabs Streaming |
| OCR | Mistral OCR |
| Vector DB | Upstash Vector (Hybrid Index) |
| Cache/State | Upstash Redis |
| Storage | Google Cloud Storage |
🔗 Open Source & Ready to Fork
I believe Personalized Education is the killer use case for AI. Echo-Learn is the foundation.
It’s fully open source. Fork it, plug in your own documents, and you have a custom AI Tutor in minutes.
| Resource | Link |
|---|---|
| 📦 GitHub | github.com/jacksonkasi1/echo-learn |
| 🎬 Demo Video | YouTube |
⭐ Show Some Love
If you found this useful, please:
- Star the repo → It helps others discover Echo-Learn
- Fork & build → I’d love to see what you create
- Drop a comment → What would YOU use this for?
Let’s build the future of learning, one echo at a time. 🔊
Built with ❤️ using Gemini 3, Upstash, Mistral AI, and ElevenLabs