Note: I wrote this article for the Google Cloud Run Hackathon 2025.
I built LearnForge, an AI-powered learning platform that does something no other platform does: it has a conversation with you, figures out what you actually want to learn (not what you think you want), researches the topic in real-time, and then creates a personalized learning journey that adapts to how you learn.
Built on Google Cloud Run and Agent Development Kit (ADK) with 12 specialized AI agents working together, it handles everything from vague “I want to learn AI” statements to structured, adaptive learning missions that remember where you left off—even if you come back weeks later.
The result? A learning experience that feels like having a personal tutor who never forgets, never gets tired,…
Note: I wrote this article for the Google Cloud Run Hackathon 2025.
I built LearnForge, an AI-powered learning platform that does something no other platform does: it has a conversation with you, figures out what you actually want to learn (not what you think you want), researches the topic in real-time, and then creates a personalized learning journey that adapts to how you learn.
Built on Google Cloud Run and Agent Development Kit (ADK) with 12 specialized AI agents working together, it handles everything from vague “I want to learn AI” statements to structured, adaptive learning missions that remember where you left off—even if you come back weeks later.
The result? A learning experience that feels like having a personal tutor who never forgets, never gets tired, and actually knows what they’re talking about.
The Problem: Why Online Learning is Hard (And How I Fixed It)
Let me paint you a picture. You’re sitting at your computer, motivated, ready to learn something new. You type “learn Python” into a course platform. You get 47 courses. You pick one. Three hours later, you realize:
- You’re learning about data types when you wanted to build web APIs
- The instructor assumes you know nothing, but you’ve coded in JavaScript for years
- The examples are boring and don’t relate to what you actually want to build
- You’re on lesson 12 of 200, and you have no idea if you’re learning the right things
Sound familiar? This is the fundamental problem with online learning: one-size-fits-all doesn’t fit anyone.
The Real Problem
Here’s what’s broken:
1. The Goal Mismatch
When someone says “I want to learn AI,” they could mean:
- Building chatbots for customer service
- Fine-tuning language models for text summarization
- Understanding neural network architectures
- Creating computer vision systems for medical imaging
But traditional platforms give everyone the same “Introduction to AI” course. It’s like asking “Where do you want to go?” and getting the same map regardless of whether you said “the grocery store” or “Mount Everest.”
2. The Experience Gap
A beginner needs hand-holding. An intermediate learner needs to skip basics. An expert needs advanced concepts. But courses are static, they can’t adapt.
3. The Context Loss
You start a course on Monday. Life happens. You come back on Friday. The platform has no idea:
- What you were learning
- How well you understood it
- Where you left off
- What you were confused about
4. The Content Problem
Most courses are pre-recorded, static content. But learning is dynamic. You might need:
- A different explanation when you’re confused
- Real-world examples relevant to your goals
- Video tutorials when you’re a visual learner
- Step-by-step guides when you prefer structured learning
The Solution: AI That Actually Understands
What if, instead of picking from a catalog, you could just... talk to an AI?
You: “I want to learn AI”
AI: “Great! I checked what’s trending in AI learning. Most people start with either model training, data pipelines, or building AI-powered applications. Which direction do you want to focus on?”
You: “Building applications, I think”
AI: “Got it. Are you thinking of chatbots, image generators, or recommendation systems?”
You: “Chatbots”
AI: “Perfect! So you want to learn AI chatbot development, at an intermediate level, focusing on building real applications. Does that sound right?”
And then here’s the magic the AI:
- Researches the topic using Google Search
- Creates a personalized learning mission with checkpoints
- Guides you through it conversationally
- Adapts content based on your responses
- Remembers everything, even if you come back weeks later
This isn’t a chatbot. This is a multi-agent AI system that coordinates 12 specialized agents to create a complete learning experience.
How It Works: The Magic Behind the Scenes
LearnForge uses Google’s Agent Development Kit (ADK) to orchestrate 12 specialized AI agents. Think of it like a well-coordinated team where each agent has a specific job, but to the user, it feels like talking to one intelligent tutor.
Phase 1: Mission Creation (Meet Polaris)
When you first connect, you meet Polaris, the Pathfinder. Polaris doesn’t just ask questions it researches your topic in real-time to ask intelligent, informed questions.
You: "I want to learn machine learning"
Polaris: [Searches Google for "machine learning learning paths 2025"]
[Finds that most people focus on: model training, data prep, or deployment]
"When people explore 'machine learning,' they often focus on
model training, data prep, or deployment. Which of these do
you want to master first?"
Behind the scenes, Polaris uses:
- Pathfinder Agent: Conversational goal clarification with research
- Search Agent: Google Search API for real-time topic research
- Mission Curator Agent: Converts your goals into structured learning missions
The result? Instead of a generic “Machine Learning 101” course, you get a mission tailored to your specific goal: “Building Production ML Systems with Python and TensorFlow” or “Data Preparation for Machine Learning Models.”
Phase 2: Learning Execution (Meet Lumina)
Once your mission is created, Lumina takes over. Lumina is your personal learning companion patient, adaptive, and genuinely helpful.
Lumina guides you through checkpoints (bite-sized learning goals) using a sophisticated multi-agent system:
Lumina Orchestrator (invisible coordinator)
├── Greeter: "Welcome! Let's start your journey..."
├── Flow Briefer: "Next up: Understanding Neural Networks. Ready?"
├── Sensei (your teacher)
│ ├── Content Composer
│ │ ├── Content Searcher: Finds educational articles via Google Search
│ │ ├── Video Selector: Curates YouTube videos (4-20 min, educational)
│ │ └── Content Formatter: Personalizes content for your learning style
│ └── Evaluates your understanding
├── Help Desk: Answers off-topic questions
└── Wrapper: Celebrates your completion
Here’s what makes this special:
1. Content is Generated in Real-Time
When Sensei teaches you about “neural networks,” it doesn’t pull from a static database. Instead:
- Content Searcher finds the latest, most relevant articles
- Video Selector curates educational YouTube videos (filtered by duration, category, relevance)
- Content Formatter adapts everything to your learning style (visual? examples? step-by-step?)
2. It Adapts to Your Understanding
Sensei: "Can you explain how backpropagation works?"
You: "It's like... adjusting weights based on errors?"
Sensei: "You're on track with the error part! Let me clarify the
weight adjustment mechanism..."
[Delegates to Content Composer for a clearer explanation]
[Presents it naturally, as if Sensei knew it all along]
3. It Remembers Everything
This is where it gets interesting. Most chatbots lose context when you close the browser. LearnForge uses Cloud SQL with DatabaseSessionService to persist everything:
- Which checkpoint you’re on
- What content was presented
- How you responded to questions
- What you were confused about
- Your learning preferences
Close your browser, come back next week, switch devices it all just works.
The Architecture: How 12 Agents Work Together Seamlessly
The technical magic is in the orchestration. LearnForge uses Google ADK’s hierarchical agent system to coordinate specialized agents without the user ever knowing.
Silent Orchestration: The Invisible Hand
The orchestrators are completely invisible. Users never see messages like “Let me hand you over to the Sensei...” Instead, transitions are seamless:
# What the user sees:
Sensei: "Let's explore neural networks! Here's how they work..."
# What's actually happening:
Orchestrator → delegates to Sensei → Sensei delegates to Content Composer
→ Content Composer chains: Searcher → Video Selector → Formatter
→ Content flows back → Sensei presents it naturally
The orchestrator’s instruction is explicit:
root_agent = LlmAgent(
instruction="""
YOU MUST NEVER TALK TO THE USER DIRECTLY.
YOU MUST NEVER ACKNOWLEDGE DELEGATIONS.
The user should ONLY see responses from sub-agents.
"""
)
Content Authority Separation: Why Teaching Agents Don’t Generate Content
Here’s an insight that improved content quality dramatically: teaching agents shouldn’t generate content they should delegate to specialized agents.
sensei_agent = LlmAgent(
instruction="""
YOU ARE FORBIDDEN FROM CREATING ANY TEACHING CONTENT.
You must delegate ALL content creation to content_composer_agent.
YOU CAN:
- Ask questions
- Evaluate answers
- Provide feedback
YOU CANNOT:
- Explain concepts yourself
- Provide examples yourself
"""
)
Why? Because:
- Content Searcher has access to Google Search (real-time, research-backed)
- Video Selector has access to YouTube API (curated, filtered)
- Content Formatter knows your learning preferences
Sensei focuses on pedagogy. Content creation agents focus on quality. Separation of concerns, even for AI.
The DatabaseSessionService Breakthrough: Why This Changes Everything
Here’s where LearnForge diverges from every other AI learning platform I’ve seen.
The Problem Nobody Talks About
Most AI chatbots use in-memory sessions. This works fine for:
- 5-minute conversations
- Simple Q&A
- Demos
But learning is different. Learning sessions can span:
- Hours (deep dive sessions)
- Days (coming back to continue)
- Weeks (long-form courses)
- Months (mastery journeys)
In-memory sessions fail catastrophically:
- Server restart? Session lost.
- Connection drop? Session lost.
- Switch devices? Session lost.
- Come back tomorrow? Session lost.
The Solution: Persistent State with Cloud SQL
LearnForge uses DatabaseSessionService with Cloud SQL (PostgreSQL) to persist everything:
from google.adk.sessions import DatabaseSessionService
from google.cloud.sql.connector import Connector
connector = Connector(refresh_strategy="LAZY")
session_service = DatabaseSessionService(
db_url="postgresql+pg8000://",
creator=lambda: connector.connect(
instance_connection_name,
"pg8000",
user=db_user,
password=db_password,
db=db_name,
),
pool_size=10,
max_overflow=5,
pool_timeout=60,
pool_recycle=1800,
)
What gets persisted:
- Current checkpoint index
- Completed checkpoints
- Content search results (so we don’t re-search)
- Video selections (so we remember what was shown)
- User responses and comprehension checks
- Learning preferences
The impact:
You can:
- Close your browser mid-checkpoint and resume exactly where you left off
- Switch from laptop to phone seamlessly
- Come back weeks later and continue your mission
- Share sessions with team members (collaborative learning)
This isn’t just a feature it’s what makes LearnForge production-ready for real learning, not just demos.
Cloud SQL Connector: Security Without the Headache
Traditional database access requires:
- IP whitelisting (nightmare in serverless)
- VPNs (complex setup)
- Exposed connection strings (security risk)
Cloud SQL Connector uses IAM credentials. No network configuration. No exposed passwords. Just secure, managed connections.
def _create_cloud_sql_connection(self):
return self._connector.connect(
settings.INSTANCE_CONNECTION_NAME,
"pg8000",
user=settings.DB_USER,
password=settings.DB_PASSWORD,
db=settings.DB_NAME,
)
Production-grade security, zero configuration.
Real-Time WebSocket: The Conversation That Never Lags
Both Polaris and Lumina use WebSocket connections for real-time, bidirectional communication. But here’s what makes it special: session resume.
The Flow
@router.websocket("/ws")
async def mission_ally_websocket(websocket: WebSocket, mission_id: str):
await websocket.accept()
user_id = await authenticate(websocket)
# Check for existing session
existing_session = await session_service.get_session(
app_name="mission-ally",
user_id=user_id,
session_id=session_id
)
if existing_session:
# Resume from last checkpoint
current_checkpoint = existing_session.state["current_checkpoint_index"]
completed = existing_session.state["completed_checkpoints"]
# Send historical messages, continue from where they left off
else:
# Start new mission
session = await session_service.create_session(...)
If you disconnect and reconnect, the system:
- Loads your session from Cloud SQL
- Sends you historical messages (so you see the conversation)
- Continues from your last checkpoint
- Feels completely seamless
No “start over” button. No lost progress. Just... continue.
Content Composition: How AI Curates Your Learning Materials
When Sensei needs to teach you about “neural networks,” it doesn’t pull from a static database. Instead, it orchestrates a three-stage pipeline:
Stage 1: Content Searcher
Uses Google Search API to find the latest, most relevant educational content:
content_searcher = LlmAgent(
name="lumina_content_searcher",
tools=[google_search_tool],
instruction="Search for educational content about the concept..."
)
Real-time search means you get current information, not outdated course materials.
Stage 2: Video Selector
Uses YouTube Data API v3 to curate educational videos:
def search_youtube_videos(
query: str,
max_results: int = 3,
duration_filter: str = "medium", # 4-20 minutes (optimal for learning)
video_category_id: str = "27" # Education category
) -> list[dict]:
# Filters by: duration, category, relevance
# Returns: title, channel, description, duration, thumbnail
Why this matters: Not all YouTube videos are educational. Not all educational videos are the right length. The selector finds videos that are:
- Actually educational (category 27)
- The right duration (4-20 min is the sweet spot)
- Relevant to the concept
- From reputable channels
Stage 3: Content Formatter
Personalizes everything based on your learning profile:
content_formatter = LlmAgent(
instruction=f"""
Format content for:
- Learning style: {user_profile['learning_style']} # ["examples", "step-by-step"]
- Level: {user_profile['level']} # "Beginner" | "Intermediate" | "Advanced"
- Preferences: {user_preferences}
"""
)
If you’re a visual learner who prefers examples, you get examples. If you prefer step-by-step guides, you get structured explanations. The content adapts to you.
SequentialAgent: Chaining It All Together
ADK’s SequentialAgent makes this elegant:
content_composer = SequentialAgent(
name="lumina_content_composer_agent",
sub_agents=[
content_searcher, # Stage 1: Search
video_selector, # Stage 2: Curate videos
content_formatter # Stage 3: Personalize
]
)
Each stage passes its output to the next. Clean, simple, powerful.
Technical Deep Dives: The Decisions That Matter
1. Why Cloud SQL Connector Over Traditional Connections
The problem: In Cloud Run (serverless), you can’t whitelist IPs. Traditional database connections require network configuration.
The solution: Cloud SQL Connector uses IAM credentials. Zero network configuration. Secure by default.
@property
def use_cloud_sql_connector(self) -> bool:
return self.is_cloud_run and all([
self.INSTANCE_CONNECTION_NAME,
self.DB_USER,
self.DB_PASSWORD,
self.DB_NAME
])
Impact: Production-ready security without the infrastructure headache.
2. Secret Manager: Zero Hardcoded Credentials
All sensitive configuration comes from Secret Manager:
def _read_secret(env_var: str, default: str = "") -> str:
value = os.getenv(env_var, default)
# Cloud Run mounts secrets as files
if value and os.path.exists(value):
with open(value) as f:
return f.read().strip()
return value
Impact: Credential rotation? Just update the secret. No code changes. No redeploys.
3. Multi-Stage Docker Build: Faster Cold Starts
Optimized container images reduce cold start time:
# Builder stage: Install dependencies
FROM python:3.11-slim as builder
RUN pip install poetry
COPY pyproject.toml poetry.lock ./
RUN poetry install --no-root --only main
# Runtime stage: Copy only what's needed
FROM python:3.11-slim
COPY --from=builder /usr/local/lib/python3.11/site-packages /usr/local/lib/python3.11/site-packages
COPY . .
Impact: 60% smaller images, faster cold starts, lower costs.
4. State-Driven Checkpoint Progression
Checkpoints advance through state, not hardcoded logic:
increment_checkpoint_tool = FunctionTool(
func=lambda ctx: {
"current_checkpoint_index": ctx.state["current_checkpoint_index"] + 1,
"current_checkpoint_goal": next_checkpoint_name,
}
)
Impact: Mission structures are flexible. Add checkpoints? Remove checkpoints? Change order? It just works.
Performance & Scalability: Built for Real Usage
Cloud Run Auto-Scaling
- Min instances: 0 (scale to zero = cost savings)
- Max instances: 100 (handles traffic spikes)
- CPU: 2 vCPU per instance
- Memory: 4Gi per instance
- Concurrency: 80 requests per instance
What this means: Zero users? Zero cost. 10,000 concurrent learners? Scales automatically. No manual intervention.
Database Connection Pooling
- Base pool: 10 connections
- Overflow: 5 connections
- Timeout: 60 seconds
- Recycle: 30 minutes
What this means: Efficient connection management. No connection exhaustion. Handles thousands of concurrent sessions.
Session State Efficiency
Average session state: ~50KB. That’s:
- Checkpoint progress
- Content search results (cached)
- Video selections
- User responses
Cloud SQL handles this efficiently. Thousands of concurrent sessions? No problem.
Lessons Learned: What I Wish I Knew Earlier
1. DatabaseSessionService Is Non-Negotiable for Production
In-memory sessions work for demos. Production learning platforms need persistence. The moment I switched to DatabaseSessionService, everything changed:
- Users could resume sessions
- Progress tracked across weeks
- Concurrent learners with isolated state
Learning: Choose session storage based on use case duration, not convenience.
2. Silent Orchestration Creates Better UX
Users don’t care about agent architecture. They want a seamless conversation. The orchestrator should be invisible:
# WRONG: User sees the machinery
Orchestrator: "Let me hand you over to the Sensei..."
Sensei: "Hello, let's learn..."
# RIGHT: User only sees the teacher
Sensei: "Hello, let's learn..."
Learning: Hide complexity, show simplicity.
3. Content Authority Separation Improves Quality
Teaching agents shouldn’t generate content. They should delegate to specialized agents with:
- Access to search APIs (real-time content)
- Video curation (filtered, relevant)
- Personalization (user preferences)
Learning: Separation of concerns applies to AI agents too.
4. Cloud SQL Connector Simplifies Security
No IP whitelisting. No VPNs. No exposed connection strings. Just IAM credentials and secure connections.
Learning: Use managed services for security, not manual configuration.
5. SequentialAgent Reduces Boilerplate
Content composition requires: search → video selection → formatting. SequentialAgent chains these automatically. No manual coordination needed.
Learning: ADK’s built-in patterns are powerful. Use them.
6. State-Driven Flow Enables Flexibility
Hardcoded checkpoint logic breaks when mission structures change. State-driven progression adapts automatically.
Learning: Data-driven > code-driven for dynamic systems.
The Impact: What This Actually Solves
For Learners
- Personalized learning paths: No more generic courses
- Real-time adaptation: Content adjusts to your understanding
- Session persistence: Learn at your own pace, resume anytime
- Research-backed content: Latest information, not outdated materials
- Multi-modal learning: Text + videos + interactive teaching
For Educators
- Scalable tutoring: One AI system can teach thousands simultaneously
- Adaptive content: Each learner gets personalized materials
- Progress tracking: See exactly where learners are stuck
- Content curation: AI finds and filters the best resources
For the Industry
- Proof that multi-agent AI works: 12 agents coordinating seamlessly
- Production-ready patterns: DatabaseSessionService, Cloud SQL Connector, WebSocket resume
- Scalable architecture: Cloud Run auto-scaling, connection pooling
- Security best practices: Secret Manager, IAM-based connections
What’s Next: The Future of AI-Powered Learning
Short-term:
- Multi-modal content (images, diagrams, interactive exercises)
- Collaborative learning (team missions, peer review)
- Advanced analytics (learning velocity, concept mastery tracking)
Long-term:
- Fine-tuned models for domain-specific teaching
- Adaptive difficulty based on real-time comprehension
- Integration with external platforms (Coursera, edX, Khan Academy)
The vision: Every learner gets a personal tutor that:
- Understands their goals
- Adapts to their learning style
- Remembers everything
- Never gets tired
- Scales to millions
Conclusion: Why This Matters
Building LearnForge taught me something important: the future of education isn’t about better content, it’s about better personalization.
Traditional platforms give everyone the same course. LearnForge gives everyone a personalized learning journey that:
- Starts with a conversation (not a catalog)
- Adapts in real-time (not static content)
- Remembers everything (not ephemeral sessions)
- Scales seamlessly (not manual infrastructure)
By combining:
- Google ADK’s hierarchical agent orchestration for complex workflows
- Cloud SQL with DatabaseSessionService for persistent state
- Cloud Run’s auto-scaling for seamless scalability
- WebSocket real-time communication for responsive UX
I created a platform that transforms how people learn, from static courses to dynamic, personalized, adaptive journeys.
The technology is here. The infrastructure is ready. The future of education is AI-powered, serverless, and personalized.
LearnForge is just the beginning.
Google Cloud Services Utilized
| Service | Purpose | Why It Matters |
|---|---|---|
| Cloud Run | Serverless container hosting | Auto-scales from 0 to 100 instances, zero infrastructure management |
| Artifact Registry | Container image storage | Versioned images, CI/CD integration |
| Cloud SQL (PostgreSQL) | Persistent session state | Sessions survive restarts, connection drops, device switches |
| Cloud SQL Connector | Secure database connections | IAM-based security, no IP whitelisting needed |
| Firebase Authentication | User authentication | Google OAuth 2.0, secure session management |
| Firestore | Mission data storage | User profiles, mission definitions, enrollments |
| Cloud Logging | Application logs | Centralized logging, debugging, monitoring |
| Cloud Trace | Distributed tracing | Performance analysis, bottleneck identification |
| Secret Manager | Credential storage | Zero hardcoded secrets, rotation-friendly |
| Agent Development Kit (ADK) | Multi-agent orchestration | Hierarchical agents, sequential pipelines, tool integration |
| Gemini 2.5 Flash | LLM for agents | Fast, cost-effective, powerful reasoning |
| YouTube Data API v3 | Video curation | Educational video search, filtering, metadata |
| Google Search API | Content discovery | Real-time research, up-to-date information |
Built with Google Cloud Run and Agent Development Kit (ADK)
Transforming how people learn, one conversation at a time.