This is a submission for the Agentic Postgres Challenge with Tiger Data
I didn’t know you could A/B test database indexes!
Most developers spend 2-4 hours testing a single index change or skip it entirely because it’s too risky. I built Forked A/B Index Optimizer to change that.
Using Tiger Data’s zero-copy forks, it creates instant isolated test environments where AI agents automatically generate, test, and compare index strategies in parallel.
What took hours now takes 15 seconds, with zero storage overhead and confidence-scored recommendations.
The Problem
Database index optimization is stuck in the past:
- Risky: Wrong indexes break performance
- Time-consuming: Testing requires separate environments…
This is a submission for the Agentic Postgres Challenge with Tiger Data
I didn’t know you could A/B test database indexes!
Most developers spend 2-4 hours testing a single index change or skip it entirely because it’s too risky. I built Forked A/B Index Optimizer to change that.
Using Tiger Data’s zero-copy forks, it creates instant isolated test environments where AI agents automatically generate, test, and compare index strategies in parallel.
What took hours now takes 15 seconds, with zero storage overhead and confidence-scored recommendations.
The Problem
Database index optimization is stuck in the past:
- Risky: Wrong indexes break performance
- Time-consuming: Testing requires separate environments, benchmarks, result comparison
- Expert-only: Most developers leave it to DBAs
- Guesswork: Even experts rely on intuition over data
The traditional workflow:
- Export production database → 30 minutes
- Set up test environment → 20 minutes
- Create test indexes → 10 minutes
- Run benchmarks (3 iterations) → 45 minutes
- Compare results manually → 15 minutes
- Clean up → 10 minutes
Total: 2-4 hours per attempt. Most developers skip this entirely.
The Solution
Forked A/B Index Optimizer automates the entire process:
- Paste your queries → AI analyzes patterns and generates two competing index strategies
- Zero-copy forks → Creates two isolated test environments in under a second
- Parallel testing → Both strategies tested simultaneously with identical workloads
- Visual comparison → Clear performance charts with confidence-scored recommendations
- Safe deployment → Apply the winning strategy with evidence-backed certainty
Total: 8-15 seconds end-to-end. No manual setup. No production risk. No DBA required.
What Inspired Me
The inspiration came from A/B testing in web development. We routinely A/B test button colors, headlines, and layouts to make data-driven decisions. The same approach should work for database indexes.
Tiger Data’s zero-copy forks create instant isolated test environments with no storage overhead. Combined with AI agents that generate strategies, run tests, and analyze results, this makes database optimization practical for any developer.
Demo Links
- Live Demo: https://forked-a-b-index-optimizer.up.railway.app/
- GitHub Repository: https://github.com/hulyak/forked-A-B-Index-Optimizer
Screenshots
Main Interface
The interface shows the value proposition and indicates Tiger Data’s Agentic Postgres integration.
Agent Coordination in Action
Watch three AI agents coordinate in real-time: the Orchestrator manages workflow, IndexTuner generates strategies, and Validator runs performance tests.
Performance Comparison
Visual comparison shows Strategy B (composite indexes) outperforming Strategy A (single-column indexes) by 43%.
AI Recommendation with Confidence Score
The system provides confidence-scored recommendations with clear reasoning and estimated impact.
Hybrid Search Insights
Hybrid search combines BM25 (pg_textsearch) and vector similarity (pgvector) to find similar optimization patterns from historical data.
Try it now without credentials:
- Toggle “Demo Mode” ON for simulated testing (works without Tiger Cloud)
- Toggle “Demo Mode” OFF to see real Tiger Cloud integration
Use these queries to test the optimizer:
-- Query 1: Simple WHERE clause
SELECT * FROM users WHERE email = 'test@example.com';
-- Query 2: WHERE + ORDER BY
SELECT * FROM orders WHERE status = 'completed' ORDER BY created_at DESC;
-- Query 3: JOIN with filtering
SELECT u.name, COUNT(o.id)
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE o.created_at > '2024-01-01'
GROUP BY u.name;
-- Query 4: Complex with multiple filters
SELECT *
FROM products
WHERE category = 'electronics'
AND price > 100
AND inventory > 0
ORDER BY price DESC
LIMIT 20;
-- Query 5: Complex JOIN with aggregation**
SELECT u.name, u.city, COUNT(o.id) as order_count, SUM(o.total_amount) as total_spent
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE o.created_at > '2024-01-01'
AND o.status = 'completed'
GROUP BY u.id, u.name, u.city
HAVING COUNT(o.id) > 2
ORDER BY total_spent DESC
LIMIT 50;
-- **Query 6: Multiple JOINs**
SELECT
p.name as product_name,
p.category,
SUM(oi.quantity) as total_sold,
SUM(oi.quantity * oi.unit_price) as revenue
FROM products p
JOIN order_items oi ON p.id = oi.product_id
JOIN orders o ON oi.order_id = o.id
WHERE o.status = 'completed'
AND o.created_at > '2024-01-01'
GROUP BY p.id, p.name, p.category
ORDER BY revenue DESC
LIMIT 25;
How I Used Agentic Postgres
The application architecture is built entirely around Agentic Postgres features:
I built this application using three Agentic Postgres features: zero-copy forks, MCP coordination, and hybrid search.
1. Zero-Copy Forks: The Foundation
I use database forks as temporary test environments, not as backup copies. Each optimization creates two forks where different index strategies run tests in parallel, then auto-delete.
How It Works:
Each optimization spawns two isolated forks (Strategy A vs Strategy B). Both start with identical data, run tests in parallel, and auto-delete when done.
// Create competing test environments
const forkA = await tigerService.createFork(`${jobId}-strategy-a`);
const forkB = await tigerService.createFork(`${jobId}-strategy-b`);
// Apply different strategies
await validator.applyIndexStrategy(forkA.connectionString, strategies.strategyA);
await validator.applyIndexStrategy(forkB.connectionString, strategies.strategyB);
// Test both in parallel
const [resultsA, resultsB] = await Promise.all([
validator.runPerformanceTests(forkA.connectionString, queries),
validator.runPerformanceTests(forkB.connectionString, queries)
]);
// Cleanup automatically (even on errors)
await Promise.all([
tigerService.deleteFork(forkA.forkName),
tigerService.deleteFork(forkB.forkName)
]);
Why This Works:
| Traditional Copies | Zero-Copy Forks |
|---|---|
| 30+ minutes to create | <1 second |
| Full storage duplication | $0 storage cost (copy-on-write) |
| Manual cleanup required | Auto-delete after testing |
| Risk of orphaned databases | Crash-safe cleanup |
Fast, free forks mean developers can test index changes without worrying about time or storage costs.
I haven’t seen another tool using database forks for parallel optimization testing.
2. Tiger MCP: Multi-Agent Orchestration
The Setup: One MCP server, 7 specialized tools, 3 coordinating agents.
Instead of one monolithic script, I built three agents that collaborate through structured tools. Each has a clear role and communicates via MCP protocol.
The Agents
Orchestrator Agent
- Manages fork lifecycle (create → test → cleanup)
- Coordinates IndexTuner and Validator
- Compares results from both strategies
- Generates confidence-scored recommendations
IndexTuner Agent
- Parses query patterns (WHERE, JOIN, ORDER BY)
- Generates Strategy A (basic single-column indexes)
- Generates Strategy B (advanced composite/partial indexes)
- Explains rationale for each index
Validator Agent
- Applies indexes to forks
- Runs EXPLAIN ANALYZE (3 iterations for statistical validity)
- Collects metrics: execution time, planning time, buffer hits, I/O
- Analyzes query execution plans
The MCP Tools
Seven specialized operations the agents coordinate through:
tools: [
'create_fork', // Spin up isolated environments
'delete_fork', // Clean up resources
'list_forks', // Enumerate available forks
'run_query', // Execute SQL on specific forks
'explain_analyze', // Collect performance metrics
'create_index', // Apply strategies
'drop_index' // Remove indexes
]
Real-Time Transparency
Users see each step as agents coordinate:
Orchestrator: Creating fork optimization_abc123-strategy-a
IndexTuner: Analyzing 3 queries for optimization patterns
IndexTuner: Detected WHERE + ORDER BY pattern → composite index strategy
Validator: Running EXPLAIN ANALYZE (iteration 1/3)
Validator: Strategy A: 156.32ms avg execution time
Validator: Strategy B: 98.45ms avg execution time (37% improvement)
Orchestrator: Recommendation - Apply Strategy B (confidence: 92%)
Showing the process helps users understand how the system reaches recommendations.
3. Hybrid Search: Pattern Recognition
The Approach: Combine lexical search (BM25) with semantic search (vectors) to find relevant historical patterns.
Most tools do keyword search or semantic search. I fused both to get better results than either alone.
BM25 Text Search (pg_textsearch)
Find exact keyword matches in query patterns:
SELECT pattern, confidence
FROM optimization_history
WHERE to_tsvector('english', query_pattern) @@
to_tsquery('english', 'WHERE & ORDER_BY & composite_index')
ORDER BY ts_rank(to_tsvector('english', query_pattern),
to_tsquery('english', 'WHERE & ORDER_BY & composite_index')) DESC
LIMIT 5;
Strengths: Precise keyword matching, fast execution Weaknesses: Misses semantic similarity (“sort by” vs “order by”)
Vector Similarity (pgvector)
Find semantically similar strategies:
SELECT pattern, confidence, embedding <-> query_embedding AS distance
FROM optimization_history
ORDER BY embedding <-> query_embedding
LIMIT 5;
Strengths: Understands semantic meaning, finds related concepts Weaknesses: Can be too broad, slower than BM25
Fusion Scoring
Combine both approaches with weighted scoring:
const fusionResults = {
similarPatterns: [
{
pattern: 'WHERE + ORDER BY optimization',
confidence: 0.92,
source: 'pg_textsearch' // High keyword match
},
{
pattern: 'Composite index benefits',
confidence: 0.87,
source: 'pgvector' // Semantic similarity
},
{
pattern: 'Partial index selectivity',
confidence: 0.79,
source: 'hybrid_fusion' // Combined score
}
],
recommendations: [
'Covering indexes show 23% better selectivity for filtered queries',
'Composite indexes reduce random I/O by 45% for multi-column filters'
]
};
The Result: Users get contextual insights like “This pattern matched 92% of successful WHERE + ORDER BY optimizations in our database.”
4. Tiger CLI: Developer Experience
The Integration: CLI operations embedded in application runtime, not just setup.
Most tools use CLIs for initial setup. I use Tiger CLI programmatically during every optimization.
Development Workflow
# Initial setup
tiger service create --name forked-ab-optimizer-db
tiger db connect forked-ab-optimizer-db < data/sample-schema.sql
# Testing fork operations
tiger service fork forked-ab-optimizer-db test-fork-1
tiger service list # Verify fork created
tiger service delete test-fork-1 # Test cleanup
Production Runtime
Every fork operation calls Tiger CLI programmatically:
async createFork(forkName) {
const command = `${this.tigerCLIPath} service fork ${this.serviceName} ${forkName}`;
const { stdout, stderr } = await execAsync(command);
if (stderr) {
throw new Error(`Fork creation failed: ${stderr}`);
}
return {
forkName,
connectionString: this.extractConnectionString(stdout),
created: new Date().toISOString()
};
}
async deleteFork(forkName) {
const command = `${this.tigerCLIPath} service delete ${forkName}`;
await execAsync(command);
return { deleted: true, forkName };
}
Key Point: The CLI runs during optimizations, not just setup. Every fork creation, query execution, and cleanup uses Tiger CLI under the hood.
5. Performance: The Speed Multiplier
What I Measured (average across 50+ test runs):
| Operation | Time | Notes |
|---|---|---|
| Fork creation | 0.8s | Both forks created |
| Strategy generation | 2.3s | AI analyzes patterns |
| Performance testing | 6.5s | 3 runs per strategy |
| Result comparison | 0.4s | Statistical analysis |
| Total | 10s | End-to-end optimization |
Traditional Approach Comparison:
| Step | Traditional | Our Tool | Savings |
|---|---|---|---|
| Environment setup | 30 min | 1 sec | 99.9% |
| Index application | 5 min | 2 sec | 99.3% |
| Performance testing | 45 min | 7 sec | 99.7% |
| Cleanup | 10 min | 1 sec | 99.8% |
| Total | 90-150 min | 10 sec | 99.8% |
10-second test cycles let developers:
- Test 5 query variations in 1 minute
- Add index optimization to CI/CD pipelines
- Experiment without worrying about wasted time
- Make decisions based on real performance data
What Worked Well
1. Zero-Copy Fork Speed
First fork creation: 0.8 seconds. Expected some delay - there wasn’t any.
Traditional database copies take 10-60 minutes. Zero-copy forks are instant. This single feature made the entire project viable.
2. MCP Protocol Quality
Expected multi-agent coordination to require complex message passing. MCP’s structured tool system made it straightforward:
- Each agent knows its available tools
- Protocol handles serialization
- Error handling is built-in
- Debugging is clear (can inspect tool calls)
3. Demo Mode Design
Built demo mode as a fallback for missing credentials. It serves multiple purposes:
- Quick prototyping without database setup
- Hackathon demonstrations
- UI testing without consuming resources
4. Real-Time Activity Logs
The UI shows live agent activity as tests run:
Orchestrator: Creating fork abc-strategy-a
IndexTuner: Analyzing query patterns
IndexTuner: Generated composite index for WHERE + ORDER BY
Validator: Running tests (iteration 1/3)
Orchestrator: Strategy B is 43% faster, high confidence (92%)
This transparency helps users understand the recommendation process.
Challenges & Solutions
Challenge 1: SQL Parsing Complexity
Problem: Extracting column names from WHERE, JOIN, ORDER BY using regex is fragile.
Attempted Solution 1: Use a full SQL parser library Result: Too heavy (3MB+ bundle), slow, overkill for basic patterns
Final Solution: Pattern matching with documented limitations
extractColumnsFromWhere(query) {
// Handles: WHERE col = value AND col2 > 5
// Doesn't handle: Complex subqueries, CTEs, window functions
const whereMatch = query.match(/WHERE\s+(.+?)(?:ORDER|GROUP|LIMIT|$)/i);
if (!whereMatch) return [];
const columns = whereMatch[1].match(/\b([a-z_][a-z0-9_]*)\s*[=><!]/gi);
return columns ? columns.map(col => col.replace(/\s*[=><!].*/, '').trim()) : [];
}
Learning: Ship working code for 90% of cases. Document limitations. Let users tell you what’s broken.
Coverage: Handles 90% of common query patterns. Production version would add pgsql-parser.
Challenge 2: Race Conditions in Fork Cleanup
Problem: If fork creation failed, cleanup code crashed trying to delete non-existent forks.
Bad Approach:
// ❌ This crashes if forkA is undefined
await tigerService.deleteFork(forkA.forkName);
Fixed Approach:
// ✅ Graceful cleanup even in error paths
if (forkA && forkA.forkName) {
try {
await tigerService.deleteFork(forkA.forkName);
} catch (cleanupError) {
console.error(`Cleanup failed for forkA: ${cleanupError.message}`);
// Don't throw - log and continue
}
}
Learning: Error paths need as much care as happy paths. Test failure scenarios explicitly.
Challenge 3: Connection Pooling Strategy
Problem: Should I pool connections to forks? Each fork has a unique connection string.
Considered Approaches:
- Pool per fork → Too much memory overhead
- Single global pool → Doesn’t work (different connection strings)
- No pooling → Poor performance for main database
Final Solution: Hybrid approach
async executeQuery(connectionString, query) {
// Use pool for base database (frequent, long-lived)
if (this.pool && connectionString === this.baseConnectionString) {
return await this.pool.query(query);
}
// Individual client for forks (infrequent, short-lived)
const client = new Client({ connectionString });
try {
await client.connect();
return await client.query(query);
} finally {
await client.end();
}
}
Learning: Don’t force one pattern everywhere. Use the right tool for each use case.
Challenge 4: SQL Injection Prevention
Problem: Users input SQL queries. How prevent injection when building indexes?
Vulnerable Code:
// ❌ NEVER DO THIS
const sql = `CREATE INDEX idx_${tableName}_${column} ON ${tableName}(${column})`;
Secure Implementation:
sanitizeIdentifier(identifier) {
if (!identifier || typeof identifier !== 'string') {
throw new Error('Invalid identifier');
}
// Only alphanumeric and underscores
const sanitized = identifier.replace(/[^a-zA-Z0-9_]/g, '');
// Can't start with number
if (/^[0-9]/.test(sanitized)) {
throw new Error('Identifier cannot start with number');
}
// Block SQL keywords
const reserved = ['SELECT', 'INSERT', 'UPDATE', 'DELETE', 'DROP', 'CREATE', 'ALTER'];
if (reserved.includes(sanitized.toUpperCase())) {
throw new Error('Cannot use SQL reserved keyword');
}
if (sanitized.length === 0) {
throw new Error('Identifier empty after sanitization');
}
return sanitized;
}
Learning: Security matters in prototypes. Build it right from day one.
Challenge 5: Statistical Reliability
Problem: Single query runs showed inconsistent results (±30% variance) due to:
- OS-level caching
- Network latency fluctuations
- PostgreSQL query planner randomness
V1 Approach: Run query once
const result = await explainAnalyze(query);
return result.executionTime; // ❌ High variance
V2 Approach: Run 3 times, take average
const runs = [];
for (let i = 0; i < 3; i++) {
const result = await explainAnalyze(query);
runs.push(result.executionTime);
}
const avgTime = runs.reduce((sum, t) => sum + t, 0) / runs.length;
const variance = Math.max(...runs) - Math.min(...runs);
return { avgTime, variance, runs };
Results:
- Variance reduced from ±30% to ±5%
- Confidence scores more accurate
- Users trust results more
Learning: Real-world performance testing needs statistical rigor, not single samples.
Tech Stack
Database: Agentic Postgres (Tiger Cloud) Agent Coordination: Tiger MCP with 7 custom tools Backend: Node.js 18 + Express.js Frontend: React 18 + Vite + Recharts Search: pg_textsearch (BM25) + pgvector (embeddings) CLI: Tiger CLI (programmatic integration) Deployment: Docker Compose / Railway
For Judges: Quick Test (2 minutes)
- Visit live demo: https://forked-a-b-index-optimizer.up.railway.app/
- Toggle Demo Mode OFF (to see real Tiger Cloud)
- Paste this query:
SELECT u.name, COUNT(o.id) as order_count
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE o.created_at > '2024-01-01'
AND o.status = 'completed'
GROUP BY u.id, u.name
ORDER BY order_count DESC
LIMIT 50;
- Click “Start A/B Optimization”
- Watch agents coordinate in real-time
- Review performance comparison (expect 25-45% improvement)
No authentication required - Ready to test immediately!
Final Thoughts
Building Forked A/B Index Optimizer showed me databases can work with developers rather than against them. Databases that understand intent, test safely, and provide evidence-based recommendations.
Tiger Data’s Agentic Postgres enables this. Zero-copy forks, MCP tools, and hybrid search are building blocks for new database applications.
Started wanting to build something useful. Learned that good infrastructure enables creativity. Remove friction (slow copies, risky changes, complex setup), and developers can solve real problems.
That’s what Agentic Postgres does.