RLLM: Recursive Large Language Models (TypeScript)
A TypeScript implementation of Recursive Language Models for processing large contexts with LLMs.
Inspired by Cloudflareβs Code Mode approach.
Key differences from the Python version:
- V8 isolates instead of subprocess/TCP
- Zod schema support for typed context
- TypeScript-native
Installation
pnpm add rllm
# or
npm install rllm
Quick Start
LLM writes JavaScript code that runs in a secure V8 isolate:
import { createRLLM } from 'rllm';
const rlm = createRLLM({
model: 'gpt-4o-mini',
verbose: true,
});
// Full ...
RLLM: Recursive Large Language Models (TypeScript)
A TypeScript implementation of Recursive Language Models for processing large contexts with LLMs.
Inspired by Cloudflareβs Code Mode approach.
Key differences from the Python version:
- V8 isolates instead of subprocess/TCP
- Zod schema support for typed context
- TypeScript-native
Installation
pnpm add rllm
# or
npm install rllm
Quick Start
LLM writes JavaScript code that runs in a secure V8 isolate:
import { createRLLM } from 'rllm';
const rlm = createRLLM({
model: 'gpt-4o-mini',
verbose: true,
});
// Full RLM completion - prompt first, context in options
const result = await rlm.completion(
"What are the key findings in this research?",
{ context: hugeDocument }
);
console.log(result.answer);
console.log(`Iterations: ${result.iterations}, Sub-LLM calls: ${result.usage.subCalls}`);
Structured Context with Zod Schema
For structured data, you can provide a Zod schema. The LLM will receive type information, enabling it to write better code:
import { z } from 'zod';
import { createRLLM } from 'rllm';
// Define schema for your data
const DataSchema = z.object({
users: z.array(z.object({
id: z.string(),
name: z.string(),
role: z.enum(['admin', 'user', 'guest']),
activity: z.array(z.object({
date: z.string(),
action: z.string(),
})),
})),
settings: z.record(z.string(), z.boolean()),
});
const rlm = createRLLM({ model: 'gpt-4o-mini' });
const result = await rlm.completion(
"How many admin users are there? What actions did they perform?",
{
context: myData,
contextSchema: DataSchema, // LLM sees the type structure!
}
);
The LLM will know it can access context.users, context.settings, etc. with full type awareness.
The LLM will write code like:
// LLM-generated code runs in V8 isolate
const chunks = [];
for (let i = 0; i < context.length; i += 50000) {
chunks.push(context.slice(i, i + 50000));
}
const findings = await llm_query_batched(
chunks.map(c => `Extract key findings from:\n${c}`)
);
const summary = await llm_query(`Combine findings:\n${findings.join('\n')}`);
print(summary);
FINAL(summary);
API Reference
createRLLM(options)
Create an RLLM instance with sensible defaults.
const rlm = createRLLM({
model: 'gpt-4o-mini', // Model name
provider: 'openai', // 'openai' | 'anthropic' | 'openrouter' | 'custom'
apiKey: process.env.KEY, // Optional, uses env vars by default
baseUrl: undefined, // Optional, required for 'custom' provider
verbose: true, // Enable logging
});
Custom Provider (OpenAI-Compatible APIs)
Use the custom provider to connect to any OpenAI-compatible API (e.g., vLLM, Ollama, LM Studio, Azure OpenAI):
const rlm = createRLLM({
provider: 'custom',
model: 'llama-3.1-8b',
baseUrl: 'http://localhost:8000/v1', // Required for custom provider
apiKey: 'your-api-key', // Optional, depends on your API
verbose: true,
});
Note: When using provider: 'custom', the baseUrl parameter is required. An error will be thrown if itβs not provided.
RLLM Methods
| Method | Description |
|---|---|
rlm.completion(prompt, options) | Full RLM completion with code execution |
rlm.chat(messages) | Direct LLM chat |
rlm.getClient() | Get underlying LLM client |
CompletionOptions
| Option | Type | Description |
|---|---|---|
context | `string | T` |
contextSchema | ZodType<T> | Optional Zod schema describing context structure |
Sandbox Bindings
The V8 isolate provides these bindings to LLM-generated code:
| Binding | Description |
|---|---|
context | The loaded context data |
llm_query(prompt, model?) | Query sub-LLM |
llm_query_batched(prompts, model?) | Batch query sub-LLMs |
FINAL(answer) | Return final answer |
FINAL_VAR(varName) | Return variable as final answer |
print(...) | Console output |
Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β RLLM TypeScript β
β β
β βββββββββββββββ ββββββββββββββββββββββββββββββββββββ β
β β RLLM β β V8 Isolate (Sandbox) β β
β β Class βββββΆβ β β
β βββββββββββββββ β β’ context (injected data) β β
β β β β’ llm_query() βββ β β
β β β β’ llm_query_batched() β β
β βΌ β β’ print() / console β β
β βββββββββββββββ β β’ FINAL() / FINAL_VAR() β β
β β LLMClient ββββββΌβββββββββββββββββββ β β
β β (OpenAI) β β β β
β βββββββββββββββ β LLM-generated JS code runs here β β
β ββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
No TCP. No subprocess. Direct function calls via bindings.
Why V8 Isolates? (Not TCP/Containers)
The Python RLLM uses subprocess + TCP sockets for code execution. We use V8 isolates instead:
Python RLLM: LLM β Python exec() β subprocess β TCP socket β LMHandler
TypeScript: LLM β V8 isolate (same process) β direct function calls
Benefits:
- No TCP/network - Direct function calls via bindings
- Fast startup - Isolates spin up in milliseconds
- Secure - V8βs built-in memory isolation
- Simple - No containers, no socket servers
Development
# Install dependencies
pnpm install
# Build
pnpm build
# Run example
pnpm example
# Run tests
pnpm test
License
MIT - Same as the original Python RLLM.
Credits
Based on the Recursive Language Models paper and Python implementation by Alex Zhang et al.
Reference: RLM Blogpost