GLM Enhanced Proxy
HTTP proxy server that transforms Anthropic Messages API requests to Z.ai GLM-4.7 API format, enabling Claude-compatible tools and applications to use GLM models.
Why GLMProxy?
The Problem with GLM-4.7
GLM-4.7 is a powerful model, but it has limitations when used directly:
| Limitation | Impact |
|---|---|
| No web search | Canβt access current information or search the web |
| No reasoning mode | Lacks step-by-step thinking like Claudeβs extended thinking |
| Manual model switching | Developers must manually route between text/vision models |
| Limited tool ecosystem | Official docs state "does not support custom tools" |
| **Complex integrati⦠|
GLM Enhanced Proxy
HTTP proxy server that transforms Anthropic Messages API requests to Z.ai GLM-4.7 API format, enabling Claude-compatible tools and applications to use GLM models.
Why GLMProxy?
The Problem with GLM-4.7
GLM-4.7 is a powerful model, but it has limitations when used directly:
| Limitation | Impact |
|---|---|
| No web search | Canβt access current information or search the web |
| No reasoning mode | Lacks step-by-step thinking like Claudeβs extended thinking |
| Manual model switching | Developers must manually route between text/vision models |
| Limited tool ecosystem | Official docs state "does not support custom tools" |
| Complex integration | Each AI tool needs custom GLM API integration |
How GLMProxy Solves These
| Problem | Solution |
|---|---|
| No web search | MCP web_search/web_reader injection; intercepts Claude Codeβs native WebSearch/WebFetch |
| No reasoning | Automatic reasoning prompt injection with <reasoning_content> parsing to thinking blocks |
| Manual model switching | Auto-detects images/video in current message β routes to glm-4.6v, switches back to glm-4.7 for text |
| Limited tools | Dynamic MCP registry - add Playwright, Context7, or any MCP server via dashboard |
| Complex integration | Drop-in Anthropic API compatibility - works with any tool that supports custom base URLs |
Real Benefits
- Claude Code users: Get web search without an Anthropic subscription
- Vision tasks: Automatic model switching - no manual configuration
- Reasoning: Step-by-step thinking blocks for complex problems
- Extensible: Add your own MCP servers for specialized tools
- Zero code changes: Point your tools at
http://127.0.0.1:4567and go
Features
- Web Dashboard: Settings panel and MCP management (vanilla JS, no dependencies)
- Smart Backend Routing: Automatically routes text requests via Anthropic endpoint and vision requests via OpenAI endpoint for optimal results
- API Translation: Transparent conversion between Anthropic Messages API and OpenAI-compatible GLM API
- Intelligent Model Selection: Automatic selection of text (glm-4.7) or vision (glm-4.6v) models based on current message content
- Video Analysis: Full video support with automatic file path detection - just mention a video file and itβs analyzed
- Reasoning Injection: Automatic reasoning prompt injection for step-by-step thinking with
<reasoning_content>tag parsing - Tool Execution: Internal tool loop for web_search and web_reader via Z.ai MCP servers, plus automatic interception of Claude Codeβs native WebSearch/WebFetch tools
- Client Tools: Pass-through support for client-defined tools
- Streaming: Full SSE streaming support for both backend paths
- Production Ready: Structured logging, error handling, graceful shutdown
Quick Start
Prerequisites
- Node.js 18.0.0 or later
- Z.ai API key (get one at https://z.ai)
- Claude Code CLI (optional, for
ccglmcommand)
Installation
# Clone and install
git clone <repository-url>
cd glmproxy
npm install
# Install globally for ccglm command
npm install -g .
Configuration
Setting Up API Keys
API keys are configured via a .env file in the project root. This file is automatically loaded on startup.
Create your .env file:
# Copy the example file
cp .env.example .env
# Edit with your API key
nano .env # or use your preferred editor
Required keys:
| Variable | Description |
|---|---|
ZAI_API_KEY | Your Z.ai API key (required) - get one at https://z.ai |
Optional keys for MCP servers:
| Variable | Description |
|---|---|
REF_API_KEY | API key for Ref Tools MCP (documentation search) |
CONTEXT7_API_KEY | API key for Context7 MCP (library docs lookup) |
Security best practices:
- Never commit
.envto git - It is already in.gitignore - Never share API keys - Treat them like passwords
- Use environment variables in CI/CD - Donβt store keys in code or config files
- Rotate keys periodically - Regenerate if you suspect exposure
- Dashboard API key entry - Keys entered via the web UI are saved to
.envautomatically
If you donβt have a .env.example file, create .env manually:
# .env
ZAI_API_KEY=your_api_key_here
# Optional: Server configuration
# PORT=4567
# HOST=127.0.0.1
# LOG_LEVEL=info
Alternatively, set the environment variable directly in your shell:
export ZAI_API_KEY="your-api-key-here"
Running with CLI (Recommended)
The easiest way to use the proxy:
# Start proxy and launch Claude Code in one command
ccglm
# Skip permission prompts (use with caution)
ccglm yolo
# Open the web dashboard to configure settings
ccglm ui
# Check proxy status
ccglm status
Running Manually
# Start proxy server
npm start
# Development (with auto-reload)
npm run dev
The proxy will start on http://127.0.0.1:4567 by default.
Access the Dashboard
Open your browser and navigate to:
http://127.0.0.1:4567/
Youβll see the settings dashboard where you can:
- Configure your Z.ai API key in the Settings panel
- Select endpoint mode (Anthropic, OpenAI, BigModel)
- Toggle features like web search, reasoning, and streaming
- Manage custom MCP servers
Test the connection
# Health check
curl http://127.0.0.1:4567/health
# Simple request
curl -X POST http://127.0.0.1:4567/v1/messages \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-20250514",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Hello, what model are you?"}
]
}'
Configuration Options
All configuration is done via environment variables:
| Variable | Default | Description |
|---|---|---|
ZAI_API_KEY | (required) | Your Z.ai API key |
PORT | 4567 | Server port |
HOST | 127.0.0.1 | Server host |
LOG_LEVEL | info | Logging level: debug, info, warn, error |
ZAI_BASE_URL | https://api.z.ai/api/paas/v4/chat/completions | GLM API endpoint (OpenAI path) |
ZAI_ANTHROPIC_URL | https://api.z.ai/api/anthropic/v1/messages | GLM API endpoint (Anthropic path) |
STREAMING_ENABLED | false | Enable SSE streaming for responses |
STREAMING_CHUNK_SIZE | 20 | Characters per streaming chunk |
STREAMING_CHUNK_DELAY | 0 | Delay between chunks (ms) |
USE_ANTHROPIC_ENDPOINT | true | Use native Anthropic-compatible endpoint for text requests |
WEB_SEARCH_ENABLED | true | Enable web_search/web_reader tools and Claude Code tool interception |
CLI Reference
The ccglm command provides a convenient way to use the proxy:
| Command | Description |
|---|---|
ccglm | Start proxy and launch Claude Code |
ccglm yolo | Same as above, with --dangerously-skip-permissions |
ccglm ui | Open the web dashboard in browser |
ccglm start | Start proxy server in foreground |
ccglm stop | Stop background proxy server |
ccglm status | Check if proxy is running |
ccglm activate | Print shell exports for manual use |
ccglm help | Show help message |
What ccglm does
When you run ccglm, it:
- Starts the proxy server in the background (if not already running)
- Sets environment variables to route Claude Code through the proxy:
ANTHROPIC_BASE_URLβ proxy URLANTHROPIC_AUTH_TOKENβ dummy token (proxy uses your ZAI_API_KEY)ANTHROPIC_DEFAULT_*_MODELβ glm4.6 for all model tiers
- Launches Claude Code
Use ccglm yolo to skip permission prompts.
Examples
# Start proxy + Claude Code
ccglm
# Skip permission prompts (use with caution)
ccglm yolo
# Open settings UI to configure API key and features
ccglm ui
# Use with shell activation (for advanced users)
eval $(ccglm activate)
claude
Usage with AI Tools
Claude Code
The easiest way (using ccglm):
ccglm
Or configure manually:
# In your shell config (.bashrc, .zshrc, etc.)
export ANTHROPIC_BASE_URL="http://127.0.0.1:4567"
Or in the Claude Code settings, set the API base URL to http://127.0.0.1:4567.
Other AI Coding Tools
Any tool that supports the Anthropic Messages API with a custom base URL can use this proxy. Simply configure:
- Base URL:
http://127.0.0.1:4567 - API Key: Any value (the proxy uses your configured ZAI_API_KEY)
API Reference
POST /v1/messages
Anthropic Messages API compatible endpoint.
Request:
{
"model": "claude-sonnet-4-20250514",
"max_tokens": 4096,
"system": "You are a helpful assistant.",
"messages": [
{"role": "user", "content": "Hello!"}
],
"tools": [...],
"stream": false
}
Response:
{
"id": "msg_1245677890_abc123",
"type": "message",
"role": "assistant",
"content": [
{"type": "text", "text": "Hello! How can I help you?"}
],
"model": "glm-4.7",
"stop_reason": "end_turn",
"usage": {
"input_tokens": 10,
"output_tokens": 8
}
}
GET /health
Health check endpoint with status and configuration.
Response:
{
"status": "ok",
"version": "1.0.0",
"uptime": 12345,
"config": {
"toolsEnabled": true,
"streamingEnabled": false,
"models": ["glm-4.7", "glm-4.6v"]
},
"validation": {
"isValid": true,
"errors": []
}
}
GET /config
Detailed configuration endpoint (for debugging).
Response:
{
"port": 4567,
"host": "127.0.0.1",
"apiKeyConfigured": true,
"models": {
"text": "glm-4.7",
"vision": "glm-4.6v"
},
"toolExecution": {
"maxIterations": 5,
"timeout": 30000
}
}
POST /config
Update runtime configuration. Changes apply to all clients (Claude Code, Cline, dashboard, etc.).
Request:
{
"streaming": false,
"webSearch": true,
"apiKey": "your-api-key",
"endpoint": "anthropic"
}
Response:
{
"success": true,
"config": {
"streaming": false,
"webSearch": true,
"apiKeyConfigured": true,
"endpoint": "anthropic"
}
}
Backend Endpoints
The proxy supports two backend paths to Z.ai with intelligent routing:
Automatic Routing (Default)
The proxy automatically selects the best endpoint based on content:
- Text-only requests β Anthropic endpoint (glm-4.7) - faster, native format
- Vision requests β OpenAI endpoint (glm-4.6v) - full image analysis
This avoids Z.aiβs server_tool_use interception on the Anthropic endpoint which truncates image analysis results.
OpenAI-Compatible Path
- Transforms Anthropic Messages API to OpenAI Chat Completions format
- Routes to
https://api.z.ai/api/paas/v4/chat/completions - Used automatically for vision requests (glm-4.6v)
- Provides complete, untruncated image analysis
Anthropic-Compatible Path
- Native passthrough to Z.aiβs Anthropic-compatible endpoint
- Routes to
https://api.z.ai/api/anthropic/v1/messages - Used for text-only requests when enabled (default)
- Faster with no format conversion overhead
Toggle the Anthropic endpoint:
- Set
USE_ANTHROPIC_ENDPOINT=true/falseenvironment variable, or - Use the dashboard Settings panel toggle, or
- POST to
/configwith{"endpoint": "anthropic"}or{"endpoint": "openai"}
Features in Detail
Model Routing
The proxy automatically selects the appropriate GLM model based on the current message:
- glm-4.7: Used for text-only messages (via Anthropic endpoint)
- glm-4.6v: Used when the current message contains images or videos (via OpenAI endpoint)
After processing an image or video, subsequent text-only messages automatically switch back to glm-4.7 for faster responses. Previous media in conversation history donβt force the vision model.
Media detection scans for:
- Direct image/video content blocks
- Base64-encoded images and videos
- Tool results containing images (e.g., screenshots)
Video Analysis
GLM-4.6v supports video analysis with up to ~1 hour of video content (128K context). The proxy makes video analysis seamless:
Automatic File Path Detection (Claude Code)
When using Claude Code, simply mention a video file path in your message and the proxy will automatically:
- Detect the video file reference
- Read the file from your working directory
- Convert it to a video content block
- Route to the vision model for analysis
Supported patterns:
@video.mp4 # File in current directory
./path/to/video.mp4 # Relative path
../downloads/clip.webm # Parent directory
/home/user/videos/movie.mov # Absolute path
~/Videos/recording.mp4 # Home directory
Example usage in Claude Code:
User: What's happening in @meeting-recording.mp4?
User: Analyze the video at ~/Downloads/demo.mp4
User: Describe /tmp/screen-capture.webm
Dashboard Upload
In the web dashboard, you can:
- Drag and drop video files directly into the chat
- Use the file picker to select videos
- Paste video file paths
Supported Formats
- MP4 (video/mp4)
- WebM (video/webm)
- MOV (video/quicktime)
- MPEG (video/mpeg)
Body size limit: 50MB (supports most short-to-medium videos)
API Format
For programmatic use, send videos in Anthropic-like format:
{
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "What's in this video?"},
{
"type": "video",
"source": {
"type": "url",
"url": "https://example.com/video.mp4"
}
}
]
}]
}
Or with base64:
{
"type": "video",
"source": {
"type": "base64",
"media_type": "video/mp4",
"data": "AAAAIGZ0eXBpc29t..."
}
}
Reasoning
The proxy automatically injects a reasoning prompt before the last user message to encourage step-by-step thinking. The modelβs reasoning output is:
- Captured from
<reasoning_content>tags in the response - Transformed to Anthropic
thinkingblocks
Example response with reasoning:
{
"content": [
{"type": "thinking", "thinking": "Let me think about this..."},
{"type": "text", "text": "The answer is 42."}
]
}
Tool Execution
The proxy provides web search capabilities via Z.aiβs MCP servers, with two internal tools:
- web_search: Search the web using Z.aiβs search MCP
- web_reader: Read web page content using Z.aiβs reader MCP
Claude Code Integration
When WEB_SEARCH_ENABLED=true (the default), the proxy automatically intercepts Claude Codeβs native WebSearch and WebFetch tools. This is useful because:
- Claude Codeβs native web tools require an Anthropic API subscription
- The proxy routes these calls through Z.aiβs MCP servers instead
- No changes needed to Claude Code - it works transparently
When Claude Code calls WebSearch or WebFetch, the proxy:
- Intercepts the tool call before it reaches the API
- Executes the equivalent MCP tool (
web_searchorweb_reader) - Returns the result to Claude Code as if the native tool worked
Smart Tool Injection
The proxy uses keyword-based triggers to inject web_search/web_reader tools only when the user explicitly requests web functionality. Trigger phrases include:
- "search the web", "search online", "look up online"
- "latest news", "current news", "recent news"
- "latest docs", "official documentation"
- "what is the latest", "what are the latest"
This prevents unwanted web searches on every request (e.g., during Claude Code startup).
Configuration
Toggle in the dashboard settings, or via environment:
# Disable web search interception (tools passed through to client)
WEB_SEARCH_ENABLED=false ccglm start
Client-defined tools are always passed through to the response for client handling.
Streaming
Both backend paths support full SSE streaming with proper Anthropic event format:
event: message_start
data: {"type":"message_start","message":{...}}
event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Hello"}}
event: content_block_stop
data: {"type":"content_block_stop","index":0}
event: message_stop
data: {"type":"message_stop"}
Streaming properly handles:
- Text content blocks
- Reasoning/thinking blocks
- Tool use blocks
- Recursive tool execution loops
Error Handling
All errors are returned in Anthropic error format:
{
"type": "error",
"error": {
"type": "invalid_request_error",
"message": "messages is required"
}
}
Error types:
invalid_request_error(400): Malformed requestauthentication_error(401): Invalid API keyrate_limit_error(429): Rate limit exceededapi_error(500): Internal server erroroverloaded_error(529): API overloaded
Logging
Structured logging with configurable levels:
# Enable debug logging
LOG_LEVEL=debug node src/index.js
Log format:
[2024-01-15T10:30:00.000Z] [INFO] [server] Listening on http://127.0.0.1:4567
[2024-01-15T10:30:05.000Z] [INFO] [request] POST /v1/messages {"messages":3}
[2024-01-15T10:30:05.000Z] [INFO] [routing] Vision request detected in current message, using OpenAI endpoint
[2024-01-15T10:30:06.000Z] [INFO] [tool] web_search completed {"duration":"1234ms","success":true}
[2024-01-15T10:30:07.000Z] [INFO] [response] 200 end_turn {"duration":"2345ms"}
Troubleshooting
"ZAI_API_KEY environment variable is required"
Set your Z.ai API key:
export ZAI_API_KEY="your-key-here"
"GLM API error: 401 Unauthorized"
Your API key is invalid or expired. Get a new key from https://z.ai.
"GLM API error: 429 Too Many Requests"
Youβve hit the rate limit. Wait and retry, or upgrade your Z.ai plan.
Requests are slow
GLM-4.7 can take 10-30 seconds for complex requests. For faster responses:
- Use shorter prompts
- Reduce
max_tokens
Vision requests show truncated analysis
This should be fixed automatically - the proxy routes vision requests through the OpenAI endpoint which provides complete image analysis. If you still see truncated results, ensure youβre using the latest version.
Model stays on glm-4.6v after image
The proxy now only checks the current message for images. After an image request, subsequent text-only messages will automatically use glm-4.7. You donβt need to start a new conversation.
Debug logging
Enable debug logs to see full request/response details:
LOG_LEVEL=debug node src/index.js
Security Considerations
GLM Proxy is designed for localhost development use only. It is not intended for production deployment or multi-user environments.
Intended Use
- Development tool: For local development and testing with AI coding assistants
- Single-user localhost: Runs on
127.0.0.1by default for local-only access - Trusted environment: Assumes the localhost environment is trusted
Security Model
This proxy operates under a localhost trust model:
- No authentication: The proxy itself has no authentication layer
- API key storage: Z.ai API keys are stored in memory (server) and localStorage (browser dashboard)
- No encryption: HTTP traffic on localhost is unencrypted (acceptable for local development)
- No rate limiting: Relies on upstream Z.ai rate limits
What NOT to Do
Do not expose this proxy to the public internet. Specifically:
- Do not bind to
0.0.0.0or your public IP in production - Do not expose port 4567 (or your configured port) through your firewall
- Do not use in shared hosting or multi-tenant environments
- Do not run in production or as a public service
API Key Handling
The proxy handles API keys as follows:
- Environment variables:
ZAI_API_KEYis read from the environment (recommended for CLI use) - Dashboard configuration: API keys entered in the web UI are stored in browser localStorage
- Runtime updates: API keys can be updated via POST
/config(stored in memory only) - Upstream only: Keys are only sent to Z.aiβs API endpoints (never logged or exposed)
- Not persisted: Runtime API keys are lost on server restart (use environment variables for persistence)
Recommended Practices
For safe localhost development:
- Use the default
HOST=127.0.0.1binding - Store your
ZAI_API_KEYin your shell profile or.envfile (not in version control) - Use the
ccglmcommand which starts the proxy with safe defaults - Keep your development environment secure (encrypted disk, screen lock, etc.)
If You Need Production Deployment
If you must deploy this proxy in a production or shared environment, you will need to add:
- Authentication and authorization (e.g., API keys, OAuth)
- HTTPS/TLS encryption
- Rate limiting and DoS protection
- Input validation and sanitization hardening
- Security headers (CSP, HSTS, etc.)
- Audit logging
- Network isolation and firewall rules
We do not recommend production deployment as this is a development tool, but if you proceed, you assume full responsibility for security hardening.
Project Structure
glmproxy/
βββ src/
β βββ index.js # Entry point
β βββ cli.js # CLI entry point (ccglm command)
β βββ server.js # HTTP server with smart routing
β βββ config.js # Configuration with runtime state
β βββ middleware/
β β βββ validate.js # Request validation
β βββ transformers/
β β βββ request.js # Anthropic -> GLM (with reasoning injection)
β β βββ response.js # GLM -> Anthropic
β β βββ messages.js # Message conversion
β β βββ anthropic-request.js # Request preparer for Anthropic endpoint
β β βββ anthropic-response.js # Response cleaner for Anthropic endpoint
β βββ reasoning/
β β βββ injector.js # Reasoning prompt injection
β βββ routing/
β β βββ model-router.js # Model selection (current message only)
β βββ tools/
β β βββ definitions.js # Tool schemas (web_search, web_reader)
β β βββ executor.js # Tool loop with MCP integration (OpenAI path)
β β βββ anthropic-executor.js # Tool loop for Anthropic path
β β βββ mcp-client.js # MCP client
β βββ streaming/
β β βββ sse.js # SSE streaming support
β β βββ glm-stream.js # Real-time GLM API streaming
β β βββ anthropic-stream.js # Anthropic endpoint streaming
β βββ utils/
β βββ logger.js # Structured logging
β βββ errors.js # Error classes (Anthropic format)
β βββ video-detector.js # Auto-detect video paths in messages
βββ public/
β βββ index.html # Dashboard entry point
β βββ css/
β β βββ styles.css # Styles with theme variables
β βββ js/
β βββ app.js # Main application orchestrator
β βββ api.js # API client
β βββ settings.js # Settings panel
β βββ mcp-manager.js # MCP server management
β βββ theme.js # Theme switching
β βββ utils.js # Utility functions
βββ package.json
βββ README.md
License
MIT