# Configuration Guide This guide covers all configuration options for the Zen MCP Server. The server is configured through environment variables defined in your `.env` file. ## Quick Start Configuration **Auto Mode (Recommended):** Set `DEFAULT_MODEL=auto` and let Claude intelligently select the best model for each task: ```env # Basic configuration DEFAULT_MODEL=auto GEMINI_API_KEY=your-gemini-key OPENAI_API_KEY=your-openai-key ``` ## Complete Configuration Reference ### Required Configuration **Workspace Root:** ```env ### API Keys (At least one required) **Important:** Use EITHER OpenRouter OR native APIs, not both! Having both creates ambiguity about which provider serves each model. **Option 1: Native APIs (Recommended for direct access)** ```env # Google Gemini API GEMINI_API_KEY=your_gemini_api_key_here # Get from: https://makersuite.google.com/app/apikey # OpenAI API OPENAI_API_KEY=your_openai_api_key_here # Get from: https://platform.openai.com/api-keys # X.AI GROK API XAI_API_KEY=your_xai_api_key_here # Get from: https://console.x.ai/ ``` **Option 2: OpenRouter (Access multiple models through one API)** ```env # OpenRouter for unified model access OPENROUTER_API_KEY=your_openrouter_api_key_here # Get from: https://openrouter.ai/ # If using OpenRouter, comment out native API keys above ``` **Option 3: Custom API Endpoints (Local models)** ```env # For Ollama, vLLM, LM Studio, etc. CUSTOM_API_URL=http://localhost:11434/v1 # Ollama example CUSTOM_API_KEY= # Empty for Ollama CUSTOM_MODEL_NAME=llama3.2 # Default model ``` **Local Model Connection:** - Use standard localhost URLs since the server runs natively - Example: `http://localhost:11434/v1` for Ollama ### Model Configuration **Default Model Selection:** ```env # Options: 'auto', 'pro', 'flash', 'o3', 'o3-mini', 'o4-mini', etc. DEFAULT_MODEL=auto # Claude picks best model for each task (recommended) ``` - **Available Models:** The canonical capability data for native providers lives in JSON manifests under `conf/`: - `conf/openai_models.json` – OpenAI catalogue (can be overridden with `OPENAI_MODELS_CONFIG_PATH`) - `conf/gemini_models.json` – Gemini catalogue (`GEMINI_MODELS_CONFIG_PATH`) - `conf/xai_models.json` – X.AI / GROK catalogue (`XAI_MODELS_CONFIG_PATH`) - `conf/openrouter_models.json` – OpenRouter catalogue (`OPENROUTER_MODELS_CONFIG_PATH`) - `conf/custom_models.json` – Custom/OpenAI-compatible endpoints (`CUSTOM_MODELS_CONFIG_PATH`) Each JSON file documents the allowed fields via its `_README` block and controls model aliases, capability limits, and feature flags. Edit these files (or point the matching `*_MODELS_CONFIG_PATH` variable to your own copy) when you want to adjust context windows, enable JSON mode, or expose additional aliases without touching Python code. The shipped defaults cover: | Provider | Canonical Models | Notable Aliases | |----------|-----------------|-----------------| | OpenAI | `gpt-5`, `gpt-5-pro`, `gpt-5-mini`, `gpt-5-nano`, `gpt-5-codex`, `gpt-4.1`, `o3`, `o3-mini`, `o3-pro`, `o4-mini` | `gpt5`, `gpt5pro`, `mini`, `nano`, `codex`, `o3mini`, `o3pro`, `o4mini` | | Gemini | `gemini-2.5-pro`, `gemini-2.5-flash`, `gemini-2.0-flash`, `gemini-2.0-flash-lite` | `pro`, `gemini-pro`, `flash`, `flash-2.0`, `flashlite` | | X.AI | `grok-4`, `grok-3`, `grok-3-fast` | `grok`, `grok4`, `grok3`, `grok3fast`, `grokfast` | | OpenRouter | See `conf/openrouter_models.json` for the continually evolving catalogue | e.g., `opus`, `sonnet`, `flash`, `pro`, `mistral` | | Custom | User-managed entries such as `llama3.2` | Define your own aliases per entry | > **Tip:** Copy the JSON file you need, customise it, and point the corresponding `*_MODELS_CONFIG_PATH` environment variable to your version. This lets you enable or disable capabilities (JSON mode, function calling, temperature support) without editing Python. ### Thinking Mode Configuration **Default Thinking Mode for ThinkDeep:** ```env # Only applies to models supporting extended thinking (e.g., Gemini 2.5 Pro) DEFAULT_THINKING_MODE_THINKDEEP=high # Available modes and token consumption: # minimal: 128 tokens - Quick analysis, fastest response # low: 2,048 tokens - Light reasoning tasks # medium: 8,192 tokens - Balanced reasoning # high: 16,384 tokens - Complex analysis (recommended for thinkdeep) # max: 32,768 tokens - Maximum reasoning depth ``` ### Model Usage Restrictions Control which models can be used from each provider for cost control, compliance, or standardization: ```env # Format: Comma-separated list (case-insensitive, whitespace tolerant) # Empty or unset = all models allowed (default) # OpenAI model restrictions OPENAI_ALLOWED_MODELS=o3-mini,o4-mini,mini # Gemini model restrictions GOOGLE_ALLOWED_MODELS=flash,pro # X.AI GROK model restrictions XAI_ALLOWED_MODELS=grok-3,grok-3-fast,grok-4 # OpenRouter model restrictions (affects models via custom provider) OPENROUTER_ALLOWED_MODELS=opus,sonnet,mistral ``` **Supported Model Names:** The names/aliases listed in the JSON manifests above are the authoritative source. Keep in mind: - Aliases are case-insensitive and defined per entry (for example, `mini` maps to `gpt-5-mini` by default, while `flash` maps to `gemini-2.5-flash`). - When you override the manifest files you can add or remove aliases as needed; restriction policies (`*_ALLOWED_MODELS`) automatically pick up those changes. - Models omitted from a manifest fall back to generic capability detection (where supported) and may have limited feature metadata. **Example Configurations:** ```env # Cost control - only cheap models OPENAI_ALLOWED_MODELS=o4-mini GOOGLE_ALLOWED_MODELS=flash # Single model standardization OPENAI_ALLOWED_MODELS=o4-mini GOOGLE_ALLOWED_MODELS=pro # Balanced selection GOOGLE_ALLOWED_MODELS=flash,pro XAI_ALLOWED_MODELS=grok,grok-3-fast ``` ### Advanced Configuration **Custom Model Configuration & Manifest Overrides:** ```env # Override default location of built-in catalogues OPENAI_MODELS_CONFIG_PATH=/path/to/openai_models.json GEMINI_MODELS_CONFIG_PATH=/path/to/gemini_models.json XAI_MODELS_CONFIG_PATH=/path/to/xai_models.json OPENROUTER_MODELS_CONFIG_PATH=/path/to/openrouter_models.json CUSTOM_MODELS_CONFIG_PATH=/path/to/custom_models.json ``` **Conversation Settings:** ```env # How long AI-to-AI conversation threads persist in memory (hours) # Conversations are auto-purged when claude closes its MCP connection or # when a session is quit / re-launched CONVERSATION_TIMEOUT_HOURS=5 # Maximum conversation turns (each exchange = 2 turns) MAX_CONVERSATION_TURNS=20 ``` **Logging Configuration:** ```env # Logging level: DEBUG, INFO, WARNING, ERROR LOG_LEVEL=DEBUG # Default: shows detailed operational messages ``` ## Configuration Examples ### Development Setup ```env # Development with multiple providers DEFAULT_MODEL=auto GEMINI_API_KEY=your-gemini-key OPENAI_API_KEY=your-openai-key XAI_API_KEY=your-xai-key LOG_LEVEL=DEBUG CONVERSATION_TIMEOUT_HOURS=1 ``` ### Production Setup ```env # Production with cost controls DEFAULT_MODEL=auto GEMINI_API_KEY=your-gemini-key OPENAI_API_KEY=your-openai-key GOOGLE_ALLOWED_MODELS=flash OPENAI_ALLOWED_MODELS=o4-mini LOG_LEVEL=INFO CONVERSATION_TIMEOUT_HOURS=3 ``` ### Local Development ```env # Local models only DEFAULT_MODEL=llama3.2 CUSTOM_API_URL=http://localhost:11434/v1 CUSTOM_API_KEY= CUSTOM_MODEL_NAME=llama3.2 LOG_LEVEL=DEBUG ``` ### OpenRouter Only ```env # Single API for multiple models DEFAULT_MODEL=auto OPENROUTER_API_KEY=your-openrouter-key OPENROUTER_ALLOWED_MODELS=opus,sonnet,gpt-4 LOG_LEVEL=INFO ``` ## Important Notes **Local Networking:** - Use standard localhost URLs for local models - The server runs as a native Python process **API Key Priority:** - Native APIs take priority over OpenRouter when both are configured - Avoid configuring both native and OpenRouter for the same models **Model Restrictions:** - Apply to all usage including auto mode - Empty/unset = all models allowed - Invalid model names are warned about at startup **Configuration Changes:** - Restart the server with `./run-server.sh` after changing `.env` - Configuration is loaded once at startup ## Related Documentation - **[Advanced Usage Guide](advanced-usage.md)** - Advanced model usage patterns, thinking modes, and power user workflows - **[Context Revival Guide](context-revival.md)** - Conversation persistence and context revival across sessions - **[AI-to-AI Collaboration Guide](ai-collaboration.md)** - Multi-model coordination and conversation threading