feat!: breaking change - OpenRouter models are now read from conf/openrouter_models.json while Custom / Self-hosted models are read from conf/custom_models.json

feat: Azure OpenAI / Azure AI Foundry support. Models should be defined in conf/azure_models.json (or a custom path). See .env.example for environment variables or see readme. https://github.com/BeehiveInnovations/zen-mcp-server/issues/265

feat: OpenRouter / Custom Models / Azure can separately also use custom config paths now (see .env.example )

refactor: Model registry class made abstract, OpenRouter / Custom Provider / Azure OpenAI now subclass these

refactor: breaking change: `is_custom` property has been removed from model_capabilities.py (and thus custom_models.json) given each models are now read from separate configuration files
This commit is contained in:
Fahad
2025-10-04 21:10:56 +04:00
parent e91ed2a924
commit ff9a07a37a
40 changed files with 1651 additions and 852 deletions

View File

@@ -17,6 +17,15 @@ GEMINI_API_KEY=your_gemini_api_key_here
# Get your OpenAI API key from: https://platform.openai.com/api-keys # Get your OpenAI API key from: https://platform.openai.com/api-keys
OPENAI_API_KEY=your_openai_api_key_here OPENAI_API_KEY=your_openai_api_key_here
# Azure OpenAI mirrors OpenAI models through Azure-hosted deployments
# Set the endpoint from Azure Portal. Models are defined in conf/azure_models.json
# (or the file referenced by AZURE_MODELS_CONFIG_PATH).
AZURE_OPENAI_API_KEY=your_azure_openai_key_here
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
# AZURE_OPENAI_API_VERSION=2024-02-15-preview
# AZURE_OPENAI_ALLOWED_MODELS=gpt-4o,gpt-4o-mini
# AZURE_MODELS_CONFIG_PATH=/absolute/path/to/custom_azure_models.json
# Get your X.AI API key from: https://console.x.ai/ # Get your X.AI API key from: https://console.x.ai/
XAI_API_KEY=your_xai_api_key_here XAI_API_KEY=your_xai_api_key_here

View File

@@ -3,7 +3,7 @@
[zen_web.webm](https://github.com/user-attachments/assets/851e3911-7f06-47c0-a4ab-a2601236697c) [zen_web.webm](https://github.com/user-attachments/assets/851e3911-7f06-47c0-a4ab-a2601236697c)
<div align="center"> <div align="center">
<b>🤖 <a href="https://www.anthropic.com/claude-code">Claude Code</a> OR <a href="https://github.com/google-gemini/gemini-cli">Gemini CLI</a> OR <a href="https://github.com/openai/codex">Codex CLI</a> + [Gemini / OpenAI / Grok / OpenRouter / DIAL / Ollama / Anthropic / Any Model] = Your Ultimate AI Development Team</b> <b>🤖 <a href="https://www.anthropic.com/claude-code">Claude Code</a> OR <a href="https://github.com/google-gemini/gemini-cli">Gemini CLI</a> OR <a href="https://github.com/openai/codex">Codex CLI</a> + [Gemini / OpenAI / Azure / Grok / OpenRouter / DIAL / Ollama / Anthropic / Any Model] = Your Ultimate AI Development Team</b>
</div> </div>
<br/> <br/>
@@ -85,6 +85,7 @@ For best results, use Claude Code with:
- **[OpenRouter](https://openrouter.ai/)** - Access multiple models with one API - **[OpenRouter](https://openrouter.ai/)** - Access multiple models with one API
- **[Gemini](https://makersuite.google.com/app/apikey)** - Google's latest models - **[Gemini](https://makersuite.google.com/app/apikey)** - Google's latest models
- **[OpenAI](https://platform.openai.com/api-keys)** - O3, GPT-5 series - **[OpenAI](https://platform.openai.com/api-keys)** - O3, GPT-5 series
- **[Azure OpenAI](https://learn.microsoft.com/azure/ai-services/openai/)** - Enterprise deployments of GPT-4o, GPT-4.1, GPT-5 family
- **[X.AI](https://console.x.ai/)** - Grok models - **[X.AI](https://console.x.ai/)** - Grok models
- **[DIAL](https://dialx.ai/)** - Vendor-agnostic model access - **[DIAL](https://dialx.ai/)** - Vendor-agnostic model access
- **[Ollama](https://ollama.ai/)** - Local models (free) - **[Ollama](https://ollama.ai/)** - Local models (free)
@@ -132,6 +133,10 @@ cd zen-mcp-server
👉 **[Complete Setup Guide](docs/getting-started.md)** with detailed installation, configuration for Gemini / Codex, and troubleshooting 👉 **[Complete Setup Guide](docs/getting-started.md)** with detailed installation, configuration for Gemini / Codex, and troubleshooting
👉 **[Cursor & VS Code Setup](docs/getting-started.md#ide-clients)** for IDE integration instructions 👉 **[Cursor & VS Code Setup](docs/getting-started.md#ide-clients)** for IDE integration instructions
## Provider Configuration
Zen activates any provider that has credentials in your `.env`. See `.env.example` for deeper customization.
## Core Tools ## Core Tools
> **Note:** Each tool comes with its own multi-step workflow, parameters, and descriptions that consume valuable context window space even when not in use. To optimize performance, some tools are disabled by default. See [Tool Configuration](#tool-configuration) below to enable them. > **Note:** Each tool comes with its own multi-step workflow, parameters, and descriptions that consume valuable context window space even when not in use. To optimize performance, some tools are disabled by default. See [Tool Configuration](#tool-configuration) below to enable them.
@@ -247,7 +252,7 @@ DISABLED_TOOLS=
- **[Context revival](docs/context-revival.md)** - Continue conversations even after context resets - **[Context revival](docs/context-revival.md)** - Continue conversations even after context resets
**Model Support** **Model Support**
- **Multiple providers** - Gemini, OpenAI, X.AI, OpenRouter, DIAL, Ollama - **Multiple providers** - Gemini, OpenAI, Azure, X.AI, OpenRouter, DIAL, Ollama
- **Latest models** - GPT-5, Gemini 2.5 Pro, O3, Grok-4, local Llama - **Latest models** - GPT-5, Gemini 2.5 Pro, O3, Grok-4, local Llama
- **[Thinking modes](docs/advanced-usage.md#thinking-modes)** - Control reasoning depth vs cost - **[Thinking modes](docs/advanced-usage.md#thinking-modes)** - Control reasoning depth vs cost
- **Vision support** - Analyze images, diagrams, screenshots - **Vision support** - Analyze images, diagrams, screenshots
@@ -288,6 +293,7 @@ DISABLED_TOOLS=
- [Tools Reference](docs/tools/) - All tools with examples - [Tools Reference](docs/tools/) - All tools with examples
- [Advanced Usage](docs/advanced-usage.md) - Power user features - [Advanced Usage](docs/advanced-usage.md) - Power user features
- [Configuration](docs/configuration.md) - Environment variables, restrictions - [Configuration](docs/configuration.md) - Environment variables, restrictions
- [Adding Providers](docs/adding_providers.md) - Provider-specific setup (OpenAI, Azure, custom gateways)
- [Model Ranking Guide](docs/model_ranking.md) - How intelligence scores drive auto-mode suggestions - [Model Ranking Guide](docs/model_ranking.md) - How intelligence scores drive auto-mode suggestions
**🔧 Setup & Support** **🔧 Setup & Support**
@@ -303,10 +309,12 @@ Apache 2.0 License - see [LICENSE](LICENSE) file for details.
Built with the power of **Multi-Model AI** collaboration 🤝 Built with the power of **Multi-Model AI** collaboration 🤝
- **A**ctual **I**ntelligence by real Humans - **A**ctual **I**ntelligence by real Humans
- [MCP (Model Context Protocol)](https://modelcontextprotocol.com) by Anthropic - [MCP (Model Context Protocol)](https://modelcontextprotocol.com)
- [Claude Code](https://claude.ai/code) - Your AI coding orchestrator - [Codex CLI](https://developers.openai.com/codex/cli)
- [Gemini 2.5 Pro & Flash](https://ai.google.dev/) - Extended thinking & fast analysis - [Claude Code](https://claude.ai/code)
- [OpenAI O3 & GPT-5](https://openai.com/) - Strong reasoning & latest capabilities - [Gemini](https://ai.google.dev/)
- [OpenAI](https://openai.com/)
- [Azure OpenAI](https://learn.microsoft.com/azure/ai-services/openai/)
### Star History ### Star History

45
conf/azure_models.json Normal file
View File

@@ -0,0 +1,45 @@
{
"_README": {
"description": "Model metadata for Azure OpenAI / Azure AI Foundry-backed provider. The `models` definition can be copied from openrouter_models.json / custom_models.json",
"documentation": "https://github.com/BeehiveInnovations/zen-mcp-server/blob/main/docs/azure_models.md",
"usage": "Models listed here are exposed through Azure AI Foundry. Aliases are case-insensitive.",
"field_notes": "Matches providers/shared/model_capabilities.py.",
"field_descriptions": {
"model_name": "The model identifier e.g., 'gpt-4'",
"deployment": "Azure model deployment name",
"aliases": "Array of short names users can type instead of the full model name",
"context_window": "Total number of tokens the model can process (input + output combined)",
"max_output_tokens": "Maximum number of tokens the model can generate in a single response",
"supports_extended_thinking": "Whether the model supports extended reasoning tokens (currently none do via OpenRouter or custom APIs)",
"supports_json_mode": "Whether the model can guarantee valid JSON output",
"supports_function_calling": "Whether the model supports function/tool calling",
"supports_images": "Whether the model can process images/visual input",
"max_image_size_mb": "Maximum total size in MB for all images combined (capped at 40MB max for custom models)",
"supports_temperature": "Whether the model accepts temperature parameter in API calls (set to false for O3/O4 reasoning models)",
"temperature_constraint": "Type of temperature constraint: 'fixed' (fixed value), 'range' (continuous range), 'discrete' (specific values), or omit for default range",
"description": "Human-readable description of the model",
"intelligence_score": "1-20 human rating used as the primary signal for auto-mode model ordering"
}
},
"_example_models": [
{
"model_name": "gpt-4",
"deployment": "gpt-4",
"aliases": [
"gpt4"
],
"context_window": 128000,
"max_output_tokens": 16384,
"supports_extended_thinking": false,
"supports_json_mode": true,
"supports_function_calling": false,
"supports_images": false,
"max_image_size_mb": 0.0,
"supports_temperature": false,
"temperature_constraint": "fixed",
"description": "GPT-4 (128K context, 16K output)",
"intelligence_score": 10
}
],
"models": []
}

View File

@@ -1,383 +1,26 @@
{ {
"_README": { "_README": {
"description": "Unified model configuration for multiple AI providers and endpoints, including OpenRouter", "description": "Model metadata for local/self-hosted OpenAI-compatible endpoints (Custom provider).",
"providers_supported": [
"OpenRouter - Access to GPT-4, Claude, Mistral, etc. via unified API",
"Custom API endpoints - Local models (Ollama, vLLM, LM Studio, etc.)",
"Self-hosted APIs - Any OpenAI-compatible endpoint"
],
"documentation": "https://github.com/BeehiveInnovations/zen-mcp-server/blob/main/docs/custom_models.md", "documentation": "https://github.com/BeehiveInnovations/zen-mcp-server/blob/main/docs/custom_models.md",
"usage": "Models can be accessed via aliases (e.g., 'opus', 'local-llama') or full names (e.g., 'anthropic/claude-opus-4', 'llama3.2')", "usage": "Each entry will be advertised by the Custom provider. Aliases are case-insensitive.",
"instructions": [ "field_notes": "Matches providers/shared/model_capabilities.py.",
"Add new models by copying an existing entry and modifying it",
"Aliases are case-insensitive and should be unique across all models",
"context_window is the model's total context window size in tokens (input + output)",
"Set supports_* flags based on the model's actual capabilities",
"Set is_custom=true for models that should ONLY work with custom endpoints (Ollama, vLLM, etc.)",
"Models not listed here will use generic defaults (32K context window, basic features)",
"For OpenRouter models: Use official OpenRouter model names (e.g., 'anthropic/claude-opus-4')",
"For local/custom models: Use model names as they appear in your API (e.g., 'llama3.2', 'gpt-3.5-turbo')"
],
"field_descriptions": { "field_descriptions": {
"model_name": "The model identifier - OpenRouter format (e.g., 'anthropic/claude-opus-4') or custom model name (e.g., 'llama3.2')", "model_name": "The model identifier e.g., 'llama3.2'",
"aliases": "Array of short names users can type instead of the full model name", "aliases": "Array of short names users can type instead of the full model name",
"context_window": "Total number of tokens the model can process (input + output combined)", "context_window": "Total number of tokens the model can process (input + output combined)",
"max_output_tokens": "Maximum number of tokens the model can generate in a single response", "max_output_tokens": "Maximum number of tokens the model can generate in a single response",
"supports_extended_thinking": "Whether the model supports extended reasoning tokens (currently none do via OpenRouter or custom APIs)", "supports_extended_thinking": "Whether the model supports extended reasoning tokens",
"supports_json_mode": "Whether the model can guarantee valid JSON output", "supports_json_mode": "Whether the model can guarantee valid JSON output",
"supports_function_calling": "Whether the model supports function/tool calling", "supports_function_calling": "Whether the model supports function/tool calling",
"supports_images": "Whether the model can process images/visual input", "supports_images": "Whether the model can process images/visual input",
"max_image_size_mb": "Maximum total size in MB for all images combined (capped at 40MB max for custom models)", "max_image_size_mb": "Maximum total size in MB for all images combined (capped at 40MB max for custom models)",
"supports_temperature": "Whether the model accepts temperature parameter in API calls (set to false for O3/O4 reasoning models)", "supports_temperature": "Whether the model accepts temperature parameter in API calls (set to false for O3/O4 reasoning models)",
"temperature_constraint": "Type of temperature constraint: 'fixed' (fixed value), 'range' (continuous range), 'discrete' (specific values), or omit for default range", "temperature_constraint": "Type of temperature constraint: 'fixed' (fixed value), 'range' (continuous range), 'discrete' (specific values), or omit for default range",
"is_custom": "Set to true for models that should ONLY be used with custom API endpoints (Ollama, vLLM, etc.). False or omitted for OpenRouter/cloud models.",
"description": "Human-readable description of the model", "description": "Human-readable description of the model",
"intelligence_score": "1-20 human rating used as the primary signal for auto-mode model ordering" "intelligence_score": "1-20 human rating used as the primary signal for auto-mode model ordering"
},
"example_custom_model": {
"model_name": "my-local-model",
"aliases": [
"shortname",
"nickname",
"abbrev"
],
"context_window": 128000,
"max_output_tokens": 32768,
"supports_extended_thinking": false,
"supports_json_mode": true,
"supports_function_calling": true,
"supports_images": true,
"max_image_size_mb": 10.0,
"supports_temperature": true,
"temperature_constraint": "range",
"is_custom": true,
"description": "Example custom/local model for Ollama, vLLM, etc.",
"intelligence_score": 12
} }
}, },
"models": [ "models": [
{
"model_name": "anthropic/claude-sonnet-4.5",
"aliases": [
"sonnet",
"sonnet4.5"
],
"context_window": 200000,
"max_output_tokens": 64000,
"supports_extended_thinking": false,
"supports_json_mode": false,
"supports_function_calling": false,
"supports_images": true,
"max_image_size_mb": 5.0,
"description": "Claude Sonnet 4.5 - High-performance model with exceptional reasoning and efficiency",
"intelligence_score": 12
},
{
"model_name": "anthropic/claude-opus-4.1",
"aliases": [
"opus",
"claude-opus"
],
"context_window": 200000,
"max_output_tokens": 64000,
"supports_extended_thinking": false,
"supports_json_mode": false,
"supports_function_calling": false,
"supports_images": true,
"max_image_size_mb": 5.0,
"description": "Claude Opus 4.1 - Our most capable and intelligent model yet",
"intelligence_score": 14
},
{
"model_name": "anthropic/claude-sonnet-4.1",
"aliases": [
"sonnet4.1"
],
"context_window": 200000,
"max_output_tokens": 64000,
"supports_extended_thinking": false,
"supports_json_mode": false,
"supports_function_calling": false,
"supports_images": true,
"max_image_size_mb": 5.0,
"description": "Claude Sonnet 4.1 - Last generation high-performance model with exceptional reasoning and efficiency",
"intelligence_score": 10
},
{
"model_name": "anthropic/claude-3.5-haiku",
"aliases": [
"haiku"
],
"context_window": 200000,
"max_output_tokens": 64000,
"supports_extended_thinking": false,
"supports_json_mode": false,
"supports_function_calling": false,
"supports_images": true,
"max_image_size_mb": 5.0,
"description": "Claude 3 Haiku - Fast and efficient with vision",
"intelligence_score": 8
},
{
"model_name": "google/gemini-2.5-pro",
"aliases": [
"pro",
"gemini-pro",
"gemini",
"pro-openrouter"
],
"context_window": 1048576,
"max_output_tokens": 65536,
"supports_extended_thinking": true,
"supports_json_mode": true,
"supports_function_calling": true,
"supports_images": true,
"max_image_size_mb": 20.0,
"description": "Google's Gemini 2.5 Pro via OpenRouter with vision",
"intelligence_score": 18
},
{
"model_name": "google/gemini-2.5-flash",
"aliases": [
"flash",
"gemini-flash"
],
"context_window": 1048576,
"max_output_tokens": 65536,
"supports_extended_thinking": true,
"supports_json_mode": true,
"supports_function_calling": true,
"supports_images": true,
"max_image_size_mb": 15.0,
"description": "Google's Gemini 2.5 Flash via OpenRouter with vision",
"intelligence_score": 10
},
{
"model_name": "mistralai/mistral-large-2411",
"aliases": [
"mistral-large",
"mistral"
],
"context_window": 128000,
"max_output_tokens": 32000,
"supports_extended_thinking": false,
"supports_json_mode": true,
"supports_function_calling": true,
"supports_images": false,
"max_image_size_mb": 0.0,
"description": "Mistral's largest model (text-only)",
"intelligence_score": 11
},
{
"model_name": "meta-llama/llama-3-70b",
"aliases": [
"llama",
"llama3",
"llama3-70b",
"llama-70b",
"llama3-openrouter"
],
"context_window": 8192,
"max_output_tokens": 8192,
"supports_extended_thinking": false,
"supports_json_mode": false,
"supports_function_calling": false,
"supports_images": false,
"max_image_size_mb": 0.0,
"description": "Meta's Llama 3 70B model (text-only)",
"intelligence_score": 9
},
{
"model_name": "deepseek/deepseek-r1-0528",
"aliases": [
"deepseek-r1",
"deepseek",
"r1",
"deepseek-thinking"
],
"context_window": 65536,
"max_output_tokens": 32768,
"supports_extended_thinking": true,
"supports_json_mode": true,
"supports_function_calling": false,
"supports_images": false,
"max_image_size_mb": 0.0,
"description": "DeepSeek R1 with thinking mode - advanced reasoning capabilities (text-only)",
"intelligence_score": 15
},
{
"model_name": "perplexity/llama-3-sonar-large-32k-online",
"aliases": [
"perplexity",
"sonar",
"perplexity-online"
],
"context_window": 32768,
"max_output_tokens": 32768,
"supports_extended_thinking": false,
"supports_json_mode": false,
"supports_function_calling": false,
"supports_images": false,
"max_image_size_mb": 0.0,
"description": "Perplexity's online model with web search (text-only)",
"intelligence_score": 9
},
{
"model_name": "openai/o3",
"aliases": [
"o3"
],
"context_window": 200000,
"max_output_tokens": 100000,
"supports_extended_thinking": false,
"supports_json_mode": true,
"supports_function_calling": true,
"supports_images": true,
"max_image_size_mb": 20.0,
"supports_temperature": false,
"temperature_constraint": "fixed",
"description": "OpenAI's o3 model - well-rounded and powerful across domains with vision",
"intelligence_score": 14
},
{
"model_name": "openai/o3-mini",
"aliases": [
"o3-mini",
"o3mini"
],
"context_window": 200000,
"max_output_tokens": 100000,
"supports_extended_thinking": false,
"supports_json_mode": true,
"supports_function_calling": true,
"supports_images": true,
"max_image_size_mb": 20.0,
"supports_temperature": false,
"temperature_constraint": "fixed",
"description": "OpenAI's o3-mini model - balanced performance and speed with vision",
"intelligence_score": 12
},
{
"model_name": "openai/o3-mini-high",
"aliases": [
"o3-mini-high",
"o3mini-high"
],
"context_window": 200000,
"max_output_tokens": 100000,
"supports_extended_thinking": false,
"supports_json_mode": true,
"supports_function_calling": true,
"supports_images": true,
"max_image_size_mb": 20.0,
"supports_temperature": false,
"temperature_constraint": "fixed",
"description": "OpenAI's o3-mini with high reasoning effort - optimized for complex problems with vision",
"intelligence_score": 13
},
{
"model_name": "openai/o3-pro",
"aliases": [
"o3pro"
],
"context_window": 200000,
"max_output_tokens": 100000,
"supports_extended_thinking": false,
"supports_json_mode": true,
"supports_function_calling": true,
"supports_images": true,
"max_image_size_mb": 20.0,
"supports_temperature": false,
"temperature_constraint": "fixed",
"description": "OpenAI's o3-pro model - professional-grade reasoning and analysis with vision",
"intelligence_score": 15
},
{
"model_name": "openai/o4-mini",
"aliases": [
"o4-mini",
"o4mini"
],
"context_window": 200000,
"max_output_tokens": 100000,
"supports_extended_thinking": false,
"supports_json_mode": true,
"supports_function_calling": true,
"supports_images": true,
"max_image_size_mb": 20.0,
"supports_temperature": false,
"temperature_constraint": "fixed",
"description": "OpenAI's o4-mini model - optimized for shorter contexts with rapid reasoning and vision",
"intelligence_score": 11
},
{
"model_name": "openai/gpt-5",
"aliases": [
"gpt5"
],
"context_window": 400000,
"max_output_tokens": 128000,
"supports_extended_thinking": true,
"supports_json_mode": true,
"supports_function_calling": true,
"supports_images": true,
"max_image_size_mb": 20.0,
"supports_temperature": true,
"temperature_constraint": "range",
"description": "GPT-5 (400K context, 128K output) - Advanced model with reasoning support",
"intelligence_score": 16
},
{
"model_name": "openai/gpt-5-codex",
"aliases": [
"codex",
"gpt5codex"
],
"context_window": 400000,
"max_output_tokens": 128000,
"supports_extended_thinking": false,
"supports_json_mode": true,
"supports_function_calling": false,
"supports_images": false,
"max_image_size_mb": 0.0,
"is_custom": false,
"description": "GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and coding workflows",
"intelligence_score": 17
},
{
"model_name": "openai/gpt-5-mini",
"aliases": [
"gpt5mini"
],
"context_window": 400000,
"max_output_tokens": 128000,
"supports_extended_thinking": false,
"supports_json_mode": true,
"supports_function_calling": false,
"supports_images": false,
"max_image_size_mb": 0.0,
"supports_temperature": true,
"temperature_constraint": "fixed",
"description": "GPT-5-mini (400K context, 128K output) - Efficient variant with reasoning support",
"intelligence_score": 10
},
{
"model_name": "openai/gpt-5-nano",
"aliases": [
"gpt5nano"
],
"context_window": 400000,
"max_output_tokens": 128000,
"supports_extended_thinking": false,
"supports_json_mode": true,
"supports_function_calling": false,
"supports_images": false,
"max_image_size_mb": 0.0,
"supports_temperature": true,
"temperature_constraint": "fixed",
"description": "GPT-5 nano (400K context, 128K output) - Fastest, cheapest version of GPT-5 for summarization and classification tasks",
"intelligence_score": 8
},
{ {
"model_name": "llama3.2", "model_name": "llama3.2",
"aliases": [ "aliases": [
@@ -391,7 +34,6 @@
"supports_function_calling": false, "supports_function_calling": false,
"supports_images": false, "supports_images": false,
"max_image_size_mb": 0.0, "max_image_size_mb": 0.0,
"is_custom": true,
"description": "Local Llama 3.2 model via custom endpoint (Ollama/vLLM) - 128K context window (text-only)", "description": "Local Llama 3.2 model via custom endpoint (Ollama/vLLM) - 128K context window (text-only)",
"intelligence_score": 6 "intelligence_score": 6
} }

346
conf/openrouter_models.json Normal file
View File

@@ -0,0 +1,346 @@
{
"_README": {
"description": "Model metadata for OpenRouter-backed providers.",
"documentation": "https://github.com/BeehiveInnovations/zen-mcp-server/blob/main/docs/custom_models.md",
"usage": "Models listed here are exposed through OpenRouter. Aliases are case-insensitive.",
"field_notes": "Matches providers/shared/model_capabilities.py.",
"field_descriptions": {
"model_name": "The model identifier - OpenRouter format (e.g., 'anthropic/claude-opus-4') or custom model name (e.g., 'llama3.2')",
"aliases": "Array of short names users can type instead of the full model name",
"context_window": "Total number of tokens the model can process (input + output combined)",
"max_output_tokens": "Maximum number of tokens the model can generate in a single response",
"supports_extended_thinking": "Whether the model supports extended reasoning tokens (currently none do via OpenRouter or custom APIs)",
"supports_json_mode": "Whether the model can guarantee valid JSON output",
"supports_function_calling": "Whether the model supports function/tool calling",
"supports_images": "Whether the model can process images/visual input",
"max_image_size_mb": "Maximum total size in MB for all images combined (capped at 40MB max for custom models)",
"supports_temperature": "Whether the model accepts temperature parameter in API calls (set to false for O3/O4 reasoning models)",
"temperature_constraint": "Type of temperature constraint: 'fixed' (fixed value), 'range' (continuous range), 'discrete' (specific values), or omit for default range",
"description": "Human-readable description of the model",
"intelligence_score": "1-20 human rating used as the primary signal for auto-mode model ordering"
}
},
"models": [
{
"model_name": "anthropic/claude-sonnet-4.5",
"aliases": [
"sonnet",
"sonnet4.5"
],
"context_window": 200000,
"max_output_tokens": 64000,
"supports_extended_thinking": false,
"supports_json_mode": false,
"supports_function_calling": false,
"supports_images": true,
"max_image_size_mb": 5.0,
"description": "Claude Sonnet 4.5 - High-performance model with exceptional reasoning and efficiency",
"intelligence_score": 12
},
{
"model_name": "anthropic/claude-opus-4.1",
"aliases": [
"opus",
"claude-opus"
],
"context_window": 200000,
"max_output_tokens": 64000,
"supports_extended_thinking": false,
"supports_json_mode": false,
"supports_function_calling": false,
"supports_images": true,
"max_image_size_mb": 5.0,
"description": "Claude Opus 4.1 - Our most capable and intelligent model yet",
"intelligence_score": 14
},
{
"model_name": "anthropic/claude-sonnet-4.1",
"aliases": [
"sonnet4.1"
],
"context_window": 200000,
"max_output_tokens": 64000,
"supports_extended_thinking": false,
"supports_json_mode": false,
"supports_function_calling": false,
"supports_images": true,
"max_image_size_mb": 5.0,
"description": "Claude Sonnet 4.1 - Last generation high-performance model with exceptional reasoning and efficiency",
"intelligence_score": 10
},
{
"model_name": "anthropic/claude-3.5-haiku",
"aliases": [
"haiku"
],
"context_window": 200000,
"max_output_tokens": 64000,
"supports_extended_thinking": false,
"supports_json_mode": false,
"supports_function_calling": false,
"supports_images": true,
"max_image_size_mb": 5.0,
"description": "Claude 3 Haiku - Fast and efficient with vision",
"intelligence_score": 8
},
{
"model_name": "google/gemini-2.5-pro",
"aliases": [
"pro",
"gemini-pro",
"gemini",
"pro-openrouter"
],
"context_window": 1048576,
"max_output_tokens": 65536,
"supports_extended_thinking": true,
"supports_json_mode": true,
"supports_function_calling": true,
"supports_images": true,
"max_image_size_mb": 20.0,
"description": "Google's Gemini 2.5 Pro via OpenRouter with vision",
"intelligence_score": 18
},
{
"model_name": "google/gemini-2.5-flash",
"aliases": [
"flash",
"gemini-flash"
],
"context_window": 1048576,
"max_output_tokens": 65536,
"supports_extended_thinking": true,
"supports_json_mode": true,
"supports_function_calling": true,
"supports_images": true,
"max_image_size_mb": 15.0,
"description": "Google's Gemini 2.5 Flash via OpenRouter with vision",
"intelligence_score": 10
},
{
"model_name": "mistralai/mistral-large-2411",
"aliases": [
"mistral-large",
"mistral"
],
"context_window": 128000,
"max_output_tokens": 32000,
"supports_extended_thinking": false,
"supports_json_mode": true,
"supports_function_calling": true,
"supports_images": false,
"max_image_size_mb": 0.0,
"description": "Mistral's largest model (text-only)",
"intelligence_score": 11
},
{
"model_name": "meta-llama/llama-3-70b",
"aliases": [
"llama",
"llama3",
"llama3-70b",
"llama-70b",
"llama3-openrouter"
],
"context_window": 8192,
"max_output_tokens": 8192,
"supports_extended_thinking": false,
"supports_json_mode": false,
"supports_function_calling": false,
"supports_images": false,
"max_image_size_mb": 0.0,
"description": "Meta's Llama 3 70B model (text-only)",
"intelligence_score": 9
},
{
"model_name": "deepseek/deepseek-r1-0528",
"aliases": [
"deepseek-r1",
"deepseek",
"r1",
"deepseek-thinking"
],
"context_window": 65536,
"max_output_tokens": 32768,
"supports_extended_thinking": true,
"supports_json_mode": true,
"supports_function_calling": false,
"supports_images": false,
"max_image_size_mb": 0.0,
"description": "DeepSeek R1 with thinking mode - advanced reasoning capabilities (text-only)",
"intelligence_score": 15
},
{
"model_name": "perplexity/llama-3-sonar-large-32k-online",
"aliases": [
"perplexity",
"sonar",
"perplexity-online"
],
"context_window": 32768,
"max_output_tokens": 32768,
"supports_extended_thinking": false,
"supports_json_mode": false,
"supports_function_calling": false,
"supports_images": false,
"max_image_size_mb": 0.0,
"description": "Perplexity's online model with web search (text-only)",
"intelligence_score": 9
},
{
"model_name": "openai/o3",
"aliases": [
"o3"
],
"context_window": 200000,
"max_output_tokens": 100000,
"supports_extended_thinking": false,
"supports_json_mode": true,
"supports_function_calling": true,
"supports_images": true,
"max_image_size_mb": 20.0,
"supports_temperature": false,
"temperature_constraint": "fixed",
"description": "OpenAI's o3 model - well-rounded and powerful across domains with vision",
"intelligence_score": 14
},
{
"model_name": "openai/o3-mini",
"aliases": [
"o3-mini",
"o3mini"
],
"context_window": 200000,
"max_output_tokens": 100000,
"supports_extended_thinking": false,
"supports_json_mode": true,
"supports_function_calling": true,
"supports_images": true,
"max_image_size_mb": 20.0,
"supports_temperature": false,
"temperature_constraint": "fixed",
"description": "OpenAI's o3-mini model - balanced performance and speed with vision",
"intelligence_score": 12
},
{
"model_name": "openai/o3-mini-high",
"aliases": [
"o3-mini-high",
"o3mini-high"
],
"context_window": 200000,
"max_output_tokens": 100000,
"supports_extended_thinking": false,
"supports_json_mode": true,
"supports_function_calling": true,
"supports_images": true,
"max_image_size_mb": 20.0,
"supports_temperature": false,
"temperature_constraint": "fixed",
"description": "OpenAI's o3-mini with high reasoning effort - optimized for complex problems with vision",
"intelligence_score": 13
},
{
"model_name": "openai/o3-pro",
"aliases": [
"o3pro"
],
"context_window": 200000,
"max_output_tokens": 100000,
"supports_extended_thinking": false,
"supports_json_mode": true,
"supports_function_calling": true,
"supports_images": true,
"max_image_size_mb": 20.0,
"supports_temperature": false,
"temperature_constraint": "fixed",
"description": "OpenAI's o3-pro model - professional-grade reasoning and analysis with vision",
"intelligence_score": 15
},
{
"model_name": "openai/o4-mini",
"aliases": [
"o4-mini",
"o4mini"
],
"context_window": 200000,
"max_output_tokens": 100000,
"supports_extended_thinking": false,
"supports_json_mode": true,
"supports_function_calling": true,
"supports_images": true,
"max_image_size_mb": 20.0,
"supports_temperature": false,
"temperature_constraint": "fixed",
"description": "OpenAI's o4-mini model - optimized for shorter contexts with rapid reasoning and vision",
"intelligence_score": 11
},
{
"model_name": "openai/gpt-5",
"aliases": [
"gpt5"
],
"context_window": 400000,
"max_output_tokens": 128000,
"supports_extended_thinking": true,
"supports_json_mode": true,
"supports_function_calling": true,
"supports_images": true,
"max_image_size_mb": 20.0,
"supports_temperature": true,
"temperature_constraint": "range",
"description": "GPT-5 (400K context, 128K output) - Advanced model with reasoning support",
"intelligence_score": 16
},
{
"model_name": "openai/gpt-5-codex",
"aliases": [
"codex",
"gpt5codex"
],
"context_window": 400000,
"max_output_tokens": 128000,
"supports_extended_thinking": false,
"supports_json_mode": true,
"supports_function_calling": false,
"supports_images": false,
"max_image_size_mb": 0.0,
"description": "GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and coding workflows",
"intelligence_score": 17
},
{
"model_name": "openai/gpt-5-mini",
"aliases": [
"gpt5mini"
],
"context_window": 400000,
"max_output_tokens": 128000,
"supports_extended_thinking": false,
"supports_json_mode": true,
"supports_function_calling": false,
"supports_images": false,
"max_image_size_mb": 0.0,
"supports_temperature": true,
"temperature_constraint": "fixed",
"description": "GPT-5-mini (400K context, 128K output) - Efficient variant with reasoning support",
"intelligence_score": 10
},
{
"model_name": "openai/gpt-5-nano",
"aliases": [
"gpt5nano"
],
"context_window": 400000,
"max_output_tokens": 128000,
"supports_extended_thinking": false,
"supports_json_mode": true,
"supports_function_calling": false,
"supports_images": false,
"max_image_size_mb": 0.0,
"supports_temperature": true,
"temperature_constraint": "fixed",
"description": "GPT-5 nano (400K context, 128K output) - Fastest, cheapest version of GPT-5 for summarization and classification tasks",
"intelligence_score": 8
}
]
}

View File

@@ -9,6 +9,7 @@ Each provider:
- Defines supported models using `ModelCapabilities` objects - Defines supported models using `ModelCapabilities` objects
- Implements the minimal abstract hooks (`get_provider_type()` and `generate_content()`) - Implements the minimal abstract hooks (`get_provider_type()` and `generate_content()`)
- Gets wired into `configure_providers()` so environment variables control activation - Gets wired into `configure_providers()` so environment variables control activation
- Can leverage helper subclasses (e.g., `AzureOpenAIProvider`) when only client wiring differs
### Intelligence score cheatsheet ### Intelligence score cheatsheet
@@ -31,6 +32,13 @@ features ([details here](model_ranking.md)).
⚠️ **Important**: If you implement a custom `generate_content()`, call `_resolve_model_name()` before invoking the SDK so aliases (e.g. `"gpt"``"gpt-4"`) resolve correctly. The shared implementations already do this for you. ⚠️ **Important**: If you implement a custom `generate_content()`, call `_resolve_model_name()` before invoking the SDK so aliases (e.g. `"gpt"``"gpt-4"`) resolve correctly. The shared implementations already do this for you.
**Option C: Azure OpenAI (`AzureOpenAIProvider`)**
- For Azure-hosted deployments of OpenAI models
- Reuses the OpenAI-compatible pipeline but swaps in the `AzureOpenAI` client and a deployment mapping (canonical model → deployment ID)
- Define deployments in [`conf/azure_models.json`](../conf/azure_models.json) (or the file referenced by `AZURE_MODELS_CONFIG_PATH`).
- Entries follow the [`ModelCapabilities`](../providers/shared/model_capabilities.py) schema and must include a `deployment` identifier.
See [Azure OpenAI Configuration](azure_openai.md) for a step-by-step walkthrough.
## Step-by-Step Guide ## Step-by-Step Guide
### 1. Add Provider Type ### 1. Add Provider Type
@@ -227,6 +235,19 @@ DISABLED_TOOLS=debug,tracer
EXAMPLE_ALLOWED_MODELS=example-model-large,example-model-small EXAMPLE_ALLOWED_MODELS=example-model-large,example-model-small
``` ```
For Azure OpenAI deployments:
```bash
AZURE_OPENAI_API_KEY=your_azure_openai_key_here
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
# Models are defined in conf/azure_models.json (or AZURE_MODELS_CONFIG_PATH)
# AZURE_OPENAI_API_VERSION=2024-02-15-preview
# AZURE_OPENAI_ALLOWED_MODELS=gpt-4o,gpt-4o-mini
# AZURE_MODELS_CONFIG_PATH=/absolute/path/to/custom_azure_models.json
```
You can also define Azure models in [`conf/azure_models.json`](../conf/azure_models.json) (the bundled file is empty so you can copy it safely). Each entry mirrors the `ModelCapabilities` schema and must include a `deployment` field. Set `AZURE_MODELS_CONFIG_PATH` if you maintain a custom copy outside the repository.
**Note**: The `description` field in `ModelCapabilities` helps Claude choose the best model in auto mode. **Note**: The `description` field in `ModelCapabilities` helps Claude choose the best model in auto mode.
### 5. Test Your Provider ### 5. Test Your Provider

View File

@@ -91,8 +91,8 @@ OPENAI_ALLOWED_MODELS=o3,o4-mini
**Important Notes:** **Important Notes:**
- Restrictions apply to all usage including auto mode - Restrictions apply to all usage including auto mode
- `OPENROUTER_ALLOWED_MODELS` only affects OpenRouter models accessed via custom provider (where `is_custom: false` in custom_models.json) - `OPENROUTER_ALLOWED_MODELS` only affects models defined in `conf/openrouter_models.json`
- Custom local models (`is_custom: true`) are not affected by any restrictions - Custom local models (from `conf/custom_models.json`) are not affected by OpenRouter restrictions
## Thinking Modes ## Thinking Modes

62
docs/azure_openai.md Normal file
View File

@@ -0,0 +1,62 @@
# Azure OpenAI Configuration
Azure OpenAI support lets Zen MCP talk to GPT-4o, GPT-4.1, GPT-5, and o-series deployments that you expose through your Azure resource. This guide describes the configuration expected by the server: a couple of required environment variables plus a JSON manifest that lists every deployment you want to expose.
## 1. Required Environment Variables
Set these entries in your `.env` (or MCP `env` block).
```bash
AZURE_OPENAI_API_KEY=your_azure_openai_key_here
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
# AZURE_OPENAI_API_VERSION=2024-02-15-preview
```
Without the key and endpoint the provider is skipped entirely. Leave the key blank only if the endpoint truly allows anonymous access (rare for Azure).
## 2. Define Deployments in `conf/azure_models.json`
Azure models live in `conf/azure_models.json` (or the file pointed to by `AZURE_MODELS_CONFIG_PATH`). Each entry follows the same schema as [`ModelCapabilities`](../providers/shared/model_capabilities.py) with one additional required key: `deployment`. This field must exactly match the deployment name shown in the Azure Portal (for example `prod-gpt4o`). The provider routes requests by that value, so omitting it or using the wrong name will cause the server to skip the model.
```json
{
"models": [
{
"model_name": "gpt-4o",
"deployment": "prod-gpt4o",
"friendly_name": "Azure GPT-4o EU",
"intelligence_score": 18,
"context_window": 600000,
"max_output_tokens": 128000,
"supports_temperature": false,
"temperature_constraint": "fixed",
"aliases": ["gpt4o-eu"]
}
]
}
```
Tips:
- Copy `conf/azure_models.json` into your repo and commit it, or point `AZURE_MODELS_CONFIG_PATH` at a custom path.
- Add one object per deployment. Aliases are optional but help when you want short names like `gpt4o-eu`.
- All capability fields are optional except `model_name`, `deployment`, and `friendly_name`. Anything you omit falls back to conservative defaults.
## 3. Optional Restrictions
Use `AZURE_OPENAI_ALLOWED_MODELS` to limit which Azure models Claude can access:
```bash
AZURE_OPENAI_ALLOWED_MODELS=gpt-4o,gpt-4o-mini
```
Aliases are matched case-insensitively.
## 4. Quick Checklist
- [ ] `AZURE_OPENAI_API_KEY` and `AZURE_OPENAI_ENDPOINT` are set
- [ ] `conf/azure_models.json` (or the file referenced by `AZURE_MODELS_CONFIG_PATH`) lists every deployment with the desired metadata
- [ ] Optional: `AZURE_OPENAI_ALLOWED_MODELS` to restrict usage
- [ ] Restart `./run-server.sh` and run `listmodels` to confirm the Azure entries appear with the expected metadata
See also: [`docs/adding_providers.md`](adding_providers.md) for the full provider architecture and [README (Provider Configuration)](../README.md#provider-configuration) for quick-start environment snippets.

View File

@@ -158,6 +158,8 @@ XAI_ALLOWED_MODELS=grok,grok-3-fast
```env ```env
# Override default location of custom_models.json # Override default location of custom_models.json
CUSTOM_MODELS_CONFIG_PATH=/path/to/your/custom_models.json CUSTOM_MODELS_CONFIG_PATH=/path/to/your/custom_models.json
# Override default location of openrouter_models.json
OPENROUTER_MODELS_CONFIG_PATH=/path/to/your/openrouter_models.json
``` ```
**Conversation Settings:** **Conversation Settings:**

View File

@@ -35,7 +35,12 @@ This guide covers setting up multiple AI model providers including OpenRouter, c
## Model Aliases ## Model Aliases
The server uses `conf/custom_models.json` to map convenient aliases to both OpenRouter and custom model names. This unified registry supports both cloud models (via OpenRouter) and local models (via custom endpoints). Zen ships two registries:
- `conf/openrouter_models.json` metadata for models routed through OpenRouter. Override with `OPENROUTER_MODELS_CONFIG_PATH` if you maintain a custom copy.
- `conf/custom_models.json` metadata for local or self-hosted OpenAI-compatible endpoints used by the Custom provider. Override with `CUSTOM_MODELS_CONFIG_PATH` if needed.
Copy whichever file you need into your project (or point the corresponding `*_MODELS_CONFIG_PATH` env var at your own copy) and edit it to advertise the models you want.
### OpenRouter Models (Cloud) ### OpenRouter Models (Cloud)
@@ -58,7 +63,7 @@ The server uses `conf/custom_models.json` to map convenient aliases to both Open
|-------|-------------------|------| |-------|-------------------|------|
| `local-llama`, `local` | `llama3.2` | Requires `CUSTOM_API_URL` configured | | `local-llama`, `local` | `llama3.2` | Requires `CUSTOM_API_URL` configured |
View the full list in [`conf/custom_models.json`](conf/custom_models.json). View the baseline OpenRouter catalogue in [`conf/openrouter_models.json`](conf/openrouter_models.json) and populate [`conf/custom_models.json`](conf/custom_models.json) with your local models.
To control ordering in auto mode or the `listmodels` summary, adjust the To control ordering in auto mode or the `listmodels` summary, adjust the
[`intelligence_score`](model_ranking.md) for each entry (or rely on the automatic [`intelligence_score`](model_ranking.md) for each entry (or rely on the automatic
@@ -152,7 +157,7 @@ CUSTOM_MODEL_NAME=your-loaded-model
## Using Models ## Using Models
**Using model aliases (from conf/custom_models.json):** **Using model aliases (from the registry files):**
``` ```
# OpenRouter models: # OpenRouter models:
"Use opus for deep analysis" # → anthropic/claude-opus-4 "Use opus for deep analysis" # → anthropic/claude-opus-4
@@ -185,20 +190,20 @@ CUSTOM_MODEL_NAME=your-loaded-model
The system automatically routes models to the appropriate provider: The system automatically routes models to the appropriate provider:
1. **Models with `is_custom: true`** → Always routed to Custom API (requires `CUSTOM_API_URL`) 1. Entries in `conf/custom_models.json` → Always routed through the Custom API (requires `CUSTOM_API_URL`)
2. **Models with `is_custom: false` or omitted** → Routed to OpenRouter (requires `OPENROUTER_API_KEY`) 2. Entries in `conf/openrouter_models.json` → Routed through OpenRouter (requires `OPENROUTER_API_KEY`)
3. **Unknown models** → Fallback logic based on model name patterns 3. **Unknown models** → Fallback logic based on model name patterns
**Provider Priority Order:** **Provider Priority Order:**
1. Native APIs (Google, OpenAI) - if API keys are available 1. Native APIs (Google, OpenAI) - if API keys are available
2. Custom endpoints - for models marked with `is_custom: true` 2. Custom endpoints - for models declared in `conf/custom_models.json`
3. OpenRouter - catch-all for cloud models 3. OpenRouter - catch-all for cloud models
This ensures clean separation between local and cloud models while maintaining flexibility for unknown models. This ensures clean separation between local and cloud models while maintaining flexibility for unknown models.
## Model Configuration ## Model Configuration
The server uses `conf/custom_models.json` to define model aliases and capabilities. You can: These JSON files define model aliases and capabilities. You can:
1. **Use the default configuration** - Includes popular models with convenient aliases 1. **Use the default configuration** - Includes popular models with convenient aliases
2. **Customize the configuration** - Add your own models and aliases 2. **Customize the configuration** - Add your own models and aliases
@@ -206,7 +211,7 @@ The server uses `conf/custom_models.json` to define model aliases and capabiliti
### Adding Custom Models ### Adding Custom Models
Edit `conf/custom_models.json` to add new models. The configuration supports both OpenRouter (cloud) and custom endpoint (local) models. Edit `conf/openrouter_models.json` to tweak OpenRouter behaviour or `conf/custom_models.json` to add local models. Each entry maps directly onto [`ModelCapabilities`](../providers/shared/model_capabilities.py).
#### Adding an OpenRouter Model #### Adding an OpenRouter Model
@@ -232,7 +237,6 @@ Edit `conf/custom_models.json` to add new models. The configuration supports bot
"supports_extended_thinking": false, "supports_extended_thinking": false,
"supports_json_mode": false, "supports_json_mode": false,
"supports_function_calling": false, "supports_function_calling": false,
"is_custom": true,
"description": "My custom Ollama/vLLM model" "description": "My custom Ollama/vLLM model"
} }
``` ```
@@ -244,10 +248,9 @@ Edit `conf/custom_models.json` to add new models. The configuration supports bot
- `supports_extended_thinking`: Whether the model has extended reasoning capabilities - `supports_extended_thinking`: Whether the model has extended reasoning capabilities
- `supports_json_mode`: Whether the model can guarantee valid JSON output - `supports_json_mode`: Whether the model can guarantee valid JSON output
- `supports_function_calling`: Whether the model supports function/tool calling - `supports_function_calling`: Whether the model supports function/tool calling
- `is_custom`: **Set to `true` for models that should ONLY work with custom endpoints** (Ollama, vLLM, etc.)
- `description`: Human-readable description of the model - `description`: Human-readable description of the model
**Important:** Always set `is_custom: true` for local models. This ensures they're only used when `CUSTOM_API_URL` is configured and prevents conflicts with OpenRouter. **Important:** Keep OpenRouter and Custom models in their respective files so that requests are routed correctly.
## Available Models ## Available Models

View File

@@ -4,6 +4,7 @@
|----------|-------------| |----------|-------------|
| [Getting Started](getting-started.md) | Installation paths, prerequisite setup, and first-run guidance. | | [Getting Started](getting-started.md) | Installation paths, prerequisite setup, and first-run guidance. |
| [Adding Providers](adding_providers.md) | How to register new AI providers and advertise capabilities. | | [Adding Providers](adding_providers.md) | How to register new AI providers and advertise capabilities. |
| [Azure OpenAI](azure_openai.md) | Configure Azure deployments, capability overrides, and env mappings. |
| [Model Ranking](model_ranking.md) | How intelligence scores translate into auto-mode ordering. | | [Model Ranking](model_ranking.md) | How intelligence scores translate into auto-mode ordering. |
| [Custom Models](custom_models.md) | Configure OpenRouter/custom models and aliases. | | [Custom Models](custom_models.md) | Configure OpenRouter/custom models and aliases. |
| [Adding Tools](adding_tools.md) | Create new tools using the shared base classes. | | [Adding Tools](adding_tools.md) | Create new tools using the shared base classes. |

View File

@@ -25,7 +25,7 @@ feature_bonus = (
+ (1 if supports_json_mode else 0) + (1 if supports_json_mode else 0)
+ (1 if supports_images else 0) + (1 if supports_images else 0)
) )
penalty = 1 if is_custom else 0 penalty = 1 if provider == CUSTOM else 0
effective_rank = clamp(base + ctx_bonus + output_bonus + feature_bonus - penalty, 0, 100) effective_rank = clamp(base + ctx_bonus + output_bonus + feature_bonus - penalty, 0, 100)
``` ```

View File

@@ -1,5 +1,6 @@
"""Model provider abstractions for supporting multiple AI providers.""" """Model provider abstractions for supporting multiple AI providers."""
from .azure_openai import AzureOpenAIProvider
from .base import ModelProvider from .base import ModelProvider
from .gemini import GeminiModelProvider from .gemini import GeminiModelProvider
from .openai_compatible import OpenAICompatibleProvider from .openai_compatible import OpenAICompatibleProvider
@@ -13,6 +14,7 @@ __all__ = [
"ModelResponse", "ModelResponse",
"ModelCapabilities", "ModelCapabilities",
"ModelProviderRegistry", "ModelProviderRegistry",
"AzureOpenAIProvider",
"GeminiModelProvider", "GeminiModelProvider",
"OpenAIModelProvider", "OpenAIModelProvider",
"OpenAICompatibleProvider", "OpenAICompatibleProvider",

342
providers/azure_openai.py Normal file
View File

@@ -0,0 +1,342 @@
"""Azure OpenAI provider built on the OpenAI-compatible implementation."""
from __future__ import annotations
import logging
from dataclasses import asdict, replace
try: # pragma: no cover - optional dependency
from openai import AzureOpenAI
except ImportError: # pragma: no cover
AzureOpenAI = None # type: ignore[assignment]
from utils.env import get_env, suppress_env_vars
from .azure_registry import AzureModelRegistry
from .openai_compatible import OpenAICompatibleProvider
from .openai_provider import OpenAIModelProvider
from .shared import ModelCapabilities, ModelResponse, ProviderType, TemperatureConstraint
logger = logging.getLogger(__name__)
class AzureOpenAIProvider(OpenAICompatibleProvider):
"""Thin Azure wrapper that reuses the OpenAI-compatible request pipeline."""
FRIENDLY_NAME = "Azure OpenAI"
DEFAULT_API_VERSION = "2024-02-15-preview"
# The OpenAI-compatible base expects subclasses to expose capabilities via
# ``get_all_model_capabilities``. Azure deployments are user-defined, so we
# build the catalogue dynamically from environment configuration instead of
# relying on a static ``MODEL_CAPABILITIES`` map.
MODEL_CAPABILITIES: dict[str, ModelCapabilities] = {}
def __init__(
self,
api_key: str,
*,
azure_endpoint: str | None = None,
api_version: str | None = None,
deployments: dict[str, object] | None = None,
**kwargs,
) -> None:
# Let the OpenAI-compatible base handle shared configuration such as
# timeouts, restriction-aware allowlists, and logging. ``base_url`` maps
# directly onto Azure's endpoint URL.
super().__init__(api_key, base_url=azure_endpoint, **kwargs)
if not azure_endpoint:
azure_endpoint = get_env("AZURE_OPENAI_ENDPOINT")
if not azure_endpoint:
raise ValueError("Azure OpenAI endpoint is required via parameter or AZURE_OPENAI_ENDPOINT")
self.azure_endpoint = azure_endpoint.rstrip("/")
self.api_version = api_version or get_env("AZURE_OPENAI_API_VERSION", self.DEFAULT_API_VERSION)
registry_specs = self._load_registry_entries()
override_specs = self._normalise_deployments(deployments or {}) if deployments else {}
self._model_specs = self._merge_specs(registry_specs, override_specs)
if not self._model_specs:
raise ValueError(
"Azure OpenAI provider requires at least one configured deployment. "
"Populate conf/azure_models.json or set AZURE_MODELS_CONFIG_PATH."
)
self._capabilities = self._build_capabilities_map()
self._deployment_map = {name: spec["deployment"] for name, spec in self._model_specs.items()}
self._deployment_alias_lookup = {
deployment.lower(): canonical for canonical, deployment in self._deployment_map.items()
}
self._canonical_lookup = {name.lower(): name for name in self._model_specs.keys()}
self._invalidate_capability_cache()
# ------------------------------------------------------------------
# Capability helpers
# ------------------------------------------------------------------
def get_all_model_capabilities(self) -> dict[str, ModelCapabilities]:
return dict(self._capabilities)
def get_provider_type(self) -> ProviderType:
return ProviderType.AZURE
def get_capabilities(self, model_name: str) -> ModelCapabilities: # type: ignore[override]
lowered = model_name.lower()
if lowered in self._deployment_alias_lookup:
canonical = self._deployment_alias_lookup[lowered]
return super().get_capabilities(canonical)
canonical = self._canonical_lookup.get(lowered)
if canonical:
return super().get_capabilities(canonical)
return super().get_capabilities(model_name)
def validate_model_name(self, model_name: str) -> bool: # type: ignore[override]
lowered = model_name.lower()
if lowered in self._deployment_alias_lookup or lowered in self._canonical_lookup:
return True
return super().validate_model_name(model_name)
def _build_capabilities_map(self) -> dict[str, ModelCapabilities]:
capabilities: dict[str, ModelCapabilities] = {}
for canonical_name, spec in self._model_specs.items():
template_capability: ModelCapabilities | None = spec.get("capability")
overrides = spec.get("overrides", {})
if template_capability:
cloned = replace(template_capability)
else:
template = OpenAIModelProvider.MODEL_CAPABILITIES.get(canonical_name)
if template:
friendly = template.friendly_name.replace("OpenAI", "Azure OpenAI", 1)
cloned = replace(
template,
provider=ProviderType.AZURE,
friendly_name=friendly,
aliases=list(template.aliases),
)
else:
deployment_name = spec.get("deployment", "")
cloned = ModelCapabilities(
provider=ProviderType.AZURE,
model_name=canonical_name,
friendly_name=f"Azure OpenAI ({canonical_name})",
description=f"Azure deployment '{deployment_name}' for {canonical_name}",
aliases=[],
)
if overrides:
overrides = dict(overrides)
temp_override = overrides.get("temperature_constraint")
if isinstance(temp_override, str):
overrides["temperature_constraint"] = TemperatureConstraint.create(temp_override)
aliases_override = overrides.get("aliases")
if isinstance(aliases_override, str):
overrides["aliases"] = [alias.strip() for alias in aliases_override.split(",") if alias.strip()]
provider_override = overrides.get("provider")
if provider_override:
overrides.pop("provider", None)
try:
cloned = replace(cloned, **overrides)
except TypeError:
base_data = asdict(cloned)
base_data.update(overrides)
base_data["provider"] = ProviderType.AZURE
temp_value = base_data.get("temperature_constraint")
if isinstance(temp_value, str):
base_data["temperature_constraint"] = TemperatureConstraint.create(temp_value)
cloned = ModelCapabilities(**base_data)
if cloned.provider != ProviderType.AZURE:
cloned.provider = ProviderType.AZURE
capabilities[canonical_name] = cloned
return capabilities
def _load_registry_entries(self) -> dict[str, dict]:
try:
registry = AzureModelRegistry()
except Exception as exc: # pragma: no cover - registry failure should not crash provider
logger.warning("Unable to load Azure model registry: %s", exc)
return {}
entries: dict[str, dict] = {}
for model_name, capability, extra in registry.iter_entries():
deployment = extra.get("deployment")
if not deployment:
logger.warning("Azure model '%s' missing deployment in registry", model_name)
continue
entries[model_name] = {"deployment": deployment, "capability": capability}
return entries
@staticmethod
def _merge_specs(
registry_specs: dict[str, dict],
override_specs: dict[str, dict],
) -> dict[str, dict]:
specs: dict[str, dict] = {}
for canonical, entry in registry_specs.items():
specs[canonical] = {
"deployment": entry.get("deployment"),
"capability": entry.get("capability"),
"overrides": {},
}
for canonical, entry in override_specs.items():
spec = specs.get(canonical, {"deployment": None, "capability": None, "overrides": {}})
deployment = entry.get("deployment")
if deployment:
spec["deployment"] = deployment
overrides = {k: v for k, v in entry.items() if k not in {"deployment"}}
overrides.pop("capability", None)
if overrides:
spec["overrides"].update(overrides)
specs[canonical] = spec
return {k: v for k, v in specs.items() if v.get("deployment")}
@staticmethod
def _normalise_deployments(mapping: dict[str, object]) -> dict[str, dict]:
normalised: dict[str, dict] = {}
for canonical, spec in mapping.items():
canonical_name = (canonical or "").strip()
if not canonical_name:
continue
deployment_name: str | None = None
overrides: dict[str, object] = {}
if isinstance(spec, str):
deployment_name = spec.strip()
elif isinstance(spec, dict):
deployment_name = spec.get("deployment") or spec.get("deployment_name")
overrides = {k: v for k, v in spec.items() if k not in {"deployment", "deployment_name"}}
if not deployment_name:
continue
normalised[canonical_name] = {"deployment": deployment_name.strip(), **overrides}
return normalised
# ------------------------------------------------------------------
# Azure-specific configuration
# ------------------------------------------------------------------
@property
def client(self): # type: ignore[override]
"""Instantiate the Azure OpenAI client on first use."""
if self._client is None:
if AzureOpenAI is None:
raise ImportError(
"Azure OpenAI support requires the 'openai' package. Install it with `pip install openai`."
)
import httpx
proxy_env_vars = ["HTTP_PROXY", "HTTPS_PROXY", "ALL_PROXY", "http_proxy", "https_proxy", "all_proxy"]
with suppress_env_vars(*proxy_env_vars):
try:
timeout_config = self.timeout_config
http_client = httpx.Client(timeout=timeout_config, follow_redirects=True)
client_kwargs = {
"api_key": self.api_key,
"azure_endpoint": self.azure_endpoint,
"api_version": self.api_version,
"http_client": http_client,
}
if self.DEFAULT_HEADERS:
client_kwargs["default_headers"] = self.DEFAULT_HEADERS.copy()
logger.debug(
"Initializing Azure OpenAI client endpoint=%s api_version=%s timeouts=%s",
self.azure_endpoint,
self.api_version,
timeout_config,
)
self._client = AzureOpenAI(**client_kwargs)
except Exception as exc:
logger.error("Failed to create Azure OpenAI client: %s", exc)
raise
return self._client
# ------------------------------------------------------------------
# Request delegation
# ------------------------------------------------------------------
def generate_content(
self,
prompt: str,
model_name: str,
system_prompt: str | None = None,
temperature: float = 0.3,
max_output_tokens: int | None = None,
images: list[str] | None = None,
**kwargs,
) -> ModelResponse:
canonical_name, deployment_name = self._resolve_canonical_and_deployment(model_name)
# Delegate to the shared OpenAI-compatible implementation using the
# deployment name Azure requires the deployment identifier in the
# ``model`` field. The returned ``ModelResponse`` is normalised so
# downstream consumers continue to see the canonical model name.
raw_response = super().generate_content(
prompt=prompt,
model_name=deployment_name,
system_prompt=system_prompt,
temperature=temperature,
max_output_tokens=max_output_tokens,
images=images,
**kwargs,
)
capabilities = self._capabilities.get(canonical_name)
friendly_name = capabilities.friendly_name if capabilities else self.FRIENDLY_NAME
return ModelResponse(
content=raw_response.content,
usage=raw_response.usage,
model_name=canonical_name,
friendly_name=friendly_name,
provider=ProviderType.AZURE,
metadata={**raw_response.metadata, "deployment": deployment_name},
)
def _resolve_canonical_and_deployment(self, model_name: str) -> tuple[str, str]:
resolved_canonical = self._resolve_model_name(model_name)
if resolved_canonical not in self._deployment_map:
# The base resolver may hand back the deployment alias. Try to map it
# back to a canonical entry.
for canonical, deployment in self._deployment_map.items():
if deployment.lower() == resolved_canonical.lower():
return canonical, deployment
raise ValueError(f"Model '{model_name}' is not configured for Azure OpenAI")
return resolved_canonical, self._deployment_map[resolved_canonical]
def _parse_allowed_models(self) -> set[str] | None: # type: ignore[override]
# Support both AZURE_ALLOWED_MODELS (inherited behaviour) and the
# clearer AZURE_OPENAI_ALLOWED_MODELS alias.
explicit = get_env("AZURE_OPENAI_ALLOWED_MODELS")
if explicit:
models = {m.strip().lower() for m in explicit.split(",") if m.strip()}
if models:
logger.info("Configured allowed models for Azure OpenAI: %s", sorted(models))
self._allowed_alias_cache = {}
return models
return super()._parse_allowed_models()

View File

@@ -0,0 +1,45 @@
"""Registry loader for Azure OpenAI model configurations."""
from __future__ import annotations
import logging
from .model_registry_base import CAPABILITY_FIELD_NAMES, CustomModelRegistryBase
from .shared import ModelCapabilities, ProviderType, TemperatureConstraint
logger = logging.getLogger(__name__)
class AzureModelRegistry(CustomModelRegistryBase):
"""Load Azure-specific model metadata from configuration files."""
def __init__(self, config_path: str | None = None) -> None:
super().__init__(
env_var_name="AZURE_MODELS_CONFIG_PATH",
default_filename="azure_models.json",
config_path=config_path,
)
self.reload()
def _extra_keys(self) -> set[str]:
return {"deployment", "deployment_name"}
def _provider_default(self) -> ProviderType:
return ProviderType.AZURE
def _default_friendly_name(self, model_name: str) -> str:
return f"Azure OpenAI ({model_name})"
def _finalise_entry(self, entry: dict) -> tuple[ModelCapabilities, dict]:
deployment = entry.pop("deployment", None) or entry.pop("deployment_name", None)
if not deployment:
raise ValueError(f"Azure model '{entry.get('model_name')}' is missing required 'deployment' field")
temp_hint = entry.get("temperature_constraint")
if isinstance(temp_hint, str):
entry["temperature_constraint"] = TemperatureConstraint.create(temp_hint)
filtered = {k: v for k, v in entry.items() if k in CAPABILITY_FIELD_NAMES}
filtered.setdefault("provider", ProviderType.AZURE)
capability = ModelCapabilities(**filtered)
return capability, {"deployment": deployment}

View File

@@ -1,10 +1,10 @@
"""Custom API provider implementation.""" """Custom API provider implementation."""
import logging import logging
from typing import Optional
from utils.env import get_env from utils.env import get_env
from .custom_registry import CustomEndpointModelRegistry
from .openai_compatible import OpenAICompatibleProvider from .openai_compatible import OpenAICompatibleProvider
from .openrouter_registry import OpenRouterModelRegistry from .openrouter_registry import OpenRouterModelRegistry
from .shared import ModelCapabilities, ProviderType from .shared import ModelCapabilities, ProviderType
@@ -31,8 +31,8 @@ class CustomProvider(OpenAICompatibleProvider):
FRIENDLY_NAME = "Custom API" FRIENDLY_NAME = "Custom API"
# Model registry for managing configurations and aliases (shared with OpenRouter) # Model registry for managing configurations and aliases
_registry: Optional[OpenRouterModelRegistry] = None _registry: CustomEndpointModelRegistry | None = None
def __init__(self, api_key: str = "", base_url: str = "", **kwargs): def __init__(self, api_key: str = "", base_url: str = "", **kwargs):
"""Initialize Custom provider for local/self-hosted models. """Initialize Custom provider for local/self-hosted models.
@@ -78,9 +78,9 @@ class CustomProvider(OpenAICompatibleProvider):
super().__init__(api_key, base_url=base_url, **kwargs) super().__init__(api_key, base_url=base_url, **kwargs)
# Initialize model registry (shared with OpenRouter for consistent aliases) # Initialize model registry
if CustomProvider._registry is None: if CustomProvider._registry is None:
CustomProvider._registry = OpenRouterModelRegistry() CustomProvider._registry = CustomEndpointModelRegistry()
# Log loaded models and aliases only on first load # Log loaded models and aliases only on first load
models = self._registry.list_models() models = self._registry.list_models()
aliases = self._registry.list_aliases() aliases = self._registry.list_aliases()
@@ -92,8 +92,8 @@ class CustomProvider(OpenAICompatibleProvider):
def _lookup_capabilities( def _lookup_capabilities(
self, self,
canonical_name: str, canonical_name: str,
requested_name: Optional[str] = None, requested_name: str | None = None,
) -> Optional[ModelCapabilities]: ) -> ModelCapabilities | None:
"""Return capabilities for models explicitly marked as custom.""" """Return capabilities for models explicitly marked as custom."""
builtin = super()._lookup_capabilities(canonical_name, requested_name) builtin = super()._lookup_capabilities(canonical_name, requested_name)
@@ -101,12 +101,12 @@ class CustomProvider(OpenAICompatibleProvider):
return builtin return builtin
registry_entry = self._registry.resolve(canonical_name) registry_entry = self._registry.resolve(canonical_name)
if registry_entry and getattr(registry_entry, "is_custom", False): if registry_entry:
registry_entry.provider = ProviderType.CUSTOM registry_entry.provider = ProviderType.CUSTOM
return registry_entry return registry_entry
logging.debug( logging.debug(
"Custom provider cannot resolve model '%s'; ensure it is declared with 'is_custom': true in custom_models.json", "Custom provider cannot resolve model '%s'; ensure it is declared in custom_models.json",
canonical_name, canonical_name,
) )
return None return None
@@ -151,6 +151,15 @@ class CustomProvider(OpenAICompatibleProvider):
return base_model return base_model
logging.debug(f"Model '{model_name}' not found in registry, using as-is") logging.debug(f"Model '{model_name}' not found in registry, using as-is")
# Attempt to resolve via OpenRouter registry so aliases still map cleanly
openrouter_registry = OpenRouterModelRegistry()
openrouter_config = openrouter_registry.resolve(model_name)
if openrouter_config:
resolved = openrouter_config.model_name
self._alias_cache[cache_key] = resolved
self._alias_cache.setdefault(resolved.lower(), resolved)
return resolved
self._alias_cache[cache_key] = model_name self._alias_cache[cache_key] = model_name
return model_name return model_name
@@ -160,9 +169,9 @@ class CustomProvider(OpenAICompatibleProvider):
if not self._registry: if not self._registry:
return {} return {}
capabilities: dict[str, ModelCapabilities] = {} capabilities = {}
for model_name in self._registry.list_models(): for model in self._registry.list_models():
config = self._registry.resolve(model_name) config = self._registry.resolve(model)
if config and getattr(config, "is_custom", False): if config:
capabilities[model_name] = config capabilities[model] = config
return capabilities return capabilities

View File

@@ -0,0 +1,26 @@
"""Registry for models exposed via custom (local) OpenAI-compatible endpoints."""
from __future__ import annotations
from .model_registry_base import CAPABILITY_FIELD_NAMES, CapabilityModelRegistry
from .shared import ModelCapabilities, ProviderType
class CustomEndpointModelRegistry(CapabilityModelRegistry):
def __init__(self, config_path: str | None = None) -> None:
super().__init__(
env_var_name="CUSTOM_MODELS_CONFIG_PATH",
default_filename="custom_models.json",
provider=ProviderType.CUSTOM,
friendly_prefix="Custom ({model})",
config_path=config_path,
)
self.reload()
def _finalise_entry(self, entry: dict) -> tuple[ModelCapabilities, dict]:
entry["provider"] = ProviderType.CUSTOM
entry.setdefault("friendly_name", f"Custom ({entry['model_name']})")
filtered = {k: v for k, v in entry.items() if k in CAPABILITY_FIELD_NAMES}
filtered.setdefault("provider", ProviderType.CUSTOM)
capability = ModelCapabilities(**filtered)
return capability, {}

View File

@@ -0,0 +1,241 @@
"""Shared infrastructure for JSON-backed model registries."""
from __future__ import annotations
import importlib.resources
import json
import logging
from collections.abc import Iterable
from dataclasses import fields
from pathlib import Path
from utils.env import get_env
from utils.file_utils import read_json_file
from .shared import ModelCapabilities, ProviderType, TemperatureConstraint
logger = logging.getLogger(__name__)
CAPABILITY_FIELD_NAMES = {field.name for field in fields(ModelCapabilities)}
class CustomModelRegistryBase:
"""Load and expose capability metadata from a JSON manifest."""
def __init__(
self,
*,
env_var_name: str,
default_filename: str,
config_path: str | None = None,
) -> None:
self._env_var_name = env_var_name
self._default_filename = default_filename
self._use_resources = False
self._resource_package = "conf"
self._default_path = Path(__file__).parent.parent / "conf" / default_filename
if config_path:
self.config_path = Path(config_path)
else:
env_path = get_env(env_var_name)
if env_path:
self.config_path = Path(env_path)
else:
try:
resource = importlib.resources.files(self._resource_package).joinpath(default_filename)
if hasattr(resource, "read_text"):
self._use_resources = True
self.config_path = None
else:
raise AttributeError("resource accessor not available")
except Exception:
self.config_path = Path(__file__).parent.parent / "conf" / default_filename
self.alias_map: dict[str, str] = {}
self.model_map: dict[str, ModelCapabilities] = {}
self._extras: dict[str, dict] = {}
def reload(self) -> None:
data = self._load_config_data()
configs = [config for config in self._parse_models(data) if config is not None]
self._build_maps(configs)
def list_models(self) -> list[str]:
return list(self.model_map.keys())
def list_aliases(self) -> list[str]:
return list(self.alias_map.keys())
def resolve(self, name_or_alias: str) -> ModelCapabilities | None:
key = name_or_alias.lower()
canonical = self.alias_map.get(key)
if canonical:
return self.model_map.get(canonical)
for model_name in self.model_map:
if model_name.lower() == key:
return self.model_map[model_name]
return None
def get_capabilities(self, name_or_alias: str) -> ModelCapabilities | None:
return self.resolve(name_or_alias)
def get_entry(self, model_name: str) -> dict | None:
return self._extras.get(model_name)
def iter_entries(self) -> Iterable[tuple[str, ModelCapabilities, dict]]:
for model_name, capability in self.model_map.items():
yield model_name, capability, self._extras.get(model_name, {})
# ------------------------------------------------------------------
# Internal helpers
# ------------------------------------------------------------------
def _load_config_data(self) -> dict:
if self._use_resources:
try:
resource = importlib.resources.files(self._resource_package).joinpath(self._default_filename)
if hasattr(resource, "read_text"):
config_text = resource.read_text(encoding="utf-8")
else: # pragma: no cover - legacy Python fallback
with resource.open("r", encoding="utf-8") as handle:
config_text = handle.read()
data = json.loads(config_text)
except FileNotFoundError:
logger.debug("Packaged %s not found", self._default_filename)
return {"models": []}
except Exception as exc:
logger.warning("Failed to read packaged %s: %s", self._default_filename, exc)
return {"models": []}
return data or {"models": []}
if not self.config_path:
raise FileNotFoundError("Registry configuration path is not set")
if not self.config_path.exists():
logger.debug("Model registry config not found at %s", self.config_path)
if self.config_path == self._default_path:
fallback = Path.cwd() / "conf" / self._default_filename
if fallback != self.config_path and fallback.exists():
logger.debug("Falling back to %s", fallback)
self.config_path = fallback
else:
return {"models": []}
else:
return {"models": []}
data = read_json_file(str(self.config_path))
return data or {"models": []}
@property
def use_resources(self) -> bool:
return self._use_resources
def _parse_models(self, data: dict) -> Iterable[ModelCapabilities | None]:
for raw in data.get("models", []):
if not isinstance(raw, dict):
continue
yield self._convert_entry(raw)
def _convert_entry(self, raw: dict) -> ModelCapabilities | None:
entry = dict(raw)
model_name = entry.get("model_name")
if not model_name:
return None
aliases = entry.get("aliases")
if isinstance(aliases, str):
entry["aliases"] = [alias.strip() for alias in aliases.split(",") if alias.strip()]
entry.setdefault("friendly_name", self._default_friendly_name(model_name))
temperature_hint = entry.get("temperature_constraint")
if isinstance(temperature_hint, str):
entry["temperature_constraint"] = TemperatureConstraint.create(temperature_hint)
elif temperature_hint is None:
entry["temperature_constraint"] = TemperatureConstraint.create("range")
if "max_tokens" in entry:
raise ValueError(
"`max_tokens` is no longer supported. Use `max_output_tokens` in your model configuration."
)
unknown_keys = set(entry.keys()) - CAPABILITY_FIELD_NAMES - self._extra_keys()
if unknown_keys:
raise ValueError("Unsupported fields in model configuration: " + ", ".join(sorted(unknown_keys)))
capability, extras = self._finalise_entry(entry)
capability.provider = self._provider_default()
self._extras[capability.model_name] = extras or {}
return capability
def _default_friendly_name(self, model_name: str) -> str:
return model_name
def _extra_keys(self) -> set[str]:
return set()
def _provider_default(self) -> ProviderType:
return ProviderType.OPENROUTER
def _finalise_entry(self, entry: dict) -> tuple[ModelCapabilities, dict]:
return ModelCapabilities(**{k: v for k, v in entry.items() if k in CAPABILITY_FIELD_NAMES}), {}
def _build_maps(self, configs: Iterable[ModelCapabilities]) -> None:
alias_map: dict[str, str] = {}
model_map: dict[str, ModelCapabilities] = {}
for config in configs:
if not config:
continue
model_map[config.model_name] = config
model_name_lower = config.model_name.lower()
if model_name_lower not in alias_map:
alias_map[model_name_lower] = config.model_name
for alias in config.aliases:
alias_lower = alias.lower()
if alias_lower in alias_map and alias_map[alias_lower] != config.model_name:
raise ValueError(
f"Duplicate alias '{alias}' found for models '{alias_map[alias_lower]}' and '{config.model_name}'"
)
alias_map[alias_lower] = config.model_name
self.alias_map = alias_map
self.model_map = model_map
class CapabilityModelRegistry(CustomModelRegistryBase):
"""Registry that returns `ModelCapabilities` objects with alias support."""
def __init__(
self,
*,
env_var_name: str,
default_filename: str,
provider: ProviderType,
friendly_prefix: str,
config_path: str | None = None,
) -> None:
self._provider = provider
self._friendly_prefix = friendly_prefix
super().__init__(
env_var_name=env_var_name,
default_filename=default_filename,
config_path=config_path,
)
self.reload()
def _provider_default(self) -> ProviderType:
return self._provider
def _default_friendly_name(self, model_name: str) -> str:
return self._friendly_prefix.format(model=model_name)
def _finalise_entry(self, entry: dict) -> tuple[ModelCapabilities, dict]:
filtered = {k: v for k, v in entry.items() if k in CAPABILITY_FIELD_NAMES}
filtered.setdefault("provider", self._provider_default())
capability = ModelCapabilities(**filtered)
return capability, {}

View File

@@ -8,7 +8,7 @@ from urllib.parse import urlparse
from openai import OpenAI from openai import OpenAI
from utils.env import get_env from utils.env import get_env, suppress_env_vars
from utils.image_utils import validate_image from utils.image_utils import validate_image
from .base import ModelProvider from .base import ModelProvider
@@ -257,80 +257,74 @@ class OpenAICompatibleProvider(ModelProvider):
def client(self): def client(self):
"""Lazy initialization of OpenAI client with security checks and timeout configuration.""" """Lazy initialization of OpenAI client with security checks and timeout configuration."""
if self._client is None: if self._client is None:
import os
import httpx import httpx
# Temporarily disable proxy environment variables to prevent httpx from detecting them
original_env = {}
proxy_env_vars = ["HTTP_PROXY", "HTTPS_PROXY", "ALL_PROXY", "http_proxy", "https_proxy", "all_proxy"] proxy_env_vars = ["HTTP_PROXY", "HTTPS_PROXY", "ALL_PROXY", "http_proxy", "https_proxy", "all_proxy"]
for var in proxy_env_vars: with suppress_env_vars(*proxy_env_vars):
if var in os.environ:
original_env[var] = os.environ[var]
del os.environ[var]
try:
# Create a custom httpx client that explicitly avoids proxy parameters
timeout_config = (
self.timeout_config
if hasattr(self, "timeout_config") and self.timeout_config
else httpx.Timeout(30.0)
)
# Create httpx client with minimal config to avoid proxy conflicts
# Note: proxies parameter was removed in httpx 0.28.0
# Check for test transport injection
if hasattr(self, "_test_transport"):
# Use custom transport for testing (HTTP recording/replay)
http_client = httpx.Client(
transport=self._test_transport,
timeout=timeout_config,
follow_redirects=True,
)
else:
# Normal production client
http_client = httpx.Client(
timeout=timeout_config,
follow_redirects=True,
)
# Keep client initialization minimal to avoid proxy parameter conflicts
client_kwargs = {
"api_key": self.api_key,
"http_client": http_client,
}
if self.base_url:
client_kwargs["base_url"] = self.base_url
if self.organization:
client_kwargs["organization"] = self.organization
# Add default headers if any
if self.DEFAULT_HEADERS:
client_kwargs["default_headers"] = self.DEFAULT_HEADERS.copy()
logging.debug(f"OpenAI client initialized with custom httpx client and timeout: {timeout_config}")
# Create OpenAI client with custom httpx client
self._client = OpenAI(**client_kwargs)
except Exception as e:
# If all else fails, try absolute minimal client without custom httpx
logging.warning(f"Failed to create client with custom httpx, falling back to minimal config: {e}")
try: try:
minimal_kwargs = {"api_key": self.api_key} # Create a custom httpx client that explicitly avoids proxy parameters
timeout_config = (
self.timeout_config
if hasattr(self, "timeout_config") and self.timeout_config
else httpx.Timeout(30.0)
)
# Create httpx client with minimal config to avoid proxy conflicts
# Note: proxies parameter was removed in httpx 0.28.0
# Check for test transport injection
if hasattr(self, "_test_transport"):
# Use custom transport for testing (HTTP recording/replay)
http_client = httpx.Client(
transport=self._test_transport,
timeout=timeout_config,
follow_redirects=True,
)
else:
# Normal production client
http_client = httpx.Client(
timeout=timeout_config,
follow_redirects=True,
)
# Keep client initialization minimal to avoid proxy parameter conflicts
client_kwargs = {
"api_key": self.api_key,
"http_client": http_client,
}
if self.base_url: if self.base_url:
minimal_kwargs["base_url"] = self.base_url client_kwargs["base_url"] = self.base_url
self._client = OpenAI(**minimal_kwargs)
except Exception as fallback_error: if self.organization:
logging.error(f"Even minimal OpenAI client creation failed: {fallback_error}") client_kwargs["organization"] = self.organization
raise
finally: # Add default headers if any
# Restore original proxy environment variables if self.DEFAULT_HEADERS:
for var, value in original_env.items(): client_kwargs["default_headers"] = self.DEFAULT_HEADERS.copy()
os.environ[var] = value
logging.debug(
"OpenAI client initialized with custom httpx client and timeout: %s",
timeout_config,
)
# Create OpenAI client with custom httpx client
self._client = OpenAI(**client_kwargs)
except Exception as e:
# If all else fails, try absolute minimal client without custom httpx
logging.warning(
"Failed to create client with custom httpx, falling back to minimal config: %s",
e,
)
try:
minimal_kwargs = {"api_key": self.api_key}
if self.base_url:
minimal_kwargs["base_url"] = self.base_url
self._client = OpenAI(**minimal_kwargs)
except Exception as fallback_error:
logging.error("Even minimal OpenAI client creation failed: %s", fallback_error)
raise
return self._client return self._client

View File

@@ -103,16 +103,16 @@ class OpenAIModelProvider(OpenAICompatibleProvider):
model_name="o3-mini", model_name="o3-mini",
friendly_name="OpenAI (O3-mini)", friendly_name="OpenAI (O3-mini)",
intelligence_score=12, intelligence_score=12,
context_window=200_000, # 200K tokens context_window=200_000,
max_output_tokens=65536, # 64K max output tokens max_output_tokens=65536,
supports_extended_thinking=False, supports_extended_thinking=False,
supports_system_prompts=True, supports_system_prompts=True,
supports_streaming=True, supports_streaming=True,
supports_function_calling=True, supports_function_calling=True,
supports_json_mode=True, supports_json_mode=True,
supports_images=True, # O3 models support vision supports_images=True,
max_image_size_mb=20.0, # 20MB per OpenAI docs max_image_size_mb=20.0,
supports_temperature=False, # O3 models don't accept temperature parameter supports_temperature=False,
temperature_constraint=TemperatureConstraint.create("fixed"), temperature_constraint=TemperatureConstraint.create("fixed"),
description="Fast O3 variant (200K context) - Balanced performance/speed, moderate complexity", description="Fast O3 variant (200K context) - Balanced performance/speed, moderate complexity",
aliases=["o3mini"], aliases=["o3mini"],
@@ -122,16 +122,16 @@ class OpenAIModelProvider(OpenAICompatibleProvider):
model_name="o3-pro", model_name="o3-pro",
friendly_name="OpenAI (O3-Pro)", friendly_name="OpenAI (O3-Pro)",
intelligence_score=15, intelligence_score=15,
context_window=200_000, # 200K tokens context_window=200_000,
max_output_tokens=65536, # 64K max output tokens max_output_tokens=65536,
supports_extended_thinking=False, supports_extended_thinking=False,
supports_system_prompts=True, supports_system_prompts=True,
supports_streaming=True, supports_streaming=True,
supports_function_calling=True, supports_function_calling=True,
supports_json_mode=True, supports_json_mode=True,
supports_images=True, # O3 models support vision supports_images=True,
max_image_size_mb=20.0, # 20MB per OpenAI docs max_image_size_mb=20.0,
supports_temperature=False, # O3 models don't accept temperature parameter supports_temperature=False,
temperature_constraint=TemperatureConstraint.create("fixed"), temperature_constraint=TemperatureConstraint.create("fixed"),
description="Professional-grade reasoning (200K context) - EXTREMELY EXPENSIVE: Only for the most complex problems requiring universe-scale complexity analysis OR when the user explicitly asks for this model. Use sparingly for critical architectural decisions or exceptionally complex debugging that other models cannot handle.", description="Professional-grade reasoning (200K context) - EXTREMELY EXPENSIVE: Only for the most complex problems requiring universe-scale complexity analysis OR when the user explicitly asks for this model. Use sparingly for critical architectural decisions or exceptionally complex debugging that other models cannot handle.",
aliases=["o3pro"], aliases=["o3pro"],
@@ -141,16 +141,15 @@ class OpenAIModelProvider(OpenAICompatibleProvider):
model_name="o4-mini", model_name="o4-mini",
friendly_name="OpenAI (O4-mini)", friendly_name="OpenAI (O4-mini)",
intelligence_score=11, intelligence_score=11,
context_window=200_000, # 200K tokens context_window=200_000,
max_output_tokens=65536, # 64K max output tokens
supports_extended_thinking=False, supports_extended_thinking=False,
supports_system_prompts=True, supports_system_prompts=True,
supports_streaming=True, supports_streaming=True,
supports_function_calling=True, supports_function_calling=True,
supports_json_mode=True, supports_json_mode=True,
supports_images=True, # O4 models support vision supports_images=True,
max_image_size_mb=20.0, # 20MB per OpenAI docs max_image_size_mb=20.0,
supports_temperature=False, # O4 models don't accept temperature parameter supports_temperature=False,
temperature_constraint=TemperatureConstraint.create("fixed"), temperature_constraint=TemperatureConstraint.create("fixed"),
description="Latest reasoning model (200K context) - Optimized for shorter contexts, rapid reasoning", description="Latest reasoning model (200K context) - Optimized for shorter contexts, rapid reasoning",
aliases=["o4mini"], aliases=["o4mini"],
@@ -160,16 +159,16 @@ class OpenAIModelProvider(OpenAICompatibleProvider):
model_name="gpt-4.1", model_name="gpt-4.1",
friendly_name="OpenAI (GPT 4.1)", friendly_name="OpenAI (GPT 4.1)",
intelligence_score=13, intelligence_score=13,
context_window=1_000_000, # 1M tokens context_window=1_000_000,
max_output_tokens=32_768, max_output_tokens=32_768,
supports_extended_thinking=False, supports_extended_thinking=False,
supports_system_prompts=True, supports_system_prompts=True,
supports_streaming=True, supports_streaming=True,
supports_function_calling=True, supports_function_calling=True,
supports_json_mode=True, supports_json_mode=True,
supports_images=True, # GPT-4.1 supports vision supports_images=True,
max_image_size_mb=20.0, # 20MB per OpenAI docs max_image_size_mb=20.0,
supports_temperature=True, # Regular models accept temperature parameter supports_temperature=True,
temperature_constraint=TemperatureConstraint.create("range"), temperature_constraint=TemperatureConstraint.create("range"),
description="GPT-4.1 (1M context) - Advanced reasoning model with large context window", description="GPT-4.1 (1M context) - Advanced reasoning model with large context window",
aliases=["gpt4.1"], aliases=["gpt4.1"],
@@ -178,19 +177,19 @@ class OpenAIModelProvider(OpenAICompatibleProvider):
provider=ProviderType.OPENAI, provider=ProviderType.OPENAI,
model_name="gpt-5-codex", model_name="gpt-5-codex",
friendly_name="OpenAI (GPT-5 Codex)", friendly_name="OpenAI (GPT-5 Codex)",
intelligence_score=17, # Higher than GPT-5 for coding tasks intelligence_score=17,
context_window=400_000, # 400K tokens (same as GPT-5) context_window=400_000,
max_output_tokens=128_000, # 128K output tokens max_output_tokens=128_000,
supports_extended_thinking=True, # Responses API supports reasoning tokens supports_extended_thinking=True,
supports_system_prompts=True, supports_system_prompts=True,
supports_streaming=True, supports_streaming=True,
supports_function_calling=True, # Enhanced for agentic software engineering supports_function_calling=True,
supports_json_mode=True, supports_json_mode=True,
supports_images=True, # Screenshots, wireframes, diagrams supports_images=True,
max_image_size_mb=20.0, # 20MB per OpenAI docs max_image_size_mb=20.0,
supports_temperature=True, supports_temperature=True,
temperature_constraint=TemperatureConstraint.create("range"), temperature_constraint=TemperatureConstraint.create("range"),
description="GPT-5 Codex (400K context) - Uses Responses API for 40-80% cost savings. Specialized for coding, refactoring, and software architecture. 3% better performance on SWE-bench.", description="GPT-5 Codex (400K context) Specialized for coding, refactoring, and software architecture.",
aliases=["gpt5-codex", "codex", "gpt-5-code", "gpt5-code"], aliases=["gpt5-codex", "codex", "gpt-5-code", "gpt5-code"],
), ),
} }
@@ -282,7 +281,7 @@ class OpenAIModelProvider(OpenAICompatibleProvider):
if category == ToolModelCategory.EXTENDED_REASONING: if category == ToolModelCategory.EXTENDED_REASONING:
# Prefer models with extended thinking support # Prefer models with extended thinking support
# GPT-5-Codex first for coding tasks (uses Responses API with 40-80% cost savings) # GPT-5-Codex first for coding tasks
preferred = find_first(["gpt-5-codex", "o3", "o3-pro", "gpt-5"]) preferred = find_first(["gpt-5-codex", "o3", "o3-pro", "gpt-5"])
return preferred if preferred else allowed_models[0] return preferred if preferred else allowed_models[0]

View File

@@ -1,7 +1,6 @@
"""OpenRouter provider implementation.""" """OpenRouter provider implementation."""
import logging import logging
from typing import Optional
from utils.env import get_env from utils.env import get_env
@@ -42,7 +41,7 @@ class OpenRouterProvider(OpenAICompatibleProvider):
} }
# Model registry for managing configurations and aliases # Model registry for managing configurations and aliases
_registry: Optional[OpenRouterModelRegistry] = None _registry: OpenRouterModelRegistry | None = None
def __init__(self, api_key: str, **kwargs): def __init__(self, api_key: str, **kwargs):
"""Initialize OpenRouter provider. """Initialize OpenRouter provider.
@@ -70,8 +69,8 @@ class OpenRouterProvider(OpenAICompatibleProvider):
def _lookup_capabilities( def _lookup_capabilities(
self, self,
canonical_name: str, canonical_name: str,
requested_name: Optional[str] = None, requested_name: str | None = None,
) -> Optional[ModelCapabilities]: ) -> ModelCapabilities | None:
"""Fetch OpenRouter capabilities from the registry or build a generic fallback.""" """Fetch OpenRouter capabilities from the registry or build a generic fallback."""
capabilities = self._registry.get_capabilities(canonical_name) capabilities = self._registry.get_capabilities(canonical_name)
@@ -143,7 +142,7 @@ class OpenRouterProvider(OpenAICompatibleProvider):
# Custom models belong to CustomProvider; skip them here so the two # Custom models belong to CustomProvider; skip them here so the two
# providers don't race over the same registrations (important for tests # providers don't race over the same registrations (important for tests
# that stub the registry with minimal objects lacking attrs). # that stub the registry with minimal objects lacking attrs).
if hasattr(config, "is_custom") and config.is_custom is True: if config.provider == ProviderType.CUSTOM:
continue continue
if restriction_service: if restriction_service:
@@ -211,7 +210,7 @@ class OpenRouterProvider(OpenAICompatibleProvider):
continue continue
# See note in list_models: respect the CustomProvider boundary. # See note in list_models: respect the CustomProvider boundary.
if hasattr(config, "is_custom") and config.is_custom is True: if config.provider == ProviderType.CUSTOM:
continue continue
capabilities[model_name] = config capabilities[model_name] = config

View File

@@ -1,293 +1,38 @@
"""OpenRouter model registry for managing model configurations and aliases.""" """OpenRouter model registry for managing model configurations and aliases."""
import importlib.resources from __future__ import annotations
import logging
from pathlib import Path
from typing import Optional
from utils.env import get_env from .model_registry_base import CAPABILITY_FIELD_NAMES, CapabilityModelRegistry
from .shared import ModelCapabilities, ProviderType
# Import handled via importlib.resources.files() calls directly
from utils.file_utils import read_json_file
from .shared import (
ModelCapabilities,
ProviderType,
TemperatureConstraint,
)
class OpenRouterModelRegistry: class OpenRouterModelRegistry(CapabilityModelRegistry):
"""In-memory view of OpenRouter and custom model metadata. """Capability registry backed by `conf/openrouter_models.json`."""
Role def __init__(self, config_path: str | None = None) -> None:
Parse the packaged ``conf/custom_models.json`` (or user-specified super().__init__(
overrides), construct alias and capability maps, and serve those env_var_name="OPENROUTER_MODELS_CONFIG_PATH",
structures to providers that rely on OpenRouter semantics (both the default_filename="openrouter_models.json",
OpenRouter provider itself and the Custom provider). provider=ProviderType.OPENROUTER,
friendly_prefix="OpenRouter ({model})",
config_path=config_path,
)
Key duties def _finalise_entry(self, entry: dict) -> tuple[ModelCapabilities, dict]:
* Load :class:`ModelCapabilities` definitions from configuration files provider_override = entry.get("provider")
* Maintain a case-insensitive alias → canonical name map for fast if isinstance(provider_override, str):
resolution entry_provider = ProviderType(provider_override.lower())
* Provide helpers to list models, list aliases, and resolve an arbitrary elif isinstance(provider_override, ProviderType):
name to its capability object without repeatedly touching the file entry_provider = provider_override
system.
"""
def __init__(self, config_path: Optional[str] = None):
"""Initialize the registry.
Args:
config_path: Path to config file. If None, uses default locations.
"""
self.alias_map: dict[str, str] = {} # alias -> model_name
self.model_map: dict[str, ModelCapabilities] = {} # model_name -> config
# Determine config path and loading strategy
self.use_resources = False
if config_path:
# Direct config_path parameter
self.config_path = Path(config_path)
else: else:
# Check environment variable first entry_provider = ProviderType.OPENROUTER
env_path = get_env("CUSTOM_MODELS_CONFIG_PATH")
if env_path:
# Environment variable path
self.config_path = Path(env_path)
else:
# Try importlib.resources for robust packaging support
self.config_path = None
self.use_resources = False
try: if entry_provider == ProviderType.CUSTOM:
resource_traversable = importlib.resources.files("conf").joinpath("custom_models.json") entry.setdefault("friendly_name", f"Custom ({entry['model_name']})")
if hasattr(resource_traversable, "read_text"): else:
self.use_resources = True entry.setdefault("friendly_name", f"OpenRouter ({entry['model_name']})")
else:
raise AttributeError("read_text not available")
except Exception:
pass
if not self.use_resources: filtered = {k: v for k, v in entry.items() if k in CAPABILITY_FIELD_NAMES}
# Fallback to file system paths filtered.setdefault("provider", entry_provider)
potential_paths = [ capability = ModelCapabilities(**filtered)
Path(__file__).parent.parent / "conf" / "custom_models.json", return capability, {}
Path.cwd() / "conf" / "custom_models.json",
]
for path in potential_paths:
if path.exists():
self.config_path = path
break
if self.config_path is None:
self.config_path = potential_paths[0]
# Load configuration
self.reload()
def reload(self) -> None:
"""Reload configuration from disk."""
try:
configs = self._read_config()
self._build_maps(configs)
caller_info = ""
try:
import inspect
caller_frame = inspect.currentframe().f_back
if caller_frame:
caller_name = caller_frame.f_code.co_name
caller_file = (
caller_frame.f_code.co_filename.split("/")[-1] if caller_frame.f_code.co_filename else "unknown"
)
# Look for tool context
while caller_frame:
frame_locals = caller_frame.f_locals
if "self" in frame_locals and hasattr(frame_locals["self"], "get_name"):
tool_name = frame_locals["self"].get_name()
caller_info = f" (called from {tool_name} tool)"
break
caller_frame = caller_frame.f_back
if not caller_info:
caller_info = f" (called from {caller_name} in {caller_file})"
except Exception:
# If frame inspection fails, just continue without caller info
pass
logging.debug(
f"Loaded {len(self.model_map)} OpenRouter models with {len(self.alias_map)} aliases{caller_info}"
)
except ValueError as e:
# Re-raise ValueError only for duplicate aliases (critical config errors)
logging.error(f"Failed to load OpenRouter model configuration: {e}")
# Initialize with empty maps on failure
self.alias_map = {}
self.model_map = {}
if "Duplicate alias" in str(e):
raise
except Exception as e:
logging.error(f"Failed to load OpenRouter model configuration: {e}")
# Initialize with empty maps on failure
self.alias_map = {}
self.model_map = {}
def _read_config(self) -> list[ModelCapabilities]:
"""Read configuration from file or package resources.
Returns:
List of model configurations
"""
try:
if self.use_resources:
# Use importlib.resources for packaged environments
try:
resource_path = importlib.resources.files("conf").joinpath("custom_models.json")
if hasattr(resource_path, "read_text"):
# Python 3.9+
config_text = resource_path.read_text(encoding="utf-8")
else:
# Python 3.8 fallback
with resource_path.open("r", encoding="utf-8") as f:
config_text = f.read()
import json
data = json.loads(config_text)
logging.debug("Loaded OpenRouter config from package resources")
except Exception as e:
logging.warning(f"Failed to load config from resources: {e}")
return []
else:
# Use file path loading
if not self.config_path.exists():
logging.warning(f"OpenRouter model config not found at {self.config_path}")
return []
# Use centralized JSON reading utility
data = read_json_file(str(self.config_path))
logging.debug(f"Loaded OpenRouter config from file: {self.config_path}")
if data is None:
location = "resources" if self.use_resources else str(self.config_path)
raise ValueError(f"Could not read or parse JSON from {location}")
# Parse models
configs = []
for model_data in data.get("models", []):
# Create ModelCapabilities directly from JSON data
# Handle temperature_constraint conversion
temp_constraint_str = model_data.get("temperature_constraint")
temp_constraint = TemperatureConstraint.create(temp_constraint_str or "range")
# Set provider-specific defaults based on is_custom flag
is_custom = model_data.get("is_custom", False)
if is_custom:
model_data.setdefault("provider", ProviderType.CUSTOM)
model_data.setdefault("friendly_name", f"Custom ({model_data.get('model_name', 'Unknown')})")
else:
model_data.setdefault("provider", ProviderType.OPENROUTER)
model_data.setdefault("friendly_name", f"OpenRouter ({model_data.get('model_name', 'Unknown')})")
model_data["temperature_constraint"] = temp_constraint
# Remove the string version of temperature_constraint before creating ModelCapabilities
if "temperature_constraint" in model_data and isinstance(model_data["temperature_constraint"], str):
del model_data["temperature_constraint"]
model_data["temperature_constraint"] = temp_constraint
config = ModelCapabilities(**model_data)
configs.append(config)
return configs
except ValueError:
# Re-raise ValueError for specific config errors
raise
except Exception as e:
location = "resources" if self.use_resources else str(self.config_path)
raise ValueError(f"Error reading config from {location}: {e}")
def _build_maps(self, configs: list[ModelCapabilities]) -> None:
"""Build alias and model maps from configurations.
Args:
configs: List of model configurations
"""
alias_map = {}
model_map = {}
for config in configs:
# Add to model map
model_map[config.model_name] = config
# Add the model_name itself as an alias for case-insensitive lookup
# But only if it's not already in the aliases list
model_name_lower = config.model_name.lower()
aliases_lower = [alias.lower() for alias in config.aliases]
if model_name_lower not in aliases_lower:
if model_name_lower in alias_map:
existing_model = alias_map[model_name_lower]
if existing_model != config.model_name:
raise ValueError(
f"Duplicate model name '{config.model_name}' (case-insensitive) found for models "
f"'{existing_model}' and '{config.model_name}'"
)
else:
alias_map[model_name_lower] = config.model_name
# Add aliases
for alias in config.aliases:
alias_lower = alias.lower()
if alias_lower in alias_map:
existing_model = alias_map[alias_lower]
raise ValueError(
f"Duplicate alias '{alias}' found for models '{existing_model}' and '{config.model_name}'"
)
alias_map[alias_lower] = config.model_name
# Atomic update
self.alias_map = alias_map
self.model_map = model_map
def resolve(self, name_or_alias: str) -> Optional[ModelCapabilities]:
"""Resolve a model name or alias to configuration.
Args:
name_or_alias: Model name or alias to resolve
Returns:
Model configuration if found, None otherwise
"""
# Try alias lookup (case-insensitive) - this now includes model names too
alias_lower = name_or_alias.lower()
if alias_lower in self.alias_map:
model_name = self.alias_map[alias_lower]
return self.model_map.get(model_name)
return None
def get_capabilities(self, name_or_alias: str) -> Optional[ModelCapabilities]:
"""Get model capabilities for a name or alias.
Args:
name_or_alias: Model name or alias
Returns:
ModelCapabilities if found, None otherwise
"""
# Registry now returns ModelCapabilities directly
return self.resolve(name_or_alias)
def get_model_config(self, name_or_alias: str) -> Optional[ModelCapabilities]:
"""Backward-compatible wrapper used by providers and older tests."""
return self.resolve(name_or_alias)
def list_models(self) -> list[str]:
"""List all available model names."""
return list(self.model_map.keys())
def list_aliases(self) -> list[str]:
"""List all available aliases."""
return list(self.alias_map.keys())

View File

@@ -38,6 +38,7 @@ class ModelProviderRegistry:
PROVIDER_PRIORITY_ORDER = [ PROVIDER_PRIORITY_ORDER = [
ProviderType.GOOGLE, # Direct Gemini access ProviderType.GOOGLE, # Direct Gemini access
ProviderType.OPENAI, # Direct OpenAI access ProviderType.OPENAI, # Direct OpenAI access
ProviderType.AZURE, # Azure-hosted OpenAI deployments
ProviderType.XAI, # Direct X.AI GROK access ProviderType.XAI, # Direct X.AI GROK access
ProviderType.DIAL, # DIAL unified API access ProviderType.DIAL, # DIAL unified API access
ProviderType.CUSTOM, # Local/self-hosted models ProviderType.CUSTOM, # Local/self-hosted models
@@ -123,6 +124,21 @@ class ModelProviderRegistry:
provider_kwargs["base_url"] = gemini_base_url provider_kwargs["base_url"] = gemini_base_url
logging.info(f"Initialized Gemini provider with custom endpoint: {gemini_base_url}") logging.info(f"Initialized Gemini provider with custom endpoint: {gemini_base_url}")
provider = provider_class(**provider_kwargs) provider = provider_class(**provider_kwargs)
elif provider_type == ProviderType.AZURE:
if not api_key:
return None
azure_endpoint = get_env("AZURE_OPENAI_ENDPOINT")
if not azure_endpoint:
logging.warning("AZURE_OPENAI_ENDPOINT missing skipping Azure OpenAI provider")
return None
azure_version = get_env("AZURE_OPENAI_API_VERSION")
provider = provider_class(
api_key=api_key,
azure_endpoint=azure_endpoint,
api_version=azure_version,
)
else: else:
if not api_key: if not api_key:
return None return None
@@ -318,6 +334,7 @@ class ModelProviderRegistry:
key_mapping = { key_mapping = {
ProviderType.GOOGLE: "GEMINI_API_KEY", ProviderType.GOOGLE: "GEMINI_API_KEY",
ProviderType.OPENAI: "OPENAI_API_KEY", ProviderType.OPENAI: "OPENAI_API_KEY",
ProviderType.AZURE: "AZURE_OPENAI_API_KEY",
ProviderType.XAI: "XAI_API_KEY", ProviderType.XAI: "XAI_API_KEY",
ProviderType.OPENROUTER: "OPENROUTER_API_KEY", ProviderType.OPENROUTER: "OPENROUTER_API_KEY",
ProviderType.CUSTOM: "CUSTOM_API_KEY", # Can be empty for providers that don't need auth ProviderType.CUSTOM: "CUSTOM_API_KEY", # Can be empty for providers that don't need auth

View File

@@ -53,7 +53,6 @@ class ModelCapabilities:
# Additional attributes # Additional attributes
max_image_size_mb: float = 0.0 max_image_size_mb: float = 0.0
is_custom: bool = False
temperature_constraint: TemperatureConstraint = field( temperature_constraint: TemperatureConstraint = field(
default_factory=lambda: RangeTemperatureConstraint(0.0, 2.0, 0.3) default_factory=lambda: RangeTemperatureConstraint(0.0, 2.0, 0.3)
) )
@@ -102,9 +101,6 @@ class ModelCapabilities:
if self.supports_images: if self.supports_images:
score += 1 score += 1
if self.is_custom:
score -= 1
return max(0, min(100, score)) return max(0, min(100, score))
@staticmethod @staticmethod

View File

@@ -10,6 +10,7 @@ class ProviderType(Enum):
GOOGLE = "google" GOOGLE = "google"
OPENAI = "openai" OPENAI = "openai"
AZURE = "azure"
XAI = "xai" XAI = "xai"
OPENROUTER = "openrouter" OPENROUTER = "openrouter"
CUSTOM = "custom" CUSTOM = "custom"

View File

@@ -21,7 +21,7 @@ py-modules = ["server", "config"]
"*" = ["conf/*.json"] "*" = ["conf/*.json"]
[tool.setuptools.data-files] [tool.setuptools.data-files]
"conf" = ["conf/custom_models.json"] "conf" = ["conf/custom_models.json", "conf/openrouter_models.json", "conf/azure_models.json"]
[project.scripts] [project.scripts]
zen-mcp-server = "server:run" zen-mcp-server = "server:run"

View File

@@ -377,6 +377,7 @@ def configure_providers():
value = get_env(key) value = get_env(key)
logger.debug(f" {key}: {'[PRESENT]' if value else '[MISSING]'}") logger.debug(f" {key}: {'[PRESENT]' if value else '[MISSING]'}")
from providers import ModelProviderRegistry from providers import ModelProviderRegistry
from providers.azure_openai import AzureOpenAIProvider
from providers.custom import CustomProvider from providers.custom import CustomProvider
from providers.dial import DIALModelProvider from providers.dial import DIALModelProvider
from providers.gemini import GeminiModelProvider from providers.gemini import GeminiModelProvider
@@ -411,6 +412,27 @@ def configure_providers():
else: else:
logger.debug("OpenAI API key is placeholder value") logger.debug("OpenAI API key is placeholder value")
# Check for Azure OpenAI configuration
azure_key = get_env("AZURE_OPENAI_API_KEY")
azure_endpoint = get_env("AZURE_OPENAI_ENDPOINT")
azure_models_available = False
if azure_key and azure_key != "your_azure_openai_key_here" and azure_endpoint:
try:
from providers.azure_registry import AzureModelRegistry
azure_registry = AzureModelRegistry()
if azure_registry.list_models():
valid_providers.append("Azure OpenAI")
has_native_apis = True
azure_models_available = True
logger.info("Azure OpenAI configuration detected")
else:
logger.warning(
"Azure OpenAI models configuration is empty. Populate conf/azure_models.json or set AZURE_MODELS_CONFIG_PATH."
)
except Exception as exc:
logger.warning(f"Failed to load Azure OpenAI models: {exc}")
# Check for X.AI API key # Check for X.AI API key
xai_key = get_env("XAI_API_KEY") xai_key = get_env("XAI_API_KEY")
if xai_key and xai_key != "your_xai_api_key_here": if xai_key and xai_key != "your_xai_api_key_here":
@@ -468,6 +490,10 @@ def configure_providers():
ModelProviderRegistry.register_provider(ProviderType.OPENAI, OpenAIModelProvider) ModelProviderRegistry.register_provider(ProviderType.OPENAI, OpenAIModelProvider)
registered_providers.append(ProviderType.OPENAI.value) registered_providers.append(ProviderType.OPENAI.value)
logger.debug(f"Registered provider: {ProviderType.OPENAI.value}") logger.debug(f"Registered provider: {ProviderType.OPENAI.value}")
if azure_models_available:
ModelProviderRegistry.register_provider(ProviderType.AZURE, AzureOpenAIProvider)
registered_providers.append(ProviderType.AZURE.value)
logger.debug(f"Registered provider: {ProviderType.AZURE.value}")
if xai_key and xai_key != "your_xai_api_key_here": if xai_key and xai_key != "your_xai_api_key_here":
ModelProviderRegistry.register_provider(ProviderType.XAI, XAIModelProvider) ModelProviderRegistry.register_provider(ProviderType.XAI, XAIModelProvider)
registered_providers.append(ProviderType.XAI.value) registered_providers.append(ProviderType.XAI.value)

View File

@@ -64,6 +64,14 @@ def test_error_listing_respects_env_restrictions(monkeypatch, reset_registry):
monkeypatch.setenv("OPENAI_API_KEY", "test-openai") monkeypatch.setenv("OPENAI_API_KEY", "test-openai")
monkeypatch.setenv("OPENROUTER_API_KEY", "test-openrouter") monkeypatch.setenv("OPENROUTER_API_KEY", "test-openrouter")
monkeypatch.delenv("XAI_API_KEY", raising=False) monkeypatch.delenv("XAI_API_KEY", raising=False)
# Ensure Azure provider stays disabled regardless of developer workstation env
for azure_var in (
"AZURE_OPENAI_API_KEY",
"AZURE_OPENAI_ENDPOINT",
"AZURE_OPENAI_ALLOWED_MODELS",
"AZURE_MODELS_CONFIG_PATH",
):
monkeypatch.delenv(azure_var, raising=False)
monkeypatch.setenv("ZEN_MCP_FORCE_ENV_OVERRIDE", "false") monkeypatch.setenv("ZEN_MCP_FORCE_ENV_OVERRIDE", "false")
env_config.reload_env({"ZEN_MCP_FORCE_ENV_OVERRIDE": "false"}) env_config.reload_env({"ZEN_MCP_FORCE_ENV_OVERRIDE": "false"})
try: try:
@@ -103,6 +111,13 @@ def test_error_listing_respects_env_restrictions(monkeypatch, reset_registry):
for var in ("XAI_API_KEY", "CUSTOM_API_URL", "CUSTOM_API_KEY", "DIAL_API_KEY"): for var in ("XAI_API_KEY", "CUSTOM_API_URL", "CUSTOM_API_KEY", "DIAL_API_KEY"):
monkeypatch.delenv(var, raising=False) monkeypatch.delenv(var, raising=False)
for azure_var in (
"AZURE_OPENAI_API_KEY",
"AZURE_OPENAI_ENDPOINT",
"AZURE_OPENAI_ALLOWED_MODELS",
"AZURE_MODELS_CONFIG_PATH",
):
monkeypatch.delenv(azure_var, raising=False)
ModelProviderRegistry.reset_for_testing() ModelProviderRegistry.reset_for_testing()
model_restrictions._restriction_service = None model_restrictions._restriction_service = None
@@ -136,6 +151,13 @@ def test_error_listing_without_restrictions_shows_full_catalog(monkeypatch, rese
monkeypatch.setenv("OPENROUTER_API_KEY", "test-openrouter") monkeypatch.setenv("OPENROUTER_API_KEY", "test-openrouter")
monkeypatch.setenv("XAI_API_KEY", "test-xai") monkeypatch.setenv("XAI_API_KEY", "test-xai")
monkeypatch.setenv("ZEN_MCP_FORCE_ENV_OVERRIDE", "false") monkeypatch.setenv("ZEN_MCP_FORCE_ENV_OVERRIDE", "false")
for azure_var in (
"AZURE_OPENAI_API_KEY",
"AZURE_OPENAI_ENDPOINT",
"AZURE_OPENAI_ALLOWED_MODELS",
"AZURE_MODELS_CONFIG_PATH",
):
monkeypatch.delenv(azure_var, raising=False)
env_config.reload_env({"ZEN_MCP_FORCE_ENV_OVERRIDE": "false"}) env_config.reload_env({"ZEN_MCP_FORCE_ENV_OVERRIDE": "false"})
try: try:
import dotenv import dotenv

View File

@@ -0,0 +1,145 @@
import sys
import types
import pytest
if "openai" not in sys.modules: # pragma: no cover - test shim for optional dependency
stub = types.ModuleType("openai")
stub.AzureOpenAI = object # Replaced with a mock inside tests
sys.modules["openai"] = stub
from providers.azure_openai import AzureOpenAIProvider
from providers.shared import ModelCapabilities, ProviderType
class _DummyResponse:
def __init__(self):
self.choices = [
types.SimpleNamespace(
message=types.SimpleNamespace(content="hello"),
finish_reason="stop",
)
]
self.model = "prod-gpt4o"
self.id = "resp-123"
self.created = 0
self.usage = types.SimpleNamespace(
prompt_tokens=5,
completion_tokens=3,
total_tokens=8,
)
@pytest.fixture
def dummy_azure_client(monkeypatch):
captured = {}
class _DummyAzureClient:
def __init__(self, **kwargs):
captured["client_kwargs"] = kwargs
self.chat = types.SimpleNamespace(completions=types.SimpleNamespace(create=self._create_completion))
self.responses = types.SimpleNamespace(create=self._create_response)
def _create_completion(self, **kwargs):
captured["request_kwargs"] = kwargs
return _DummyResponse()
def _create_response(self, **kwargs):
captured["responses_kwargs"] = kwargs
return _DummyResponse()
monkeypatch.delenv("AZURE_OPENAI_ALLOWED_MODELS", raising=False)
monkeypatch.setattr("providers.azure_openai.AzureOpenAI", _DummyAzureClient)
return captured
def test_generate_content_uses_deployment_mapping(dummy_azure_client):
provider = AzureOpenAIProvider(
api_key="key",
azure_endpoint="https://example.openai.azure.com/",
deployments={"gpt-4o": "prod-gpt4o"},
)
result = provider.generate_content("hello", "gpt-4o")
assert dummy_azure_client["request_kwargs"]["model"] == "prod-gpt4o"
assert result.model_name == "gpt-4o"
assert result.provider == ProviderType.AZURE
assert provider.validate_model_name("prod-gpt4o")
def test_generate_content_accepts_deployment_alias(dummy_azure_client):
provider = AzureOpenAIProvider(
api_key="key",
azure_endpoint="https://example.openai.azure.com/",
deployments={"gpt-4o-mini": "mini-deployment"},
)
# Calling with the deployment alias should still resolve properly.
result = provider.generate_content("hi", "mini-deployment")
assert dummy_azure_client["request_kwargs"]["model"] == "mini-deployment"
assert result.model_name == "gpt-4o-mini"
def test_client_initialization_uses_endpoint_and_version(dummy_azure_client):
provider = AzureOpenAIProvider(
api_key="key",
azure_endpoint="https://example.openai.azure.com/",
api_version="2024-03-15-preview",
deployments={"gpt-4o": "prod"},
)
_ = provider.client
assert dummy_azure_client["client_kwargs"]["azure_endpoint"] == "https://example.openai.azure.com"
assert dummy_azure_client["client_kwargs"]["api_version"] == "2024-03-15-preview"
def test_deployment_overrides_capabilities(dummy_azure_client):
provider = AzureOpenAIProvider(
api_key="key",
azure_endpoint="https://example.openai.azure.com/",
deployments={
"gpt-4o": {
"deployment": "prod-gpt4o",
"friendly_name": "Azure GPT-4o EU",
"intelligence_score": 19,
"supports_temperature": False,
"temperature_constraint": "fixed",
}
},
)
caps = provider.get_capabilities("gpt-4o")
assert caps.friendly_name == "Azure GPT-4o EU"
assert caps.intelligence_score == 19
assert not caps.supports_temperature
def test_registry_configuration_merges_capabilities(dummy_azure_client, monkeypatch):
def fake_registry_entries(self):
capability = ModelCapabilities(
provider=ProviderType.AZURE,
model_name="gpt-4o",
friendly_name="Azure GPT-4o Registry",
context_window=500_000,
max_output_tokens=128_000,
)
return {"gpt-4o": {"deployment": "registry-deployment", "capability": capability}}
monkeypatch.setattr(AzureOpenAIProvider, "_load_registry_entries", fake_registry_entries)
provider = AzureOpenAIProvider(
api_key="key",
azure_endpoint="https://example.openai.azure.com/",
)
# Capability should come from registry
caps = provider.get_capabilities("gpt-4o")
assert caps.friendly_name == "Azure GPT-4o Registry"
assert caps.context_window == 500_000
# API call should use deployment defined in registry
provider.generate_content("hello", "gpt-4o")
assert dummy_azure_client["request_kwargs"]["model"] == "registry-deployment"

View File

@@ -34,8 +34,7 @@ class TestCustomOpenAITemperatureParameterFix:
config_models = [ config_models = [
{ {
"model_name": "gpt-5-2025-08-07", "model_name": "gpt-5-2025-08-07",
"provider": "ProviderType.OPENAI", "provider": "openai",
"is_custom": True,
"context_window": 400000, "context_window": 400000,
"max_output_tokens": 128000, "max_output_tokens": 128000,
"supports_extended_thinking": True, "supports_extended_thinking": True,

View File

@@ -62,9 +62,9 @@ class TestCustomProvider:
with pytest.raises(ValueError): with pytest.raises(ValueError):
provider.get_capabilities("o3") provider.get_capabilities("o3")
# Test with a custom model (is_custom=true) # Test with a custom model from the local registry
capabilities = provider.get_capabilities("local-llama") capabilities = provider.get_capabilities("local-llama")
assert capabilities.provider == ProviderType.CUSTOM # local-llama has is_custom=true assert capabilities.provider == ProviderType.CUSTOM
assert capabilities.context_window > 0 assert capabilities.context_window > 0
finally: finally:

View File

@@ -181,7 +181,7 @@ class TestModelEnumeration:
# Configure environment with OpenRouter access only # Configure environment with OpenRouter access only
self._setup_environment({"OPENROUTER_API_KEY": "test-openrouter-key"}) self._setup_environment({"OPENROUTER_API_KEY": "test-openrouter-key"})
# Create a temporary custom model config with a free variant # Create a temporary OpenRouter model config with a free variant
custom_config = { custom_config = {
"models": [ "models": [
{ {
@@ -199,9 +199,9 @@ class TestModelEnumeration:
] ]
} }
config_path = tmp_path / "custom_models.json" config_path = tmp_path / "openrouter_models.json"
config_path.write_text(json.dumps(custom_config), encoding="utf-8") config_path.write_text(json.dumps(custom_config), encoding="utf-8")
monkeypatch.setenv("CUSTOM_MODELS_CONFIG_PATH", str(config_path)) monkeypatch.setenv("OPENROUTER_MODELS_CONFIG_PATH", str(config_path))
# Reset cached registries so the temporary config is loaded # Reset cached registries so the temporary config is loaded
from tools.shared.base_tool import BaseTool from tools.shared.base_tool import BaseTool

View File

@@ -366,8 +366,8 @@ class TestCustomProviderOpenRouterRestrictions:
assert not provider.validate_model_name("sonnet") assert not provider.validate_model_name("sonnet")
assert not provider.validate_model_name("haiku") assert not provider.validate_model_name("haiku")
# Should still validate custom models (is_custom=true) regardless of restrictions # Should still validate custom models defined in conf/custom_models.json
assert provider.validate_model_name("local-llama") # This has is_custom=true assert provider.validate_model_name("local-llama")
@patch.dict(os.environ, {"OPENROUTER_ALLOWED_MODELS": "opus", "OPENROUTER_API_KEY": "test-key"}) @patch.dict(os.environ, {"OPENROUTER_ALLOWED_MODELS": "opus", "OPENROUTER_API_KEY": "test-key"})
def test_custom_provider_openrouter_capabilities_restrictions(self): def test_custom_provider_openrouter_capabilities_restrictions(self):
@@ -389,7 +389,7 @@ class TestCustomProviderOpenRouterRestrictions:
with pytest.raises(ValueError): with pytest.raises(ValueError):
provider.get_capabilities("haiku") provider.get_capabilities("haiku")
# Should still work for custom models (is_custom=true) # Should still work for custom models
capabilities = provider.get_capabilities("local-llama") capabilities = provider.get_capabilities("local-llama")
assert capabilities.provider == ProviderType.CUSTOM assert capabilities.provider == ProviderType.CUSTOM

View File

@@ -172,7 +172,7 @@ class TestOpenRouterAutoMode:
def mock_resolve(model_name): def mock_resolve(model_name):
if model_name in model_names: if model_name in model_names:
mock_config = Mock() mock_config = Mock()
mock_config.is_custom = False mock_config.provider = ProviderType.OPENROUTER
mock_config.aliases = [] # Empty list of aliases mock_config.aliases = [] # Empty list of aliases
mock_config.get_effective_capability_rank = Mock(return_value=50) # Add ranking method mock_config.get_effective_capability_rank = Mock(return_value=50) # Add ranking method
return mock_config return mock_config

View File

@@ -3,6 +3,7 @@
import json import json
import os import os
import tempfile import tempfile
from unittest.mock import patch
import pytest import pytest
@@ -49,7 +50,7 @@ class TestOpenRouterModelRegistry:
os.unlink(temp_path) os.unlink(temp_path)
def test_environment_variable_override(self): def test_environment_variable_override(self):
"""Test OPENROUTER_MODELS_PATH environment variable.""" """Test OPENROUTER_MODELS_CONFIG_PATH environment variable."""
# Create custom config # Create custom config
config_data = { config_data = {
"models": [ "models": [
@@ -63,8 +64,8 @@ class TestOpenRouterModelRegistry:
try: try:
# Set environment variable # Set environment variable
original_env = os.environ.get("CUSTOM_MODELS_CONFIG_PATH") original_env = os.environ.get("OPENROUTER_MODELS_CONFIG_PATH")
os.environ["CUSTOM_MODELS_CONFIG_PATH"] = temp_path os.environ["OPENROUTER_MODELS_CONFIG_PATH"] = temp_path
# Create registry without explicit path # Create registry without explicit path
registry = OpenRouterModelRegistry() registry = OpenRouterModelRegistry()
@@ -76,9 +77,9 @@ class TestOpenRouterModelRegistry:
finally: finally:
# Restore environment # Restore environment
if original_env is not None: if original_env is not None:
os.environ["CUSTOM_MODELS_CONFIG_PATH"] = original_env os.environ["OPENROUTER_MODELS_CONFIG_PATH"] = original_env
else: else:
del os.environ["CUSTOM_MODELS_CONFIG_PATH"] del os.environ["OPENROUTER_MODELS_CONFIG_PATH"]
os.unlink(temp_path) os.unlink(temp_path)
def test_alias_resolution(self): def test_alias_resolution(self):
@@ -161,7 +162,7 @@ class TestOpenRouterModelRegistry:
os.unlink(temp_path) os.unlink(temp_path)
def test_backwards_compatibility_max_tokens(self): def test_backwards_compatibility_max_tokens(self):
"""Test that old max_tokens field is no longer supported (should result in empty registry).""" """Test that legacy max_tokens field maps to max_output_tokens."""
config_data = { config_data = {
"models": [ "models": [
{ {
@@ -178,19 +179,17 @@ class TestOpenRouterModelRegistry:
temp_path = f.name temp_path = f.name
try: try:
# Should gracefully handle the error and result in empty registry with patch.dict("os.environ", {}, clear=True):
registry = OpenRouterModelRegistry(config_path=temp_path) with pytest.raises(ValueError, match="max_output_tokens"):
# Registry should be empty due to config error OpenRouterModelRegistry(config_path=temp_path)
assert len(registry.list_models()) == 0
assert len(registry.list_aliases()) == 0
assert registry.resolve("old") is None
finally: finally:
os.unlink(temp_path) os.unlink(temp_path)
def test_missing_config_file(self): def test_missing_config_file(self):
"""Test behavior with missing config file.""" """Test behavior with missing config file."""
# Use a non-existent path # Use a non-existent path
registry = OpenRouterModelRegistry(config_path="/non/existent/path.json") with patch.dict("os.environ", {}, clear=True):
registry = OpenRouterModelRegistry(config_path="/non/existent/path.json")
# Should initialize with empty maps # Should initialize with empty maps
assert len(registry.list_models()) == 0 assert len(registry.list_models()) == 0

View File

@@ -1,5 +1,7 @@
"""Tests for uvx path resolution functionality.""" """Tests for uvx path resolution functionality."""
import json
import tempfile
from pathlib import Path from pathlib import Path
from unittest.mock import patch from unittest.mock import patch
@@ -18,8 +20,8 @@ class TestUvxPathResolution:
def test_config_path_resolution(self): def test_config_path_resolution(self):
"""Test that the config path resolution finds the config file in multiple locations.""" """Test that the config path resolution finds the config file in multiple locations."""
# Check that the config file exists in the development location # Check that the config file exists in the development location
config_file = Path(__file__).parent.parent / "conf" / "custom_models.json" config_file = Path(__file__).parent.parent / "conf" / "openrouter_models.json"
assert config_file.exists(), "Config file should exist in conf/custom_models.json" assert config_file.exists(), "Config file should exist in conf/openrouter_models.json"
# Test that a registry can find and use the config # Test that a registry can find and use the config
registry = OpenRouterModelRegistry() registry = OpenRouterModelRegistry()
@@ -34,7 +36,7 @@ class TestUvxPathResolution:
def test_explicit_config_path_override(self): def test_explicit_config_path_override(self):
"""Test that explicit config path works correctly.""" """Test that explicit config path works correctly."""
config_path = Path(__file__).parent.parent / "conf" / "custom_models.json" config_path = Path(__file__).parent.parent / "conf" / "openrouter_models.json"
registry = OpenRouterModelRegistry(config_path=str(config_path)) registry = OpenRouterModelRegistry(config_path=str(config_path))
@@ -44,41 +46,62 @@ class TestUvxPathResolution:
def test_environment_variable_override(self): def test_environment_variable_override(self):
"""Test that CUSTOM_MODELS_CONFIG_PATH environment variable works.""" """Test that CUSTOM_MODELS_CONFIG_PATH environment variable works."""
config_path = Path(__file__).parent.parent / "conf" / "custom_models.json" config_path = Path(__file__).parent.parent / "conf" / "openrouter_models.json"
with patch.dict("os.environ", {"CUSTOM_MODELS_CONFIG_PATH": str(config_path)}): with patch.dict("os.environ", {"OPENROUTER_MODELS_CONFIG_PATH": str(config_path)}):
registry = OpenRouterModelRegistry() registry = OpenRouterModelRegistry()
# Should use environment path # Should use environment path
assert registry.config_path == config_path assert registry.config_path == config_path
assert len(registry.list_models()) > 0 assert len(registry.list_models()) > 0
@patch("providers.openrouter_registry.importlib.resources.files") @patch("providers.model_registry_base.importlib.resources.files")
@patch("pathlib.Path.exists") def test_multiple_path_fallback(self, mock_files):
def test_multiple_path_fallback(self, mock_exists, mock_files): """Test that file-system fallback works when resource loading fails."""
"""Test that multiple path resolution works for different deployment scenarios."""
# Make resources loading fail to trigger file system fallback
mock_files.side_effect = Exception("Resource loading failed") mock_files.side_effect = Exception("Resource loading failed")
# Simulate dev path failing, and working directory path succeeding with tempfile.TemporaryDirectory() as tmpdir:
# The third `True` is for the check within `reload()` temp_dir = Path(tmpdir)
mock_exists.side_effect = [False, True, True] conf_dir = temp_dir / "conf"
conf_dir.mkdir(parents=True, exist_ok=True)
config_path = conf_dir / "openrouter_models.json"
config_path.write_text(
json.dumps(
{
"models": [
{
"model_name": "test/model",
"aliases": ["testalias"],
"context_window": 1024,
"max_output_tokens": 512,
}
]
},
indent=2,
)
)
registry = OpenRouterModelRegistry() original_exists = Path.exists
# Should have fallen back to file system mode def fake_exists(path_self):
assert not registry.use_resources, "Should fall back to file system when resources fail" if str(path_self).endswith("conf/openrouter_models.json") and path_self != config_path:
return False
if path_self == config_path:
return True
return original_exists(path_self)
# Assert that the registry fell back to the second potential path with patch("pathlib.Path.cwd", return_value=temp_dir), patch("pathlib.Path.exists", fake_exists):
assert registry.config_path == Path.cwd() / "conf" / "custom_models.json" registry = OpenRouterModelRegistry()
# Should load models successfully assert not registry.use_resources
assert len(registry.list_models()) > 0 assert registry.config_path == config_path
assert "test/model" in registry.list_models()
def test_missing_config_handling(self): def test_missing_config_handling(self):
"""Test behavior when config file is missing.""" """Test behavior when config file is missing."""
# Use a non-existent path # Use a non-existent path
registry = OpenRouterModelRegistry(config_path="/nonexistent/path/config.json") with patch.dict("os.environ", {}, clear=True):
registry = OpenRouterModelRegistry(config_path="/nonexistent/path/config.json")
# Should gracefully handle missing config # Should gracefully handle missing config
assert len(registry.list_models()) == 0 assert len(registry.list_models()) == 0

View File

@@ -166,8 +166,10 @@ class TestXAIProvider:
"""Test model restrictions functionality.""" """Test model restrictions functionality."""
# Clear cached restriction service # Clear cached restriction service
import utils.model_restrictions import utils.model_restrictions
from providers.registry import ModelProviderRegistry
utils.model_restrictions._restriction_service = None utils.model_restrictions._restriction_service = None
ModelProviderRegistry.reset_for_testing()
provider = XAIModelProvider("test-key") provider = XAIModelProvider("test-key")
@@ -187,8 +189,10 @@ class TestXAIProvider:
"""Test multiple models in restrictions.""" """Test multiple models in restrictions."""
# Clear cached restriction service # Clear cached restriction service
import utils.model_restrictions import utils.model_restrictions
from providers.registry import ModelProviderRegistry
utils.model_restrictions._restriction_service = None utils.model_restrictions._restriction_service = None
ModelProviderRegistry.reset_for_testing()
provider = XAIModelProvider("test-key") provider = XAIModelProvider("test-key")

View File

@@ -11,6 +11,8 @@ from typing import Any, Optional
from mcp.types import TextContent from mcp.types import TextContent
from providers.custom_registry import CustomEndpointModelRegistry
from providers.openrouter_registry import OpenRouterModelRegistry
from tools.models import ToolModelCategory, ToolOutput from tools.models import ToolModelCategory, ToolOutput
from tools.shared.base_models import ToolRequest from tools.shared.base_models import ToolRequest
from tools.shared.base_tool import BaseTool from tools.shared.base_tool import BaseTool
@@ -80,7 +82,6 @@ class ListModelsTool(BaseTool):
Returns: Returns:
Formatted list of models by provider Formatted list of models by provider
""" """
from providers.openrouter_registry import OpenRouterModelRegistry
from providers.registry import ModelProviderRegistry from providers.registry import ModelProviderRegistry
from providers.shared import ProviderType from providers.shared import ProviderType
from utils.model_restrictions import get_restriction_service from utils.model_restrictions import get_restriction_service
@@ -99,6 +100,7 @@ class ListModelsTool(BaseTool):
provider_info = { provider_info = {
ProviderType.GOOGLE: {"name": "Google Gemini", "env_key": "GEMINI_API_KEY"}, ProviderType.GOOGLE: {"name": "Google Gemini", "env_key": "GEMINI_API_KEY"},
ProviderType.OPENAI: {"name": "OpenAI", "env_key": "OPENAI_API_KEY"}, ProviderType.OPENAI: {"name": "OpenAI", "env_key": "OPENAI_API_KEY"},
ProviderType.AZURE: {"name": "Azure OpenAI", "env_key": "AZURE_OPENAI_API_KEY"},
ProviderType.XAI: {"name": "X.AI (Grok)", "env_key": "XAI_API_KEY"}, ProviderType.XAI: {"name": "X.AI (Grok)", "env_key": "XAI_API_KEY"},
ProviderType.DIAL: {"name": "AI DIAL", "env_key": "DIAL_API_KEY"}, ProviderType.DIAL: {"name": "AI DIAL", "env_key": "DIAL_API_KEY"},
} }
@@ -317,12 +319,12 @@ class ListModelsTool(BaseTool):
output_lines.append("**Description**: Local models via Ollama, vLLM, LM Studio, etc.") output_lines.append("**Description**: Local models via Ollama, vLLM, LM Studio, etc.")
try: try:
registry = OpenRouterModelRegistry() registry = CustomEndpointModelRegistry()
custom_models = [] custom_models = []
for alias in registry.list_aliases(): for alias in registry.list_aliases():
config = registry.resolve(alias) config = registry.resolve(alias)
if config and config.is_custom: if config:
custom_models.append((alias, config)) custom_models.append((alias, config))
if custom_models: if custom_models:

View File

@@ -82,6 +82,7 @@ class BaseTool(ABC):
# Class-level cache for OpenRouter registry to avoid multiple loads # Class-level cache for OpenRouter registry to avoid multiple loads
_openrouter_registry_cache = None _openrouter_registry_cache = None
_custom_registry_cache = None
@classmethod @classmethod
def _get_openrouter_registry(cls): def _get_openrouter_registry(cls):
@@ -94,6 +95,16 @@ class BaseTool(ABC):
logger.debug("Created cached OpenRouter registry instance") logger.debug("Created cached OpenRouter registry instance")
return BaseTool._openrouter_registry_cache return BaseTool._openrouter_registry_cache
@classmethod
def _get_custom_registry(cls):
"""Get cached custom-endpoint registry instance."""
if BaseTool._custom_registry_cache is None:
from providers.custom_registry import CustomEndpointModelRegistry
BaseTool._custom_registry_cache = CustomEndpointModelRegistry()
logger.debug("Created cached Custom registry instance")
return BaseTool._custom_registry_cache
def __init__(self): def __init__(self):
# Cache tool metadata at initialization to avoid repeated calls # Cache tool metadata at initialization to avoid repeated calls
self.name = self.get_name() self.name = self.get_name()
@@ -266,14 +277,10 @@ class BaseTool(ABC):
custom_url = get_env("CUSTOM_API_URL") custom_url = get_env("CUSTOM_API_URL")
if custom_url: if custom_url:
try: try:
registry = self._get_openrouter_registry() registry = self._get_custom_registry()
# Find all custom models (is_custom=true)
for alias in registry.list_aliases(): for alias in registry.list_aliases():
config = registry.resolve(alias) if alias not in all_models:
# Check if this is a custom model that requires custom endpoints all_models.append(alias)
if config and config.is_custom:
if alias not in all_models:
all_models.append(alias)
except Exception as e: except Exception as e:
import logging import logging
@@ -1282,12 +1289,7 @@ When recommending searches, be specific about what information you need and why
try: try:
registry = self._get_openrouter_registry() registry = self._get_openrouter_registry()
# Include every known alias so MCP enum matches registry capabilities
for alias in registry.list_aliases(): for alias in registry.list_aliases():
config = registry.resolve(alias)
if config and config.is_custom:
# Custom-only models require CUSTOM_API_URL; defer to custom block
continue
if alias not in all_models: if alias not in all_models:
all_models.append(alias) all_models.append(alias)
except Exception as exc: # pragma: no cover - logged for observability except Exception as exc: # pragma: no cover - logged for observability
@@ -1299,10 +1301,9 @@ When recommending searches, be specific about what information you need and why
custom_url = get_env("CUSTOM_API_URL") custom_url = get_env("CUSTOM_API_URL")
if custom_url: if custom_url:
try: try:
registry = self._get_openrouter_registry() registry = self._get_custom_registry()
for alias in registry.list_aliases(): for alias in registry.list_aliases():
config = registry.resolve(alias) if alias not in all_models:
if config and config.is_custom and alias not in all_models:
all_models.append(alias) all_models.append(alias)
except Exception as exc: # pragma: no cover - logged for observability except Exception as exc: # pragma: no cover - logged for observability
import logging import logging

View File

@@ -4,6 +4,7 @@ from __future__ import annotations
import os import os
from collections.abc import Mapping from collections.abc import Mapping
from contextlib import contextmanager
from pathlib import Path from pathlib import Path
try: try:
@@ -86,3 +87,25 @@ def get_all_env() -> dict[str, str | None]:
"""Expose the loaded .env mapping for diagnostics/logging.""" """Expose the loaded .env mapping for diagnostics/logging."""
return dict(_DOTENV_VALUES) return dict(_DOTENV_VALUES)
@contextmanager
def suppress_env_vars(*names: str):
"""Temporarily remove environment variables during the context.
Args:
names: Environment variable names to remove. Empty or falsy names are ignored.
"""
removed: dict[str, str] = {}
try:
for name in names:
if not name:
continue
if name in os.environ:
removed[name] = os.environ[name]
del os.environ[name]
yield
finally:
for name, value in removed.items():
os.environ[name] = value