feat!: breaking change - OpenRouter models are now read from conf/openrouter_models.json while Custom / Self-hosted models are read from conf/custom_models.json
feat: Azure OpenAI / Azure AI Foundry support. Models should be defined in conf/azure_models.json (or a custom path). See .env.example for environment variables or see readme. https://github.com/BeehiveInnovations/zen-mcp-server/issues/265 feat: OpenRouter / Custom Models / Azure can separately also use custom config paths now (see .env.example ) refactor: Model registry class made abstract, OpenRouter / Custom Provider / Azure OpenAI now subclass these refactor: breaking change: `is_custom` property has been removed from model_capabilities.py (and thus custom_models.json) given each models are now read from separate configuration files
This commit is contained in:
@@ -17,6 +17,15 @@ GEMINI_API_KEY=your_gemini_api_key_here
|
|||||||
# Get your OpenAI API key from: https://platform.openai.com/api-keys
|
# Get your OpenAI API key from: https://platform.openai.com/api-keys
|
||||||
OPENAI_API_KEY=your_openai_api_key_here
|
OPENAI_API_KEY=your_openai_api_key_here
|
||||||
|
|
||||||
|
# Azure OpenAI mirrors OpenAI models through Azure-hosted deployments
|
||||||
|
# Set the endpoint from Azure Portal. Models are defined in conf/azure_models.json
|
||||||
|
# (or the file referenced by AZURE_MODELS_CONFIG_PATH).
|
||||||
|
AZURE_OPENAI_API_KEY=your_azure_openai_key_here
|
||||||
|
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
|
||||||
|
# AZURE_OPENAI_API_VERSION=2024-02-15-preview
|
||||||
|
# AZURE_OPENAI_ALLOWED_MODELS=gpt-4o,gpt-4o-mini
|
||||||
|
# AZURE_MODELS_CONFIG_PATH=/absolute/path/to/custom_azure_models.json
|
||||||
|
|
||||||
# Get your X.AI API key from: https://console.x.ai/
|
# Get your X.AI API key from: https://console.x.ai/
|
||||||
XAI_API_KEY=your_xai_api_key_here
|
XAI_API_KEY=your_xai_api_key_here
|
||||||
|
|
||||||
|
|||||||
20
README.md
20
README.md
@@ -3,7 +3,7 @@
|
|||||||
[zen_web.webm](https://github.com/user-attachments/assets/851e3911-7f06-47c0-a4ab-a2601236697c)
|
[zen_web.webm](https://github.com/user-attachments/assets/851e3911-7f06-47c0-a4ab-a2601236697c)
|
||||||
|
|
||||||
<div align="center">
|
<div align="center">
|
||||||
<b>🤖 <a href="https://www.anthropic.com/claude-code">Claude Code</a> OR <a href="https://github.com/google-gemini/gemini-cli">Gemini CLI</a> OR <a href="https://github.com/openai/codex">Codex CLI</a> + [Gemini / OpenAI / Grok / OpenRouter / DIAL / Ollama / Anthropic / Any Model] = Your Ultimate AI Development Team</b>
|
<b>🤖 <a href="https://www.anthropic.com/claude-code">Claude Code</a> OR <a href="https://github.com/google-gemini/gemini-cli">Gemini CLI</a> OR <a href="https://github.com/openai/codex">Codex CLI</a> + [Gemini / OpenAI / Azure / Grok / OpenRouter / DIAL / Ollama / Anthropic / Any Model] = Your Ultimate AI Development Team</b>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<br/>
|
<br/>
|
||||||
@@ -85,6 +85,7 @@ For best results, use Claude Code with:
|
|||||||
- **[OpenRouter](https://openrouter.ai/)** - Access multiple models with one API
|
- **[OpenRouter](https://openrouter.ai/)** - Access multiple models with one API
|
||||||
- **[Gemini](https://makersuite.google.com/app/apikey)** - Google's latest models
|
- **[Gemini](https://makersuite.google.com/app/apikey)** - Google's latest models
|
||||||
- **[OpenAI](https://platform.openai.com/api-keys)** - O3, GPT-5 series
|
- **[OpenAI](https://platform.openai.com/api-keys)** - O3, GPT-5 series
|
||||||
|
- **[Azure OpenAI](https://learn.microsoft.com/azure/ai-services/openai/)** - Enterprise deployments of GPT-4o, GPT-4.1, GPT-5 family
|
||||||
- **[X.AI](https://console.x.ai/)** - Grok models
|
- **[X.AI](https://console.x.ai/)** - Grok models
|
||||||
- **[DIAL](https://dialx.ai/)** - Vendor-agnostic model access
|
- **[DIAL](https://dialx.ai/)** - Vendor-agnostic model access
|
||||||
- **[Ollama](https://ollama.ai/)** - Local models (free)
|
- **[Ollama](https://ollama.ai/)** - Local models (free)
|
||||||
@@ -132,6 +133,10 @@ cd zen-mcp-server
|
|||||||
👉 **[Complete Setup Guide](docs/getting-started.md)** with detailed installation, configuration for Gemini / Codex, and troubleshooting
|
👉 **[Complete Setup Guide](docs/getting-started.md)** with detailed installation, configuration for Gemini / Codex, and troubleshooting
|
||||||
👉 **[Cursor & VS Code Setup](docs/getting-started.md#ide-clients)** for IDE integration instructions
|
👉 **[Cursor & VS Code Setup](docs/getting-started.md#ide-clients)** for IDE integration instructions
|
||||||
|
|
||||||
|
## Provider Configuration
|
||||||
|
|
||||||
|
Zen activates any provider that has credentials in your `.env`. See `.env.example` for deeper customization.
|
||||||
|
|
||||||
## Core Tools
|
## Core Tools
|
||||||
|
|
||||||
> **Note:** Each tool comes with its own multi-step workflow, parameters, and descriptions that consume valuable context window space even when not in use. To optimize performance, some tools are disabled by default. See [Tool Configuration](#tool-configuration) below to enable them.
|
> **Note:** Each tool comes with its own multi-step workflow, parameters, and descriptions that consume valuable context window space even when not in use. To optimize performance, some tools are disabled by default. See [Tool Configuration](#tool-configuration) below to enable them.
|
||||||
@@ -247,7 +252,7 @@ DISABLED_TOOLS=
|
|||||||
- **[Context revival](docs/context-revival.md)** - Continue conversations even after context resets
|
- **[Context revival](docs/context-revival.md)** - Continue conversations even after context resets
|
||||||
|
|
||||||
**Model Support**
|
**Model Support**
|
||||||
- **Multiple providers** - Gemini, OpenAI, X.AI, OpenRouter, DIAL, Ollama
|
- **Multiple providers** - Gemini, OpenAI, Azure, X.AI, OpenRouter, DIAL, Ollama
|
||||||
- **Latest models** - GPT-5, Gemini 2.5 Pro, O3, Grok-4, local Llama
|
- **Latest models** - GPT-5, Gemini 2.5 Pro, O3, Grok-4, local Llama
|
||||||
- **[Thinking modes](docs/advanced-usage.md#thinking-modes)** - Control reasoning depth vs cost
|
- **[Thinking modes](docs/advanced-usage.md#thinking-modes)** - Control reasoning depth vs cost
|
||||||
- **Vision support** - Analyze images, diagrams, screenshots
|
- **Vision support** - Analyze images, diagrams, screenshots
|
||||||
@@ -288,6 +293,7 @@ DISABLED_TOOLS=
|
|||||||
- [Tools Reference](docs/tools/) - All tools with examples
|
- [Tools Reference](docs/tools/) - All tools with examples
|
||||||
- [Advanced Usage](docs/advanced-usage.md) - Power user features
|
- [Advanced Usage](docs/advanced-usage.md) - Power user features
|
||||||
- [Configuration](docs/configuration.md) - Environment variables, restrictions
|
- [Configuration](docs/configuration.md) - Environment variables, restrictions
|
||||||
|
- [Adding Providers](docs/adding_providers.md) - Provider-specific setup (OpenAI, Azure, custom gateways)
|
||||||
- [Model Ranking Guide](docs/model_ranking.md) - How intelligence scores drive auto-mode suggestions
|
- [Model Ranking Guide](docs/model_ranking.md) - How intelligence scores drive auto-mode suggestions
|
||||||
|
|
||||||
**🔧 Setup & Support**
|
**🔧 Setup & Support**
|
||||||
@@ -303,10 +309,12 @@ Apache 2.0 License - see [LICENSE](LICENSE) file for details.
|
|||||||
|
|
||||||
Built with the power of **Multi-Model AI** collaboration 🤝
|
Built with the power of **Multi-Model AI** collaboration 🤝
|
||||||
- **A**ctual **I**ntelligence by real Humans
|
- **A**ctual **I**ntelligence by real Humans
|
||||||
- [MCP (Model Context Protocol)](https://modelcontextprotocol.com) by Anthropic
|
- [MCP (Model Context Protocol)](https://modelcontextprotocol.com)
|
||||||
- [Claude Code](https://claude.ai/code) - Your AI coding orchestrator
|
- [Codex CLI](https://developers.openai.com/codex/cli)
|
||||||
- [Gemini 2.5 Pro & Flash](https://ai.google.dev/) - Extended thinking & fast analysis
|
- [Claude Code](https://claude.ai/code)
|
||||||
- [OpenAI O3 & GPT-5](https://openai.com/) - Strong reasoning & latest capabilities
|
- [Gemini](https://ai.google.dev/)
|
||||||
|
- [OpenAI](https://openai.com/)
|
||||||
|
- [Azure OpenAI](https://learn.microsoft.com/azure/ai-services/openai/)
|
||||||
|
|
||||||
### Star History
|
### Star History
|
||||||
|
|
||||||
|
|||||||
45
conf/azure_models.json
Normal file
45
conf/azure_models.json
Normal file
@@ -0,0 +1,45 @@
|
|||||||
|
{
|
||||||
|
"_README": {
|
||||||
|
"description": "Model metadata for Azure OpenAI / Azure AI Foundry-backed provider. The `models` definition can be copied from openrouter_models.json / custom_models.json",
|
||||||
|
"documentation": "https://github.com/BeehiveInnovations/zen-mcp-server/blob/main/docs/azure_models.md",
|
||||||
|
"usage": "Models listed here are exposed through Azure AI Foundry. Aliases are case-insensitive.",
|
||||||
|
"field_notes": "Matches providers/shared/model_capabilities.py.",
|
||||||
|
"field_descriptions": {
|
||||||
|
"model_name": "The model identifier e.g., 'gpt-4'",
|
||||||
|
"deployment": "Azure model deployment name",
|
||||||
|
"aliases": "Array of short names users can type instead of the full model name",
|
||||||
|
"context_window": "Total number of tokens the model can process (input + output combined)",
|
||||||
|
"max_output_tokens": "Maximum number of tokens the model can generate in a single response",
|
||||||
|
"supports_extended_thinking": "Whether the model supports extended reasoning tokens (currently none do via OpenRouter or custom APIs)",
|
||||||
|
"supports_json_mode": "Whether the model can guarantee valid JSON output",
|
||||||
|
"supports_function_calling": "Whether the model supports function/tool calling",
|
||||||
|
"supports_images": "Whether the model can process images/visual input",
|
||||||
|
"max_image_size_mb": "Maximum total size in MB for all images combined (capped at 40MB max for custom models)",
|
||||||
|
"supports_temperature": "Whether the model accepts temperature parameter in API calls (set to false for O3/O4 reasoning models)",
|
||||||
|
"temperature_constraint": "Type of temperature constraint: 'fixed' (fixed value), 'range' (continuous range), 'discrete' (specific values), or omit for default range",
|
||||||
|
"description": "Human-readable description of the model",
|
||||||
|
"intelligence_score": "1-20 human rating used as the primary signal for auto-mode model ordering"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"_example_models": [
|
||||||
|
{
|
||||||
|
"model_name": "gpt-4",
|
||||||
|
"deployment": "gpt-4",
|
||||||
|
"aliases": [
|
||||||
|
"gpt4"
|
||||||
|
],
|
||||||
|
"context_window": 128000,
|
||||||
|
"max_output_tokens": 16384,
|
||||||
|
"supports_extended_thinking": false,
|
||||||
|
"supports_json_mode": true,
|
||||||
|
"supports_function_calling": false,
|
||||||
|
"supports_images": false,
|
||||||
|
"max_image_size_mb": 0.0,
|
||||||
|
"supports_temperature": false,
|
||||||
|
"temperature_constraint": "fixed",
|
||||||
|
"description": "GPT-4 (128K context, 16K output)",
|
||||||
|
"intelligence_score": 10
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"models": []
|
||||||
|
}
|
||||||
@@ -1,383 +1,26 @@
|
|||||||
{
|
{
|
||||||
"_README": {
|
"_README": {
|
||||||
"description": "Unified model configuration for multiple AI providers and endpoints, including OpenRouter",
|
"description": "Model metadata for local/self-hosted OpenAI-compatible endpoints (Custom provider).",
|
||||||
"providers_supported": [
|
|
||||||
"OpenRouter - Access to GPT-4, Claude, Mistral, etc. via unified API",
|
|
||||||
"Custom API endpoints - Local models (Ollama, vLLM, LM Studio, etc.)",
|
|
||||||
"Self-hosted APIs - Any OpenAI-compatible endpoint"
|
|
||||||
],
|
|
||||||
"documentation": "https://github.com/BeehiveInnovations/zen-mcp-server/blob/main/docs/custom_models.md",
|
"documentation": "https://github.com/BeehiveInnovations/zen-mcp-server/blob/main/docs/custom_models.md",
|
||||||
"usage": "Models can be accessed via aliases (e.g., 'opus', 'local-llama') or full names (e.g., 'anthropic/claude-opus-4', 'llama3.2')",
|
"usage": "Each entry will be advertised by the Custom provider. Aliases are case-insensitive.",
|
||||||
"instructions": [
|
"field_notes": "Matches providers/shared/model_capabilities.py.",
|
||||||
"Add new models by copying an existing entry and modifying it",
|
|
||||||
"Aliases are case-insensitive and should be unique across all models",
|
|
||||||
"context_window is the model's total context window size in tokens (input + output)",
|
|
||||||
"Set supports_* flags based on the model's actual capabilities",
|
|
||||||
"Set is_custom=true for models that should ONLY work with custom endpoints (Ollama, vLLM, etc.)",
|
|
||||||
"Models not listed here will use generic defaults (32K context window, basic features)",
|
|
||||||
"For OpenRouter models: Use official OpenRouter model names (e.g., 'anthropic/claude-opus-4')",
|
|
||||||
"For local/custom models: Use model names as they appear in your API (e.g., 'llama3.2', 'gpt-3.5-turbo')"
|
|
||||||
],
|
|
||||||
"field_descriptions": {
|
"field_descriptions": {
|
||||||
"model_name": "The model identifier - OpenRouter format (e.g., 'anthropic/claude-opus-4') or custom model name (e.g., 'llama3.2')",
|
"model_name": "The model identifier e.g., 'llama3.2'",
|
||||||
"aliases": "Array of short names users can type instead of the full model name",
|
"aliases": "Array of short names users can type instead of the full model name",
|
||||||
"context_window": "Total number of tokens the model can process (input + output combined)",
|
"context_window": "Total number of tokens the model can process (input + output combined)",
|
||||||
"max_output_tokens": "Maximum number of tokens the model can generate in a single response",
|
"max_output_tokens": "Maximum number of tokens the model can generate in a single response",
|
||||||
"supports_extended_thinking": "Whether the model supports extended reasoning tokens (currently none do via OpenRouter or custom APIs)",
|
"supports_extended_thinking": "Whether the model supports extended reasoning tokens",
|
||||||
"supports_json_mode": "Whether the model can guarantee valid JSON output",
|
"supports_json_mode": "Whether the model can guarantee valid JSON output",
|
||||||
"supports_function_calling": "Whether the model supports function/tool calling",
|
"supports_function_calling": "Whether the model supports function/tool calling",
|
||||||
"supports_images": "Whether the model can process images/visual input",
|
"supports_images": "Whether the model can process images/visual input",
|
||||||
"max_image_size_mb": "Maximum total size in MB for all images combined (capped at 40MB max for custom models)",
|
"max_image_size_mb": "Maximum total size in MB for all images combined (capped at 40MB max for custom models)",
|
||||||
"supports_temperature": "Whether the model accepts temperature parameter in API calls (set to false for O3/O4 reasoning models)",
|
"supports_temperature": "Whether the model accepts temperature parameter in API calls (set to false for O3/O4 reasoning models)",
|
||||||
"temperature_constraint": "Type of temperature constraint: 'fixed' (fixed value), 'range' (continuous range), 'discrete' (specific values), or omit for default range",
|
"temperature_constraint": "Type of temperature constraint: 'fixed' (fixed value), 'range' (continuous range), 'discrete' (specific values), or omit for default range",
|
||||||
"is_custom": "Set to true for models that should ONLY be used with custom API endpoints (Ollama, vLLM, etc.). False or omitted for OpenRouter/cloud models.",
|
|
||||||
"description": "Human-readable description of the model",
|
"description": "Human-readable description of the model",
|
||||||
"intelligence_score": "1-20 human rating used as the primary signal for auto-mode model ordering"
|
"intelligence_score": "1-20 human rating used as the primary signal for auto-mode model ordering"
|
||||||
},
|
|
||||||
"example_custom_model": {
|
|
||||||
"model_name": "my-local-model",
|
|
||||||
"aliases": [
|
|
||||||
"shortname",
|
|
||||||
"nickname",
|
|
||||||
"abbrev"
|
|
||||||
],
|
|
||||||
"context_window": 128000,
|
|
||||||
"max_output_tokens": 32768,
|
|
||||||
"supports_extended_thinking": false,
|
|
||||||
"supports_json_mode": true,
|
|
||||||
"supports_function_calling": true,
|
|
||||||
"supports_images": true,
|
|
||||||
"max_image_size_mb": 10.0,
|
|
||||||
"supports_temperature": true,
|
|
||||||
"temperature_constraint": "range",
|
|
||||||
"is_custom": true,
|
|
||||||
"description": "Example custom/local model for Ollama, vLLM, etc.",
|
|
||||||
"intelligence_score": 12
|
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"models": [
|
"models": [
|
||||||
{
|
|
||||||
"model_name": "anthropic/claude-sonnet-4.5",
|
|
||||||
"aliases": [
|
|
||||||
"sonnet",
|
|
||||||
"sonnet4.5"
|
|
||||||
],
|
|
||||||
"context_window": 200000,
|
|
||||||
"max_output_tokens": 64000,
|
|
||||||
"supports_extended_thinking": false,
|
|
||||||
"supports_json_mode": false,
|
|
||||||
"supports_function_calling": false,
|
|
||||||
"supports_images": true,
|
|
||||||
"max_image_size_mb": 5.0,
|
|
||||||
"description": "Claude Sonnet 4.5 - High-performance model with exceptional reasoning and efficiency",
|
|
||||||
"intelligence_score": 12
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"model_name": "anthropic/claude-opus-4.1",
|
|
||||||
"aliases": [
|
|
||||||
"opus",
|
|
||||||
"claude-opus"
|
|
||||||
],
|
|
||||||
"context_window": 200000,
|
|
||||||
"max_output_tokens": 64000,
|
|
||||||
"supports_extended_thinking": false,
|
|
||||||
"supports_json_mode": false,
|
|
||||||
"supports_function_calling": false,
|
|
||||||
"supports_images": true,
|
|
||||||
"max_image_size_mb": 5.0,
|
|
||||||
"description": "Claude Opus 4.1 - Our most capable and intelligent model yet",
|
|
||||||
"intelligence_score": 14
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"model_name": "anthropic/claude-sonnet-4.1",
|
|
||||||
"aliases": [
|
|
||||||
"sonnet4.1"
|
|
||||||
],
|
|
||||||
"context_window": 200000,
|
|
||||||
"max_output_tokens": 64000,
|
|
||||||
"supports_extended_thinking": false,
|
|
||||||
"supports_json_mode": false,
|
|
||||||
"supports_function_calling": false,
|
|
||||||
"supports_images": true,
|
|
||||||
"max_image_size_mb": 5.0,
|
|
||||||
"description": "Claude Sonnet 4.1 - Last generation high-performance model with exceptional reasoning and efficiency",
|
|
||||||
"intelligence_score": 10
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"model_name": "anthropic/claude-3.5-haiku",
|
|
||||||
"aliases": [
|
|
||||||
"haiku"
|
|
||||||
],
|
|
||||||
"context_window": 200000,
|
|
||||||
"max_output_tokens": 64000,
|
|
||||||
"supports_extended_thinking": false,
|
|
||||||
"supports_json_mode": false,
|
|
||||||
"supports_function_calling": false,
|
|
||||||
"supports_images": true,
|
|
||||||
"max_image_size_mb": 5.0,
|
|
||||||
"description": "Claude 3 Haiku - Fast and efficient with vision",
|
|
||||||
"intelligence_score": 8
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"model_name": "google/gemini-2.5-pro",
|
|
||||||
"aliases": [
|
|
||||||
"pro",
|
|
||||||
"gemini-pro",
|
|
||||||
"gemini",
|
|
||||||
"pro-openrouter"
|
|
||||||
],
|
|
||||||
"context_window": 1048576,
|
|
||||||
"max_output_tokens": 65536,
|
|
||||||
"supports_extended_thinking": true,
|
|
||||||
"supports_json_mode": true,
|
|
||||||
"supports_function_calling": true,
|
|
||||||
"supports_images": true,
|
|
||||||
"max_image_size_mb": 20.0,
|
|
||||||
"description": "Google's Gemini 2.5 Pro via OpenRouter with vision",
|
|
||||||
"intelligence_score": 18
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"model_name": "google/gemini-2.5-flash",
|
|
||||||
"aliases": [
|
|
||||||
"flash",
|
|
||||||
"gemini-flash"
|
|
||||||
],
|
|
||||||
"context_window": 1048576,
|
|
||||||
"max_output_tokens": 65536,
|
|
||||||
"supports_extended_thinking": true,
|
|
||||||
"supports_json_mode": true,
|
|
||||||
"supports_function_calling": true,
|
|
||||||
"supports_images": true,
|
|
||||||
"max_image_size_mb": 15.0,
|
|
||||||
"description": "Google's Gemini 2.5 Flash via OpenRouter with vision",
|
|
||||||
"intelligence_score": 10
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"model_name": "mistralai/mistral-large-2411",
|
|
||||||
"aliases": [
|
|
||||||
"mistral-large",
|
|
||||||
"mistral"
|
|
||||||
],
|
|
||||||
"context_window": 128000,
|
|
||||||
"max_output_tokens": 32000,
|
|
||||||
"supports_extended_thinking": false,
|
|
||||||
"supports_json_mode": true,
|
|
||||||
"supports_function_calling": true,
|
|
||||||
"supports_images": false,
|
|
||||||
"max_image_size_mb": 0.0,
|
|
||||||
"description": "Mistral's largest model (text-only)",
|
|
||||||
"intelligence_score": 11
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"model_name": "meta-llama/llama-3-70b",
|
|
||||||
"aliases": [
|
|
||||||
"llama",
|
|
||||||
"llama3",
|
|
||||||
"llama3-70b",
|
|
||||||
"llama-70b",
|
|
||||||
"llama3-openrouter"
|
|
||||||
],
|
|
||||||
"context_window": 8192,
|
|
||||||
"max_output_tokens": 8192,
|
|
||||||
"supports_extended_thinking": false,
|
|
||||||
"supports_json_mode": false,
|
|
||||||
"supports_function_calling": false,
|
|
||||||
"supports_images": false,
|
|
||||||
"max_image_size_mb": 0.0,
|
|
||||||
"description": "Meta's Llama 3 70B model (text-only)",
|
|
||||||
"intelligence_score": 9
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"model_name": "deepseek/deepseek-r1-0528",
|
|
||||||
"aliases": [
|
|
||||||
"deepseek-r1",
|
|
||||||
"deepseek",
|
|
||||||
"r1",
|
|
||||||
"deepseek-thinking"
|
|
||||||
],
|
|
||||||
"context_window": 65536,
|
|
||||||
"max_output_tokens": 32768,
|
|
||||||
"supports_extended_thinking": true,
|
|
||||||
"supports_json_mode": true,
|
|
||||||
"supports_function_calling": false,
|
|
||||||
"supports_images": false,
|
|
||||||
"max_image_size_mb": 0.0,
|
|
||||||
"description": "DeepSeek R1 with thinking mode - advanced reasoning capabilities (text-only)",
|
|
||||||
"intelligence_score": 15
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"model_name": "perplexity/llama-3-sonar-large-32k-online",
|
|
||||||
"aliases": [
|
|
||||||
"perplexity",
|
|
||||||
"sonar",
|
|
||||||
"perplexity-online"
|
|
||||||
],
|
|
||||||
"context_window": 32768,
|
|
||||||
"max_output_tokens": 32768,
|
|
||||||
"supports_extended_thinking": false,
|
|
||||||
"supports_json_mode": false,
|
|
||||||
"supports_function_calling": false,
|
|
||||||
"supports_images": false,
|
|
||||||
"max_image_size_mb": 0.0,
|
|
||||||
"description": "Perplexity's online model with web search (text-only)",
|
|
||||||
"intelligence_score": 9
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"model_name": "openai/o3",
|
|
||||||
"aliases": [
|
|
||||||
"o3"
|
|
||||||
],
|
|
||||||
"context_window": 200000,
|
|
||||||
"max_output_tokens": 100000,
|
|
||||||
"supports_extended_thinking": false,
|
|
||||||
"supports_json_mode": true,
|
|
||||||
"supports_function_calling": true,
|
|
||||||
"supports_images": true,
|
|
||||||
"max_image_size_mb": 20.0,
|
|
||||||
"supports_temperature": false,
|
|
||||||
"temperature_constraint": "fixed",
|
|
||||||
"description": "OpenAI's o3 model - well-rounded and powerful across domains with vision",
|
|
||||||
"intelligence_score": 14
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"model_name": "openai/o3-mini",
|
|
||||||
"aliases": [
|
|
||||||
"o3-mini",
|
|
||||||
"o3mini"
|
|
||||||
],
|
|
||||||
"context_window": 200000,
|
|
||||||
"max_output_tokens": 100000,
|
|
||||||
"supports_extended_thinking": false,
|
|
||||||
"supports_json_mode": true,
|
|
||||||
"supports_function_calling": true,
|
|
||||||
"supports_images": true,
|
|
||||||
"max_image_size_mb": 20.0,
|
|
||||||
"supports_temperature": false,
|
|
||||||
"temperature_constraint": "fixed",
|
|
||||||
"description": "OpenAI's o3-mini model - balanced performance and speed with vision",
|
|
||||||
"intelligence_score": 12
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"model_name": "openai/o3-mini-high",
|
|
||||||
"aliases": [
|
|
||||||
"o3-mini-high",
|
|
||||||
"o3mini-high"
|
|
||||||
],
|
|
||||||
"context_window": 200000,
|
|
||||||
"max_output_tokens": 100000,
|
|
||||||
"supports_extended_thinking": false,
|
|
||||||
"supports_json_mode": true,
|
|
||||||
"supports_function_calling": true,
|
|
||||||
"supports_images": true,
|
|
||||||
"max_image_size_mb": 20.0,
|
|
||||||
"supports_temperature": false,
|
|
||||||
"temperature_constraint": "fixed",
|
|
||||||
"description": "OpenAI's o3-mini with high reasoning effort - optimized for complex problems with vision",
|
|
||||||
"intelligence_score": 13
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"model_name": "openai/o3-pro",
|
|
||||||
"aliases": [
|
|
||||||
"o3pro"
|
|
||||||
],
|
|
||||||
"context_window": 200000,
|
|
||||||
"max_output_tokens": 100000,
|
|
||||||
"supports_extended_thinking": false,
|
|
||||||
"supports_json_mode": true,
|
|
||||||
"supports_function_calling": true,
|
|
||||||
"supports_images": true,
|
|
||||||
"max_image_size_mb": 20.0,
|
|
||||||
"supports_temperature": false,
|
|
||||||
"temperature_constraint": "fixed",
|
|
||||||
"description": "OpenAI's o3-pro model - professional-grade reasoning and analysis with vision",
|
|
||||||
"intelligence_score": 15
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"model_name": "openai/o4-mini",
|
|
||||||
"aliases": [
|
|
||||||
"o4-mini",
|
|
||||||
"o4mini"
|
|
||||||
],
|
|
||||||
"context_window": 200000,
|
|
||||||
"max_output_tokens": 100000,
|
|
||||||
"supports_extended_thinking": false,
|
|
||||||
"supports_json_mode": true,
|
|
||||||
"supports_function_calling": true,
|
|
||||||
"supports_images": true,
|
|
||||||
"max_image_size_mb": 20.0,
|
|
||||||
"supports_temperature": false,
|
|
||||||
"temperature_constraint": "fixed",
|
|
||||||
"description": "OpenAI's o4-mini model - optimized for shorter contexts with rapid reasoning and vision",
|
|
||||||
"intelligence_score": 11
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"model_name": "openai/gpt-5",
|
|
||||||
"aliases": [
|
|
||||||
"gpt5"
|
|
||||||
],
|
|
||||||
"context_window": 400000,
|
|
||||||
"max_output_tokens": 128000,
|
|
||||||
"supports_extended_thinking": true,
|
|
||||||
"supports_json_mode": true,
|
|
||||||
"supports_function_calling": true,
|
|
||||||
"supports_images": true,
|
|
||||||
"max_image_size_mb": 20.0,
|
|
||||||
"supports_temperature": true,
|
|
||||||
"temperature_constraint": "range",
|
|
||||||
"description": "GPT-5 (400K context, 128K output) - Advanced model with reasoning support",
|
|
||||||
"intelligence_score": 16
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"model_name": "openai/gpt-5-codex",
|
|
||||||
"aliases": [
|
|
||||||
"codex",
|
|
||||||
"gpt5codex"
|
|
||||||
],
|
|
||||||
"context_window": 400000,
|
|
||||||
"max_output_tokens": 128000,
|
|
||||||
"supports_extended_thinking": false,
|
|
||||||
"supports_json_mode": true,
|
|
||||||
"supports_function_calling": false,
|
|
||||||
"supports_images": false,
|
|
||||||
"max_image_size_mb": 0.0,
|
|
||||||
"is_custom": false,
|
|
||||||
"description": "GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and coding workflows",
|
|
||||||
"intelligence_score": 17
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"model_name": "openai/gpt-5-mini",
|
|
||||||
"aliases": [
|
|
||||||
"gpt5mini"
|
|
||||||
],
|
|
||||||
"context_window": 400000,
|
|
||||||
"max_output_tokens": 128000,
|
|
||||||
"supports_extended_thinking": false,
|
|
||||||
"supports_json_mode": true,
|
|
||||||
"supports_function_calling": false,
|
|
||||||
"supports_images": false,
|
|
||||||
"max_image_size_mb": 0.0,
|
|
||||||
"supports_temperature": true,
|
|
||||||
"temperature_constraint": "fixed",
|
|
||||||
"description": "GPT-5-mini (400K context, 128K output) - Efficient variant with reasoning support",
|
|
||||||
"intelligence_score": 10
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"model_name": "openai/gpt-5-nano",
|
|
||||||
"aliases": [
|
|
||||||
"gpt5nano"
|
|
||||||
],
|
|
||||||
"context_window": 400000,
|
|
||||||
"max_output_tokens": 128000,
|
|
||||||
"supports_extended_thinking": false,
|
|
||||||
"supports_json_mode": true,
|
|
||||||
"supports_function_calling": false,
|
|
||||||
"supports_images": false,
|
|
||||||
"max_image_size_mb": 0.0,
|
|
||||||
"supports_temperature": true,
|
|
||||||
"temperature_constraint": "fixed",
|
|
||||||
"description": "GPT-5 nano (400K context, 128K output) - Fastest, cheapest version of GPT-5 for summarization and classification tasks",
|
|
||||||
"intelligence_score": 8
|
|
||||||
},
|
|
||||||
{
|
{
|
||||||
"model_name": "llama3.2",
|
"model_name": "llama3.2",
|
||||||
"aliases": [
|
"aliases": [
|
||||||
@@ -391,7 +34,6 @@
|
|||||||
"supports_function_calling": false,
|
"supports_function_calling": false,
|
||||||
"supports_images": false,
|
"supports_images": false,
|
||||||
"max_image_size_mb": 0.0,
|
"max_image_size_mb": 0.0,
|
||||||
"is_custom": true,
|
|
||||||
"description": "Local Llama 3.2 model via custom endpoint (Ollama/vLLM) - 128K context window (text-only)",
|
"description": "Local Llama 3.2 model via custom endpoint (Ollama/vLLM) - 128K context window (text-only)",
|
||||||
"intelligence_score": 6
|
"intelligence_score": 6
|
||||||
}
|
}
|
||||||
|
|||||||
346
conf/openrouter_models.json
Normal file
346
conf/openrouter_models.json
Normal file
@@ -0,0 +1,346 @@
|
|||||||
|
{
|
||||||
|
"_README": {
|
||||||
|
"description": "Model metadata for OpenRouter-backed providers.",
|
||||||
|
"documentation": "https://github.com/BeehiveInnovations/zen-mcp-server/blob/main/docs/custom_models.md",
|
||||||
|
"usage": "Models listed here are exposed through OpenRouter. Aliases are case-insensitive.",
|
||||||
|
"field_notes": "Matches providers/shared/model_capabilities.py.",
|
||||||
|
"field_descriptions": {
|
||||||
|
"model_name": "The model identifier - OpenRouter format (e.g., 'anthropic/claude-opus-4') or custom model name (e.g., 'llama3.2')",
|
||||||
|
"aliases": "Array of short names users can type instead of the full model name",
|
||||||
|
"context_window": "Total number of tokens the model can process (input + output combined)",
|
||||||
|
"max_output_tokens": "Maximum number of tokens the model can generate in a single response",
|
||||||
|
"supports_extended_thinking": "Whether the model supports extended reasoning tokens (currently none do via OpenRouter or custom APIs)",
|
||||||
|
"supports_json_mode": "Whether the model can guarantee valid JSON output",
|
||||||
|
"supports_function_calling": "Whether the model supports function/tool calling",
|
||||||
|
"supports_images": "Whether the model can process images/visual input",
|
||||||
|
"max_image_size_mb": "Maximum total size in MB for all images combined (capped at 40MB max for custom models)",
|
||||||
|
"supports_temperature": "Whether the model accepts temperature parameter in API calls (set to false for O3/O4 reasoning models)",
|
||||||
|
"temperature_constraint": "Type of temperature constraint: 'fixed' (fixed value), 'range' (continuous range), 'discrete' (specific values), or omit for default range",
|
||||||
|
"description": "Human-readable description of the model",
|
||||||
|
"intelligence_score": "1-20 human rating used as the primary signal for auto-mode model ordering"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"models": [
|
||||||
|
{
|
||||||
|
"model_name": "anthropic/claude-sonnet-4.5",
|
||||||
|
"aliases": [
|
||||||
|
"sonnet",
|
||||||
|
"sonnet4.5"
|
||||||
|
],
|
||||||
|
"context_window": 200000,
|
||||||
|
"max_output_tokens": 64000,
|
||||||
|
"supports_extended_thinking": false,
|
||||||
|
"supports_json_mode": false,
|
||||||
|
"supports_function_calling": false,
|
||||||
|
"supports_images": true,
|
||||||
|
"max_image_size_mb": 5.0,
|
||||||
|
"description": "Claude Sonnet 4.5 - High-performance model with exceptional reasoning and efficiency",
|
||||||
|
"intelligence_score": 12
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"model_name": "anthropic/claude-opus-4.1",
|
||||||
|
"aliases": [
|
||||||
|
"opus",
|
||||||
|
"claude-opus"
|
||||||
|
],
|
||||||
|
"context_window": 200000,
|
||||||
|
"max_output_tokens": 64000,
|
||||||
|
"supports_extended_thinking": false,
|
||||||
|
"supports_json_mode": false,
|
||||||
|
"supports_function_calling": false,
|
||||||
|
"supports_images": true,
|
||||||
|
"max_image_size_mb": 5.0,
|
||||||
|
"description": "Claude Opus 4.1 - Our most capable and intelligent model yet",
|
||||||
|
"intelligence_score": 14
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"model_name": "anthropic/claude-sonnet-4.1",
|
||||||
|
"aliases": [
|
||||||
|
"sonnet4.1"
|
||||||
|
],
|
||||||
|
"context_window": 200000,
|
||||||
|
"max_output_tokens": 64000,
|
||||||
|
"supports_extended_thinking": false,
|
||||||
|
"supports_json_mode": false,
|
||||||
|
"supports_function_calling": false,
|
||||||
|
"supports_images": true,
|
||||||
|
"max_image_size_mb": 5.0,
|
||||||
|
"description": "Claude Sonnet 4.1 - Last generation high-performance model with exceptional reasoning and efficiency",
|
||||||
|
"intelligence_score": 10
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"model_name": "anthropic/claude-3.5-haiku",
|
||||||
|
"aliases": [
|
||||||
|
"haiku"
|
||||||
|
],
|
||||||
|
"context_window": 200000,
|
||||||
|
"max_output_tokens": 64000,
|
||||||
|
"supports_extended_thinking": false,
|
||||||
|
"supports_json_mode": false,
|
||||||
|
"supports_function_calling": false,
|
||||||
|
"supports_images": true,
|
||||||
|
"max_image_size_mb": 5.0,
|
||||||
|
"description": "Claude 3 Haiku - Fast and efficient with vision",
|
||||||
|
"intelligence_score": 8
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"model_name": "google/gemini-2.5-pro",
|
||||||
|
"aliases": [
|
||||||
|
"pro",
|
||||||
|
"gemini-pro",
|
||||||
|
"gemini",
|
||||||
|
"pro-openrouter"
|
||||||
|
],
|
||||||
|
"context_window": 1048576,
|
||||||
|
"max_output_tokens": 65536,
|
||||||
|
"supports_extended_thinking": true,
|
||||||
|
"supports_json_mode": true,
|
||||||
|
"supports_function_calling": true,
|
||||||
|
"supports_images": true,
|
||||||
|
"max_image_size_mb": 20.0,
|
||||||
|
"description": "Google's Gemini 2.5 Pro via OpenRouter with vision",
|
||||||
|
"intelligence_score": 18
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"model_name": "google/gemini-2.5-flash",
|
||||||
|
"aliases": [
|
||||||
|
"flash",
|
||||||
|
"gemini-flash"
|
||||||
|
],
|
||||||
|
"context_window": 1048576,
|
||||||
|
"max_output_tokens": 65536,
|
||||||
|
"supports_extended_thinking": true,
|
||||||
|
"supports_json_mode": true,
|
||||||
|
"supports_function_calling": true,
|
||||||
|
"supports_images": true,
|
||||||
|
"max_image_size_mb": 15.0,
|
||||||
|
"description": "Google's Gemini 2.5 Flash via OpenRouter with vision",
|
||||||
|
"intelligence_score": 10
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"model_name": "mistralai/mistral-large-2411",
|
||||||
|
"aliases": [
|
||||||
|
"mistral-large",
|
||||||
|
"mistral"
|
||||||
|
],
|
||||||
|
"context_window": 128000,
|
||||||
|
"max_output_tokens": 32000,
|
||||||
|
"supports_extended_thinking": false,
|
||||||
|
"supports_json_mode": true,
|
||||||
|
"supports_function_calling": true,
|
||||||
|
"supports_images": false,
|
||||||
|
"max_image_size_mb": 0.0,
|
||||||
|
"description": "Mistral's largest model (text-only)",
|
||||||
|
"intelligence_score": 11
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"model_name": "meta-llama/llama-3-70b",
|
||||||
|
"aliases": [
|
||||||
|
"llama",
|
||||||
|
"llama3",
|
||||||
|
"llama3-70b",
|
||||||
|
"llama-70b",
|
||||||
|
"llama3-openrouter"
|
||||||
|
],
|
||||||
|
"context_window": 8192,
|
||||||
|
"max_output_tokens": 8192,
|
||||||
|
"supports_extended_thinking": false,
|
||||||
|
"supports_json_mode": false,
|
||||||
|
"supports_function_calling": false,
|
||||||
|
"supports_images": false,
|
||||||
|
"max_image_size_mb": 0.0,
|
||||||
|
"description": "Meta's Llama 3 70B model (text-only)",
|
||||||
|
"intelligence_score": 9
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"model_name": "deepseek/deepseek-r1-0528",
|
||||||
|
"aliases": [
|
||||||
|
"deepseek-r1",
|
||||||
|
"deepseek",
|
||||||
|
"r1",
|
||||||
|
"deepseek-thinking"
|
||||||
|
],
|
||||||
|
"context_window": 65536,
|
||||||
|
"max_output_tokens": 32768,
|
||||||
|
"supports_extended_thinking": true,
|
||||||
|
"supports_json_mode": true,
|
||||||
|
"supports_function_calling": false,
|
||||||
|
"supports_images": false,
|
||||||
|
"max_image_size_mb": 0.0,
|
||||||
|
"description": "DeepSeek R1 with thinking mode - advanced reasoning capabilities (text-only)",
|
||||||
|
"intelligence_score": 15
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"model_name": "perplexity/llama-3-sonar-large-32k-online",
|
||||||
|
"aliases": [
|
||||||
|
"perplexity",
|
||||||
|
"sonar",
|
||||||
|
"perplexity-online"
|
||||||
|
],
|
||||||
|
"context_window": 32768,
|
||||||
|
"max_output_tokens": 32768,
|
||||||
|
"supports_extended_thinking": false,
|
||||||
|
"supports_json_mode": false,
|
||||||
|
"supports_function_calling": false,
|
||||||
|
"supports_images": false,
|
||||||
|
"max_image_size_mb": 0.0,
|
||||||
|
"description": "Perplexity's online model with web search (text-only)",
|
||||||
|
"intelligence_score": 9
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"model_name": "openai/o3",
|
||||||
|
"aliases": [
|
||||||
|
"o3"
|
||||||
|
],
|
||||||
|
"context_window": 200000,
|
||||||
|
"max_output_tokens": 100000,
|
||||||
|
"supports_extended_thinking": false,
|
||||||
|
"supports_json_mode": true,
|
||||||
|
"supports_function_calling": true,
|
||||||
|
"supports_images": true,
|
||||||
|
"max_image_size_mb": 20.0,
|
||||||
|
"supports_temperature": false,
|
||||||
|
"temperature_constraint": "fixed",
|
||||||
|
"description": "OpenAI's o3 model - well-rounded and powerful across domains with vision",
|
||||||
|
"intelligence_score": 14
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"model_name": "openai/o3-mini",
|
||||||
|
"aliases": [
|
||||||
|
"o3-mini",
|
||||||
|
"o3mini"
|
||||||
|
],
|
||||||
|
"context_window": 200000,
|
||||||
|
"max_output_tokens": 100000,
|
||||||
|
"supports_extended_thinking": false,
|
||||||
|
"supports_json_mode": true,
|
||||||
|
"supports_function_calling": true,
|
||||||
|
"supports_images": true,
|
||||||
|
"max_image_size_mb": 20.0,
|
||||||
|
"supports_temperature": false,
|
||||||
|
"temperature_constraint": "fixed",
|
||||||
|
"description": "OpenAI's o3-mini model - balanced performance and speed with vision",
|
||||||
|
"intelligence_score": 12
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"model_name": "openai/o3-mini-high",
|
||||||
|
"aliases": [
|
||||||
|
"o3-mini-high",
|
||||||
|
"o3mini-high"
|
||||||
|
],
|
||||||
|
"context_window": 200000,
|
||||||
|
"max_output_tokens": 100000,
|
||||||
|
"supports_extended_thinking": false,
|
||||||
|
"supports_json_mode": true,
|
||||||
|
"supports_function_calling": true,
|
||||||
|
"supports_images": true,
|
||||||
|
"max_image_size_mb": 20.0,
|
||||||
|
"supports_temperature": false,
|
||||||
|
"temperature_constraint": "fixed",
|
||||||
|
"description": "OpenAI's o3-mini with high reasoning effort - optimized for complex problems with vision",
|
||||||
|
"intelligence_score": 13
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"model_name": "openai/o3-pro",
|
||||||
|
"aliases": [
|
||||||
|
"o3pro"
|
||||||
|
],
|
||||||
|
"context_window": 200000,
|
||||||
|
"max_output_tokens": 100000,
|
||||||
|
"supports_extended_thinking": false,
|
||||||
|
"supports_json_mode": true,
|
||||||
|
"supports_function_calling": true,
|
||||||
|
"supports_images": true,
|
||||||
|
"max_image_size_mb": 20.0,
|
||||||
|
"supports_temperature": false,
|
||||||
|
"temperature_constraint": "fixed",
|
||||||
|
"description": "OpenAI's o3-pro model - professional-grade reasoning and analysis with vision",
|
||||||
|
"intelligence_score": 15
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"model_name": "openai/o4-mini",
|
||||||
|
"aliases": [
|
||||||
|
"o4-mini",
|
||||||
|
"o4mini"
|
||||||
|
],
|
||||||
|
"context_window": 200000,
|
||||||
|
"max_output_tokens": 100000,
|
||||||
|
"supports_extended_thinking": false,
|
||||||
|
"supports_json_mode": true,
|
||||||
|
"supports_function_calling": true,
|
||||||
|
"supports_images": true,
|
||||||
|
"max_image_size_mb": 20.0,
|
||||||
|
"supports_temperature": false,
|
||||||
|
"temperature_constraint": "fixed",
|
||||||
|
"description": "OpenAI's o4-mini model - optimized for shorter contexts with rapid reasoning and vision",
|
||||||
|
"intelligence_score": 11
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"model_name": "openai/gpt-5",
|
||||||
|
"aliases": [
|
||||||
|
"gpt5"
|
||||||
|
],
|
||||||
|
"context_window": 400000,
|
||||||
|
"max_output_tokens": 128000,
|
||||||
|
"supports_extended_thinking": true,
|
||||||
|
"supports_json_mode": true,
|
||||||
|
"supports_function_calling": true,
|
||||||
|
"supports_images": true,
|
||||||
|
"max_image_size_mb": 20.0,
|
||||||
|
"supports_temperature": true,
|
||||||
|
"temperature_constraint": "range",
|
||||||
|
"description": "GPT-5 (400K context, 128K output) - Advanced model with reasoning support",
|
||||||
|
"intelligence_score": 16
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"model_name": "openai/gpt-5-codex",
|
||||||
|
"aliases": [
|
||||||
|
"codex",
|
||||||
|
"gpt5codex"
|
||||||
|
],
|
||||||
|
"context_window": 400000,
|
||||||
|
"max_output_tokens": 128000,
|
||||||
|
"supports_extended_thinking": false,
|
||||||
|
"supports_json_mode": true,
|
||||||
|
"supports_function_calling": false,
|
||||||
|
"supports_images": false,
|
||||||
|
"max_image_size_mb": 0.0,
|
||||||
|
"description": "GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and coding workflows",
|
||||||
|
"intelligence_score": 17
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"model_name": "openai/gpt-5-mini",
|
||||||
|
"aliases": [
|
||||||
|
"gpt5mini"
|
||||||
|
],
|
||||||
|
"context_window": 400000,
|
||||||
|
"max_output_tokens": 128000,
|
||||||
|
"supports_extended_thinking": false,
|
||||||
|
"supports_json_mode": true,
|
||||||
|
"supports_function_calling": false,
|
||||||
|
"supports_images": false,
|
||||||
|
"max_image_size_mb": 0.0,
|
||||||
|
"supports_temperature": true,
|
||||||
|
"temperature_constraint": "fixed",
|
||||||
|
"description": "GPT-5-mini (400K context, 128K output) - Efficient variant with reasoning support",
|
||||||
|
"intelligence_score": 10
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"model_name": "openai/gpt-5-nano",
|
||||||
|
"aliases": [
|
||||||
|
"gpt5nano"
|
||||||
|
],
|
||||||
|
"context_window": 400000,
|
||||||
|
"max_output_tokens": 128000,
|
||||||
|
"supports_extended_thinking": false,
|
||||||
|
"supports_json_mode": true,
|
||||||
|
"supports_function_calling": false,
|
||||||
|
"supports_images": false,
|
||||||
|
"max_image_size_mb": 0.0,
|
||||||
|
"supports_temperature": true,
|
||||||
|
"temperature_constraint": "fixed",
|
||||||
|
"description": "GPT-5 nano (400K context, 128K output) - Fastest, cheapest version of GPT-5 for summarization and classification tasks",
|
||||||
|
"intelligence_score": 8
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
@@ -9,6 +9,7 @@ Each provider:
|
|||||||
- Defines supported models using `ModelCapabilities` objects
|
- Defines supported models using `ModelCapabilities` objects
|
||||||
- Implements the minimal abstract hooks (`get_provider_type()` and `generate_content()`)
|
- Implements the minimal abstract hooks (`get_provider_type()` and `generate_content()`)
|
||||||
- Gets wired into `configure_providers()` so environment variables control activation
|
- Gets wired into `configure_providers()` so environment variables control activation
|
||||||
|
- Can leverage helper subclasses (e.g., `AzureOpenAIProvider`) when only client wiring differs
|
||||||
|
|
||||||
### Intelligence score cheatsheet
|
### Intelligence score cheatsheet
|
||||||
|
|
||||||
@@ -31,6 +32,13 @@ features ([details here](model_ranking.md)).
|
|||||||
|
|
||||||
⚠️ **Important**: If you implement a custom `generate_content()`, call `_resolve_model_name()` before invoking the SDK so aliases (e.g. `"gpt"` → `"gpt-4"`) resolve correctly. The shared implementations already do this for you.
|
⚠️ **Important**: If you implement a custom `generate_content()`, call `_resolve_model_name()` before invoking the SDK so aliases (e.g. `"gpt"` → `"gpt-4"`) resolve correctly. The shared implementations already do this for you.
|
||||||
|
|
||||||
|
**Option C: Azure OpenAI (`AzureOpenAIProvider`)**
|
||||||
|
- For Azure-hosted deployments of OpenAI models
|
||||||
|
- Reuses the OpenAI-compatible pipeline but swaps in the `AzureOpenAI` client and a deployment mapping (canonical model → deployment ID)
|
||||||
|
- Define deployments in [`conf/azure_models.json`](../conf/azure_models.json) (or the file referenced by `AZURE_MODELS_CONFIG_PATH`).
|
||||||
|
- Entries follow the [`ModelCapabilities`](../providers/shared/model_capabilities.py) schema and must include a `deployment` identifier.
|
||||||
|
See [Azure OpenAI Configuration](azure_openai.md) for a step-by-step walkthrough.
|
||||||
|
|
||||||
## Step-by-Step Guide
|
## Step-by-Step Guide
|
||||||
|
|
||||||
### 1. Add Provider Type
|
### 1. Add Provider Type
|
||||||
@@ -227,6 +235,19 @@ DISABLED_TOOLS=debug,tracer
|
|||||||
EXAMPLE_ALLOWED_MODELS=example-model-large,example-model-small
|
EXAMPLE_ALLOWED_MODELS=example-model-large,example-model-small
|
||||||
```
|
```
|
||||||
|
|
||||||
|
For Azure OpenAI deployments:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
AZURE_OPENAI_API_KEY=your_azure_openai_key_here
|
||||||
|
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
|
||||||
|
# Models are defined in conf/azure_models.json (or AZURE_MODELS_CONFIG_PATH)
|
||||||
|
# AZURE_OPENAI_API_VERSION=2024-02-15-preview
|
||||||
|
# AZURE_OPENAI_ALLOWED_MODELS=gpt-4o,gpt-4o-mini
|
||||||
|
# AZURE_MODELS_CONFIG_PATH=/absolute/path/to/custom_azure_models.json
|
||||||
|
```
|
||||||
|
|
||||||
|
You can also define Azure models in [`conf/azure_models.json`](../conf/azure_models.json) (the bundled file is empty so you can copy it safely). Each entry mirrors the `ModelCapabilities` schema and must include a `deployment` field. Set `AZURE_MODELS_CONFIG_PATH` if you maintain a custom copy outside the repository.
|
||||||
|
|
||||||
**Note**: The `description` field in `ModelCapabilities` helps Claude choose the best model in auto mode.
|
**Note**: The `description` field in `ModelCapabilities` helps Claude choose the best model in auto mode.
|
||||||
|
|
||||||
### 5. Test Your Provider
|
### 5. Test Your Provider
|
||||||
|
|||||||
@@ -91,8 +91,8 @@ OPENAI_ALLOWED_MODELS=o3,o4-mini
|
|||||||
|
|
||||||
**Important Notes:**
|
**Important Notes:**
|
||||||
- Restrictions apply to all usage including auto mode
|
- Restrictions apply to all usage including auto mode
|
||||||
- `OPENROUTER_ALLOWED_MODELS` only affects OpenRouter models accessed via custom provider (where `is_custom: false` in custom_models.json)
|
- `OPENROUTER_ALLOWED_MODELS` only affects models defined in `conf/openrouter_models.json`
|
||||||
- Custom local models (`is_custom: true`) are not affected by any restrictions
|
- Custom local models (from `conf/custom_models.json`) are not affected by OpenRouter restrictions
|
||||||
|
|
||||||
## Thinking Modes
|
## Thinking Modes
|
||||||
|
|
||||||
|
|||||||
62
docs/azure_openai.md
Normal file
62
docs/azure_openai.md
Normal file
@@ -0,0 +1,62 @@
|
|||||||
|
# Azure OpenAI Configuration
|
||||||
|
|
||||||
|
Azure OpenAI support lets Zen MCP talk to GPT-4o, GPT-4.1, GPT-5, and o-series deployments that you expose through your Azure resource. This guide describes the configuration expected by the server: a couple of required environment variables plus a JSON manifest that lists every deployment you want to expose.
|
||||||
|
|
||||||
|
## 1. Required Environment Variables
|
||||||
|
|
||||||
|
Set these entries in your `.env` (or MCP `env` block).
|
||||||
|
|
||||||
|
```bash
|
||||||
|
AZURE_OPENAI_API_KEY=your_azure_openai_key_here
|
||||||
|
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
|
||||||
|
# AZURE_OPENAI_API_VERSION=2024-02-15-preview
|
||||||
|
```
|
||||||
|
|
||||||
|
Without the key and endpoint the provider is skipped entirely. Leave the key blank only if the endpoint truly allows anonymous access (rare for Azure).
|
||||||
|
|
||||||
|
## 2. Define Deployments in `conf/azure_models.json`
|
||||||
|
|
||||||
|
Azure models live in `conf/azure_models.json` (or the file pointed to by `AZURE_MODELS_CONFIG_PATH`). Each entry follows the same schema as [`ModelCapabilities`](../providers/shared/model_capabilities.py) with one additional required key: `deployment`. This field must exactly match the deployment name shown in the Azure Portal (for example `prod-gpt4o`). The provider routes requests by that value, so omitting it or using the wrong name will cause the server to skip the model.
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"models": [
|
||||||
|
{
|
||||||
|
"model_name": "gpt-4o",
|
||||||
|
"deployment": "prod-gpt4o",
|
||||||
|
"friendly_name": "Azure GPT-4o EU",
|
||||||
|
"intelligence_score": 18,
|
||||||
|
"context_window": 600000,
|
||||||
|
"max_output_tokens": 128000,
|
||||||
|
"supports_temperature": false,
|
||||||
|
"temperature_constraint": "fixed",
|
||||||
|
"aliases": ["gpt4o-eu"]
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Tips:
|
||||||
|
|
||||||
|
- Copy `conf/azure_models.json` into your repo and commit it, or point `AZURE_MODELS_CONFIG_PATH` at a custom path.
|
||||||
|
- Add one object per deployment. Aliases are optional but help when you want short names like `gpt4o-eu`.
|
||||||
|
- All capability fields are optional except `model_name`, `deployment`, and `friendly_name`. Anything you omit falls back to conservative defaults.
|
||||||
|
|
||||||
|
## 3. Optional Restrictions
|
||||||
|
|
||||||
|
Use `AZURE_OPENAI_ALLOWED_MODELS` to limit which Azure models Claude can access:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
AZURE_OPENAI_ALLOWED_MODELS=gpt-4o,gpt-4o-mini
|
||||||
|
```
|
||||||
|
|
||||||
|
Aliases are matched case-insensitively.
|
||||||
|
|
||||||
|
## 4. Quick Checklist
|
||||||
|
|
||||||
|
- [ ] `AZURE_OPENAI_API_KEY` and `AZURE_OPENAI_ENDPOINT` are set
|
||||||
|
- [ ] `conf/azure_models.json` (or the file referenced by `AZURE_MODELS_CONFIG_PATH`) lists every deployment with the desired metadata
|
||||||
|
- [ ] Optional: `AZURE_OPENAI_ALLOWED_MODELS` to restrict usage
|
||||||
|
- [ ] Restart `./run-server.sh` and run `listmodels` to confirm the Azure entries appear with the expected metadata
|
||||||
|
|
||||||
|
See also: [`docs/adding_providers.md`](adding_providers.md) for the full provider architecture and [README (Provider Configuration)](../README.md#provider-configuration) for quick-start environment snippets.
|
||||||
@@ -158,6 +158,8 @@ XAI_ALLOWED_MODELS=grok,grok-3-fast
|
|||||||
```env
|
```env
|
||||||
# Override default location of custom_models.json
|
# Override default location of custom_models.json
|
||||||
CUSTOM_MODELS_CONFIG_PATH=/path/to/your/custom_models.json
|
CUSTOM_MODELS_CONFIG_PATH=/path/to/your/custom_models.json
|
||||||
|
# Override default location of openrouter_models.json
|
||||||
|
OPENROUTER_MODELS_CONFIG_PATH=/path/to/your/openrouter_models.json
|
||||||
```
|
```
|
||||||
|
|
||||||
**Conversation Settings:**
|
**Conversation Settings:**
|
||||||
@@ -244,4 +246,4 @@ LOG_LEVEL=INFO
|
|||||||
|
|
||||||
- **[Advanced Usage Guide](advanced-usage.md)** - Advanced model usage patterns, thinking modes, and power user workflows
|
- **[Advanced Usage Guide](advanced-usage.md)** - Advanced model usage patterns, thinking modes, and power user workflows
|
||||||
- **[Context Revival Guide](context-revival.md)** - Conversation persistence and context revival across sessions
|
- **[Context Revival Guide](context-revival.md)** - Conversation persistence and context revival across sessions
|
||||||
- **[AI-to-AI Collaboration Guide](ai-collaboration.md)** - Multi-model coordination and conversation threading
|
- **[AI-to-AI Collaboration Guide](ai-collaboration.md)** - Multi-model coordination and conversation threading
|
||||||
|
|||||||
@@ -35,7 +35,12 @@ This guide covers setting up multiple AI model providers including OpenRouter, c
|
|||||||
|
|
||||||
## Model Aliases
|
## Model Aliases
|
||||||
|
|
||||||
The server uses `conf/custom_models.json` to map convenient aliases to both OpenRouter and custom model names. This unified registry supports both cloud models (via OpenRouter) and local models (via custom endpoints).
|
Zen ships two registries:
|
||||||
|
|
||||||
|
- `conf/openrouter_models.json` – metadata for models routed through OpenRouter. Override with `OPENROUTER_MODELS_CONFIG_PATH` if you maintain a custom copy.
|
||||||
|
- `conf/custom_models.json` – metadata for local or self-hosted OpenAI-compatible endpoints used by the Custom provider. Override with `CUSTOM_MODELS_CONFIG_PATH` if needed.
|
||||||
|
|
||||||
|
Copy whichever file you need into your project (or point the corresponding `*_MODELS_CONFIG_PATH` env var at your own copy) and edit it to advertise the models you want.
|
||||||
|
|
||||||
### OpenRouter Models (Cloud)
|
### OpenRouter Models (Cloud)
|
||||||
|
|
||||||
@@ -58,7 +63,7 @@ The server uses `conf/custom_models.json` to map convenient aliases to both Open
|
|||||||
|-------|-------------------|------|
|
|-------|-------------------|------|
|
||||||
| `local-llama`, `local` | `llama3.2` | Requires `CUSTOM_API_URL` configured |
|
| `local-llama`, `local` | `llama3.2` | Requires `CUSTOM_API_URL` configured |
|
||||||
|
|
||||||
View the full list in [`conf/custom_models.json`](conf/custom_models.json).
|
View the baseline OpenRouter catalogue in [`conf/openrouter_models.json`](conf/openrouter_models.json) and populate [`conf/custom_models.json`](conf/custom_models.json) with your local models.
|
||||||
|
|
||||||
To control ordering in auto mode or the `listmodels` summary, adjust the
|
To control ordering in auto mode or the `listmodels` summary, adjust the
|
||||||
[`intelligence_score`](model_ranking.md) for each entry (or rely on the automatic
|
[`intelligence_score`](model_ranking.md) for each entry (or rely on the automatic
|
||||||
@@ -152,7 +157,7 @@ CUSTOM_MODEL_NAME=your-loaded-model
|
|||||||
|
|
||||||
## Using Models
|
## Using Models
|
||||||
|
|
||||||
**Using model aliases (from conf/custom_models.json):**
|
**Using model aliases (from the registry files):**
|
||||||
```
|
```
|
||||||
# OpenRouter models:
|
# OpenRouter models:
|
||||||
"Use opus for deep analysis" # → anthropic/claude-opus-4
|
"Use opus for deep analysis" # → anthropic/claude-opus-4
|
||||||
@@ -185,20 +190,20 @@ CUSTOM_MODEL_NAME=your-loaded-model
|
|||||||
|
|
||||||
The system automatically routes models to the appropriate provider:
|
The system automatically routes models to the appropriate provider:
|
||||||
|
|
||||||
1. **Models with `is_custom: true`** → Always routed to Custom API (requires `CUSTOM_API_URL`)
|
1. Entries in `conf/custom_models.json` → Always routed through the Custom API (requires `CUSTOM_API_URL`)
|
||||||
2. **Models with `is_custom: false` or omitted** → Routed to OpenRouter (requires `OPENROUTER_API_KEY`)
|
2. Entries in `conf/openrouter_models.json` → Routed through OpenRouter (requires `OPENROUTER_API_KEY`)
|
||||||
3. **Unknown models** → Fallback logic based on model name patterns
|
3. **Unknown models** → Fallback logic based on model name patterns
|
||||||
|
|
||||||
**Provider Priority Order:**
|
**Provider Priority Order:**
|
||||||
1. Native APIs (Google, OpenAI) - if API keys are available
|
1. Native APIs (Google, OpenAI) - if API keys are available
|
||||||
2. Custom endpoints - for models marked with `is_custom: true`
|
2. Custom endpoints - for models declared in `conf/custom_models.json`
|
||||||
3. OpenRouter - catch-all for cloud models
|
3. OpenRouter - catch-all for cloud models
|
||||||
|
|
||||||
This ensures clean separation between local and cloud models while maintaining flexibility for unknown models.
|
This ensures clean separation between local and cloud models while maintaining flexibility for unknown models.
|
||||||
|
|
||||||
## Model Configuration
|
## Model Configuration
|
||||||
|
|
||||||
The server uses `conf/custom_models.json` to define model aliases and capabilities. You can:
|
These JSON files define model aliases and capabilities. You can:
|
||||||
|
|
||||||
1. **Use the default configuration** - Includes popular models with convenient aliases
|
1. **Use the default configuration** - Includes popular models with convenient aliases
|
||||||
2. **Customize the configuration** - Add your own models and aliases
|
2. **Customize the configuration** - Add your own models and aliases
|
||||||
@@ -206,7 +211,7 @@ The server uses `conf/custom_models.json` to define model aliases and capabiliti
|
|||||||
|
|
||||||
### Adding Custom Models
|
### Adding Custom Models
|
||||||
|
|
||||||
Edit `conf/custom_models.json` to add new models. The configuration supports both OpenRouter (cloud) and custom endpoint (local) models.
|
Edit `conf/openrouter_models.json` to tweak OpenRouter behaviour or `conf/custom_models.json` to add local models. Each entry maps directly onto [`ModelCapabilities`](../providers/shared/model_capabilities.py).
|
||||||
|
|
||||||
#### Adding an OpenRouter Model
|
#### Adding an OpenRouter Model
|
||||||
|
|
||||||
@@ -232,7 +237,6 @@ Edit `conf/custom_models.json` to add new models. The configuration supports bot
|
|||||||
"supports_extended_thinking": false,
|
"supports_extended_thinking": false,
|
||||||
"supports_json_mode": false,
|
"supports_json_mode": false,
|
||||||
"supports_function_calling": false,
|
"supports_function_calling": false,
|
||||||
"is_custom": true,
|
|
||||||
"description": "My custom Ollama/vLLM model"
|
"description": "My custom Ollama/vLLM model"
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
@@ -244,10 +248,9 @@ Edit `conf/custom_models.json` to add new models. The configuration supports bot
|
|||||||
- `supports_extended_thinking`: Whether the model has extended reasoning capabilities
|
- `supports_extended_thinking`: Whether the model has extended reasoning capabilities
|
||||||
- `supports_json_mode`: Whether the model can guarantee valid JSON output
|
- `supports_json_mode`: Whether the model can guarantee valid JSON output
|
||||||
- `supports_function_calling`: Whether the model supports function/tool calling
|
- `supports_function_calling`: Whether the model supports function/tool calling
|
||||||
- `is_custom`: **Set to `true` for models that should ONLY work with custom endpoints** (Ollama, vLLM, etc.)
|
|
||||||
- `description`: Human-readable description of the model
|
- `description`: Human-readable description of the model
|
||||||
|
|
||||||
**Important:** Always set `is_custom: true` for local models. This ensures they're only used when `CUSTOM_API_URL` is configured and prevents conflicts with OpenRouter.
|
**Important:** Keep OpenRouter and Custom models in their respective files so that requests are routed correctly.
|
||||||
|
|
||||||
## Available Models
|
## Available Models
|
||||||
|
|
||||||
|
|||||||
@@ -4,6 +4,7 @@
|
|||||||
|----------|-------------|
|
|----------|-------------|
|
||||||
| [Getting Started](getting-started.md) | Installation paths, prerequisite setup, and first-run guidance. |
|
| [Getting Started](getting-started.md) | Installation paths, prerequisite setup, and first-run guidance. |
|
||||||
| [Adding Providers](adding_providers.md) | How to register new AI providers and advertise capabilities. |
|
| [Adding Providers](adding_providers.md) | How to register new AI providers and advertise capabilities. |
|
||||||
|
| [Azure OpenAI](azure_openai.md) | Configure Azure deployments, capability overrides, and env mappings. |
|
||||||
| [Model Ranking](model_ranking.md) | How intelligence scores translate into auto-mode ordering. |
|
| [Model Ranking](model_ranking.md) | How intelligence scores translate into auto-mode ordering. |
|
||||||
| [Custom Models](custom_models.md) | Configure OpenRouter/custom models and aliases. |
|
| [Custom Models](custom_models.md) | Configure OpenRouter/custom models and aliases. |
|
||||||
| [Adding Tools](adding_tools.md) | Create new tools using the shared base classes. |
|
| [Adding Tools](adding_tools.md) | Create new tools using the shared base classes. |
|
||||||
|
|||||||
@@ -25,7 +25,7 @@ feature_bonus = (
|
|||||||
+ (1 if supports_json_mode else 0)
|
+ (1 if supports_json_mode else 0)
|
||||||
+ (1 if supports_images else 0)
|
+ (1 if supports_images else 0)
|
||||||
)
|
)
|
||||||
penalty = 1 if is_custom else 0
|
penalty = 1 if provider == CUSTOM else 0
|
||||||
|
|
||||||
effective_rank = clamp(base + ctx_bonus + output_bonus + feature_bonus - penalty, 0, 100)
|
effective_rank = clamp(base + ctx_bonus + output_bonus + feature_bonus - penalty, 0, 100)
|
||||||
```
|
```
|
||||||
|
|||||||
@@ -1,5 +1,6 @@
|
|||||||
"""Model provider abstractions for supporting multiple AI providers."""
|
"""Model provider abstractions for supporting multiple AI providers."""
|
||||||
|
|
||||||
|
from .azure_openai import AzureOpenAIProvider
|
||||||
from .base import ModelProvider
|
from .base import ModelProvider
|
||||||
from .gemini import GeminiModelProvider
|
from .gemini import GeminiModelProvider
|
||||||
from .openai_compatible import OpenAICompatibleProvider
|
from .openai_compatible import OpenAICompatibleProvider
|
||||||
@@ -13,6 +14,7 @@ __all__ = [
|
|||||||
"ModelResponse",
|
"ModelResponse",
|
||||||
"ModelCapabilities",
|
"ModelCapabilities",
|
||||||
"ModelProviderRegistry",
|
"ModelProviderRegistry",
|
||||||
|
"AzureOpenAIProvider",
|
||||||
"GeminiModelProvider",
|
"GeminiModelProvider",
|
||||||
"OpenAIModelProvider",
|
"OpenAIModelProvider",
|
||||||
"OpenAICompatibleProvider",
|
"OpenAICompatibleProvider",
|
||||||
|
|||||||
342
providers/azure_openai.py
Normal file
342
providers/azure_openai.py
Normal file
@@ -0,0 +1,342 @@
|
|||||||
|
"""Azure OpenAI provider built on the OpenAI-compatible implementation."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import logging
|
||||||
|
from dataclasses import asdict, replace
|
||||||
|
|
||||||
|
try: # pragma: no cover - optional dependency
|
||||||
|
from openai import AzureOpenAI
|
||||||
|
except ImportError: # pragma: no cover
|
||||||
|
AzureOpenAI = None # type: ignore[assignment]
|
||||||
|
|
||||||
|
from utils.env import get_env, suppress_env_vars
|
||||||
|
|
||||||
|
from .azure_registry import AzureModelRegistry
|
||||||
|
from .openai_compatible import OpenAICompatibleProvider
|
||||||
|
from .openai_provider import OpenAIModelProvider
|
||||||
|
from .shared import ModelCapabilities, ModelResponse, ProviderType, TemperatureConstraint
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
class AzureOpenAIProvider(OpenAICompatibleProvider):
|
||||||
|
"""Thin Azure wrapper that reuses the OpenAI-compatible request pipeline."""
|
||||||
|
|
||||||
|
FRIENDLY_NAME = "Azure OpenAI"
|
||||||
|
DEFAULT_API_VERSION = "2024-02-15-preview"
|
||||||
|
|
||||||
|
# The OpenAI-compatible base expects subclasses to expose capabilities via
|
||||||
|
# ``get_all_model_capabilities``. Azure deployments are user-defined, so we
|
||||||
|
# build the catalogue dynamically from environment configuration instead of
|
||||||
|
# relying on a static ``MODEL_CAPABILITIES`` map.
|
||||||
|
MODEL_CAPABILITIES: dict[str, ModelCapabilities] = {}
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
api_key: str,
|
||||||
|
*,
|
||||||
|
azure_endpoint: str | None = None,
|
||||||
|
api_version: str | None = None,
|
||||||
|
deployments: dict[str, object] | None = None,
|
||||||
|
**kwargs,
|
||||||
|
) -> None:
|
||||||
|
# Let the OpenAI-compatible base handle shared configuration such as
|
||||||
|
# timeouts, restriction-aware allowlists, and logging. ``base_url`` maps
|
||||||
|
# directly onto Azure's endpoint URL.
|
||||||
|
super().__init__(api_key, base_url=azure_endpoint, **kwargs)
|
||||||
|
|
||||||
|
if not azure_endpoint:
|
||||||
|
azure_endpoint = get_env("AZURE_OPENAI_ENDPOINT")
|
||||||
|
if not azure_endpoint:
|
||||||
|
raise ValueError("Azure OpenAI endpoint is required via parameter or AZURE_OPENAI_ENDPOINT")
|
||||||
|
|
||||||
|
self.azure_endpoint = azure_endpoint.rstrip("/")
|
||||||
|
self.api_version = api_version or get_env("AZURE_OPENAI_API_VERSION", self.DEFAULT_API_VERSION)
|
||||||
|
|
||||||
|
registry_specs = self._load_registry_entries()
|
||||||
|
override_specs = self._normalise_deployments(deployments or {}) if deployments else {}
|
||||||
|
|
||||||
|
self._model_specs = self._merge_specs(registry_specs, override_specs)
|
||||||
|
if not self._model_specs:
|
||||||
|
raise ValueError(
|
||||||
|
"Azure OpenAI provider requires at least one configured deployment. "
|
||||||
|
"Populate conf/azure_models.json or set AZURE_MODELS_CONFIG_PATH."
|
||||||
|
)
|
||||||
|
|
||||||
|
self._capabilities = self._build_capabilities_map()
|
||||||
|
self._deployment_map = {name: spec["deployment"] for name, spec in self._model_specs.items()}
|
||||||
|
self._deployment_alias_lookup = {
|
||||||
|
deployment.lower(): canonical for canonical, deployment in self._deployment_map.items()
|
||||||
|
}
|
||||||
|
self._canonical_lookup = {name.lower(): name for name in self._model_specs.keys()}
|
||||||
|
self._invalidate_capability_cache()
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Capability helpers
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
def get_all_model_capabilities(self) -> dict[str, ModelCapabilities]:
|
||||||
|
return dict(self._capabilities)
|
||||||
|
|
||||||
|
def get_provider_type(self) -> ProviderType:
|
||||||
|
return ProviderType.AZURE
|
||||||
|
|
||||||
|
def get_capabilities(self, model_name: str) -> ModelCapabilities: # type: ignore[override]
|
||||||
|
lowered = model_name.lower()
|
||||||
|
if lowered in self._deployment_alias_lookup:
|
||||||
|
canonical = self._deployment_alias_lookup[lowered]
|
||||||
|
return super().get_capabilities(canonical)
|
||||||
|
canonical = self._canonical_lookup.get(lowered)
|
||||||
|
if canonical:
|
||||||
|
return super().get_capabilities(canonical)
|
||||||
|
return super().get_capabilities(model_name)
|
||||||
|
|
||||||
|
def validate_model_name(self, model_name: str) -> bool: # type: ignore[override]
|
||||||
|
lowered = model_name.lower()
|
||||||
|
if lowered in self._deployment_alias_lookup or lowered in self._canonical_lookup:
|
||||||
|
return True
|
||||||
|
return super().validate_model_name(model_name)
|
||||||
|
|
||||||
|
def _build_capabilities_map(self) -> dict[str, ModelCapabilities]:
|
||||||
|
capabilities: dict[str, ModelCapabilities] = {}
|
||||||
|
|
||||||
|
for canonical_name, spec in self._model_specs.items():
|
||||||
|
template_capability: ModelCapabilities | None = spec.get("capability")
|
||||||
|
overrides = spec.get("overrides", {})
|
||||||
|
|
||||||
|
if template_capability:
|
||||||
|
cloned = replace(template_capability)
|
||||||
|
else:
|
||||||
|
template = OpenAIModelProvider.MODEL_CAPABILITIES.get(canonical_name)
|
||||||
|
|
||||||
|
if template:
|
||||||
|
friendly = template.friendly_name.replace("OpenAI", "Azure OpenAI", 1)
|
||||||
|
cloned = replace(
|
||||||
|
template,
|
||||||
|
provider=ProviderType.AZURE,
|
||||||
|
friendly_name=friendly,
|
||||||
|
aliases=list(template.aliases),
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
deployment_name = spec.get("deployment", "")
|
||||||
|
cloned = ModelCapabilities(
|
||||||
|
provider=ProviderType.AZURE,
|
||||||
|
model_name=canonical_name,
|
||||||
|
friendly_name=f"Azure OpenAI ({canonical_name})",
|
||||||
|
description=f"Azure deployment '{deployment_name}' for {canonical_name}",
|
||||||
|
aliases=[],
|
||||||
|
)
|
||||||
|
|
||||||
|
if overrides:
|
||||||
|
overrides = dict(overrides)
|
||||||
|
temp_override = overrides.get("temperature_constraint")
|
||||||
|
if isinstance(temp_override, str):
|
||||||
|
overrides["temperature_constraint"] = TemperatureConstraint.create(temp_override)
|
||||||
|
|
||||||
|
aliases_override = overrides.get("aliases")
|
||||||
|
if isinstance(aliases_override, str):
|
||||||
|
overrides["aliases"] = [alias.strip() for alias in aliases_override.split(",") if alias.strip()]
|
||||||
|
provider_override = overrides.get("provider")
|
||||||
|
if provider_override:
|
||||||
|
overrides.pop("provider", None)
|
||||||
|
|
||||||
|
try:
|
||||||
|
cloned = replace(cloned, **overrides)
|
||||||
|
except TypeError:
|
||||||
|
base_data = asdict(cloned)
|
||||||
|
base_data.update(overrides)
|
||||||
|
base_data["provider"] = ProviderType.AZURE
|
||||||
|
temp_value = base_data.get("temperature_constraint")
|
||||||
|
if isinstance(temp_value, str):
|
||||||
|
base_data["temperature_constraint"] = TemperatureConstraint.create(temp_value)
|
||||||
|
cloned = ModelCapabilities(**base_data)
|
||||||
|
|
||||||
|
if cloned.provider != ProviderType.AZURE:
|
||||||
|
cloned.provider = ProviderType.AZURE
|
||||||
|
|
||||||
|
capabilities[canonical_name] = cloned
|
||||||
|
|
||||||
|
return capabilities
|
||||||
|
|
||||||
|
def _load_registry_entries(self) -> dict[str, dict]:
|
||||||
|
try:
|
||||||
|
registry = AzureModelRegistry()
|
||||||
|
except Exception as exc: # pragma: no cover - registry failure should not crash provider
|
||||||
|
logger.warning("Unable to load Azure model registry: %s", exc)
|
||||||
|
return {}
|
||||||
|
|
||||||
|
entries: dict[str, dict] = {}
|
||||||
|
for model_name, capability, extra in registry.iter_entries():
|
||||||
|
deployment = extra.get("deployment")
|
||||||
|
if not deployment:
|
||||||
|
logger.warning("Azure model '%s' missing deployment in registry", model_name)
|
||||||
|
continue
|
||||||
|
entries[model_name] = {"deployment": deployment, "capability": capability}
|
||||||
|
|
||||||
|
return entries
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _merge_specs(
|
||||||
|
registry_specs: dict[str, dict],
|
||||||
|
override_specs: dict[str, dict],
|
||||||
|
) -> dict[str, dict]:
|
||||||
|
specs: dict[str, dict] = {}
|
||||||
|
|
||||||
|
for canonical, entry in registry_specs.items():
|
||||||
|
specs[canonical] = {
|
||||||
|
"deployment": entry.get("deployment"),
|
||||||
|
"capability": entry.get("capability"),
|
||||||
|
"overrides": {},
|
||||||
|
}
|
||||||
|
|
||||||
|
for canonical, entry in override_specs.items():
|
||||||
|
spec = specs.get(canonical, {"deployment": None, "capability": None, "overrides": {}})
|
||||||
|
deployment = entry.get("deployment")
|
||||||
|
if deployment:
|
||||||
|
spec["deployment"] = deployment
|
||||||
|
overrides = {k: v for k, v in entry.items() if k not in {"deployment"}}
|
||||||
|
overrides.pop("capability", None)
|
||||||
|
if overrides:
|
||||||
|
spec["overrides"].update(overrides)
|
||||||
|
specs[canonical] = spec
|
||||||
|
|
||||||
|
return {k: v for k, v in specs.items() if v.get("deployment")}
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _normalise_deployments(mapping: dict[str, object]) -> dict[str, dict]:
|
||||||
|
normalised: dict[str, dict] = {}
|
||||||
|
for canonical, spec in mapping.items():
|
||||||
|
canonical_name = (canonical or "").strip()
|
||||||
|
if not canonical_name:
|
||||||
|
continue
|
||||||
|
|
||||||
|
deployment_name: str | None = None
|
||||||
|
overrides: dict[str, object] = {}
|
||||||
|
|
||||||
|
if isinstance(spec, str):
|
||||||
|
deployment_name = spec.strip()
|
||||||
|
elif isinstance(spec, dict):
|
||||||
|
deployment_name = spec.get("deployment") or spec.get("deployment_name")
|
||||||
|
overrides = {k: v for k, v in spec.items() if k not in {"deployment", "deployment_name"}}
|
||||||
|
|
||||||
|
if not deployment_name:
|
||||||
|
continue
|
||||||
|
|
||||||
|
normalised[canonical_name] = {"deployment": deployment_name.strip(), **overrides}
|
||||||
|
|
||||||
|
return normalised
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Azure-specific configuration
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
@property
|
||||||
|
def client(self): # type: ignore[override]
|
||||||
|
"""Instantiate the Azure OpenAI client on first use."""
|
||||||
|
|
||||||
|
if self._client is None:
|
||||||
|
if AzureOpenAI is None:
|
||||||
|
raise ImportError(
|
||||||
|
"Azure OpenAI support requires the 'openai' package. Install it with `pip install openai`."
|
||||||
|
)
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
|
||||||
|
proxy_env_vars = ["HTTP_PROXY", "HTTPS_PROXY", "ALL_PROXY", "http_proxy", "https_proxy", "all_proxy"]
|
||||||
|
|
||||||
|
with suppress_env_vars(*proxy_env_vars):
|
||||||
|
try:
|
||||||
|
timeout_config = self.timeout_config
|
||||||
|
|
||||||
|
http_client = httpx.Client(timeout=timeout_config, follow_redirects=True)
|
||||||
|
|
||||||
|
client_kwargs = {
|
||||||
|
"api_key": self.api_key,
|
||||||
|
"azure_endpoint": self.azure_endpoint,
|
||||||
|
"api_version": self.api_version,
|
||||||
|
"http_client": http_client,
|
||||||
|
}
|
||||||
|
|
||||||
|
if self.DEFAULT_HEADERS:
|
||||||
|
client_kwargs["default_headers"] = self.DEFAULT_HEADERS.copy()
|
||||||
|
|
||||||
|
logger.debug(
|
||||||
|
"Initializing Azure OpenAI client endpoint=%s api_version=%s timeouts=%s",
|
||||||
|
self.azure_endpoint,
|
||||||
|
self.api_version,
|
||||||
|
timeout_config,
|
||||||
|
)
|
||||||
|
|
||||||
|
self._client = AzureOpenAI(**client_kwargs)
|
||||||
|
|
||||||
|
except Exception as exc:
|
||||||
|
logger.error("Failed to create Azure OpenAI client: %s", exc)
|
||||||
|
raise
|
||||||
|
|
||||||
|
return self._client
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Request delegation
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
def generate_content(
|
||||||
|
self,
|
||||||
|
prompt: str,
|
||||||
|
model_name: str,
|
||||||
|
system_prompt: str | None = None,
|
||||||
|
temperature: float = 0.3,
|
||||||
|
max_output_tokens: int | None = None,
|
||||||
|
images: list[str] | None = None,
|
||||||
|
**kwargs,
|
||||||
|
) -> ModelResponse:
|
||||||
|
canonical_name, deployment_name = self._resolve_canonical_and_deployment(model_name)
|
||||||
|
|
||||||
|
# Delegate to the shared OpenAI-compatible implementation using the
|
||||||
|
# deployment name – Azure requires the deployment identifier in the
|
||||||
|
# ``model`` field. The returned ``ModelResponse`` is normalised so
|
||||||
|
# downstream consumers continue to see the canonical model name.
|
||||||
|
raw_response = super().generate_content(
|
||||||
|
prompt=prompt,
|
||||||
|
model_name=deployment_name,
|
||||||
|
system_prompt=system_prompt,
|
||||||
|
temperature=temperature,
|
||||||
|
max_output_tokens=max_output_tokens,
|
||||||
|
images=images,
|
||||||
|
**kwargs,
|
||||||
|
)
|
||||||
|
|
||||||
|
capabilities = self._capabilities.get(canonical_name)
|
||||||
|
friendly_name = capabilities.friendly_name if capabilities else self.FRIENDLY_NAME
|
||||||
|
|
||||||
|
return ModelResponse(
|
||||||
|
content=raw_response.content,
|
||||||
|
usage=raw_response.usage,
|
||||||
|
model_name=canonical_name,
|
||||||
|
friendly_name=friendly_name,
|
||||||
|
provider=ProviderType.AZURE,
|
||||||
|
metadata={**raw_response.metadata, "deployment": deployment_name},
|
||||||
|
)
|
||||||
|
|
||||||
|
def _resolve_canonical_and_deployment(self, model_name: str) -> tuple[str, str]:
|
||||||
|
resolved_canonical = self._resolve_model_name(model_name)
|
||||||
|
|
||||||
|
if resolved_canonical not in self._deployment_map:
|
||||||
|
# The base resolver may hand back the deployment alias. Try to map it
|
||||||
|
# back to a canonical entry.
|
||||||
|
for canonical, deployment in self._deployment_map.items():
|
||||||
|
if deployment.lower() == resolved_canonical.lower():
|
||||||
|
return canonical, deployment
|
||||||
|
raise ValueError(f"Model '{model_name}' is not configured for Azure OpenAI")
|
||||||
|
|
||||||
|
return resolved_canonical, self._deployment_map[resolved_canonical]
|
||||||
|
|
||||||
|
def _parse_allowed_models(self) -> set[str] | None: # type: ignore[override]
|
||||||
|
# Support both AZURE_ALLOWED_MODELS (inherited behaviour) and the
|
||||||
|
# clearer AZURE_OPENAI_ALLOWED_MODELS alias.
|
||||||
|
explicit = get_env("AZURE_OPENAI_ALLOWED_MODELS")
|
||||||
|
if explicit:
|
||||||
|
models = {m.strip().lower() for m in explicit.split(",") if m.strip()}
|
||||||
|
if models:
|
||||||
|
logger.info("Configured allowed models for Azure OpenAI: %s", sorted(models))
|
||||||
|
self._allowed_alias_cache = {}
|
||||||
|
return models
|
||||||
|
|
||||||
|
return super()._parse_allowed_models()
|
||||||
45
providers/azure_registry.py
Normal file
45
providers/azure_registry.py
Normal file
@@ -0,0 +1,45 @@
|
|||||||
|
"""Registry loader for Azure OpenAI model configurations."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import logging
|
||||||
|
|
||||||
|
from .model_registry_base import CAPABILITY_FIELD_NAMES, CustomModelRegistryBase
|
||||||
|
from .shared import ModelCapabilities, ProviderType, TemperatureConstraint
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
class AzureModelRegistry(CustomModelRegistryBase):
|
||||||
|
"""Load Azure-specific model metadata from configuration files."""
|
||||||
|
|
||||||
|
def __init__(self, config_path: str | None = None) -> None:
|
||||||
|
super().__init__(
|
||||||
|
env_var_name="AZURE_MODELS_CONFIG_PATH",
|
||||||
|
default_filename="azure_models.json",
|
||||||
|
config_path=config_path,
|
||||||
|
)
|
||||||
|
self.reload()
|
||||||
|
|
||||||
|
def _extra_keys(self) -> set[str]:
|
||||||
|
return {"deployment", "deployment_name"}
|
||||||
|
|
||||||
|
def _provider_default(self) -> ProviderType:
|
||||||
|
return ProviderType.AZURE
|
||||||
|
|
||||||
|
def _default_friendly_name(self, model_name: str) -> str:
|
||||||
|
return f"Azure OpenAI ({model_name})"
|
||||||
|
|
||||||
|
def _finalise_entry(self, entry: dict) -> tuple[ModelCapabilities, dict]:
|
||||||
|
deployment = entry.pop("deployment", None) or entry.pop("deployment_name", None)
|
||||||
|
if not deployment:
|
||||||
|
raise ValueError(f"Azure model '{entry.get('model_name')}' is missing required 'deployment' field")
|
||||||
|
|
||||||
|
temp_hint = entry.get("temperature_constraint")
|
||||||
|
if isinstance(temp_hint, str):
|
||||||
|
entry["temperature_constraint"] = TemperatureConstraint.create(temp_hint)
|
||||||
|
|
||||||
|
filtered = {k: v for k, v in entry.items() if k in CAPABILITY_FIELD_NAMES}
|
||||||
|
filtered.setdefault("provider", ProviderType.AZURE)
|
||||||
|
capability = ModelCapabilities(**filtered)
|
||||||
|
return capability, {"deployment": deployment}
|
||||||
@@ -1,10 +1,10 @@
|
|||||||
"""Custom API provider implementation."""
|
"""Custom API provider implementation."""
|
||||||
|
|
||||||
import logging
|
import logging
|
||||||
from typing import Optional
|
|
||||||
|
|
||||||
from utils.env import get_env
|
from utils.env import get_env
|
||||||
|
|
||||||
|
from .custom_registry import CustomEndpointModelRegistry
|
||||||
from .openai_compatible import OpenAICompatibleProvider
|
from .openai_compatible import OpenAICompatibleProvider
|
||||||
from .openrouter_registry import OpenRouterModelRegistry
|
from .openrouter_registry import OpenRouterModelRegistry
|
||||||
from .shared import ModelCapabilities, ProviderType
|
from .shared import ModelCapabilities, ProviderType
|
||||||
@@ -31,8 +31,8 @@ class CustomProvider(OpenAICompatibleProvider):
|
|||||||
|
|
||||||
FRIENDLY_NAME = "Custom API"
|
FRIENDLY_NAME = "Custom API"
|
||||||
|
|
||||||
# Model registry for managing configurations and aliases (shared with OpenRouter)
|
# Model registry for managing configurations and aliases
|
||||||
_registry: Optional[OpenRouterModelRegistry] = None
|
_registry: CustomEndpointModelRegistry | None = None
|
||||||
|
|
||||||
def __init__(self, api_key: str = "", base_url: str = "", **kwargs):
|
def __init__(self, api_key: str = "", base_url: str = "", **kwargs):
|
||||||
"""Initialize Custom provider for local/self-hosted models.
|
"""Initialize Custom provider for local/self-hosted models.
|
||||||
@@ -78,9 +78,9 @@ class CustomProvider(OpenAICompatibleProvider):
|
|||||||
|
|
||||||
super().__init__(api_key, base_url=base_url, **kwargs)
|
super().__init__(api_key, base_url=base_url, **kwargs)
|
||||||
|
|
||||||
# Initialize model registry (shared with OpenRouter for consistent aliases)
|
# Initialize model registry
|
||||||
if CustomProvider._registry is None:
|
if CustomProvider._registry is None:
|
||||||
CustomProvider._registry = OpenRouterModelRegistry()
|
CustomProvider._registry = CustomEndpointModelRegistry()
|
||||||
# Log loaded models and aliases only on first load
|
# Log loaded models and aliases only on first load
|
||||||
models = self._registry.list_models()
|
models = self._registry.list_models()
|
||||||
aliases = self._registry.list_aliases()
|
aliases = self._registry.list_aliases()
|
||||||
@@ -92,8 +92,8 @@ class CustomProvider(OpenAICompatibleProvider):
|
|||||||
def _lookup_capabilities(
|
def _lookup_capabilities(
|
||||||
self,
|
self,
|
||||||
canonical_name: str,
|
canonical_name: str,
|
||||||
requested_name: Optional[str] = None,
|
requested_name: str | None = None,
|
||||||
) -> Optional[ModelCapabilities]:
|
) -> ModelCapabilities | None:
|
||||||
"""Return capabilities for models explicitly marked as custom."""
|
"""Return capabilities for models explicitly marked as custom."""
|
||||||
|
|
||||||
builtin = super()._lookup_capabilities(canonical_name, requested_name)
|
builtin = super()._lookup_capabilities(canonical_name, requested_name)
|
||||||
@@ -101,12 +101,12 @@ class CustomProvider(OpenAICompatibleProvider):
|
|||||||
return builtin
|
return builtin
|
||||||
|
|
||||||
registry_entry = self._registry.resolve(canonical_name)
|
registry_entry = self._registry.resolve(canonical_name)
|
||||||
if registry_entry and getattr(registry_entry, "is_custom", False):
|
if registry_entry:
|
||||||
registry_entry.provider = ProviderType.CUSTOM
|
registry_entry.provider = ProviderType.CUSTOM
|
||||||
return registry_entry
|
return registry_entry
|
||||||
|
|
||||||
logging.debug(
|
logging.debug(
|
||||||
"Custom provider cannot resolve model '%s'; ensure it is declared with 'is_custom': true in custom_models.json",
|
"Custom provider cannot resolve model '%s'; ensure it is declared in custom_models.json",
|
||||||
canonical_name,
|
canonical_name,
|
||||||
)
|
)
|
||||||
return None
|
return None
|
||||||
@@ -151,6 +151,15 @@ class CustomProvider(OpenAICompatibleProvider):
|
|||||||
return base_model
|
return base_model
|
||||||
|
|
||||||
logging.debug(f"Model '{model_name}' not found in registry, using as-is")
|
logging.debug(f"Model '{model_name}' not found in registry, using as-is")
|
||||||
|
# Attempt to resolve via OpenRouter registry so aliases still map cleanly
|
||||||
|
openrouter_registry = OpenRouterModelRegistry()
|
||||||
|
openrouter_config = openrouter_registry.resolve(model_name)
|
||||||
|
if openrouter_config:
|
||||||
|
resolved = openrouter_config.model_name
|
||||||
|
self._alias_cache[cache_key] = resolved
|
||||||
|
self._alias_cache.setdefault(resolved.lower(), resolved)
|
||||||
|
return resolved
|
||||||
|
|
||||||
self._alias_cache[cache_key] = model_name
|
self._alias_cache[cache_key] = model_name
|
||||||
return model_name
|
return model_name
|
||||||
|
|
||||||
@@ -160,9 +169,9 @@ class CustomProvider(OpenAICompatibleProvider):
|
|||||||
if not self._registry:
|
if not self._registry:
|
||||||
return {}
|
return {}
|
||||||
|
|
||||||
capabilities: dict[str, ModelCapabilities] = {}
|
capabilities = {}
|
||||||
for model_name in self._registry.list_models():
|
for model in self._registry.list_models():
|
||||||
config = self._registry.resolve(model_name)
|
config = self._registry.resolve(model)
|
||||||
if config and getattr(config, "is_custom", False):
|
if config:
|
||||||
capabilities[model_name] = config
|
capabilities[model] = config
|
||||||
return capabilities
|
return capabilities
|
||||||
|
|||||||
26
providers/custom_registry.py
Normal file
26
providers/custom_registry.py
Normal file
@@ -0,0 +1,26 @@
|
|||||||
|
"""Registry for models exposed via custom (local) OpenAI-compatible endpoints."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from .model_registry_base import CAPABILITY_FIELD_NAMES, CapabilityModelRegistry
|
||||||
|
from .shared import ModelCapabilities, ProviderType
|
||||||
|
|
||||||
|
|
||||||
|
class CustomEndpointModelRegistry(CapabilityModelRegistry):
|
||||||
|
def __init__(self, config_path: str | None = None) -> None:
|
||||||
|
super().__init__(
|
||||||
|
env_var_name="CUSTOM_MODELS_CONFIG_PATH",
|
||||||
|
default_filename="custom_models.json",
|
||||||
|
provider=ProviderType.CUSTOM,
|
||||||
|
friendly_prefix="Custom ({model})",
|
||||||
|
config_path=config_path,
|
||||||
|
)
|
||||||
|
self.reload()
|
||||||
|
|
||||||
|
def _finalise_entry(self, entry: dict) -> tuple[ModelCapabilities, dict]:
|
||||||
|
entry["provider"] = ProviderType.CUSTOM
|
||||||
|
entry.setdefault("friendly_name", f"Custom ({entry['model_name']})")
|
||||||
|
filtered = {k: v for k, v in entry.items() if k in CAPABILITY_FIELD_NAMES}
|
||||||
|
filtered.setdefault("provider", ProviderType.CUSTOM)
|
||||||
|
capability = ModelCapabilities(**filtered)
|
||||||
|
return capability, {}
|
||||||
241
providers/model_registry_base.py
Normal file
241
providers/model_registry_base.py
Normal file
@@ -0,0 +1,241 @@
|
|||||||
|
"""Shared infrastructure for JSON-backed model registries."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import importlib.resources
|
||||||
|
import json
|
||||||
|
import logging
|
||||||
|
from collections.abc import Iterable
|
||||||
|
from dataclasses import fields
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
from utils.env import get_env
|
||||||
|
from utils.file_utils import read_json_file
|
||||||
|
|
||||||
|
from .shared import ModelCapabilities, ProviderType, TemperatureConstraint
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
CAPABILITY_FIELD_NAMES = {field.name for field in fields(ModelCapabilities)}
|
||||||
|
|
||||||
|
|
||||||
|
class CustomModelRegistryBase:
|
||||||
|
"""Load and expose capability metadata from a JSON manifest."""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
*,
|
||||||
|
env_var_name: str,
|
||||||
|
default_filename: str,
|
||||||
|
config_path: str | None = None,
|
||||||
|
) -> None:
|
||||||
|
self._env_var_name = env_var_name
|
||||||
|
self._default_filename = default_filename
|
||||||
|
self._use_resources = False
|
||||||
|
self._resource_package = "conf"
|
||||||
|
self._default_path = Path(__file__).parent.parent / "conf" / default_filename
|
||||||
|
|
||||||
|
if config_path:
|
||||||
|
self.config_path = Path(config_path)
|
||||||
|
else:
|
||||||
|
env_path = get_env(env_var_name)
|
||||||
|
if env_path:
|
||||||
|
self.config_path = Path(env_path)
|
||||||
|
else:
|
||||||
|
try:
|
||||||
|
resource = importlib.resources.files(self._resource_package).joinpath(default_filename)
|
||||||
|
if hasattr(resource, "read_text"):
|
||||||
|
self._use_resources = True
|
||||||
|
self.config_path = None
|
||||||
|
else:
|
||||||
|
raise AttributeError("resource accessor not available")
|
||||||
|
except Exception:
|
||||||
|
self.config_path = Path(__file__).parent.parent / "conf" / default_filename
|
||||||
|
|
||||||
|
self.alias_map: dict[str, str] = {}
|
||||||
|
self.model_map: dict[str, ModelCapabilities] = {}
|
||||||
|
self._extras: dict[str, dict] = {}
|
||||||
|
|
||||||
|
def reload(self) -> None:
|
||||||
|
data = self._load_config_data()
|
||||||
|
configs = [config for config in self._parse_models(data) if config is not None]
|
||||||
|
self._build_maps(configs)
|
||||||
|
|
||||||
|
def list_models(self) -> list[str]:
|
||||||
|
return list(self.model_map.keys())
|
||||||
|
|
||||||
|
def list_aliases(self) -> list[str]:
|
||||||
|
return list(self.alias_map.keys())
|
||||||
|
|
||||||
|
def resolve(self, name_or_alias: str) -> ModelCapabilities | None:
|
||||||
|
key = name_or_alias.lower()
|
||||||
|
canonical = self.alias_map.get(key)
|
||||||
|
if canonical:
|
||||||
|
return self.model_map.get(canonical)
|
||||||
|
|
||||||
|
for model_name in self.model_map:
|
||||||
|
if model_name.lower() == key:
|
||||||
|
return self.model_map[model_name]
|
||||||
|
return None
|
||||||
|
|
||||||
|
def get_capabilities(self, name_or_alias: str) -> ModelCapabilities | None:
|
||||||
|
return self.resolve(name_or_alias)
|
||||||
|
|
||||||
|
def get_entry(self, model_name: str) -> dict | None:
|
||||||
|
return self._extras.get(model_name)
|
||||||
|
|
||||||
|
def iter_entries(self) -> Iterable[tuple[str, ModelCapabilities, dict]]:
|
||||||
|
for model_name, capability in self.model_map.items():
|
||||||
|
yield model_name, capability, self._extras.get(model_name, {})
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Internal helpers
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
def _load_config_data(self) -> dict:
|
||||||
|
if self._use_resources:
|
||||||
|
try:
|
||||||
|
resource = importlib.resources.files(self._resource_package).joinpath(self._default_filename)
|
||||||
|
if hasattr(resource, "read_text"):
|
||||||
|
config_text = resource.read_text(encoding="utf-8")
|
||||||
|
else: # pragma: no cover - legacy Python fallback
|
||||||
|
with resource.open("r", encoding="utf-8") as handle:
|
||||||
|
config_text = handle.read()
|
||||||
|
data = json.loads(config_text)
|
||||||
|
except FileNotFoundError:
|
||||||
|
logger.debug("Packaged %s not found", self._default_filename)
|
||||||
|
return {"models": []}
|
||||||
|
except Exception as exc:
|
||||||
|
logger.warning("Failed to read packaged %s: %s", self._default_filename, exc)
|
||||||
|
return {"models": []}
|
||||||
|
return data or {"models": []}
|
||||||
|
|
||||||
|
if not self.config_path:
|
||||||
|
raise FileNotFoundError("Registry configuration path is not set")
|
||||||
|
|
||||||
|
if not self.config_path.exists():
|
||||||
|
logger.debug("Model registry config not found at %s", self.config_path)
|
||||||
|
if self.config_path == self._default_path:
|
||||||
|
fallback = Path.cwd() / "conf" / self._default_filename
|
||||||
|
if fallback != self.config_path and fallback.exists():
|
||||||
|
logger.debug("Falling back to %s", fallback)
|
||||||
|
self.config_path = fallback
|
||||||
|
else:
|
||||||
|
return {"models": []}
|
||||||
|
else:
|
||||||
|
return {"models": []}
|
||||||
|
|
||||||
|
data = read_json_file(str(self.config_path))
|
||||||
|
return data or {"models": []}
|
||||||
|
|
||||||
|
@property
|
||||||
|
def use_resources(self) -> bool:
|
||||||
|
return self._use_resources
|
||||||
|
|
||||||
|
def _parse_models(self, data: dict) -> Iterable[ModelCapabilities | None]:
|
||||||
|
for raw in data.get("models", []):
|
||||||
|
if not isinstance(raw, dict):
|
||||||
|
continue
|
||||||
|
yield self._convert_entry(raw)
|
||||||
|
|
||||||
|
def _convert_entry(self, raw: dict) -> ModelCapabilities | None:
|
||||||
|
entry = dict(raw)
|
||||||
|
model_name = entry.get("model_name")
|
||||||
|
if not model_name:
|
||||||
|
return None
|
||||||
|
|
||||||
|
aliases = entry.get("aliases")
|
||||||
|
if isinstance(aliases, str):
|
||||||
|
entry["aliases"] = [alias.strip() for alias in aliases.split(",") if alias.strip()]
|
||||||
|
|
||||||
|
entry.setdefault("friendly_name", self._default_friendly_name(model_name))
|
||||||
|
|
||||||
|
temperature_hint = entry.get("temperature_constraint")
|
||||||
|
if isinstance(temperature_hint, str):
|
||||||
|
entry["temperature_constraint"] = TemperatureConstraint.create(temperature_hint)
|
||||||
|
elif temperature_hint is None:
|
||||||
|
entry["temperature_constraint"] = TemperatureConstraint.create("range")
|
||||||
|
|
||||||
|
if "max_tokens" in entry:
|
||||||
|
raise ValueError(
|
||||||
|
"`max_tokens` is no longer supported. Use `max_output_tokens` in your model configuration."
|
||||||
|
)
|
||||||
|
|
||||||
|
unknown_keys = set(entry.keys()) - CAPABILITY_FIELD_NAMES - self._extra_keys()
|
||||||
|
if unknown_keys:
|
||||||
|
raise ValueError("Unsupported fields in model configuration: " + ", ".join(sorted(unknown_keys)))
|
||||||
|
|
||||||
|
capability, extras = self._finalise_entry(entry)
|
||||||
|
capability.provider = self._provider_default()
|
||||||
|
self._extras[capability.model_name] = extras or {}
|
||||||
|
return capability
|
||||||
|
|
||||||
|
def _default_friendly_name(self, model_name: str) -> str:
|
||||||
|
return model_name
|
||||||
|
|
||||||
|
def _extra_keys(self) -> set[str]:
|
||||||
|
return set()
|
||||||
|
|
||||||
|
def _provider_default(self) -> ProviderType:
|
||||||
|
return ProviderType.OPENROUTER
|
||||||
|
|
||||||
|
def _finalise_entry(self, entry: dict) -> tuple[ModelCapabilities, dict]:
|
||||||
|
return ModelCapabilities(**{k: v for k, v in entry.items() if k in CAPABILITY_FIELD_NAMES}), {}
|
||||||
|
|
||||||
|
def _build_maps(self, configs: Iterable[ModelCapabilities]) -> None:
|
||||||
|
alias_map: dict[str, str] = {}
|
||||||
|
model_map: dict[str, ModelCapabilities] = {}
|
||||||
|
|
||||||
|
for config in configs:
|
||||||
|
if not config:
|
||||||
|
continue
|
||||||
|
model_map[config.model_name] = config
|
||||||
|
|
||||||
|
model_name_lower = config.model_name.lower()
|
||||||
|
if model_name_lower not in alias_map:
|
||||||
|
alias_map[model_name_lower] = config.model_name
|
||||||
|
|
||||||
|
for alias in config.aliases:
|
||||||
|
alias_lower = alias.lower()
|
||||||
|
if alias_lower in alias_map and alias_map[alias_lower] != config.model_name:
|
||||||
|
raise ValueError(
|
||||||
|
f"Duplicate alias '{alias}' found for models '{alias_map[alias_lower]}' and '{config.model_name}'"
|
||||||
|
)
|
||||||
|
alias_map[alias_lower] = config.model_name
|
||||||
|
|
||||||
|
self.alias_map = alias_map
|
||||||
|
self.model_map = model_map
|
||||||
|
|
||||||
|
|
||||||
|
class CapabilityModelRegistry(CustomModelRegistryBase):
|
||||||
|
"""Registry that returns `ModelCapabilities` objects with alias support."""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
*,
|
||||||
|
env_var_name: str,
|
||||||
|
default_filename: str,
|
||||||
|
provider: ProviderType,
|
||||||
|
friendly_prefix: str,
|
||||||
|
config_path: str | None = None,
|
||||||
|
) -> None:
|
||||||
|
self._provider = provider
|
||||||
|
self._friendly_prefix = friendly_prefix
|
||||||
|
super().__init__(
|
||||||
|
env_var_name=env_var_name,
|
||||||
|
default_filename=default_filename,
|
||||||
|
config_path=config_path,
|
||||||
|
)
|
||||||
|
self.reload()
|
||||||
|
|
||||||
|
def _provider_default(self) -> ProviderType:
|
||||||
|
return self._provider
|
||||||
|
|
||||||
|
def _default_friendly_name(self, model_name: str) -> str:
|
||||||
|
return self._friendly_prefix.format(model=model_name)
|
||||||
|
|
||||||
|
def _finalise_entry(self, entry: dict) -> tuple[ModelCapabilities, dict]:
|
||||||
|
filtered = {k: v for k, v in entry.items() if k in CAPABILITY_FIELD_NAMES}
|
||||||
|
filtered.setdefault("provider", self._provider_default())
|
||||||
|
capability = ModelCapabilities(**filtered)
|
||||||
|
return capability, {}
|
||||||
@@ -8,7 +8,7 @@ from urllib.parse import urlparse
|
|||||||
|
|
||||||
from openai import OpenAI
|
from openai import OpenAI
|
||||||
|
|
||||||
from utils.env import get_env
|
from utils.env import get_env, suppress_env_vars
|
||||||
from utils.image_utils import validate_image
|
from utils.image_utils import validate_image
|
||||||
|
|
||||||
from .base import ModelProvider
|
from .base import ModelProvider
|
||||||
@@ -257,80 +257,74 @@ class OpenAICompatibleProvider(ModelProvider):
|
|||||||
def client(self):
|
def client(self):
|
||||||
"""Lazy initialization of OpenAI client with security checks and timeout configuration."""
|
"""Lazy initialization of OpenAI client with security checks and timeout configuration."""
|
||||||
if self._client is None:
|
if self._client is None:
|
||||||
import os
|
|
||||||
|
|
||||||
import httpx
|
import httpx
|
||||||
|
|
||||||
# Temporarily disable proxy environment variables to prevent httpx from detecting them
|
|
||||||
original_env = {}
|
|
||||||
proxy_env_vars = ["HTTP_PROXY", "HTTPS_PROXY", "ALL_PROXY", "http_proxy", "https_proxy", "all_proxy"]
|
proxy_env_vars = ["HTTP_PROXY", "HTTPS_PROXY", "ALL_PROXY", "http_proxy", "https_proxy", "all_proxy"]
|
||||||
|
|
||||||
for var in proxy_env_vars:
|
with suppress_env_vars(*proxy_env_vars):
|
||||||
if var in os.environ:
|
|
||||||
original_env[var] = os.environ[var]
|
|
||||||
del os.environ[var]
|
|
||||||
|
|
||||||
try:
|
|
||||||
# Create a custom httpx client that explicitly avoids proxy parameters
|
|
||||||
timeout_config = (
|
|
||||||
self.timeout_config
|
|
||||||
if hasattr(self, "timeout_config") and self.timeout_config
|
|
||||||
else httpx.Timeout(30.0)
|
|
||||||
)
|
|
||||||
|
|
||||||
# Create httpx client with minimal config to avoid proxy conflicts
|
|
||||||
# Note: proxies parameter was removed in httpx 0.28.0
|
|
||||||
# Check for test transport injection
|
|
||||||
if hasattr(self, "_test_transport"):
|
|
||||||
# Use custom transport for testing (HTTP recording/replay)
|
|
||||||
http_client = httpx.Client(
|
|
||||||
transport=self._test_transport,
|
|
||||||
timeout=timeout_config,
|
|
||||||
follow_redirects=True,
|
|
||||||
)
|
|
||||||
else:
|
|
||||||
# Normal production client
|
|
||||||
http_client = httpx.Client(
|
|
||||||
timeout=timeout_config,
|
|
||||||
follow_redirects=True,
|
|
||||||
)
|
|
||||||
|
|
||||||
# Keep client initialization minimal to avoid proxy parameter conflicts
|
|
||||||
client_kwargs = {
|
|
||||||
"api_key": self.api_key,
|
|
||||||
"http_client": http_client,
|
|
||||||
}
|
|
||||||
|
|
||||||
if self.base_url:
|
|
||||||
client_kwargs["base_url"] = self.base_url
|
|
||||||
|
|
||||||
if self.organization:
|
|
||||||
client_kwargs["organization"] = self.organization
|
|
||||||
|
|
||||||
# Add default headers if any
|
|
||||||
if self.DEFAULT_HEADERS:
|
|
||||||
client_kwargs["default_headers"] = self.DEFAULT_HEADERS.copy()
|
|
||||||
|
|
||||||
logging.debug(f"OpenAI client initialized with custom httpx client and timeout: {timeout_config}")
|
|
||||||
|
|
||||||
# Create OpenAI client with custom httpx client
|
|
||||||
self._client = OpenAI(**client_kwargs)
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
# If all else fails, try absolute minimal client without custom httpx
|
|
||||||
logging.warning(f"Failed to create client with custom httpx, falling back to minimal config: {e}")
|
|
||||||
try:
|
try:
|
||||||
minimal_kwargs = {"api_key": self.api_key}
|
# Create a custom httpx client that explicitly avoids proxy parameters
|
||||||
|
timeout_config = (
|
||||||
|
self.timeout_config
|
||||||
|
if hasattr(self, "timeout_config") and self.timeout_config
|
||||||
|
else httpx.Timeout(30.0)
|
||||||
|
)
|
||||||
|
|
||||||
|
# Create httpx client with minimal config to avoid proxy conflicts
|
||||||
|
# Note: proxies parameter was removed in httpx 0.28.0
|
||||||
|
# Check for test transport injection
|
||||||
|
if hasattr(self, "_test_transport"):
|
||||||
|
# Use custom transport for testing (HTTP recording/replay)
|
||||||
|
http_client = httpx.Client(
|
||||||
|
transport=self._test_transport,
|
||||||
|
timeout=timeout_config,
|
||||||
|
follow_redirects=True,
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
# Normal production client
|
||||||
|
http_client = httpx.Client(
|
||||||
|
timeout=timeout_config,
|
||||||
|
follow_redirects=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Keep client initialization minimal to avoid proxy parameter conflicts
|
||||||
|
client_kwargs = {
|
||||||
|
"api_key": self.api_key,
|
||||||
|
"http_client": http_client,
|
||||||
|
}
|
||||||
|
|
||||||
if self.base_url:
|
if self.base_url:
|
||||||
minimal_kwargs["base_url"] = self.base_url
|
client_kwargs["base_url"] = self.base_url
|
||||||
self._client = OpenAI(**minimal_kwargs)
|
|
||||||
except Exception as fallback_error:
|
if self.organization:
|
||||||
logging.error(f"Even minimal OpenAI client creation failed: {fallback_error}")
|
client_kwargs["organization"] = self.organization
|
||||||
raise
|
|
||||||
finally:
|
# Add default headers if any
|
||||||
# Restore original proxy environment variables
|
if self.DEFAULT_HEADERS:
|
||||||
for var, value in original_env.items():
|
client_kwargs["default_headers"] = self.DEFAULT_HEADERS.copy()
|
||||||
os.environ[var] = value
|
|
||||||
|
logging.debug(
|
||||||
|
"OpenAI client initialized with custom httpx client and timeout: %s",
|
||||||
|
timeout_config,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Create OpenAI client with custom httpx client
|
||||||
|
self._client = OpenAI(**client_kwargs)
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
# If all else fails, try absolute minimal client without custom httpx
|
||||||
|
logging.warning(
|
||||||
|
"Failed to create client with custom httpx, falling back to minimal config: %s",
|
||||||
|
e,
|
||||||
|
)
|
||||||
|
try:
|
||||||
|
minimal_kwargs = {"api_key": self.api_key}
|
||||||
|
if self.base_url:
|
||||||
|
minimal_kwargs["base_url"] = self.base_url
|
||||||
|
self._client = OpenAI(**minimal_kwargs)
|
||||||
|
except Exception as fallback_error:
|
||||||
|
logging.error("Even minimal OpenAI client creation failed: %s", fallback_error)
|
||||||
|
raise
|
||||||
|
|
||||||
return self._client
|
return self._client
|
||||||
|
|
||||||
|
|||||||
@@ -103,16 +103,16 @@ class OpenAIModelProvider(OpenAICompatibleProvider):
|
|||||||
model_name="o3-mini",
|
model_name="o3-mini",
|
||||||
friendly_name="OpenAI (O3-mini)",
|
friendly_name="OpenAI (O3-mini)",
|
||||||
intelligence_score=12,
|
intelligence_score=12,
|
||||||
context_window=200_000, # 200K tokens
|
context_window=200_000,
|
||||||
max_output_tokens=65536, # 64K max output tokens
|
max_output_tokens=65536,
|
||||||
supports_extended_thinking=False,
|
supports_extended_thinking=False,
|
||||||
supports_system_prompts=True,
|
supports_system_prompts=True,
|
||||||
supports_streaming=True,
|
supports_streaming=True,
|
||||||
supports_function_calling=True,
|
supports_function_calling=True,
|
||||||
supports_json_mode=True,
|
supports_json_mode=True,
|
||||||
supports_images=True, # O3 models support vision
|
supports_images=True,
|
||||||
max_image_size_mb=20.0, # 20MB per OpenAI docs
|
max_image_size_mb=20.0,
|
||||||
supports_temperature=False, # O3 models don't accept temperature parameter
|
supports_temperature=False,
|
||||||
temperature_constraint=TemperatureConstraint.create("fixed"),
|
temperature_constraint=TemperatureConstraint.create("fixed"),
|
||||||
description="Fast O3 variant (200K context) - Balanced performance/speed, moderate complexity",
|
description="Fast O3 variant (200K context) - Balanced performance/speed, moderate complexity",
|
||||||
aliases=["o3mini"],
|
aliases=["o3mini"],
|
||||||
@@ -122,16 +122,16 @@ class OpenAIModelProvider(OpenAICompatibleProvider):
|
|||||||
model_name="o3-pro",
|
model_name="o3-pro",
|
||||||
friendly_name="OpenAI (O3-Pro)",
|
friendly_name="OpenAI (O3-Pro)",
|
||||||
intelligence_score=15,
|
intelligence_score=15,
|
||||||
context_window=200_000, # 200K tokens
|
context_window=200_000,
|
||||||
max_output_tokens=65536, # 64K max output tokens
|
max_output_tokens=65536,
|
||||||
supports_extended_thinking=False,
|
supports_extended_thinking=False,
|
||||||
supports_system_prompts=True,
|
supports_system_prompts=True,
|
||||||
supports_streaming=True,
|
supports_streaming=True,
|
||||||
supports_function_calling=True,
|
supports_function_calling=True,
|
||||||
supports_json_mode=True,
|
supports_json_mode=True,
|
||||||
supports_images=True, # O3 models support vision
|
supports_images=True,
|
||||||
max_image_size_mb=20.0, # 20MB per OpenAI docs
|
max_image_size_mb=20.0,
|
||||||
supports_temperature=False, # O3 models don't accept temperature parameter
|
supports_temperature=False,
|
||||||
temperature_constraint=TemperatureConstraint.create("fixed"),
|
temperature_constraint=TemperatureConstraint.create("fixed"),
|
||||||
description="Professional-grade reasoning (200K context) - EXTREMELY EXPENSIVE: Only for the most complex problems requiring universe-scale complexity analysis OR when the user explicitly asks for this model. Use sparingly for critical architectural decisions or exceptionally complex debugging that other models cannot handle.",
|
description="Professional-grade reasoning (200K context) - EXTREMELY EXPENSIVE: Only for the most complex problems requiring universe-scale complexity analysis OR when the user explicitly asks for this model. Use sparingly for critical architectural decisions or exceptionally complex debugging that other models cannot handle.",
|
||||||
aliases=["o3pro"],
|
aliases=["o3pro"],
|
||||||
@@ -141,16 +141,15 @@ class OpenAIModelProvider(OpenAICompatibleProvider):
|
|||||||
model_name="o4-mini",
|
model_name="o4-mini",
|
||||||
friendly_name="OpenAI (O4-mini)",
|
friendly_name="OpenAI (O4-mini)",
|
||||||
intelligence_score=11,
|
intelligence_score=11,
|
||||||
context_window=200_000, # 200K tokens
|
context_window=200_000,
|
||||||
max_output_tokens=65536, # 64K max output tokens
|
|
||||||
supports_extended_thinking=False,
|
supports_extended_thinking=False,
|
||||||
supports_system_prompts=True,
|
supports_system_prompts=True,
|
||||||
supports_streaming=True,
|
supports_streaming=True,
|
||||||
supports_function_calling=True,
|
supports_function_calling=True,
|
||||||
supports_json_mode=True,
|
supports_json_mode=True,
|
||||||
supports_images=True, # O4 models support vision
|
supports_images=True,
|
||||||
max_image_size_mb=20.0, # 20MB per OpenAI docs
|
max_image_size_mb=20.0,
|
||||||
supports_temperature=False, # O4 models don't accept temperature parameter
|
supports_temperature=False,
|
||||||
temperature_constraint=TemperatureConstraint.create("fixed"),
|
temperature_constraint=TemperatureConstraint.create("fixed"),
|
||||||
description="Latest reasoning model (200K context) - Optimized for shorter contexts, rapid reasoning",
|
description="Latest reasoning model (200K context) - Optimized for shorter contexts, rapid reasoning",
|
||||||
aliases=["o4mini"],
|
aliases=["o4mini"],
|
||||||
@@ -160,16 +159,16 @@ class OpenAIModelProvider(OpenAICompatibleProvider):
|
|||||||
model_name="gpt-4.1",
|
model_name="gpt-4.1",
|
||||||
friendly_name="OpenAI (GPT 4.1)",
|
friendly_name="OpenAI (GPT 4.1)",
|
||||||
intelligence_score=13,
|
intelligence_score=13,
|
||||||
context_window=1_000_000, # 1M tokens
|
context_window=1_000_000,
|
||||||
max_output_tokens=32_768,
|
max_output_tokens=32_768,
|
||||||
supports_extended_thinking=False,
|
supports_extended_thinking=False,
|
||||||
supports_system_prompts=True,
|
supports_system_prompts=True,
|
||||||
supports_streaming=True,
|
supports_streaming=True,
|
||||||
supports_function_calling=True,
|
supports_function_calling=True,
|
||||||
supports_json_mode=True,
|
supports_json_mode=True,
|
||||||
supports_images=True, # GPT-4.1 supports vision
|
supports_images=True,
|
||||||
max_image_size_mb=20.0, # 20MB per OpenAI docs
|
max_image_size_mb=20.0,
|
||||||
supports_temperature=True, # Regular models accept temperature parameter
|
supports_temperature=True,
|
||||||
temperature_constraint=TemperatureConstraint.create("range"),
|
temperature_constraint=TemperatureConstraint.create("range"),
|
||||||
description="GPT-4.1 (1M context) - Advanced reasoning model with large context window",
|
description="GPT-4.1 (1M context) - Advanced reasoning model with large context window",
|
||||||
aliases=["gpt4.1"],
|
aliases=["gpt4.1"],
|
||||||
@@ -178,19 +177,19 @@ class OpenAIModelProvider(OpenAICompatibleProvider):
|
|||||||
provider=ProviderType.OPENAI,
|
provider=ProviderType.OPENAI,
|
||||||
model_name="gpt-5-codex",
|
model_name="gpt-5-codex",
|
||||||
friendly_name="OpenAI (GPT-5 Codex)",
|
friendly_name="OpenAI (GPT-5 Codex)",
|
||||||
intelligence_score=17, # Higher than GPT-5 for coding tasks
|
intelligence_score=17,
|
||||||
context_window=400_000, # 400K tokens (same as GPT-5)
|
context_window=400_000,
|
||||||
max_output_tokens=128_000, # 128K output tokens
|
max_output_tokens=128_000,
|
||||||
supports_extended_thinking=True, # Responses API supports reasoning tokens
|
supports_extended_thinking=True,
|
||||||
supports_system_prompts=True,
|
supports_system_prompts=True,
|
||||||
supports_streaming=True,
|
supports_streaming=True,
|
||||||
supports_function_calling=True, # Enhanced for agentic software engineering
|
supports_function_calling=True,
|
||||||
supports_json_mode=True,
|
supports_json_mode=True,
|
||||||
supports_images=True, # Screenshots, wireframes, diagrams
|
supports_images=True,
|
||||||
max_image_size_mb=20.0, # 20MB per OpenAI docs
|
max_image_size_mb=20.0,
|
||||||
supports_temperature=True,
|
supports_temperature=True,
|
||||||
temperature_constraint=TemperatureConstraint.create("range"),
|
temperature_constraint=TemperatureConstraint.create("range"),
|
||||||
description="GPT-5 Codex (400K context) - Uses Responses API for 40-80% cost savings. Specialized for coding, refactoring, and software architecture. 3% better performance on SWE-bench.",
|
description="GPT-5 Codex (400K context) Specialized for coding, refactoring, and software architecture.",
|
||||||
aliases=["gpt5-codex", "codex", "gpt-5-code", "gpt5-code"],
|
aliases=["gpt5-codex", "codex", "gpt-5-code", "gpt5-code"],
|
||||||
),
|
),
|
||||||
}
|
}
|
||||||
@@ -282,7 +281,7 @@ class OpenAIModelProvider(OpenAICompatibleProvider):
|
|||||||
|
|
||||||
if category == ToolModelCategory.EXTENDED_REASONING:
|
if category == ToolModelCategory.EXTENDED_REASONING:
|
||||||
# Prefer models with extended thinking support
|
# Prefer models with extended thinking support
|
||||||
# GPT-5-Codex first for coding tasks (uses Responses API with 40-80% cost savings)
|
# GPT-5-Codex first for coding tasks
|
||||||
preferred = find_first(["gpt-5-codex", "o3", "o3-pro", "gpt-5"])
|
preferred = find_first(["gpt-5-codex", "o3", "o3-pro", "gpt-5"])
|
||||||
return preferred if preferred else allowed_models[0]
|
return preferred if preferred else allowed_models[0]
|
||||||
|
|
||||||
|
|||||||
@@ -1,7 +1,6 @@
|
|||||||
"""OpenRouter provider implementation."""
|
"""OpenRouter provider implementation."""
|
||||||
|
|
||||||
import logging
|
import logging
|
||||||
from typing import Optional
|
|
||||||
|
|
||||||
from utils.env import get_env
|
from utils.env import get_env
|
||||||
|
|
||||||
@@ -42,7 +41,7 @@ class OpenRouterProvider(OpenAICompatibleProvider):
|
|||||||
}
|
}
|
||||||
|
|
||||||
# Model registry for managing configurations and aliases
|
# Model registry for managing configurations and aliases
|
||||||
_registry: Optional[OpenRouterModelRegistry] = None
|
_registry: OpenRouterModelRegistry | None = None
|
||||||
|
|
||||||
def __init__(self, api_key: str, **kwargs):
|
def __init__(self, api_key: str, **kwargs):
|
||||||
"""Initialize OpenRouter provider.
|
"""Initialize OpenRouter provider.
|
||||||
@@ -70,8 +69,8 @@ class OpenRouterProvider(OpenAICompatibleProvider):
|
|||||||
def _lookup_capabilities(
|
def _lookup_capabilities(
|
||||||
self,
|
self,
|
||||||
canonical_name: str,
|
canonical_name: str,
|
||||||
requested_name: Optional[str] = None,
|
requested_name: str | None = None,
|
||||||
) -> Optional[ModelCapabilities]:
|
) -> ModelCapabilities | None:
|
||||||
"""Fetch OpenRouter capabilities from the registry or build a generic fallback."""
|
"""Fetch OpenRouter capabilities from the registry or build a generic fallback."""
|
||||||
|
|
||||||
capabilities = self._registry.get_capabilities(canonical_name)
|
capabilities = self._registry.get_capabilities(canonical_name)
|
||||||
@@ -143,7 +142,7 @@ class OpenRouterProvider(OpenAICompatibleProvider):
|
|||||||
# Custom models belong to CustomProvider; skip them here so the two
|
# Custom models belong to CustomProvider; skip them here so the two
|
||||||
# providers don't race over the same registrations (important for tests
|
# providers don't race over the same registrations (important for tests
|
||||||
# that stub the registry with minimal objects lacking attrs).
|
# that stub the registry with minimal objects lacking attrs).
|
||||||
if hasattr(config, "is_custom") and config.is_custom is True:
|
if config.provider == ProviderType.CUSTOM:
|
||||||
continue
|
continue
|
||||||
|
|
||||||
if restriction_service:
|
if restriction_service:
|
||||||
@@ -211,7 +210,7 @@ class OpenRouterProvider(OpenAICompatibleProvider):
|
|||||||
continue
|
continue
|
||||||
|
|
||||||
# See note in list_models: respect the CustomProvider boundary.
|
# See note in list_models: respect the CustomProvider boundary.
|
||||||
if hasattr(config, "is_custom") and config.is_custom is True:
|
if config.provider == ProviderType.CUSTOM:
|
||||||
continue
|
continue
|
||||||
|
|
||||||
capabilities[model_name] = config
|
capabilities[model_name] = config
|
||||||
|
|||||||
@@ -1,293 +1,38 @@
|
|||||||
"""OpenRouter model registry for managing model configurations and aliases."""
|
"""OpenRouter model registry for managing model configurations and aliases."""
|
||||||
|
|
||||||
import importlib.resources
|
from __future__ import annotations
|
||||||
import logging
|
|
||||||
from pathlib import Path
|
|
||||||
from typing import Optional
|
|
||||||
|
|
||||||
from utils.env import get_env
|
from .model_registry_base import CAPABILITY_FIELD_NAMES, CapabilityModelRegistry
|
||||||
|
from .shared import ModelCapabilities, ProviderType
|
||||||
# Import handled via importlib.resources.files() calls directly
|
|
||||||
from utils.file_utils import read_json_file
|
|
||||||
|
|
||||||
from .shared import (
|
|
||||||
ModelCapabilities,
|
|
||||||
ProviderType,
|
|
||||||
TemperatureConstraint,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class OpenRouterModelRegistry:
|
class OpenRouterModelRegistry(CapabilityModelRegistry):
|
||||||
"""In-memory view of OpenRouter and custom model metadata.
|
"""Capability registry backed by `conf/openrouter_models.json`."""
|
||||||
|
|
||||||
Role
|
def __init__(self, config_path: str | None = None) -> None:
|
||||||
Parse the packaged ``conf/custom_models.json`` (or user-specified
|
super().__init__(
|
||||||
overrides), construct alias and capability maps, and serve those
|
env_var_name="OPENROUTER_MODELS_CONFIG_PATH",
|
||||||
structures to providers that rely on OpenRouter semantics (both the
|
default_filename="openrouter_models.json",
|
||||||
OpenRouter provider itself and the Custom provider).
|
provider=ProviderType.OPENROUTER,
|
||||||
|
friendly_prefix="OpenRouter ({model})",
|
||||||
|
config_path=config_path,
|
||||||
|
)
|
||||||
|
|
||||||
Key duties
|
def _finalise_entry(self, entry: dict) -> tuple[ModelCapabilities, dict]:
|
||||||
* Load :class:`ModelCapabilities` definitions from configuration files
|
provider_override = entry.get("provider")
|
||||||
* Maintain a case-insensitive alias → canonical name map for fast
|
if isinstance(provider_override, str):
|
||||||
resolution
|
entry_provider = ProviderType(provider_override.lower())
|
||||||
* Provide helpers to list models, list aliases, and resolve an arbitrary
|
elif isinstance(provider_override, ProviderType):
|
||||||
name to its capability object without repeatedly touching the file
|
entry_provider = provider_override
|
||||||
system.
|
|
||||||
"""
|
|
||||||
|
|
||||||
def __init__(self, config_path: Optional[str] = None):
|
|
||||||
"""Initialize the registry.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
config_path: Path to config file. If None, uses default locations.
|
|
||||||
"""
|
|
||||||
self.alias_map: dict[str, str] = {} # alias -> model_name
|
|
||||||
self.model_map: dict[str, ModelCapabilities] = {} # model_name -> config
|
|
||||||
|
|
||||||
# Determine config path and loading strategy
|
|
||||||
self.use_resources = False
|
|
||||||
if config_path:
|
|
||||||
# Direct config_path parameter
|
|
||||||
self.config_path = Path(config_path)
|
|
||||||
else:
|
else:
|
||||||
# Check environment variable first
|
entry_provider = ProviderType.OPENROUTER
|
||||||
env_path = get_env("CUSTOM_MODELS_CONFIG_PATH")
|
|
||||||
if env_path:
|
|
||||||
# Environment variable path
|
|
||||||
self.config_path = Path(env_path)
|
|
||||||
else:
|
|
||||||
# Try importlib.resources for robust packaging support
|
|
||||||
self.config_path = None
|
|
||||||
self.use_resources = False
|
|
||||||
|
|
||||||
try:
|
if entry_provider == ProviderType.CUSTOM:
|
||||||
resource_traversable = importlib.resources.files("conf").joinpath("custom_models.json")
|
entry.setdefault("friendly_name", f"Custom ({entry['model_name']})")
|
||||||
if hasattr(resource_traversable, "read_text"):
|
else:
|
||||||
self.use_resources = True
|
entry.setdefault("friendly_name", f"OpenRouter ({entry['model_name']})")
|
||||||
else:
|
|
||||||
raise AttributeError("read_text not available")
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
if not self.use_resources:
|
filtered = {k: v for k, v in entry.items() if k in CAPABILITY_FIELD_NAMES}
|
||||||
# Fallback to file system paths
|
filtered.setdefault("provider", entry_provider)
|
||||||
potential_paths = [
|
capability = ModelCapabilities(**filtered)
|
||||||
Path(__file__).parent.parent / "conf" / "custom_models.json",
|
return capability, {}
|
||||||
Path.cwd() / "conf" / "custom_models.json",
|
|
||||||
]
|
|
||||||
|
|
||||||
for path in potential_paths:
|
|
||||||
if path.exists():
|
|
||||||
self.config_path = path
|
|
||||||
break
|
|
||||||
|
|
||||||
if self.config_path is None:
|
|
||||||
self.config_path = potential_paths[0]
|
|
||||||
|
|
||||||
# Load configuration
|
|
||||||
self.reload()
|
|
||||||
|
|
||||||
def reload(self) -> None:
|
|
||||||
"""Reload configuration from disk."""
|
|
||||||
try:
|
|
||||||
configs = self._read_config()
|
|
||||||
self._build_maps(configs)
|
|
||||||
caller_info = ""
|
|
||||||
try:
|
|
||||||
import inspect
|
|
||||||
|
|
||||||
caller_frame = inspect.currentframe().f_back
|
|
||||||
if caller_frame:
|
|
||||||
caller_name = caller_frame.f_code.co_name
|
|
||||||
caller_file = (
|
|
||||||
caller_frame.f_code.co_filename.split("/")[-1] if caller_frame.f_code.co_filename else "unknown"
|
|
||||||
)
|
|
||||||
# Look for tool context
|
|
||||||
while caller_frame:
|
|
||||||
frame_locals = caller_frame.f_locals
|
|
||||||
if "self" in frame_locals and hasattr(frame_locals["self"], "get_name"):
|
|
||||||
tool_name = frame_locals["self"].get_name()
|
|
||||||
caller_info = f" (called from {tool_name} tool)"
|
|
||||||
break
|
|
||||||
caller_frame = caller_frame.f_back
|
|
||||||
if not caller_info:
|
|
||||||
caller_info = f" (called from {caller_name} in {caller_file})"
|
|
||||||
except Exception:
|
|
||||||
# If frame inspection fails, just continue without caller info
|
|
||||||
pass
|
|
||||||
|
|
||||||
logging.debug(
|
|
||||||
f"Loaded {len(self.model_map)} OpenRouter models with {len(self.alias_map)} aliases{caller_info}"
|
|
||||||
)
|
|
||||||
except ValueError as e:
|
|
||||||
# Re-raise ValueError only for duplicate aliases (critical config errors)
|
|
||||||
logging.error(f"Failed to load OpenRouter model configuration: {e}")
|
|
||||||
# Initialize with empty maps on failure
|
|
||||||
self.alias_map = {}
|
|
||||||
self.model_map = {}
|
|
||||||
if "Duplicate alias" in str(e):
|
|
||||||
raise
|
|
||||||
except Exception as e:
|
|
||||||
logging.error(f"Failed to load OpenRouter model configuration: {e}")
|
|
||||||
# Initialize with empty maps on failure
|
|
||||||
self.alias_map = {}
|
|
||||||
self.model_map = {}
|
|
||||||
|
|
||||||
def _read_config(self) -> list[ModelCapabilities]:
|
|
||||||
"""Read configuration from file or package resources.
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
List of model configurations
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
if self.use_resources:
|
|
||||||
# Use importlib.resources for packaged environments
|
|
||||||
try:
|
|
||||||
resource_path = importlib.resources.files("conf").joinpath("custom_models.json")
|
|
||||||
if hasattr(resource_path, "read_text"):
|
|
||||||
# Python 3.9+
|
|
||||||
config_text = resource_path.read_text(encoding="utf-8")
|
|
||||||
else:
|
|
||||||
# Python 3.8 fallback
|
|
||||||
with resource_path.open("r", encoding="utf-8") as f:
|
|
||||||
config_text = f.read()
|
|
||||||
|
|
||||||
import json
|
|
||||||
|
|
||||||
data = json.loads(config_text)
|
|
||||||
logging.debug("Loaded OpenRouter config from package resources")
|
|
||||||
except Exception as e:
|
|
||||||
logging.warning(f"Failed to load config from resources: {e}")
|
|
||||||
return []
|
|
||||||
else:
|
|
||||||
# Use file path loading
|
|
||||||
if not self.config_path.exists():
|
|
||||||
logging.warning(f"OpenRouter model config not found at {self.config_path}")
|
|
||||||
return []
|
|
||||||
|
|
||||||
# Use centralized JSON reading utility
|
|
||||||
data = read_json_file(str(self.config_path))
|
|
||||||
logging.debug(f"Loaded OpenRouter config from file: {self.config_path}")
|
|
||||||
|
|
||||||
if data is None:
|
|
||||||
location = "resources" if self.use_resources else str(self.config_path)
|
|
||||||
raise ValueError(f"Could not read or parse JSON from {location}")
|
|
||||||
|
|
||||||
# Parse models
|
|
||||||
configs = []
|
|
||||||
for model_data in data.get("models", []):
|
|
||||||
# Create ModelCapabilities directly from JSON data
|
|
||||||
# Handle temperature_constraint conversion
|
|
||||||
temp_constraint_str = model_data.get("temperature_constraint")
|
|
||||||
temp_constraint = TemperatureConstraint.create(temp_constraint_str or "range")
|
|
||||||
|
|
||||||
# Set provider-specific defaults based on is_custom flag
|
|
||||||
is_custom = model_data.get("is_custom", False)
|
|
||||||
if is_custom:
|
|
||||||
model_data.setdefault("provider", ProviderType.CUSTOM)
|
|
||||||
model_data.setdefault("friendly_name", f"Custom ({model_data.get('model_name', 'Unknown')})")
|
|
||||||
else:
|
|
||||||
model_data.setdefault("provider", ProviderType.OPENROUTER)
|
|
||||||
model_data.setdefault("friendly_name", f"OpenRouter ({model_data.get('model_name', 'Unknown')})")
|
|
||||||
model_data["temperature_constraint"] = temp_constraint
|
|
||||||
|
|
||||||
# Remove the string version of temperature_constraint before creating ModelCapabilities
|
|
||||||
if "temperature_constraint" in model_data and isinstance(model_data["temperature_constraint"], str):
|
|
||||||
del model_data["temperature_constraint"]
|
|
||||||
model_data["temperature_constraint"] = temp_constraint
|
|
||||||
|
|
||||||
config = ModelCapabilities(**model_data)
|
|
||||||
configs.append(config)
|
|
||||||
|
|
||||||
return configs
|
|
||||||
except ValueError:
|
|
||||||
# Re-raise ValueError for specific config errors
|
|
||||||
raise
|
|
||||||
except Exception as e:
|
|
||||||
location = "resources" if self.use_resources else str(self.config_path)
|
|
||||||
raise ValueError(f"Error reading config from {location}: {e}")
|
|
||||||
|
|
||||||
def _build_maps(self, configs: list[ModelCapabilities]) -> None:
|
|
||||||
"""Build alias and model maps from configurations.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
configs: List of model configurations
|
|
||||||
"""
|
|
||||||
alias_map = {}
|
|
||||||
model_map = {}
|
|
||||||
|
|
||||||
for config in configs:
|
|
||||||
# Add to model map
|
|
||||||
model_map[config.model_name] = config
|
|
||||||
|
|
||||||
# Add the model_name itself as an alias for case-insensitive lookup
|
|
||||||
# But only if it's not already in the aliases list
|
|
||||||
model_name_lower = config.model_name.lower()
|
|
||||||
aliases_lower = [alias.lower() for alias in config.aliases]
|
|
||||||
|
|
||||||
if model_name_lower not in aliases_lower:
|
|
||||||
if model_name_lower in alias_map:
|
|
||||||
existing_model = alias_map[model_name_lower]
|
|
||||||
if existing_model != config.model_name:
|
|
||||||
raise ValueError(
|
|
||||||
f"Duplicate model name '{config.model_name}' (case-insensitive) found for models "
|
|
||||||
f"'{existing_model}' and '{config.model_name}'"
|
|
||||||
)
|
|
||||||
else:
|
|
||||||
alias_map[model_name_lower] = config.model_name
|
|
||||||
|
|
||||||
# Add aliases
|
|
||||||
for alias in config.aliases:
|
|
||||||
alias_lower = alias.lower()
|
|
||||||
if alias_lower in alias_map:
|
|
||||||
existing_model = alias_map[alias_lower]
|
|
||||||
raise ValueError(
|
|
||||||
f"Duplicate alias '{alias}' found for models '{existing_model}' and '{config.model_name}'"
|
|
||||||
)
|
|
||||||
alias_map[alias_lower] = config.model_name
|
|
||||||
|
|
||||||
# Atomic update
|
|
||||||
self.alias_map = alias_map
|
|
||||||
self.model_map = model_map
|
|
||||||
|
|
||||||
def resolve(self, name_or_alias: str) -> Optional[ModelCapabilities]:
|
|
||||||
"""Resolve a model name or alias to configuration.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
name_or_alias: Model name or alias to resolve
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
Model configuration if found, None otherwise
|
|
||||||
"""
|
|
||||||
# Try alias lookup (case-insensitive) - this now includes model names too
|
|
||||||
alias_lower = name_or_alias.lower()
|
|
||||||
if alias_lower in self.alias_map:
|
|
||||||
model_name = self.alias_map[alias_lower]
|
|
||||||
return self.model_map.get(model_name)
|
|
||||||
|
|
||||||
return None
|
|
||||||
|
|
||||||
def get_capabilities(self, name_or_alias: str) -> Optional[ModelCapabilities]:
|
|
||||||
"""Get model capabilities for a name or alias.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
name_or_alias: Model name or alias
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
ModelCapabilities if found, None otherwise
|
|
||||||
"""
|
|
||||||
# Registry now returns ModelCapabilities directly
|
|
||||||
return self.resolve(name_or_alias)
|
|
||||||
|
|
||||||
def get_model_config(self, name_or_alias: str) -> Optional[ModelCapabilities]:
|
|
||||||
"""Backward-compatible wrapper used by providers and older tests."""
|
|
||||||
|
|
||||||
return self.resolve(name_or_alias)
|
|
||||||
|
|
||||||
def list_models(self) -> list[str]:
|
|
||||||
"""List all available model names."""
|
|
||||||
return list(self.model_map.keys())
|
|
||||||
|
|
||||||
def list_aliases(self) -> list[str]:
|
|
||||||
"""List all available aliases."""
|
|
||||||
return list(self.alias_map.keys())
|
|
||||||
|
|||||||
@@ -38,6 +38,7 @@ class ModelProviderRegistry:
|
|||||||
PROVIDER_PRIORITY_ORDER = [
|
PROVIDER_PRIORITY_ORDER = [
|
||||||
ProviderType.GOOGLE, # Direct Gemini access
|
ProviderType.GOOGLE, # Direct Gemini access
|
||||||
ProviderType.OPENAI, # Direct OpenAI access
|
ProviderType.OPENAI, # Direct OpenAI access
|
||||||
|
ProviderType.AZURE, # Azure-hosted OpenAI deployments
|
||||||
ProviderType.XAI, # Direct X.AI GROK access
|
ProviderType.XAI, # Direct X.AI GROK access
|
||||||
ProviderType.DIAL, # DIAL unified API access
|
ProviderType.DIAL, # DIAL unified API access
|
||||||
ProviderType.CUSTOM, # Local/self-hosted models
|
ProviderType.CUSTOM, # Local/self-hosted models
|
||||||
@@ -123,6 +124,21 @@ class ModelProviderRegistry:
|
|||||||
provider_kwargs["base_url"] = gemini_base_url
|
provider_kwargs["base_url"] = gemini_base_url
|
||||||
logging.info(f"Initialized Gemini provider with custom endpoint: {gemini_base_url}")
|
logging.info(f"Initialized Gemini provider with custom endpoint: {gemini_base_url}")
|
||||||
provider = provider_class(**provider_kwargs)
|
provider = provider_class(**provider_kwargs)
|
||||||
|
elif provider_type == ProviderType.AZURE:
|
||||||
|
if not api_key:
|
||||||
|
return None
|
||||||
|
|
||||||
|
azure_endpoint = get_env("AZURE_OPENAI_ENDPOINT")
|
||||||
|
if not azure_endpoint:
|
||||||
|
logging.warning("AZURE_OPENAI_ENDPOINT missing – skipping Azure OpenAI provider")
|
||||||
|
return None
|
||||||
|
|
||||||
|
azure_version = get_env("AZURE_OPENAI_API_VERSION")
|
||||||
|
provider = provider_class(
|
||||||
|
api_key=api_key,
|
||||||
|
azure_endpoint=azure_endpoint,
|
||||||
|
api_version=azure_version,
|
||||||
|
)
|
||||||
else:
|
else:
|
||||||
if not api_key:
|
if not api_key:
|
||||||
return None
|
return None
|
||||||
@@ -318,6 +334,7 @@ class ModelProviderRegistry:
|
|||||||
key_mapping = {
|
key_mapping = {
|
||||||
ProviderType.GOOGLE: "GEMINI_API_KEY",
|
ProviderType.GOOGLE: "GEMINI_API_KEY",
|
||||||
ProviderType.OPENAI: "OPENAI_API_KEY",
|
ProviderType.OPENAI: "OPENAI_API_KEY",
|
||||||
|
ProviderType.AZURE: "AZURE_OPENAI_API_KEY",
|
||||||
ProviderType.XAI: "XAI_API_KEY",
|
ProviderType.XAI: "XAI_API_KEY",
|
||||||
ProviderType.OPENROUTER: "OPENROUTER_API_KEY",
|
ProviderType.OPENROUTER: "OPENROUTER_API_KEY",
|
||||||
ProviderType.CUSTOM: "CUSTOM_API_KEY", # Can be empty for providers that don't need auth
|
ProviderType.CUSTOM: "CUSTOM_API_KEY", # Can be empty for providers that don't need auth
|
||||||
|
|||||||
@@ -53,7 +53,6 @@ class ModelCapabilities:
|
|||||||
|
|
||||||
# Additional attributes
|
# Additional attributes
|
||||||
max_image_size_mb: float = 0.0
|
max_image_size_mb: float = 0.0
|
||||||
is_custom: bool = False
|
|
||||||
temperature_constraint: TemperatureConstraint = field(
|
temperature_constraint: TemperatureConstraint = field(
|
||||||
default_factory=lambda: RangeTemperatureConstraint(0.0, 2.0, 0.3)
|
default_factory=lambda: RangeTemperatureConstraint(0.0, 2.0, 0.3)
|
||||||
)
|
)
|
||||||
@@ -102,9 +101,6 @@ class ModelCapabilities:
|
|||||||
if self.supports_images:
|
if self.supports_images:
|
||||||
score += 1
|
score += 1
|
||||||
|
|
||||||
if self.is_custom:
|
|
||||||
score -= 1
|
|
||||||
|
|
||||||
return max(0, min(100, score))
|
return max(0, min(100, score))
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
|
|||||||
@@ -10,6 +10,7 @@ class ProviderType(Enum):
|
|||||||
|
|
||||||
GOOGLE = "google"
|
GOOGLE = "google"
|
||||||
OPENAI = "openai"
|
OPENAI = "openai"
|
||||||
|
AZURE = "azure"
|
||||||
XAI = "xai"
|
XAI = "xai"
|
||||||
OPENROUTER = "openrouter"
|
OPENROUTER = "openrouter"
|
||||||
CUSTOM = "custom"
|
CUSTOM = "custom"
|
||||||
|
|||||||
@@ -21,7 +21,7 @@ py-modules = ["server", "config"]
|
|||||||
"*" = ["conf/*.json"]
|
"*" = ["conf/*.json"]
|
||||||
|
|
||||||
[tool.setuptools.data-files]
|
[tool.setuptools.data-files]
|
||||||
"conf" = ["conf/custom_models.json"]
|
"conf" = ["conf/custom_models.json", "conf/openrouter_models.json", "conf/azure_models.json"]
|
||||||
|
|
||||||
[project.scripts]
|
[project.scripts]
|
||||||
zen-mcp-server = "server:run"
|
zen-mcp-server = "server:run"
|
||||||
|
|||||||
26
server.py
26
server.py
@@ -377,6 +377,7 @@ def configure_providers():
|
|||||||
value = get_env(key)
|
value = get_env(key)
|
||||||
logger.debug(f" {key}: {'[PRESENT]' if value else '[MISSING]'}")
|
logger.debug(f" {key}: {'[PRESENT]' if value else '[MISSING]'}")
|
||||||
from providers import ModelProviderRegistry
|
from providers import ModelProviderRegistry
|
||||||
|
from providers.azure_openai import AzureOpenAIProvider
|
||||||
from providers.custom import CustomProvider
|
from providers.custom import CustomProvider
|
||||||
from providers.dial import DIALModelProvider
|
from providers.dial import DIALModelProvider
|
||||||
from providers.gemini import GeminiModelProvider
|
from providers.gemini import GeminiModelProvider
|
||||||
@@ -411,6 +412,27 @@ def configure_providers():
|
|||||||
else:
|
else:
|
||||||
logger.debug("OpenAI API key is placeholder value")
|
logger.debug("OpenAI API key is placeholder value")
|
||||||
|
|
||||||
|
# Check for Azure OpenAI configuration
|
||||||
|
azure_key = get_env("AZURE_OPENAI_API_KEY")
|
||||||
|
azure_endpoint = get_env("AZURE_OPENAI_ENDPOINT")
|
||||||
|
azure_models_available = False
|
||||||
|
if azure_key and azure_key != "your_azure_openai_key_here" and azure_endpoint:
|
||||||
|
try:
|
||||||
|
from providers.azure_registry import AzureModelRegistry
|
||||||
|
|
||||||
|
azure_registry = AzureModelRegistry()
|
||||||
|
if azure_registry.list_models():
|
||||||
|
valid_providers.append("Azure OpenAI")
|
||||||
|
has_native_apis = True
|
||||||
|
azure_models_available = True
|
||||||
|
logger.info("Azure OpenAI configuration detected")
|
||||||
|
else:
|
||||||
|
logger.warning(
|
||||||
|
"Azure OpenAI models configuration is empty. Populate conf/azure_models.json or set AZURE_MODELS_CONFIG_PATH."
|
||||||
|
)
|
||||||
|
except Exception as exc:
|
||||||
|
logger.warning(f"Failed to load Azure OpenAI models: {exc}")
|
||||||
|
|
||||||
# Check for X.AI API key
|
# Check for X.AI API key
|
||||||
xai_key = get_env("XAI_API_KEY")
|
xai_key = get_env("XAI_API_KEY")
|
||||||
if xai_key and xai_key != "your_xai_api_key_here":
|
if xai_key and xai_key != "your_xai_api_key_here":
|
||||||
@@ -468,6 +490,10 @@ def configure_providers():
|
|||||||
ModelProviderRegistry.register_provider(ProviderType.OPENAI, OpenAIModelProvider)
|
ModelProviderRegistry.register_provider(ProviderType.OPENAI, OpenAIModelProvider)
|
||||||
registered_providers.append(ProviderType.OPENAI.value)
|
registered_providers.append(ProviderType.OPENAI.value)
|
||||||
logger.debug(f"Registered provider: {ProviderType.OPENAI.value}")
|
logger.debug(f"Registered provider: {ProviderType.OPENAI.value}")
|
||||||
|
if azure_models_available:
|
||||||
|
ModelProviderRegistry.register_provider(ProviderType.AZURE, AzureOpenAIProvider)
|
||||||
|
registered_providers.append(ProviderType.AZURE.value)
|
||||||
|
logger.debug(f"Registered provider: {ProviderType.AZURE.value}")
|
||||||
if xai_key and xai_key != "your_xai_api_key_here":
|
if xai_key and xai_key != "your_xai_api_key_here":
|
||||||
ModelProviderRegistry.register_provider(ProviderType.XAI, XAIModelProvider)
|
ModelProviderRegistry.register_provider(ProviderType.XAI, XAIModelProvider)
|
||||||
registered_providers.append(ProviderType.XAI.value)
|
registered_providers.append(ProviderType.XAI.value)
|
||||||
|
|||||||
@@ -64,6 +64,14 @@ def test_error_listing_respects_env_restrictions(monkeypatch, reset_registry):
|
|||||||
monkeypatch.setenv("OPENAI_API_KEY", "test-openai")
|
monkeypatch.setenv("OPENAI_API_KEY", "test-openai")
|
||||||
monkeypatch.setenv("OPENROUTER_API_KEY", "test-openrouter")
|
monkeypatch.setenv("OPENROUTER_API_KEY", "test-openrouter")
|
||||||
monkeypatch.delenv("XAI_API_KEY", raising=False)
|
monkeypatch.delenv("XAI_API_KEY", raising=False)
|
||||||
|
# Ensure Azure provider stays disabled regardless of developer workstation env
|
||||||
|
for azure_var in (
|
||||||
|
"AZURE_OPENAI_API_KEY",
|
||||||
|
"AZURE_OPENAI_ENDPOINT",
|
||||||
|
"AZURE_OPENAI_ALLOWED_MODELS",
|
||||||
|
"AZURE_MODELS_CONFIG_PATH",
|
||||||
|
):
|
||||||
|
monkeypatch.delenv(azure_var, raising=False)
|
||||||
monkeypatch.setenv("ZEN_MCP_FORCE_ENV_OVERRIDE", "false")
|
monkeypatch.setenv("ZEN_MCP_FORCE_ENV_OVERRIDE", "false")
|
||||||
env_config.reload_env({"ZEN_MCP_FORCE_ENV_OVERRIDE": "false"})
|
env_config.reload_env({"ZEN_MCP_FORCE_ENV_OVERRIDE": "false"})
|
||||||
try:
|
try:
|
||||||
@@ -103,6 +111,13 @@ def test_error_listing_respects_env_restrictions(monkeypatch, reset_registry):
|
|||||||
|
|
||||||
for var in ("XAI_API_KEY", "CUSTOM_API_URL", "CUSTOM_API_KEY", "DIAL_API_KEY"):
|
for var in ("XAI_API_KEY", "CUSTOM_API_URL", "CUSTOM_API_KEY", "DIAL_API_KEY"):
|
||||||
monkeypatch.delenv(var, raising=False)
|
monkeypatch.delenv(var, raising=False)
|
||||||
|
for azure_var in (
|
||||||
|
"AZURE_OPENAI_API_KEY",
|
||||||
|
"AZURE_OPENAI_ENDPOINT",
|
||||||
|
"AZURE_OPENAI_ALLOWED_MODELS",
|
||||||
|
"AZURE_MODELS_CONFIG_PATH",
|
||||||
|
):
|
||||||
|
monkeypatch.delenv(azure_var, raising=False)
|
||||||
|
|
||||||
ModelProviderRegistry.reset_for_testing()
|
ModelProviderRegistry.reset_for_testing()
|
||||||
model_restrictions._restriction_service = None
|
model_restrictions._restriction_service = None
|
||||||
@@ -136,6 +151,13 @@ def test_error_listing_without_restrictions_shows_full_catalog(monkeypatch, rese
|
|||||||
monkeypatch.setenv("OPENROUTER_API_KEY", "test-openrouter")
|
monkeypatch.setenv("OPENROUTER_API_KEY", "test-openrouter")
|
||||||
monkeypatch.setenv("XAI_API_KEY", "test-xai")
|
monkeypatch.setenv("XAI_API_KEY", "test-xai")
|
||||||
monkeypatch.setenv("ZEN_MCP_FORCE_ENV_OVERRIDE", "false")
|
monkeypatch.setenv("ZEN_MCP_FORCE_ENV_OVERRIDE", "false")
|
||||||
|
for azure_var in (
|
||||||
|
"AZURE_OPENAI_API_KEY",
|
||||||
|
"AZURE_OPENAI_ENDPOINT",
|
||||||
|
"AZURE_OPENAI_ALLOWED_MODELS",
|
||||||
|
"AZURE_MODELS_CONFIG_PATH",
|
||||||
|
):
|
||||||
|
monkeypatch.delenv(azure_var, raising=False)
|
||||||
env_config.reload_env({"ZEN_MCP_FORCE_ENV_OVERRIDE": "false"})
|
env_config.reload_env({"ZEN_MCP_FORCE_ENV_OVERRIDE": "false"})
|
||||||
try:
|
try:
|
||||||
import dotenv
|
import dotenv
|
||||||
|
|||||||
145
tests/test_azure_openai_provider.py
Normal file
145
tests/test_azure_openai_provider.py
Normal file
@@ -0,0 +1,145 @@
|
|||||||
|
import sys
|
||||||
|
import types
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
if "openai" not in sys.modules: # pragma: no cover - test shim for optional dependency
|
||||||
|
stub = types.ModuleType("openai")
|
||||||
|
stub.AzureOpenAI = object # Replaced with a mock inside tests
|
||||||
|
sys.modules["openai"] = stub
|
||||||
|
|
||||||
|
from providers.azure_openai import AzureOpenAIProvider
|
||||||
|
from providers.shared import ModelCapabilities, ProviderType
|
||||||
|
|
||||||
|
|
||||||
|
class _DummyResponse:
|
||||||
|
def __init__(self):
|
||||||
|
self.choices = [
|
||||||
|
types.SimpleNamespace(
|
||||||
|
message=types.SimpleNamespace(content="hello"),
|
||||||
|
finish_reason="stop",
|
||||||
|
)
|
||||||
|
]
|
||||||
|
self.model = "prod-gpt4o"
|
||||||
|
self.id = "resp-123"
|
||||||
|
self.created = 0
|
||||||
|
self.usage = types.SimpleNamespace(
|
||||||
|
prompt_tokens=5,
|
||||||
|
completion_tokens=3,
|
||||||
|
total_tokens=8,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def dummy_azure_client(monkeypatch):
|
||||||
|
captured = {}
|
||||||
|
|
||||||
|
class _DummyAzureClient:
|
||||||
|
def __init__(self, **kwargs):
|
||||||
|
captured["client_kwargs"] = kwargs
|
||||||
|
self.chat = types.SimpleNamespace(completions=types.SimpleNamespace(create=self._create_completion))
|
||||||
|
self.responses = types.SimpleNamespace(create=self._create_response)
|
||||||
|
|
||||||
|
def _create_completion(self, **kwargs):
|
||||||
|
captured["request_kwargs"] = kwargs
|
||||||
|
return _DummyResponse()
|
||||||
|
|
||||||
|
def _create_response(self, **kwargs):
|
||||||
|
captured["responses_kwargs"] = kwargs
|
||||||
|
return _DummyResponse()
|
||||||
|
|
||||||
|
monkeypatch.delenv("AZURE_OPENAI_ALLOWED_MODELS", raising=False)
|
||||||
|
monkeypatch.setattr("providers.azure_openai.AzureOpenAI", _DummyAzureClient)
|
||||||
|
return captured
|
||||||
|
|
||||||
|
|
||||||
|
def test_generate_content_uses_deployment_mapping(dummy_azure_client):
|
||||||
|
provider = AzureOpenAIProvider(
|
||||||
|
api_key="key",
|
||||||
|
azure_endpoint="https://example.openai.azure.com/",
|
||||||
|
deployments={"gpt-4o": "prod-gpt4o"},
|
||||||
|
)
|
||||||
|
|
||||||
|
result = provider.generate_content("hello", "gpt-4o")
|
||||||
|
|
||||||
|
assert dummy_azure_client["request_kwargs"]["model"] == "prod-gpt4o"
|
||||||
|
assert result.model_name == "gpt-4o"
|
||||||
|
assert result.provider == ProviderType.AZURE
|
||||||
|
assert provider.validate_model_name("prod-gpt4o")
|
||||||
|
|
||||||
|
|
||||||
|
def test_generate_content_accepts_deployment_alias(dummy_azure_client):
|
||||||
|
provider = AzureOpenAIProvider(
|
||||||
|
api_key="key",
|
||||||
|
azure_endpoint="https://example.openai.azure.com/",
|
||||||
|
deployments={"gpt-4o-mini": "mini-deployment"},
|
||||||
|
)
|
||||||
|
|
||||||
|
# Calling with the deployment alias should still resolve properly.
|
||||||
|
result = provider.generate_content("hi", "mini-deployment")
|
||||||
|
|
||||||
|
assert dummy_azure_client["request_kwargs"]["model"] == "mini-deployment"
|
||||||
|
assert result.model_name == "gpt-4o-mini"
|
||||||
|
|
||||||
|
|
||||||
|
def test_client_initialization_uses_endpoint_and_version(dummy_azure_client):
|
||||||
|
provider = AzureOpenAIProvider(
|
||||||
|
api_key="key",
|
||||||
|
azure_endpoint="https://example.openai.azure.com/",
|
||||||
|
api_version="2024-03-15-preview",
|
||||||
|
deployments={"gpt-4o": "prod"},
|
||||||
|
)
|
||||||
|
|
||||||
|
_ = provider.client
|
||||||
|
|
||||||
|
assert dummy_azure_client["client_kwargs"]["azure_endpoint"] == "https://example.openai.azure.com"
|
||||||
|
assert dummy_azure_client["client_kwargs"]["api_version"] == "2024-03-15-preview"
|
||||||
|
|
||||||
|
|
||||||
|
def test_deployment_overrides_capabilities(dummy_azure_client):
|
||||||
|
provider = AzureOpenAIProvider(
|
||||||
|
api_key="key",
|
||||||
|
azure_endpoint="https://example.openai.azure.com/",
|
||||||
|
deployments={
|
||||||
|
"gpt-4o": {
|
||||||
|
"deployment": "prod-gpt4o",
|
||||||
|
"friendly_name": "Azure GPT-4o EU",
|
||||||
|
"intelligence_score": 19,
|
||||||
|
"supports_temperature": False,
|
||||||
|
"temperature_constraint": "fixed",
|
||||||
|
}
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
caps = provider.get_capabilities("gpt-4o")
|
||||||
|
assert caps.friendly_name == "Azure GPT-4o EU"
|
||||||
|
assert caps.intelligence_score == 19
|
||||||
|
assert not caps.supports_temperature
|
||||||
|
|
||||||
|
|
||||||
|
def test_registry_configuration_merges_capabilities(dummy_azure_client, monkeypatch):
|
||||||
|
def fake_registry_entries(self):
|
||||||
|
capability = ModelCapabilities(
|
||||||
|
provider=ProviderType.AZURE,
|
||||||
|
model_name="gpt-4o",
|
||||||
|
friendly_name="Azure GPT-4o Registry",
|
||||||
|
context_window=500_000,
|
||||||
|
max_output_tokens=128_000,
|
||||||
|
)
|
||||||
|
return {"gpt-4o": {"deployment": "registry-deployment", "capability": capability}}
|
||||||
|
|
||||||
|
monkeypatch.setattr(AzureOpenAIProvider, "_load_registry_entries", fake_registry_entries)
|
||||||
|
|
||||||
|
provider = AzureOpenAIProvider(
|
||||||
|
api_key="key",
|
||||||
|
azure_endpoint="https://example.openai.azure.com/",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Capability should come from registry
|
||||||
|
caps = provider.get_capabilities("gpt-4o")
|
||||||
|
assert caps.friendly_name == "Azure GPT-4o Registry"
|
||||||
|
assert caps.context_window == 500_000
|
||||||
|
|
||||||
|
# API call should use deployment defined in registry
|
||||||
|
provider.generate_content("hello", "gpt-4o")
|
||||||
|
assert dummy_azure_client["request_kwargs"]["model"] == "registry-deployment"
|
||||||
@@ -34,8 +34,7 @@ class TestCustomOpenAITemperatureParameterFix:
|
|||||||
config_models = [
|
config_models = [
|
||||||
{
|
{
|
||||||
"model_name": "gpt-5-2025-08-07",
|
"model_name": "gpt-5-2025-08-07",
|
||||||
"provider": "ProviderType.OPENAI",
|
"provider": "openai",
|
||||||
"is_custom": True,
|
|
||||||
"context_window": 400000,
|
"context_window": 400000,
|
||||||
"max_output_tokens": 128000,
|
"max_output_tokens": 128000,
|
||||||
"supports_extended_thinking": True,
|
"supports_extended_thinking": True,
|
||||||
|
|||||||
@@ -62,9 +62,9 @@ class TestCustomProvider:
|
|||||||
with pytest.raises(ValueError):
|
with pytest.raises(ValueError):
|
||||||
provider.get_capabilities("o3")
|
provider.get_capabilities("o3")
|
||||||
|
|
||||||
# Test with a custom model (is_custom=true)
|
# Test with a custom model from the local registry
|
||||||
capabilities = provider.get_capabilities("local-llama")
|
capabilities = provider.get_capabilities("local-llama")
|
||||||
assert capabilities.provider == ProviderType.CUSTOM # local-llama has is_custom=true
|
assert capabilities.provider == ProviderType.CUSTOM
|
||||||
assert capabilities.context_window > 0
|
assert capabilities.context_window > 0
|
||||||
|
|
||||||
finally:
|
finally:
|
||||||
|
|||||||
@@ -181,7 +181,7 @@ class TestModelEnumeration:
|
|||||||
# Configure environment with OpenRouter access only
|
# Configure environment with OpenRouter access only
|
||||||
self._setup_environment({"OPENROUTER_API_KEY": "test-openrouter-key"})
|
self._setup_environment({"OPENROUTER_API_KEY": "test-openrouter-key"})
|
||||||
|
|
||||||
# Create a temporary custom model config with a free variant
|
# Create a temporary OpenRouter model config with a free variant
|
||||||
custom_config = {
|
custom_config = {
|
||||||
"models": [
|
"models": [
|
||||||
{
|
{
|
||||||
@@ -199,9 +199,9 @@ class TestModelEnumeration:
|
|||||||
]
|
]
|
||||||
}
|
}
|
||||||
|
|
||||||
config_path = tmp_path / "custom_models.json"
|
config_path = tmp_path / "openrouter_models.json"
|
||||||
config_path.write_text(json.dumps(custom_config), encoding="utf-8")
|
config_path.write_text(json.dumps(custom_config), encoding="utf-8")
|
||||||
monkeypatch.setenv("CUSTOM_MODELS_CONFIG_PATH", str(config_path))
|
monkeypatch.setenv("OPENROUTER_MODELS_CONFIG_PATH", str(config_path))
|
||||||
|
|
||||||
# Reset cached registries so the temporary config is loaded
|
# Reset cached registries so the temporary config is loaded
|
||||||
from tools.shared.base_tool import BaseTool
|
from tools.shared.base_tool import BaseTool
|
||||||
|
|||||||
@@ -366,8 +366,8 @@ class TestCustomProviderOpenRouterRestrictions:
|
|||||||
assert not provider.validate_model_name("sonnet")
|
assert not provider.validate_model_name("sonnet")
|
||||||
assert not provider.validate_model_name("haiku")
|
assert not provider.validate_model_name("haiku")
|
||||||
|
|
||||||
# Should still validate custom models (is_custom=true) regardless of restrictions
|
# Should still validate custom models defined in conf/custom_models.json
|
||||||
assert provider.validate_model_name("local-llama") # This has is_custom=true
|
assert provider.validate_model_name("local-llama")
|
||||||
|
|
||||||
@patch.dict(os.environ, {"OPENROUTER_ALLOWED_MODELS": "opus", "OPENROUTER_API_KEY": "test-key"})
|
@patch.dict(os.environ, {"OPENROUTER_ALLOWED_MODELS": "opus", "OPENROUTER_API_KEY": "test-key"})
|
||||||
def test_custom_provider_openrouter_capabilities_restrictions(self):
|
def test_custom_provider_openrouter_capabilities_restrictions(self):
|
||||||
@@ -389,7 +389,7 @@ class TestCustomProviderOpenRouterRestrictions:
|
|||||||
with pytest.raises(ValueError):
|
with pytest.raises(ValueError):
|
||||||
provider.get_capabilities("haiku")
|
provider.get_capabilities("haiku")
|
||||||
|
|
||||||
# Should still work for custom models (is_custom=true)
|
# Should still work for custom models
|
||||||
capabilities = provider.get_capabilities("local-llama")
|
capabilities = provider.get_capabilities("local-llama")
|
||||||
assert capabilities.provider == ProviderType.CUSTOM
|
assert capabilities.provider == ProviderType.CUSTOM
|
||||||
|
|
||||||
|
|||||||
@@ -172,7 +172,7 @@ class TestOpenRouterAutoMode:
|
|||||||
def mock_resolve(model_name):
|
def mock_resolve(model_name):
|
||||||
if model_name in model_names:
|
if model_name in model_names:
|
||||||
mock_config = Mock()
|
mock_config = Mock()
|
||||||
mock_config.is_custom = False
|
mock_config.provider = ProviderType.OPENROUTER
|
||||||
mock_config.aliases = [] # Empty list of aliases
|
mock_config.aliases = [] # Empty list of aliases
|
||||||
mock_config.get_effective_capability_rank = Mock(return_value=50) # Add ranking method
|
mock_config.get_effective_capability_rank = Mock(return_value=50) # Add ranking method
|
||||||
return mock_config
|
return mock_config
|
||||||
|
|||||||
@@ -3,6 +3,7 @@
|
|||||||
import json
|
import json
|
||||||
import os
|
import os
|
||||||
import tempfile
|
import tempfile
|
||||||
|
from unittest.mock import patch
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
|
|
||||||
@@ -49,7 +50,7 @@ class TestOpenRouterModelRegistry:
|
|||||||
os.unlink(temp_path)
|
os.unlink(temp_path)
|
||||||
|
|
||||||
def test_environment_variable_override(self):
|
def test_environment_variable_override(self):
|
||||||
"""Test OPENROUTER_MODELS_PATH environment variable."""
|
"""Test OPENROUTER_MODELS_CONFIG_PATH environment variable."""
|
||||||
# Create custom config
|
# Create custom config
|
||||||
config_data = {
|
config_data = {
|
||||||
"models": [
|
"models": [
|
||||||
@@ -63,8 +64,8 @@ class TestOpenRouterModelRegistry:
|
|||||||
|
|
||||||
try:
|
try:
|
||||||
# Set environment variable
|
# Set environment variable
|
||||||
original_env = os.environ.get("CUSTOM_MODELS_CONFIG_PATH")
|
original_env = os.environ.get("OPENROUTER_MODELS_CONFIG_PATH")
|
||||||
os.environ["CUSTOM_MODELS_CONFIG_PATH"] = temp_path
|
os.environ["OPENROUTER_MODELS_CONFIG_PATH"] = temp_path
|
||||||
|
|
||||||
# Create registry without explicit path
|
# Create registry without explicit path
|
||||||
registry = OpenRouterModelRegistry()
|
registry = OpenRouterModelRegistry()
|
||||||
@@ -76,9 +77,9 @@ class TestOpenRouterModelRegistry:
|
|||||||
finally:
|
finally:
|
||||||
# Restore environment
|
# Restore environment
|
||||||
if original_env is not None:
|
if original_env is not None:
|
||||||
os.environ["CUSTOM_MODELS_CONFIG_PATH"] = original_env
|
os.environ["OPENROUTER_MODELS_CONFIG_PATH"] = original_env
|
||||||
else:
|
else:
|
||||||
del os.environ["CUSTOM_MODELS_CONFIG_PATH"]
|
del os.environ["OPENROUTER_MODELS_CONFIG_PATH"]
|
||||||
os.unlink(temp_path)
|
os.unlink(temp_path)
|
||||||
|
|
||||||
def test_alias_resolution(self):
|
def test_alias_resolution(self):
|
||||||
@@ -161,7 +162,7 @@ class TestOpenRouterModelRegistry:
|
|||||||
os.unlink(temp_path)
|
os.unlink(temp_path)
|
||||||
|
|
||||||
def test_backwards_compatibility_max_tokens(self):
|
def test_backwards_compatibility_max_tokens(self):
|
||||||
"""Test that old max_tokens field is no longer supported (should result in empty registry)."""
|
"""Test that legacy max_tokens field maps to max_output_tokens."""
|
||||||
config_data = {
|
config_data = {
|
||||||
"models": [
|
"models": [
|
||||||
{
|
{
|
||||||
@@ -178,19 +179,17 @@ class TestOpenRouterModelRegistry:
|
|||||||
temp_path = f.name
|
temp_path = f.name
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# Should gracefully handle the error and result in empty registry
|
with patch.dict("os.environ", {}, clear=True):
|
||||||
registry = OpenRouterModelRegistry(config_path=temp_path)
|
with pytest.raises(ValueError, match="max_output_tokens"):
|
||||||
# Registry should be empty due to config error
|
OpenRouterModelRegistry(config_path=temp_path)
|
||||||
assert len(registry.list_models()) == 0
|
|
||||||
assert len(registry.list_aliases()) == 0
|
|
||||||
assert registry.resolve("old") is None
|
|
||||||
finally:
|
finally:
|
||||||
os.unlink(temp_path)
|
os.unlink(temp_path)
|
||||||
|
|
||||||
def test_missing_config_file(self):
|
def test_missing_config_file(self):
|
||||||
"""Test behavior with missing config file."""
|
"""Test behavior with missing config file."""
|
||||||
# Use a non-existent path
|
# Use a non-existent path
|
||||||
registry = OpenRouterModelRegistry(config_path="/non/existent/path.json")
|
with patch.dict("os.environ", {}, clear=True):
|
||||||
|
registry = OpenRouterModelRegistry(config_path="/non/existent/path.json")
|
||||||
|
|
||||||
# Should initialize with empty maps
|
# Should initialize with empty maps
|
||||||
assert len(registry.list_models()) == 0
|
assert len(registry.list_models()) == 0
|
||||||
|
|||||||
@@ -1,5 +1,7 @@
|
|||||||
"""Tests for uvx path resolution functionality."""
|
"""Tests for uvx path resolution functionality."""
|
||||||
|
|
||||||
|
import json
|
||||||
|
import tempfile
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from unittest.mock import patch
|
from unittest.mock import patch
|
||||||
|
|
||||||
@@ -18,8 +20,8 @@ class TestUvxPathResolution:
|
|||||||
def test_config_path_resolution(self):
|
def test_config_path_resolution(self):
|
||||||
"""Test that the config path resolution finds the config file in multiple locations."""
|
"""Test that the config path resolution finds the config file in multiple locations."""
|
||||||
# Check that the config file exists in the development location
|
# Check that the config file exists in the development location
|
||||||
config_file = Path(__file__).parent.parent / "conf" / "custom_models.json"
|
config_file = Path(__file__).parent.parent / "conf" / "openrouter_models.json"
|
||||||
assert config_file.exists(), "Config file should exist in conf/custom_models.json"
|
assert config_file.exists(), "Config file should exist in conf/openrouter_models.json"
|
||||||
|
|
||||||
# Test that a registry can find and use the config
|
# Test that a registry can find and use the config
|
||||||
registry = OpenRouterModelRegistry()
|
registry = OpenRouterModelRegistry()
|
||||||
@@ -34,7 +36,7 @@ class TestUvxPathResolution:
|
|||||||
|
|
||||||
def test_explicit_config_path_override(self):
|
def test_explicit_config_path_override(self):
|
||||||
"""Test that explicit config path works correctly."""
|
"""Test that explicit config path works correctly."""
|
||||||
config_path = Path(__file__).parent.parent / "conf" / "custom_models.json"
|
config_path = Path(__file__).parent.parent / "conf" / "openrouter_models.json"
|
||||||
|
|
||||||
registry = OpenRouterModelRegistry(config_path=str(config_path))
|
registry = OpenRouterModelRegistry(config_path=str(config_path))
|
||||||
|
|
||||||
@@ -44,41 +46,62 @@ class TestUvxPathResolution:
|
|||||||
|
|
||||||
def test_environment_variable_override(self):
|
def test_environment_variable_override(self):
|
||||||
"""Test that CUSTOM_MODELS_CONFIG_PATH environment variable works."""
|
"""Test that CUSTOM_MODELS_CONFIG_PATH environment variable works."""
|
||||||
config_path = Path(__file__).parent.parent / "conf" / "custom_models.json"
|
config_path = Path(__file__).parent.parent / "conf" / "openrouter_models.json"
|
||||||
|
|
||||||
with patch.dict("os.environ", {"CUSTOM_MODELS_CONFIG_PATH": str(config_path)}):
|
with patch.dict("os.environ", {"OPENROUTER_MODELS_CONFIG_PATH": str(config_path)}):
|
||||||
registry = OpenRouterModelRegistry()
|
registry = OpenRouterModelRegistry()
|
||||||
|
|
||||||
# Should use environment path
|
# Should use environment path
|
||||||
assert registry.config_path == config_path
|
assert registry.config_path == config_path
|
||||||
assert len(registry.list_models()) > 0
|
assert len(registry.list_models()) > 0
|
||||||
|
|
||||||
@patch("providers.openrouter_registry.importlib.resources.files")
|
@patch("providers.model_registry_base.importlib.resources.files")
|
||||||
@patch("pathlib.Path.exists")
|
def test_multiple_path_fallback(self, mock_files):
|
||||||
def test_multiple_path_fallback(self, mock_exists, mock_files):
|
"""Test that file-system fallback works when resource loading fails."""
|
||||||
"""Test that multiple path resolution works for different deployment scenarios."""
|
|
||||||
# Make resources loading fail to trigger file system fallback
|
|
||||||
mock_files.side_effect = Exception("Resource loading failed")
|
mock_files.side_effect = Exception("Resource loading failed")
|
||||||
|
|
||||||
# Simulate dev path failing, and working directory path succeeding
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
# The third `True` is for the check within `reload()`
|
temp_dir = Path(tmpdir)
|
||||||
mock_exists.side_effect = [False, True, True]
|
conf_dir = temp_dir / "conf"
|
||||||
|
conf_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
config_path = conf_dir / "openrouter_models.json"
|
||||||
|
config_path.write_text(
|
||||||
|
json.dumps(
|
||||||
|
{
|
||||||
|
"models": [
|
||||||
|
{
|
||||||
|
"model_name": "test/model",
|
||||||
|
"aliases": ["testalias"],
|
||||||
|
"context_window": 1024,
|
||||||
|
"max_output_tokens": 512,
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
indent=2,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
registry = OpenRouterModelRegistry()
|
original_exists = Path.exists
|
||||||
|
|
||||||
# Should have fallen back to file system mode
|
def fake_exists(path_self):
|
||||||
assert not registry.use_resources, "Should fall back to file system when resources fail"
|
if str(path_self).endswith("conf/openrouter_models.json") and path_self != config_path:
|
||||||
|
return False
|
||||||
|
if path_self == config_path:
|
||||||
|
return True
|
||||||
|
return original_exists(path_self)
|
||||||
|
|
||||||
# Assert that the registry fell back to the second potential path
|
with patch("pathlib.Path.cwd", return_value=temp_dir), patch("pathlib.Path.exists", fake_exists):
|
||||||
assert registry.config_path == Path.cwd() / "conf" / "custom_models.json"
|
registry = OpenRouterModelRegistry()
|
||||||
|
|
||||||
# Should load models successfully
|
assert not registry.use_resources
|
||||||
assert len(registry.list_models()) > 0
|
assert registry.config_path == config_path
|
||||||
|
assert "test/model" in registry.list_models()
|
||||||
|
|
||||||
def test_missing_config_handling(self):
|
def test_missing_config_handling(self):
|
||||||
"""Test behavior when config file is missing."""
|
"""Test behavior when config file is missing."""
|
||||||
# Use a non-existent path
|
# Use a non-existent path
|
||||||
registry = OpenRouterModelRegistry(config_path="/nonexistent/path/config.json")
|
with patch.dict("os.environ", {}, clear=True):
|
||||||
|
registry = OpenRouterModelRegistry(config_path="/nonexistent/path/config.json")
|
||||||
|
|
||||||
# Should gracefully handle missing config
|
# Should gracefully handle missing config
|
||||||
assert len(registry.list_models()) == 0
|
assert len(registry.list_models()) == 0
|
||||||
|
|||||||
@@ -166,8 +166,10 @@ class TestXAIProvider:
|
|||||||
"""Test model restrictions functionality."""
|
"""Test model restrictions functionality."""
|
||||||
# Clear cached restriction service
|
# Clear cached restriction service
|
||||||
import utils.model_restrictions
|
import utils.model_restrictions
|
||||||
|
from providers.registry import ModelProviderRegistry
|
||||||
|
|
||||||
utils.model_restrictions._restriction_service = None
|
utils.model_restrictions._restriction_service = None
|
||||||
|
ModelProviderRegistry.reset_for_testing()
|
||||||
|
|
||||||
provider = XAIModelProvider("test-key")
|
provider = XAIModelProvider("test-key")
|
||||||
|
|
||||||
@@ -187,8 +189,10 @@ class TestXAIProvider:
|
|||||||
"""Test multiple models in restrictions."""
|
"""Test multiple models in restrictions."""
|
||||||
# Clear cached restriction service
|
# Clear cached restriction service
|
||||||
import utils.model_restrictions
|
import utils.model_restrictions
|
||||||
|
from providers.registry import ModelProviderRegistry
|
||||||
|
|
||||||
utils.model_restrictions._restriction_service = None
|
utils.model_restrictions._restriction_service = None
|
||||||
|
ModelProviderRegistry.reset_for_testing()
|
||||||
|
|
||||||
provider = XAIModelProvider("test-key")
|
provider = XAIModelProvider("test-key")
|
||||||
|
|
||||||
|
|||||||
@@ -11,6 +11,8 @@ from typing import Any, Optional
|
|||||||
|
|
||||||
from mcp.types import TextContent
|
from mcp.types import TextContent
|
||||||
|
|
||||||
|
from providers.custom_registry import CustomEndpointModelRegistry
|
||||||
|
from providers.openrouter_registry import OpenRouterModelRegistry
|
||||||
from tools.models import ToolModelCategory, ToolOutput
|
from tools.models import ToolModelCategory, ToolOutput
|
||||||
from tools.shared.base_models import ToolRequest
|
from tools.shared.base_models import ToolRequest
|
||||||
from tools.shared.base_tool import BaseTool
|
from tools.shared.base_tool import BaseTool
|
||||||
@@ -80,7 +82,6 @@ class ListModelsTool(BaseTool):
|
|||||||
Returns:
|
Returns:
|
||||||
Formatted list of models by provider
|
Formatted list of models by provider
|
||||||
"""
|
"""
|
||||||
from providers.openrouter_registry import OpenRouterModelRegistry
|
|
||||||
from providers.registry import ModelProviderRegistry
|
from providers.registry import ModelProviderRegistry
|
||||||
from providers.shared import ProviderType
|
from providers.shared import ProviderType
|
||||||
from utils.model_restrictions import get_restriction_service
|
from utils.model_restrictions import get_restriction_service
|
||||||
@@ -99,6 +100,7 @@ class ListModelsTool(BaseTool):
|
|||||||
provider_info = {
|
provider_info = {
|
||||||
ProviderType.GOOGLE: {"name": "Google Gemini", "env_key": "GEMINI_API_KEY"},
|
ProviderType.GOOGLE: {"name": "Google Gemini", "env_key": "GEMINI_API_KEY"},
|
||||||
ProviderType.OPENAI: {"name": "OpenAI", "env_key": "OPENAI_API_KEY"},
|
ProviderType.OPENAI: {"name": "OpenAI", "env_key": "OPENAI_API_KEY"},
|
||||||
|
ProviderType.AZURE: {"name": "Azure OpenAI", "env_key": "AZURE_OPENAI_API_KEY"},
|
||||||
ProviderType.XAI: {"name": "X.AI (Grok)", "env_key": "XAI_API_KEY"},
|
ProviderType.XAI: {"name": "X.AI (Grok)", "env_key": "XAI_API_KEY"},
|
||||||
ProviderType.DIAL: {"name": "AI DIAL", "env_key": "DIAL_API_KEY"},
|
ProviderType.DIAL: {"name": "AI DIAL", "env_key": "DIAL_API_KEY"},
|
||||||
}
|
}
|
||||||
@@ -317,12 +319,12 @@ class ListModelsTool(BaseTool):
|
|||||||
output_lines.append("**Description**: Local models via Ollama, vLLM, LM Studio, etc.")
|
output_lines.append("**Description**: Local models via Ollama, vLLM, LM Studio, etc.")
|
||||||
|
|
||||||
try:
|
try:
|
||||||
registry = OpenRouterModelRegistry()
|
registry = CustomEndpointModelRegistry()
|
||||||
custom_models = []
|
custom_models = []
|
||||||
|
|
||||||
for alias in registry.list_aliases():
|
for alias in registry.list_aliases():
|
||||||
config = registry.resolve(alias)
|
config = registry.resolve(alias)
|
||||||
if config and config.is_custom:
|
if config:
|
||||||
custom_models.append((alias, config))
|
custom_models.append((alias, config))
|
||||||
|
|
||||||
if custom_models:
|
if custom_models:
|
||||||
|
|||||||
@@ -82,6 +82,7 @@ class BaseTool(ABC):
|
|||||||
|
|
||||||
# Class-level cache for OpenRouter registry to avoid multiple loads
|
# Class-level cache for OpenRouter registry to avoid multiple loads
|
||||||
_openrouter_registry_cache = None
|
_openrouter_registry_cache = None
|
||||||
|
_custom_registry_cache = None
|
||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
def _get_openrouter_registry(cls):
|
def _get_openrouter_registry(cls):
|
||||||
@@ -94,6 +95,16 @@ class BaseTool(ABC):
|
|||||||
logger.debug("Created cached OpenRouter registry instance")
|
logger.debug("Created cached OpenRouter registry instance")
|
||||||
return BaseTool._openrouter_registry_cache
|
return BaseTool._openrouter_registry_cache
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def _get_custom_registry(cls):
|
||||||
|
"""Get cached custom-endpoint registry instance."""
|
||||||
|
if BaseTool._custom_registry_cache is None:
|
||||||
|
from providers.custom_registry import CustomEndpointModelRegistry
|
||||||
|
|
||||||
|
BaseTool._custom_registry_cache = CustomEndpointModelRegistry()
|
||||||
|
logger.debug("Created cached Custom registry instance")
|
||||||
|
return BaseTool._custom_registry_cache
|
||||||
|
|
||||||
def __init__(self):
|
def __init__(self):
|
||||||
# Cache tool metadata at initialization to avoid repeated calls
|
# Cache tool metadata at initialization to avoid repeated calls
|
||||||
self.name = self.get_name()
|
self.name = self.get_name()
|
||||||
@@ -266,14 +277,10 @@ class BaseTool(ABC):
|
|||||||
custom_url = get_env("CUSTOM_API_URL")
|
custom_url = get_env("CUSTOM_API_URL")
|
||||||
if custom_url:
|
if custom_url:
|
||||||
try:
|
try:
|
||||||
registry = self._get_openrouter_registry()
|
registry = self._get_custom_registry()
|
||||||
# Find all custom models (is_custom=true)
|
|
||||||
for alias in registry.list_aliases():
|
for alias in registry.list_aliases():
|
||||||
config = registry.resolve(alias)
|
if alias not in all_models:
|
||||||
# Check if this is a custom model that requires custom endpoints
|
all_models.append(alias)
|
||||||
if config and config.is_custom:
|
|
||||||
if alias not in all_models:
|
|
||||||
all_models.append(alias)
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
import logging
|
import logging
|
||||||
|
|
||||||
@@ -1282,12 +1289,7 @@ When recommending searches, be specific about what information you need and why
|
|||||||
try:
|
try:
|
||||||
registry = self._get_openrouter_registry()
|
registry = self._get_openrouter_registry()
|
||||||
|
|
||||||
# Include every known alias so MCP enum matches registry capabilities
|
|
||||||
for alias in registry.list_aliases():
|
for alias in registry.list_aliases():
|
||||||
config = registry.resolve(alias)
|
|
||||||
if config and config.is_custom:
|
|
||||||
# Custom-only models require CUSTOM_API_URL; defer to custom block
|
|
||||||
continue
|
|
||||||
if alias not in all_models:
|
if alias not in all_models:
|
||||||
all_models.append(alias)
|
all_models.append(alias)
|
||||||
except Exception as exc: # pragma: no cover - logged for observability
|
except Exception as exc: # pragma: no cover - logged for observability
|
||||||
@@ -1299,10 +1301,9 @@ When recommending searches, be specific about what information you need and why
|
|||||||
custom_url = get_env("CUSTOM_API_URL")
|
custom_url = get_env("CUSTOM_API_URL")
|
||||||
if custom_url:
|
if custom_url:
|
||||||
try:
|
try:
|
||||||
registry = self._get_openrouter_registry()
|
registry = self._get_custom_registry()
|
||||||
for alias in registry.list_aliases():
|
for alias in registry.list_aliases():
|
||||||
config = registry.resolve(alias)
|
if alias not in all_models:
|
||||||
if config and config.is_custom and alias not in all_models:
|
|
||||||
all_models.append(alias)
|
all_models.append(alias)
|
||||||
except Exception as exc: # pragma: no cover - logged for observability
|
except Exception as exc: # pragma: no cover - logged for observability
|
||||||
import logging
|
import logging
|
||||||
|
|||||||
23
utils/env.py
23
utils/env.py
@@ -4,6 +4,7 @@ from __future__ import annotations
|
|||||||
|
|
||||||
import os
|
import os
|
||||||
from collections.abc import Mapping
|
from collections.abc import Mapping
|
||||||
|
from contextlib import contextmanager
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
try:
|
try:
|
||||||
@@ -86,3 +87,25 @@ def get_all_env() -> dict[str, str | None]:
|
|||||||
"""Expose the loaded .env mapping for diagnostics/logging."""
|
"""Expose the loaded .env mapping for diagnostics/logging."""
|
||||||
|
|
||||||
return dict(_DOTENV_VALUES)
|
return dict(_DOTENV_VALUES)
|
||||||
|
|
||||||
|
|
||||||
|
@contextmanager
|
||||||
|
def suppress_env_vars(*names: str):
|
||||||
|
"""Temporarily remove environment variables during the context.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
names: Environment variable names to remove. Empty or falsy names are ignored.
|
||||||
|
"""
|
||||||
|
|
||||||
|
removed: dict[str, str] = {}
|
||||||
|
try:
|
||||||
|
for name in names:
|
||||||
|
if not name:
|
||||||
|
continue
|
||||||
|
if name in os.environ:
|
||||||
|
removed[name] = os.environ[name]
|
||||||
|
del os.environ[name]
|
||||||
|
yield
|
||||||
|
finally:
|
||||||
|
for name, value in removed.items():
|
||||||
|
os.environ[name] = value
|
||||||
|
|||||||
Reference in New Issue
Block a user