feat: all native providers now read from catalog files like OpenRouter / Custom configs. Allows for greater control over the capabilities
This commit is contained in:
@@ -67,16 +67,26 @@ CUSTOM_MODEL_NAME=llama3.2 # Default model
|
||||
DEFAULT_MODEL=auto # Claude picks best model for each task (recommended)
|
||||
```
|
||||
|
||||
**Available Models:**
|
||||
- **`auto`**: Claude automatically selects the optimal model
|
||||
- **`pro`** (Gemini 2.5 Pro): Extended thinking, deep analysis
|
||||
- **`flash`** (Gemini 2.0 Flash): Ultra-fast responses
|
||||
- **`o3`**: Strong logical reasoning (200K context)
|
||||
- **`o3-mini`**: Balanced speed/quality (200K context)
|
||||
- **`o4-mini`**: Latest reasoning model, optimized for shorter contexts
|
||||
- **`grok-3`**: GROK-3 advanced reasoning (131K context)
|
||||
- **`grok-4`**: GROK-4 flagship model (256K context)
|
||||
- **Custom models**: via OpenRouter or local APIs
|
||||
- **Available Models:** The canonical capability data for native providers lives in JSON manifests under `conf/`:
|
||||
- `conf/openai_models.json` – OpenAI catalogue (can be overridden with `OPENAI_MODELS_CONFIG_PATH`)
|
||||
- `conf/gemini_models.json` – Gemini catalogue (`GEMINI_MODELS_CONFIG_PATH`)
|
||||
- `conf/xai_models.json` – X.AI / GROK catalogue (`XAI_MODELS_CONFIG_PATH`)
|
||||
- `conf/openrouter_models.json` – OpenRouter catalogue (`OPENROUTER_MODELS_CONFIG_PATH`)
|
||||
- `conf/custom_models.json` – Custom/OpenAI-compatible endpoints (`CUSTOM_MODELS_CONFIG_PATH`)
|
||||
|
||||
Each JSON file documents the allowed fields via its `_README` block and controls model aliases, capability limits, and feature flags. Edit these files (or point the matching `*_MODELS_CONFIG_PATH` variable to your own copy) when you want to adjust context windows, enable JSON mode, or expose additional aliases without touching Python code.
|
||||
|
||||
The shipped defaults cover:
|
||||
|
||||
| Provider | Canonical Models | Notable Aliases |
|
||||
|----------|-----------------|-----------------|
|
||||
| OpenAI | `gpt-5`, `gpt-5-pro`, `gpt-5-mini`, `gpt-5-nano`, `gpt-5-codex`, `gpt-4.1`, `o3`, `o3-mini`, `o3-pro`, `o4-mini` | `gpt5`, `gpt5pro`, `mini`, `nano`, `codex`, `o3mini`, `o3pro`, `o4mini` |
|
||||
| Gemini | `gemini-2.5-pro`, `gemini-2.5-flash`, `gemini-2.0-flash`, `gemini-2.0-flash-lite` | `pro`, `gemini-pro`, `flash`, `flash-2.0`, `flashlite` |
|
||||
| X.AI | `grok-4`, `grok-3`, `grok-3-fast` | `grok`, `grok4`, `grok3`, `grok3fast`, `grokfast` |
|
||||
| OpenRouter | See `conf/openrouter_models.json` for the continually evolving catalogue | e.g., `opus`, `sonnet`, `flash`, `pro`, `mistral` |
|
||||
| Custom | User-managed entries such as `llama3.2` | Define your own aliases per entry |
|
||||
|
||||
> **Tip:** Copy the JSON file you need, customise it, and point the corresponding `*_MODELS_CONFIG_PATH` environment variable to your version. This lets you enable or disable capabilities (JSON mode, function calling, temperature support) without editing Python.
|
||||
|
||||
### Thinking Mode Configuration
|
||||
|
||||
@@ -114,28 +124,11 @@ XAI_ALLOWED_MODELS=grok-3,grok-3-fast,grok-4
|
||||
OPENROUTER_ALLOWED_MODELS=opus,sonnet,mistral
|
||||
```
|
||||
|
||||
**Supported Model Names:**
|
||||
**Supported Model Names:** The names/aliases listed in the JSON manifests above are the authoritative source. Keep in mind:
|
||||
|
||||
**OpenAI Models:**
|
||||
- `o3` (200K context, high reasoning)
|
||||
- `o3-mini` (200K context, balanced)
|
||||
- `o4-mini` (200K context, latest balanced)
|
||||
- `mini` (shorthand for o4-mini)
|
||||
|
||||
**Gemini Models:**
|
||||
- `gemini-2.5-flash` (1M context, fast)
|
||||
- `gemini-2.5-pro` (1M context, powerful)
|
||||
- `flash` (shorthand for Flash model)
|
||||
- `pro` (shorthand for Pro model)
|
||||
|
||||
**X.AI GROK Models:**
|
||||
- `grok-4` (256K context, flagship Grok model with reasoning, vision, and structured outputs)
|
||||
- `grok-3` (131K context, advanced reasoning)
|
||||
- `grok-3-fast` (131K context, higher performance)
|
||||
- `grok` (shorthand for grok-4)
|
||||
- `grok4` (shorthand for grok-4)
|
||||
- `grok3` (shorthand for grok-3)
|
||||
- `grokfast` (shorthand for grok-3-fast)
|
||||
- Aliases are case-insensitive and defined per entry (for example, `mini` maps to `gpt-5-mini` by default, while `flash` maps to `gemini-2.5-flash`).
|
||||
- When you override the manifest files you can add or remove aliases as needed; restriction policies (`*_ALLOWED_MODELS`) automatically pick up those changes.
|
||||
- Models omitted from a manifest fall back to generic capability detection (where supported) and may have limited feature metadata.
|
||||
|
||||
**Example Configurations:**
|
||||
```env
|
||||
@@ -154,12 +147,14 @@ XAI_ALLOWED_MODELS=grok,grok-3-fast
|
||||
|
||||
### Advanced Configuration
|
||||
|
||||
**Custom Model Configuration:**
|
||||
**Custom Model Configuration & Manifest Overrides:**
|
||||
```env
|
||||
# Override default location of custom_models.json
|
||||
CUSTOM_MODELS_CONFIG_PATH=/path/to/your/custom_models.json
|
||||
# Override default location of openrouter_models.json
|
||||
OPENROUTER_MODELS_CONFIG_PATH=/path/to/your/openrouter_models.json
|
||||
# Override default location of built-in catalogues
|
||||
OPENAI_MODELS_CONFIG_PATH=/path/to/openai_models.json
|
||||
GEMINI_MODELS_CONFIG_PATH=/path/to/gemini_models.json
|
||||
XAI_MODELS_CONFIG_PATH=/path/to/xai_models.json
|
||||
OPENROUTER_MODELS_CONFIG_PATH=/path/to/openrouter_models.json
|
||||
CUSTOM_MODELS_CONFIG_PATH=/path/to/custom_models.json
|
||||
```
|
||||
|
||||
**Conversation Settings:**
|
||||
|
||||
@@ -35,27 +35,33 @@ This guide covers setting up multiple AI model providers including OpenRouter, c
|
||||
|
||||
## Model Aliases
|
||||
|
||||
Zen ships two registries:
|
||||
Zen ships multiple registries:
|
||||
|
||||
- `conf/openrouter_models.json` – metadata for models routed through OpenRouter. Override with `OPENROUTER_MODELS_CONFIG_PATH` if you maintain a custom copy.
|
||||
- `conf/custom_models.json` – metadata for local or self-hosted OpenAI-compatible endpoints used by the Custom provider. Override with `CUSTOM_MODELS_CONFIG_PATH` if needed.
|
||||
- `conf/openai_models.json` – native OpenAI catalogue (override with `OPENAI_MODELS_CONFIG_PATH`)
|
||||
- `conf/gemini_models.json` – native Google Gemini catalogue (`GEMINI_MODELS_CONFIG_PATH`)
|
||||
- `conf/xai_models.json` – native X.AI / GROK catalogue (`XAI_MODELS_CONFIG_PATH`)
|
||||
- `conf/openrouter_models.json` – OpenRouter catalogue (`OPENROUTER_MODELS_CONFIG_PATH`)
|
||||
- `conf/custom_models.json` – local/self-hosted OpenAI-compatible catalogue (`CUSTOM_MODELS_CONFIG_PATH`)
|
||||
|
||||
Copy whichever file you need into your project (or point the corresponding `*_MODELS_CONFIG_PATH` env var at your own copy) and edit it to advertise the models you want.
|
||||
|
||||
### OpenRouter Models (Cloud)
|
||||
|
||||
| Alias | Maps to OpenRouter Model |
|
||||
|-------|-------------------------|
|
||||
| `opus` | `anthropic/claude-opus-4` |
|
||||
| `sonnet`, `claude` | `anthropic/claude-sonnet-4` |
|
||||
| `haiku` | `anthropic/claude-3.5-haiku` |
|
||||
| `gpt4o`, `4o` | `openai/gpt-4o` |
|
||||
| `gpt4o-mini`, `4o-mini` | `openai/gpt-4o-mini` |
|
||||
| `pro`, `gemini` | `google/gemini-2.5-pro` |
|
||||
| `flash` | `google/gemini-2.5-flash` |
|
||||
| `mistral` | `mistral/mistral-large` |
|
||||
| `deepseek`, `coder` | `deepseek/deepseek-coder` |
|
||||
| `perplexity` | `perplexity/llama-3-sonar-large-32k-online` |
|
||||
The curated defaults in `conf/openrouter_models.json` include popular entries such as:
|
||||
|
||||
| Alias | Canonical Model | Highlights |
|
||||
|-------|-----------------|------------|
|
||||
| `opus`, `claude-opus` | `anthropic/claude-opus-4.1` | Flagship Claude reasoning model with vision |
|
||||
| `sonnet`, `sonnet4.5` | `anthropic/claude-sonnet-4.5` | Balanced Claude with high context window |
|
||||
| `haiku` | `anthropic/claude-3.5-haiku` | Fast Claude option with vision |
|
||||
| `pro`, `gemini` | `google/gemini-2.5-pro` | Frontier Gemini with extended thinking |
|
||||
| `flash` | `google/gemini-2.5-flash` | Ultra-fast Gemini with vision |
|
||||
| `mistral` | `mistralai/mistral-large-2411` | Frontier Mistral (text only) |
|
||||
| `llama3` | `meta-llama/llama-3-70b` | Large open-weight text model |
|
||||
| `deepseek-r1` | `deepseek/deepseek-r1-0528` | DeepSeek reasoning model |
|
||||
| `perplexity` | `perplexity/llama-3-sonar-large-32k-online` | Search-augmented model |
|
||||
|
||||
Consult the JSON file for the full list, aliases, and capability flags. Add new entries as OpenRouter releases additional models.
|
||||
|
||||
### Custom/Local Models
|
||||
|
||||
@@ -65,6 +71,14 @@ Copy whichever file you need into your project (or point the corresponding `*_MO
|
||||
|
||||
View the baseline OpenRouter catalogue in [`conf/openrouter_models.json`](conf/openrouter_models.json) and populate [`conf/custom_models.json`](conf/custom_models.json) with your local models.
|
||||
|
||||
Native catalogues (`conf/openai_models.json`, `conf/gemini_models.json`, `conf/xai_models.json`) follow the same schema. Updating those files lets you:
|
||||
|
||||
- Expose new aliases (e.g., map `enterprise-pro` to `gpt-5-pro`)
|
||||
- Advertise support for JSON mode or vision if the upstream provider adds it
|
||||
- Adjust token limits when providers increase context windows
|
||||
|
||||
Because providers load the manifests on import, you can tweak capabilities without touching Python. Restart the server after editing the JSON files so changes are picked up.
|
||||
|
||||
To control ordering in auto mode or the `listmodels` summary, adjust the
|
||||
[`intelligence_score`](model_ranking.md) for each entry (or rely on the automatic
|
||||
heuristic described there).
|
||||
|
||||
Reference in New Issue
Block a user