docs: update advanced usage and configuration to include new GPT-5.1 models and enhance tool parameters

This commit is contained in:
Bjorn Melin
2025-11-14 01:09:40 -07:00
parent 4d3d177d91
commit 807c9df70e
14 changed files with 83 additions and 35 deletions

View File

@@ -4,16 +4,33 @@ This guide covers advanced features, configuration options, and workflows for po
## Table of Contents ## Table of Contents
- [Model Configuration](#model-configuration) - [Advanced Usage Guide](#advanced-usage-guide)
- [Model Usage Restrictions](#model-usage-restrictions) - [Table of Contents](#table-of-contents)
- [Thinking Modes](#thinking-modes) - [Model Configuration](#model-configuration)
- [Tool Parameters](#tool-parameters) - [Model Usage Restrictions](#model-usage-restrictions)
- [Context Revival: AI Memory Beyond Context Limits](#context-revival-ai-memory-beyond-context-limits) - [Thinking Modes](#thinking-modes)
- [Collaborative Workflows](#collaborative-workflows) - [Thinking Modes \& Token Budgets](#thinking-modes--token-budgets)
- [Working with Large Prompts](#working-with-large-prompts) - [How to Use Thinking Modes](#how-to-use-thinking-modes)
- [Vision Support](#vision-support) - [Optimizing Token Usage \& Costs](#optimizing-token-usage--costs)
- [Web Search Integration](#web-search-integration) - [Tool Parameters](#tool-parameters)
- [System Prompts](#system-prompts) - [File-Processing Tools](#file-processing-tools)
- [Context Revival: AI Memory Beyond Context Limits](#context-revival-ai-memory-beyond-context-limits)
- [**The Breakthrough**](#the-breakthrough)
- [Key Benefits](#key-benefits)
- [Quick Example](#quick-example)
- [Collaborative Workflows](#collaborative-workflows)
- [Design → Review → Implement](#design--review--implement)
- [Code → Review → Fix](#code--review--fix)
- [Debug → Analyze → Solution → Precommit Check → Publish](#debug--analyze--solution--precommit-check--publish)
- [Refactor → Review → Implement → Test](#refactor--review--implement--test)
- [Tool Selection Guidance](#tool-selection-guidance)
- [Vision Support](#vision-support)
- [Working with Large Prompts](#working-with-large-prompts)
- [Web Search Integration](#web-search-integration)
- [System Prompts](#system-prompts)
- [Prompt Architecture](#prompt-architecture)
- [Specialized Expertise](#specialized-expertise)
- [Customization](#customization)
## Model Configuration ## Model Configuration
@@ -41,6 +58,9 @@ Regardless of your default configuration, you can specify models per request:
| **`o3-mini`** | OpenAI | 200K tokens | Balanced speed/quality | Moderate complexity tasks | | **`o3-mini`** | OpenAI | 200K tokens | Balanced speed/quality | Moderate complexity tasks |
| **`o4-mini`** | OpenAI | 200K tokens | Latest reasoning model | Optimized for shorter contexts | | **`o4-mini`** | OpenAI | 200K tokens | Latest reasoning model | Optimized for shorter contexts |
| **`gpt4.1`** | OpenAI | 1M tokens | Latest GPT-4 with extended context | Large codebase analysis, comprehensive reviews | | **`gpt4.1`** | OpenAI | 1M tokens | Latest GPT-4 with extended context | Large codebase analysis, comprehensive reviews |
| **`gpt5.1`** (GPT-5.1) | OpenAI | 400K tokens | Flagship reasoning model with configurable thinking effort | Complex problems, balanced agent/coding flows |
| **`gpt5.1-codex`** (GPT-5.1 Codex) | OpenAI | 400K tokens | Agentic coding specialization (Responses API) | Advanced coding tasks, structured code generation |
| **`gpt5.1-codex-mini`** (GPT-5.1 Codex mini) | OpenAI | 400K tokens | Cost-efficient Codex variant with streaming | Balanced coding tasks, cost-conscious development |
| **`gpt5`** (GPT-5) | OpenAI | 400K tokens | Advanced model with reasoning support | Complex problems requiring advanced reasoning | | **`gpt5`** (GPT-5) | OpenAI | 400K tokens | Advanced model with reasoning support | Complex problems requiring advanced reasoning |
| **`gpt5-mini`** (GPT-5 Mini) | OpenAI | 400K tokens | Efficient variant with reasoning | Balanced performance and capability | | **`gpt5-mini`** (GPT-5 Mini) | OpenAI | 400K tokens | Efficient variant with reasoning | Balanced performance and capability |
| **`gpt5-nano`** (GPT-5 Nano) | OpenAI | 400K tokens | Fastest, cheapest GPT-5 variant | Summarization and classification tasks | | **`gpt5-nano`** (GPT-5 Nano) | OpenAI | 400K tokens | Fastest, cheapest GPT-5 variant | Summarization and classification tasks |
@@ -61,6 +81,10 @@ cloud models (expensive/powerful) AND local models (free/private) in the same co
- **Flash Lite 2.0**: Text-only lightweight model (no thinking support) - **Flash Lite 2.0**: Text-only lightweight model (no thinking support)
- **O3/O4 Models**: Excellent reasoning, systematic analysis, 200K context - **O3/O4 Models**: Excellent reasoning, systematic analysis, 200K context
- **GPT-4.1**: Extended context window (1M tokens), general capabilities - **GPT-4.1**: Extended context window (1M tokens), general capabilities
- **GPT-5.1 Series**: Latest flagship reasoning models, 400K context
- **GPT-5.1**: Flagship model with configurable thinking effort and vision
- **GPT-5.1 Codex**: Agentic coding specialization (Responses API, non-streaming)
- **GPT-5.1 Codex mini**: Cost-efficient Codex variant with streaming support
- **GPT-5 Series**: Advanced reasoning models, 400K context - **GPT-5 Series**: Advanced reasoning models, 400K context
- **GPT-5**: Full-featured with reasoning support and vision - **GPT-5**: Full-featured with reasoning support and vision
- **GPT-5 Mini**: Balanced efficiency and capability - **GPT-5 Mini**: Balanced efficiency and capability
@@ -161,7 +185,7 @@ All tools that work with files support **both individual files and entire direct
**`analyze`** - Analyze files or directories **`analyze`** - Analyze files or directories
- `files`: List of file paths or directories (required) - `files`: List of file paths or directories (required)
- `question`: What to analyze (required) - `question`: What to analyze (required)
- `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5|gpt5-mini|gpt5-nano (default: server default) - `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5.1|gpt5.1-codex|gpt5.1-codex-mini|gpt5|gpt5-mini|gpt5-nano (default: server default)
- `analysis_type`: architecture|performance|security|quality|general - `analysis_type`: architecture|performance|security|quality|general
- `output_format`: summary|detailed|actionable - `output_format`: summary|detailed|actionable
- `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only) - `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only)
@@ -176,7 +200,7 @@ All tools that work with files support **both individual files and entire direct
**`codereview`** - Review code files or directories **`codereview`** - Review code files or directories
- `files`: List of file paths or directories (required) - `files`: List of file paths or directories (required)
- `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5|gpt5-mini|gpt5-nano (default: server default) - `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5.1|gpt5.1-codex|gpt5.1-codex-mini|gpt5|gpt5-mini|gpt5-nano (default: server default)
- `review_type`: full|security|performance|quick - `review_type`: full|security|performance|quick
- `focus_on`: Specific aspects to focus on - `focus_on`: Specific aspects to focus on
- `standards`: Coding standards to enforce - `standards`: Coding standards to enforce
@@ -192,7 +216,7 @@ All tools that work with files support **both individual files and entire direct
**`debug`** - Debug with file context **`debug`** - Debug with file context
- `error_description`: Description of the issue (required) - `error_description`: Description of the issue (required)
- `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5|gpt5-mini|gpt5-nano (default: server default) - `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5.1|gpt5.1-codex|gpt5.1-codex-mini|gpt5|gpt5-mini|gpt5-nano (default: server default)
- `error_context`: Stack trace or logs - `error_context`: Stack trace or logs
- `files`: Files or directories related to the issue - `files`: Files or directories related to the issue
- `runtime_info`: Environment details - `runtime_info`: Environment details
@@ -208,7 +232,7 @@ All tools that work with files support **both individual files and entire direct
**`thinkdeep`** - Extended analysis with file context **`thinkdeep`** - Extended analysis with file context
- `current_analysis`: Your current thinking (required) - `current_analysis`: Your current thinking (required)
- `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5|gpt5-mini|gpt5-nano (default: server default) - `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5.1|gpt5.1-codex|gpt5.1-codex-mini|gpt5|gpt5-mini|gpt5-nano (default: server default)
- `problem_context`: Additional context - `problem_context`: Additional context
- `focus_areas`: Specific aspects to focus on - `focus_areas`: Specific aspects to focus on
- `files`: Files or directories for context - `files`: Files or directories for context
@@ -224,7 +248,7 @@ All tools that work with files support **both individual files and entire direct
**`testgen`** - Comprehensive test generation with edge case coverage **`testgen`** - Comprehensive test generation with edge case coverage
- `files`: Code files or directories to generate tests for (required) - `files`: Code files or directories to generate tests for (required)
- `prompt`: Description of what to test, testing objectives, and scope (required) - `prompt`: Description of what to test, testing objectives, and scope (required)
- `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5|gpt5-mini|gpt5-nano (default: server default) - `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5.1|gpt5.1-codex|gpt5.1-codex-mini|gpt5|gpt5-mini|gpt5-nano (default: server default)
- `test_examples`: Optional existing test files as style/pattern reference - `test_examples`: Optional existing test files as style/pattern reference
- `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only) - `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only)
@@ -239,7 +263,7 @@ All tools that work with files support **both individual files and entire direct
- `files`: Code files or directories to analyze for refactoring opportunities (required) - `files`: Code files or directories to analyze for refactoring opportunities (required)
- `prompt`: Description of refactoring goals, context, and specific areas of focus (required) - `prompt`: Description of refactoring goals, context, and specific areas of focus (required)
- `refactor_type`: codesmells|decompose|modernize|organization (required) - `refactor_type`: codesmells|decompose|modernize|organization (required)
- `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5|gpt5-mini|gpt5-nano (default: server default) - `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5.1|gpt5.1-codex|gpt5.1-codex-mini|gpt5|gpt5-mini|gpt5-nano (default: server default)
- `focus_areas`: Specific areas to focus on (e.g., 'performance', 'readability', 'maintainability', 'security') - `focus_areas`: Specific areas to focus on (e.g., 'performance', 'readability', 'maintainability', 'security')
- `style_guide_examples`: Optional existing code files to use as style/pattern reference - `style_guide_examples`: Optional existing code files to use as style/pattern reference
- `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only) - `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only)

View File

@@ -63,7 +63,7 @@ CUSTOM_MODEL_NAME=llama3.2 # Default model
**Default Model Selection:** **Default Model Selection:**
```env ```env
# Options: 'auto', 'pro', 'flash', 'o3', 'o3-mini', 'o4-mini', etc. # Options: 'auto', 'pro', 'flash', 'gpt5.1', 'gpt5.1-codex', 'gpt5.1-codex-mini', 'o3', 'o3-mini', 'o4-mini', etc.
DEFAULT_MODEL=auto # Claude picks best model for each task (recommended) DEFAULT_MODEL=auto # Claude picks best model for each task (recommended)
``` ```
@@ -81,12 +81,14 @@ DEFAULT_MODEL=auto # Claude picks best model for each task (recommended)
| Provider | Canonical Models | Notable Aliases | | Provider | Canonical Models | Notable Aliases |
|----------|-----------------|-----------------| |----------|-----------------|-----------------|
| OpenAI | `gpt-5`, `gpt-5-pro`, `gpt-5-mini`, `gpt-5-nano`, `gpt-5-codex`, `gpt-4.1`, `o3`, `o3-mini`, `o3-pro`, `o4-mini` | `gpt5`, `gpt5pro`, `mini`, `nano`, `codex`, `o3mini`, `o3pro`, `o4mini` | | OpenAI | `gpt-5.1`, `gpt-5.1-codex`, `gpt-5.1-codex-mini`, `gpt-5`, `gpt-5-pro`, `gpt-5-mini`, `gpt-5-nano`, `gpt-5-codex`, `gpt-4.1`, `o3`, `o3-mini`, `o3-pro`, `o4-mini` | `gpt5.1`, `gpt-5.1`, `5.1`, `gpt5.1-codex`, `codex-5.1`, `codex-mini`, `gpt5`, `gpt5pro`, `mini`, `nano`, `codex`, `o3mini`, `o3pro`, `o4mini` |
| Gemini | `gemini-2.5-pro`, `gemini-2.5-flash`, `gemini-2.0-flash`, `gemini-2.0-flash-lite` | `pro`, `gemini-pro`, `flash`, `flash-2.0`, `flashlite` | | Gemini | `gemini-2.5-pro`, `gemini-2.5-flash`, `gemini-2.0-flash`, `gemini-2.0-flash-lite` | `pro`, `gemini-pro`, `flash`, `flash-2.0`, `flashlite` |
| X.AI | `grok-4`, `grok-3`, `grok-3-fast` | `grok`, `grok4`, `grok3`, `grok3fast`, `grokfast` | | X.AI | `grok-4`, `grok-3`, `grok-3-fast` | `grok`, `grok4`, `grok3`, `grok3fast`, `grokfast` |
| OpenRouter | See `conf/openrouter_models.json` for the continually evolving catalogue | e.g., `opus`, `sonnet`, `flash`, `pro`, `mistral` | | OpenRouter | See `conf/openrouter_models.json` for the continually evolving catalogue | e.g., `opus`, `sonnet`, `flash`, `pro`, `mistral` |
| Custom | User-managed entries such as `llama3.2` | Define your own aliases per entry | | Custom | User-managed entries such as `llama3.2` | Define your own aliases per entry |
Latest OpenAI entries (`gpt-5.1`, `gpt-5.1-codex`, `gpt-5.1-codex-mini`) mirror the official model cards released on November 13, 2025: all three expose 400K-token contexts with 128K-token outputs, reasoning-token support, and multimodal inputs. `gpt-5.1-codex` is Responses-only with streaming disabled, while the base `gpt-5.1` and Codex mini support streaming along with full code-generation flags. Update your manifests if you run custom deployments so these capability bits stay accurate.
> **Tip:** Copy the JSON file you need, customise it, and point the corresponding `*_MODELS_CONFIG_PATH` environment variable to your version. This lets you enable or disable capabilities (JSON mode, function calling, temperature support, code generation) without editing Python. > **Tip:** Copy the JSON file you need, customise it, and point the corresponding `*_MODELS_CONFIG_PATH` environment variable to your version. This lets you enable or disable capabilities (JSON mode, function calling, temperature support, code generation) without editing Python.
### Code Generation Capability ### Code Generation Capability
@@ -105,7 +107,7 @@ The `allow_code_generation` capability enables models to generate complete, prod
**When to Enable:** **When to Enable:**
- **Enable for**: Models MORE capable than your primary CLI's model (e.g., GPT-5, GPT-5 Pro when using Claude Code with Sonnet 4.5) - **Enable for**: Models MORE capable than your primary CLI's model (e.g., GPT-5.1 Codex, GPT-5 Pro, GPT-5.1 when using Claude Code with Sonnet 4.5)
- **Purpose**: Get complete implementations from a more powerful reasoning model that your primary CLI can then review and apply - **Purpose**: Get complete implementations from a more powerful reasoning model that your primary CLI can then review and apply
- **Use case**: Large-scale implementations, major refactoring, complete module creation - **Use case**: Large-scale implementations, major refactoring, complete module creation
@@ -169,7 +171,7 @@ Control which models can be used from each provider for cost control, compliance
# Empty or unset = all models allowed (default) # Empty or unset = all models allowed (default)
# OpenAI model restrictions # OpenAI model restrictions
OPENAI_ALLOWED_MODELS=o3-mini,o4-mini,mini OPENAI_ALLOWED_MODELS=gpt-5.1-codex-mini,gpt-5-mini,o3-mini,o4-mini,mini
# Gemini model restrictions # Gemini model restrictions
GOOGLE_ALLOWED_MODELS=flash,pro GOOGLE_ALLOWED_MODELS=flash,pro
@@ -193,12 +195,17 @@ OPENROUTER_ALLOWED_MODELS=opus,sonnet,mistral
OPENAI_ALLOWED_MODELS=o4-mini OPENAI_ALLOWED_MODELS=o4-mini
GOOGLE_ALLOWED_MODELS=flash GOOGLE_ALLOWED_MODELS=flash
# High-performance setup
OPENAI_ALLOWED_MODELS=gpt-5.1-codex,gpt-5.1
GOOGLE_ALLOWED_MODELS=pro
# Single model standardization # Single model standardization
OPENAI_ALLOWED_MODELS=o4-mini OPENAI_ALLOWED_MODELS=o4-mini
GOOGLE_ALLOWED_MODELS=pro GOOGLE_ALLOWED_MODELS=pro
# Balanced selection # Balanced selection
GOOGLE_ALLOWED_MODELS=flash,pro GOOGLE_ALLOWED_MODELS=flash,pro
OPENAI_ALLOWED_MODELS=gpt-5.1-codex-mini,gpt-5-mini,o4-mini
XAI_ALLOWED_MODELS=grok,grok-3-fast XAI_ALLOWED_MODELS=grok,grok-3-fast
``` ```
@@ -240,6 +247,8 @@ LOG_LEVEL=DEBUG # Default: shows detailed operational messages
DEFAULT_MODEL=auto DEFAULT_MODEL=auto
GEMINI_API_KEY=your-gemini-key GEMINI_API_KEY=your-gemini-key
OPENAI_API_KEY=your-openai-key OPENAI_API_KEY=your-openai-key
GOOGLE_ALLOWED_MODELS=flash,pro
OPENAI_ALLOWED_MODELS=gpt-5.1-codex-mini,gpt-5-mini,o4-mini
XAI_API_KEY=your-xai-key XAI_API_KEY=your-xai-key
LOG_LEVEL=DEBUG LOG_LEVEL=DEBUG
CONVERSATION_TIMEOUT_HOURS=1 CONVERSATION_TIMEOUT_HOURS=1
@@ -252,7 +261,7 @@ DEFAULT_MODEL=auto
GEMINI_API_KEY=your-gemini-key GEMINI_API_KEY=your-gemini-key
OPENAI_API_KEY=your-openai-key OPENAI_API_KEY=your-openai-key
GOOGLE_ALLOWED_MODELS=flash GOOGLE_ALLOWED_MODELS=flash
OPENAI_ALLOWED_MODELS=o4-mini OPENAI_ALLOWED_MODELS=gpt-5.1-codex-mini,o4-mini
LOG_LEVEL=INFO LOG_LEVEL=INFO
CONVERSATION_TIMEOUT_HOURS=3 CONVERSATION_TIMEOUT_HOURS=3
``` ```

View File

@@ -61,6 +61,9 @@ The curated defaults in `conf/openrouter_models.json` include popular entries su
| `llama3` | `meta-llama/llama-3-70b` | Large open-weight text model | | `llama3` | `meta-llama/llama-3-70b` | Large open-weight text model |
| `deepseek-r1` | `deepseek/deepseek-r1-0528` | DeepSeek reasoning model | | `deepseek-r1` | `deepseek/deepseek-r1-0528` | DeepSeek reasoning model |
| `perplexity` | `perplexity/llama-3-sonar-large-32k-online` | Search-augmented model | | `perplexity` | `perplexity/llama-3-sonar-large-32k-online` | Search-augmented model |
| `gpt5.1`, `gpt-5.1`, `5.1` | `openai/gpt-5.1` | Flagship GPT-5.1 with reasoning and vision |
| `gpt5.1-codex`, `codex-5.1` | `openai/gpt-5.1-codex` | Agentic coding specialization (Responses API) |
| `codex-mini`, `gpt5.1-codex-mini` | `openai/gpt-5.1-codex-mini` | Cost-efficient Codex variant with streaming |
Consult the JSON file for the full list, aliases, and capability flags. Add new entries as OpenRouter releases additional models. Consult the JSON file for the full list, aliases, and capability flags. Add new entries as OpenRouter releases additional models.
@@ -78,6 +81,18 @@ Native catalogues (`conf/openai_models.json`, `conf/gemini_models.json`, `conf/x
- Advertise support for JSON mode or vision if the upstream provider adds it - Advertise support for JSON mode or vision if the upstream provider adds it
- Adjust token limits when providers increase context windows - Adjust token limits when providers increase context windows
### Latest OpenAI releases
OpenAI's November 13, 2025 drop introduced `gpt-5.1`, `gpt-5.1-codex`, and `gpt-5.1-codex-mini`, all of which now ship in `conf/openai_models.json`:
| Model | Highlights | Notes |
|-------|------------|-------|
| `gpt-5.1` | 400K context, 128K output, multimodal IO, configurable reasoning effort | Streaming enabled; use for balanced agent/coding flows |
| `gpt-5.1-codex` | Responses-only agentic coding version of GPT-5.1 | Streaming disabled; `use_openai_response_api=true`; `allow_code_generation=true` |
| `gpt-5.1-codex-mini` | Cost-efficient Codex variant | Streaming enabled, retains 400K context and code-generation flag |
These entries include pricing-friendly aliases (`gpt5.1`, `codex-5.1`, `codex-mini`) plus updated capability flags (`supports_extended_thinking`, `allow_code_generation`). Copy the manifest if you operate custom deployment names so downstream providers inherit the same metadata.
Because providers load the manifests on import, you can tweak capabilities without touching Python. Restart the server after editing the JSON files so changes are picked up. Because providers load the manifests on import, you can tweak capabilities without touching Python. Restart the server after editing the JSON files so changes are picked up.
To control ordering in auto mode or the `listmodels` summary, adjust the To control ordering in auto mode or the `listmodels` summary, adjust the

View File

@@ -29,7 +29,7 @@ You need at least one API key. Choose based on your needs:
**OpenAI:** **OpenAI:**
- Visit [OpenAI Platform](https://platform.openai.com/api-keys) - Visit [OpenAI Platform](https://platform.openai.com/api-keys)
- Generate an API key for O3, GPT-5 access - Generate an API key for GPT-5.1, GPT-5.1-Codex, GPT-5, O3 access
**X.AI (Grok):** **X.AI (Grok):**
- Visit [X.AI Console](https://console.x.ai/) - Visit [X.AI Console](https://console.x.ai/)
@@ -287,7 +287,7 @@ Add your API keys (at least one required):
```env ```env
# Choose your providers (at least one required) # Choose your providers (at least one required)
GEMINI_API_KEY=your-gemini-api-key-here # For Gemini models GEMINI_API_KEY=your-gemini-api-key-here # For Gemini models
OPENAI_API_KEY=your-openai-api-key-here # For O3, GPT-5 OPENAI_API_KEY=your-openai-api-key-here # For GPT-5.1, GPT-5.1-Codex, O3
XAI_API_KEY=your-xai-api-key-here # For Grok models XAI_API_KEY=your-xai-api-key-here # For Grok models
OPENROUTER_API_KEY=your-openrouter-key # For multiple models OPENROUTER_API_KEY=your-openrouter-key # For multiple models
@@ -498,7 +498,7 @@ DEFAULT_MODEL=auto
GEMINI_API_KEY=your-key GEMINI_API_KEY=your-key
OPENAI_API_KEY=your-key OPENAI_API_KEY=your-key
GOOGLE_ALLOWED_MODELS=flash,pro GOOGLE_ALLOWED_MODELS=flash,pro
OPENAI_ALLOWED_MODELS=o4-mini,o3-mini OPENAI_ALLOWED_MODELS=gpt-5.1-codex-mini,gpt-5-mini,o4-mini
``` ```
### Cost-Optimized Setup ### Cost-Optimized Setup
@@ -514,7 +514,7 @@ DEFAULT_MODEL=auto
GEMINI_API_KEY=your-key GEMINI_API_KEY=your-key
OPENAI_API_KEY=your-key OPENAI_API_KEY=your-key
GOOGLE_ALLOWED_MODELS=pro GOOGLE_ALLOWED_MODELS=pro
OPENAI_ALLOWED_MODELS=o3 OPENAI_ALLOWED_MODELS=gpt-5.1-codex,gpt-5.1
``` ```
### Local-First Setup ### Local-First Setup

View File

@@ -39,7 +39,7 @@ A straightforward rubric that mirrors typical provider tiers:
| Intelligence | Guidance | | Intelligence | Guidance |
|--------------|----------| |--------------|----------|
| 1819 | Frontier reasoning models (Gemini 2.5 Pro, GPT5) | | 1819 | Frontier reasoning models (Gemini 2.5 Pro, GPT5.1 Codex, GPT5.1, GPT5) |
| 1517 | Strong general models with large context (O3 Pro, DeepSeek R1) | | 1517 | Strong general models with large context (O3 Pro, DeepSeek R1) |
| 1214 | Balanced assistants (Claude Opus/Sonnet, Mistral Large) | | 1214 | Balanced assistants (Claude Opus/Sonnet, Mistral Large) |
| 911 | Fast distillations (Gemini Flash, GPT-5 Mini, Mistral medium) | | 911 | Fast distillations (Gemini Flash, GPT-5 Mini, Mistral medium) |

View File

@@ -64,7 +64,7 @@ This workflow ensures methodical analysis before expert insights, resulting in d
**Initial Configuration (used in step 1):** **Initial Configuration (used in step 1):**
- `prompt`: What to analyze or look for (required) - `prompt`: What to analyze or look for (required)
- `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5|gpt5-mini|gpt5-nano (default: server default) - `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5.1|gpt5.1-codex|gpt5.1-codex-mini|gpt5|gpt5-mini|gpt5-nano (default: server default)
- `analysis_type`: architecture|performance|security|quality|general (default: general) - `analysis_type`: architecture|performance|security|quality|general (default: general)
- `output_format`: summary|detailed|actionable (default: detailed) - `output_format`: summary|detailed|actionable (default: detailed)
- `temperature`: Temperature for analysis (0-1, default 0.2) - `temperature`: Temperature for analysis (0-1, default 0.2)

View File

@@ -52,7 +52,7 @@ word verdict in the end.
## Tool Parameters ## Tool Parameters
- `prompt`: Your question or discussion topic (required) - `prompt`: Your question or discussion topic (required)
- `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5|gpt5-mini|gpt5-nano (default: server default) - `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5.1|gpt5.1-codex|gpt5.1-codex-mini|gpt5|gpt5-mini|gpt5-nano (default: server default)
- `absolute_file_paths`: Optional absolute file or directory paths for additional context - `absolute_file_paths`: Optional absolute file or directory paths for additional context
- `images`: Optional images for visual context (absolute paths) - `images`: Optional images for visual context (absolute paths)
- `working_directory_absolute_path`: **Required** - Absolute path to an existing directory where generated code artifacts will be saved - `working_directory_absolute_path`: **Required** - Absolute path to an existing directory where generated code artifacts will be saved

View File

@@ -79,7 +79,7 @@ The above prompt will simultaneously run two separate `codereview` tools with tw
**Initial Review Configuration (used in step 1):** **Initial Review Configuration (used in step 1):**
- `prompt`: User's summary of what the code does, expected behavior, constraints, and review objectives (required) - `prompt`: User's summary of what the code does, expected behavior, constraints, and review objectives (required)
- `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5|gpt5-mini|gpt5-nano (default: server default) - `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5.1|gpt5.1-codex|gpt5.1-codex-mini|gpt5|gpt5-mini|gpt5-nano (default: server default)
- `review_type`: full|security|performance|quick (default: full) - `review_type`: full|security|performance|quick (default: full)
- `focus_on`: Specific aspects to focus on (e.g., "security vulnerabilities", "performance bottlenecks") - `focus_on`: Specific aspects to focus on (e.g., "security vulnerabilities", "performance bottlenecks")
- `standards`: Coding standards to enforce (e.g., "PEP8", "ESLint", "Google Style Guide") - `standards`: Coding standards to enforce (e.g., "PEP8", "ESLint", "Google Style Guide")

View File

@@ -72,7 +72,7 @@ This structured approach ensures Claude performs methodical groundwork before ex
- `images`: Visual debugging materials (error screenshots, logs, etc.) - `images`: Visual debugging materials (error screenshots, logs, etc.)
**Model Selection:** **Model Selection:**
- `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5|gpt5-mini|gpt5-nano (default: server default) - `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5.1|gpt5.1-codex|gpt5.1-codex-mini|gpt5|gpt5-mini|gpt5-nano (default: server default)
- `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only) - `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only)
- `use_assistant_model`: Whether to use expert analysis phase (default: true, set to false to use Claude only) - `use_assistant_model`: Whether to use expert analysis phase (default: true, set to false to use Claude only)

View File

@@ -140,7 +140,7 @@ Use zen and perform a thorough precommit ensuring there aren't any new regressio
**Initial Configuration (used in step 1):** **Initial Configuration (used in step 1):**
- `path`: Starting directory to search for repos (REQUIRED for step 1, must be absolute path) - `path`: Starting directory to search for repos (REQUIRED for step 1, must be absolute path)
- `prompt`: The original user request description for the changes (required for context) - `prompt`: The original user request description for the changes (required for context)
- `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5|gpt5-mini|gpt5-nano (default: server default) - `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5.1|gpt5.1-codex|gpt5.1-codex-mini|gpt5|gpt5-mini|gpt5-nano (default: server default)
- `compare_to`: Compare against a branch/tag instead of local changes (optional) - `compare_to`: Compare against a branch/tag instead of local changes (optional)
- `severity_filter`: critical|high|medium|low|all (default: all) - `severity_filter`: critical|high|medium|low|all (default: all)
- `include_staged`: Include staged changes in the review (default: true) - `include_staged`: Include staged changes in the review (default: true)

View File

@@ -102,7 +102,7 @@ This results in Claude first performing its own expert analysis, encouraging it
**Initial Configuration (used in step 1):** **Initial Configuration (used in step 1):**
- `prompt`: Description of refactoring goals, context, and specific areas of focus (required) - `prompt`: Description of refactoring goals, context, and specific areas of focus (required)
- `refactor_type`: codesmells|decompose|modernize|organization (default: codesmells) - `refactor_type`: codesmells|decompose|modernize|organization (default: codesmells)
- `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5|gpt5-mini|gpt5-nano (default: server default) - `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5.1|gpt5.1-codex|gpt5.1-codex-mini|gpt5|gpt5-mini|gpt5-nano (default: server default)
- `focus_areas`: Specific areas to focus on (e.g., 'performance', 'readability', 'maintainability', 'security') - `focus_areas`: Specific areas to focus on (e.g., 'performance', 'readability', 'maintainability', 'security')
- `style_guide_examples`: Optional existing code files to use as style/pattern reference (absolute paths) - `style_guide_examples`: Optional existing code files to use as style/pattern reference (absolute paths)
- `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only) - `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only)

View File

@@ -85,7 +85,7 @@ security remediation plan using planner
- `images`: Architecture diagrams, security documentation, or visual references - `images`: Architecture diagrams, security documentation, or visual references
**Initial Security Configuration (used in step 1):** **Initial Security Configuration (used in step 1):**
- `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5|gpt5-mini|gpt5-nano (default: server default) - `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5.1|gpt5.1-codex|gpt5.1-codex-mini|gpt5|gpt5-mini|gpt5-nano (default: server default)
- `security_scope`: Application context, technology stack, and security boundary definition (required) - `security_scope`: Application context, technology stack, and security boundary definition (required)
- `threat_level`: low|medium|high|critical (default: medium) - determines assessment depth and urgency - `threat_level`: low|medium|high|critical (default: medium) - determines assessment depth and urgency
- `compliance_requirements`: List of compliance frameworks to assess against (e.g., ["PCI DSS", "SOC2"]) - `compliance_requirements`: List of compliance frameworks to assess against (e.g., ["PCI DSS", "SOC2"])

View File

@@ -69,7 +69,7 @@ Test generation excels with extended reasoning models like Gemini Pro or O3, whi
**Initial Configuration (used in step 1):** **Initial Configuration (used in step 1):**
- `prompt`: Description of what to test, testing objectives, and specific scope/focus areas (required) - `prompt`: Description of what to test, testing objectives, and specific scope/focus areas (required)
- `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5|gpt5-mini|gpt5-nano (default: server default) - `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5.1|gpt5.1-codex|gpt5.1-codex-mini|gpt5|gpt5-mini|gpt5-nano (default: server default)
- `test_examples`: Optional existing test files or directories to use as style/pattern reference (absolute paths) - `test_examples`: Optional existing test files or directories to use as style/pattern reference (absolute paths)
- `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only) - `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only)
- `use_assistant_model`: Whether to use expert test generation phase (default: true, set to false to use Claude only) - `use_assistant_model`: Whether to use expert test generation phase (default: true, set to false to use Claude only)

View File

@@ -30,7 +30,7 @@ with the best architecture for my project
## Tool Parameters ## Tool Parameters
- `prompt`: Your current thinking/analysis to extend and validate (required) - `prompt`: Your current thinking/analysis to extend and validate (required)
- `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5|gpt5-mini|gpt5-nano (default: server default) - `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5.1|gpt5.1-codex|gpt5.1-codex-mini|gpt5|gpt5-mini|gpt5-nano (default: server default)
- `problem_context`: Additional context about the problem or goal - `problem_context`: Additional context about the problem or goal
- `focus_areas`: Specific aspects to focus on (architecture, performance, security, etc.) - `focus_areas`: Specific aspects to focus on (architecture, performance, security, etc.)
- `files`: Optional file paths or directories for additional context (absolute paths) - `files`: Optional file paths or directories for additional context (absolute paths)