GPT-5, GPT-5-mini support

Improvements to model name resolution
Improved instructions for multi-step workflows when continuation is available
Improved instructions for chat tool
Improved preferred model resolution, moved code from registry -> each provider
Updated tests
This commit is contained in:
Fahad
2025-08-08 08:51:34 +05:00
parent 9a4791cb06
commit 1a8ec2e12f
30 changed files with 792 additions and 483 deletions

View File

@@ -37,13 +37,13 @@ OPENROUTER_API_KEY=your_openrouter_api_key_here
# Optional: Default model to use
# Options: 'auto' (Claude picks best model), 'pro', 'flash', 'o3', 'o3-mini', 'o4-mini', 'o4-mini-high',
# 'grok', 'opus-4', 'sonnet-4', or any DIAL model if DIAL is configured
# 'gpt-5', 'gpt-5-mini', 'grok', 'opus-4', 'sonnet-4', or any DIAL model if DIAL is configured
# When set to 'auto', Claude will select the best model for each task
# Defaults to 'auto' if not specified
DEFAULT_MODEL=auto
# Optional: Default thinking mode for ThinkDeep tool
# NOTE: Only applies to models that support extended thinking (e.g., Gemini 2.5 Pro)
# NOTE: Only applies to models that support extended thinking (e.g., Gemini 2.5 Pro, GPT-5 models)
# Flash models (2.0) will use system prompt engineering instead
# Token consumption per mode:
# minimal: 128 tokens - Quick analysis, fastest response
@@ -65,6 +65,8 @@ DEFAULT_THINKING_MODE_THINKDEEP=high
# - o3-mini (200K context, balanced)
# - o4-mini (200K context, latest balanced, temperature=1.0 only)
# - o4-mini-high (200K context, enhanced reasoning, temperature=1.0 only)
# - gpt-5 (400K context, 128K output, reasoning tokens)
# - gpt-5-mini (400K context, 128K output, reasoning tokens)
# - mini (shorthand for o4-mini)
#
# Supported Google/Gemini models: