diff --git a/README.md b/README.md index 75c29b2..709dec6 100644 --- a/README.md +++ b/README.md @@ -125,7 +125,7 @@ and review into consideration to aid with its final pre-commit review. For best results when using [Claude Code](https://claude.ai/code): - **Sonnet 4.5** - All agentic work and orchestration -- **Gemini 2.5 Pro** OR **GPT-5-Pro** - Deep thinking, additional code reviews, debugging and validations, pre-commit analysis +- **Gemini 3.0 Pro** OR **GPT-5-Pro** - Deep thinking, additional code reviews, debugging and validations, pre-commit analysis
@@ -134,7 +134,7 @@ For best results when using [Claude Code](https://claude.ai/code): For best results when using [Codex CLI](https://developers.openai.com/codex/cli): - **GPT-5 Codex Medium** - All agentic work and orchestration -- **Gemini 2.5 Pro** OR **GPT-5-Pro** - Deep thinking, additional code reviews, debugging and validations, pre-commit analysis +- **Gemini 3.0 Pro** OR **GPT-5-Pro** - Deep thinking, additional code reviews, debugging and validations, pre-commit analysis
## Quick Start (5 minutes) @@ -205,7 +205,7 @@ Zen activates any provider that has credentials in your `.env`. See `.env.exampl **Collaboration & Planning** *(Enabled by default)* - **[`clink`](docs/tools/clink.md)** - Bridge requests to external AI CLIs (Gemini planner, codereviewer, etc.) -- **[`chat`](docs/tools/chat.md)** - Brainstorm ideas, get second opinions, validate approaches. With capable models (GPT-5 Pro, Gemini 2.5 Pro), generates complete code / implementation +- **[`chat`](docs/tools/chat.md)** - Brainstorm ideas, get second opinions, validate approaches. With capable models (GPT-5 Pro, Gemini 3.0 Pro), generates complete code / implementation - **[`thinkdeep`](docs/tools/thinkdeep.md)** - Extended reasoning, edge case analysis, alternative perspectives - **[`planner`](docs/tools/planner.md)** - Break down complex projects into structured, actionable plans - **[`consensus`](docs/tools/consensus.md)** - Get expert opinions from multiple AI models with stance steering @@ -379,7 +379,7 @@ DISABLED_TOOLS= **Model Support** - **Multiple providers** - Gemini, OpenAI, Azure, X.AI, OpenRouter, DIAL, Ollama -- **Latest models** - GPT-5, Gemini 2.5 Pro, O3, Grok-4, local Llama +- **Latest models** - GPT-5, Gemini 3.0 Pro, O3, Grok-4, local Llama - **[Thinking modes](docs/advanced-usage.md#thinking-modes)** - Control reasoning depth vs cost - **Vision support** - Analyze images, diagrams, screenshots diff --git a/docs/advanced-usage.md b/docs/advanced-usage.md index d701f8f..1bde2ef 100644 --- a/docs/advanced-usage.md +++ b/docs/advanced-usage.md @@ -33,7 +33,7 @@ Regardless of your default configuration, you can specify models per request: | Model | Provider | Context | Strengths | Auto Mode Usage | |-------|----------|---------|-----------|------------------| -| **`pro`** (Gemini 2.5 Pro) | Google | 1M tokens | Extended thinking (up to 32K tokens), deep analysis | Complex architecture, security reviews, deep debugging | +| **`pro`** (Gemini 3.0 Pro) | Google | 1M tokens | Extended thinking (up to 32K tokens), deep analysis | Complex architecture, security reviews, deep debugging | | **`flash`** (Gemini 2.5 Flash) | Google | 1M tokens | Ultra-fast responses with thinking | Quick checks, formatting, simple analysis | | **`flash-2.0`** (Gemini 2.0 Flash) | Google | 1M tokens | Latest fast model with audio/video support | Quick analysis with multimodal input | | **`flashlite`** (Gemini 2.0 Flash Lite) | Google | 1M tokens | Lightweight text-only model | Fast text processing without vision | @@ -58,7 +58,7 @@ cloud models (expensive/powerful) AND local models (free/private) in the same co **Model Capabilities:** - **Gemini Models**: Support thinking modes (minimal to max), web search, 1M context - - **Pro 2.5**: Deep analysis with max 32K thinking tokens + - **Pro 3.0**: Deep analysis with max 32K thinking tokens - **Flash 2.5**: Ultra-fast with thinking support (24K thinking tokens) - **Flash 2.0**: Latest fast model with audio/video input (24K thinking tokens) - **Flash Lite 2.0**: Text-only lightweight model (no thinking support) @@ -107,7 +107,7 @@ OPENAI_ALLOWED_MODELS=o3,o4-mini ### Thinking Modes & Token Budgets -These only apply to models that support customizing token usage for extended thinking, such as Gemini 2.5 Pro. +These only apply to models that support customizing token usage for extended thinking, such as Gemini 3.0 Pro. | Mode | Token Budget | Use Case | Cost Impact | |------|-------------|----------|-------------| @@ -155,7 +155,7 @@ These only apply to models that support customizing token usage for extended thi # Complex debugging, letting claude pick the best model "Use zen to debug this race condition with max thinking mode" -# Architecture analysis with Gemini 2.5 Pro +# Architecture analysis with Gemini 3.0 Pro "Analyze the entire src/ directory architecture with high thinking using pro" ``` @@ -346,7 +346,7 @@ To help choose the right tool for your needs: The Zen MCP server supports vision-capable models for analyzing images, diagrams, screenshots, and visual content. Vision support works seamlessly with all tools and conversation threading. **Supported Models:** -- **Gemini 2.5 Pro & Flash**: Excellent for diagrams, architecture analysis, UI mockups (up to 20MB total) +- **Gemini 3.0 Pro & Flash**: Excellent for diagrams, architecture analysis, UI mockups (up to 20MB total) - **OpenAI O3/O4 series**: Strong for visual debugging, error screenshots (up to 20MB total) - **Claude models via OpenRouter**: Good for code screenshots, visual analysis (up to 5MB total) - **Custom models**: Support varies by model, with 40MB maximum enforced for abuse prevention diff --git a/docs/configuration.md b/docs/configuration.md index 0127f52..097c5cc 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -151,7 +151,9 @@ The `allow_code_generation` capability enables models to generate complete, prod **Default Thinking Mode for ThinkDeep:** ```env -# Only applies to models supporting extended thinking (e.g., Gemini 2.5 Pro) +# Only applies to models supporting extended thinking (e.g., Gemini 3.0 Pro) +# Starting with Gemini 3.0 Pro, `thinking level` should stick to `high` + DEFAULT_THINKING_MODE_THINKDEEP=high # Available modes and token consumption: diff --git a/docs/getting-started.md b/docs/getting-started.md index 9a487b2..b29a371 100644 --- a/docs/getting-started.md +++ b/docs/getting-started.md @@ -25,7 +25,7 @@ You need at least one API key. Choose based on your needs: **Gemini (Google):** - Visit [Google AI Studio](https://makersuite.google.com/app/apikey) - Generate an API key -- **Note**: For Gemini 2.5 Pro, use a paid API key (free tier has limited access) +- **Note**: For Gemini 3.0 / 2.5 Pro, use a paid API key (free tier has limited access) **OpenAI:** - Visit [OpenAI Platform](https://platform.openai.com/api-keys) diff --git a/docs/model_ranking.md b/docs/model_ranking.md index d843996..5b5dc1b 100644 --- a/docs/model_ranking.md +++ b/docs/model_ranking.md @@ -37,14 +37,14 @@ of the work so you can enforce organisational preferences easily. A straightforward rubric that mirrors typical provider tiers: -| Intelligence | Guidance | -|--------------|----------| -| 18–19 | Frontier reasoning models (Gemini 2.5 Pro, GPT‑5.1 Codex, GPT‑5.1, GPT‑5) | -| 15–17 | Strong general models with large context (O3 Pro, DeepSeek R1) | -| 12–14 | Balanced assistants (Claude Opus/Sonnet, Mistral Large) | -| 9–11 | Fast distillations (Gemini Flash, GPT-5 Mini, Mistral medium) | -| 6–8 | Local or efficiency-focused models (Llama 3 70B, Claude Haiku) | -| ≤5 | Experimental/lightweight models | +| Intelligence | Guidance | +|--------------|-------------------------------------------------------------------------------------------| +| 18–19 | Frontier reasoning models (Gemini 3.0 Pro, Gemini 2.5 Pro, GPT‑5.1 Codex, GPT‑5.1, GPT‑5) | +| 15–17 | Strong general models with large context (O3 Pro, DeepSeek R1) | +| 12–14 | Balanced assistants (Claude Opus/Sonnet, Mistral Large) | +| 9–11 | Fast distillations (Gemini Flash, GPT-5 Mini, Mistral medium) | +| 6–8 | Local or efficiency-focused models (Llama 3 70B, Claude Haiku) | +| ≤5 | Experimental/lightweight models | Record the reasoning for your scores so future updates stay consistent. diff --git a/docs/tools/chat.md b/docs/tools/chat.md index ed19e17..8fba541 100644 --- a/docs/tools/chat.md +++ b/docs/tools/chat.md @@ -39,7 +39,7 @@ word verdict in the end. - **Collaborative thinking partner** for your analysis and planning - **Get second opinions** on your designs and approaches - **Brainstorm solutions** and explore alternatives together -- **Structured code generation**: When using GPT-5 Pro or Gemini 2.5 Pro, get complete, production-ready implementations saved to `zen_generated.code` for your CLI to review and apply +- **Structured code generation**: When using GPT-5.1 or Gemini 3.0 / 2.5 Pro, get complete, production-ready implementations saved to `zen_generated.code` for your CLI to review and apply - **Validate your checklists** and implementation plans - **General development questions** and explanations - **Technology comparisons** and best practices @@ -62,11 +62,11 @@ word verdict in the end. ## Structured Code Generation -When using advanced reasoning models like **GPT-5 Pro** or **Gemini 2.5 Pro**, the chat tool can generate complete, production-ready code implementations in a structured format. +When using advanced reasoning models like **GPT-5 Pro** or **Gemini 3.0 Pro**, the chat tool can generate complete, production-ready code implementations in a structured format. ### How It Works -1. You ask your AI agent to implement a complex new feature using `chat` with a higher-reasoning model such as **GPT-5 Pro** or **Gemini 2.5 Pro** +1. You ask your AI agent to implement a complex new feature using `chat` with a higher-reasoning model such as **GPT-5 Pro** or **Gemini 3.0 Pro** 2. The model generates structured implementation and shares the complete implementation with Zen 3. Zen saves the code to `zen_generated.code` and asks AI agent to implement the plan 4. AI agent continues from the previous context, reads the file, applies the implementation diff --git a/docs/tools/precommit.md b/docs/tools/precommit.md index ca9d7d2..8d1a0ae 100644 --- a/docs/tools/precommit.md +++ b/docs/tools/precommit.md @@ -59,7 +59,7 @@ to **Publish all Changes** was made after a rigorous set of prompts where it exa from different angles (performance, bugs, anti-patterns etc). After confirming that unit tests were passing, it went ahead with an initial review with O3; came back clean - all good to go. O3 was happy, everything looked great. -Then it shared the changes and relevant code with Gemini 2.5 Pro - the following is the outcome: +Then it shared the changes and relevant code with Gemini 3.0 Pro - the following is the outcome: ``` ⏺ zen:precommit (MCP)(step: "FINAL VALIDATION COMPLETE. All changes have been thoroughly analyzed and are ready for @@ -85,7 +85,7 @@ Then it shared the changes and relevant code with Gemini 2.5 Pro - the following ⏺ 🔍 Pre-Commit Validation Complete - Based on comprehensive analysis using both O3 and Gemini 2.5 Pro models, I can provide the following assessment: + Based on comprehensive analysis using both O3 and Gemini 3.0 Pro models, I can provide the following assessment: ✅ APPROVED WITH CRITICAL FIX REQUIRED