feat!: Full code can now be generated by an external model and shared with the AI tool (Claude Code / Codex etc)!

model definitions now support a new `allow_code_generation` flag, only to be used with higher reasoning models such as GPT-5-Pro and-Gemini 2.5-Pro When `true`, the `chat` tool can now request the external model to generate a full implementation / update / instructions etc and then share the implementation with the calling agent. This effectively allows us to utilize more powerful models such as GPT-5-Pro to generate code for us or entire implementations (which are either API-only or part of the $200 Pro plan from within the ChatGPT app)
2025-10-07 18:49:13 +04:00
parent 04f7ce5b03
commit ece8a5ebed
29 changed files with 1008 additions and 122 deletions
--- a/docs/adding_tools.md
+++ b/docs/adding_tools.md
@@ -52,6 +52,9 @@ from tools.simple.base import SimpleTool
 class ChatRequest(ToolRequest):
    prompt: str = Field(..., description="Your question or idea.")
    files: list[str] | None = Field(default_factory=list)
+    working_directory: str = Field(
+        ..., description="Absolute full directory path where the assistant AI can save generated code for implementation."
+    )

 class ChatTool(SimpleTool):
    def get_name(self) -> str:  # required by BaseTool
@@ -67,10 +70,17 @@ class ChatTool(SimpleTool):
        return ChatRequest

    def get_tool_fields(self) -> dict[str, dict[str, object]]:
-        return {"prompt": {"type": "string", "description": "Your question."}, "files": SimpleTool.FILES_FIELD}
+        return {
+            "prompt": {"type": "string", "description": "Your question."},
+            "files": SimpleTool.FILES_FIELD,
+            "working_directory": {
+                "type": "string",
+                "description": "Absolute full directory path where the assistant AI can save generated code for implementation.",
+            },
+        }

    def get_required_fields(self) -> list[str]:
-        return ["prompt"]
+        return ["prompt", "working_directory"]

    async def prepare_prompt(self, request: ChatRequest) -> str:
        return self.prepare_chat_style_prompt(request)
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -75,7 +75,7 @@ DEFAULT_MODEL=auto  # Claude picks best model for each task (recommended)
  - `conf/dial_models.json` – DIAL aggregation catalogue (`DIAL_MODELS_CONFIG_PATH`)
  - `conf/custom_models.json` – Custom/OpenAI-compatible endpoints (`CUSTOM_MODELS_CONFIG_PATH`)

-  Each JSON file documents the allowed fields via its `_README` block and controls model aliases, capability limits, and feature flags. Edit these files (or point the matching `*_MODELS_CONFIG_PATH` variable to your own copy) when you want to adjust context windows, enable JSON mode, or expose additional aliases without touching Python code.
+  Each JSON file documents the allowed fields via its `_README` block and controls model aliases, capability limits, and feature flags (including `allow_code_generation`). Edit these files (or point the matching `*_MODELS_CONFIG_PATH` variable to your own copy) when you want to adjust context windows, enable JSON mode, enable structured code generation, or expose additional aliases without touching Python code.

  The shipped defaults cover:

@@ -87,7 +87,63 @@ DEFAULT_MODEL=auto  # Claude picks best model for each task (recommended)
  | OpenRouter | See `conf/openrouter_models.json` for the continually evolving catalogue | e.g., `opus`, `sonnet`, `flash`, `pro`, `mistral` |
  | Custom | User-managed entries such as `llama3.2` | Define your own aliases per entry |

-  > **Tip:** Copy the JSON file you need, customise it, and point the corresponding `*_MODELS_CONFIG_PATH` environment variable to your version. This lets you enable or disable capabilities (JSON mode, function calling, temperature support) without editing Python.
+  > **Tip:** Copy the JSON file you need, customise it, and point the corresponding `*_MODELS_CONFIG_PATH` environment variable to your version. This lets you enable or disable capabilities (JSON mode, function calling, temperature support, code generation) without editing Python.
+
+### Code Generation Capability
+
+**`allow_code_generation` Flag:**
+
+The `allow_code_generation` capability enables models to generate complete, production-ready implementations in a structured format. When enabled, the `chat` tool will inject special instructions for substantial code generation tasks.
+
+```json
+{
+  "model_name": "gpt-5",
+  "allow_code_generation": true,
+  ...
+}
+```
+
+**When to Enable:**
+
+- **Enable for**: Models MORE capable than your primary CLI's model (e.g., GPT-5, GPT-5 Pro when using Claude Code with Sonnet 4.5)
+- **Purpose**: Get complete implementations from a more powerful reasoning model that your primary CLI can then review and apply
+- **Use case**: Large-scale implementations, major refactoring, complete module creation
+
+**Important Guidelines:**
+
+1. Only enable for models significantly more capable than your primary CLI to ensure high-quality generated code
+2. The capability triggers structured code output (`<GENERATED-CODE>` blocks) for substantial implementation requests
+3. Minor code changes still use inline code blocks regardless of this setting
+4. Generated code is saved to `zen_generated.code` in the user's working directory
+5. Your CLI receives instructions to review and apply the generated code systematically
+
+**Example Configuration:**
+
+```json
+// OpenAI models configuration (conf/openai_models.json)
+{
+  "models": [
+    {
+      "model_name": "gpt-5",
+      "allow_code_generation": true,
+      "intelligence_score": 18,
+      ...
+    },
+    {
+      "model_name": "gpt-5-pro",
+      "allow_code_generation": true,
+      "intelligence_score": 19,
+      ...
+    }
+  ]
+}
+```
+
+**Typical Workflow:**
+1. You ask your AI agent to implement a complex new feature using `chat` with a higher-reasoning model such as **gpt-5-pro**
+2. GPT-5-Pro generates structured implementation and shares the complete implementation with Zen
+3. Zen saves the code to `zen_generated.code` and asks AI agent to implement the plan
+4. AI agent continues from the previous context, reads the file, applies the implementation

 ### Thinking Mode Configuration

--- a/docs/tools/chat.md
+++ b/docs/tools/chat.md
@@ -39,13 +39,14 @@ word verdict in the end.
 - **Collaborative thinking partner** for your analysis and planning
 - **Get second opinions** on your designs and approaches
 - **Brainstorm solutions** and explore alternatives together
+- **Structured code generation**: When using GPT-5 Pro or Gemini 2.5 Pro, get complete, production-ready implementations saved to `zen_generated.code` for your CLI to review and apply
 - **Validate your checklists** and implementation plans
 - **General development questions** and explanations
 - **Technology comparisons** and best practices
 - **Architecture and design discussions**
 - **File reference support**: `"Use gemini to explain this algorithm with context from algorithm.py"`
 - **Image support**: Include screenshots, diagrams, UI mockups for visual analysis: `"Chat with gemini about this error dialog screenshot to understand the user experience issue"`
- **Dynamic collaboration**: Gemini can request additional files or context during the conversation if needed for a more thorough response
+- **Dynamic collaboration**: Models can request additional files or context during the conversation if needed for a more thorough response
 - **Web search awareness**: Automatically identifies when online research would help and instructs Claude to perform targeted searches using continuation IDs

 ## Tool Parameters
@@ -54,10 +55,48 @@ word verdict in the end.
 - `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5|gpt5-mini|gpt5-nano (default: server default)
 - `files`: Optional files for context (absolute paths)
 - `images`: Optional images for visual context (absolute paths)
+- `working_directory`: **Required** - Absolute directory path where generated code artifacts will be saved
 - `temperature`: Response creativity (0-1, default 0.5)
 - `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only)
 - `continuation_id`: Continue previous conversations

+## Structured Code Generation
+
+When using advanced reasoning models like **GPT-5 Pro** or **Gemini 2.5 Pro**, the chat tool can generate complete, production-ready code implementations in a structured format.
+
+### How It Works
+
+1. You ask your AI agent to implement a complex new feature using `chat` with a higher-reasoning model such as **GPT-5 Pro** or **Gemini 2.5 Pro**
+2. The model generates structured implementation and shares the complete implementation with Zen
+3. Zen saves the code to `zen_generated.code` and asks AI agent to implement the plan
+4. AI agent continues from the previous context, reads the file, applies the implementation
+
+### When Code Generation Activates
+
+The structured format activates for **substantial implementation work**:
+- Creating new features from scratch with multiple files or significant code
+- Major refactoring across multiple files or large sections
+- Implementing new modules, components, or subsystems
+- Large-scale updates affecting substantial portions of the codebase
+- Complete rewrites of functions, algorithms, or approaches
+
+For minor changes (small tweaks, bug fixes, algorithm improvements), the model responds normally with inline code blocks.
+
+### Example Usage
+
+```
+chat with gpt-5-pro and ask it to make me a standalone, classic version of the
+Pacman game using pygame that I can run from the commandline. Give me a single
+script to execute in the end with any / all dependencies setup for me. 
+Do everything using pygame, we have no external resources / images / audio at
+hand. Instead of ghosts, it'll be different geometric shapes moving around 
+in the maze that Pacman can eat (so there are no baddies). Pacman gets to eat
+everything including bread-crumbs and large geometric shapes but make me the
+classic maze / walls that it navigates within using keyboard arrow keys.
+```
+
+See the [Configuration Guide](../configuration.md#code-generation-capability) for details on the `allow_code_generation` flag.
+
 ## Usage Examples

 **Basic Development Chat:**