Revert "don't need this"

This reverts commit 93121c18e0.
don't need this
2025-10-22 14:53:25 +11:00 · 2025-10-22 14:30:13 +11:00 · 2025-10-22 14:27:17 +11:00 · 2025-10-22 14:19:00 +11:00
15 changed files with 604 additions and 1016 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -0,0 +1,33 @@
+# Changelog
+
+## [Unreleased]
+
+### Added
+
+**Interactive Requirements Mode**
+- **AI-Enhanced Interactive Requirements**: New `--interactive-requirements` flag for autonomous mode
+  - User enters brief description of what they want to build
+  - AI automatically enhances input into structured requirements.md document
+  - Generates professional markdown with:
+    - Project title and overview
+    - Organized requirements (functional, technical, quality)
+    - Acceptance criteria
+  - User can review, accept, edit manually, or cancel before proceeding
+  - Seamlessly transitions to autonomous mode
+
+**Autonomous Mode Configuration**
+- **Autonomous Mode Configuration**: Added ability to specify different models for coach and player agents in autonomous mode
+  - New `[autonomous]` configuration section in `g3.toml`
+  - `coach_provider` and `coach_model` options for coach agent
+  - `player_provider` and `player_model` options for player agent
+  - `Config::for_coach()` and `Config::for_player()` methods to generate role-specific configurations
+  - Comprehensive test suite for autonomous configuration
+
+### Changed
+- Autonomous mode now uses `config.for_player()` for the player agent
+- Coach agent creation now uses `config.for_coach()` for the coach agent
+
+### Benefits
+- **Cost Optimization**: Use cheaper models for execution, expensive models for review
+- **Speed Optimization**: Use faster models for iteration, thorough models for validation
+- **Specialization**: Leverage different providers' strengths for different roles
--- a/Cargo.lock
+++ b/Cargo.lock
@@ -1316,7 +1316,6 @@ dependencies = [
 "dirs 5.0.1",
 "serde",
 "shellexpand",
- "tempfile",
 "thiserror 1.0.69",
 "toml",
 ]
--- a/README.md
+++ b/README.md
@@ -2,122 +2,14 @@

 G3 is a coding AI agent designed to help you complete tasks by writing code and executing commands. Built in Rust, it provides a flexible architecture for interacting with various Large Language Model (LLM) providers while offering powerful code generation and task automation capabilities.

-## Architecture Overview
-
-G3 follows a modular architecture organized as a Rust workspace with multiple crates, each responsible for specific functionality:
-
-### Core Components
-
-#### **g3-core**
-The heart of the agent system, containing:
- **Agent Engine**: Main orchestration logic for handling conversations, tool execution, and task management
- **Context Window Management**: Intelligent tracking of token usage with context thinning (50-80%) and auto-summarization at 80% capacity
- **Tool System**: Built-in tools for file operations, shell commands, computer control, TODO management, and structured output
- **Streaming Response Parser**: Real-time parsing of LLM responses with tool call detection and execution
- **Task Execution**: Support for single and iterative task execution with automatic retry logic
-
-#### **g3-providers**
-Abstraction layer for LLM providers:
- **Provider Interface**: Common trait-based API for different LLM backends
- **Multiple Provider Support**: 
-  - Anthropic (Claude models)
-  - Databricks (DBRX and other models)
-  - Local/embedded models via llama.cpp with Metal acceleration on macOS
- **OAuth Authentication**: Built-in OAuth flow support for secure provider authentication
- **Provider Registry**: Dynamic provider management and selection
-
-#### **g3-config**
-Configuration management system:
- Environment-based configuration
- Provider credentials and settings
- Model selection and parameters
- Runtime configuration options
-
-#### **g3-execution**
-Task execution framework:
- Task planning and decomposition
- Execution strategies (sequential, parallel)
- Error handling and retry mechanisms
- Progress tracking and reporting
-
-#### **g3-computer-control**
-Computer control capabilities:
- Mouse and keyboard automation
- UI element inspection and interaction
- Screenshot capture and window management
- OCR text extraction via Tesseract
-
-#### **g3-cli**
-Command-line interface:
- Interactive terminal interface
- Task submission and monitoring
- Configuration management commands
- Session management
-
-### Error Handling & Resilience
-
-G3 includes robust error handling with automatic retry logic:
- **Recoverable Error Detection**: Automatically identifies recoverable errors (rate limits, network issues, server errors, timeouts)
- **Exponential Backoff with Jitter**: Implements intelligent retry delays to avoid overwhelming services
- **Detailed Error Logging**: Captures comprehensive error context including stack traces, request/response data, and session information
- **Error Persistence**: Saves detailed error logs to `logs/errors/` for post-mortem analysis
- **Graceful Degradation**: Non-recoverable errors are logged with full context before terminating
-
 ## Key Features

-### Intelligent Context Management
- Automatic context window monitoring with percentage-based tracking
- Smart auto-summarization when approaching token limits
- **Context thinning** at 50%, 60%, 70%, 80% thresholds - automatically replaces large tool results with file references
- Conversation history preservation through summaries
- Dynamic token allocation for different providers (4k to 200k+ tokens)
-
-### Tool Ecosystem
- **File Operations**: Read, write, and edit files with line-range precision
- **Shell Integration**: Execute system commands with output capture
- **Code Generation**: Structured code generation with syntax awareness
- **TODO Management**: Read and write TODO lists with markdown checkbox format
- **Computer Control** (Experimental): Automate desktop applications
-  - Mouse and keyboard control
-  - UI element inspection
-  - Screenshot capture and window management
-  - OCR text extraction from images and screen regions
-  - Window listing and identification
- **Final Output**: Formatted result presentation
-
-### Provider Flexibility
- Support for multiple LLM providers through a unified interface
- Hot-swappable providers without code changes
- Provider-specific optimizations and feature support
- Local model support for offline operation
-
-### Task Automation
- Single-shot task execution for quick operations
- Iterative task mode for complex, multi-step workflows
- Automatic error recovery and retry logic
- Progress tracking and intermediate result handling
-
-## Language & Technology Stack
-
- **Language**: Rust (2021 edition)
- **Async Runtime**: Tokio for concurrent operations
- **HTTP Client**: Reqwest for API communications
- **Serialization**: Serde for JSON handling
- **CLI Framework**: Clap for command-line parsing
- **Logging**: Tracing for structured logging
- **Local Models**: llama.cpp with Metal acceleration support
-
-## Use Cases
-
-G3 is designed for:
- Automated code generation and refactoring
- File manipulation and project scaffolding
- System administration tasks
- Data processing and transformation  
- API integration and testing
- Documentation generation
- Complex multi-step workflows
- Desktop application automation and testing
+- **Multiple LLM Providers**: Anthropic (Claude), Databricks, OpenAI, and local models via llama.cpp
+- **Autonomous Mode**: Coach-player feedback loop for complex tasks
+- **Intelligent Context Management**: Auto-summarization and context thinning at 50-80% thresholds
+- **Rich Tool Ecosystem**: File operations, shell commands, computer control, browser automation
+- **Streaming Responses**: Real-time output with tool call detection
+- **Error Recovery**: Automatic retry logic with exponential backoff

 ## Getting Started

@@ -125,56 +17,234 @@ G3 is designed for:
 # Build the project
 cargo build --release

-# Run G3
-cargo run
-
-# Execute a task
+# Execute a single task
 g3 "implement a function to calculate fibonacci numbers"
+
+# Start autonomous mode with interactive requirements
+g3 --autonomous --interactive-requirements
 ```

+## Configuration
+
+Create `~/.config/g3/config.toml`:
+
+```toml
+[providers]
+default_provider = "databricks"
+
+[providers.anthropic]
+api_key = "sk-ant-..."
+model = "claude-3-5-sonnet-20241022"
+max_tokens = 4096
+
+[providers.databricks]
+host = "https://your-workspace.cloud.databricks.com"
+model = "databricks-meta-llama-3-1-70b-instruct"
+max_tokens = 4096
+use_oauth = true
+
+[agent]
+max_context_length = 8192
+enable_streaming = true
+
+# Optional: Use different models for coach and player in autonomous mode
+[autonomous]
+coach_provider = "anthropic"
+coach_model = "claude-3-5-sonnet-20241022"  # Thorough review
+player_provider = "databricks"
+player_model = "databricks-meta-llama-3-1-70b-instruct"  # Fast execution
+```
+
+## Autonomous Mode (Coach-Player Loop)
+
+G3 features an autonomous mode where two agents collaborate:
+- **Player Agent**: Executes tasks and implements solutions
+- **Coach Agent**: Reviews work and provides feedback
+
+### Option 1: Interactive Requirements with AI Enhancement (Recommended)
+
+```bash
+g3 --autonomous --interactive-requirements
+```
+
+**How it works:**
+1. Describe what you want to build (can be brief)
+2. Press **Ctrl+D** (Unix/Mac) or **Ctrl+Z** (Windows)
+3. AI enhances your input into a structured requirements document
+4. Review the enhanced requirements
+5. Choose to proceed, edit manually, or cancel
+6. If accepted, autonomous mode starts automatically
+
+**Example:**
+```
+You type: "build a todo app with cli in python"
+
+AI generates:
+# Todo List CLI Application
+
+## Overview
+A command-line todo list application built in Python...
+
+## Functional Requirements
+1. Add tasks with descriptions
+2. Mark tasks as complete
+3. Delete tasks
+...
+```
+
+### Option 2: Direct Requirements
+
+```bash
+g3 --autonomous --requirements "Build a REST API with CRUD operations for user management"
+```
+
+### Option 3: Requirements File
+
+Create `requirements.md` in your workspace:
+
+```markdown
+# Project Requirements
+
+1. Create a REST API with user endpoints
+2. Use SQLite for storage
+3. Include input validation
+4. Write unit tests
+```
+
+Then run:
+
+```bash
+g3 --autonomous
+```
+
+### Why Different Models for Coach and Player?
+
+Configure different models in the `[autonomous]` section to:
+- **Optimize Cost**: Use cheaper model for execution, expensive for review
+- **Optimize Speed**: Use fast model for iteration, thorough for validation
+- **Specialize**: Leverage provider strengths (e.g., Claude for analysis, Llama for code)
+
+If not configured, both agents use the `default_provider` and its model.
+
+## Command-Line Options
+
+```bash
+# Autonomous mode
+g3 --autonomous --interactive-requirements
+g3 --autonomous --requirements "Your requirements"
+g3 --autonomous --max-turns 10
+
+# Single-shot mode
+g3 "your task here"
+
+# Options
+--workspace <DIR>          # Set workspace directory
+--provider <NAME>          # Override provider (anthropic, databricks, openai)
+--model <NAME>             # Override model
+--quiet                    # Disable log files
+--webdriver                # Enable browser automation
+--show-prompt              # Show system prompt
+--show-code                # Show generated code
+```
+
+## Architecture Overview
+
+G3 is organized as a Rust workspace with multiple crates:
+
+- **g3-core**: Agent engine, context management, tool system, streaming parser
+- **g3-providers**: LLM provider abstraction (Anthropic, Databricks, OpenAI, local models)
+- **g3-config**: Configuration management
+- **g3-execution**: Task execution framework
+- **g3-computer-control**: Mouse/keyboard automation, OCR, screenshots
+- **g3-cli**: Command-line interface
+
+### Key Capabilities
+
+**Intelligent Context Management**
+- Automatic context window monitoring with percentage-based tracking
+- Smart auto-summarization when approaching token limits
+- Context thinning at 50%, 60%, 70%, 80% thresholds
+- Dynamic token allocation (4k to 200k+ tokens)
+
+**Tool Ecosystem**
+- File operations (read, write, edit with line-range precision)
+- Shell command execution
+- TODO management
+- Computer control (experimental): mouse, keyboard, OCR, screenshots
+- Browser automation via WebDriver (Safari)
+
+**Error Handling**
+- Automatic retry logic with exponential backoff
+- Recoverable error detection (rate limits, network issues, timeouts)
+- Detailed error logging to `logs/errors/`
+
 ## WebDriver Browser Automation

-G3 includes WebDriver support for browser automation tasks using Safari.
-
-**One-Time Setup** (macOS only):
-
-Safari Remote Automation must be enabled before using WebDriver tools. Run this once:
+**One-Time Setup** (macOS):

 ```bash
-# Option 1: Use the provided script
-./scripts/enable-safari-automation.sh
-
-# Option 2: Enable manually
+# Enable Safari Remote Automation
 safaridriver --enable  # Requires password

-# Option 3: Enable via Safari UI
+# Or via Safari UI:
 # Safari → Preferences → Advanced → Show Develop menu
 # Then: Develop → Allow Remote Automation
 ```

-**For detailed setup instructions and troubleshooting**, see [WebDriver Setup Guide](docs/webdriver-setup.md).
+**Usage**:

-**Usage**: Run G3 with the `--webdriver` flag to enable browser automation tools.
+```bash
+g3 --webdriver "scrape the top stories from Hacker News"
+```
+
+See [docs/webdriver-setup.md](docs/webdriver-setup.md) for detailed setup.

 ## Computer Control (Experimental)

-G3 can interact with your computer's GUI for automation tasks:
+Enable in config:
+
+```toml
+[computer_control]
+enabled = true
+require_confirmation = true
+```
+
+Grant accessibility permissions:
+- **macOS**: System Preferences → Security & Privacy → Accessibility
+- **Linux**: Ensure X11 or Wayland access
+- **Windows**: Run as administrator (first time)

 **Available Tools**: `mouse_click`, `type_text`, `find_element`, `take_screenshot`, `extract_text`, `find_text_on_screen`, `list_windows`

-**Setup**: Enable in config with `computer_control.enabled = true` and grant OS accessibility permissions:
- **macOS**: System Preferences → Security & Privacy → Accessibility  
- **Linux**: Ensure X11 or Wayland access
- **Windows**: Run as administrator (first time only)
+## Use Cases
+
+- Automated code generation and refactoring
+- File manipulation and project scaffolding
+- System administration tasks
+- Data processing and transformation
+- API integration and testing
+- Documentation generation
+- Complex multi-step workflows
+- Desktop application automation

 ## Session Logs

-G3 automatically saves session logs for each interaction in the `logs/` directory. These logs contain:
+G3 automatically saves session logs to `logs/` directory:
 - Complete conversation history
 - Token usage statistics
 - Timestamps and session status

-The `logs/` directory is created automatically on first use and is excluded from version control.
+Disable with `--quiet` flag.
+
+## Technology Stack
+
+- **Language**: Rust (2021 edition)
+- **Async Runtime**: Tokio
+- **HTTP Client**: Reqwest
+- **Serialization**: Serde
+- **CLI Framework**: Clap
+- **Logging**: Tracing
+- **Local Models**: llama.cpp with Metal acceleration

 ## License

@@ -182,4 +252,4 @@ MIT License - see LICENSE file for details

 ## Contributing

-G3 is an open-source project. Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
+Contributions welcome! Please see CONTRIBUTING.md for guidelines.
--- a/config.coach-player.example.toml
+++ b/config.coach-player.example.toml
@@ -1,24 +0,0 @@
-[providers]
-default_provider = "databricks"
-# Specify different providers for coach and player in autonomous mode
-coach = "databricks"    # Provider for coach (code reviewer) - can be more powerful/expensive
-player = "anthropic"    # Provider for player (code implementer) - can be faster/cheaper
-
-[providers.databricks]
-host = "https://your-workspace.cloud.databricks.com"
-# token = "your-databricks-token"  # Optional - will use OAuth if not provided
-model = "databricks-claude-sonnet-4"
-max_tokens = 4096
-temperature = 0.1
-use_oauth = true
-
-[providers.anthropic]
-api_key = "your-anthropic-api-key"
-model = "claude-3-haiku-20240307"  # Using a faster model for player
-max_tokens = 4096
-temperature = 0.3  # Slightly higher temperature for more creative implementations
-
-[agent]
-max_context_length = 8192
-enable_streaming = true
-timeout_seconds = 60
--- a/config.example.toml
+++ b/config.example.toml
@@ -1,10 +1,5 @@
 [providers]
 default_provider = "databricks"
-# Optional: Specify different providers for coach and player in autonomous mode
-# If not specified, will use default_provider for both
-# coach = "databricks"    # Provider for coach (code reviewer)
-# player = "anthropic"    # Provider for player (code implementer)
-# Note: Make sure the specified providers are configured below

 [providers.databricks]
 host = "https://your-workspace.cloud.databricks.com"
--- a/crates/g3-cli/src/lib.rs
+++ b/crates/g3-cli/src/lib.rs
@@ -103,14 +103,11 @@ fn extract_coach_feedback_from_logs(
    coach_result: &g3_core::TaskResult,
    coach_agent: &g3_core::Agent<ConsoleUiWriter>,
    output: &SimpleOutput,
-) -> Result<String> {
-    // CORRECT APPROACH: Get the session ID from the current coach agent
-    // and read its specific log file directly
-
+) -> String {
    // Get the coach agent's session ID
    let session_id = coach_agent
        .get_session_id()
-        .ok_or_else(|| anyhow::anyhow!("Coach agent has no session ID"))?;
+        .expect("Coach agent has no session ID");

    // Construct the log file path for this specific coach session
    let logs_dir = std::path::Path::new("logs");
@@ -123,15 +120,75 @@ fn extract_coach_feedback_from_logs(
                if let Some(context_window) = log_json.get("context_window") {
                    if let Some(conversation_history) = context_window.get("conversation_history") {
                        if let Some(messages) = conversation_history.as_array() {
-                            // Simply get the last message content - this is the coach's final feedback
-                            if let Some(last_message) = messages.last() {
-                                if let Some(content) = last_message.get("content") {
-                                    if let Some(content_str) = content.as_str() {
-                                        output.print(&format!(
-                                            "✅ Extracted coach feedback from session: {}",
-                                            session_id
-                                        ));
-                                        return Ok(content_str.to_string());
+                            // Look for the last assistant message (regardless of tool used)
+                            for message in messages.iter().rev() {
+                                if let Some(role) = message.get("role") {
+                                    if role.as_str() == Some("assistant") {
+                                        if let Some(content) = message.get("content") {
+                                            if let Some(content_str) = content.as_str() {
+                                                // First, check if this is plain text feedback (no tool call)
+                                                // This happens when the coach returns final feedback directly
+                                                if !content_str.contains("{\"tool\"") {
+                                                    let trimmed = content_str.trim();
+                                                    if !trimmed.is_empty() {
+                                                        output.print(&format!(
+                                                            "✅ Extracted coach feedback from session: {} ({} chars) [plain text]",
+                                                            session_id,
+                                                            trimmed.len()
+                                                        ));
+                                                        return trimmed.to_string();
+                                                    }
+                                                }
+                                                
+                                                // Look for ANY tool call in the message
+                                                // Pattern: {"tool": "...", "args": {...}}
+                                                if let Some(tool_start) = content_str.find("{\"tool\"") {
+                                                    let json_part = &content_str[tool_start..];
+                                                    
+                                                    // Find the end of the JSON object
+                                                    if let Some(json_end) = find_json_end(json_part) {
+                                                        let json_str = &json_part[..json_end];
+                                                        
+                                                        if let Ok(tool_call) = serde_json::from_str::<serde_json::Value>(json_str) {
+                                                            if let Some(args) = tool_call.get("args") {
+                                                                // Try to extract feedback from different possible fields
+                                                                let feedback = if let Some(summary) = args.get("summary") {
+                                                                    // final_output tool uses "summary"
+                                                                    summary.as_str().map(|s| s.to_string())
+                                                                } else if let Some(content) = args.get("content") {
+                                                                    // todo_write and other tools might use "content"
+                                                                    content.as_str().map(|s| s.to_string())
+                                                                } else {
+                                                                    // Fallback: use the entire args as JSON string
+                                                                    Some(serde_json::to_string_pretty(args).unwrap_or_default())
+                                                                };
+                                                                
+                                                                if let Some(feedback_str) = feedback {
+                                                                    if !feedback_str.trim().is_empty() {
+                                                                        output.print(&format!(
+                                                                            "✅ Extracted coach feedback from session: {} ({} chars)",
+                                                                            session_id,
+                                                                            feedback_str.len()
+                                                                        ));
+                                                                        
+                                                                        // Validate feedback length
+                                                                        if feedback_str.len() < 80 && !feedback_str.contains("IMPLEMENTATION_APPROVED") {
+                                                                            panic!(
+                                                                                "Coach feedback is too short ({} chars): '{}'",
+                                                                                feedback_str.len(),
+                                                                                feedback_str
+                                                                            );
+                                                                        }
+                                                                        
+                                                                        return feedback_str;
+                                                                    }
+                                                                }
+                                                            }
+                                                        }
+                                                    }
+                                                }
+                                            }
+                                        }
                                    }
                                }
                            }
@@ -156,6 +213,35 @@ fn extract_coach_feedback_from_logs(
    );
 }

+/// Helper function to find the end of a JSON object using brace counting
+fn find_json_end(json_str: &str) -> Option<usize> {
+    let mut depth = 0;
+    let mut in_string = false;
+    let mut escape_next = false;
+    
+    for (i, ch) in json_str.char_indices() {
+        if escape_next {
+            escape_next = false;
+            continue;
+        }
+        
+        match ch {
+            '\\' if in_string => escape_next = true,
+            '"' => in_string = !in_string,
+            '{' if !in_string => depth += 1,
+            '}' if !in_string => {
+                depth -= 1;
+                if depth == 0 {
+                    return Some(i + 1);
+                }
+            }
+            _ => {}
+        }
+    }
+    
+    None
+}
+
 use clap::Parser;
 use g3_config::Config;
 use g3_core::{project::Project, ui_writer::UiWriter, Agent};
@@ -239,6 +325,10 @@ pub struct Cli {
    /// Disable log file creation (no logs/ directory or session logs)
    #[arg(long)]
    pub quiet: bool,
+
+    /// Enable WebDriver tools for browser automation (Safari)
+    #[arg(long)]
+    pub webdriver: bool,
 }

 pub async fn run() -> Result<()> {
@@ -331,19 +421,20 @@ pub async fn run() -> Result<()> {
                cli.model.clone(),
            )?;
            
+            // Create a simple output writer for the enhancement task
            let ui_writer = ConsoleUiWriter::new();
            let mut temp_agent = Agent::new_with_readme_and_quiet(
                temp_config,
                ui_writer,
                None,
-                true, // quiet mode
+                true, // quiet mode for enhancement
            ).await?;
            
-            // Craft the enhancement prompt
+            // Create enhancement prompt
            let enhancement_prompt = format!(
-                r#"You are a requirements analyst. Take this brief user input and expand it into a structured requirements document.
+                r#"Convert the following user input into a well-structured requirements.md document.

-USER INPUT:
+User Input:
 {}

 Create a professional requirements document with:
@@ -433,12 +524,17 @@ Output ONLY the markdown content, no explanations or meta-commentary."#,
    }

    // Load configuration with CLI overrides
-    let config = Config::load_with_overrides(
+    let mut config = Config::load_with_overrides(
        cli.config.as_deref(),
        cli.provider.clone(),
        cli.model.clone(),
    )?;

+    // Override webdriver setting from CLI flag
+    if cli.webdriver {
+        config.webdriver.enabled = true;
+    }
+
    // Validate provider if specified
    if let Some(ref provider) = cli.provider {
        let valid_providers = ["anthropic", "databricks", "embedded", "openai"];
@@ -466,7 +562,8 @@ Output ONLY the markdown content, no explanations or meta-commentary."#,
    
    let mut agent = if cli.autonomous {
        Agent::new_autonomous_with_readme_and_quiet(
-            config.clone(),
+            // Use player-specific config in autonomous mode
+            config.for_player()?,
            ui_writer,
            combined_content.clone(),
            cli.quiet,
@@ -1373,6 +1470,10 @@ async fn run_autonomous(
    loop {
        let turn_start_time = Instant::now();
        let turn_start_tokens = agent.get_context_window().used_tokens;
+        
+        // Reset filter suppression state at the start of each turn
+        g3_core::fixed_filter_json::reset_fixed_json_tool_state();
+        
        // Skip player turn if it's the first turn and implementation files exist
        if !(turn == 1 && skip_first_player) {
            output.print(&format!(
@@ -1535,10 +1636,10 @@ async fn run_autonomous(
        // Use the same config with overrides that was passed to the player agent
        let base_config = agent.get_config().clone();
        let coach_config = base_config.for_coach()?;
-
+        
        // Reset filter suppression state before creating coach agent
        g3_core::fixed_filter_json::reset_fixed_json_tool_state();
-
+        
        let ui_writer = ConsoleUiWriter::new();
        let mut coach_agent =
            Agent::new_autonomous_with_readme_and_quiet(coach_config, ui_writer, None, quiet).await?;
@@ -1689,7 +1790,7 @@ Remember: Be clear in your review and concise in your feedback. APPROVE if the i

        // Extract the complete coach feedback from final_output
        let coach_feedback_text =
-            extract_coach_feedback_from_logs(&coach_result, &coach_agent, &output)?;
+            extract_coach_feedback_from_logs(&coach_result, &coach_agent, &output);

        // Log the size of the feedback for debugging
        info!(
@@ -1716,6 +1817,15 @@ Remember: Be clear in your review and concise in your feedback. APPROVE if the i

        output.print_smart(&format!("Coach feedback:\n{}", coach_feedback_text));

+        // Record turn metrics before checking for approval or max turns
+        let turn_duration = turn_start_time.elapsed();
+        let turn_tokens = agent.get_context_window().used_tokens.saturating_sub(turn_start_tokens);
+        turn_metrics.push(TurnMetrics {
+            turn_number: turn,
+            tokens_used: turn_tokens,
+            wall_clock_time: turn_duration,
+        });
+
        // Check if coach approved the implementation
        if coach_result.is_approved() || coach_feedback_text.contains("IMPLEMENTATION_APPROVED") {
            output.print("\n=== SESSION COMPLETED - IMPLEMENTATION APPROVED ===");
@@ -1724,6 +1834,7 @@ Remember: Be clear in your review and concise in your feedback. APPROVE if the i
            break;
        }

+        // Increment turn counter after recording metrics but before checking max turns
        // Check if we've reached max turns
        if turn >= max_turns {
            output.print("\n=== SESSION COMPLETED - MAX TURNS REACHED ===");
@@ -1733,14 +1844,7 @@ Remember: Be clear in your review and concise in your feedback. APPROVE if the i

        // Store coach feedback for next iteration
        coach_feedback = coach_feedback_text;
-        // Record turn metrics before incrementing
-        let turn_duration = turn_start_time.elapsed();
-        let turn_tokens = agent.get_context_window().used_tokens.saturating_sub(turn_start_tokens);
-        turn_metrics.push(TurnMetrics {
-            turn_number: turn,
-            tokens_used: turn_tokens,
-            wall_clock_time: turn_duration,
-        });
+        
        turn += 1;

        output.print("🔄 Coach provided feedback for next iteration");
--- a/crates/g3-config/Cargo.toml
+++ b/crates/g3-config/Cargo.toml
@@ -12,6 +12,3 @@ thiserror = { workspace = true }
 toml = "0.8"
 shellexpand = "3.0"
 dirs = "5.0"
-
-[dev-dependencies]
-tempfile = "3.8"
--- a/crates/g3-config/src/autonomous_config_tests.rs
+++ b/crates/g3-config/src/autonomous_config_tests.rs
@@ -0,0 +1,131 @@
+#[cfg(test)]
+mod autonomous_config_tests {
+    use crate::{Config, AnthropicConfig, DatabricksConfig};
+
+    #[test]
+    fn test_default_autonomous_config() {
+        let config = Config::default();
+        assert!(config.autonomous.coach_provider.is_none());
+        assert!(config.autonomous.coach_model.is_none());
+        assert!(config.autonomous.player_provider.is_none());
+        assert!(config.autonomous.player_model.is_none());
+    }
+
+    #[test]
+    fn test_for_coach_with_overrides() {
+        let mut config = Config::default();
+        
+        // Set up base config with anthropic
+        config.providers.anthropic = Some(AnthropicConfig {
+            api_key: "test-key".to_string(),
+            model: "claude-3-5-sonnet-20241022".to_string(),
+            max_tokens: Some(4096),
+            temperature: Some(0.1),
+        });
+        
+        // Set coach overrides
+        config.autonomous.coach_provider = Some("anthropic".to_string());
+        config.autonomous.coach_model = Some("claude-3-opus-20240229".to_string());
+        
+        let coach_config = config.for_coach().unwrap();
+        
+        // Verify coach uses overridden provider and model
+        assert_eq!(coach_config.providers.default_provider, "anthropic");
+        assert_eq!(
+            coach_config.providers.anthropic.as_ref().unwrap().model,
+            "claude-3-opus-20240229"
+        );
+    }
+
+    #[test]
+    fn test_for_player_with_overrides() {
+        let mut config = Config::default();
+        
+        // Set up base config with databricks
+        config.providers.databricks = Some(DatabricksConfig {
+            host: "https://test.databricks.com".to_string(),
+            token: Some("test-token".to_string()),
+            model: "databricks-meta-llama-3-1-70b-instruct".to_string(),
+            max_tokens: Some(4096),
+            temperature: Some(0.1),
+            use_oauth: Some(false),
+        });
+        
+        // Set player overrides
+        config.autonomous.player_provider = Some("databricks".to_string());
+        config.autonomous.player_model = Some("databricks-dbrx-instruct".to_string());
+        
+        let player_config = config.for_player().unwrap();
+        
+        // Verify player uses overridden provider and model
+        assert_eq!(player_config.providers.default_provider, "databricks");
+        assert_eq!(
+            player_config.providers.databricks.as_ref().unwrap().model,
+            "databricks-dbrx-instruct"
+        );
+    }
+
+    #[test]
+    fn test_no_overrides_uses_defaults() {
+        let mut config = Config::default();
+        config.providers.default_provider = "databricks".to_string();
+        
+        let coach_config = config.for_coach().unwrap();
+        let player_config = config.for_player().unwrap();
+        
+        // Both should use the default provider when no overrides
+        assert_eq!(coach_config.providers.default_provider, "databricks");
+        assert_eq!(player_config.providers.default_provider, "databricks");
+    }
+
+    #[test]
+    fn test_provider_override_only() {
+        let mut config = Config::default();
+        
+        config.providers.anthropic = Some(AnthropicConfig {
+            api_key: "test-key".to_string(),
+            model: "claude-3-5-sonnet-20241022".to_string(),
+            max_tokens: Some(4096),
+            temperature: Some(0.1),
+        });
+        
+        // Only override provider, not model
+        config.autonomous.coach_provider = Some("anthropic".to_string());
+        
+        let coach_config = config.for_coach().unwrap();
+        
+        // Should use overridden provider with its default model
+        assert_eq!(coach_config.providers.default_provider, "anthropic");
+        assert_eq!(
+            coach_config.providers.anthropic.as_ref().unwrap().model,
+            "claude-3-5-sonnet-20241022"
+        );
+    }
+
+    #[test]
+    fn test_model_override_only() {
+        let mut config = Config::default();
+        config.providers.default_provider = "databricks".to_string();
+        
+        config.providers.databricks = Some(DatabricksConfig {
+            host: "https://test.databricks.com".to_string(),
+            token: Some("test-token".to_string()),
+            model: "databricks-meta-llama-3-1-70b-instruct".to_string(),
+            max_tokens: Some(4096),
+            temperature: Some(0.1),
+            use_oauth: Some(false),
+        });
+        
+        // Only override model, not provider
+        config.autonomous.player_model = Some("databricks-dbrx-instruct".to_string());
+        
+        let player_config = config.for_player().unwrap();
+        
+        // Should use default provider with overridden model
+        assert_eq!(player_config.providers.default_provider, "databricks");
+        assert_eq!(
+            player_config.providers.databricks.as_ref().unwrap().model,
+            "databricks-dbrx-instruct"
+        );
+    }
+}
--- a/crates/g3-config/src/lib.rs
+++ b/crates/g3-config/src/lib.rs
@@ -2,12 +2,16 @@ use serde::{Deserialize, Serialize};
 use anyhow::Result;
 use std::path::Path;

+#[cfg(test)]
+mod autonomous_config_tests;
+
 #[derive(Debug, Clone, Serialize, Deserialize)]
 pub struct Config {
    pub providers: ProvidersConfig,
    pub agent: AgentConfig,
    pub computer_control: ComputerControlConfig,
    pub webdriver: WebDriverConfig,
+    pub autonomous: AutonomousConfig,
 }

 #[derive(Debug, Clone, Serialize, Deserialize)]
@@ -17,8 +21,6 @@ pub struct ProvidersConfig {
    pub databricks: Option<DatabricksConfig>,
    pub embedded: Option<EmbeddedConfig>,
    pub default_provider: String,
-    pub coach: Option<String>,  // Provider to use for coach in autonomous mode
-    pub player: Option<String>, // Provider to use for player in autonomous mode
 }

 #[derive(Debug, Clone, Serialize, Deserialize)]
@@ -88,6 +90,20 @@ impl Default for WebDriverConfig {
    }
 }

+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct AutonomousConfig {
+    pub coach_provider: Option<String>,
+    pub coach_model: Option<String>,
+    pub player_provider: Option<String>,
+    pub player_model: Option<String>,
+}
+
+impl Default for AutonomousConfig {
+    fn default() -> Self {
+        Self { coach_provider: None, coach_model: None, player_provider: None, player_model: None }
+    }
+}
+
 impl Default for ComputerControlConfig {
    fn default() -> Self {
        Self {
@@ -114,8 +130,6 @@ impl Default for Config {
                }),
                embedded: None,
                default_provider: "databricks".to_string(),
-                coach: None,  // Will use default_provider if not specified
-                player: None, // Will use default_provider if not specified
            },
            agent: AgentConfig {
                max_context_length: 8192,
@@ -124,6 +138,7 @@ impl Default for Config {
            },
            computer_control: ComputerControlConfig::default(),
            webdriver: WebDriverConfig::default(),
+            autonomous: AutonomousConfig::default(),
        }
    }
 }
@@ -228,8 +243,6 @@ impl Config {
                    threads: Some(8),
                }),
                default_provider: "embedded".to_string(),
-                coach: None,  // Will use default_provider if not specified
-                player: None, // Will use default_provider if not specified
            },
            agent: AgentConfig {
                max_context_length: 8192,
@@ -238,6 +251,7 @@ impl Config {
            },
            computer_control: ComputerControlConfig::default(),
            webdriver: WebDriverConfig::default(),
+            autonomous: AutonomousConfig::default(),
        }
    }
    
@@ -307,66 +321,77 @@ impl Config {
        Ok(config)
    }
    
-    /// Get the provider to use for coach mode in autonomous execution
-    pub fn get_coach_provider(&self) -> &str {
-        self.providers.coach
-            .as_deref()
-            .unwrap_or(&self.providers.default_provider)
-    }
-    
-    /// Get the provider to use for player mode in autonomous execution
-    pub fn get_player_provider(&self) -> &str {
-        self.providers.player
-            .as_deref()
-            .unwrap_or(&self.providers.default_provider)
-    }
-    
-    /// Create a copy of the config with a different default provider
-    pub fn with_provider_override(&self, provider: &str) -> Result<Self> {
-        // Validate that the provider is configured
-        match provider {
-            "anthropic" if self.providers.anthropic.is_none() => {
-                return Err(anyhow::anyhow!(
-                    "Provider '{}' is specified but not configured. Please add {} configuration to your config file.",
-                    provider, provider
-                ));
-            }
-            "databricks" if self.providers.databricks.is_none() => {
-                return Err(anyhow::anyhow!(
-                    "Provider '{}' is specified but not configured. Please add {} configuration to your config file.",
-                    provider, provider
-                ));
-            }
-            "embedded" if self.providers.embedded.is_none() => {
-                return Err(anyhow::anyhow!(
-                    "Provider '{}' is specified but not configured. Please add {} configuration to your config file.",
-                    provider, provider
-                ));
-            }
-            "openai" if self.providers.openai.is_none() => {
-                return Err(anyhow::anyhow!(
-                    "Provider '{}' is specified but not configured. Please add {} configuration to your config file.",
-                    provider, provider
-                ));
-            }
-            _ => {} // Provider is configured or unknown (will be caught later)
+    /// Create a config for the coach agent in autonomous mode
+    pub fn for_coach(&self) -> Result<Self> {
+        let mut config = self.clone();
+        
+        // Apply coach-specific overrides if configured
+        if let Some(ref coach_provider) = self.autonomous.coach_provider {
+            config.providers.default_provider = coach_provider.clone();
+        }
+        
+        if let Some(ref coach_model) = self.autonomous.coach_model {
+            // Apply model override to the coach's provider
+            match config.providers.default_provider.as_str() {
+                "anthropic" => {
+                    if let Some(ref mut anthropic) = config.providers.anthropic {
+                        anthropic.model = coach_model.clone();
+                    } else {
+                        return Err(anyhow::anyhow!(
+                            "Coach provider 'anthropic' is not configured. Please add anthropic configuration to your config file."
+                        ));
+                    }
+                }
+                "databricks" => {
+                    if let Some(ref mut databricks) = config.providers.databricks {
+                        databricks.model = coach_model.clone();
+                    } else {
+                        return Err(anyhow::anyhow!(
+                            "Coach provider 'databricks' is not configured. Please add databricks configuration to your config file."
+                        ));
+                    }
+                }
+                _ => {}
+            }
        }
        
-        let mut config = self.clone();
-        config.providers.default_provider = provider.to_string();
        Ok(config)
    }
    
-    /// Create a copy of the config for coach mode in autonomous execution
-    pub fn for_coach(&self) -> Result<Self> {
-        self.with_provider_override(self.get_coach_provider())
-    }
-    
-    /// Create a copy of the config for player mode in autonomous execution
+    /// Create a config for the player agent in autonomous mode
    pub fn for_player(&self) -> Result<Self> {
-        self.with_provider_override(self.get_player_provider())
+        let mut config = self.clone();
+        
+        // Apply player-specific overrides if configured
+        if let Some(ref player_provider) = self.autonomous.player_provider {
+            config.providers.default_provider = player_provider.clone();
+        }
+        
+        if let Some(ref player_model) = self.autonomous.player_model {
+            // Apply model override to the player's provider
+            match config.providers.default_provider.as_str() {
+                "anthropic" => {
+                    if let Some(ref mut anthropic) = config.providers.anthropic {
+                        anthropic.model = player_model.clone();
+                    } else {
+                        return Err(anyhow::anyhow!(
+                            "Player provider 'anthropic' is not configured. Please add anthropic configuration to your config file."
+                        ));
+                    }
+                }
+                "databricks" => {
+                    if let Some(ref mut databricks) = config.providers.databricks {
+                        databricks.model = player_model.clone();
+                    } else {
+                        return Err(anyhow::anyhow!(
+                            "Player provider 'databricks' is not configured. Please add databricks configuration to your config file."
+                        ));
+                    }
+                }
+                _ => {}
+            }
+        }
+        
+        Ok(config)
    }
 }
-
-#[cfg(test)]
-mod tests;
--- a/crates/g3-config/src/tests.rs
+++ b/crates/g3-config/src/tests.rs
@@ -1,131 +0,0 @@
-#[cfg(test)]
-mod tests {
-    use crate::Config;
-    use std::fs;
-    use tempfile::TempDir;
-
-    #[test]
-    fn test_coach_player_providers() {
-        // Create a temporary directory for the test config
-        let temp_dir = TempDir::new().unwrap();
-        let config_path = temp_dir.path().join("test_config.toml");
-        
-        // Write a test configuration with coach and player providers
-        let config_content = r#"
-[providers]
-default_provider = "databricks"
-coach = "anthropic"
-player = "embedded"
-
-[providers.databricks]
-host = "https://test.databricks.com"
-token = "test-token"
-model = "test-model"
-
-[providers.anthropic]
-api_key = "test-key"
-model = "claude-3"
-
-[providers.embedded]
-model_path = "test.gguf"
-model_type = "llama"
-
-[agent]
-max_context_length = 8192
-enable_streaming = true
-timeout_seconds = 60
-"#;
-        
-        fs::write(&config_path, config_content).unwrap();
-        
-        // Load the configuration
-        let config = Config::load(Some(config_path.to_str().unwrap())).unwrap();
-        
-        // Test that the providers are correctly identified
-        assert_eq!(config.providers.default_provider, "databricks");
-        assert_eq!(config.get_coach_provider(), "anthropic");
-        assert_eq!(config.get_player_provider(), "embedded");
-        
-        // Test creating coach config
-        let coach_config = config.for_coach().unwrap();
-        assert_eq!(coach_config.providers.default_provider, "anthropic");
-        
-        // Test creating player config
-        let player_config = config.for_player().unwrap();
-        assert_eq!(player_config.providers.default_provider, "embedded");
-    }
-    
-    #[test]
-    fn test_coach_player_fallback_to_default() {
-        // Create a temporary directory for the test config
-        let temp_dir = TempDir::new().unwrap();
-        let config_path = temp_dir.path().join("test_config.toml");
-        
-        // Write a test configuration WITHOUT coach and player providers
-        let config_content = r#"
-[providers]
-default_provider = "databricks"
-
-[providers.databricks]
-host = "https://test.databricks.com"
-token = "test-token"
-model = "test-model"
-
-[agent]
-max_context_length = 8192
-enable_streaming = true
-timeout_seconds = 60
-"#;
-        
-        fs::write(&config_path, config_content).unwrap();
-        
-        // Load the configuration
-        let config = Config::load(Some(config_path.to_str().unwrap())).unwrap();
-        
-        // Test that coach and player fall back to default provider
-        assert_eq!(config.get_coach_provider(), "databricks");
-        assert_eq!(config.get_player_provider(), "databricks");
-        
-        // Test creating coach config (should use default)
-        let coach_config = config.for_coach().unwrap();
-        assert_eq!(coach_config.providers.default_provider, "databricks");
-        
-        // Test creating player config (should use default)
-        let player_config = config.for_player().unwrap();
-        assert_eq!(player_config.providers.default_provider, "databricks");
-    }
-    
-    #[test]
-    fn test_invalid_provider_error() {
-        // Create a temporary directory for the test config
-        let temp_dir = TempDir::new().unwrap();
-        let config_path = temp_dir.path().join("test_config.toml");
-        
-        // Write a test configuration with an unconfigured provider
-        let config_content = r#"
-[providers]
-default_provider = "databricks"
-coach = "openai"  # OpenAI is not configured
-
-[providers.databricks]
-host = "https://test.databricks.com"
-token = "test-token"
-model = "test-model"
-
-[agent]
-max_context_length = 8192
-enable_streaming = true
-timeout_seconds = 60
-"#;
-        
-        fs::write(&config_path, config_content).unwrap();
-        
-        // Load the configuration
-        let config = Config::load(Some(config_path.to_str().unwrap())).unwrap();
-        
-        // Test that trying to create a coach config with unconfigured provider fails
-        let result = config.for_coach();
-        assert!(result.is_err());
-        assert!(result.unwrap_err().to_string().contains("not configured"));
-    }
-}
--- a/crates/g3-core/src/lib.rs
+++ b/crates/g3-core/src/lib.rs
@@ -599,32 +599,13 @@ impl<W: UiWriter> Agent<W> {
    ) -> Result<Self> {
        let mut providers = ProviderRegistry::new();

-        // In autonomous mode, we need to register both coach and player providers
-        // Otherwise, only register the default provider
-        let providers_to_register: Vec<String> = if is_autonomous {
-            let mut providers = vec![config.providers.default_provider.clone()];
-            if let Some(coach) = &config.providers.coach {
-                if !providers.contains(coach) {
-                    providers.push(coach.clone());
-                }
-            }
-            if let Some(player) = &config.providers.player {
-                if !providers.contains(player) {
-                    providers.push(player.clone());
-                }
-            }
-            providers
-        } else {
-            vec![config.providers.default_provider.clone()]
-        };
-
        // Only register providers that are configured AND selected as the default provider
        // This prevents unnecessary initialization of heavy providers like embedded models

        // Register embedded provider if configured AND it's the default provider
        if let Some(embedded_config) = &config.providers.embedded {
-            if providers_to_register.contains(&"embedded".to_string()) {
-                info!("Initializing embedded provider");
+            if config.providers.default_provider == "embedded" {
+                info!("Initializing embedded provider (selected as default)");
                let embedded_provider = g3_providers::EmbeddedProvider::new(
                    embedded_config.model_path.clone(),
                    embedded_config.model_type.clone(),
@@ -636,31 +617,14 @@ impl<W: UiWriter> Agent<W> {
                )?;
                providers.register(embedded_provider);
            } else {
-                info!("Embedded provider configured but not needed, skipping initialization");
-            }
-        }
-
-        // Register OpenAI provider if configured AND it's the default provider
-        if let Some(openai_config) = &config.providers.openai {
-            if providers_to_register.contains(&"openai".to_string()) {
-                info!("Initializing OpenAI provider");
-                let openai_provider = g3_providers::OpenAIProvider::new(
-                    openai_config.api_key.clone(),
-                    Some(openai_config.model.clone()),
-                    openai_config.base_url.clone(),
-                    openai_config.max_tokens,
-                    openai_config.temperature,
-                )?;
-                providers.register(openai_provider);
-            } else {
-                info!("OpenAI provider configured but not needed, skipping initialization");
+                info!("Embedded provider configured but not selected as default, skipping initialization");
            }
        }

        // Register Anthropic provider if configured AND it's the default provider
        if let Some(anthropic_config) = &config.providers.anthropic {
-            if providers_to_register.contains(&"anthropic".to_string()) {
-                info!("Initializing Anthropic provider");
+            if config.providers.default_provider == "anthropic" {
+                info!("Initializing Anthropic provider (selected as default)");
                let anthropic_provider = g3_providers::AnthropicProvider::new(
                    anthropic_config.api_key.clone(),
                    Some(anthropic_config.model.clone()),
@@ -669,14 +633,14 @@ impl<W: UiWriter> Agent<W> {
                )?;
                providers.register(anthropic_provider);
            } else {
-                info!("Anthropic provider configured but not needed, skipping initialization");
+                info!("Anthropic provider configured but not selected as default, skipping initialization");
            }
        }

        // Register Databricks provider if configured AND it's the default provider
        if let Some(databricks_config) = &config.providers.databricks {
-            if providers_to_register.contains(&"databricks".to_string()) {
-                info!("Initializing Databricks provider");
+            if config.providers.default_provider == "databricks" {
+                info!("Initializing Databricks provider (selected as default)");

                let databricks_provider = if let Some(token) = &databricks_config.token {
                    // Use token-based authentication
@@ -700,7 +664,7 @@ impl<W: UiWriter> Agent<W> {

                providers.register(databricks_provider);
            } else {
-                info!("Databricks provider configured but not needed, skipping initialization");
+                info!("Databricks provider configured but not selected as default, skipping initialization");
            }
        }

@@ -783,9 +747,6 @@ impl<W: UiWriter> Agent<W> {
                    config.agent.max_context_length as u32
                }
            }
-            "openai" => {
-                192000
-            }
            "anthropic" => {
                // Claude models have large context windows
                200000 // Default for Claude models
@@ -1073,6 +1034,7 @@ Template:
        };

        // Get max_tokens from provider configuration
+        // For Databricks, this should be much higher to support large file generation
        let max_tokens = match provider.name() {
            "databricks" => {
                // Use the model's maximum limit for Databricks to allow large file generation
--- a/crates/g3-providers/src/anthropic.rs
+++ b/crates/g3-providers/src/anthropic.rs
@@ -156,9 +156,8 @@ impl AnthropicProvider {
            .post(ANTHROPIC_API_URL)
            .header("x-api-key", &self.api_key)
            .header("anthropic-version", ANTHROPIC_VERSION)
-            // Anthropic beta 1m context window. Enable if needed. It costs extra, so check first.
-            // .header("anthropic-beta", "context-1m-2025-08-07")
            .header("content-type", "application/json");
+
        if streaming {
            builder = builder.header("accept", "text/event-stream");
        }
--- a/crates/g3-providers/src/lib.rs
+++ b/crates/g3-providers/src/lib.rs
@@ -88,12 +88,10 @@ pub mod anthropic;
 pub mod databricks;
 pub mod embedded;
 pub mod oauth;
-pub mod openai;

 pub use anthropic::AnthropicProvider;
 pub use databricks::DatabricksProvider;
 pub use embedded::EmbeddedProvider;
-pub use openai::OpenAIProvider;

 /// Provider registry for managing multiple LLM providers
 pub struct ProviderRegistry {
--- a/crates/g3-providers/src/openai.rs
+++ b/crates/g3-providers/src/openai.rs
@@ -1,495 +0,0 @@
-use anyhow::Result;
-use async_trait::async_trait;
-use bytes::Bytes;
-use futures_util::stream::StreamExt;
-use reqwest::Client;
-use serde::Deserialize;
-use serde_json::json;
-use tokio::sync::mpsc;
-use tokio_stream::wrappers::ReceiverStream;
-use tracing::{debug, error};
-
-use crate::{
-    CompletionChunk, CompletionRequest, CompletionResponse, CompletionStream, LLMProvider,
-    Message, MessageRole, Tool, ToolCall, Usage,
-};
-
-#[derive(Clone)]
-pub struct OpenAIProvider {
-    client: Client,
-    api_key: String,
-    model: String,
-    base_url: String,
-    max_tokens: Option<u32>,
-    _temperature: Option<f32>,
-}
-
-impl OpenAIProvider {
-    pub fn new(
-        api_key: String,
-        model: Option<String>,
-        base_url: Option<String>,
-        max_tokens: Option<u32>,
-        temperature: Option<f32>,
-    ) -> Result<Self> {
-        Ok(Self {
-            client: Client::new(),
-            api_key,
-            model: model.unwrap_or_else(|| "gpt-4o".to_string()),
-            base_url: base_url.unwrap_or_else(|| "https://api.openai.com/v1".to_string()),
-            max_tokens,
-            _temperature: temperature,
-        })
-    }
-
-    fn create_request_body(
-        &self,
-        messages: &[Message],
-        tools: Option<&[Tool]>,
-        stream: bool,
-        max_tokens: Option<u32>,
-        _temperature: Option<f32>,
-    ) -> serde_json::Value {
-        let mut body = json!({
-            "model": self.model,
-            "messages": convert_messages(messages),
-            "stream": stream,
-        });
-
-        if let Some(max_tokens) = max_tokens.or(self.max_tokens) {
-            body["max_completion_tokens"] = json!(max_tokens);
-        }
-
-        // OpenAI calls with temp setting seem to fail, so don't send one.
-        // if let Some(temperature) = temperature.or(self.temperature) {
-        //     body["temperature"] = json!(temperature);
-        // }
-
-        if let Some(tools) = tools {
-            if !tools.is_empty() {
-                body["tools"] = json!(convert_tools(tools));
-            }
-        }
-
-        if stream {
-            body["stream_options"] = json!({
-                "include_usage": true,
-            });
-        }
-
-        body
-    }
-
-    async fn parse_streaming_response(
-        &self,
-        mut stream: impl futures_util::Stream<Item = reqwest::Result<Bytes>> + Unpin,
-        tx: mpsc::Sender<Result<CompletionChunk>>,
-    ) -> Option<Usage> {
-        let mut buffer = String::new();
-        let mut accumulated_content = String::new();
-        let mut accumulated_usage: Option<Usage> = None;
-        let mut current_tool_calls: Vec<OpenAIStreamingToolCall> = Vec::new();
-
-        while let Some(chunk_result) = stream.next().await {
-            match chunk_result {
-                Ok(chunk) => {
-                    let chunk_str = match std::str::from_utf8(&chunk) {
-                        Ok(s) => s,
-                        Err(e) => {
-                            error!("Failed to parse chunk as UTF-8: {}", e);
-                            continue;
-                        }
-                    };
-
-                    buffer.push_str(chunk_str);
-
-                    // Process complete lines
-                    while let Some(line_end) = buffer.find('\n') {
-                        let line = buffer[..line_end].trim().to_string();
-                        buffer.drain(..line_end + 1);
-
-                        if line.is_empty() {
-                            continue;
-                        }
-
-                        // Parse Server-Sent Events format
-                        if let Some(data) = line.strip_prefix("data: ") {
-                            if data == "[DONE]" {
-                                debug!("Received stream completion marker");
-
-                                // Send final chunk with accumulated content and tool calls
-                                if !accumulated_content.is_empty() || !current_tool_calls.is_empty() {
-                                    let tool_calls = if current_tool_calls.is_empty() {
-                                        None
-                                    } else {
-                                        Some(
-                                            current_tool_calls
-                                                .iter()
-                                                .filter_map(|tc| tc.to_tool_call())
-                                                .collect(),
-                                        )
-                                    };
-
-                                    let final_chunk = CompletionChunk {
-                                        content: accumulated_content.clone(),
-                                        finished: true,
-                                        tool_calls,
-                                        usage: accumulated_usage.clone(),
-                                    };
-                                    let _ = tx.send(Ok(final_chunk)).await;
-                                }
-
-                                return accumulated_usage;
-                            }
-
-                            // Parse the JSON data
-                            match serde_json::from_str::<OpenAIStreamChunk>(data) {
-                                Ok(chunk_data) => {
-                                    // Handle content
-                                    for choice in &chunk_data.choices {
-                                        if let Some(content) = &choice.delta.content {
-                                            accumulated_content.push_str(content);
-
-                                            let chunk = CompletionChunk {
-                                                content: content.clone(),
-                                                finished: false,
-                                                tool_calls: None,
-                                                usage: None,
-                                            };
-                                            if tx.send(Ok(chunk)).await.is_err() {
-                                                debug!("Receiver dropped, stopping stream");
-                                                return accumulated_usage;
-                                            }
-                                        }
-
-                                        // Handle tool calls
-                                        if let Some(delta_tool_calls) = &choice.delta.tool_calls {
-                                            for delta_tool_call in delta_tool_calls {
-                                                if let Some(index) = delta_tool_call.index {
-                                                    // Ensure we have enough tool calls in our vector
-                                                    while current_tool_calls.len() <= index {
-                                                        current_tool_calls
-                                                            .push(OpenAIStreamingToolCall::default());
-                                                    }
-
-                                                    let tool_call = &mut current_tool_calls[index];
-
-                                                    if let Some(id) = &delta_tool_call.id {
-                                                        tool_call.id = Some(id.clone());
-                                                    }
-
-                                                    if let Some(function) = &delta_tool_call.function {
-                                                        if let Some(name) = &function.name {
-                                                            tool_call.name = Some(name.clone());
-                                                        }
-                                                        if let Some(arguments) = &function.arguments {
-                                                            tool_call.arguments.push_str(arguments);
-                                                        }
-                                                    }
-                                                }
-                                            }
-                                        }
-                                    }
-
-                                    // Handle usage
-                                    if let Some(usage) = chunk_data.usage {
-                                        accumulated_usage = Some(Usage {
-                                            prompt_tokens: usage.prompt_tokens,
-                                            completion_tokens: usage.completion_tokens,
-                                            total_tokens: usage.total_tokens,
-                                        });
-                                    }
-                                }
-                                Err(e) => {
-                                    debug!("Failed to parse stream chunk: {} - Data: {}", e, data);
-                                }
-                            }
-                        }
-                    }
-                }
-                Err(e) => {
-                    error!("Stream error: {}", e);
-                    let _ = tx.send(Err(anyhow::anyhow!("Stream error: {}", e))).await;
-                    return accumulated_usage;
-                }
-            }
-        }
-
-        // Send final chunk if we haven't already
-        let tool_calls = if current_tool_calls.is_empty() {
-            None
-        } else {
-            Some(
-                current_tool_calls
-                    .iter()
-                    .filter_map(|tc| tc.to_tool_call())
-                    .collect(),
-            )
-        };
-        
-        let final_chunk = CompletionChunk {
-            content: String::new(),
-            finished: true,
-            tool_calls,
-            usage: accumulated_usage.clone(),
-        };
-        let _ = tx.send(Ok(final_chunk)).await;
-        
-        accumulated_usage
-    }
-}
-
-#[async_trait]
-impl LLMProvider for OpenAIProvider {
-    async fn complete(&self, request: CompletionRequest) -> Result<CompletionResponse> {
-        debug!(
-            "Processing OpenAI completion request with {} messages",
-            request.messages.len()
-        );
-
-        let body = self.create_request_body(
-            &request.messages,
-            request.tools.as_deref(),
-            false,
-            request.max_tokens,
-            request.temperature,
-        );
-
-        debug!("Sending request to OpenAI API: model={}", self.model);
-
-        let response = self
-            .client
-            .post(&format!("{}/chat/completions", self.base_url))
-            .header("Authorization", format!("Bearer {}", self.api_key))
-            .json(&body)
-            .send()
-            .await?;
-
-        let status = response.status();
-        if !status.is_success() {
-            let error_text = response
-                .text()
-                .await
-                .unwrap_or_else(|_| "Unknown error".to_string());
-            return Err(anyhow::anyhow!("OpenAI API error {}: {}", status, error_text));
-        }
-
-        let openai_response: OpenAIResponse = response.json().await?;
-
-        let content = openai_response
-            .choices
-            .first()
-            .and_then(|choice| choice.message.content.clone())
-            .unwrap_or_default();
-
-        let usage = Usage {
-            prompt_tokens: openai_response.usage.prompt_tokens,
-            completion_tokens: openai_response.usage.completion_tokens,
-            total_tokens: openai_response.usage.total_tokens,
-        };
-
-        debug!(
-            "OpenAI completion successful: {} tokens generated",
-            usage.completion_tokens
-        );
-
-        Ok(CompletionResponse {
-            content,
-            usage,
-            model: self.model.clone(),
-        })
-    }
-
-    async fn stream(&self, request: CompletionRequest) -> Result<CompletionStream> {
-        debug!(
-            "Processing OpenAI streaming request with {} messages",
-            request.messages.len()
-        );
-
-        let body = self.create_request_body(
-            &request.messages,
-            request.tools.as_deref(),
-            true,
-            request.max_tokens,
-            request.temperature,
-        );
-
-        debug!("Sending streaming request to OpenAI API: model={}", self.model);
-
-        let response = self
-            .client
-            .post(&format!("{}/chat/completions", self.base_url))
-            .header("Authorization", format!("Bearer {}", self.api_key))
-            .json(&body)
-            .send()
-            .await?;
-
-        let status = response.status();
-        if !status.is_success() {
-            let error_text = response
-                .text()
-                .await
-                .unwrap_or_else(|_| "Unknown error".to_string());
-            return Err(anyhow::anyhow!("OpenAI API error {}: {}", status, error_text));
-        }
-
-        let stream = response.bytes_stream();
-        let (tx, rx) = mpsc::channel(100);
-
-        // Spawn task to process the stream
-        let provider = self.clone();
-        tokio::spawn(async move {
-            let usage = provider.parse_streaming_response(stream, tx).await;
-            // Log the final usage if available
-            if let Some(usage) = usage {
-                debug!(
-                    "Stream completed with usage - prompt: {}, completion: {}, total: {}",
-                    usage.prompt_tokens, usage.completion_tokens, usage.total_tokens
-                );
-            }
-        });
-
-        Ok(ReceiverStream::new(rx))
-    }
-
-    fn name(&self) -> &str {
-        "openai"
-    }
-
-    fn model(&self) -> &str {
-        &self.model
-    }
-
-    fn has_native_tool_calling(&self) -> bool {
-        // OpenAI models support native tool calling
-        true
-    }
-}
-
-fn convert_messages(messages: &[Message]) -> Vec<serde_json::Value> {
-    messages
-        .iter()
-        .map(|msg| {
-            json!({
-                "role": match msg.role {
-                    MessageRole::System => "system",
-                    MessageRole::User => "user",
-                    MessageRole::Assistant => "assistant",
-                },
-                "content": msg.content,
-            })
-        })
-        .collect()
-}
-
-fn convert_tools(tools: &[Tool]) -> Vec<serde_json::Value> {
-    tools
-        .iter()
-        .map(|tool| {
-            json!({
-                "type": "function",
-                "function": {
-                    "name": tool.name,
-                    "description": tool.description,
-                    "parameters": tool.input_schema,
-                }
-            })
-        })
-        .collect()
-}
-
-// OpenAI API response structures
-#[derive(Debug, Deserialize)]
-struct OpenAIResponse {
-    choices: Vec<OpenAIChoice>,
-    usage: OpenAIUsage,
-}
-
-#[derive(Debug, Deserialize)]
-struct OpenAIChoice {
-    message: OpenAIMessage,
-}
-
-#[allow(dead_code)]
-#[derive(Debug, Deserialize)]
-struct OpenAIMessage {
-    content: Option<String>,
-    #[serde(default)]
-    tool_calls: Option<Vec<OpenAIToolCall>>,
-}
-
-#[allow(dead_code)]
-#[derive(Debug, Deserialize)]
-struct OpenAIToolCall {
-    id: String,
-    function: OpenAIFunction,
-}
-
-#[allow(dead_code)]
-#[derive(Debug, Deserialize)]
-struct OpenAIFunction {
-    name: String,
-    arguments: String,
-}
-
-// Streaming tool call accumulator
-#[derive(Debug, Default)]
-struct OpenAIStreamingToolCall {
-    id: Option<String>,
-    name: Option<String>,
-    arguments: String,
-}
-
-impl OpenAIStreamingToolCall {
-    fn to_tool_call(&self) -> Option<ToolCall> {
-        let id = self.id.as_ref()?;
-        let name = self.name.as_ref()?;
-        
-        let args = serde_json::from_str(&self.arguments).unwrap_or(serde_json::Value::Null);
-        
-        Some(ToolCall {
-            id: id.clone(),
-            tool: name.clone(),
-            args,
-        })
-    }
-}
-
-#[derive(Debug, Deserialize)]
-struct OpenAIUsage {
-    prompt_tokens: u32,
-    completion_tokens: u32,
-    total_tokens: u32,
-}
-
-// Streaming response structures
-#[derive(Debug, Deserialize)]
-struct OpenAIStreamChunk {
-    choices: Vec<OpenAIStreamChoice>,
-    usage: Option<OpenAIUsage>,
-}
-
-#[derive(Debug, Deserialize)]
-struct OpenAIStreamChoice {
-    delta: OpenAIDelta,
-}
-
-#[derive(Debug, Deserialize)]
-struct OpenAIDelta {
-    content: Option<String>,
-    #[serde(default)]
-    tool_calls: Option<Vec<OpenAIDeltaToolCall>>,
-}
-
-#[derive(Debug, Deserialize)]
-struct OpenAIDeltaToolCall {
-    index: Option<usize>,
-    id: Option<String>,
-    function: Option<OpenAIDeltaFunction>,
-}
-
-#[derive(Debug, Deserialize)]
-struct OpenAIDeltaFunction {
-    name: Option<String>,
-    arguments: Option<String>,
-}
--- a/docs/coach-player-providers.md
+++ b/docs/coach-player-providers.md
@@ -1,75 +0,0 @@
-# Coach-Player Provider Configuration
-
-G3 now supports specifying different LLM providers for the coach and player agents when running in autonomous mode. This allows you to optimize for different requirements:
-
- **Player**: The agent that implements code - might benefit from a faster, more cost-effective model
- **Coach**: The agent that reviews code - might benefit from a more powerful, analytical model
-
-## Configuration
-
-In your `config.toml` file, under the `[providers]` section, you can specify:
-
-```toml
-[providers]
-default_provider = "databricks"  # Used for normal operations
-coach = "databricks"              # Provider for coach (code reviewer)
-player = "anthropic"              # Provider for player (code implementer)
-```
-
-If `coach` or `player` are not specified, they will default to using the `default_provider`.
-
-## Example Use Cases
-
-### Cost Optimization
-Use a cheaper, faster model for initial implementations (player) and a more powerful model for review (coach):
-
-```toml
-coach = "anthropic"  # Claude Sonnet for thorough review
-player = "anthropic" # Claude Haiku for quick implementation
-```
-
-### Speed vs Quality Trade-off
-Use a local embedded model for fast iterations (player) and a cloud model for quality review (coach):
-
-```toml
-coach = "databricks"  # Cloud model for quality review
-player = "embedded"   # Local model for fast implementation
-```
-
-### Specialized Models
-Use different models optimized for different tasks:
-
-```toml
-coach = "databricks"  # Model fine-tuned for code review
-player = "openai"     # Model optimized for code generation
-```
-
-## Requirements
-
- Both providers must be properly configured in your config file
- Each provider must have valid credentials
- The models specified for each provider must be accessible
-
-## How It Works
-
-When running in autonomous mode (`g3 --autonomous`), the system will:
-
-1. Use the `player` provider (or default) for the initial implementation
-2. Switch to the `coach` provider (or default) for code review
-3. Return to the `player` provider for implementing feedback
-4. Continue this cycle for the specified number of turns
-
-The providers are logged at startup so you can verify which models are being used:
-
-```
-🎮 Player provider: anthropic
-👨‍🏫 Coach provider: databricks
-ℹ️  Using different providers for player and coach
-```
-
-## Benefits
-
- **Cost Efficiency**: Use expensive models only where they add the most value
- **Speed Optimization**: Use faster models for iterative development
- **Specialization**: Leverage models that excel at specific tasks
- **Flexibility**: Easy to experiment with different provider combinations
Author	SHA1	Message	Date
Michael Neale	f2ed303550	Revert "don't need this" This reverts commit `93121c18e0`.	2025-10-22 14:53:25 +11:00
Michael Neale	93121c18e0	don't need this	2025-10-22 14:30:13 +11:00
Michael Neale	ed84a940f9	tweak auto mode	2025-10-22 14:27:17 +11:00
Michael Neale	3128b5d8b9	can choose per mode models for auto mode	2025-10-22 14:19:00 +11:00