Add --interactive-requirements flag for AI-enhanced requirements mode

- Adds new --interactive-requirements CLI flag for autonomous mode - Prompts user for brief requirements input - Uses AI to enhance and structure requirements into proper markdown - Shows enhanced requirements and allows user to approve/edit/cancel - Saves to requirements.md and proceeds with autonomous mode if approved - Includes test script for manual verification
Merge pull request #7 from jochenx/jochen-add-openai-and-multi-providers
2025-10-22 14:58:35 +11:00 · 2025-10-22 13:46:16 +11:00 · 2025-10-22 13:20:45 +11:00 · 2025-10-21 16:59:13 +11:00 · 2025-10-21 16:00:58 +11:00 · 2025-10-21 14:34:41 +11:00
38 changed files with 5268 additions and 432 deletions
--- a/Cargo.lock
+++ b/Cargo.lock
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -4,7 +4,8 @@ members = [
    "crates/g3-core", 
    "crates/g3-providers",
    "crates/g3-config",
-    "crates/g3-execution"
+    "crates/g3-execution",
    "crates/g3-computer-control"
 ]
 resolver = "2"
--- a/DESIGN.md
+++ b/DESIGN.md
@@ -29,7 +29,8 @@ g3/
 │   ├── g3-core/                  # Core agent engine, tools, and streaming logic
 │   ├── g3-providers/             # LLM provider abstractions and implementations
 │   ├── g3-config/                # Configuration management
-│   └── g3-execution/             # Code execution engine
+│   ├── g3-execution/             # Code execution engine
 │   └── g3-computer-control/      # Computer control and automation
 ├── logs/                         # Session logs (auto-created)
 ├── README.md                     # Project documentation
 └── DESIGN.md                     # This design document
@@ -48,6 +49,7 @@ g3/
 │ • Retro TUI     │    │ • Tool system   │    │ • Embedded      │
 │ • Autonomous    │    │ • Streaming     │    │   (llama.cpp)   │
 │   mode          │    │ • Task exec     │    │ • OAuth flow    │
 │                 │    │ • TODO mgmt     │    │                 │
 └─────────────────┘    └─────────────────┘    └─────────────────┘
         │                       │                       │
         └───────────────────────┼───────────────────────┘
@@ -59,7 +61,18 @@ g3/
                    │ • Shell cmds    │    │ • Env overrides │
                    │ • Streaming     │    │ • Provider      │
                    │ • Error hdlg    │    │   settings      │
-                    └─────────────────┘    └─────────────────┘
+                    └─────────────────┘    │ • Computer      │
                             │              │   control cfg   │
                             │              └─────────────────┘
                             │                       │
                    ┌─────────────────┐             │
                    │ g3-computer-    │◄────────────┘
                    │   control       │
                    │ • Mouse/kbd     │
                    │ • Screenshots   │
                    │ • OCR/Tesseract │
                    │ • Windows/UI    │
                    └─────────────────┘
 ```
 ## Core Components
@@ -79,6 +92,7 @@ g3/
 - **Streaming Parser**: Real-time parsing of LLM responses with tool call detection and execution
 - **Session Management**: Automatic session logging with detailed conversation history and token usage
 - **Error Recovery**: Sophisticated error classification and retry logic for recoverable errors
 - **TODO Management**: In-memory TODO list with read/write tools for task tracking
 **Available Tools:**
 - `shell`: Execute shell commands with streaming output
@@ -86,7 +100,15 @@ g3/
 - `write_file`: Create or overwrite files with content
 - `str_replace`: Apply unified diffs to files with precise editing
 - `final_output`: Signal task completion with detailed summaries
- **Project Management**: Workspace handling, requirements.md processing for autonomous mode
+- `todo_read`: Read the entire TODO list content
 - `todo_write`: Write or overwrite the entire TODO list
 - `mouse_click`: Click the mouse at specific coordinates
 - `type_text`: Type text at the current cursor position
 - `find_element`: Find UI elements by text, role, or attributes
 - `take_screenshot`: Capture screenshots of screen, region, or window
 - `extract_text`: Extract text from images or screen regions using OCR
 - `find_text_on_screen`: Find text visually on screen and return coordinates
 - `list_windows`: List all open windows with IDs and titles
 ### 2. g3-providers: LLM Provider Abstraction
@@ -172,6 +194,26 @@ g3/
 - **Validation**: Configuration validation with helpful error messages
 - **Flexible Paths**: Support for shell expansion (`~`, environment variables)
 ### 6. g3-computer-control: Computer Control & Automation
 **Primary Responsibilities:**
 - Cross-platform computer control and automation
 - Mouse and keyboard input simulation
 - Window management and screenshot capture
 - OCR text extraction from images and screen regions
 **Platform Support:**
 - **macOS**: Core Graphics, Cocoa, screencapture integration
 - **Linux**: X11/Xtest for input, X11 for window management
 - **Windows**: Win32 APIs for input and window control
 **Key Features:**
 - **OCR Integration**: Tesseract-based text extraction from images
 - **Window Management**: List, identify, and capture specific application windows
 - **UI Automation**: Find elements, simulate clicks, type text
 - **Screenshot Capture**: Full screen, regions, or specific windows
 - **Accessibility**: Requires OS-level permissions for automation
 ## Advanced Features
 ### Context Window Management
@@ -180,6 +222,7 @@ G3 implements sophisticated context window management:
 - **Automatic Monitoring**: Tracks token usage with percentage-based thresholds
 - **Smart Summarization**: Auto-triggers at 80% capacity to prevent context overflow
 - **Context Thinning**: Progressive thinning at 50%, 60%, 70%, 80% thresholds - replaces large tool results with file references
 - **Conversation Preservation**: Maintains conversation continuity through intelligent summaries
 - **Provider-Specific Limits**: Adapts to different model context windows (4k to 200k+ tokens)
 - **Cumulative Tracking**: Monitors total token usage across entire sessions
@@ -354,20 +397,23 @@ This design document reflects the current state of G3 as a mature, production-re
 ### Fully Implemented
 - ✅ **Core Agent Engine**: Complete with streaming, tool execution, and context management
 - ✅ **Provider System**: Anthropic, Databricks, and Embedded providers with OAuth support
- ✅ **Tool System**: All 5 core tools (shell, read_file, write_file, str_replace, final_output)
+- ✅ **Tool System**: 13 tools including file ops, shell, TODO management, and computer control
 - ✅ **CLI Interface**: Interactive mode, single-shot mode, retro TUI
 - ✅ **Autonomous Mode**: Coach-player feedback loop with requirements.md processing
 - ✅ **Configuration**: TOML-based config with environment overrides
 - ✅ **Error Handling**: Comprehensive retry logic and error classification
 - ✅ **Session Logging**: Automatic session tracking and JSON logs
- ✅ **Context Management**: Auto-summarization at 80% capacity
+- ✅ **Context Management**: Context thinning (50-80%) and auto-summarization at 80% capacity
 - ✅ **Computer Control**: Cross-platform automation with OCR support
 - ✅ **TODO Management**: In-memory TODO list with read/write tools
 ### Architecture Highlights
- **Workspace**: 5 crates with clear separation of concerns
+- **Workspace**: 6 crates with clear separation of concerns
 - **Dependencies**: Modern Rust ecosystem (Tokio, Clap, Serde, etc.)
 - **Streaming**: Real-time response processing with tool call detection
 - **Cross-Platform**: Works on macOS, Linux, and Windows
- **GPU Support**: Metal acceleration for local models on macOS
+- **GPU Support**: Metal acceleration for local models on macOS, CUDA on Linux
 - **OCR Support**: Tesseract integration for text extraction from images
 ### Key Files
 - `src/main.rs`: main entry point delegating to g3-cli
@@ -376,3 +422,5 @@ This design document reflects the current state of G3 as a mature, production-re
 - `crates/g3-providers/src/lib.rs`: provider trait and registry
 - `crates/g3-config/src/lib.rs`: configuration management
 - `crates/g3-execution/src/lib.rs`: code execution engine
 - `crates/g3-computer-control/src/lib.rs`: computer control and automation
 - `crates/g3-computer-control/src/platform/`: platform-specific implementations
--- a/README.md
+++ b/README.md
@@ -11,8 +11,8 @@ G3 follows a modular architecture organized as a Rust workspace with multiple cr
 #### **g3-core**
 The heart of the agent system, containing:
 - **Agent Engine**: Main orchestration logic for handling conversations, tool execution, and task management
- **Context Window Management**: Intelligent tracking of token usage with auto-summarization capabilities when approaching context limits (~80% capacity)
+- **Context Window Management**: Intelligent tracking of token usage with context thinning (50-80%) and auto-summarization at 80% capacity
- **Tool System**: Built-in tools for file operations (read, write, edit), shell command execution, and structured output generation
+- **Tool System**: Built-in tools for file operations, shell commands, computer control, TODO management, and structured output
 - **Streaming Response Parser**: Real-time parsing of LLM responses with tool call detection and execution
 - **Task Execution**: Support for single and iterative task execution with automatic retry logic
@@ -40,6 +40,13 @@ Task execution framework:
 - Error handling and retry mechanisms
 - Progress tracking and reporting
 #### **g3-computer-control**
 Computer control capabilities:
 - Mouse and keyboard automation
 - UI element inspection and interaction
 - Screenshot capture and window management
 - OCR text extraction via Tesseract
 #### **g3-cli**
 Command-line interface:
 - Interactive terminal interface
@@ -61,13 +68,21 @@ G3 includes robust error handling with automatic retry logic:
 ### Intelligent Context Management
 - Automatic context window monitoring with percentage-based tracking
 - Smart auto-summarization when approaching token limits
 - **Context thinning** at 50%, 60%, 70%, 80% thresholds - automatically replaces large tool results with file references
 - Conversation history preservation through summaries
- Dynamic token allocation for different providers
+- Dynamic token allocation for different providers (4k to 200k+ tokens)
 ### Tool Ecosystem
 - **File Operations**: Read, write, and edit files with line-range precision
 - **Shell Integration**: Execute system commands with output capture
 - **Code Generation**: Structured code generation with syntax awareness
 - **TODO Management**: Read and write TODO lists with markdown checkbox format
 - **Computer Control** (Experimental): Automate desktop applications
  - Mouse and keyboard control
  - UI element inspection
  - Screenshot capture and window management
  - OCR text extraction from images and screen regions
  - Window listing and identification
 - **Final Output**: Formatted result presentation
 ### Provider Flexibility
@@ -98,10 +113,11 @@ G3 is designed for:
 - Automated code generation and refactoring
 - File manipulation and project scaffolding
 - System administration tasks
- Data processing and transformation
+- Data processing and transformation  
 - API integration and testing
 - Documentation generation
 - Complex multi-step workflows
 - Desktop application automation and testing
 ## Getting Started
@@ -116,6 +132,41 @@ cargo run
 g3 "implement a function to calculate fibonacci numbers"
 ```
 ## WebDriver Browser Automation
 G3 includes WebDriver support for browser automation tasks using Safari.
 **One-Time Setup** (macOS only):
 Safari Remote Automation must be enabled before using WebDriver tools. Run this once:
 ```bash
 # Option 1: Use the provided script
 ./scripts/enable-safari-automation.sh
 # Option 2: Enable manually
 safaridriver --enable  # Requires password
 # Option 3: Enable via Safari UI
 # Safari → Preferences → Advanced → Show Develop menu
 # Then: Develop → Allow Remote Automation
 ```
 **For detailed setup instructions and troubleshooting**, see [WebDriver Setup Guide](docs/webdriver-setup.md).
 **Usage**: Run G3 with the `--webdriver` flag to enable browser automation tools.
 ## Computer Control (Experimental)
 G3 can interact with your computer's GUI for automation tasks:
 **Available Tools**: `mouse_click`, `type_text`, `find_element`, `take_screenshot`, `extract_text`, `find_text_on_screen`, `list_windows`
 **Setup**: Enable in config with `computer_control.enabled = true` and grant OS accessibility permissions:
 - **macOS**: System Preferences → Security & Privacy → Accessibility  
 - **Linux**: Ensure X11 or Wayland access
 - **Windows**: Run as administrator (first time only)
 ## Session Logs
 G3 automatically saves session logs for each interaction in the `logs/` directory. These logs contain:
--- a/config.coach-player.example.toml
+++ b/config.coach-player.example.toml
@@ -0,0 +1,24 @@
 [providers]
 default_provider = "databricks"
 # Specify different providers for coach and player in autonomous mode
 coach = "databricks"    # Provider for coach (code reviewer) - can be more powerful/expensive
 player = "anthropic"    # Provider for player (code implementer) - can be faster/cheaper
 [providers.databricks]
 host = "https://your-workspace.cloud.databricks.com"
 # token = "your-databricks-token"  # Optional - will use OAuth if not provided
 model = "databricks-claude-sonnet-4"
 max_tokens = 4096
 temperature = 0.1
 use_oauth = true
 [providers.anthropic]
 api_key = "your-anthropic-api-key"
 model = "claude-3-haiku-20240307"  # Using a faster model for player
 max_tokens = 4096
 temperature = 0.3  # Slightly higher temperature for more creative implementations
 [agent]
 max_context_length = 8192
 enable_streaming = true
 timeout_seconds = 60
--- a/config.example.toml
+++ b/config.example.toml
@@ -1,5 +1,10 @@
 [providers]
 default_provider = "databricks"
 # Optional: Specify different providers for coach and player in autonomous mode
 # If not specified, will use default_provider for both
 # coach = "databricks"    # Provider for coach (code reviewer)
 # player = "anthropic"    # Provider for player (code implementer)
 # Note: Make sure the specified providers are configured below
 [providers.databricks]
 host = "https://your-workspace.cloud.databricks.com"
@@ -13,3 +18,8 @@ use_oauth = true
 max_context_length = 8192
 enable_streaming = true
 timeout_seconds = 60
 [computer_control]
 enabled = false  # Set to true to enable computer control (requires OS permissions)
 require_confirmation = true
 max_actions_per_second = 5
--- a/crates/g3-cli/src/lib.rs
+++ b/crates/g3-cli/src/lib.rs
@@ -1,7 +1,5 @@
 use anyhow::Result;
 use std::time::{Duration, Instant};
 /// Extract coach feedback by reading from the coach agent's specific log file
 /// Uses the coach agent's session ID to find the exact log file
 #[derive(Debug, Clone)]
 struct TurnMetrics {
@@ -21,7 +19,7 @@ fn generate_turn_histogram(turn_metrics: &[TurnMetrics]) -> String {
    // Find max values for scaling
    let max_tokens = turn_metrics.iter().map(|t| t.tokens_used).max().unwrap_or(1);
    let max_time_ms = turn_metrics.iter()
-        .map(|t| t.wall_clock_time.as_millis() as u32)
+        .map(|t| t.wall_clock_time.as_millis().min(u32::MAX as u128) as u32)
        .max()
        .unwrap_or(1);
@@ -35,7 +33,7 @@ fn generate_turn_histogram(turn_metrics: &[TurnMetrics]) -> String {
    histogram.push_str(&format!("   {} = Wall Clock Time (max: {:.1}s)\n\n", TIME_CHAR, max_time_ms as f64 / 1000.0));
    for metrics in turn_metrics {
-        let turn_time_ms = metrics.wall_clock_time.as_millis() as u32;
+        let turn_time_ms = metrics.wall_clock_time.as_millis().min(u32::MAX as u128) as u32;
        // Calculate bar lengths (proportional to max values)
        let token_bar_len = if max_tokens > 0 {
@@ -99,18 +97,25 @@ fn generate_turn_histogram(turn_metrics: &[TurnMetrics]) -> String {
    histogram
 }
-fn extract_coach_feedback_from_logs(_coach_result: &g3_core::TaskResult, coach_agent: &g3_core::Agent<ConsoleUiWriter>, output: &SimpleOutput) -> Result<String> {
+/// Extract coach feedback by reading from the coach agent's specific log file
 /// Uses the coach agent's session ID to find the exact log file
 fn extract_coach_feedback_from_logs(
    coach_result: &g3_core::TaskResult,
    coach_agent: &g3_core::Agent<ConsoleUiWriter>,
    output: &SimpleOutput,
 ) -> Result<String> {
    // CORRECT APPROACH: Get the session ID from the current coach agent
    // and read its specific log file directly
-    
+
    // Get the coach agent's session ID
-    let session_id = coach_agent.get_session_id()
+    let session_id = coach_agent
        .get_session_id()
        .ok_or_else(|| anyhow::anyhow!("Coach agent has no session ID"))?;
-    
+
    // Construct the log file path for this specific coach session
    let logs_dir = std::path::Path::new("logs");
    let log_file_path = logs_dir.join(format!("g3_session_{}.json", session_id));
-    
+
    // Read the coach agent's specific log file
    if log_file_path.exists() {
        if let Ok(log_content) = std::fs::read_to_string(&log_file_path) {
@@ -122,7 +127,10 @@ fn extract_coach_feedback_from_logs(_coach_result: &g3_core::TaskResult, coach_a
                            if let Some(last_message) = messages.last() {
                                if let Some(content) = last_message.get("content") {
                                    if let Some(content_str) = content.as_str() {
-                                        output.print(&format!("✅ Extracted coach feedback from session: {}", session_id));
+                                        output.print(&format!(
                                            "✅ Extracted coach feedback from session: {}",
                                            session_id
                                        ));
                                        return Ok(content_str.to_string());
                                    }
                                }
@@ -133,8 +141,19 @@ fn extract_coach_feedback_from_logs(_coach_result: &g3_core::TaskResult, coach_a
            }
        }
    }
-    
+
-    Err(anyhow::anyhow!("Could not extract feedback from coach session: {}", session_id))
+    // If we couldn't extract from logs, panic with detailed error
    panic!(
        "CRITICAL: Could not extract coach feedback from session: {}\n\
         Log file path: {:?}\n\
         Log file exists: {}\n\
         This indicates the coach did not call any tool or the log is corrupted.\n\
         Coach result response length: {} chars",
        session_id,
        log_file_path,
        log_file_path.exists(),
        coach_result.response.len()
    );
 }
 use clap::Parser;
@@ -197,6 +216,10 @@ pub struct Cli {
    #[arg(long, value_name = "TEXT")]
    pub requirements: Option<String>,
    /// Interactive mode: prompt for requirements and save to requirements.md before starting autonomous mode
    #[arg(long)]
    pub interactive_requirements: bool,
    /// Use retro terminal UI (inspired by 80s sci-fi)
    #[arg(long)]
    pub retro: bool,
@@ -284,6 +307,112 @@ pub async fn run() -> Result<()> {
    // Create project model
    let project = if cli.autonomous {
        // Handle interactive requirements mode with AI enhancement
        if cli.interactive_requirements {
            println!("\n📝 Interactive Requirements Mode");
            println!("================================\n");
            println!("Describe what you want to build (can be brief):");
            println!("Press Ctrl+D (Unix) or Ctrl+Z (Windows) when done.\n");
            use std::io::{self, Read, Write};
            let mut requirements_input = String::new();
            io::stdin().read_to_string(&mut requirements_input)?;
            if requirements_input.trim().is_empty() {
                anyhow::bail!("No requirements provided. Exiting.");
            }
            println!("\n🤖 Enhancing your requirements with AI...\n");
            // Create a temporary agent to enhance the requirements
            let temp_config = Config::load_with_overrides(
                cli.config.as_deref(),
                cli.provider.clone(),
                cli.model.clone(),
            )?;
            let ui_writer = ConsoleUiWriter::new();
            let mut temp_agent = Agent::new_with_readme_and_quiet(
                temp_config,
                ui_writer,
                None,
                true, // quiet mode
            ).await?;
            // Craft the enhancement prompt
            let enhancement_prompt = format!(
                r#"You are a requirements analyst. Take this brief user input and expand it into a structured requirements document.
 USER INPUT:
 {}
 Create a professional requirements document with:
 1. A clear project title (# heading)
 2. An overview section explaining what will be built
 3. Organized requirements (functional, technical, quality)
 4. Acceptance criteria
 5. Any technical constraints or preferences mentioned
 Format as proper markdown. Be specific and actionable. If the user's input is vague, make reasonable assumptions but keep it focused on what they described.
 Output ONLY the markdown content, no explanations or meta-commentary."#,
                requirements_input.trim()
            );
            // Execute enhancement task
            let result = temp_agent
                .execute_task_with_timing(&enhancement_prompt, None, false, false, false, false)
                .await?;
            let enhanced_requirements = result.response.trim().to_string();
            // Show the enhanced requirements
            println!("\n📋 Enhanced Requirements Document:");
            println!("{}\n", "=".repeat(60));
            println!("{}", enhanced_requirements);
            println!("{}\n", "=".repeat(60));
            // Ask for confirmation
            println!("\n❓ Is this requirements document acceptable?");
            println!("   [y] Yes, proceed with autonomous mode");
            println!("   [e] Edit and save manually");
            println!("   [n] No, cancel\n");
            print!("Your choice (y/e/n): ");
            io::stdout().flush()?;
            let mut choice = String::new();
            io::stdin().read_line(&mut choice)?;
            let choice = choice.trim().to_lowercase();
            let requirements_path = workspace_dir.join("requirements.md");
            match choice.as_str() {
                "y" | "yes" => {
                    // Save enhanced requirements
                    std::fs::write(&requirements_path, &enhanced_requirements)?;
                    println!("\n✅ Requirements saved to: {}", requirements_path.display());
                    println!("🚀 Starting autonomous mode...\n");
                }
                "e" | "edit" => {
                    // Save enhanced requirements for manual editing
                    std::fs::write(&requirements_path, &enhanced_requirements)?;
                    println!("\n✅ Requirements saved to: {}", requirements_path.display());
                    println!("📝 Please edit the file and run: g3 --autonomous");
                    println!("   Exiting for now.\n");
                    return Ok(());
                }
                "n" | "no" => {
                    println!("\n❌ Cancelled. No files were saved.\n");
                    return Ok(());
                }
                _ => {
                    println!("\n❌ Invalid choice. Cancelled.\n");
                    return Ok(());
                }
            }
        }
        if let Some(requirements_text) = cli.requirements {
            // Use requirements text override
            Project::new_autonomous_with_requirements(workspace_dir.clone(), requirements_text)?
@@ -309,14 +438,15 @@ pub async fn run() -> Result<()> {
        cli.provider.clone(),
        cli.model.clone(),
    )?;
-    
+
    // Validate provider if specified
    if let Some(ref provider) = cli.provider {
        let valid_providers = ["anthropic", "databricks", "embedded", "openai"];
        if !valid_providers.contains(&provider.as_str()) {
            return Err(anyhow::anyhow!(
-                "Invalid provider '{}'. Valid options: {:?}", 
+                "Invalid provider '{}'. Valid options: {:?}",
-                provider, valid_providers
+                provider,
                valid_providers
            ));
        }
    }
@@ -335,9 +465,21 @@ pub async fn run() -> Result<()> {
    };
    let mut agent = if cli.autonomous {
-        Agent::new_autonomous_with_readme_and_quiet(config.clone(), ui_writer, combined_content.clone(), cli.quiet).await?
+        Agent::new_autonomous_with_readme_and_quiet(
            config.clone(),
            ui_writer,
            combined_content.clone(),
            cli.quiet,
        )
        .await?
    } else {
-        Agent::new_with_readme_and_quiet(config.clone(), ui_writer, combined_content.clone(), cli.quiet).await?
+        Agent::new_with_readme_and_quiet(
            config.clone(),
            ui_writer,
            combined_content.clone(),
            cli.quiet,
        )
        .await?
    };
    // Execute task, autonomous mode, or start interactive mode
@@ -374,7 +516,7 @@ pub async fn run() -> Result<()> {
        if cli.retro {
            // Use retro terminal UI
            run_interactive_retro(
-                config,  // Already has overrides applied
+                config, // Already has overrides applied
                cli.show_prompt,
                cli.show_code,
                cli.theme,
@@ -1119,7 +1261,10 @@ async fn run_autonomous(
        output.print("❌ Error: requirements.md not found in workspace directory");
        output.print("   Please either:");
        output.print("   1. Create a requirements.md file with your project requirements at:");
-        output.print(&format!("      {}/requirements.md", project.workspace().display()));
+        output.print(&format!(
            "      {}/requirements.md",
            project.workspace().display()
        ));
        output.print("   2. Or use the --requirements flag to provide requirements text directly:");
        output.print("      g3 --autonomous --requirements \"Your requirements here\"");
        output.print("");
@@ -1254,11 +1399,17 @@ async fn run_autonomous(
            // If there's no coach feedback on subsequent turns, this is an error
            if coach_feedback.is_empty() {
                if turn > 1 {
-                    return Err(anyhow::anyhow!("Player mode error: No coach feedback received on turn {}", turn));
+                    return Err(anyhow::anyhow!(
                        "Player mode error: No coach feedback received on turn {}",
                        turn
                    ));
                }
                output.print("📋 Player starting initial implementation (no prior coach feedback)");
            } else {
-                output.print(&format!("📋 Player received coach feedback ({} chars):", coach_feedback.len()));
+                output.print(&format!(
                    "📋 Player received coach feedback ({} chars):",
                    coach_feedback.len()
                ));
                output.print(&format!("{}", coach_feedback));
            }
            output.print(""); // Empty line for readability
@@ -1356,7 +1507,7 @@ async fn run_autonomous(
                ));
                // Record turn metrics before incrementing
                let turn_duration = turn_start_time.elapsed();
-                let turn_tokens = agent.get_context_window().used_tokens - turn_start_tokens;
+                let turn_tokens = agent.get_context_window().used_tokens.saturating_sub(turn_start_tokens);
                turn_metrics.push(TurnMetrics {
                    turn_number: turn,
                    tokens_used: turn_tokens,
@@ -1382,9 +1533,15 @@ async fn run_autonomous(
        // Create a new agent instance for coach mode to ensure fresh context
        // Use the same config with overrides that was passed to the player agent
-        let config = agent.get_config().clone();
+        let base_config = agent.get_config().clone();
        let coach_config = base_config.for_coach()?;
        // Reset filter suppression state before creating coach agent
        g3_core::fixed_filter_json::reset_fixed_json_tool_state();
        let ui_writer = ConsoleUiWriter::new();
-        let mut coach_agent = Agent::new_autonomous_with_readme_and_quiet(config, ui_writer, None, quiet).await?;
+        let mut coach_agent =
            Agent::new_autonomous_with_readme_and_quiet(coach_config, ui_writer, None, quiet).await?;
        // Ensure coach agent is also in the workspace directory
        project.enter_workspace()?;
@@ -1414,13 +1571,13 @@ CRITICAL INSTRUCTIONS:
 3. Focus ONLY on what needs to be fixed or improved
 4. Do NOT include your analysis process, file contents, or compilation output in the summary
-If the implementation correctly meets all requirements and compiles without errors:
+If the implementation generally meets all requirements and compiles without errors:
 - Call final_output with summary: 'IMPLEMENTATION_APPROVED'
 If improvements are needed:
 - Call final_output with a brief summary listing ONLY the specific issues to fix
-Remember: Be thorough in your review but concise in your feedback. APPROVE if the implementation works and generally fits the requirements.",
+Remember: Be clear in your review and concise in your feedback. APPROVE if the implementation works and generally fits the requirements. Don't be picky.",
            requirements
        );
@@ -1511,7 +1668,7 @@ Remember: Be thorough in your review but concise in your feedback. APPROVE if th
            coach_feedback = "The implementation needs review. Please ensure all requirements are met and the code compiles without errors.".to_string();
            // Record turn metrics before incrementing
            let turn_duration = turn_start_time.elapsed();
-            let turn_tokens = agent.get_context_window().used_tokens - turn_start_tokens;
+            let turn_tokens = agent.get_context_window().used_tokens.saturating_sub(turn_start_tokens);
            turn_metrics.push(TurnMetrics {
                turn_number: turn,
                tokens_used: turn_tokens,
@@ -1531,7 +1688,8 @@ Remember: Be thorough in your review but concise in your feedback. APPROVE if th
        let coach_result = coach_result_opt.unwrap();
        // Extract the complete coach feedback from final_output
-        let coach_feedback_text = extract_coach_feedback_from_logs(&coach_result, &coach_agent, &output)?;
+        let coach_feedback_text =
            extract_coach_feedback_from_logs(&coach_result, &coach_agent, &output)?;
        // Log the size of the feedback for debugging
        info!(
@@ -1546,7 +1704,7 @@ Remember: Be thorough in your review but concise in your feedback. APPROVE if th
            coach_feedback = "The implementation needs review. Please ensure all requirements are met and the code compiles without errors.".to_string();
            // Record turn metrics before incrementing
            let turn_duration = turn_start_time.elapsed();
-            let turn_tokens = agent.get_context_window().used_tokens - turn_start_tokens;
+            let turn_tokens = agent.get_context_window().used_tokens.saturating_sub(turn_start_tokens);
            turn_metrics.push(TurnMetrics {
                turn_number: turn,
                tokens_used: turn_tokens,
@@ -1577,7 +1735,7 @@ Remember: Be thorough in your review but concise in your feedback. APPROVE if th
        coach_feedback = coach_feedback_text;
        // Record turn metrics before incrementing
        let turn_duration = turn_start_time.elapsed();
-        let turn_tokens = agent.get_context_window().used_tokens - turn_start_tokens;
+        let turn_tokens = agent.get_context_window().used_tokens.saturating_sub(turn_start_tokens);
        turn_metrics.push(TurnMetrics {
            turn_number: turn,
            tokens_used: turn_tokens,
--- a/crates/g3-cli/src/ui_writer_impl.rs
+++ b/crates/g3-cli/src/ui_writer_impl.rs
@@ -10,6 +10,7 @@ pub struct ConsoleUiWriter {
    current_tool_args: Mutex<Vec<(String, String)>>,
    current_output_line: Mutex<Option<String>>,
    output_line_printed: Mutex<bool>,
    in_todo_tool: Mutex<bool>,
 }
 impl ConsoleUiWriter {
@@ -19,6 +20,60 @@ impl ConsoleUiWriter {
            current_tool_args: Mutex::new(Vec::new()),
            current_output_line: Mutex::new(None),
            output_line_printed: Mutex::new(false),
            in_todo_tool: Mutex::new(false),
        }
    }
    fn print_todo_line(&self, line: &str) {
        // Transform and print todo list lines elegantly
        let trimmed = line.trim();
        // Skip the "📝 TODO list:" prefix line
        if trimmed.starts_with("📝 TODO list:") || trimmed == "📝 TODO list is empty" {
            return;
        }
        // Handle empty lines
        if trimmed.is_empty() {
            println!();
            return;
        }
        // Detect indentation level
        let indent_count = line.chars().take_while(|c| c.is_whitespace()).count();
        let indent = "  ".repeat(indent_count / 2); // Convert spaces to visual indent
        // Format based on line type
        if trimmed.starts_with("- [ ]") {
            // Incomplete task
            let task = trimmed.strip_prefix("- [ ]").unwrap_or(trimmed).trim();
            println!("{}☐ {}", indent, task);
        } else if trimmed.starts_with("- [x]") || trimmed.starts_with("- [X]") {
            // Completed task
            let task = trimmed.strip_prefix("- [x]")
                .or_else(|| trimmed.strip_prefix("- [X]"))
                .unwrap_or(trimmed)
                .trim();
            println!("{}\x1b[2m☑ {}\x1b[0m", indent, task);
        } else if trimmed.starts_with("- ") {
            // Regular bullet point
            let item = trimmed.strip_prefix("- ").unwrap_or(trimmed).trim();
            println!("{}• {}", indent, item);
        } else if trimmed.starts_with("# ") {
            // Heading
            let heading = trimmed.strip_prefix("# ").unwrap_or(trimmed).trim();
            println!("\n\x1b[1m{}\x1b[0m", heading);
        } else if trimmed.starts_with("## ") {
            // Subheading
            let subheading = trimmed.strip_prefix("## ").unwrap_or(trimmed).trim();
            println!("\n\x1b[1m{}\x1b[0m", subheading);
        } else if trimmed.starts_with("**") && trimmed.ends_with("**") {
            // Bold text (section marker)
            let text = trimmed.trim_start_matches("**").trim_end_matches("**");
            println!("{}\x1b[1m{}\x1b[0m", indent, text);
        } else {
            // Regular text or note
            println!("{}{}", indent, trimmed);
        }
    }
 }
@@ -53,6 +108,15 @@ impl UiWriter for ConsoleUiWriter {
        // Store the tool name and clear args for collection
        *self.current_tool_name.lock().unwrap() = Some(tool_name.to_string());
        self.current_tool_args.lock().unwrap().clear();
        // Check if this is a todo tool call
        let is_todo = tool_name == "todo_read" || tool_name == "todo_write";
        *self.in_todo_tool.lock().unwrap() = is_todo;
        // For todo tools, we'll skip the normal header and print a custom one later
        if is_todo {
            return;
        }
    }
    fn print_tool_arg(&self, key: &str, value: &str) {
@@ -75,6 +139,12 @@ impl UiWriter for ConsoleUiWriter {
    }
    fn print_tool_output_header(&self) {
        // Skip normal header for todo tools
        if *self.in_todo_tool.lock().unwrap() {
            println!(); // Just add a newline
            return;
        }
        println!();
        // Now print the tool header with the most important arg in bold green
        if let Some(tool_name) = self.current_tool_name.lock().unwrap().as_ref() {
@@ -115,8 +185,8 @@ impl UiWriter for ConsoleUiWriter {
                    String::new()
                };
-                // Print with bold green formatting using ANSI escape codes
+                // Print with bold green tool name, purple (non-bold) for pipe and args
-                println!("┌─\x1b[1;32m {} | {}{}\x1b[0m", tool_name, display_value, header_suffix);
+                println!("┌─\x1b[1;32m {}\x1b[0m\x1b[35m | {}{}\x1b[0m", tool_name, display_value, header_suffix);
            } else {
                // Print with bold green formatting using ANSI escape codes
                println!("┌─\x1b[1;32m {}\x1b[0m", tool_name);
@@ -144,10 +214,21 @@ impl UiWriter for ConsoleUiWriter {
    }
    fn print_tool_output_line(&self, line: &str) {
        // Special handling for todo tools
        if *self.in_todo_tool.lock().unwrap() {
            self.print_todo_line(line);
            return;
        }
        println!("│ \x1b[2m{}\x1b[0m", line);
    }
    fn print_tool_output_summary(&self, count: usize) {
        // Skip for todo tools
        if *self.in_todo_tool.lock().unwrap() {
            return;
        }
        println!(
            "│ \x1b[2m({} line{})\x1b[0m",
            count,
@@ -156,7 +237,55 @@ impl UiWriter for ConsoleUiWriter {
    }
    fn print_tool_timing(&self, duration_str: &str) {
-        println!("└─ ⚡️ {}", duration_str);
+        // For todo tools, just print a simple completion message
        if *self.in_todo_tool.lock().unwrap() {
            println!();
            *self.in_todo_tool.lock().unwrap() = false;
            return;
        }
        // Parse the duration string to determine color
        // Format is like "1.5s", "500ms", "2m 30.0s"
        let color_code = if duration_str.ends_with("ms") {
            // Milliseconds - use default color (< 1s)
            ""
        } else if duration_str.contains('m') {
            // Contains minutes
            // Extract minutes value
            if let Some(m_pos) = duration_str.find('m') {
                if let Ok(minutes) = duration_str[..m_pos].trim().parse::<u32>() {
                    if minutes >= 5 {
                        "\x1b[31m" // Red for >= 5 minutes
                    } else {
                        "\x1b[38;5;208m" // Orange for >= 1 minute but < 5 minutes
                    }
                } else {
                    "" // Default color if parsing fails
                }
            } else {
                "" // Default color if 'm' not found (shouldn't happen)
            }
        } else if duration_str.ends_with('s') {
            // Seconds only
            if let Some(s_value) = duration_str.strip_suffix('s') {
                if let Ok(seconds) = s_value.trim().parse::<f64>() {
                    if seconds >= 1.0 {
                        "\x1b[33m" // Yellow for >= 1 second
                    } else {
                        "" // Default color for < 1 second
                    }
                } else {
                    "" // Default color if parsing fails
                }
            } else {
                "" // Default color
            }
        } else {
            // Milliseconds or other format - use default color
            ""
        };
        println!("└─ ⚡️ {}{}\x1b[0m", color_code, duration_str);
        println!();
        // Clear the stored tool info
        *self.current_tool_name.lock().unwrap() = None;
--- a/crates/g3-computer-control/Cargo.toml
+++ b/crates/g3-computer-control/Cargo.toml
@@ -0,0 +1,46 @@
 [package]
 name = "g3-computer-control"
 version = "0.1.0"
 edition = "2021"
 [dependencies]
 # Workspace dependencies
 tokio = { workspace = true }
 anyhow = { workspace = true }
 thiserror = { workspace = true }
 serde = { workspace = true }
 serde_json = { workspace = true }
 tracing = { workspace = true }
 uuid = { workspace = true }
 shellexpand = "3.1"
 # Async trait support
 async-trait = "0.1"
 # WebDriver support
 fantoccini = "0.21"
 # OCR dependencies
 tesseract = "0.14"
 # macOS dependencies
 [target.'cfg(target_os = "macos")'.dependencies]
 core-graphics = "0.23"
 core-foundation = "0.9"
 cocoa = "0.25"
 objc = "0.2"
 image = "0.24"
 # Linux dependencies
 [target.'cfg(target_os = "linux")'.dependencies]
 x11 = { version = "2.21", features = ["xlib", "xtest"] }
 image = "0.24"
 # Windows dependencies
 [target.'cfg(target_os = "windows")'.dependencies]
 windows = { version = "0.52", features = [
    "Win32_Foundation",
    "Win32_UI_WindowsAndMessaging",
    "Win32_UI_Input_KeyboardAndMouse",
    "Win32_Graphics_Gdi",
 ] }
--- a/crates/g3-computer-control/examples/debug_screenshot.rs
+++ b/crates/g3-computer-control/examples/debug_screenshot.rs
@@ -0,0 +1,46 @@
 use core_graphics::display::CGDisplay;
 fn main() {
    let display = CGDisplay::main();
    let image = display.image().expect("Failed to capture screen");
    println!("CGImage properties:");
    println!("  Width: {}", image.width());
    println!("  Height: {}", image.height());
    println!("  Bits per component: {}", image.bits_per_component());
    println!("  Bits per pixel: {}", image.bits_per_pixel());
    println!("  Bytes per row: {}", image.bytes_per_row());
    let data = image.data();
    let expected_size = image.width() * image.height() * 4;
    println!("  Data length: {}", data.len());
    println!("  Expected (w*h*4): {}", expected_size);
    // Check if there's padding in rows
    let bytes_per_row = image.bytes_per_row();
    let width = image.width();
    let expected_bytes_per_row = width * 4;
    println!("\nRow alignment:");
    println!("  Actual bytes per row: {}", bytes_per_row);
    println!("  Expected (width * 4): {}", expected_bytes_per_row);
    println!("  Padding per row: {}", bytes_per_row - expected_bytes_per_row);
    // Sample some pixels from different locations
    println!("\nFirst 3 pixels (raw bytes):");
    for i in 0..3 {
        let offset = i * 4;
        println!("  Pixel {}: [{:3}, {:3}, {:3}, {:3}]", 
                 i, data[offset], data[offset+1], data[offset+2], data[offset+3]);
    }
    // Check a pixel from the middle
    let mid_row = image.height() / 2;
    let mid_col = image.width() / 2;
    let mid_offset = (mid_row * bytes_per_row + mid_col * 4) as usize;
    println!("\nMiddle pixel (row {}, col {}):", mid_row, mid_col);
    println!("  Offset: {}", mid_offset);
    if mid_offset + 3 < data.len() as usize {
        println!("  Bytes: [{:3}, {:3}, {:3}, {:3}]", 
                 data[mid_offset], data[mid_offset+1], data[mid_offset+2], data[mid_offset+3]);
    }
 }
--- a/crates/g3-computer-control/examples/list_windows.rs
+++ b/crates/g3-computer-control/examples/list_windows.rs
@@ -0,0 +1,56 @@
 use core_graphics::window::{kCGWindowListOptionOnScreenOnly, kCGNullWindowID, CGWindowListCopyWindowInfo};
 use core_foundation::dictionary::CFDictionary;
 use core_foundation::string::CFString;
 use core_foundation::base::TCFType;
 fn main() {
    println!("Listing all on-screen windows...");
    println!("{:<10} {:<25} {}", "Window ID", "Owner", "Title");
    println!("{}", "-".repeat(80));
    unsafe {
        let window_list = CGWindowListCopyWindowInfo(
            kCGWindowListOptionOnScreenOnly,
            kCGNullWindowID
        );
        let count = core_foundation::array::CFArray::<CFDictionary>::wrap_under_create_rule(window_list).len();
        let array = core_foundation::array::CFArray::<CFDictionary>::wrap_under_create_rule(window_list);
        for i in 0..count {
            let dict = array.get(i).unwrap();
            // Get window ID
            let window_id_key = CFString::from_static_string("kCGWindowNumber");
            let window_id: i64 = if let Some(value) = dict.find(window_id_key.as_concrete_TypeRef()) {
                let num: core_foundation::number::CFNumber = TCFType::wrap_under_get_rule(*value as *const _);
                num.to_i64().unwrap_or(0)
            } else {
                0
            };
            // Get owner name
            let owner_key = CFString::from_static_string("kCGWindowOwnerName");
            let owner: String = if let Some(value) = dict.find(owner_key.as_concrete_TypeRef()) {
                let s: CFString = TCFType::wrap_under_get_rule(*value as *const _);
                s.to_string()
            } else {
                "Unknown".to_string()
            };
            // Get window name/title
            let name_key = CFString::from_static_string("kCGWindowName");
            let title: String = if let Some(value) = dict.find(name_key.as_concrete_TypeRef()) {
                let s: CFString = TCFType::wrap_under_get_rule(*value as *const _);
                s.to_string()
            } else {
                "".to_string()
            };
            // Filter for iTerm or show all
            if owner.contains("iTerm") || owner.contains("Terminal") {
                println!("{:<10} {:<25} {}", window_id, owner, title);
            }
        }
    }
 }
--- a/crates/g3-computer-control/examples/safari_demo.rs
+++ b/crates/g3-computer-control/examples/safari_demo.rs
@@ -0,0 +1,64 @@
 use g3_computer_control::SafariDriver;
 use g3_computer_control::webdriver::WebDriverController;
 use anyhow::Result;
 #[tokio::main]
 async fn main() -> Result<()> {
    println!("Safari WebDriver Demo");
    println!("=====================\n");
    println!("Make sure to:");
    println!("1. Enable 'Allow Remote Automation' in Safari's Develop menu");
    println!("2. Run: /usr/bin/safaridriver --enable");
    println!("3. Start safaridriver in another terminal: safaridriver --port 4444\n");
    println!("Connecting to SafariDriver...");
    let mut driver = SafariDriver::new().await?;
    println!("✅ Connected!\n");
    // Navigate to a website
    println!("Navigating to example.com...");
    driver.navigate("https://example.com").await?;
    println!("✅ Navigated\n");
    // Get page title
    let title = driver.title().await?;
    println!("Page title: {}\n", title);
    // Get current URL
    let url = driver.current_url().await?;
    println!("Current URL: {}\n", url);
    // Find an element
    println!("Finding h1 element...");
    let mut h1 = driver.find_element("h1").await?;
    let h1_text = h1.text().await?;
    println!("H1 text: {}\n", h1_text);
    // Find all paragraphs
    println!("Finding all paragraphs...");
    let paragraphs = driver.find_elements("p").await?;
    println!("Found {} paragraphs\n", paragraphs.len());
    // Get page source
    println!("Getting page source...");
    let source = driver.page_source().await?;
    println!("Page source length: {} bytes\n", source.len());
    // Execute JavaScript
    println!("Executing JavaScript...");
    let result = driver.execute_script("return document.title", vec![]).await?;
    println!("JS result: {:?}\n", result);
    // Take a screenshot
    println!("Taking screenshot...");
    driver.screenshot("/tmp/safari_demo.png").await?;
    println!("✅ Screenshot saved to /tmp/safari_demo.png\n");
    // Close the browser
    println!("Closing browser...");
    driver.quit().await?;
    println!("✅ Done!");
    Ok(())
 }
--- a/crates/g3-computer-control/examples/test_permission_prompt.rs
+++ b/crates/g3-computer-control/examples/test_permission_prompt.rs
@@ -0,0 +1,21 @@
 use g3_computer_control::{create_controller, ComputerController};
 #[tokio::main]
 async fn main() {
    println!("Testing screenshot with permission prompt...");
    let controller = create_controller().expect("Failed to create controller");
    match controller.take_screenshot("/tmp/test_with_prompt.png", None, None).await {
        Ok(_) => {
            println!("\n✅ Screenshot saved to /tmp/test_with_prompt.png");
            println!("Opening screenshot...");
            let _ = std::process::Command::new("open")
                .arg("/tmp/test_with_prompt.png")
                .spawn();
        }
        Err(e) => {
            println!("❌ Screenshot failed: {}", e);
        }
    }
 }
--- a/crates/g3-computer-control/examples/test_screencapture_direct.rs
+++ b/crates/g3-computer-control/examples/test_screencapture_direct.rs
@@ -0,0 +1,39 @@
 use std::process::Command;
 fn main() {
    let path = "/tmp/rust_screencapture_test.png";
    println!("Testing screencapture command from Rust...");
    let mut cmd = Command::new("screencapture");
    cmd.arg("-x"); // No sound
    cmd.arg(path);
    println!("Command: {:?}", cmd);
    match cmd.output() {
        Ok(output) => {
            println!("Exit status: {}", output.status);
            println!("Stdout: {}", String::from_utf8_lossy(&output.stdout));
            println!("Stderr: {}", String::from_utf8_lossy(&output.stderr));
            if output.status.success() {
                println!("\n✅ Screenshot saved to: {}", path);
                // Check file exists and size
                if let Ok(metadata) = std::fs::metadata(path) {
                    println!("File size: {} bytes ({:.1} MB)", metadata.len(), metadata.len() as f64 / 1_000_000.0);
                }
                // Open it
                let _ = Command::new("open").arg(path).spawn();
                println!("\nOpened screenshot - please verify it looks correct!");
            } else {
                println!("\n❌ Screenshot failed!");
            }
        }
        Err(e) => {
            println!("❌ Failed to execute screencapture: {}", e);
        }
    }
 }
--- a/crates/g3-computer-control/examples/test_screenshot_fix.rs
+++ b/crates/g3-computer-control/examples/test_screenshot_fix.rs
@@ -0,0 +1,69 @@
 use core_graphics::display::CGDisplay;
 use image::{ImageBuffer, RgbaImage};
 use std::path::Path;
 fn main() {
    let display = CGDisplay::main();
    let image = display.image().expect("Failed to capture screen");
    let width = image.width() as u32;
    let height = image.height() as u32;
    let bytes_per_row = image.bytes_per_row() as usize;
    let data = image.data();
    println!("Testing screenshot fix...");
    println!("Image: {}x{}, bytes_per_row: {}", width, height, bytes_per_row);
    println!("Expected bytes per row: {}", width * 4);
    println!("Padding per row: {} bytes", bytes_per_row - (width as usize * 4));
    // OLD METHOD (broken) - treating data as continuous
    println!("\n=== OLD METHOD (BROKEN) ===");
    let mut old_rgba = Vec::with_capacity(data.len() as usize);
    for chunk in data.chunks_exact(4) {
        old_rgba.push(chunk[2]); // R
        old_rgba.push(chunk[1]); // G
        old_rgba.push(chunk[0]); // B
        old_rgba.push(chunk[3]); // A
    }
    println!("Converted {} pixels", old_rgba.len() / 4);
    println!("Expected {} pixels", width * height);
    // NEW METHOD (fixed) - handling row padding
    println!("\n=== NEW METHOD (FIXED) ===");
    let mut new_rgba = Vec::with_capacity((width * height * 4) as usize);
    for row in 0..height as usize {
        let row_start = row * bytes_per_row;
        let row_end = row_start + (width as usize * 4);
        for chunk in data[row_start..row_end].chunks_exact(4) {
            new_rgba.push(chunk[2]); // R
            new_rgba.push(chunk[1]); // G
            new_rgba.push(chunk[0]); // B
            new_rgba.push(chunk[3]); // A
        }
    }
    println!("Converted {} pixels", new_rgba.len() / 4);
    println!("Expected {} pixels", width * height);
    // Save a small crop from both methods
    let crop_size = 200;
    // Old method crop
    let old_crop: Vec<u8> = old_rgba.iter().take((crop_size * crop_size * 4) as usize).copied().collect();
    if let Some(old_img) = ImageBuffer::from_raw(crop_size, crop_size, old_crop) {
        let old_img: RgbaImage = old_img;
        old_img.save("/tmp/screenshot_old_method.png").unwrap();
        println!("\nSaved OLD method crop to: /tmp/screenshot_old_method.png");
    }
    // New method crop
    let new_crop: Vec<u8> = new_rgba.iter().take((crop_size * crop_size * 4) as usize).copied().collect();
    if let Some(new_img) = ImageBuffer::from_raw(crop_size, crop_size, new_crop) {
        let new_img: RgbaImage = new_img;
        new_img.save("/tmp/screenshot_new_method.png").unwrap();
        println!("Saved NEW method crop to: /tmp/screenshot_new_method.png");
    }
    println!("\nOpen both images to compare:");
    println!("  open /tmp/screenshot_old_method.png /tmp/screenshot_new_method.png");
 }
--- a/crates/g3-computer-control/examples/test_window_capture.rs
+++ b/crates/g3-computer-control/examples/test_window_capture.rs
@@ -0,0 +1,45 @@
 use g3_computer_control::create_controller;
 #[tokio::main]
 async fn main() {
    println!("Testing window-specific screenshot capture...");
    let controller = create_controller().expect("Failed to create controller");
    // Test 1: Capture iTerm2 window
    println!("\n1. Capturing iTerm2 window...");
    match controller.take_screenshot("/tmp/iterm_window.png", None, Some("iTerm2")).await {
        Ok(_) => {
            println!("   ✅ iTerm2 window captured to /tmp/iterm_window.png");
            let _ = std::process::Command::new("open").arg("/tmp/iterm_window.png").spawn();
        }
        Err(e) => println!("   ❌ Failed: {}", e),
    }
    // Wait a moment for the image to open
    tokio::time::sleep(tokio::time::Duration::from_secs(2)).await;
    // Test 2: Full screen capture for comparison
    println!("\n2. Capturing full screen for comparison...");
    match controller.take_screenshot("/tmp/fullscreen.png", None, None).await {
        Ok(_) => {
            println!("   ✅ Full screen captured to /tmp/fullscreen.png");
            let _ = std::process::Command::new("open").arg("/tmp/fullscreen.png").spawn();
        }
        Err(e) => println!("   ❌ Failed: {}", e),
    }
    println!("\n=== Comparison ===");
    println!("iTerm window:  /tmp/iterm_window.png (should show ONLY iTerm window)");
    println!("Full screen:   /tmp/fullscreen.png (should show entire desktop)");
    // Show file sizes
    if let Ok(meta1) = std::fs::metadata("/tmp/iterm_window.png") {
        if let Ok(meta2) = std::fs::metadata("/tmp/fullscreen.png") {
            println!("\nFile sizes:");
            println!("  iTerm window: {:.1} MB", meta1.len() as f64 / 1_000_000.0);
            println!("  Full screen:  {:.1} MB", meta2.len() as f64 / 1_000_000.0);
            println!("\nWindow capture should be smaller than full screen.");
        }
    }
 }
--- a/crates/g3-computer-control/src/lib.rs
+++ b/crates/g3-computer-control/src/lib.rs
@@ -0,0 +1,35 @@
 pub mod types;
 pub mod platform;
 pub mod webdriver;
 // Re-export webdriver types for convenience
 pub use webdriver::{WebDriverController, WebElement, safari::SafariDriver};
 use anyhow::Result;
 use async_trait::async_trait;
 use types::*;
 #[async_trait]
 pub trait ComputerController: Send + Sync {
    // Screen capture
    async fn take_screenshot(&self, path: &str, region: Option<Rect>, window_id: Option<&str>) -> Result<()>;
    // OCR operations
    async fn extract_text_from_screen(&self, region: Rect) -> Result<String>;
    async fn extract_text_from_image(&self, path: &str) -> Result<String>;
 }
 // Platform-specific constructor
 pub fn create_controller() -> Result<Box<dyn ComputerController>> {
    #[cfg(target_os = "macos")]
    return Ok(Box::new(platform::macos::MacOSController::new()?));
    #[cfg(target_os = "linux")]
    return Ok(Box::new(platform::linux::LinuxController::new()?));
    #[cfg(target_os = "windows")]
    return Ok(Box::new(platform::windows::WindowsController::new()?));
    #[cfg(not(any(target_os = "macos", target_os = "linux", target_os = "windows")))]
    anyhow::bail!("Unsupported platform")
 }
--- a/crates/g3-computer-control/src/platform/linux.rs
+++ b/crates/g3-computer-control/src/platform/linux.rs
@@ -0,0 +1,161 @@
 use crate::{ComputerController, types::*};
 use anyhow::Result;
 use async_trait::async_trait;
 use tesseract::Tesseract;
 use uuid::Uuid;
 pub struct LinuxController {
    // Placeholder for X11 connection or other state
 }
 impl LinuxController {
    pub fn new() -> Result<Self> {
        // Initialize X11 connection
        tracing::warn!("Linux computer control not fully implemented");
        Ok(Self {})
    }
 }
 #[async_trait]
 impl ComputerController for LinuxController {
    async fn move_mouse(&self, _x: i32, _y: i32) -> Result<()> {
        anyhow::bail!("Linux implementation not yet available")
    }
    async fn click(&self, _button: MouseButton) -> Result<()> {
        anyhow::bail!("Linux implementation not yet available")
    }
    async fn double_click(&self, _button: MouseButton) -> Result<()> {
        anyhow::bail!("Linux implementation not yet available")
    }
    async fn type_text(&self, _text: &str) -> Result<()> {
        anyhow::bail!("Linux implementation not yet available")
    }
    async fn press_key(&self, _key: &str) -> Result<()> {
        anyhow::bail!("Linux implementation not yet available")
    }
    async fn list_windows(&self) -> Result<Vec<Window>> {
        anyhow::bail!("Linux implementation not yet available")
    }
    async fn focus_window(&self, _window_id: &str) -> Result<()> {
        anyhow::bail!("Linux implementation not yet available")
    }
    async fn get_window_bounds(&self, _window_id: &str) -> Result<Rect> {
        anyhow::bail!("Linux implementation not yet available")
    }
    async fn find_element(&self, _selector: &ElementSelector) -> Result<Option<UIElement>> {
        anyhow::bail!("Linux implementation not yet available")
    }
    async fn get_element_text(&self, _element_id: &str) -> Result<String> {
        anyhow::bail!("Linux implementation not yet available")
    }
    async fn get_element_bounds(&self, _element_id: &str) -> Result<Rect> {
        anyhow::bail!("Linux implementation not yet available")
    }
    async fn take_screenshot(&self, _path: &str, _region: Option<Rect>, _window_id: Option<&str>) -> Result<()> {
        anyhow::bail!("Linux implementation not yet available")
    }
    async fn extract_text_from_screen(&self, _region: Rect) -> Result<OCRResult> {
        anyhow::bail!("Linux implementation not yet available")
    }
    async fn extract_text_from_image(&self, _path: &str) -> Result<OCRResult> {
        // Check if tesseract is available on the system
        let tesseract_check = std::process::Command::new("which")
            .arg("tesseract")
            .output();
        if tesseract_check.is_err() || !tesseract_check.as_ref().unwrap().status.success() {
            anyhow::bail!("Tesseract OCR is not installed on your system.\n\n\
                To install tesseract:\n  \
                Ubuntu/Debian: sudo apt-get install tesseract-ocr\n  \
                RHEL/CentOS:   sudo yum install tesseract\n  \
                Arch Linux:    sudo pacman -S tesseract\n\n\
                After installation, restart your terminal and try again.");
        }
        // Initialize Tesseract
        let tess = Tesseract::new(None, Some("eng"))
            .map_err(|e| {
                anyhow::anyhow!("Failed to initialize Tesseract: {}\n\n\
                    This usually means:\n1. Tesseract is not properly installed\n\
                    2. Language data files are missing\n\nTo fix:\n  \
                    Ubuntu/Debian: sudo apt-get install tesseract-ocr-eng\n  \
                    RHEL/CentOS:   sudo yum install tesseract-langpack-eng\n  \
                    Arch Linux:    sudo pacman -S tesseract-data-eng", e)
            })?;
        let text = tess.set_image(_path)
            .map_err(|e| anyhow::anyhow!("Failed to load image '{}': {}", _path, e))?
            .get_text()
            .map_err(|e| anyhow::anyhow!("Failed to extract text from image: {}", e))?;
        // Get confidence (simplified - would need more complex API calls for per-word confidence)
        let confidence = 0.85; // Placeholder
        Ok(OCRResult {
            text,
            confidence,
            bounds: Rect { x: 0, y: 0, width: 0, height: 0 }, // Would need image dimensions
        })
    }
    async fn find_text_on_screen(&self, _text: &str) -> Result<Option<Point>> {
        // Check if tesseract is available on the system
        let tesseract_check = std::process::Command::new("which")
            .arg("tesseract")
            .output();
        if tesseract_check.is_err() || !tesseract_check.as_ref().unwrap().status.success() {
            anyhow::bail!("Tesseract OCR is not installed on your system.\n\n\
                To install tesseract:\n  \
                Ubuntu/Debian: sudo apt-get install tesseract-ocr\n  \
                RHEL/CentOS:   sudo yum install tesseract\n  \
                Arch Linux:    sudo pacman -S tesseract\n\n\
                After installation, restart your terminal and try again.");
        }
        // Take full screen screenshot
        let temp_path = format!("/tmp/g3_ocr_search_{}.png", uuid::Uuid::new_v4());
        self.take_screenshot(&temp_path, None, None).await?;
        // Use Tesseract to find text with bounding boxes
        let tess = Tesseract::new(None, Some("eng"))
            .map_err(|e| {
                anyhow::anyhow!("Failed to initialize Tesseract: {}\n\n\
                    This usually means:\n1. Tesseract is not properly installed\n\
                    2. Language data files are missing\n\nTo fix:\n  \
                    Ubuntu/Debian: sudo apt-get install tesseract-ocr-eng\n  \
                    RHEL/CentOS:   sudo yum install tesseract-langpack-eng\n  \
                    Arch Linux:    sudo pacman -S tesseract-data-eng", e)
            })?;
        let full_text = tess.set_image(temp_path.as_str())
            .map_err(|e| anyhow::anyhow!("Failed to load screenshot: {}", e))?
            .get_text()
            .map_err(|e| anyhow::anyhow!("Failed to extract text from screen: {}", e))?;
        // Clean up temp file
        let _ = std::fs::remove_file(&temp_path);
        // Simple text search - full implementation would use get_component_images
        // to get bounding boxes for each word
        if full_text.contains(_text) {
            tracing::warn!("Text found but precise coordinates not available in simplified implementation");
            Ok(Some(Point { x: 0, y: 0 }))
        } else {
            Ok(None)
        }
    }
 }
--- a/crates/g3-computer-control/src/platform/macos.rs
+++ b/crates/g3-computer-control/src/platform/macos.rs
@@ -0,0 +1,125 @@
 use crate::{ComputerController, types::Rect};
 use anyhow::Result;
 use async_trait::async_trait;
 use std::path::Path;
 use tesseract::Tesseract;
 pub struct MacOSController {
    // Empty struct for now
 }
 impl MacOSController {
    pub fn new() -> Result<Self> {
        Ok(Self {})
    }
 }
 #[async_trait]
 impl ComputerController for MacOSController {
    async fn take_screenshot(&self, path: &str, region: Option<Rect>, window_id: Option<&str>) -> Result<()> {
        // Determine the temporary directory for screenshots
        let temp_dir = std::env::var("TMPDIR")
            .or_else(|_| std::env::var("HOME").map(|h| format!("{}/tmp", h)))
            .unwrap_or_else(|_| "/tmp".to_string());
        // Ensure temp directory exists
        std::fs::create_dir_all(&temp_dir)?;
        // If path is relative or doesn't specify a directory, use temp_dir
        let final_path = if path.starts_with('/') {
            path.to_string()
        } else {
            format!("{}/{}", temp_dir.trim_end_matches('/'), path)
        };
        let path_obj = Path::new(&final_path);
        if let Some(parent) = path_obj.parent() {
            std::fs::create_dir_all(parent)?;
        }
        let mut cmd = std::process::Command::new("screencapture");
        // Add flags
        cmd.arg("-x"); // No sound
        if let Some(region) = region {
            // Capture specific region: -R x,y,width,height
            cmd.arg("-R");
            cmd.arg(format!("{},{},{},{}", region.x, region.y, region.width, region.height));
        }
        if let Some(app_name) = window_id {
            // Capture specific window by app name
            // Use AppleScript to get window ID
            let script = format!(r#"tell application "{}" to id of window 1"#, app_name);
            let output = std::process::Command::new("osascript")
                .arg("-e")
                .arg(&script)
                .output()?;
            if output.status.success() {
                let window_id_str = String::from_utf8_lossy(&output.stdout).trim().to_string();
                cmd.arg(format!("-l{}", window_id_str));
            }
        }
        cmd.arg(&final_path);
        let screenshot_result = cmd.output()?;
        if !screenshot_result.status.success() {
            let stderr = String::from_utf8_lossy(&screenshot_result.stderr);
            return Err(anyhow::anyhow!("screencapture failed: {}", stderr));
        }
        Ok(())
    }
    async fn extract_text_from_screen(&self, region: Rect) -> Result<String> {
        // Take screenshot of region first
        let temp_path = format!("/tmp/g3_ocr_{}.png", uuid::Uuid::new_v4());
        self.take_screenshot(&temp_path, Some(region), None).await?;
        // Extract text from the screenshot
        let result = self.extract_text_from_image(&temp_path).await?;
        // Clean up temp file
        let _ = std::fs::remove_file(&temp_path);
        Ok(result)
    }
    async fn extract_text_from_image(&self, path: &str) -> Result<String> {
        // Check if tesseract is available on the system
        let tesseract_check = std::process::Command::new("which")
            .arg("tesseract")
            .output();
        if tesseract_check.is_err() || !tesseract_check.as_ref().unwrap().status.success() {
            anyhow::bail!("Tesseract OCR is not installed on your system.\n\n\
                To install tesseract:\n  macOS:   brew install tesseract\n  \
                Linux:   sudo apt-get install tesseract-ocr (Ubuntu/Debian)\n           \
                sudo yum install tesseract (RHEL/CentOS)\n  \
                Windows: Download from https://github.com/UB-Mannheim/tesseract/wiki\n\n\
                After installation, restart your terminal and try again.");
        }
        // Initialize Tesseract
        let tess = Tesseract::new(None, Some("eng"))
            .map_err(|e| {
                anyhow::anyhow!("Failed to initialize Tesseract: {}\n\n\
                    This usually means:\n1. Tesseract is not properly installed\n\
                    2. Language data files are missing\n\nTo fix:\n  \
                    macOS:   brew reinstall tesseract\n  \
                    Linux:   sudo apt-get install tesseract-ocr-eng\n  \
                    Windows: Reinstall tesseract and ensure language files are included", e)
            })?;
        let text = tess.set_image(path)
            .map_err(|e| anyhow::anyhow!("Failed to load image '{}': {}", path, e))?
            .get_text()
            .map_err(|e| anyhow::anyhow!("Failed to extract text from image: {}", e))?;
        Ok(text)
    }
 }
--- a/crates/g3-computer-control/src/platform/macos.rs.bak
+++ b/crates/g3-computer-control/src/platform/macos.rs.bak
@@ -0,0 +1,425 @@
 use crate::{ComputerController, types::*};
 use anyhow::Result;
 use async_trait::async_trait;
 use core_graphics::display::CGPoint;
 use core_graphics::event::{CGEvent, CGEventType, CGMouseButton, CGEventTapLocation};
 use core_graphics::event_source::{CGEventSource, CGEventSourceStateID};
 use std::path::Path;
 use tesseract::Tesseract;
 // MacOSController doesn't store CGEventSource to avoid Send/Sync issues
 // We create it fresh for each operation
 pub struct MacOSController {
    // Empty struct - event source created per operation
 }
 impl MacOSController {
    pub fn new() -> Result<Self> {
        // Test that we can create an event source
        let _event_source = CGEventSource::new(CGEventSourceStateID::CombinedSessionState)
            .map_err(|_| anyhow::anyhow!("Failed to create event source. Make sure Accessibility permissions are granted."))?;
        Ok(Self {})
    }
    fn key_to_keycode(&self, key: &str) -> Result<u16> {
        // Map key names to macOS keycodes
        let keycode = match key.to_lowercase().as_str() {
            "return" | "enter" => 36,
            "tab" => 48,
            "space" => 49,
            "delete" | "backspace" => 51,
            "escape" | "esc" => 53,
            "command" | "cmd" => 55,
            "shift" => 56,
            "capslock" => 57,
            "option" | "alt" => 58,
            "control" | "ctrl" => 59,
            "left" => 123,
            "right" => 124,
            "down" => 125,
            "up" => 126,
            _ => anyhow::bail!("Unknown key: {}", key),
        };
        Ok(keycode)
    }
 }
 #[async_trait]
 impl ComputerController for MacOSController {
    async fn move_mouse(&self, x: i32, y: i32) -> Result<()> {
        let event_source = CGEventSource::new(CGEventSourceStateID::CombinedSessionState)
            .map_err(|_| anyhow::anyhow!("Failed to create event source"))?;
        let point = CGPoint::new(x as f64, y as f64);
        let event = CGEvent::new_mouse_event(
            event_source,
            CGEventType::MouseMoved,
            point,
            CGMouseButton::Left,
        ).map_err(|_| anyhow::anyhow!("Failed to create mouse move event"))?;
        event.post(CGEventTapLocation::HID);
        Ok(())
    }
    async fn click(&self, button: MouseButton) -> Result<()> {
        let (cg_button, down_type, up_type) = match button {
            MouseButton::Left => (CGMouseButton::Left, CGEventType::LeftMouseDown, CGEventType::LeftMouseUp),
            MouseButton::Right => (CGMouseButton::Right, CGEventType::RightMouseDown, CGEventType::RightMouseUp),
            MouseButton::Middle => (CGMouseButton::Center, CGEventType::OtherMouseDown, CGEventType::OtherMouseUp),
        };
        let point = {
            // Get current mouse position
            let temp_source = CGEventSource::new(CGEventSourceStateID::CombinedSessionState)
                .map_err(|_| anyhow::anyhow!("Failed to create event source"))?;
            let event = CGEvent::new(temp_source)
                .map_err(|_| anyhow::anyhow!("Failed to get mouse position"))?;
            let p = event.location();
            p
        };
        {
            let event_source = CGEventSource::new(CGEventSourceStateID::CombinedSessionState)
                .map_err(|_| anyhow::anyhow!("Failed to create event source"))?;
            // Mouse down
            let down_event = CGEvent::new_mouse_event(
                event_source,
                down_type,
                point,
                cg_button,
            ).map_err(|_| anyhow::anyhow!("Failed to create mouse down event"))?;
            down_event.post(CGEventTapLocation::HID);
        } // event_source and down_event dropped here
        // Small delay
        tokio::time::sleep(tokio::time::Duration::from_millis(50)).await;
        {
            let event_source = CGEventSource::new(CGEventSourceStateID::CombinedSessionState)
                .map_err(|_| anyhow::anyhow!("Failed to create event source"))?;
            let up_event = CGEvent::new_mouse_event(
                event_source,
                up_type,
                point,
                cg_button,
            ).map_err(|_| anyhow::anyhow!("Failed to create mouse up event"))?;
            up_event.post(CGEventTapLocation::HID);
        } // event_source and up_event dropped here
        Ok(())
    }
    async fn double_click(&self, button: MouseButton) -> Result<()> {
        self.click(button).await?;
        tokio::time::sleep(tokio::time::Duration::from_millis(100)).await;
        self.click(button).await?;
        Ok(())
    }
    async fn type_text(&self, text: &str) -> Result<()> {
        for ch in text.chars() {
            {
                let event_source = CGEventSource::new(CGEventSourceStateID::CombinedSessionState)
                    .map_err(|_| anyhow::anyhow!("Failed to create event source"))?;
                // Create keyboard event for character
                let event = CGEvent::new_keyboard_event(
                    event_source,
                    0, // keycode (0 for unicode)
                    true,
                ).map_err(|_| anyhow::anyhow!("Failed to create keyboard event"))?;
                // Set unicode string
                let mut utf16_buf = [0u16; 2];
                let utf16_slice = ch.encode_utf16(&mut utf16_buf);
                let utf16_chars: Vec<u16> = utf16_slice.iter().copied().collect();
                event.set_string_from_utf16_unchecked(utf16_chars.as_slice());
                event.post(CGEventTapLocation::HID);
            } // event_source and event dropped here
            tokio::time::sleep(tokio::time::Duration::from_millis(10)).await;
        }
        Ok(())
    }
    async fn press_key(&self, key: &str) -> Result<()> {
        let keycode = self.key_to_keycode(key)?;
        {
            let event_source = CGEventSource::new(CGEventSourceStateID::CombinedSessionState)
                .map_err(|_| anyhow::anyhow!("Failed to create event source"))?;
            // Key down
            let down_event = CGEvent::new_keyboard_event(
                event_source,
                keycode,
                true,
            ).map_err(|_| anyhow::anyhow!("Failed to create key down event"))?;
            down_event.post(CGEventTapLocation::HID);
        } // event_source and down_event dropped here
        tokio::time::sleep(tokio::time::Duration::from_millis(50)).await;
        {
            let event_source = CGEventSource::new(CGEventSourceStateID::CombinedSessionState)
                .map_err(|_| anyhow::anyhow!("Failed to create event source"))?;
            // Key up
            let up_event = CGEvent::new_keyboard_event(
                event_source,
                keycode,
                false,
            ).map_err(|_| anyhow::anyhow!("Failed to create key up event"))?;
            up_event.post(CGEventTapLocation::HID);
        } // event_source and up_event dropped here
        Ok(())
    }
    async fn list_windows(&self) -> Result<Vec<Window>> {
        // Note: Full implementation would use CGWindowListCopyWindowInfo
        // For now, return empty list as this requires more complex FFI
        tracing::warn!("list_windows not fully implemented on macOS");
        Ok(vec![])
    }
    async fn focus_window(&self, _window_id: &str) -> Result<()> {
        // Note: Full implementation would use NSWorkspace to activate application
        tracing::warn!("focus_window not fully implemented on macOS");
        Ok(())
    }
    async fn get_window_bounds(&self, _window_id: &str) -> Result<Rect> {
        // Note: Full implementation would use Accessibility API
        tracing::warn!("get_window_bounds not fully implemented on macOS");
        Ok(Rect { x: 0, y: 0, width: 800, height: 600 })
    }
    async fn find_element(&self, _selector: &ElementSelector) -> Result<Option<UIElement>> {
        // Note: Full implementation would use macOS Accessibility API
        tracing::warn!("find_element not fully implemented on macOS");
        Ok(None)
    }
    async fn get_element_text(&self, _element_id: &str) -> Result<String> {
        // Note: Full implementation would use Accessibility API
        tracing::warn!("get_element_text not fully implemented on macOS");
        Ok(String::new())
    }
    async fn get_element_bounds(&self, _element_id: &str) -> Result<Rect> {
        // Note: Full implementation would use Accessibility API
        tracing::warn!("get_element_bounds not fully implemented on macOS");
        Ok(Rect { x: 0, y: 0, width: 100, height: 30 })
    }
    async fn take_screenshot(&self, path: &str, _region: Option<Rect>, window_id: Option<&str>) -> Result<()> {
        // Use native macOS screencapture command which handles all the format complexities
        // Check if we have Screen Recording permission by attempting a test capture
        // If we only get wallpaper/menubar but no windows, we need permission
        let needs_permission_check = std::env::var("G3_SKIP_PERMISSION_CHECK").is_err();
        if needs_permission_check {
            // Try to open Screen Recording settings if this is the first screenshot
            static PERMISSION_PROMPTED: std::sync::atomic::AtomicBool = std::sync::atomic::AtomicBool::new(false);
            if !PERMISSION_PROMPTED.swap(true, std::sync::atomic::Ordering::Relaxed) {
                tracing::warn!("\n=== Screen Recording Permission Required ===\n\
                    macOS requires explicit permission to capture window content.\n\
                    If screenshots only show wallpaper/menubar (no windows):\n\n\
                    1. Open System Settings > Privacy & Security > Screen Recording\n\
                    2. Enable permission for your terminal (iTerm/Terminal) or g3\n\
                    3. Restart your terminal if needed\n\n\
                    Opening Screen Recording settings now...\n");
                // Try to open the settings (non-blocking)
                let _ = std::process::Command::new("open")
                    .arg("x-apple.systempreferences:com.apple.preference.security?Privacy_ScreenCapture")
                    .spawn();
            }
        }
        let path_obj = Path::new(path);
        if let Some(parent) = path_obj.parent() {
            std::fs::create_dir_all(parent)?;
        }
        let mut cmd = std::process::Command::new("screencapture");
        // Add flags
        cmd.arg("-x"); // No sound
        if let Some(window_id) = window_id {
            // Capture specific window by getting its bounds and using region capture
            // window_id format: "AppName" or "AppName:WindowTitle"
            let app_name = window_id.split(':').next().unwrap_or(window_id);
            // Use AppleScript to get window bounds
            let script = format!(
                r#"tell application "{}"
                    tell current window
                        get bounds
                    end tell
                end tell"#,
                app_name
            );
            let output = std::process::Command::new("osascript")
                .arg("-e")
                .arg(&script)
                .output()
                .map_err(|e| anyhow::anyhow!("Failed to get window bounds: {}", e))?;
            if output.status.success() {
                let bounds_str = String::from_utf8_lossy(&output.stdout);
                let bounds: Vec<i32> = bounds_str
                    .trim()
                    .split(',')
                    .filter_map(|s| s.trim().parse().ok())
                    .collect();
                if bounds.len() == 4 {
                    let (left, top, right, bottom) = (bounds[0], bounds[1], bounds[2], bounds[3]);
                    let width = right - left;
                    let height = bottom - top;
                    cmd.arg("-R");
                    cmd.arg(format!("{},{},{},{}", left, top, width, height));
                    tracing::debug!("Capturing window '{}' at region: {},{} {}x{}", app_name, left, top, width, height);
                } else {
                    tracing::warn!("Failed to parse window bounds, capturing full screen");
                }
            } else {
                tracing::warn!("Failed to get window bounds for '{}', capturing full screen", app_name);
            }
        } else if let Some(region) = _region {
            // Capture specific region: -R x,y,width,height
            cmd.arg("-R");
            cmd.arg(format!("{},{},{},{}", region.x, region.y, region.width, region.height));
        }
        cmd.arg(path);
        let output = cmd.output()
            .map_err(|e| anyhow::anyhow!("Failed to execute screencapture: {}", e))?;
        if !output.status.success() {
            let stderr = String::from_utf8_lossy(&output.stderr);
            anyhow::bail!("screencapture failed: {}", stderr);
        }
        tracing::debug!("Screenshot saved using screencapture: {}", path);
        Ok(())
    }
    }
    async fn extract_text_from_screen(&self, region: Rect) -> Result<OCRResult> {
        // Take screenshot of region first
        let temp_path = format!("/tmp/g3_ocr_{}.png", uuid::Uuid::new_v4());
        self.take_screenshot(&temp_path, Some(region), None).await?;
        // Extract text from the screenshot
        let result = self.extract_text_from_image(&temp_path).await?;
        // Clean up temp file
        let _ = std::fs::remove_file(&temp_path);
        Ok(result)
    }
    async fn extract_text_from_image(&self, _path: &str) -> Result<OCRResult> {
        // Check if tesseract is available on the system
        let tesseract_check = std::process::Command::new("which")
            .arg("tesseract")
            .output();
        if tesseract_check.is_err() || !tesseract_check.as_ref().unwrap().status.success() {
            anyhow::bail!("Tesseract OCR is not installed on your system.\n\n\
                To install tesseract:\n  macOS:   brew install tesseract\n  \
                Linux:   sudo apt-get install tesseract-ocr (Ubuntu/Debian)\n           \
                sudo yum install tesseract (RHEL/CentOS)\n  \
                Windows: Download from https://github.com/UB-Mannheim/tesseract/wiki\n\n\
                After installation, restart your terminal and try again.");
        }
        // Initialize Tesseract
        let tess = Tesseract::new(None, Some("eng"))
            .map_err(|e| {
                anyhow::anyhow!("Failed to initialize Tesseract: {}\n\n\
                    This usually means:\n1. Tesseract is not properly installed\n\
                    2. Language data files are missing\n\nTo fix:\n  \
                    macOS:   brew reinstall tesseract\n  \
                    Linux:   sudo apt-get install tesseract-ocr-eng\n  \
                    Windows: Reinstall tesseract and ensure language files are included", e)
            })?;
        let text = tess.set_image(_path)
            .map_err(|e| anyhow::anyhow!("Failed to load image '{}': {}", _path, e))?
            .get_text()
            .map_err(|e| anyhow::anyhow!("Failed to extract text from image: {}", e))?;
        // Get confidence (simplified - would need more complex API calls for per-word confidence)
        let confidence = 0.85; // Placeholder
        Ok(OCRResult {
            text,
            confidence,
            bounds: Rect { x: 0, y: 0, width: 0, height: 0 }, // Would need image dimensions
        })
    }
    async fn find_text_on_screen(&self, _text: &str) -> Result<Option<Point>> {
        // Check if tesseract is available on the system
        let tesseract_check = std::process::Command::new("which")
            .arg("tesseract")
            .output();
        if tesseract_check.is_err() || !tesseract_check.as_ref().unwrap().status.success() {
            anyhow::bail!("Tesseract OCR is not installed on your system.\n\n\
                To install tesseract:\n  macOS:   brew install tesseract\n  \
                Linux:   sudo apt-get install tesseract-ocr (Ubuntu/Debian)\n           \
                sudo yum install tesseract (RHEL/CentOS)\n  \
                Windows: Download from https://github.com/UB-Mannheim/tesseract/wiki\n\n\
                After installation, restart your terminal and try again.");
        }
        // Take full screen screenshot
        let temp_path = format!("/tmp/g3_ocr_search_{}.png", uuid::Uuid::new_v4());
        self.take_screenshot(&temp_path, None, None).await?;
        // Use Tesseract to find text with bounding boxes
        let tess = Tesseract::new(None, Some("eng"))
            .map_err(|e| {
                anyhow::anyhow!("Failed to initialize Tesseract: {}\n\n\
                    This usually means:\n1. Tesseract is not properly installed\n\
                    2. Language data files are missing\n\nTo fix:\n  \
                    macOS:   brew reinstall tesseract\n  \
                    Linux:   sudo apt-get install tesseract-ocr-eng\n  \
                    Windows: Reinstall tesseract and ensure language files are included", e)
            })?;
        let full_text = tess.set_image(temp_path.as_str())
            .map_err(|e| anyhow::anyhow!("Failed to load screenshot: {}", e))?
            .get_text()
            .map_err(|e| anyhow::anyhow!("Failed to extract text from screen: {}", e))?;
        // Clean up temp file
        let _ = std::fs::remove_file(&temp_path);
        // Simple text search - full implementation would use get_component_images
        // to get bounding boxes for each word
        if full_text.contains(_text) {
            tracing::warn!("Text found but precise coordinates not available in simplified implementation");
            Ok(Some(Point { x: 0, y: 0 }))
        } else {
            Ok(None)
        }
    }
 }
--- a/crates/g3-computer-control/src/platform/mod.rs
+++ b/crates/g3-computer-control/src/platform/mod.rs
@@ -0,0 +1,8 @@
 #[cfg(target_os = "macos")]
 pub mod macos;
 #[cfg(target_os = "linux")]
 pub mod linux;
 #[cfg(target_os = "windows")]
 pub mod windows;
--- a/crates/g3-computer-control/src/platform/windows.rs
+++ b/crates/g3-computer-control/src/platform/windows.rs
@@ -0,0 +1,162 @@
 use crate::{ComputerController, types::*};
 use anyhow::Result;
 use async_trait::async_trait;
 use tesseract::Tesseract;
 use uuid::Uuid;
 pub struct WindowsController {
    // Placeholder for Windows-specific state
 }
 impl WindowsController {
    pub fn new() -> Result<Self> {
        tracing::warn!("Windows computer control not fully implemented");
        Ok(Self {})
    }
 }
 #[async_trait]
 impl ComputerController for WindowsController {
    async fn move_mouse(&self, _x: i32, _y: i32) -> Result<()> {
        anyhow::bail!("Windows implementation not yet available")
    }
    async fn click(&self, _button: MouseButton) -> Result<()> {
        anyhow::bail!("Windows implementation not yet available")
    }
    async fn double_click(&self, _button: MouseButton) -> Result<()> {
        anyhow::bail!("Windows implementation not yet available")
    }
    async fn type_text(&self, _text: &str) -> Result<()> {
        anyhow::bail!("Windows implementation not yet available")
    }
    async fn press_key(&self, _key: &str) -> Result<()> {
        anyhow::bail!("Windows implementation not yet available")
    }
    async fn list_windows(&self) -> Result<Vec<Window>> {
        anyhow::bail!("Windows implementation not yet available")
    }
    async fn focus_window(&self, _window_id: &str) -> Result<()> {
        anyhow::bail!("Windows implementation not yet available")
    }
    async fn get_window_bounds(&self, _window_id: &str) -> Result<Rect> {
        anyhow::bail!("Windows implementation not yet available")
    }
    async fn find_element(&self, _selector: &ElementSelector) -> Result<Option<UIElement>> {
        anyhow::bail!("Windows implementation not yet available")
    }
    async fn get_element_text(&self, _element_id: &str) -> Result<String> {
        anyhow::bail!("Windows implementation not yet available")
    }
    async fn get_element_bounds(&self, _element_id: &str) -> Result<Rect> {
        anyhow::bail!("Windows implementation not yet available")
    }
    async fn take_screenshot(&self, _path: &str, _region: Option<Rect>, _window_id: Option<&str>) -> Result<()> {
        anyhow::bail!("Windows implementation not yet available")
    }
    async fn extract_text_from_screen(&self, _region: Rect) -> Result<OCRResult> {
        anyhow::bail!("Windows implementation not yet available")
    }
    async fn extract_text_from_image(&self, _path: &str) -> Result<OCRResult> {
        // Check if tesseract is available on the system
        let tesseract_check = std::process::Command::new("where")
            .arg("tesseract")
            .output();
        if tesseract_check.is_err() || !tesseract_check.as_ref().unwrap().status.success() {
            anyhow::bail!("Tesseract OCR is not installed on your system.\n\n\
                To install tesseract on Windows:\n  \
                1. Download the installer from: https://github.com/UB-Mannheim/tesseract/wiki\n  \
                2. Run the installer and follow the instructions\n  \
                3. Add tesseract to your PATH environment variable\n  \
                4. Restart your terminal/command prompt\n\n\
                After installation, restart your terminal and try again.");
        }
        // Initialize Tesseract
        let tess = Tesseract::new(None, Some("eng"))
            .map_err(|e| {
                anyhow::anyhow!("Failed to initialize Tesseract: {}\n\n\
                    This usually means:\n1. Tesseract is not properly installed\n\
                    2. Language data files are missing\n\nTo fix:\n  \
                    1. Reinstall tesseract from https://github.com/UB-Mannheim/tesseract/wiki\n  \
                    2. Make sure to select 'Additional language data' during installation\n  \
                    3. Ensure tesseract is in your PATH", e)
            })?;
        let text = tess.set_image(_path)
            .map_err(|e| anyhow::anyhow!("Failed to load image '{}': {}", _path, e))?
            .get_text()
            .map_err(|e| anyhow::anyhow!("Failed to extract text from image: {}", e))?;
        // Get confidence (simplified - would need more complex API calls for per-word confidence)
        let confidence = 0.85; // Placeholder
        Ok(OCRResult {
            text,
            confidence,
            bounds: Rect { x: 0, y: 0, width: 0, height: 0 }, // Would need image dimensions
        })
    }
    async fn find_text_on_screen(&self, _text: &str) -> Result<Option<Point>> {
        // Check if tesseract is available on the system
        let tesseract_check = std::process::Command::new("where")
            .arg("tesseract")
            .output();
        if tesseract_check.is_err() || !tesseract_check.as_ref().unwrap().status.success() {
            anyhow::bail!("Tesseract OCR is not installed on your system.\n\n\
                To install tesseract on Windows:\n  \
                1. Download the installer from: https://github.com/UB-Mannheim/tesseract/wiki\n  \
                2. Run the installer and follow the instructions\n  \
                3. Add tesseract to your PATH environment variable\n  \
                4. Restart your terminal/command prompt\n\n\
                After installation, restart your terminal and try again.");
        }
        // Take full screen screenshot
        let temp_path = format!("C:\\\\Temp\\\\g3_ocr_search_{}.png", uuid::Uuid::new_v4());
        self.take_screenshot(&temp_path, None, None).await?;
        // Use Tesseract to find text with bounding boxes
        let tess = Tesseract::new(None, Some("eng"))
            .map_err(|e| {
                anyhow::anyhow!("Failed to initialize Tesseract: {}\n\n\
                    This usually means:\n1. Tesseract is not properly installed\n\
                    2. Language data files are missing\n\nTo fix:\n  \
                    1. Reinstall tesseract from https://github.com/UB-Mannheim/tesseract/wiki\n  \
                    2. Make sure to select 'Additional language data' during installation\n  \
                    3. Ensure tesseract is in your PATH", e)
            })?;
        let full_text = tess.set_image(temp_path.as_str())
            .map_err(|e| anyhow::anyhow!("Failed to load screenshot: {}", e))?
            .get_text()
            .map_err(|e| anyhow::anyhow!("Failed to extract text from screen: {}", e))?;
        // Clean up temp file
        let _ = std::fs::remove_file(&temp_path);
        // Simple text search - full implementation would use get_component_images
        // to get bounding boxes for each word
        if full_text.contains(_text) {
            tracing::warn!("Text found but precise coordinates not available in simplified implementation");
            Ok(Some(Point { x: 0, y: 0 }))
        } else {
            Ok(None)
        }
    }
 }
--- a/crates/g3-computer-control/src/types.rs
+++ b/crates/g3-computer-control/src/types.rs
@@ -0,0 +1,9 @@
 use serde::{Deserialize, Serialize};
 #[derive(Debug, Clone, Copy, Serialize, Deserialize)]
 pub struct Rect {
    pub x: i32,
    pub y: i32,
    pub width: i32,
    pub height: i32,
 }
--- a/crates/g3-computer-control/src/webdriver/mod.rs
+++ b/crates/g3-computer-control/src/webdriver/mod.rs
@@ -0,0 +1,111 @@
 pub mod safari;
 use anyhow::Result;
 use async_trait::async_trait;
 use serde_json::Value;
 /// WebDriver controller for browser automation
 #[async_trait]
 pub trait WebDriverController: Send + Sync {
    /// Navigate to a URL
    async fn navigate(&mut self, url: &str) -> Result<()>;
    /// Get the current URL
    async fn current_url(&self) -> Result<String>;
    /// Get the page title
    async fn title(&self) -> Result<String>;
    /// Find an element by CSS selector
    async fn find_element(&mut self, selector: &str) -> Result<WebElement>;
    /// Find multiple elements by CSS selector
    async fn find_elements(&mut self, selector: &str) -> Result<Vec<WebElement>>;
    /// Execute JavaScript in the browser
    async fn execute_script(&mut self, script: &str, args: Vec<Value>) -> Result<Value>;
    /// Get the page source (HTML)
    async fn page_source(&self) -> Result<String>;
    /// Take a screenshot and save to path
    async fn screenshot(&mut self, path: &str) -> Result<()>;
    /// Close the current window/tab
    async fn close(&mut self) -> Result<()>;
    /// Quit the browser session
    async fn quit(self) -> Result<()>;
 }
 /// Represents a web element in the DOM
 pub struct WebElement {
    pub(crate) inner: fantoccini::elements::Element,
 }
 impl WebElement {
    /// Click the element
    pub async fn click(&mut self) -> Result<()> {
        self.inner.click().await?;
        Ok(())
    }
    /// Send keys/text to the element
    pub async fn send_keys(&mut self, text: &str) -> Result<()> {
        self.inner.send_keys(text).await?;
        Ok(())
    }
    /// Clear the element's content (for input fields)
    pub async fn clear(&mut self) -> Result<()> {
        self.inner.clear().await?;
        Ok(())
    }
    /// Get the element's text content
    pub async fn text(&self) -> Result<String> {
        Ok(self.inner.text().await?)
    }
    /// Get an attribute value
    pub async fn attr(&self, name: &str) -> Result<Option<String>> {
        Ok(self.inner.attr(name).await?)
    }
    /// Get a property value
    pub async fn prop(&self, name: &str) -> Result<Option<String>> {
        Ok(self.inner.prop(name).await?)
    }
    /// Get the element's HTML
    pub async fn html(&self, inner: bool) -> Result<String> {
        Ok(self.inner.html(inner).await?)
    }
    /// Check if element is displayed
    pub async fn is_displayed(&self) -> Result<bool> {
        Ok(self.inner.is_displayed().await?)
    }
    /// Check if element is enabled
    pub async fn is_enabled(&self) -> Result<bool> {
        Ok(self.inner.is_enabled().await?)
    }
    /// Check if element is selected (for checkboxes/radio buttons)
    pub async fn is_selected(&self) -> Result<bool> {
        Ok(self.inner.is_selected().await?)
    }
    /// Find a child element by CSS selector
    pub async fn find_element(&mut self, selector: &str) -> Result<WebElement> {
        let elem = self.inner.find(fantoccini::Locator::Css(selector)).await?;
        Ok(WebElement { inner: elem })
    }
    /// Find multiple child elements by CSS selector
    pub async fn find_elements(&mut self, selector: &str) -> Result<Vec<WebElement>> {
        let elems = self.inner.find_all(fantoccini::Locator::Css(selector)).await?;
        Ok(elems.into_iter().map(|inner| WebElement { inner }).collect())
    }
 }
--- a/crates/g3-computer-control/src/webdriver/safari.rs
+++ b/crates/g3-computer-control/src/webdriver/safari.rs
@@ -0,0 +1,212 @@
 use super::{WebDriverController, WebElement};
 use anyhow::{Context, Result};
 use async_trait::async_trait;
 use fantoccini::{Client, ClientBuilder};
 use serde_json::Value;
 use std::time::Duration;
 /// SafariDriver WebDriver controller
 pub struct SafariDriver {
    client: Client,
 }
 impl SafariDriver {
    /// Create a new SafariDriver instance
    /// 
    /// This will connect to SafariDriver running on the default port (4444).
    /// Make sure to enable "Allow Remote Automation" in Safari's Develop menu first.
    /// 
    /// You can start SafariDriver manually with:
    /// ```bash
    /// /usr/bin/safaridriver --enable
    /// ```
    pub async fn new() -> Result<Self> {
        Self::with_port(4444).await
    }
    /// Create a new SafariDriver instance with a custom port
    pub async fn with_port(port: u16) -> Result<Self> {
        let url = format!("http://localhost:{}", port);
        let mut caps = serde_json::Map::new();
        caps.insert("browserName".to_string(), Value::String("safari".to_string()));
        let client = ClientBuilder::native()
            .capabilities(caps)
            .connect(&url)
            .await
            .context("Failed to connect to SafariDriver. Make sure SafariDriver is running and 'Allow Remote Automation' is enabled in Safari's Develop menu.")?;
        Ok(Self { client })
    }
    /// Go back in browser history
    pub async fn back(&mut self) -> Result<()> {
        self.client.back().await?;
        Ok(())
    }
    /// Go forward in browser history
    pub async fn forward(&mut self) -> Result<()> {
        self.client.forward().await?;
        Ok(())
    }
    /// Refresh the current page
    pub async fn refresh(&mut self) -> Result<()> {
        self.client.refresh().await?;
        Ok(())
    }
    /// Get all window handles
    pub async fn window_handles(&mut self) -> Result<Vec<String>> {
        let handles = self.client.windows().await?;
        Ok(handles.into_iter()
            .map(|h| h.into())
            .collect())
    }
    /// Switch to a window by handle
    pub async fn switch_to_window(&mut self, handle: &str) -> Result<()> {
        let window_handle: fantoccini::wd::WindowHandle = handle.to_string().try_into()?;
        self.client.switch_to_window(window_handle).await?;
        Ok(())
    }
    /// Get the current window handle
    pub async fn current_window_handle(&mut self) -> Result<String> {
        Ok(self.client.window().await?.into())
    }
    /// Close the current window
    pub async fn close_window(&mut self) -> Result<()> {
        self.client.close_window().await?;
        Ok(())
    }
    /// Create a new window/tab
    pub async fn new_window(&mut self, is_tab: bool) -> Result<String> {
        let window_type = if is_tab { "tab" } else { "window" };
        let response = self.client.new_window(window_type == "tab").await?;
        Ok(response.handle.into())
    }
    /// Get cookies
    pub async fn get_cookies(&mut self) -> Result<Vec<fantoccini::cookies::Cookie<'static>>> {
        Ok(self.client.get_all_cookies().await?)
    }
    /// Add a cookie
    pub async fn add_cookie(&mut self, cookie: fantoccini::cookies::Cookie<'static>) -> Result<()> {
        self.client.add_cookie(cookie).await?;
        Ok(())
    }
    /// Delete all cookies
    pub async fn delete_all_cookies(&mut self) -> Result<()> {
        self.client.delete_all_cookies().await?;
        Ok(())
    }
    /// Wait for an element to appear (with timeout)
    pub async fn wait_for_element(&mut self, selector: &str, timeout: Duration) -> Result<WebElement> {
        let start = std::time::Instant::now();
        let poll_interval = Duration::from_millis(100);
        loop {
            if let Ok(elem) = self.find_element(selector).await {
                return Ok(elem);
            }
            if start.elapsed() >= timeout {
                anyhow::bail!("Timeout waiting for element: {}", selector);
            }
            tokio::time::sleep(poll_interval).await;
        }
    }
    /// Wait for an element to be visible (with timeout)
    pub async fn wait_for_visible(&mut self, selector: &str, timeout: Duration) -> Result<WebElement> {
        let start = std::time::Instant::now();
        let poll_interval = Duration::from_millis(100);
        loop {
            if let Ok(elem) = self.find_element(selector).await {
                if elem.is_displayed().await.unwrap_or(false) {
                    return Ok(elem);
                }
            }
            if start.elapsed() >= timeout {
                anyhow::bail!("Timeout waiting for element to be visible: {}", selector);
            }
            tokio::time::sleep(poll_interval).await;
        }
    }
 }
 #[async_trait]
 impl WebDriverController for SafariDriver {
    async fn navigate(&mut self, url: &str) -> Result<()> {
        self.client.goto(url).await?;
        Ok(())
    }
    async fn current_url(&self) -> Result<String> {
        Ok(self.client.current_url().await?.to_string())
    }
    async fn title(&self) -> Result<String> {
        Ok(self.client.title().await?)
    }
    async fn find_element(&mut self, selector: &str) -> Result<WebElement> {
        let elem = self.client.find(fantoccini::Locator::Css(selector)).await
            .context(format!("Failed to find element with selector: {}", selector))?;
        Ok(WebElement { inner: elem })
    }
    async fn find_elements(&mut self, selector: &str) -> Result<Vec<WebElement>> {
        let elems = self.client.find_all(fantoccini::Locator::Css(selector)).await?;
        Ok(elems.into_iter().map(|inner| WebElement { inner }).collect())
    }
    async fn execute_script(&mut self, script: &str, args: Vec<Value>) -> Result<Value> {
        Ok(self.client.execute(script, args).await?)
    }
    async fn page_source(&self) -> Result<String> {
        Ok(self.client.source().await?)
    }
    async fn screenshot(&mut self, path: &str) -> Result<()> {
        let screenshot_data = self.client.screenshot().await?;
        // Expand tilde in path
        let expanded_path = shellexpand::tilde(path);
        let path_str = expanded_path.as_ref();
        // Create parent directories if needed
        if let Some(parent) = std::path::Path::new(path_str).parent() {
            std::fs::create_dir_all(parent)
                .context("Failed to create parent directories for screenshot")?;
        }
        std::fs::write(path_str, screenshot_data)
            .context("Failed to write screenshot to file")?;
        Ok(())
    }
    async fn close(&mut self) -> Result<()> {
        self.client.close_window().await?;
        Ok(())
    }
    async fn quit(mut self) -> Result<()> {
        self.client.close().await?;
        Ok(())
    }
 }
--- a/crates/g3-computer-control/tests/integration_test.rs
+++ b/crates/g3-computer-control/tests/integration_test.rs
@@ -0,0 +1,62 @@
 use g3_computer_control::*;
 #[tokio::test]
 async fn test_mouse_movement() {
    let controller = create_controller().expect("Failed to create controller");
    // Move mouse to center of screen (assuming 1920x1080)
    let result = controller.move_mouse(960, 540).await;
    assert!(result.is_ok(), "Failed to move mouse: {:?}", result.err());
 }
 #[tokio::test]
 async fn test_typing() {
    let controller = create_controller().expect("Failed to create controller");
    // Type some text
    let result = controller.type_text("Hello, World!").await;
    assert!(result.is_ok(), "Failed to type text: {:?}", result.err());
 }
 #[tokio::test]
 async fn test_screenshot() {
    let controller = create_controller().expect("Failed to create controller");
    // Take screenshot
    let path = "/tmp/test_screenshot.png";
    let result = controller.take_screenshot(path, None, None).await;
    assert!(result.is_ok(), "Failed to take screenshot: {:?}", result.err());
    // Verify file exists
    assert!(std::path::Path::new(path).exists(), "Screenshot file was not created");
    // Clean up
    let _ = std::fs::remove_file(path);
 }
 #[tokio::test]
 async fn test_click() {
    let controller = create_controller().expect("Failed to create controller");
    // Click at a safe location
    let result = controller.click(types::MouseButton::Left).await;
    assert!(result.is_ok(), "Failed to click: {:?}", result.err());
 }
 #[tokio::test]
 async fn test_double_click() {
    let controller = create_controller().expect("Failed to create controller");
    // Double click
    let result = controller.double_click(types::MouseButton::Left).await;
    assert!(result.is_ok(), "Failed to double click: {:?}", result.err());
 }
 #[tokio::test]
 async fn test_press_key() {
    let controller = create_controller().expect("Failed to create controller");
    // Press escape key
    let result = controller.press_key("escape").await;
    assert!(result.is_ok(), "Failed to press key: {:?}", result.err());
 }
--- a/crates/g3-config/Cargo.toml
+++ b/crates/g3-config/Cargo.toml
@@ -12,3 +12,6 @@ thiserror = { workspace = true }
 toml = "0.8"
 shellexpand = "3.0"
 dirs = "5.0"
 [dev-dependencies]
 tempfile = "3.8"
--- a/crates/g3-config/src/lib.rs
+++ b/crates/g3-config/src/lib.rs
@@ -6,6 +6,8 @@ use std::path::Path;
 pub struct Config {
    pub providers: ProvidersConfig,
    pub agent: AgentConfig,
    pub computer_control: ComputerControlConfig,
    pub webdriver: WebDriverConfig,
 }
 #[derive(Debug, Clone, Serialize, Deserialize)]
@@ -15,6 +17,8 @@ pub struct ProvidersConfig {
    pub databricks: Option<DatabricksConfig>,
    pub embedded: Option<EmbeddedConfig>,
    pub default_provider: String,
    pub coach: Option<String>,  // Provider to use for coach in autonomous mode
    pub player: Option<String>, // Provider to use for player in autonomous mode
 }
 #[derive(Debug, Clone, Serialize, Deserialize)]
@@ -62,6 +66,38 @@ pub struct AgentConfig {
    pub timeout_seconds: u64,
 }
 #[derive(Debug, Clone, Serialize, Deserialize)]
 pub struct ComputerControlConfig {
    pub enabled: bool,
    pub require_confirmation: bool,
    pub max_actions_per_second: u32,
 }
 #[derive(Debug, Clone, Serialize, Deserialize)]
 pub struct WebDriverConfig {
    pub enabled: bool,
    pub safari_port: u16,
 }
 impl Default for WebDriverConfig {
    fn default() -> Self {
        Self {
            enabled: false,
            safari_port: 4444,
        }
    }
 }
 impl Default for ComputerControlConfig {
    fn default() -> Self {
        Self {
            enabled: false, // Disabled by default for safety
            require_confirmation: true,
            max_actions_per_second: 5,
        }
    }
 }
 impl Default for Config {
    fn default() -> Self {
        Self {
@@ -78,12 +114,16 @@ impl Default for Config {
                }),
                embedded: None,
                default_provider: "databricks".to_string(),
                coach: None,  // Will use default_provider if not specified
                player: None, // Will use default_provider if not specified
            },
            agent: AgentConfig {
                max_context_length: 8192,
                enable_streaming: true,
                timeout_seconds: 60,
            },
            computer_control: ComputerControlConfig::default(),
            webdriver: WebDriverConfig::default(),
        }
    }
 }
@@ -188,12 +228,16 @@ impl Config {
                    threads: Some(8),
                }),
                default_provider: "embedded".to_string(),
                coach: None,  // Will use default_provider if not specified
                player: None, // Will use default_provider if not specified
            },
            agent: AgentConfig {
                max_context_length: 8192,
                enable_streaming: true,
                timeout_seconds: 60,
            },
            computer_control: ComputerControlConfig::default(),
            webdriver: WebDriverConfig::default(),
        }
    }
@@ -262,4 +306,67 @@ impl Config {
        Ok(config)
    }
    /// Get the provider to use for coach mode in autonomous execution
    pub fn get_coach_provider(&self) -> &str {
        self.providers.coach
            .as_deref()
            .unwrap_or(&self.providers.default_provider)
    }
    /// Get the provider to use for player mode in autonomous execution
    pub fn get_player_provider(&self) -> &str {
        self.providers.player
            .as_deref()
            .unwrap_or(&self.providers.default_provider)
    }
    /// Create a copy of the config with a different default provider
    pub fn with_provider_override(&self, provider: &str) -> Result<Self> {
        // Validate that the provider is configured
        match provider {
            "anthropic" if self.providers.anthropic.is_none() => {
                return Err(anyhow::anyhow!(
                    "Provider '{}' is specified but not configured. Please add {} configuration to your config file.",
                    provider, provider
                ));
            }
            "databricks" if self.providers.databricks.is_none() => {
                return Err(anyhow::anyhow!(
                    "Provider '{}' is specified but not configured. Please add {} configuration to your config file.",
                    provider, provider
                ));
            }
            "embedded" if self.providers.embedded.is_none() => {
                return Err(anyhow::anyhow!(
                    "Provider '{}' is specified but not configured. Please add {} configuration to your config file.",
                    provider, provider
                ));
            }
            "openai" if self.providers.openai.is_none() => {
                return Err(anyhow::anyhow!(
                    "Provider '{}' is specified but not configured. Please add {} configuration to your config file.",
                    provider, provider
                ));
            }
            _ => {} // Provider is configured or unknown (will be caught later)
        }
        let mut config = self.clone();
        config.providers.default_provider = provider.to_string();
        Ok(config)
    }
    /// Create a copy of the config for coach mode in autonomous execution
    pub fn for_coach(&self) -> Result<Self> {
        self.with_provider_override(self.get_coach_provider())
    }
    /// Create a copy of the config for player mode in autonomous execution
    pub fn for_player(&self) -> Result<Self> {
        self.with_provider_override(self.get_player_provider())
    }
 }
 #[cfg(test)]
 mod tests;
--- a/crates/g3-config/src/tests.rs
+++ b/crates/g3-config/src/tests.rs
@@ -0,0 +1,131 @@
 #[cfg(test)]
 mod tests {
    use crate::Config;
    use std::fs;
    use tempfile::TempDir;
    #[test]
    fn test_coach_player_providers() {
        // Create a temporary directory for the test config
        let temp_dir = TempDir::new().unwrap();
        let config_path = temp_dir.path().join("test_config.toml");
        // Write a test configuration with coach and player providers
        let config_content = r#"
 [providers]
 default_provider = "databricks"
 coach = "anthropic"
 player = "embedded"
 [providers.databricks]
 host = "https://test.databricks.com"
 token = "test-token"
 model = "test-model"
 [providers.anthropic]
 api_key = "test-key"
 model = "claude-3"
 [providers.embedded]
 model_path = "test.gguf"
 model_type = "llama"
 [agent]
 max_context_length = 8192
 enable_streaming = true
 timeout_seconds = 60
 "#;
        fs::write(&config_path, config_content).unwrap();
        // Load the configuration
        let config = Config::load(Some(config_path.to_str().unwrap())).unwrap();
        // Test that the providers are correctly identified
        assert_eq!(config.providers.default_provider, "databricks");
        assert_eq!(config.get_coach_provider(), "anthropic");
        assert_eq!(config.get_player_provider(), "embedded");
        // Test creating coach config
        let coach_config = config.for_coach().unwrap();
        assert_eq!(coach_config.providers.default_provider, "anthropic");
        // Test creating player config
        let player_config = config.for_player().unwrap();
        assert_eq!(player_config.providers.default_provider, "embedded");
    }
    #[test]
    fn test_coach_player_fallback_to_default() {
        // Create a temporary directory for the test config
        let temp_dir = TempDir::new().unwrap();
        let config_path = temp_dir.path().join("test_config.toml");
        // Write a test configuration WITHOUT coach and player providers
        let config_content = r#"
 [providers]
 default_provider = "databricks"
 [providers.databricks]
 host = "https://test.databricks.com"
 token = "test-token"
 model = "test-model"
 [agent]
 max_context_length = 8192
 enable_streaming = true
 timeout_seconds = 60
 "#;
        fs::write(&config_path, config_content).unwrap();
        // Load the configuration
        let config = Config::load(Some(config_path.to_str().unwrap())).unwrap();
        // Test that coach and player fall back to default provider
        assert_eq!(config.get_coach_provider(), "databricks");
        assert_eq!(config.get_player_provider(), "databricks");
        // Test creating coach config (should use default)
        let coach_config = config.for_coach().unwrap();
        assert_eq!(coach_config.providers.default_provider, "databricks");
        // Test creating player config (should use default)
        let player_config = config.for_player().unwrap();
        assert_eq!(player_config.providers.default_provider, "databricks");
    }
    #[test]
    fn test_invalid_provider_error() {
        // Create a temporary directory for the test config
        let temp_dir = TempDir::new().unwrap();
        let config_path = temp_dir.path().join("test_config.toml");
        // Write a test configuration with an unconfigured provider
        let config_content = r#"
 [providers]
 default_provider = "databricks"
 coach = "openai"  # OpenAI is not configured
 [providers.databricks]
 host = "https://test.databricks.com"
 token = "test-token"
 model = "test-model"
 [agent]
 max_context_length = 8192
 enable_streaming = true
 timeout_seconds = 60
 "#;
        fs::write(&config_path, config_content).unwrap();
        // Load the configuration
        let config = Config::load(Some(config_path.to_str().unwrap())).unwrap();
        // Test that trying to create a coach config with unconfigured provider fails
        let result = config.for_coach();
        assert!(result.is_err());
        assert!(result.unwrap_err().to_string().contains("not configured"));
    }
 }
--- a/crates/g3-core/Cargo.toml
+++ b/crates/g3-core/Cargo.toml
@@ -8,6 +8,7 @@ description = "Core engine for G3 AI coding agent"
 g3-providers = { path = "../g3-providers" }
 g3-config = { path = "../g3-config" }
 g3-execution = { path = "../g3-execution" }
 g3-computer-control = { path = "../g3-computer-control" }
 tokio = { workspace = true }
 reqwest = { workspace = true }
 anyhow = { workspace = true }
@@ -23,3 +24,4 @@ futures-util = "0.3"
 chrono = { version = "0.4", features = ["serde"] }
 rand = "0.8"
 regex = "1.0"
 shellexpand = "3.1"
--- a/crates/g3-core/src/lib.rs
+++ b/crates/g3-core/src/lib.rs
--- a/crates/g3-core/src/tilde_expansion_tests.rs
+++ b/crates/g3-core/src/tilde_expansion_tests.rs
@@ -0,0 +1,36 @@
 #[cfg(test)]
 mod tilde_expansion_tests {
    use std::env;
    #[test]
    fn test_tilde_expansion() {
        // Test that shellexpand works
        let path_with_tilde = "~/test.txt";
        let expanded = shellexpand::tilde(path_with_tilde);
        // Get the actual home directory
        let home = env::var("HOME").expect("HOME environment variable not set");
        // Verify expansion happened
        assert_eq!(expanded.as_ref(), format!("{}/test.txt", home));
        assert!(!expanded.contains("~"));
    }
    #[test]
    fn test_tilde_expansion_with_subdirs() {
        let path_with_tilde = "~/Documents/test.txt";
        let expanded = shellexpand::tilde(path_with_tilde);
        let home = env::var("HOME").expect("HOME environment variable not set");
        assert_eq!(expanded.as_ref(), format!("{}/Documents/test.txt", home));
    }
    #[test]
    fn test_no_tilde_unchanged() {
        let path_without_tilde = "/absolute/path/test.txt";
        let expanded = shellexpand::tilde(path_without_tilde);
        assert_eq!(expanded.as_ref(), path_without_tilde);
    }
 }
--- a/crates/g3-core/tests/test_context_thinning.rs
+++ b/crates/g3-core/tests/test_context_thinning.rs
@@ -0,0 +1,157 @@
 use g3_core::ContextWindow;
 use g3_providers::{Message, MessageRole};
 #[test]
 fn test_thinning_thresholds() {
    let mut context = ContextWindow::new(10000);
    // At 0%, should not thin
    assert!(!context.should_thin());
    // Simulate reaching 50% usage
    context.used_tokens = 5000;
    assert!(context.should_thin());
    // After thinning at 50%, should not thin again until next threshold
    context.last_thinning_percentage = 50;
    assert!(!context.should_thin());
    // At 60%, should thin again
    context.used_tokens = 6000;
    assert!(context.should_thin());
    // After thinning at 60%, should not thin
    context.last_thinning_percentage = 60;
    assert!(!context.should_thin());
    // At 70%, should thin
    context.used_tokens = 7000;
    assert!(context.should_thin());
    // At 80%, should thin
    context.last_thinning_percentage = 70;
    context.used_tokens = 8000;
    assert!(context.should_thin());
    // After 80%, should not thin (compaction takes over)
    context.last_thinning_percentage = 80;
    context.used_tokens = 8500;
    assert!(!context.should_thin());
 }
 #[test]
 fn test_thin_context_basic() {
    let mut context = ContextWindow::new(10000);
    // Add some messages to the first third
    for i in 0..9 {
        if i % 2 == 0 {
            context.add_message(Message {
                role: MessageRole::Assistant,
                content: format!("Assistant message {}", i),
            });
        } else {
            // Add tool results with varying sizes
            let content = if i == 1 {
                // Large tool result (> 1000 chars)
                format!("Tool result: {}", "x".repeat(1500))
            } else if i == 3 {
                // Another large tool result
                format!("Tool result: {}", "y".repeat(2000))
            } else {
                // Small tool result (< 1000 chars)
                format!("Tool result: small result {}", i)
            };
            context.add_message(Message {
                role: MessageRole::User,
                content,
            });
        }
    }
    // Trigger thinning at 50%
    context.used_tokens = 5000;
    let summary = context.thin_context();
    println!("Thinning summary: {}", summary);
    // Should have thinned at least 1 large tool result in the first third
    assert!(summary.contains("1 tool result"), "Summary was: {}", summary);
    assert!(summary.contains("50%"));
    // Check that the large tool results were replaced
    let first_third_end = context.conversation_history.len() / 3;
    for i in 0..first_third_end {
        if let Some(msg) = context.conversation_history.get(i) {
            if matches!(msg.role, MessageRole::User) && msg.content.starts_with("Tool result:") {
                if msg.content.len() > 1000 {
                    panic!("Found un-thinned large tool result at index {}", i);
                }
            }
        }
    }
 }
 #[test]
 fn test_thin_context_no_large_results() {
    let mut context = ContextWindow::new(10000);
    // Add only small messages
    for i in 0..9 {
        context.add_message(Message {
            role: MessageRole::User,
            content: format!("Tool result: small {}", i),
        });
    }
    context.used_tokens = 5000;
    let summary = context.thin_context();
    // Should report no large results found
    assert!(summary.contains("no large tool results found"));
 }
 #[test]
 fn test_thin_context_only_affects_first_third() {
    let mut context = ContextWindow::new(10000);
    // Add 12 messages (first third = 4 messages)
    for i in 0..12 {
        let content = if i % 2 == 1 {
            // All odd indices are large tool results
            format!("Tool result: {}", "x".repeat(1500))
        } else {
            format!("Assistant message {}", i)
        };
        let role = if i % 2 == 1 {
            MessageRole::User
        } else {
            MessageRole::Assistant
        };
        context.add_message(Message { role, content });
    }
    context.used_tokens = 5000;
    let summary = context.thin_context();
    // First third is 4 messages (indices 0-3), so only indices 1 and 3 should be thinned
    // That's 2 tool results
    assert!(summary.contains("2 tool results"));
    // Check that messages after the first third are NOT thinned
    let first_third_end = context.conversation_history.len() / 3;
    for i in first_third_end..context.conversation_history.len() {
        if let Some(msg) = context.conversation_history.get(i) {
            if matches!(msg.role, MessageRole::User) && msg.content.starts_with("Tool result:") {
                // These should still be large (not thinned)
                if i % 2 == 1 {
                    assert!(msg.content.len() > 1000, 
                        "Message at index {} should not have been thinned", i);
                }
            }
        }
    }
 }
--- a/crates/g3-providers/src/anthropic.rs
+++ b/crates/g3-providers/src/anthropic.rs
@@ -156,8 +156,9 @@ impl AnthropicProvider {
            .post(ANTHROPIC_API_URL)
            .header("x-api-key", &self.api_key)
            .header("anthropic-version", ANTHROPIC_VERSION)
            // Anthropic beta 1m context window. Enable if needed. It costs extra, so check first.
            // .header("anthropic-beta", "context-1m-2025-08-07")
            .header("content-type", "application/json");
        if streaming {
            builder = builder.header("accept", "text/event-stream");
        }
--- a/crates/g3-providers/src/lib.rs
+++ b/crates/g3-providers/src/lib.rs
@@ -88,10 +88,12 @@ pub mod anthropic;
 pub mod databricks;
 pub mod embedded;
 pub mod oauth;
 pub mod openai;
 pub use anthropic::AnthropicProvider;
 pub use databricks::DatabricksProvider;
 pub use embedded::EmbeddedProvider;
 pub use openai::OpenAIProvider;
 /// Provider registry for managing multiple LLM providers
 pub struct ProviderRegistry {
--- a/crates/g3-providers/src/openai.rs
+++ b/crates/g3-providers/src/openai.rs
@@ -0,0 +1,495 @@
 use anyhow::Result;
 use async_trait::async_trait;
 use bytes::Bytes;
 use futures_util::stream::StreamExt;
 use reqwest::Client;
 use serde::Deserialize;
 use serde_json::json;
 use tokio::sync::mpsc;
 use tokio_stream::wrappers::ReceiverStream;
 use tracing::{debug, error};
 use crate::{
    CompletionChunk, CompletionRequest, CompletionResponse, CompletionStream, LLMProvider,
    Message, MessageRole, Tool, ToolCall, Usage,
 };
 #[derive(Clone)]
 pub struct OpenAIProvider {
    client: Client,
    api_key: String,
    model: String,
    base_url: String,
    max_tokens: Option<u32>,
    _temperature: Option<f32>,
 }
 impl OpenAIProvider {
    pub fn new(
        api_key: String,
        model: Option<String>,
        base_url: Option<String>,
        max_tokens: Option<u32>,
        temperature: Option<f32>,
    ) -> Result<Self> {
        Ok(Self {
            client: Client::new(),
            api_key,
            model: model.unwrap_or_else(|| "gpt-4o".to_string()),
            base_url: base_url.unwrap_or_else(|| "https://api.openai.com/v1".to_string()),
            max_tokens,
            _temperature: temperature,
        })
    }
    fn create_request_body(
        &self,
        messages: &[Message],
        tools: Option<&[Tool]>,
        stream: bool,
        max_tokens: Option<u32>,
        _temperature: Option<f32>,
    ) -> serde_json::Value {
        let mut body = json!({
            "model": self.model,
            "messages": convert_messages(messages),
            "stream": stream,
        });
        if let Some(max_tokens) = max_tokens.or(self.max_tokens) {
            body["max_completion_tokens"] = json!(max_tokens);
        }
        // OpenAI calls with temp setting seem to fail, so don't send one.
        // if let Some(temperature) = temperature.or(self.temperature) {
        //     body["temperature"] = json!(temperature);
        // }
        if let Some(tools) = tools {
            if !tools.is_empty() {
                body["tools"] = json!(convert_tools(tools));
            }
        }
        if stream {
            body["stream_options"] = json!({
                "include_usage": true,
            });
        }
        body
    }
    async fn parse_streaming_response(
        &self,
        mut stream: impl futures_util::Stream<Item = reqwest::Result<Bytes>> + Unpin,
        tx: mpsc::Sender<Result<CompletionChunk>>,
    ) -> Option<Usage> {
        let mut buffer = String::new();
        let mut accumulated_content = String::new();
        let mut accumulated_usage: Option<Usage> = None;
        let mut current_tool_calls: Vec<OpenAIStreamingToolCall> = Vec::new();
        while let Some(chunk_result) = stream.next().await {
            match chunk_result {
                Ok(chunk) => {
                    let chunk_str = match std::str::from_utf8(&chunk) {
                        Ok(s) => s,
                        Err(e) => {
                            error!("Failed to parse chunk as UTF-8: {}", e);
                            continue;
                        }
                    };
                    buffer.push_str(chunk_str);
                    // Process complete lines
                    while let Some(line_end) = buffer.find('\n') {
                        let line = buffer[..line_end].trim().to_string();
                        buffer.drain(..line_end + 1);
                        if line.is_empty() {
                            continue;
                        }
                        // Parse Server-Sent Events format
                        if let Some(data) = line.strip_prefix("data: ") {
                            if data == "[DONE]" {
                                debug!("Received stream completion marker");
                                // Send final chunk with accumulated content and tool calls
                                if !accumulated_content.is_empty() || !current_tool_calls.is_empty() {
                                    let tool_calls = if current_tool_calls.is_empty() {
                                        None
                                    } else {
                                        Some(
                                            current_tool_calls
                                                .iter()
                                                .filter_map(|tc| tc.to_tool_call())
                                                .collect(),
                                        )
                                    };
                                    let final_chunk = CompletionChunk {
                                        content: accumulated_content.clone(),
                                        finished: true,
                                        tool_calls,
                                        usage: accumulated_usage.clone(),
                                    };
                                    let _ = tx.send(Ok(final_chunk)).await;
                                }
                                return accumulated_usage;
                            }
                            // Parse the JSON data
                            match serde_json::from_str::<OpenAIStreamChunk>(data) {
                                Ok(chunk_data) => {
                                    // Handle content
                                    for choice in &chunk_data.choices {
                                        if let Some(content) = &choice.delta.content {
                                            accumulated_content.push_str(content);
                                            let chunk = CompletionChunk {
                                                content: content.clone(),
                                                finished: false,
                                                tool_calls: None,
                                                usage: None,
                                            };
                                            if tx.send(Ok(chunk)).await.is_err() {
                                                debug!("Receiver dropped, stopping stream");
                                                return accumulated_usage;
                                            }
                                        }
                                        // Handle tool calls
                                        if let Some(delta_tool_calls) = &choice.delta.tool_calls {
                                            for delta_tool_call in delta_tool_calls {
                                                if let Some(index) = delta_tool_call.index {
                                                    // Ensure we have enough tool calls in our vector
                                                    while current_tool_calls.len() <= index {
                                                        current_tool_calls
                                                            .push(OpenAIStreamingToolCall::default());
                                                    }
                                                    let tool_call = &mut current_tool_calls[index];
                                                    if let Some(id) = &delta_tool_call.id {
                                                        tool_call.id = Some(id.clone());
                                                    }
                                                    if let Some(function) = &delta_tool_call.function {
                                                        if let Some(name) = &function.name {
                                                            tool_call.name = Some(name.clone());
                                                        }
                                                        if let Some(arguments) = &function.arguments {
                                                            tool_call.arguments.push_str(arguments);
                                                        }
                                                    }
                                                }
                                            }
                                        }
                                    }
                                    // Handle usage
                                    if let Some(usage) = chunk_data.usage {
                                        accumulated_usage = Some(Usage {
                                            prompt_tokens: usage.prompt_tokens,
                                            completion_tokens: usage.completion_tokens,
                                            total_tokens: usage.total_tokens,
                                        });
                                    }
                                }
                                Err(e) => {
                                    debug!("Failed to parse stream chunk: {} - Data: {}", e, data);
                                }
                            }
                        }
                    }
                }
                Err(e) => {
                    error!("Stream error: {}", e);
                    let _ = tx.send(Err(anyhow::anyhow!("Stream error: {}", e))).await;
                    return accumulated_usage;
                }
            }
        }
        // Send final chunk if we haven't already
        let tool_calls = if current_tool_calls.is_empty() {
            None
        } else {
            Some(
                current_tool_calls
                    .iter()
                    .filter_map(|tc| tc.to_tool_call())
                    .collect(),
            )
        };
        let final_chunk = CompletionChunk {
            content: String::new(),
            finished: true,
            tool_calls,
            usage: accumulated_usage.clone(),
        };
        let _ = tx.send(Ok(final_chunk)).await;
        accumulated_usage
    }
 }
 #[async_trait]
 impl LLMProvider for OpenAIProvider {
    async fn complete(&self, request: CompletionRequest) -> Result<CompletionResponse> {
        debug!(
            "Processing OpenAI completion request with {} messages",
            request.messages.len()
        );
        let body = self.create_request_body(
            &request.messages,
            request.tools.as_deref(),
            false,
            request.max_tokens,
            request.temperature,
        );
        debug!("Sending request to OpenAI API: model={}", self.model);
        let response = self
            .client
            .post(&format!("{}/chat/completions", self.base_url))
            .header("Authorization", format!("Bearer {}", self.api_key))
            .json(&body)
            .send()
            .await?;
        let status = response.status();
        if !status.is_success() {
            let error_text = response
                .text()
                .await
                .unwrap_or_else(|_| "Unknown error".to_string());
            return Err(anyhow::anyhow!("OpenAI API error {}: {}", status, error_text));
        }
        let openai_response: OpenAIResponse = response.json().await?;
        let content = openai_response
            .choices
            .first()
            .and_then(|choice| choice.message.content.clone())
            .unwrap_or_default();
        let usage = Usage {
            prompt_tokens: openai_response.usage.prompt_tokens,
            completion_tokens: openai_response.usage.completion_tokens,
            total_tokens: openai_response.usage.total_tokens,
        };
        debug!(
            "OpenAI completion successful: {} tokens generated",
            usage.completion_tokens
        );
        Ok(CompletionResponse {
            content,
            usage,
            model: self.model.clone(),
        })
    }
    async fn stream(&self, request: CompletionRequest) -> Result<CompletionStream> {
        debug!(
            "Processing OpenAI streaming request with {} messages",
            request.messages.len()
        );
        let body = self.create_request_body(
            &request.messages,
            request.tools.as_deref(),
            true,
            request.max_tokens,
            request.temperature,
        );
        debug!("Sending streaming request to OpenAI API: model={}", self.model);
        let response = self
            .client
            .post(&format!("{}/chat/completions", self.base_url))
            .header("Authorization", format!("Bearer {}", self.api_key))
            .json(&body)
            .send()
            .await?;
        let status = response.status();
        if !status.is_success() {
            let error_text = response
                .text()
                .await
                .unwrap_or_else(|_| "Unknown error".to_string());
            return Err(anyhow::anyhow!("OpenAI API error {}: {}", status, error_text));
        }
        let stream = response.bytes_stream();
        let (tx, rx) = mpsc::channel(100);
        // Spawn task to process the stream
        let provider = self.clone();
        tokio::spawn(async move {
            let usage = provider.parse_streaming_response(stream, tx).await;
            // Log the final usage if available
            if let Some(usage) = usage {
                debug!(
                    "Stream completed with usage - prompt: {}, completion: {}, total: {}",
                    usage.prompt_tokens, usage.completion_tokens, usage.total_tokens
                );
            }
        });
        Ok(ReceiverStream::new(rx))
    }
    fn name(&self) -> &str {
        "openai"
    }
    fn model(&self) -> &str {
        &self.model
    }
    fn has_native_tool_calling(&self) -> bool {
        // OpenAI models support native tool calling
        true
    }
 }
 fn convert_messages(messages: &[Message]) -> Vec<serde_json::Value> {
    messages
        .iter()
        .map(|msg| {
            json!({
                "role": match msg.role {
                    MessageRole::System => "system",
                    MessageRole::User => "user",
                    MessageRole::Assistant => "assistant",
                },
                "content": msg.content,
            })
        })
        .collect()
 }
 fn convert_tools(tools: &[Tool]) -> Vec<serde_json::Value> {
    tools
        .iter()
        .map(|tool| {
            json!({
                "type": "function",
                "function": {
                    "name": tool.name,
                    "description": tool.description,
                    "parameters": tool.input_schema,
                }
            })
        })
        .collect()
 }
 // OpenAI API response structures
 #[derive(Debug, Deserialize)]
 struct OpenAIResponse {
    choices: Vec<OpenAIChoice>,
    usage: OpenAIUsage,
 }
 #[derive(Debug, Deserialize)]
 struct OpenAIChoice {
    message: OpenAIMessage,
 }
 #[allow(dead_code)]
 #[derive(Debug, Deserialize)]
 struct OpenAIMessage {
    content: Option<String>,
    #[serde(default)]
    tool_calls: Option<Vec<OpenAIToolCall>>,
 }
 #[allow(dead_code)]
 #[derive(Debug, Deserialize)]
 struct OpenAIToolCall {
    id: String,
    function: OpenAIFunction,
 }
 #[allow(dead_code)]
 #[derive(Debug, Deserialize)]
 struct OpenAIFunction {
    name: String,
    arguments: String,
 }
 // Streaming tool call accumulator
 #[derive(Debug, Default)]
 struct OpenAIStreamingToolCall {
    id: Option<String>,
    name: Option<String>,
    arguments: String,
 }
 impl OpenAIStreamingToolCall {
    fn to_tool_call(&self) -> Option<ToolCall> {
        let id = self.id.as_ref()?;
        let name = self.name.as_ref()?;
        let args = serde_json::from_str(&self.arguments).unwrap_or(serde_json::Value::Null);
        Some(ToolCall {
            id: id.clone(),
            tool: name.clone(),
            args,
        })
    }
 }
 #[derive(Debug, Deserialize)]
 struct OpenAIUsage {
    prompt_tokens: u32,
    completion_tokens: u32,
    total_tokens: u32,
 }
 // Streaming response structures
 #[derive(Debug, Deserialize)]
 struct OpenAIStreamChunk {
    choices: Vec<OpenAIStreamChoice>,
    usage: Option<OpenAIUsage>,
 }
 #[derive(Debug, Deserialize)]
 struct OpenAIStreamChoice {
    delta: OpenAIDelta,
 }
 #[derive(Debug, Deserialize)]
 struct OpenAIDelta {
    content: Option<String>,
    #[serde(default)]
    tool_calls: Option<Vec<OpenAIDeltaToolCall>>,
 }
 #[derive(Debug, Deserialize)]
 struct OpenAIDeltaToolCall {
    index: Option<usize>,
    id: Option<String>,
    function: Option<OpenAIDeltaFunction>,
 }
 #[derive(Debug, Deserialize)]
 struct OpenAIDeltaFunction {
    name: Option<String>,
    arguments: Option<String>,
 }
--- a/docs/coach-player-providers.md
+++ b/docs/coach-player-providers.md
@@ -0,0 +1,75 @@
 # Coach-Player Provider Configuration
 G3 now supports specifying different LLM providers for the coach and player agents when running in autonomous mode. This allows you to optimize for different requirements:
 - **Player**: The agent that implements code - might benefit from a faster, more cost-effective model
 - **Coach**: The agent that reviews code - might benefit from a more powerful, analytical model
 ## Configuration
 In your `config.toml` file, under the `[providers]` section, you can specify:
 ```toml
 [providers]
 default_provider = "databricks"  # Used for normal operations
 coach = "databricks"              # Provider for coach (code reviewer)
 player = "anthropic"              # Provider for player (code implementer)
 ```
 If `coach` or `player` are not specified, they will default to using the `default_provider`.
 ## Example Use Cases
 ### Cost Optimization
 Use a cheaper, faster model for initial implementations (player) and a more powerful model for review (coach):
 ```toml
 coach = "anthropic"  # Claude Sonnet for thorough review
 player = "anthropic" # Claude Haiku for quick implementation
 ```
 ### Speed vs Quality Trade-off
 Use a local embedded model for fast iterations (player) and a cloud model for quality review (coach):
 ```toml
 coach = "databricks"  # Cloud model for quality review
 player = "embedded"   # Local model for fast implementation
 ```
 ### Specialized Models
 Use different models optimized for different tasks:
 ```toml
 coach = "databricks"  # Model fine-tuned for code review
 player = "openai"     # Model optimized for code generation
 ```
 ## Requirements
 - Both providers must be properly configured in your config file
 - Each provider must have valid credentials
 - The models specified for each provider must be accessible
 ## How It Works
 When running in autonomous mode (`g3 --autonomous`), the system will:
 1. Use the `player` provider (or default) for the initial implementation
 2. Switch to the `coach` provider (or default) for code review
 3. Return to the `player` provider for implementing feedback
 4. Continue this cycle for the specified number of turns
 The providers are logged at startup so you can verify which models are being used:
 ```
 🎮 Player provider: anthropic
 👨‍🏫 Coach provider: databricks
 ℹ️  Using different providers for player and coach
 ```
 ## Benefits
 - **Cost Efficiency**: Use expensive models only where they add the most value
 - **Speed Optimization**: Use faster models for iterative development
 - **Specialization**: Leverage models that excel at specific tasks
 - **Flexibility**: Easy to experiment with different provider combinations
--- a/test-ai-requirements.sh
+++ b/test-ai-requirements.sh
@@ -0,0 +1,39 @@
 #!/bin/bash
 # Test script for AI-enhanced interactive requirements mode
 echo "Testing AI-enhanced interactive requirements mode..."
 echo ""
 # Create a test workspace
 TEST_WORKSPACE="/tmp/g3-test-interactive-$(date +%s)"
 mkdir -p "$TEST_WORKSPACE"
 echo "Test workspace: $TEST_WORKSPACE"
 echo ""
 # Create sample brief input
 BRIEF_INPUT="build a calculator cli in rust with basic operations"
 echo "Brief input:"
 echo "---"
 echo "$BRIEF_INPUT"
 echo "---"
 echo ""
 echo "This will:"
 echo "1. Send brief input to AI"
 echo "2. AI generates structured requirements.md"
 echo "3. Show enhanced requirements"
 echo "4. Prompt for confirmation (y/e/n)"
 echo ""
 echo "To test manually, run:"
 echo "cargo run -- --autonomous --interactive-requirements --workspace $TEST_WORKSPACE"
 echo ""
 echo "Then type: $BRIEF_INPUT"
 echo "Press Ctrl+D"
 echo "Review the AI-generated requirements"
 echo "Choose 'y' to proceed, 'e' to edit, or 'n' to cancel"
 echo ""
 echo "Test workspace will be at: $TEST_WORKSPACE"
Author	SHA1	Message	Date
Michael Neale	af6d37a8e2	Add --interactive-requirements flag for AI-enhanced requirements mode - Adds new --interactive-requirements CLI flag for autonomous mode - Prompts user for brief requirements input - Uses AI to enhance and structure requirements into proper markdown - Shows enhanced requirements and allows user to approve/edit/cancel - Saves to requirements.md and proceeds with autonomous mode if approved - Includes test script for manual verification	2025-10-22 14:58:35 +11:00
Dhanji R. Prasanna	c1c6680e03	Merge pull request #7 from jochenx/jochen-add-openai-and-multi-providers coach/player provider split + add OpenAI	2025-10-22 13:46:16 +11:00
Jochen	f2d8e744bb	fix panic in CLI parser	2025-10-22 13:20:45 +11:00
Jochen	010a43d203	coach/player provider split + add OpenAI Allows coach and player LLM providers to be separately specified. Also adds OpenAI provider	2025-10-21 16:59:13 +11:00
Dhanji Prasanna	758e255af8	dont run safaridriver --enable each time	2025-10-21 16:00:58 +11:00
Dhanji Prasanna	393826ae02	webdriver tools	2025-10-21 14:34:41 +11:00
Dhanji Prasanna	3afad3d61f	progressive context thinning	2025-10-20 15:29:44 +11:00
Dhanji Prasanna	2488cc54d5	docs: update README and DESIGN to reflect current project state - Add g3-computer-control crate to architecture documentation - Document all 13 tools including computer control and TODO management - Add context thinning feature documentation (50-80% thresholds) - Update tool ecosystem section with complete tool list - Remove broken link to non-existent COMPUTER_CONTROL.md - Update workspace count from 5 to 6 crates - Add platform-specific implementation details for computer control - Document OCR support via Tesseract - Clarify setup instructions for computer control features	2025-10-20 15:03:22 +11:00
Dhanji Prasanna	2ad0c9a3fd	todo list formatting	2025-10-20 14:27:53 +11:00
Dhanji Prasanna	2008a81193	fix to pass feedback to player (broken by todo system)	2025-10-20 14:12:08 +11:00
Dhanji Prasanna	776f5034b8	TODO tools	2025-10-20 10:50:53 +11:00
Dhanji Prasanna	92bece957b	colorizing tool calls	2025-10-18 16:09:30 +11:00
Dhanji Prasanna	767299ff4e	minor	2025-10-18 16:03:58 +11:00
Dhanji Prasanna	9d35449be8	~ expansion for read_file and str_replace	2025-10-18 16:01:15 +11:00
Dhanji Prasanna	da652bf287	computer control tools	2025-10-18 14:16:50 +11:00
Dhanji Prasanna	a566171203	small turn completing bug	2025-10-18 13:25:23 +11:00
Dhanji Prasanna	347c9e1e00	colorize timing based on duration	2025-10-17 13:54:21 +11:00
Dhanji Prasanna	aa7eda0331	fix wall clock timing	2025-10-17 10:36:21 +11:00
Dhanji Prasanna	e42c76f3b9	Tune coach pickiness down	2025-10-17 10:28:08 +11:00
Dhanji Prasanna	dd211fab1c	panic fix	2025-10-17 09:50:01 +11:00
Dhanji R. Prasanna	bcece38473	Merge pull request #5 from dhanji/micn/agent-tweaks load AGENTS.md if there	2025-10-16 15:06:14 +11:00