Revert "don't need this"

This reverts commit 93121c18e0.
don't need this
2025-10-22 14:53:25 +11:00 · 2025-10-22 14:30:13 +11:00 · 2025-10-22 14:27:17 +11:00 · 2025-10-22 14:19:00 +11:00 · 2025-10-21 16:00:58 +11:00 · 2025-10-21 14:34:41 +11:00
33 changed files with 4961 additions and 537 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -0,0 +1,33 @@
+# Changelog
+
+## [Unreleased]
+
+### Added
+
+**Interactive Requirements Mode**
+- **AI-Enhanced Interactive Requirements**: New `--interactive-requirements` flag for autonomous mode
+  - User enters brief description of what they want to build
+  - AI automatically enhances input into structured requirements.md document
+  - Generates professional markdown with:
+    - Project title and overview
+    - Organized requirements (functional, technical, quality)
+    - Acceptance criteria
+  - User can review, accept, edit manually, or cancel before proceeding
+  - Seamlessly transitions to autonomous mode
+
+**Autonomous Mode Configuration**
+- **Autonomous Mode Configuration**: Added ability to specify different models for coach and player agents in autonomous mode
+  - New `[autonomous]` configuration section in `g3.toml`
+  - `coach_provider` and `coach_model` options for coach agent
+  - `player_provider` and `player_model` options for player agent
+  - `Config::for_coach()` and `Config::for_player()` methods to generate role-specific configurations
+  - Comprehensive test suite for autonomous configuration
+
+### Changed
+- Autonomous mode now uses `config.for_player()` for the player agent
+- Coach agent creation now uses `config.for_coach()` for the coach agent
+
+### Benefits
+- **Cost Optimization**: Use cheaper models for execution, expensive models for review
+- **Speed Optimization**: Use faster models for iteration, thorough models for validation
+- **Specialization**: Leverage different providers' strengths for different roles
--- a/Cargo.lock
+++ b/Cargo.lock
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -4,7 +4,8 @@ members = [
    "crates/g3-core", 
    "crates/g3-providers",
    "crates/g3-config",
-    "crates/g3-execution"
+    "crates/g3-execution",
+    "crates/g3-computer-control"
 ]
 resolver = "2"

--- a/DESIGN.md
+++ b/DESIGN.md
@@ -29,7 +29,8 @@ g3/
 │   ├── g3-core/                  # Core agent engine, tools, and streaming logic
 │   ├── g3-providers/             # LLM provider abstractions and implementations
 │   ├── g3-config/                # Configuration management
-│   └── g3-execution/             # Code execution engine
+│   ├── g3-execution/             # Code execution engine
+│   └── g3-computer-control/      # Computer control and automation
 ├── logs/                         # Session logs (auto-created)
 ├── README.md                     # Project documentation
 └── DESIGN.md                     # This design document
@@ -48,6 +49,7 @@ g3/
 │ • Retro TUI     │    │ • Tool system   │    │ • Embedded      │
 │ • Autonomous    │    │ • Streaming     │    │   (llama.cpp)   │
 │   mode          │    │ • Task exec     │    │ • OAuth flow    │
+│                 │    │ • TODO mgmt     │    │                 │
 └─────────────────┘    └─────────────────┘    └─────────────────┘
         │                       │                       │
         └───────────────────────┼───────────────────────┘
@@ -59,7 +61,18 @@ g3/
                    │ • Shell cmds    │    │ • Env overrides │
                    │ • Streaming     │    │ • Provider      │
                    │ • Error hdlg    │    │   settings      │
-                    └─────────────────┘    └─────────────────┘
+                    └─────────────────┘    │ • Computer      │
+                             │              │   control cfg   │
+                             │              └─────────────────┘
+                             │                       │
+                    ┌─────────────────┐             │
+                    │ g3-computer-    │◄────────────┘
+                    │   control       │
+                    │ • Mouse/kbd     │
+                    │ • Screenshots   │
+                    │ • OCR/Tesseract │
+                    │ • Windows/UI    │
+                    └─────────────────┘
 ```

 ## Core Components
@@ -79,6 +92,7 @@ g3/
 - **Streaming Parser**: Real-time parsing of LLM responses with tool call detection and execution
 - **Session Management**: Automatic session logging with detailed conversation history and token usage
 - **Error Recovery**: Sophisticated error classification and retry logic for recoverable errors
+- **TODO Management**: In-memory TODO list with read/write tools for task tracking

 **Available Tools:**
 - `shell`: Execute shell commands with streaming output
@@ -86,7 +100,15 @@ g3/
 - `write_file`: Create or overwrite files with content
 - `str_replace`: Apply unified diffs to files with precise editing
 - `final_output`: Signal task completion with detailed summaries
- **Project Management**: Workspace handling, requirements.md processing for autonomous mode
+- `todo_read`: Read the entire TODO list content
+- `todo_write`: Write or overwrite the entire TODO list
+- `mouse_click`: Click the mouse at specific coordinates
+- `type_text`: Type text at the current cursor position
+- `find_element`: Find UI elements by text, role, or attributes
+- `take_screenshot`: Capture screenshots of screen, region, or window
+- `extract_text`: Extract text from images or screen regions using OCR
+- `find_text_on_screen`: Find text visually on screen and return coordinates
+- `list_windows`: List all open windows with IDs and titles

 ### 2. g3-providers: LLM Provider Abstraction

@@ -172,6 +194,26 @@ g3/
 - **Validation**: Configuration validation with helpful error messages
 - **Flexible Paths**: Support for shell expansion (`~`, environment variables)

+### 6. g3-computer-control: Computer Control & Automation
+
+**Primary Responsibilities:**
+- Cross-platform computer control and automation
+- Mouse and keyboard input simulation
+- Window management and screenshot capture
+- OCR text extraction from images and screen regions
+
+**Platform Support:**
+- **macOS**: Core Graphics, Cocoa, screencapture integration
+- **Linux**: X11/Xtest for input, X11 for window management
+- **Windows**: Win32 APIs for input and window control
+
+**Key Features:**
+- **OCR Integration**: Tesseract-based text extraction from images
+- **Window Management**: List, identify, and capture specific application windows
+- **UI Automation**: Find elements, simulate clicks, type text
+- **Screenshot Capture**: Full screen, regions, or specific windows
+- **Accessibility**: Requires OS-level permissions for automation
+
 ## Advanced Features

 ### Context Window Management
@@ -180,6 +222,7 @@ G3 implements sophisticated context window management:

 - **Automatic Monitoring**: Tracks token usage with percentage-based thresholds
 - **Smart Summarization**: Auto-triggers at 80% capacity to prevent context overflow
+- **Context Thinning**: Progressive thinning at 50%, 60%, 70%, 80% thresholds - replaces large tool results with file references
 - **Conversation Preservation**: Maintains conversation continuity through intelligent summaries
 - **Provider-Specific Limits**: Adapts to different model context windows (4k to 200k+ tokens)
 - **Cumulative Tracking**: Monitors total token usage across entire sessions
@@ -354,20 +397,23 @@ This design document reflects the current state of G3 as a mature, production-re
 ### Fully Implemented
 - ✅ **Core Agent Engine**: Complete with streaming, tool execution, and context management
 - ✅ **Provider System**: Anthropic, Databricks, and Embedded providers with OAuth support
- ✅ **Tool System**: All 5 core tools (shell, read_file, write_file, str_replace, final_output)
+- ✅ **Tool System**: 13 tools including file ops, shell, TODO management, and computer control
 - ✅ **CLI Interface**: Interactive mode, single-shot mode, retro TUI
 - ✅ **Autonomous Mode**: Coach-player feedback loop with requirements.md processing
 - ✅ **Configuration**: TOML-based config with environment overrides
 - ✅ **Error Handling**: Comprehensive retry logic and error classification
 - ✅ **Session Logging**: Automatic session tracking and JSON logs
- ✅ **Context Management**: Auto-summarization at 80% capacity
+- ✅ **Context Management**: Context thinning (50-80%) and auto-summarization at 80% capacity
+- ✅ **Computer Control**: Cross-platform automation with OCR support
+- ✅ **TODO Management**: In-memory TODO list with read/write tools

 ### Architecture Highlights
- **Workspace**: 5 crates with clear separation of concerns
+- **Workspace**: 6 crates with clear separation of concerns
 - **Dependencies**: Modern Rust ecosystem (Tokio, Clap, Serde, etc.)
 - **Streaming**: Real-time response processing with tool call detection
 - **Cross-Platform**: Works on macOS, Linux, and Windows
- **GPU Support**: Metal acceleration for local models on macOS
+- **GPU Support**: Metal acceleration for local models on macOS, CUDA on Linux
+- **OCR Support**: Tesseract integration for text extraction from images

 ### Key Files
 - `src/main.rs`: main entry point delegating to g3-cli
@@ -376,3 +422,5 @@ This design document reflects the current state of G3 as a mature, production-re
 - `crates/g3-providers/src/lib.rs`: provider trait and registry
 - `crates/g3-config/src/lib.rs`: configuration management
 - `crates/g3-execution/src/lib.rs`: code execution engine
+- `crates/g3-computer-control/src/lib.rs`: computer control and automation
+- `crates/g3-computer-control/src/platform/`: platform-specific implementations
--- a/README.md
+++ b/README.md
@@ -2,106 +2,14 @@

 G3 is a coding AI agent designed to help you complete tasks by writing code and executing commands. Built in Rust, it provides a flexible architecture for interacting with various Large Language Model (LLM) providers while offering powerful code generation and task automation capabilities.

-## Architecture Overview
-
-G3 follows a modular architecture organized as a Rust workspace with multiple crates, each responsible for specific functionality:
-
-### Core Components
-
-#### **g3-core**
-The heart of the agent system, containing:
- **Agent Engine**: Main orchestration logic for handling conversations, tool execution, and task management
- **Context Window Management**: Intelligent tracking of token usage with auto-summarization capabilities when approaching context limits (~80% capacity)
- **Tool System**: Built-in tools for file operations (read, write, edit), shell command execution, and structured output generation
- **Streaming Response Parser**: Real-time parsing of LLM responses with tool call detection and execution
- **Task Execution**: Support for single and iterative task execution with automatic retry logic
-
-#### **g3-providers**
-Abstraction layer for LLM providers:
- **Provider Interface**: Common trait-based API for different LLM backends
- **Multiple Provider Support**: 
-  - Anthropic (Claude models)
-  - Databricks (DBRX and other models)
-  - Local/embedded models via llama.cpp with Metal acceleration on macOS
- **OAuth Authentication**: Built-in OAuth flow support for secure provider authentication
- **Provider Registry**: Dynamic provider management and selection
-
-#### **g3-config**
-Configuration management system:
- Environment-based configuration
- Provider credentials and settings
- Model selection and parameters
- Runtime configuration options
-
-#### **g3-execution**
-Task execution framework:
- Task planning and decomposition
- Execution strategies (sequential, parallel)
- Error handling and retry mechanisms
- Progress tracking and reporting
-
-#### **g3-cli**
-Command-line interface:
- Interactive terminal interface
- Task submission and monitoring
- Configuration management commands
- Session management
-
-### Error Handling & Resilience
-
-G3 includes robust error handling with automatic retry logic:
- **Recoverable Error Detection**: Automatically identifies recoverable errors (rate limits, network issues, server errors, timeouts)
- **Exponential Backoff with Jitter**: Implements intelligent retry delays to avoid overwhelming services
- **Detailed Error Logging**: Captures comprehensive error context including stack traces, request/response data, and session information
- **Error Persistence**: Saves detailed error logs to `logs/errors/` for post-mortem analysis
- **Graceful Degradation**: Non-recoverable errors are logged with full context before terminating
-
 ## Key Features

-### Intelligent Context Management
- Automatic context window monitoring with percentage-based tracking
- Smart auto-summarization when approaching token limits
- Conversation history preservation through summaries
- Dynamic token allocation for different providers
-
-### Tool Ecosystem
- **File Operations**: Read, write, and edit files with line-range precision
- **Shell Integration**: Execute system commands with output capture
- **Code Generation**: Structured code generation with syntax awareness
- **Final Output**: Formatted result presentation
-
-### Provider Flexibility
- Support for multiple LLM providers through a unified interface
- Hot-swappable providers without code changes
- Provider-specific optimizations and feature support
- Local model support for offline operation
-
-### Task Automation
- Single-shot task execution for quick operations
- Iterative task mode for complex, multi-step workflows
- Automatic error recovery and retry logic
- Progress tracking and intermediate result handling
-
-## Language & Technology Stack
-
- **Language**: Rust (2021 edition)
- **Async Runtime**: Tokio for concurrent operations
- **HTTP Client**: Reqwest for API communications
- **Serialization**: Serde for JSON handling
- **CLI Framework**: Clap for command-line parsing
- **Logging**: Tracing for structured logging
- **Local Models**: llama.cpp with Metal acceleration support
-
-## Use Cases
-
-G3 is designed for:
- Automated code generation and refactoring
- File manipulation and project scaffolding
- System administration tasks
- Data processing and transformation
- API integration and testing
- Documentation generation
- Complex multi-step workflows
+- **Multiple LLM Providers**: Anthropic (Claude), Databricks, OpenAI, and local models via llama.cpp
+- **Autonomous Mode**: Coach-player feedback loop for complex tasks
+- **Intelligent Context Management**: Auto-summarization and context thinning at 50-80% thresholds
+- **Rich Tool Ecosystem**: File operations, shell commands, computer control, browser automation
+- **Streaming Responses**: Real-time output with tool call detection
+- **Error Recovery**: Automatic retry logic with exponential backoff

 ## Getting Started

@@ -109,21 +17,234 @@ G3 is designed for:
 # Build the project
 cargo build --release

-# Run G3
-cargo run
-
-# Execute a task
+# Execute a single task
 g3 "implement a function to calculate fibonacci numbers"
+
+# Start autonomous mode with interactive requirements
+g3 --autonomous --interactive-requirements
 ```

+## Configuration
+
+Create `~/.config/g3/config.toml`:
+
+```toml
+[providers]
+default_provider = "databricks"
+
+[providers.anthropic]
+api_key = "sk-ant-..."
+model = "claude-3-5-sonnet-20241022"
+max_tokens = 4096
+
+[providers.databricks]
+host = "https://your-workspace.cloud.databricks.com"
+model = "databricks-meta-llama-3-1-70b-instruct"
+max_tokens = 4096
+use_oauth = true
+
+[agent]
+max_context_length = 8192
+enable_streaming = true
+
+# Optional: Use different models for coach and player in autonomous mode
+[autonomous]
+coach_provider = "anthropic"
+coach_model = "claude-3-5-sonnet-20241022"  # Thorough review
+player_provider = "databricks"
+player_model = "databricks-meta-llama-3-1-70b-instruct"  # Fast execution
+```
+
+## Autonomous Mode (Coach-Player Loop)
+
+G3 features an autonomous mode where two agents collaborate:
+- **Player Agent**: Executes tasks and implements solutions
+- **Coach Agent**: Reviews work and provides feedback
+
+### Option 1: Interactive Requirements with AI Enhancement (Recommended)
+
+```bash
+g3 --autonomous --interactive-requirements
+```
+
+**How it works:**
+1. Describe what you want to build (can be brief)
+2. Press **Ctrl+D** (Unix/Mac) or **Ctrl+Z** (Windows)
+3. AI enhances your input into a structured requirements document
+4. Review the enhanced requirements
+5. Choose to proceed, edit manually, or cancel
+6. If accepted, autonomous mode starts automatically
+
+**Example:**
+```
+You type: "build a todo app with cli in python"
+
+AI generates:
+# Todo List CLI Application
+
+## Overview
+A command-line todo list application built in Python...
+
+## Functional Requirements
+1. Add tasks with descriptions
+2. Mark tasks as complete
+3. Delete tasks
+...
+```
+
+### Option 2: Direct Requirements
+
+```bash
+g3 --autonomous --requirements "Build a REST API with CRUD operations for user management"
+```
+
+### Option 3: Requirements File
+
+Create `requirements.md` in your workspace:
+
+```markdown
+# Project Requirements
+
+1. Create a REST API with user endpoints
+2. Use SQLite for storage
+3. Include input validation
+4. Write unit tests
+```
+
+Then run:
+
+```bash
+g3 --autonomous
+```
+
+### Why Different Models for Coach and Player?
+
+Configure different models in the `[autonomous]` section to:
+- **Optimize Cost**: Use cheaper model for execution, expensive for review
+- **Optimize Speed**: Use fast model for iteration, thorough for validation
+- **Specialize**: Leverage provider strengths (e.g., Claude for analysis, Llama for code)
+
+If not configured, both agents use the `default_provider` and its model.
+
+## Command-Line Options
+
+```bash
+# Autonomous mode
+g3 --autonomous --interactive-requirements
+g3 --autonomous --requirements "Your requirements"
+g3 --autonomous --max-turns 10
+
+# Single-shot mode
+g3 "your task here"
+
+# Options
+--workspace <DIR>          # Set workspace directory
+--provider <NAME>          # Override provider (anthropic, databricks, openai)
+--model <NAME>             # Override model
+--quiet                    # Disable log files
+--webdriver                # Enable browser automation
+--show-prompt              # Show system prompt
+--show-code                # Show generated code
+```
+
+## Architecture Overview
+
+G3 is organized as a Rust workspace with multiple crates:
+
+- **g3-core**: Agent engine, context management, tool system, streaming parser
+- **g3-providers**: LLM provider abstraction (Anthropic, Databricks, OpenAI, local models)
+- **g3-config**: Configuration management
+- **g3-execution**: Task execution framework
+- **g3-computer-control**: Mouse/keyboard automation, OCR, screenshots
+- **g3-cli**: Command-line interface
+
+### Key Capabilities
+
+**Intelligent Context Management**
+- Automatic context window monitoring with percentage-based tracking
+- Smart auto-summarization when approaching token limits
+- Context thinning at 50%, 60%, 70%, 80% thresholds
+- Dynamic token allocation (4k to 200k+ tokens)
+
+**Tool Ecosystem**
+- File operations (read, write, edit with line-range precision)
+- Shell command execution
+- TODO management
+- Computer control (experimental): mouse, keyboard, OCR, screenshots
+- Browser automation via WebDriver (Safari)
+
+**Error Handling**
+- Automatic retry logic with exponential backoff
+- Recoverable error detection (rate limits, network issues, timeouts)
+- Detailed error logging to `logs/errors/`
+
+## WebDriver Browser Automation
+
+**One-Time Setup** (macOS):
+
+```bash
+# Enable Safari Remote Automation
+safaridriver --enable  # Requires password
+
+# Or via Safari UI:
+# Safari → Preferences → Advanced → Show Develop menu
+# Then: Develop → Allow Remote Automation
+```
+
+**Usage**:
+
+```bash
+g3 --webdriver "scrape the top stories from Hacker News"
+```
+
+See [docs/webdriver-setup.md](docs/webdriver-setup.md) for detailed setup.
+
+## Computer Control (Experimental)
+
+Enable in config:
+
+```toml
+[computer_control]
+enabled = true
+require_confirmation = true
+```
+
+Grant accessibility permissions:
+- **macOS**: System Preferences → Security & Privacy → Accessibility
+- **Linux**: Ensure X11 or Wayland access
+- **Windows**: Run as administrator (first time)
+
+**Available Tools**: `mouse_click`, `type_text`, `find_element`, `take_screenshot`, `extract_text`, `find_text_on_screen`, `list_windows`
+
+## Use Cases
+
+- Automated code generation and refactoring
+- File manipulation and project scaffolding
+- System administration tasks
+- Data processing and transformation
+- API integration and testing
+- Documentation generation
+- Complex multi-step workflows
+- Desktop application automation
+
 ## Session Logs

-G3 automatically saves session logs for each interaction in the `logs/` directory. These logs contain:
+G3 automatically saves session logs to `logs/` directory:
 - Complete conversation history
 - Token usage statistics
 - Timestamps and session status

-The `logs/` directory is created automatically on first use and is excluded from version control.
+Disable with `--quiet` flag.
+
+## Technology Stack
+
+- **Language**: Rust (2021 edition)
+- **Async Runtime**: Tokio
+- **HTTP Client**: Reqwest
+- **Serialization**: Serde
+- **CLI Framework**: Clap
+- **Logging**: Tracing
+- **Local Models**: llama.cpp with Metal acceleration

 ## License

@@ -131,4 +252,4 @@ MIT License - see LICENSE file for details

 ## Contributing

-G3 is an open-source project. Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
+Contributions welcome! Please see CONTRIBUTING.md for guidelines.
--- a/config.example.toml
+++ b/config.example.toml
@@ -13,3 +13,8 @@ use_oauth = true
 max_context_length = 8192
 enable_streaming = true
 timeout_seconds = 60
+
+[computer_control]
+enabled = false  # Set to true to enable computer control (requires OS permissions)
+require_confirmation = true
+max_actions_per_second = 5
--- a/crates/g3-cli/src/lib.rs
+++ b/crates/g3-cli/src/lib.rs
@@ -1,7 +1,5 @@
 use anyhow::Result;
 use std::time::{Duration, Instant};
-/// Extract coach feedback by reading from the coach agent's specific log file
-/// Uses the coach agent's session ID to find the exact log file

 #[derive(Debug, Clone)]
 struct TurnMetrics {
@@ -21,7 +19,7 @@ fn generate_turn_histogram(turn_metrics: &[TurnMetrics]) -> String {
    // Find max values for scaling
    let max_tokens = turn_metrics.iter().map(|t| t.tokens_used).max().unwrap_or(1);
    let max_time_ms = turn_metrics.iter()
-        .map(|t| t.wall_clock_time.as_millis() as u32)
+        .map(|t| t.wall_clock_time.as_millis().min(u32::MAX as u128) as u32)
        .max()
        .unwrap_or(1);
    
@@ -35,7 +33,7 @@ fn generate_turn_histogram(turn_metrics: &[TurnMetrics]) -> String {
    histogram.push_str(&format!("   {} = Wall Clock Time (max: {:.1}s)\n\n", TIME_CHAR, max_time_ms as f64 / 1000.0));
    
    for metrics in turn_metrics {
-        let turn_time_ms = metrics.wall_clock_time.as_millis() as u32;
+        let turn_time_ms = metrics.wall_clock_time.as_millis().min(u32::MAX as u128) as u32;
        
        // Calculate bar lengths (proportional to max values)
        let token_bar_len = if max_tokens > 0 {
@@ -99,18 +97,22 @@ fn generate_turn_histogram(turn_metrics: &[TurnMetrics]) -> String {
    histogram
 }

-fn extract_coach_feedback_from_logs(_coach_result: &g3_core::TaskResult, coach_agent: &g3_core::Agent<ConsoleUiWriter>, output: &SimpleOutput) -> Result<String> {
-    // CORRECT APPROACH: Get the session ID from the current coach agent
-    // and read its specific log file directly
-    
+/// Extract coach feedback by reading from the coach agent's specific log file
+/// Uses the coach agent's session ID to find the exact log file
+fn extract_coach_feedback_from_logs(
+    coach_result: &g3_core::TaskResult,
+    coach_agent: &g3_core::Agent<ConsoleUiWriter>,
+    output: &SimpleOutput,
+) -> String {
    // Get the coach agent's session ID
-    let session_id = coach_agent.get_session_id()
-        .ok_or_else(|| anyhow::anyhow!("Coach agent has no session ID"))?;
-    
+    let session_id = coach_agent
+        .get_session_id()
+        .expect("Coach agent has no session ID");
+
    // Construct the log file path for this specific coach session
    let logs_dir = std::path::Path::new("logs");
    let log_file_path = logs_dir.join(format!("g3_session_{}.json", session_id));
-    
+
    // Read the coach agent's specific log file
    if log_file_path.exists() {
        if let Ok(log_content) = std::fs::read_to_string(&log_file_path) {
@@ -118,12 +120,75 @@ fn extract_coach_feedback_from_logs(_coach_result: &g3_core::TaskResult, coach_a
                if let Some(context_window) = log_json.get("context_window") {
                    if let Some(conversation_history) = context_window.get("conversation_history") {
                        if let Some(messages) = conversation_history.as_array() {
-                            // Simply get the last message content - this is the coach's final feedback
-                            if let Some(last_message) = messages.last() {
-                                if let Some(content) = last_message.get("content") {
-                                    if let Some(content_str) = content.as_str() {
-                                        output.print(&format!("✅ Extracted coach feedback from session: {}", session_id));
-                                        return Ok(content_str.to_string());
+                            // Look for the last assistant message (regardless of tool used)
+                            for message in messages.iter().rev() {
+                                if let Some(role) = message.get("role") {
+                                    if role.as_str() == Some("assistant") {
+                                        if let Some(content) = message.get("content") {
+                                            if let Some(content_str) = content.as_str() {
+                                                // First, check if this is plain text feedback (no tool call)
+                                                // This happens when the coach returns final feedback directly
+                                                if !content_str.contains("{\"tool\"") {
+                                                    let trimmed = content_str.trim();
+                                                    if !trimmed.is_empty() {
+                                                        output.print(&format!(
+                                                            "✅ Extracted coach feedback from session: {} ({} chars) [plain text]",
+                                                            session_id,
+                                                            trimmed.len()
+                                                        ));
+                                                        return trimmed.to_string();
+                                                    }
+                                                }
+                                                
+                                                // Look for ANY tool call in the message
+                                                // Pattern: {"tool": "...", "args": {...}}
+                                                if let Some(tool_start) = content_str.find("{\"tool\"") {
+                                                    let json_part = &content_str[tool_start..];
+                                                    
+                                                    // Find the end of the JSON object
+                                                    if let Some(json_end) = find_json_end(json_part) {
+                                                        let json_str = &json_part[..json_end];
+                                                        
+                                                        if let Ok(tool_call) = serde_json::from_str::<serde_json::Value>(json_str) {
+                                                            if let Some(args) = tool_call.get("args") {
+                                                                // Try to extract feedback from different possible fields
+                                                                let feedback = if let Some(summary) = args.get("summary") {
+                                                                    // final_output tool uses "summary"
+                                                                    summary.as_str().map(|s| s.to_string())
+                                                                } else if let Some(content) = args.get("content") {
+                                                                    // todo_write and other tools might use "content"
+                                                                    content.as_str().map(|s| s.to_string())
+                                                                } else {
+                                                                    // Fallback: use the entire args as JSON string
+                                                                    Some(serde_json::to_string_pretty(args).unwrap_or_default())
+                                                                };
+                                                                
+                                                                if let Some(feedback_str) = feedback {
+                                                                    if !feedback_str.trim().is_empty() {
+                                                                        output.print(&format!(
+                                                                            "✅ Extracted coach feedback from session: {} ({} chars)",
+                                                                            session_id,
+                                                                            feedback_str.len()
+                                                                        ));
+                                                                        
+                                                                        // Validate feedback length
+                                                                        if feedback_str.len() < 80 && !feedback_str.contains("IMPLEMENTATION_APPROVED") {
+                                                                            panic!(
+                                                                                "Coach feedback is too short ({} chars): '{}'",
+                                                                                feedback_str.len(),
+                                                                                feedback_str
+                                                                            );
+                                                                        }
+                                                                        
+                                                                        return feedback_str;
+                                                                    }
+                                                                }
+                                                            }
+                                                        }
+                                                    }
+                                                }
+                                            }
+                                        }
                                    }
                                }
                            }
@@ -133,8 +198,48 @@ fn extract_coach_feedback_from_logs(_coach_result: &g3_core::TaskResult, coach_a
            }
        }
    }
+
+    // If we couldn't extract from logs, panic with detailed error
+    panic!(
+        "CRITICAL: Could not extract coach feedback from session: {}\n\
+         Log file path: {:?}\n\
+         Log file exists: {}\n\
+         This indicates the coach did not call any tool or the log is corrupted.\n\
+         Coach result response length: {} chars",
+        session_id,
+        log_file_path,
+        log_file_path.exists(),
+        coach_result.response.len()
+    );
+}
+
+/// Helper function to find the end of a JSON object using brace counting
+fn find_json_end(json_str: &str) -> Option<usize> {
+    let mut depth = 0;
+    let mut in_string = false;
+    let mut escape_next = false;
    
-    Err(anyhow::anyhow!("Could not extract feedback from coach session: {}", session_id))
+    for (i, ch) in json_str.char_indices() {
+        if escape_next {
+            escape_next = false;
+            continue;
+        }
+        
+        match ch {
+            '\\' if in_string => escape_next = true,
+            '"' => in_string = !in_string,
+            '{' if !in_string => depth += 1,
+            '}' if !in_string => {
+                depth -= 1;
+                if depth == 0 {
+                    return Some(i + 1);
+                }
+            }
+            _ => {}
+        }
+    }
+    
+    None
 }

 use clap::Parser;
@@ -197,6 +302,10 @@ pub struct Cli {
    #[arg(long, value_name = "TEXT")]
    pub requirements: Option<String>,

+    /// Interactive mode: prompt for requirements and save to requirements.md before starting autonomous mode
+    #[arg(long)]
+    pub interactive_requirements: bool,
+
    /// Use retro terminal UI (inspired by 80s sci-fi)
    #[arg(long)]
    pub retro: bool,
@@ -216,6 +325,10 @@ pub struct Cli {
    /// Disable log file creation (no logs/ directory or session logs)
    #[arg(long)]
    pub quiet: bool,
+
+    /// Enable WebDriver tools for browser automation (Safari)
+    #[arg(long)]
+    pub webdriver: bool,
 }

 pub async fn run() -> Result<()> {
@@ -284,6 +397,113 @@ pub async fn run() -> Result<()> {

    // Create project model
    let project = if cli.autonomous {
+        // Handle interactive requirements mode with AI enhancement
+        if cli.interactive_requirements {
+            println!("\n📝 Interactive Requirements Mode");
+            println!("================================\n");
+            println!("Describe what you want to build (can be brief):");
+            println!("Press Ctrl+D (Unix) or Ctrl+Z (Windows) when done.\n");
+            
+            use std::io::{self, Read, Write};
+            let mut requirements_input = String::new();
+            io::stdin().read_to_string(&mut requirements_input)?;
+            
+            if requirements_input.trim().is_empty() {
+                anyhow::bail!("No requirements provided. Exiting.");
+            }
+            
+            println!("\n🤖 Enhancing your requirements with AI...\n");
+            
+            // Create a temporary agent to enhance the requirements
+            let temp_config = Config::load_with_overrides(
+                cli.config.as_deref(),
+                cli.provider.clone(),
+                cli.model.clone(),
+            )?;
+            
+            // Create a simple output writer for the enhancement task
+            let ui_writer = ConsoleUiWriter::new();
+            let mut temp_agent = Agent::new_with_readme_and_quiet(
+                temp_config,
+                ui_writer,
+                None,
+                true, // quiet mode for enhancement
+            ).await?;
+            
+            // Create enhancement prompt
+            let enhancement_prompt = format!(
+                r#"Convert the following user input into a well-structured requirements.md document.
+
+User Input:
+{}
+
+Create a professional requirements document with:
+1. A clear project title (# heading)
+2. An overview section explaining what will be built
+3. Organized requirements (functional, technical, quality)
+4. Acceptance criteria
+5. Any technical constraints or preferences mentioned
+
+Format as proper markdown. Be specific and actionable. If the user's input is vague, make reasonable assumptions but keep it focused on what they described.
+
+Output ONLY the markdown content, no explanations or meta-commentary."#,
+                requirements_input.trim()
+            );
+            
+            // Execute enhancement task
+            let result = temp_agent
+                .execute_task_with_timing(&enhancement_prompt, None, false, false, false, false)
+                .await?;
+            
+            let enhanced_requirements = result.response.trim().to_string();
+            
+            // Show the enhanced requirements
+            println!("\n📋 Enhanced Requirements Document:");
+            println!("{}\n", "=".repeat(60));
+            println!("{}", enhanced_requirements);
+            println!("{}\n", "=".repeat(60));
+            
+            // Ask for confirmation
+            println!("\n❓ Is this requirements document acceptable?");
+            println!("   [y] Yes, proceed with autonomous mode");
+            println!("   [e] Edit and save manually");
+            println!("   [n] No, cancel\n");
+            
+            print!("Your choice (y/e/n): ");
+            io::stdout().flush()?;
+            
+            let mut choice = String::new();
+            io::stdin().read_line(&mut choice)?;
+            let choice = choice.trim().to_lowercase();
+            
+            let requirements_path = workspace_dir.join("requirements.md");
+            
+            match choice.as_str() {
+                "y" | "yes" => {
+                    // Save enhanced requirements
+                    std::fs::write(&requirements_path, &enhanced_requirements)?;
+                    println!("\n✅ Requirements saved to: {}", requirements_path.display());
+                    println!("🚀 Starting autonomous mode...\n");
+                }
+                "e" | "edit" => {
+                    // Save enhanced requirements for manual editing
+                    std::fs::write(&requirements_path, &enhanced_requirements)?;
+                    println!("\n✅ Requirements saved to: {}", requirements_path.display());
+                    println!("📝 Please edit the file and run: g3 --autonomous");
+                    println!("   Exiting for now.\n");
+                    return Ok(());
+                }
+                "n" | "no" => {
+                    println!("\n❌ Cancelled. No files were saved.\n");
+                    return Ok(());
+                }
+                _ => {
+                    println!("\n❌ Invalid choice. Cancelled.\n");
+                    return Ok(());
+                }
+            }
+        }
+        
        if let Some(requirements_text) = cli.requirements {
            // Use requirements text override
            Project::new_autonomous_with_requirements(workspace_dir.clone(), requirements_text)?
@@ -304,19 +524,25 @@ pub async fn run() -> Result<()> {
    }

    // Load configuration with CLI overrides
-    let config = Config::load_with_overrides(
+    let mut config = Config::load_with_overrides(
        cli.config.as_deref(),
        cli.provider.clone(),
        cli.model.clone(),
    )?;
-    
+
+    // Override webdriver setting from CLI flag
+    if cli.webdriver {
+        config.webdriver.enabled = true;
+    }
+
    // Validate provider if specified
    if let Some(ref provider) = cli.provider {
        let valid_providers = ["anthropic", "databricks", "embedded", "openai"];
        if !valid_providers.contains(&provider.as_str()) {
            return Err(anyhow::anyhow!(
-                "Invalid provider '{}'. Valid options: {:?}", 
-                provider, valid_providers
+                "Invalid provider '{}'. Valid options: {:?}",
+                provider,
+                valid_providers
            ));
        }
    }
@@ -335,9 +561,22 @@ pub async fn run() -> Result<()> {
    };
    
    let mut agent = if cli.autonomous {
-        Agent::new_autonomous_with_readme_and_quiet(config.clone(), ui_writer, combined_content.clone(), cli.quiet).await?
+        Agent::new_autonomous_with_readme_and_quiet(
+            // Use player-specific config in autonomous mode
+            config.for_player()?,
+            ui_writer,
+            combined_content.clone(),
+            cli.quiet,
+        )
+        .await?
    } else {
-        Agent::new_with_readme_and_quiet(config.clone(), ui_writer, combined_content.clone(), cli.quiet).await?
+        Agent::new_with_readme_and_quiet(
+            config.clone(),
+            ui_writer,
+            combined_content.clone(),
+            cli.quiet,
+        )
+        .await?
    };

    // Execute task, autonomous mode, or start interactive mode
@@ -374,7 +613,7 @@ pub async fn run() -> Result<()> {
        if cli.retro {
            // Use retro terminal UI
            run_interactive_retro(
-                config,  // Already has overrides applied
+                config, // Already has overrides applied
                cli.show_prompt,
                cli.show_code,
                cli.theme,
@@ -1119,7 +1358,10 @@ async fn run_autonomous(
        output.print("❌ Error: requirements.md not found in workspace directory");
        output.print("   Please either:");
        output.print("   1. Create a requirements.md file with your project requirements at:");
-        output.print(&format!("      {}/requirements.md", project.workspace().display()));
+        output.print(&format!(
+            "      {}/requirements.md",
+            project.workspace().display()
+        ));
        output.print("   2. Or use the --requirements flag to provide requirements text directly:");
        output.print("      g3 --autonomous --requirements \"Your requirements here\"");
        output.print("");
@@ -1228,6 +1470,10 @@ async fn run_autonomous(
    loop {
        let turn_start_time = Instant::now();
        let turn_start_tokens = agent.get_context_window().used_tokens;
+        
+        // Reset filter suppression state at the start of each turn
+        g3_core::fixed_filter_json::reset_fixed_json_tool_state();
+        
        // Skip player turn if it's the first turn and implementation files exist
        if !(turn == 1 && skip_first_player) {
            output.print(&format!(
@@ -1254,11 +1500,17 @@ async fn run_autonomous(
            // If there's no coach feedback on subsequent turns, this is an error
            if coach_feedback.is_empty() {
                if turn > 1 {
-                    return Err(anyhow::anyhow!("Player mode error: No coach feedback received on turn {}", turn));
+                    return Err(anyhow::anyhow!(
+                        "Player mode error: No coach feedback received on turn {}",
+                        turn
+                    ));
                }
                output.print("📋 Player starting initial implementation (no prior coach feedback)");
            } else {
-                output.print(&format!("📋 Player received coach feedback ({} chars):", coach_feedback.len()));
+                output.print(&format!(
+                    "📋 Player received coach feedback ({} chars):",
+                    coach_feedback.len()
+                ));
                output.print(&format!("{}", coach_feedback));
            }
            output.print(""); // Empty line for readability
@@ -1356,7 +1608,7 @@ async fn run_autonomous(
                ));
                // Record turn metrics before incrementing
                let turn_duration = turn_start_time.elapsed();
-                let turn_tokens = agent.get_context_window().used_tokens - turn_start_tokens;
+                let turn_tokens = agent.get_context_window().used_tokens.saturating_sub(turn_start_tokens);
                turn_metrics.push(TurnMetrics {
                    turn_number: turn,
                    tokens_used: turn_tokens,
@@ -1382,9 +1634,15 @@ async fn run_autonomous(

        // Create a new agent instance for coach mode to ensure fresh context
        // Use the same config with overrides that was passed to the player agent
-        let config = agent.get_config().clone();
+        let base_config = agent.get_config().clone();
+        let coach_config = base_config.for_coach()?;
+        
+        // Reset filter suppression state before creating coach agent
+        g3_core::fixed_filter_json::reset_fixed_json_tool_state();
+        
        let ui_writer = ConsoleUiWriter::new();
-        let mut coach_agent = Agent::new_autonomous_with_readme_and_quiet(config, ui_writer, None, quiet).await?;
+        let mut coach_agent =
+            Agent::new_autonomous_with_readme_and_quiet(coach_config, ui_writer, None, quiet).await?;

        // Ensure coach agent is also in the workspace directory
        project.enter_workspace()?;
@@ -1414,13 +1672,13 @@ CRITICAL INSTRUCTIONS:
 3. Focus ONLY on what needs to be fixed or improved
 4. Do NOT include your analysis process, file contents, or compilation output in the summary

-If the implementation correctly meets all requirements and compiles without errors:
+If the implementation generally meets all requirements and compiles without errors:
 - Call final_output with summary: 'IMPLEMENTATION_APPROVED'

 If improvements are needed:
 - Call final_output with a brief summary listing ONLY the specific issues to fix

-Remember: Be thorough in your review but concise in your feedback. APPROVE if the implementation works and generally fits the requirements.",
+Remember: Be clear in your review and concise in your feedback. APPROVE if the implementation works and generally fits the requirements. Don't be picky.",
            requirements
        );

@@ -1511,7 +1769,7 @@ Remember: Be thorough in your review but concise in your feedback. APPROVE if th
            coach_feedback = "The implementation needs review. Please ensure all requirements are met and the code compiles without errors.".to_string();
            // Record turn metrics before incrementing
            let turn_duration = turn_start_time.elapsed();
-            let turn_tokens = agent.get_context_window().used_tokens - turn_start_tokens;
+            let turn_tokens = agent.get_context_window().used_tokens.saturating_sub(turn_start_tokens);
            turn_metrics.push(TurnMetrics {
                turn_number: turn,
                tokens_used: turn_tokens,
@@ -1531,7 +1789,8 @@ Remember: Be thorough in your review but concise in your feedback. APPROVE if th
        let coach_result = coach_result_opt.unwrap();

        // Extract the complete coach feedback from final_output
-        let coach_feedback_text = extract_coach_feedback_from_logs(&coach_result, &coach_agent, &output)?;
+        let coach_feedback_text =
+            extract_coach_feedback_from_logs(&coach_result, &coach_agent, &output);

        // Log the size of the feedback for debugging
        info!(
@@ -1546,7 +1805,7 @@ Remember: Be thorough in your review but concise in your feedback. APPROVE if th
            coach_feedback = "The implementation needs review. Please ensure all requirements are met and the code compiles without errors.".to_string();
            // Record turn metrics before incrementing
            let turn_duration = turn_start_time.elapsed();
-            let turn_tokens = agent.get_context_window().used_tokens - turn_start_tokens;
+            let turn_tokens = agent.get_context_window().used_tokens.saturating_sub(turn_start_tokens);
            turn_metrics.push(TurnMetrics {
                turn_number: turn,
                tokens_used: turn_tokens,
@@ -1558,6 +1817,15 @@ Remember: Be thorough in your review but concise in your feedback. APPROVE if th

        output.print_smart(&format!("Coach feedback:\n{}", coach_feedback_text));

+        // Record turn metrics before checking for approval or max turns
+        let turn_duration = turn_start_time.elapsed();
+        let turn_tokens = agent.get_context_window().used_tokens.saturating_sub(turn_start_tokens);
+        turn_metrics.push(TurnMetrics {
+            turn_number: turn,
+            tokens_used: turn_tokens,
+            wall_clock_time: turn_duration,
+        });
+
        // Check if coach approved the implementation
        if coach_result.is_approved() || coach_feedback_text.contains("IMPLEMENTATION_APPROVED") {
            output.print("\n=== SESSION COMPLETED - IMPLEMENTATION APPROVED ===");
@@ -1566,6 +1834,7 @@ Remember: Be thorough in your review but concise in your feedback. APPROVE if th
            break;
        }

+        // Increment turn counter after recording metrics but before checking max turns
        // Check if we've reached max turns
        if turn >= max_turns {
            output.print("\n=== SESSION COMPLETED - MAX TURNS REACHED ===");
@@ -1575,14 +1844,7 @@ Remember: Be thorough in your review but concise in your feedback. APPROVE if th

        // Store coach feedback for next iteration
        coach_feedback = coach_feedback_text;
-        // Record turn metrics before incrementing
-        let turn_duration = turn_start_time.elapsed();
-        let turn_tokens = agent.get_context_window().used_tokens - turn_start_tokens;
-        turn_metrics.push(TurnMetrics {
-            turn_number: turn,
-            tokens_used: turn_tokens,
-            wall_clock_time: turn_duration,
-        });
+        
        turn += 1;

        output.print("🔄 Coach provided feedback for next iteration");
--- a/crates/g3-cli/src/ui_writer_impl.rs
+++ b/crates/g3-cli/src/ui_writer_impl.rs
@@ -10,6 +10,7 @@ pub struct ConsoleUiWriter {
    current_tool_args: Mutex<Vec<(String, String)>>,
    current_output_line: Mutex<Option<String>>,
    output_line_printed: Mutex<bool>,
+    in_todo_tool: Mutex<bool>,
 }

 impl ConsoleUiWriter {
@@ -19,6 +20,60 @@ impl ConsoleUiWriter {
            current_tool_args: Mutex::new(Vec::new()),
            current_output_line: Mutex::new(None),
            output_line_printed: Mutex::new(false),
+            in_todo_tool: Mutex::new(false),
+        }
+    }
+
+    fn print_todo_line(&self, line: &str) {
+        // Transform and print todo list lines elegantly
+        let trimmed = line.trim();
+        
+        // Skip the "📝 TODO list:" prefix line
+        if trimmed.starts_with("📝 TODO list:") || trimmed == "📝 TODO list is empty" {
+            return;
+        }
+        
+        // Handle empty lines
+        if trimmed.is_empty() {
+            println!();
+            return;
+        }
+        
+        // Detect indentation level
+        let indent_count = line.chars().take_while(|c| c.is_whitespace()).count();
+        let indent = "  ".repeat(indent_count / 2); // Convert spaces to visual indent
+        
+        // Format based on line type
+        if trimmed.starts_with("- [ ]") {
+            // Incomplete task
+            let task = trimmed.strip_prefix("- [ ]").unwrap_or(trimmed).trim();
+            println!("{}☐ {}", indent, task);
+        } else if trimmed.starts_with("- [x]") || trimmed.starts_with("- [X]") {
+            // Completed task
+            let task = trimmed.strip_prefix("- [x]")
+                .or_else(|| trimmed.strip_prefix("- [X]"))
+                .unwrap_or(trimmed)
+                .trim();
+            println!("{}\x1b[2m☑ {}\x1b[0m", indent, task);
+        } else if trimmed.starts_with("- ") {
+            // Regular bullet point
+            let item = trimmed.strip_prefix("- ").unwrap_or(trimmed).trim();
+            println!("{}• {}", indent, item);
+        } else if trimmed.starts_with("# ") {
+            // Heading
+            let heading = trimmed.strip_prefix("# ").unwrap_or(trimmed).trim();
+            println!("\n\x1b[1m{}\x1b[0m", heading);
+        } else if trimmed.starts_with("## ") {
+            // Subheading
+            let subheading = trimmed.strip_prefix("## ").unwrap_or(trimmed).trim();
+            println!("\n\x1b[1m{}\x1b[0m", subheading);
+        } else if trimmed.starts_with("**") && trimmed.ends_with("**") {
+            // Bold text (section marker)
+            let text = trimmed.trim_start_matches("**").trim_end_matches("**");
+            println!("{}\x1b[1m{}\x1b[0m", indent, text);
+        } else {
+            // Regular text or note
+            println!("{}{}", indent, trimmed);
        }
    }
 }
@@ -53,6 +108,15 @@ impl UiWriter for ConsoleUiWriter {
        // Store the tool name and clear args for collection
        *self.current_tool_name.lock().unwrap() = Some(tool_name.to_string());
        self.current_tool_args.lock().unwrap().clear();
+        
+        // Check if this is a todo tool call
+        let is_todo = tool_name == "todo_read" || tool_name == "todo_write";
+        *self.in_todo_tool.lock().unwrap() = is_todo;
+        
+        // For todo tools, we'll skip the normal header and print a custom one later
+        if is_todo {
+            return;
+        }
    }

    fn print_tool_arg(&self, key: &str, value: &str) {
@@ -75,6 +139,12 @@ impl UiWriter for ConsoleUiWriter {
    }

    fn print_tool_output_header(&self) {
+        // Skip normal header for todo tools
+        if *self.in_todo_tool.lock().unwrap() {
+            println!(); // Just add a newline
+            return;
+        }
+        
        println!();
        // Now print the tool header with the most important arg in bold green
        if let Some(tool_name) = self.current_tool_name.lock().unwrap().as_ref() {
@@ -115,8 +185,8 @@ impl UiWriter for ConsoleUiWriter {
                    String::new()
                };

-                // Print with bold green formatting using ANSI escape codes
-                println!("┌─\x1b[1;32m {} | {}{}\x1b[0m", tool_name, display_value, header_suffix);
+                // Print with bold green tool name, purple (non-bold) for pipe and args
+                println!("┌─\x1b[1;32m {}\x1b[0m\x1b[35m | {}{}\x1b[0m", tool_name, display_value, header_suffix);
            } else {
                // Print with bold green formatting using ANSI escape codes
                println!("┌─\x1b[1;32m {}\x1b[0m", tool_name);
@@ -144,10 +214,21 @@ impl UiWriter for ConsoleUiWriter {
    }

    fn print_tool_output_line(&self, line: &str) {
+        // Special handling for todo tools
+        if *self.in_todo_tool.lock().unwrap() {
+            self.print_todo_line(line);
+            return;
+        }
+        
        println!("│ \x1b[2m{}\x1b[0m", line);
    }

    fn print_tool_output_summary(&self, count: usize) {
+        // Skip for todo tools
+        if *self.in_todo_tool.lock().unwrap() {
+            return;
+        }
+        
        println!(
            "│ \x1b[2m({} line{})\x1b[0m",
            count,
@@ -156,7 +237,55 @@ impl UiWriter for ConsoleUiWriter {
    }

    fn print_tool_timing(&self, duration_str: &str) {
-        println!("└─ ⚡️ {}", duration_str);
+        // For todo tools, just print a simple completion message
+        if *self.in_todo_tool.lock().unwrap() {
+            println!();
+            *self.in_todo_tool.lock().unwrap() = false;
+            return;
+        }
+        
+        // Parse the duration string to determine color
+        // Format is like "1.5s", "500ms", "2m 30.0s"
+        let color_code = if duration_str.ends_with("ms") {
+            // Milliseconds - use default color (< 1s)
+            ""
+        } else if duration_str.contains('m') {
+            // Contains minutes
+            // Extract minutes value
+            if let Some(m_pos) = duration_str.find('m') {
+                if let Ok(minutes) = duration_str[..m_pos].trim().parse::<u32>() {
+                    if minutes >= 5 {
+                        "\x1b[31m" // Red for >= 5 minutes
+                    } else {
+                        "\x1b[38;5;208m" // Orange for >= 1 minute but < 5 minutes
+                    }
+                } else {
+                    "" // Default color if parsing fails
+                }
+            } else {
+                "" // Default color if 'm' not found (shouldn't happen)
+            }
+        } else if duration_str.ends_with('s') {
+            // Seconds only
+            if let Some(s_value) = duration_str.strip_suffix('s') {
+                if let Ok(seconds) = s_value.trim().parse::<f64>() {
+                    if seconds >= 1.0 {
+                        "\x1b[33m" // Yellow for >= 1 second
+                    } else {
+                        "" // Default color for < 1 second
+                    }
+                } else {
+                    "" // Default color if parsing fails
+                }
+            } else {
+                "" // Default color
+            }
+        } else {
+            // Milliseconds or other format - use default color
+            ""
+        };
+
+        println!("└─ ⚡️ {}{}\x1b[0m", color_code, duration_str);
        println!();
        // Clear the stored tool info
        *self.current_tool_name.lock().unwrap() = None;
--- a/crates/g3-computer-control/Cargo.toml
+++ b/crates/g3-computer-control/Cargo.toml
@@ -0,0 +1,46 @@
+[package]
+name = "g3-computer-control"
+version = "0.1.0"
+edition = "2021"
+
+[dependencies]
+# Workspace dependencies
+tokio = { workspace = true }
+anyhow = { workspace = true }
+thiserror = { workspace = true }
+serde = { workspace = true }
+serde_json = { workspace = true }
+tracing = { workspace = true }
+uuid = { workspace = true }
+
+shellexpand = "3.1"
+# Async trait support
+async-trait = "0.1"
+
+# WebDriver support
+fantoccini = "0.21"
+
+# OCR dependencies
+tesseract = "0.14"
+
+# macOS dependencies
+[target.'cfg(target_os = "macos")'.dependencies]
+core-graphics = "0.23"
+core-foundation = "0.9"
+cocoa = "0.25"
+objc = "0.2"
+image = "0.24"
+
+# Linux dependencies
+[target.'cfg(target_os = "linux")'.dependencies]
+x11 = { version = "2.21", features = ["xlib", "xtest"] }
+image = "0.24"
+
+# Windows dependencies
+[target.'cfg(target_os = "windows")'.dependencies]
+windows = { version = "0.52", features = [
+    "Win32_Foundation",
+    "Win32_UI_WindowsAndMessaging",
+    "Win32_UI_Input_KeyboardAndMouse",
+    "Win32_Graphics_Gdi",
+] }
--- a/crates/g3-computer-control/examples/debug_screenshot.rs
+++ b/crates/g3-computer-control/examples/debug_screenshot.rs
@@ -0,0 +1,46 @@
+use core_graphics::display::CGDisplay;
+
+fn main() {
+    let display = CGDisplay::main();
+    let image = display.image().expect("Failed to capture screen");
+    
+    println!("CGImage properties:");
+    println!("  Width: {}", image.width());
+    println!("  Height: {}", image.height());
+    println!("  Bits per component: {}", image.bits_per_component());
+    println!("  Bits per pixel: {}", image.bits_per_pixel());
+    println!("  Bytes per row: {}", image.bytes_per_row());
+    
+    let data = image.data();
+    let expected_size = image.width() * image.height() * 4;
+    println!("  Data length: {}", data.len());
+    println!("  Expected (w*h*4): {}", expected_size);
+    
+    // Check if there's padding in rows
+    let bytes_per_row = image.bytes_per_row();
+    let width = image.width();
+    let expected_bytes_per_row = width * 4;
+    println!("\nRow alignment:");
+    println!("  Actual bytes per row: {}", bytes_per_row);
+    println!("  Expected (width * 4): {}", expected_bytes_per_row);
+    println!("  Padding per row: {}", bytes_per_row - expected_bytes_per_row);
+    
+    // Sample some pixels from different locations
+    println!("\nFirst 3 pixels (raw bytes):");
+    for i in 0..3 {
+        let offset = i * 4;
+        println!("  Pixel {}: [{:3}, {:3}, {:3}, {:3}]", 
+                 i, data[offset], data[offset+1], data[offset+2], data[offset+3]);
+    }
+    
+    // Check a pixel from the middle
+    let mid_row = image.height() / 2;
+    let mid_col = image.width() / 2;
+    let mid_offset = (mid_row * bytes_per_row + mid_col * 4) as usize;
+    println!("\nMiddle pixel (row {}, col {}):", mid_row, mid_col);
+    println!("  Offset: {}", mid_offset);
+    if mid_offset + 3 < data.len() as usize {
+        println!("  Bytes: [{:3}, {:3}, {:3}, {:3}]", 
+                 data[mid_offset], data[mid_offset+1], data[mid_offset+2], data[mid_offset+3]);
+    }
+}
--- a/crates/g3-computer-control/examples/list_windows.rs
+++ b/crates/g3-computer-control/examples/list_windows.rs
@@ -0,0 +1,56 @@
+use core_graphics::window::{kCGWindowListOptionOnScreenOnly, kCGNullWindowID, CGWindowListCopyWindowInfo};
+use core_foundation::dictionary::CFDictionary;
+use core_foundation::string::CFString;
+use core_foundation::base::TCFType;
+
+fn main() {
+    println!("Listing all on-screen windows...");
+    println!("{:<10} {:<25} {}", "Window ID", "Owner", "Title");
+    println!("{}", "-".repeat(80));
+    
+    unsafe {
+        let window_list = CGWindowListCopyWindowInfo(
+            kCGWindowListOptionOnScreenOnly,
+            kCGNullWindowID
+        );
+        
+        let count = core_foundation::array::CFArray::<CFDictionary>::wrap_under_create_rule(window_list).len();
+        let array = core_foundation::array::CFArray::<CFDictionary>::wrap_under_create_rule(window_list);
+        
+        for i in 0..count {
+            let dict = array.get(i).unwrap();
+            
+            // Get window ID
+            let window_id_key = CFString::from_static_string("kCGWindowNumber");
+            let window_id: i64 = if let Some(value) = dict.find(window_id_key.as_concrete_TypeRef()) {
+                let num: core_foundation::number::CFNumber = TCFType::wrap_under_get_rule(*value as *const _);
+                num.to_i64().unwrap_or(0)
+            } else {
+                0
+            };
+            
+            // Get owner name
+            let owner_key = CFString::from_static_string("kCGWindowOwnerName");
+            let owner: String = if let Some(value) = dict.find(owner_key.as_concrete_TypeRef()) {
+                let s: CFString = TCFType::wrap_under_get_rule(*value as *const _);
+                s.to_string()
+            } else {
+                "Unknown".to_string()
+            };
+            
+            // Get window name/title
+            let name_key = CFString::from_static_string("kCGWindowName");
+            let title: String = if let Some(value) = dict.find(name_key.as_concrete_TypeRef()) {
+                let s: CFString = TCFType::wrap_under_get_rule(*value as *const _);
+                s.to_string()
+            } else {
+                "".to_string()
+            };
+            
+            // Filter for iTerm or show all
+            if owner.contains("iTerm") || owner.contains("Terminal") {
+                println!("{:<10} {:<25} {}", window_id, owner, title);
+            }
+        }
+    }
+}
--- a/crates/g3-computer-control/examples/safari_demo.rs
+++ b/crates/g3-computer-control/examples/safari_demo.rs
@@ -0,0 +1,64 @@
+use g3_computer_control::SafariDriver;
+use g3_computer_control::webdriver::WebDriverController;
+use anyhow::Result;
+
+#[tokio::main]
+async fn main() -> Result<()> {
+    println!("Safari WebDriver Demo");
+    println!("=====================\n");
+    
+    println!("Make sure to:");
+    println!("1. Enable 'Allow Remote Automation' in Safari's Develop menu");
+    println!("2. Run: /usr/bin/safaridriver --enable");
+    println!("3. Start safaridriver in another terminal: safaridriver --port 4444\n");
+    
+    println!("Connecting to SafariDriver...");
+    let mut driver = SafariDriver::new().await?;
+    println!("✅ Connected!\n");
+    
+    // Navigate to a website
+    println!("Navigating to example.com...");
+    driver.navigate("https://example.com").await?;
+    println!("✅ Navigated\n");
+    
+    // Get page title
+    let title = driver.title().await?;
+    println!("Page title: {}\n", title);
+    
+    // Get current URL
+    let url = driver.current_url().await?;
+    println!("Current URL: {}\n", url);
+    
+    // Find an element
+    println!("Finding h1 element...");
+    let mut h1 = driver.find_element("h1").await?;
+    let h1_text = h1.text().await?;
+    println!("H1 text: {}\n", h1_text);
+    
+    // Find all paragraphs
+    println!("Finding all paragraphs...");
+    let paragraphs = driver.find_elements("p").await?;
+    println!("Found {} paragraphs\n", paragraphs.len());
+    
+    // Get page source
+    println!("Getting page source...");
+    let source = driver.page_source().await?;
+    println!("Page source length: {} bytes\n", source.len());
+    
+    // Execute JavaScript
+    println!("Executing JavaScript...");
+    let result = driver.execute_script("return document.title", vec![]).await?;
+    println!("JS result: {:?}\n", result);
+    
+    // Take a screenshot
+    println!("Taking screenshot...");
+    driver.screenshot("/tmp/safari_demo.png").await?;
+    println!("✅ Screenshot saved to /tmp/safari_demo.png\n");
+    
+    // Close the browser
+    println!("Closing browser...");
+    driver.quit().await?;
+    println!("✅ Done!");
+    
+    Ok(())
+}
--- a/crates/g3-computer-control/examples/test_permission_prompt.rs
+++ b/crates/g3-computer-control/examples/test_permission_prompt.rs
@@ -0,0 +1,21 @@
+use g3_computer_control::{create_controller, ComputerController};
+
+#[tokio::main]
+async fn main() {
+    println!("Testing screenshot with permission prompt...");
+    
+    let controller = create_controller().expect("Failed to create controller");
+    
+    match controller.take_screenshot("/tmp/test_with_prompt.png", None, None).await {
+        Ok(_) => {
+            println!("\n✅ Screenshot saved to /tmp/test_with_prompt.png");
+            println!("Opening screenshot...");
+            let _ = std::process::Command::new("open")
+                .arg("/tmp/test_with_prompt.png")
+                .spawn();
+        }
+        Err(e) => {
+            println!("❌ Screenshot failed: {}", e);
+        }
+    }
+}
--- a/crates/g3-computer-control/examples/test_screencapture_direct.rs
+++ b/crates/g3-computer-control/examples/test_screencapture_direct.rs
@@ -0,0 +1,39 @@
+use std::process::Command;
+
+fn main() {
+    let path = "/tmp/rust_screencapture_test.png";
+    
+    println!("Testing screencapture command from Rust...");
+    
+    let mut cmd = Command::new("screencapture");
+    cmd.arg("-x"); // No sound
+    cmd.arg(path);
+    
+    println!("Command: {:?}", cmd);
+    
+    match cmd.output() {
+        Ok(output) => {
+            println!("Exit status: {}", output.status);
+            println!("Stdout: {}", String::from_utf8_lossy(&output.stdout));
+            println!("Stderr: {}", String::from_utf8_lossy(&output.stderr));
+            
+            if output.status.success() {
+                println!("\n✅ Screenshot saved to: {}", path);
+                
+                // Check file exists and size
+                if let Ok(metadata) = std::fs::metadata(path) {
+                    println!("File size: {} bytes ({:.1} MB)", metadata.len(), metadata.len() as f64 / 1_000_000.0);
+                }
+                
+                // Open it
+                let _ = Command::new("open").arg(path).spawn();
+                println!("\nOpened screenshot - please verify it looks correct!");
+            } else {
+                println!("\n❌ Screenshot failed!");
+            }
+        }
+        Err(e) => {
+            println!("❌ Failed to execute screencapture: {}", e);
+        }
+    }
+}
--- a/crates/g3-computer-control/examples/test_screenshot_fix.rs
+++ b/crates/g3-computer-control/examples/test_screenshot_fix.rs
@@ -0,0 +1,69 @@
+use core_graphics::display::CGDisplay;
+use image::{ImageBuffer, RgbaImage};
+use std::path::Path;
+
+fn main() {
+    let display = CGDisplay::main();
+    let image = display.image().expect("Failed to capture screen");
+    
+    let width = image.width() as u32;
+    let height = image.height() as u32;
+    let bytes_per_row = image.bytes_per_row() as usize;
+    let data = image.data();
+    
+    println!("Testing screenshot fix...");
+    println!("Image: {}x{}, bytes_per_row: {}", width, height, bytes_per_row);
+    println!("Expected bytes per row: {}", width * 4);
+    println!("Padding per row: {} bytes", bytes_per_row - (width as usize * 4));
+    
+    // OLD METHOD (broken) - treating data as continuous
+    println!("\n=== OLD METHOD (BROKEN) ===");
+    let mut old_rgba = Vec::with_capacity(data.len() as usize);
+    for chunk in data.chunks_exact(4) {
+        old_rgba.push(chunk[2]); // R
+        old_rgba.push(chunk[1]); // G
+        old_rgba.push(chunk[0]); // B
+        old_rgba.push(chunk[3]); // A
+    }
+    println!("Converted {} pixels", old_rgba.len() / 4);
+    println!("Expected {} pixels", width * height);
+    
+    // NEW METHOD (fixed) - handling row padding
+    println!("\n=== NEW METHOD (FIXED) ===");
+    let mut new_rgba = Vec::with_capacity((width * height * 4) as usize);
+    for row in 0..height as usize {
+        let row_start = row * bytes_per_row;
+        let row_end = row_start + (width as usize * 4);
+        
+        for chunk in data[row_start..row_end].chunks_exact(4) {
+            new_rgba.push(chunk[2]); // R
+            new_rgba.push(chunk[1]); // G
+            new_rgba.push(chunk[0]); // B
+            new_rgba.push(chunk[3]); // A
+        }
+    }
+    println!("Converted {} pixels", new_rgba.len() / 4);
+    println!("Expected {} pixels", width * height);
+    
+    // Save a small crop from both methods
+    let crop_size = 200;
+    
+    // Old method crop
+    let old_crop: Vec<u8> = old_rgba.iter().take((crop_size * crop_size * 4) as usize).copied().collect();
+    if let Some(old_img) = ImageBuffer::from_raw(crop_size, crop_size, old_crop) {
+        let old_img: RgbaImage = old_img;
+        old_img.save("/tmp/screenshot_old_method.png").unwrap();
+        println!("\nSaved OLD method crop to: /tmp/screenshot_old_method.png");
+    }
+    
+    // New method crop
+    let new_crop: Vec<u8> = new_rgba.iter().take((crop_size * crop_size * 4) as usize).copied().collect();
+    if let Some(new_img) = ImageBuffer::from_raw(crop_size, crop_size, new_crop) {
+        let new_img: RgbaImage = new_img;
+        new_img.save("/tmp/screenshot_new_method.png").unwrap();
+        println!("Saved NEW method crop to: /tmp/screenshot_new_method.png");
+    }
+    
+    println!("\nOpen both images to compare:");
+    println!("  open /tmp/screenshot_old_method.png /tmp/screenshot_new_method.png");
+}
--- a/crates/g3-computer-control/examples/test_window_capture.rs
+++ b/crates/g3-computer-control/examples/test_window_capture.rs
@@ -0,0 +1,45 @@
+use g3_computer_control::create_controller;
+
+#[tokio::main]
+async fn main() {
+    println!("Testing window-specific screenshot capture...");
+    
+    let controller = create_controller().expect("Failed to create controller");
+    
+    // Test 1: Capture iTerm2 window
+    println!("\n1. Capturing iTerm2 window...");
+    match controller.take_screenshot("/tmp/iterm_window.png", None, Some("iTerm2")).await {
+        Ok(_) => {
+            println!("   ✅ iTerm2 window captured to /tmp/iterm_window.png");
+            let _ = std::process::Command::new("open").arg("/tmp/iterm_window.png").spawn();
+        }
+        Err(e) => println!("   ❌ Failed: {}", e),
+    }
+    
+    // Wait a moment for the image to open
+    tokio::time::sleep(tokio::time::Duration::from_secs(2)).await;
+    
+    // Test 2: Full screen capture for comparison
+    println!("\n2. Capturing full screen for comparison...");
+    match controller.take_screenshot("/tmp/fullscreen.png", None, None).await {
+        Ok(_) => {
+            println!("   ✅ Full screen captured to /tmp/fullscreen.png");
+            let _ = std::process::Command::new("open").arg("/tmp/fullscreen.png").spawn();
+        }
+        Err(e) => println!("   ❌ Failed: {}", e),
+    }
+    
+    println!("\n=== Comparison ===");
+    println!("iTerm window:  /tmp/iterm_window.png (should show ONLY iTerm window)");
+    println!("Full screen:   /tmp/fullscreen.png (should show entire desktop)");
+    
+    // Show file sizes
+    if let Ok(meta1) = std::fs::metadata("/tmp/iterm_window.png") {
+        if let Ok(meta2) = std::fs::metadata("/tmp/fullscreen.png") {
+            println!("\nFile sizes:");
+            println!("  iTerm window: {:.1} MB", meta1.len() as f64 / 1_000_000.0);
+            println!("  Full screen:  {:.1} MB", meta2.len() as f64 / 1_000_000.0);
+            println!("\nWindow capture should be smaller than full screen.");
+        }
+    }
+}
--- a/crates/g3-computer-control/src/lib.rs
+++ b/crates/g3-computer-control/src/lib.rs
@@ -0,0 +1,35 @@
+pub mod types;
+pub mod platform;
+pub mod webdriver;
+
+// Re-export webdriver types for convenience
+pub use webdriver::{WebDriverController, WebElement, safari::SafariDriver};
+
+use anyhow::Result;
+use async_trait::async_trait;
+use types::*;
+
+#[async_trait]
+pub trait ComputerController: Send + Sync {
+    // Screen capture
+    async fn take_screenshot(&self, path: &str, region: Option<Rect>, window_id: Option<&str>) -> Result<()>;
+    
+    // OCR operations
+    async fn extract_text_from_screen(&self, region: Rect) -> Result<String>;
+    async fn extract_text_from_image(&self, path: &str) -> Result<String>;
+}
+
+// Platform-specific constructor
+pub fn create_controller() -> Result<Box<dyn ComputerController>> {
+    #[cfg(target_os = "macos")]
+    return Ok(Box::new(platform::macos::MacOSController::new()?));
+    
+    #[cfg(target_os = "linux")]
+    return Ok(Box::new(platform::linux::LinuxController::new()?));
+    
+    #[cfg(target_os = "windows")]
+    return Ok(Box::new(platform::windows::WindowsController::new()?));
+    
+    #[cfg(not(any(target_os = "macos", target_os = "linux", target_os = "windows")))]
+    anyhow::bail!("Unsupported platform")
+}
--- a/crates/g3-computer-control/src/platform/linux.rs
+++ b/crates/g3-computer-control/src/platform/linux.rs
@@ -0,0 +1,161 @@
+use crate::{ComputerController, types::*};
+use anyhow::Result;
+use async_trait::async_trait;
+use tesseract::Tesseract;
+use uuid::Uuid;
+
+pub struct LinuxController {
+    // Placeholder for X11 connection or other state
+}
+
+impl LinuxController {
+    pub fn new() -> Result<Self> {
+        // Initialize X11 connection
+        tracing::warn!("Linux computer control not fully implemented");
+        Ok(Self {})
+    }
+}
+
+#[async_trait]
+impl ComputerController for LinuxController {
+    async fn move_mouse(&self, _x: i32, _y: i32) -> Result<()> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn click(&self, _button: MouseButton) -> Result<()> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn double_click(&self, _button: MouseButton) -> Result<()> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn type_text(&self, _text: &str) -> Result<()> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn press_key(&self, _key: &str) -> Result<()> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn list_windows(&self) -> Result<Vec<Window>> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn focus_window(&self, _window_id: &str) -> Result<()> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn get_window_bounds(&self, _window_id: &str) -> Result<Rect> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn find_element(&self, _selector: &ElementSelector) -> Result<Option<UIElement>> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn get_element_text(&self, _element_id: &str) -> Result<String> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn get_element_bounds(&self, _element_id: &str) -> Result<Rect> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn take_screenshot(&self, _path: &str, _region: Option<Rect>, _window_id: Option<&str>) -> Result<()> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn extract_text_from_screen(&self, _region: Rect) -> Result<OCRResult> {
+        anyhow::bail!("Linux implementation not yet available")
+    }
+    
+    async fn extract_text_from_image(&self, _path: &str) -> Result<OCRResult> {
+        // Check if tesseract is available on the system
+        let tesseract_check = std::process::Command::new("which")
+            .arg("tesseract")
+            .output();
+        
+        if tesseract_check.is_err() || !tesseract_check.as_ref().unwrap().status.success() {
+            anyhow::bail!("Tesseract OCR is not installed on your system.\n\n\
+                To install tesseract:\n  \
+                Ubuntu/Debian: sudo apt-get install tesseract-ocr\n  \
+                RHEL/CentOS:   sudo yum install tesseract\n  \
+                Arch Linux:    sudo pacman -S tesseract\n\n\
+                After installation, restart your terminal and try again.");
+        }
+        
+        // Initialize Tesseract
+        let tess = Tesseract::new(None, Some("eng"))
+            .map_err(|e| {
+                anyhow::anyhow!("Failed to initialize Tesseract: {}\n\n\
+                    This usually means:\n1. Tesseract is not properly installed\n\
+                    2. Language data files are missing\n\nTo fix:\n  \
+                    Ubuntu/Debian: sudo apt-get install tesseract-ocr-eng\n  \
+                    RHEL/CentOS:   sudo yum install tesseract-langpack-eng\n  \
+                    Arch Linux:    sudo pacman -S tesseract-data-eng", e)
+            })?;
+        
+        let text = tess.set_image(_path)
+            .map_err(|e| anyhow::anyhow!("Failed to load image '{}': {}", _path, e))?
+            .get_text()
+            .map_err(|e| anyhow::anyhow!("Failed to extract text from image: {}", e))?;
+        
+        // Get confidence (simplified - would need more complex API calls for per-word confidence)
+        let confidence = 0.85; // Placeholder
+        
+        Ok(OCRResult {
+            text,
+            confidence,
+            bounds: Rect { x: 0, y: 0, width: 0, height: 0 }, // Would need image dimensions
+        })
+    }
+    
+    async fn find_text_on_screen(&self, _text: &str) -> Result<Option<Point>> {
+        // Check if tesseract is available on the system
+        let tesseract_check = std::process::Command::new("which")
+            .arg("tesseract")
+            .output();
+        
+        if tesseract_check.is_err() || !tesseract_check.as_ref().unwrap().status.success() {
+            anyhow::bail!("Tesseract OCR is not installed on your system.\n\n\
+                To install tesseract:\n  \
+                Ubuntu/Debian: sudo apt-get install tesseract-ocr\n  \
+                RHEL/CentOS:   sudo yum install tesseract\n  \
+                Arch Linux:    sudo pacman -S tesseract\n\n\
+                After installation, restart your terminal and try again.");
+        }
+        
+        // Take full screen screenshot
+        let temp_path = format!("/tmp/g3_ocr_search_{}.png", uuid::Uuid::new_v4());
+        self.take_screenshot(&temp_path, None, None).await?;
+        
+        // Use Tesseract to find text with bounding boxes
+        let tess = Tesseract::new(None, Some("eng"))
+            .map_err(|e| {
+                anyhow::anyhow!("Failed to initialize Tesseract: {}\n\n\
+                    This usually means:\n1. Tesseract is not properly installed\n\
+                    2. Language data files are missing\n\nTo fix:\n  \
+                    Ubuntu/Debian: sudo apt-get install tesseract-ocr-eng\n  \
+                    RHEL/CentOS:   sudo yum install tesseract-langpack-eng\n  \
+                    Arch Linux:    sudo pacman -S tesseract-data-eng", e)
+            })?;
+        
+        let full_text = tess.set_image(temp_path.as_str())
+            .map_err(|e| anyhow::anyhow!("Failed to load screenshot: {}", e))?
+            .get_text()
+            .map_err(|e| anyhow::anyhow!("Failed to extract text from screen: {}", e))?;
+        
+        // Clean up temp file
+        let _ = std::fs::remove_file(&temp_path);
+        
+        // Simple text search - full implementation would use get_component_images
+        // to get bounding boxes for each word
+        if full_text.contains(_text) {
+            tracing::warn!("Text found but precise coordinates not available in simplified implementation");
+            Ok(Some(Point { x: 0, y: 0 }))
+        } else {
+            Ok(None)
+        }
+    }
+}
--- a/crates/g3-computer-control/src/platform/macos.rs
+++ b/crates/g3-computer-control/src/platform/macos.rs
@@ -0,0 +1,125 @@
+use crate::{ComputerController, types::Rect};
+use anyhow::Result;
+use async_trait::async_trait;
+use std::path::Path;
+use tesseract::Tesseract;
+
+pub struct MacOSController {
+    // Empty struct for now
+}
+
+impl MacOSController {
+    pub fn new() -> Result<Self> {
+        Ok(Self {})
+    }
+}
+
+#[async_trait]
+impl ComputerController for MacOSController {
+    async fn take_screenshot(&self, path: &str, region: Option<Rect>, window_id: Option<&str>) -> Result<()> {
+        // Determine the temporary directory for screenshots
+        let temp_dir = std::env::var("TMPDIR")
+            .or_else(|_| std::env::var("HOME").map(|h| format!("{}/tmp", h)))
+            .unwrap_or_else(|_| "/tmp".to_string());
+        
+        // Ensure temp directory exists
+        std::fs::create_dir_all(&temp_dir)?;
+        
+        // If path is relative or doesn't specify a directory, use temp_dir
+        let final_path = if path.starts_with('/') {
+            path.to_string()
+        } else {
+            format!("{}/{}", temp_dir.trim_end_matches('/'), path)
+        };
+        
+        let path_obj = Path::new(&final_path);
+        if let Some(parent) = path_obj.parent() {
+            std::fs::create_dir_all(parent)?;
+        }
+        
+        let mut cmd = std::process::Command::new("screencapture");
+        
+        // Add flags
+        cmd.arg("-x"); // No sound
+        
+        if let Some(region) = region {
+            // Capture specific region: -R x,y,width,height
+            cmd.arg("-R");
+            cmd.arg(format!("{},{},{},{}", region.x, region.y, region.width, region.height));
+        }
+        
+        if let Some(app_name) = window_id {
+            // Capture specific window by app name
+            // Use AppleScript to get window ID
+            let script = format!(r#"tell application "{}" to id of window 1"#, app_name);
+            let output = std::process::Command::new("osascript")
+                .arg("-e")
+                .arg(&script)
+                .output()?;
+            
+            if output.status.success() {
+                let window_id_str = String::from_utf8_lossy(&output.stdout).trim().to_string();
+                cmd.arg(format!("-l{}", window_id_str));
+            }
+        }
+        
+        cmd.arg(&final_path);
+        
+        let screenshot_result = cmd.output()?;
+        
+        if !screenshot_result.status.success() {
+            let stderr = String::from_utf8_lossy(&screenshot_result.stderr);
+            return Err(anyhow::anyhow!("screencapture failed: {}", stderr));
+        }
+        
+        Ok(())
+    }
+    
+    async fn extract_text_from_screen(&self, region: Rect) -> Result<String> {
+        // Take screenshot of region first
+        let temp_path = format!("/tmp/g3_ocr_{}.png", uuid::Uuid::new_v4());
+        self.take_screenshot(&temp_path, Some(region), None).await?;
+        
+        // Extract text from the screenshot
+        let result = self.extract_text_from_image(&temp_path).await?;
+        
+        // Clean up temp file
+        let _ = std::fs::remove_file(&temp_path);
+        
+        Ok(result)
+    }
+    
+    async fn extract_text_from_image(&self, path: &str) -> Result<String> {
+        // Check if tesseract is available on the system
+        let tesseract_check = std::process::Command::new("which")
+            .arg("tesseract")
+            .output();
+        
+        if tesseract_check.is_err() || !tesseract_check.as_ref().unwrap().status.success() {
+            anyhow::bail!("Tesseract OCR is not installed on your system.\n\n\
+                To install tesseract:\n  macOS:   brew install tesseract\n  \
+                Linux:   sudo apt-get install tesseract-ocr (Ubuntu/Debian)\n           \
+                sudo yum install tesseract (RHEL/CentOS)\n  \
+                Windows: Download from https://github.com/UB-Mannheim/tesseract/wiki\n\n\
+                After installation, restart your terminal and try again.");
+        }
+        
+        // Initialize Tesseract
+        let tess = Tesseract::new(None, Some("eng"))
+            .map_err(|e| {
+                anyhow::anyhow!("Failed to initialize Tesseract: {}\n\n\
+                    This usually means:\n1. Tesseract is not properly installed\n\
+                    2. Language data files are missing\n\nTo fix:\n  \
+                    macOS:   brew reinstall tesseract\n  \
+                    Linux:   sudo apt-get install tesseract-ocr-eng\n  \
+                    Windows: Reinstall tesseract and ensure language files are included", e)
+            })?;
+        
+        let text = tess.set_image(path)
+            .map_err(|e| anyhow::anyhow!("Failed to load image '{}': {}", path, e))?
+            .get_text()
+            .map_err(|e| anyhow::anyhow!("Failed to extract text from image: {}", e))?;
+        
+        Ok(text)
+    }
+}
--- a/crates/g3-computer-control/src/platform/macos.rs.bak
+++ b/crates/g3-computer-control/src/platform/macos.rs.bak
@@ -0,0 +1,425 @@
+use crate::{ComputerController, types::*};
+use anyhow::Result;
+use async_trait::async_trait;
+use core_graphics::display::CGPoint;
+use core_graphics::event::{CGEvent, CGEventType, CGMouseButton, CGEventTapLocation};
+use core_graphics::event_source::{CGEventSource, CGEventSourceStateID};
+use std::path::Path;
+use tesseract::Tesseract;
+
+// MacOSController doesn't store CGEventSource to avoid Send/Sync issues
+// We create it fresh for each operation
+pub struct MacOSController {
+    // Empty struct - event source created per operation
+}
+
+impl MacOSController {
+    pub fn new() -> Result<Self> {
+        // Test that we can create an event source
+        let _event_source = CGEventSource::new(CGEventSourceStateID::CombinedSessionState)
+            .map_err(|_| anyhow::anyhow!("Failed to create event source. Make sure Accessibility permissions are granted."))?;
+        Ok(Self {})
+    }
+    
+    fn key_to_keycode(&self, key: &str) -> Result<u16> {
+        // Map key names to macOS keycodes
+        let keycode = match key.to_lowercase().as_str() {
+            "return" | "enter" => 36,
+            "tab" => 48,
+            "space" => 49,
+            "delete" | "backspace" => 51,
+            "escape" | "esc" => 53,
+            "command" | "cmd" => 55,
+            "shift" => 56,
+            "capslock" => 57,
+            "option" | "alt" => 58,
+            "control" | "ctrl" => 59,
+            "left" => 123,
+            "right" => 124,
+            "down" => 125,
+            "up" => 126,
+            _ => anyhow::bail!("Unknown key: {}", key),
+        };
+        Ok(keycode)
+    }
+}
+
+#[async_trait]
+impl ComputerController for MacOSController {
+    async fn move_mouse(&self, x: i32, y: i32) -> Result<()> {
+        let event_source = CGEventSource::new(CGEventSourceStateID::CombinedSessionState)
+            .map_err(|_| anyhow::anyhow!("Failed to create event source"))?;
+        let point = CGPoint::new(x as f64, y as f64);
+        let event = CGEvent::new_mouse_event(
+            event_source,
+            CGEventType::MouseMoved,
+            point,
+            CGMouseButton::Left,
+        ).map_err(|_| anyhow::anyhow!("Failed to create mouse move event"))?;
+        
+        event.post(CGEventTapLocation::HID);
+        Ok(())
+    }
+    
+    async fn click(&self, button: MouseButton) -> Result<()> {
+        let (cg_button, down_type, up_type) = match button {
+            MouseButton::Left => (CGMouseButton::Left, CGEventType::LeftMouseDown, CGEventType::LeftMouseUp),
+            MouseButton::Right => (CGMouseButton::Right, CGEventType::RightMouseDown, CGEventType::RightMouseUp),
+            MouseButton::Middle => (CGMouseButton::Center, CGEventType::OtherMouseDown, CGEventType::OtherMouseUp),
+        };
+        
+        let point = {
+            // Get current mouse position
+            let temp_source = CGEventSource::new(CGEventSourceStateID::CombinedSessionState)
+                .map_err(|_| anyhow::anyhow!("Failed to create event source"))?;
+            let event = CGEvent::new(temp_source)
+                .map_err(|_| anyhow::anyhow!("Failed to get mouse position"))?;
+            let p = event.location();
+            p
+        };
+        
+        {
+            let event_source = CGEventSource::new(CGEventSourceStateID::CombinedSessionState)
+                .map_err(|_| anyhow::anyhow!("Failed to create event source"))?;
+            
+            // Mouse down
+            let down_event = CGEvent::new_mouse_event(
+                event_source,
+                down_type,
+                point,
+                cg_button,
+            ).map_err(|_| anyhow::anyhow!("Failed to create mouse down event"))?;
+            down_event.post(CGEventTapLocation::HID);
+        } // event_source and down_event dropped here
+        
+        // Small delay
+        tokio::time::sleep(tokio::time::Duration::from_millis(50)).await;
+        
+        {
+            let event_source = CGEventSource::new(CGEventSourceStateID::CombinedSessionState)
+                .map_err(|_| anyhow::anyhow!("Failed to create event source"))?;
+            
+            let up_event = CGEvent::new_mouse_event(
+                event_source,
+                up_type,
+                point,
+                cg_button,
+            ).map_err(|_| anyhow::anyhow!("Failed to create mouse up event"))?;
+            up_event.post(CGEventTapLocation::HID);
+        } // event_source and up_event dropped here
+        
+        Ok(())
+    }
+    
+    async fn double_click(&self, button: MouseButton) -> Result<()> {
+        self.click(button).await?;
+        tokio::time::sleep(tokio::time::Duration::from_millis(100)).await;
+        self.click(button).await?;
+        Ok(())
+    }
+    
+    async fn type_text(&self, text: &str) -> Result<()> {
+        for ch in text.chars() {
+            {
+                let event_source = CGEventSource::new(CGEventSourceStateID::CombinedSessionState)
+                    .map_err(|_| anyhow::anyhow!("Failed to create event source"))?;
+                
+                // Create keyboard event for character
+                let event = CGEvent::new_keyboard_event(
+                    event_source,
+                    0, // keycode (0 for unicode)
+                    true,
+                ).map_err(|_| anyhow::anyhow!("Failed to create keyboard event"))?;
+                
+                // Set unicode string
+                let mut utf16_buf = [0u16; 2];
+                let utf16_slice = ch.encode_utf16(&mut utf16_buf);
+                let utf16_chars: Vec<u16> = utf16_slice.iter().copied().collect();
+                
+                event.set_string_from_utf16_unchecked(utf16_chars.as_slice());
+                event.post(CGEventTapLocation::HID);
+            } // event_source and event dropped here
+            
+            tokio::time::sleep(tokio::time::Duration::from_millis(10)).await;
+        }
+        Ok(())
+    }
+    
+    async fn press_key(&self, key: &str) -> Result<()> {
+        let keycode = self.key_to_keycode(key)?;
+        
+        {
+            let event_source = CGEventSource::new(CGEventSourceStateID::CombinedSessionState)
+                .map_err(|_| anyhow::anyhow!("Failed to create event source"))?;
+            
+            // Key down
+            let down_event = CGEvent::new_keyboard_event(
+                event_source,
+                keycode,
+                true,
+            ).map_err(|_| anyhow::anyhow!("Failed to create key down event"))?;
+            down_event.post(CGEventTapLocation::HID);
+        } // event_source and down_event dropped here
+        
+        tokio::time::sleep(tokio::time::Duration::from_millis(50)).await;
+        
+        {
+            let event_source = CGEventSource::new(CGEventSourceStateID::CombinedSessionState)
+                .map_err(|_| anyhow::anyhow!("Failed to create event source"))?;
+            
+            // Key up
+            let up_event = CGEvent::new_keyboard_event(
+                event_source,
+                keycode,
+                false,
+            ).map_err(|_| anyhow::anyhow!("Failed to create key up event"))?;
+            up_event.post(CGEventTapLocation::HID);
+        } // event_source and up_event dropped here
+        
+        Ok(())
+    }
+    
+    async fn list_windows(&self) -> Result<Vec<Window>> {
+        // Note: Full implementation would use CGWindowListCopyWindowInfo
+        // For now, return empty list as this requires more complex FFI
+        tracing::warn!("list_windows not fully implemented on macOS");
+        Ok(vec![])
+    }
+    
+    async fn focus_window(&self, _window_id: &str) -> Result<()> {
+        // Note: Full implementation would use NSWorkspace to activate application
+        tracing::warn!("focus_window not fully implemented on macOS");
+        Ok(())
+    }
+    
+    async fn get_window_bounds(&self, _window_id: &str) -> Result<Rect> {
+        // Note: Full implementation would use Accessibility API
+        tracing::warn!("get_window_bounds not fully implemented on macOS");
+        Ok(Rect { x: 0, y: 0, width: 800, height: 600 })
+    }
+    
+    async fn find_element(&self, _selector: &ElementSelector) -> Result<Option<UIElement>> {
+        // Note: Full implementation would use macOS Accessibility API
+        tracing::warn!("find_element not fully implemented on macOS");
+        Ok(None)
+    }
+    
+    async fn get_element_text(&self, _element_id: &str) -> Result<String> {
+        // Note: Full implementation would use Accessibility API
+        tracing::warn!("get_element_text not fully implemented on macOS");
+        Ok(String::new())
+    }
+    
+    async fn get_element_bounds(&self, _element_id: &str) -> Result<Rect> {
+        // Note: Full implementation would use Accessibility API
+        tracing::warn!("get_element_bounds not fully implemented on macOS");
+        Ok(Rect { x: 0, y: 0, width: 100, height: 30 })
+    }
+    
+    async fn take_screenshot(&self, path: &str, _region: Option<Rect>, window_id: Option<&str>) -> Result<()> {
+        // Use native macOS screencapture command which handles all the format complexities
+        
+        // Check if we have Screen Recording permission by attempting a test capture
+        // If we only get wallpaper/menubar but no windows, we need permission
+        let needs_permission_check = std::env::var("G3_SKIP_PERMISSION_CHECK").is_err();
+        
+        if needs_permission_check {
+            // Try to open Screen Recording settings if this is the first screenshot
+            static PERMISSION_PROMPTED: std::sync::atomic::AtomicBool = std::sync::atomic::AtomicBool::new(false);
+            
+            if !PERMISSION_PROMPTED.swap(true, std::sync::atomic::Ordering::Relaxed) {
+                tracing::warn!("\n=== Screen Recording Permission Required ===\n\
+                    macOS requires explicit permission to capture window content.\n\
+                    If screenshots only show wallpaper/menubar (no windows):\n\n\
+                    1. Open System Settings > Privacy & Security > Screen Recording\n\
+                    2. Enable permission for your terminal (iTerm/Terminal) or g3\n\
+                    3. Restart your terminal if needed\n\n\
+                    Opening Screen Recording settings now...\n");
+                
+                // Try to open the settings (non-blocking)
+                let _ = std::process::Command::new("open")
+                    .arg("x-apple.systempreferences:com.apple.preference.security?Privacy_ScreenCapture")
+                    .spawn();
+            }
+        }
+        
+        let path_obj = Path::new(path);
+        if let Some(parent) = path_obj.parent() {
+            std::fs::create_dir_all(parent)?;
+        }
+        
+        let mut cmd = std::process::Command::new("screencapture");
+        
+        // Add flags
+        cmd.arg("-x"); // No sound
+        
+        if let Some(window_id) = window_id {
+            // Capture specific window by getting its bounds and using region capture
+            // window_id format: "AppName" or "AppName:WindowTitle"
+            let app_name = window_id.split(':').next().unwrap_or(window_id);
+            
+            // Use AppleScript to get window bounds
+            let script = format!(
+                r#"tell application "{}"
+                    tell current window
+                        get bounds
+                    end tell
+                end tell"#,
+                app_name
+            );
+            
+            let output = std::process::Command::new("osascript")
+                .arg("-e")
+                .arg(&script)
+                .output()
+                .map_err(|e| anyhow::anyhow!("Failed to get window bounds: {}", e))?;
+            
+            if output.status.success() {
+                let bounds_str = String::from_utf8_lossy(&output.stdout);
+                let bounds: Vec<i32> = bounds_str
+                    .trim()
+                    .split(',')
+                    .filter_map(|s| s.trim().parse().ok())
+                    .collect();
+                
+                if bounds.len() == 4 {
+                    let (left, top, right, bottom) = (bounds[0], bounds[1], bounds[2], bounds[3]);
+                    let width = right - left;
+                    let height = bottom - top;
+                    
+                    cmd.arg("-R");
+                    cmd.arg(format!("{},{},{},{}", left, top, width, height));
+                    
+                    tracing::debug!("Capturing window '{}' at region: {},{} {}x{}", app_name, left, top, width, height);
+                } else {
+                    tracing::warn!("Failed to parse window bounds, capturing full screen");
+                }
+            } else {
+                tracing::warn!("Failed to get window bounds for '{}', capturing full screen", app_name);
+            }
+        } else if let Some(region) = _region {
+            // Capture specific region: -R x,y,width,height
+            cmd.arg("-R");
+            cmd.arg(format!("{},{},{},{}", region.x, region.y, region.width, region.height));
+        }
+        
+        cmd.arg(path);
+        
+        let output = cmd.output()
+            .map_err(|e| anyhow::anyhow!("Failed to execute screencapture: {}", e))?;
+        
+        if !output.status.success() {
+            let stderr = String::from_utf8_lossy(&output.stderr);
+            anyhow::bail!("screencapture failed: {}", stderr);
+        }
+        
+        tracing::debug!("Screenshot saved using screencapture: {}", path);
+        
+        Ok(())
+    }
+    
+    }
+    
+    async fn extract_text_from_screen(&self, region: Rect) -> Result<OCRResult> {
+        // Take screenshot of region first
+        let temp_path = format!("/tmp/g3_ocr_{}.png", uuid::Uuid::new_v4());
+        self.take_screenshot(&temp_path, Some(region), None).await?;
+        
+        // Extract text from the screenshot
+        let result = self.extract_text_from_image(&temp_path).await?;
+        
+        // Clean up temp file
+        let _ = std::fs::remove_file(&temp_path);
+        
+        Ok(result)
+    }
+    
+    async fn extract_text_from_image(&self, _path: &str) -> Result<OCRResult> {
+        // Check if tesseract is available on the system
+        let tesseract_check = std::process::Command::new("which")
+            .arg("tesseract")
+            .output();
+        
+        if tesseract_check.is_err() || !tesseract_check.as_ref().unwrap().status.success() {
+            anyhow::bail!("Tesseract OCR is not installed on your system.\n\n\
+                To install tesseract:\n  macOS:   brew install tesseract\n  \
+                Linux:   sudo apt-get install tesseract-ocr (Ubuntu/Debian)\n           \
+                sudo yum install tesseract (RHEL/CentOS)\n  \
+                Windows: Download from https://github.com/UB-Mannheim/tesseract/wiki\n\n\
+                After installation, restart your terminal and try again.");
+        }
+        
+        // Initialize Tesseract
+        let tess = Tesseract::new(None, Some("eng"))
+            .map_err(|e| {
+                anyhow::anyhow!("Failed to initialize Tesseract: {}\n\n\
+                    This usually means:\n1. Tesseract is not properly installed\n\
+                    2. Language data files are missing\n\nTo fix:\n  \
+                    macOS:   brew reinstall tesseract\n  \
+                    Linux:   sudo apt-get install tesseract-ocr-eng\n  \
+                    Windows: Reinstall tesseract and ensure language files are included", e)
+            })?;
+        
+        let text = tess.set_image(_path)
+            .map_err(|e| anyhow::anyhow!("Failed to load image '{}': {}", _path, e))?
+            .get_text()
+            .map_err(|e| anyhow::anyhow!("Failed to extract text from image: {}", e))?;
+        
+        // Get confidence (simplified - would need more complex API calls for per-word confidence)
+        let confidence = 0.85; // Placeholder
+        
+        Ok(OCRResult {
+            text,
+            confidence,
+            bounds: Rect { x: 0, y: 0, width: 0, height: 0 }, // Would need image dimensions
+        })
+    }
+    
+    async fn find_text_on_screen(&self, _text: &str) -> Result<Option<Point>> {
+        // Check if tesseract is available on the system
+        let tesseract_check = std::process::Command::new("which")
+            .arg("tesseract")
+            .output();
+        
+        if tesseract_check.is_err() || !tesseract_check.as_ref().unwrap().status.success() {
+            anyhow::bail!("Tesseract OCR is not installed on your system.\n\n\
+                To install tesseract:\n  macOS:   brew install tesseract\n  \
+                Linux:   sudo apt-get install tesseract-ocr (Ubuntu/Debian)\n           \
+                sudo yum install tesseract (RHEL/CentOS)\n  \
+                Windows: Download from https://github.com/UB-Mannheim/tesseract/wiki\n\n\
+                After installation, restart your terminal and try again.");
+        }
+        
+        // Take full screen screenshot
+        let temp_path = format!("/tmp/g3_ocr_search_{}.png", uuid::Uuid::new_v4());
+        self.take_screenshot(&temp_path, None, None).await?;
+        
+        // Use Tesseract to find text with bounding boxes
+        let tess = Tesseract::new(None, Some("eng"))
+            .map_err(|e| {
+                anyhow::anyhow!("Failed to initialize Tesseract: {}\n\n\
+                    This usually means:\n1. Tesseract is not properly installed\n\
+                    2. Language data files are missing\n\nTo fix:\n  \
+                    macOS:   brew reinstall tesseract\n  \
+                    Linux:   sudo apt-get install tesseract-ocr-eng\n  \
+                    Windows: Reinstall tesseract and ensure language files are included", e)
+            })?;
+        
+        let full_text = tess.set_image(temp_path.as_str())
+            .map_err(|e| anyhow::anyhow!("Failed to load screenshot: {}", e))?
+            .get_text()
+            .map_err(|e| anyhow::anyhow!("Failed to extract text from screen: {}", e))?;
+        
+        // Clean up temp file
+        let _ = std::fs::remove_file(&temp_path);
+        
+        // Simple text search - full implementation would use get_component_images
+        // to get bounding boxes for each word
+        if full_text.contains(_text) {
+            tracing::warn!("Text found but precise coordinates not available in simplified implementation");
+            Ok(Some(Point { x: 0, y: 0 }))
+        } else {
+            Ok(None)
+        }
+    }
+}
--- a/crates/g3-computer-control/src/platform/mod.rs
+++ b/crates/g3-computer-control/src/platform/mod.rs
@@ -0,0 +1,8 @@
+#[cfg(target_os = "macos")]
+pub mod macos;
+
+#[cfg(target_os = "linux")]
+pub mod linux;
+
+#[cfg(target_os = "windows")]
+pub mod windows;
--- a/crates/g3-computer-control/src/platform/windows.rs
+++ b/crates/g3-computer-control/src/platform/windows.rs
@@ -0,0 +1,162 @@
+use crate::{ComputerController, types::*};
+use anyhow::Result;
+use async_trait::async_trait;
+use tesseract::Tesseract;
+use uuid::Uuid;
+
+pub struct WindowsController {
+    // Placeholder for Windows-specific state
+}
+
+impl WindowsController {
+    pub fn new() -> Result<Self> {
+        tracing::warn!("Windows computer control not fully implemented");
+        Ok(Self {})
+    }
+}
+
+#[async_trait]
+impl ComputerController for WindowsController {
+    async fn move_mouse(&self, _x: i32, _y: i32) -> Result<()> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn click(&self, _button: MouseButton) -> Result<()> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn double_click(&self, _button: MouseButton) -> Result<()> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn type_text(&self, _text: &str) -> Result<()> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn press_key(&self, _key: &str) -> Result<()> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn list_windows(&self) -> Result<Vec<Window>> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn focus_window(&self, _window_id: &str) -> Result<()> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn get_window_bounds(&self, _window_id: &str) -> Result<Rect> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn find_element(&self, _selector: &ElementSelector) -> Result<Option<UIElement>> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn get_element_text(&self, _element_id: &str) -> Result<String> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn get_element_bounds(&self, _element_id: &str) -> Result<Rect> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn take_screenshot(&self, _path: &str, _region: Option<Rect>, _window_id: Option<&str>) -> Result<()> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn extract_text_from_screen(&self, _region: Rect) -> Result<OCRResult> {
+        anyhow::bail!("Windows implementation not yet available")
+    }
+    
+    async fn extract_text_from_image(&self, _path: &str) -> Result<OCRResult> {
+        // Check if tesseract is available on the system
+        let tesseract_check = std::process::Command::new("where")
+            .arg("tesseract")
+            .output();
+        
+        if tesseract_check.is_err() || !tesseract_check.as_ref().unwrap().status.success() {
+            anyhow::bail!("Tesseract OCR is not installed on your system.\n\n\
+                To install tesseract on Windows:\n  \
+                1. Download the installer from: https://github.com/UB-Mannheim/tesseract/wiki\n  \
+                2. Run the installer and follow the instructions\n  \
+                3. Add tesseract to your PATH environment variable\n  \
+                4. Restart your terminal/command prompt\n\n\
+                After installation, restart your terminal and try again.");
+        }
+        
+        // Initialize Tesseract
+        let tess = Tesseract::new(None, Some("eng"))
+            .map_err(|e| {
+                anyhow::anyhow!("Failed to initialize Tesseract: {}\n\n\
+                    This usually means:\n1. Tesseract is not properly installed\n\
+                    2. Language data files are missing\n\nTo fix:\n  \
+                    1. Reinstall tesseract from https://github.com/UB-Mannheim/tesseract/wiki\n  \
+                    2. Make sure to select 'Additional language data' during installation\n  \
+                    3. Ensure tesseract is in your PATH", e)
+            })?;
+        
+        let text = tess.set_image(_path)
+            .map_err(|e| anyhow::anyhow!("Failed to load image '{}': {}", _path, e))?
+            .get_text()
+            .map_err(|e| anyhow::anyhow!("Failed to extract text from image: {}", e))?;
+        
+        // Get confidence (simplified - would need more complex API calls for per-word confidence)
+        let confidence = 0.85; // Placeholder
+        
+        Ok(OCRResult {
+            text,
+            confidence,
+            bounds: Rect { x: 0, y: 0, width: 0, height: 0 }, // Would need image dimensions
+        })
+    }
+    
+    async fn find_text_on_screen(&self, _text: &str) -> Result<Option<Point>> {
+        // Check if tesseract is available on the system
+        let tesseract_check = std::process::Command::new("where")
+            .arg("tesseract")
+            .output();
+        
+        if tesseract_check.is_err() || !tesseract_check.as_ref().unwrap().status.success() {
+            anyhow::bail!("Tesseract OCR is not installed on your system.\n\n\
+                To install tesseract on Windows:\n  \
+                1. Download the installer from: https://github.com/UB-Mannheim/tesseract/wiki\n  \
+                2. Run the installer and follow the instructions\n  \
+                3. Add tesseract to your PATH environment variable\n  \
+                4. Restart your terminal/command prompt\n\n\
+                After installation, restart your terminal and try again.");
+        }
+        
+        // Take full screen screenshot
+        let temp_path = format!("C:\\\\Temp\\\\g3_ocr_search_{}.png", uuid::Uuid::new_v4());
+        self.take_screenshot(&temp_path, None, None).await?;
+        
+        // Use Tesseract to find text with bounding boxes
+        let tess = Tesseract::new(None, Some("eng"))
+            .map_err(|e| {
+                anyhow::anyhow!("Failed to initialize Tesseract: {}\n\n\
+                    This usually means:\n1. Tesseract is not properly installed\n\
+                    2. Language data files are missing\n\nTo fix:\n  \
+                    1. Reinstall tesseract from https://github.com/UB-Mannheim/tesseract/wiki\n  \
+                    2. Make sure to select 'Additional language data' during installation\n  \
+                    3. Ensure tesseract is in your PATH", e)
+            })?;
+        
+        let full_text = tess.set_image(temp_path.as_str())
+            .map_err(|e| anyhow::anyhow!("Failed to load screenshot: {}", e))?
+            .get_text()
+            .map_err(|e| anyhow::anyhow!("Failed to extract text from screen: {}", e))?;
+        
+        // Clean up temp file
+        let _ = std::fs::remove_file(&temp_path);
+        
+        // Simple text search - full implementation would use get_component_images
+        // to get bounding boxes for each word
+        if full_text.contains(_text) {
+            tracing::warn!("Text found but precise coordinates not available in simplified implementation");
+            Ok(Some(Point { x: 0, y: 0 }))
+        } else {
+            Ok(None)
+        }
+    }
+}
--- a/crates/g3-computer-control/src/types.rs
+++ b/crates/g3-computer-control/src/types.rs
@@ -0,0 +1,9 @@
+use serde::{Deserialize, Serialize};
+
+#[derive(Debug, Clone, Copy, Serialize, Deserialize)]
+pub struct Rect {
+    pub x: i32,
+    pub y: i32,
+    pub width: i32,
+    pub height: i32,
+}
--- a/crates/g3-computer-control/src/webdriver/mod.rs
+++ b/crates/g3-computer-control/src/webdriver/mod.rs
@@ -0,0 +1,111 @@
+pub mod safari;
+
+use anyhow::Result;
+use async_trait::async_trait;
+use serde_json::Value;
+
+/// WebDriver controller for browser automation
+#[async_trait]
+pub trait WebDriverController: Send + Sync {
+    /// Navigate to a URL
+    async fn navigate(&mut self, url: &str) -> Result<()>;
+    
+    /// Get the current URL
+    async fn current_url(&self) -> Result<String>;
+    
+    /// Get the page title
+    async fn title(&self) -> Result<String>;
+    
+    /// Find an element by CSS selector
+    async fn find_element(&mut self, selector: &str) -> Result<WebElement>;
+    
+    /// Find multiple elements by CSS selector
+    async fn find_elements(&mut self, selector: &str) -> Result<Vec<WebElement>>;
+    
+    /// Execute JavaScript in the browser
+    async fn execute_script(&mut self, script: &str, args: Vec<Value>) -> Result<Value>;
+    
+    /// Get the page source (HTML)
+    async fn page_source(&self) -> Result<String>;
+    
+    /// Take a screenshot and save to path
+    async fn screenshot(&mut self, path: &str) -> Result<()>;
+    
+    /// Close the current window/tab
+    async fn close(&mut self) -> Result<()>;
+    
+    /// Quit the browser session
+    async fn quit(self) -> Result<()>;
+}
+
+/// Represents a web element in the DOM
+pub struct WebElement {
+    pub(crate) inner: fantoccini::elements::Element,
+}
+
+impl WebElement {
+    /// Click the element
+    pub async fn click(&mut self) -> Result<()> {
+        self.inner.click().await?;
+        Ok(())
+    }
+    
+    /// Send keys/text to the element
+    pub async fn send_keys(&mut self, text: &str) -> Result<()> {
+        self.inner.send_keys(text).await?;
+        Ok(())
+    }
+    
+    /// Clear the element's content (for input fields)
+    pub async fn clear(&mut self) -> Result<()> {
+        self.inner.clear().await?;
+        Ok(())
+    }
+    
+    /// Get the element's text content
+    pub async fn text(&self) -> Result<String> {
+        Ok(self.inner.text().await?)
+    }
+    
+    /// Get an attribute value
+    pub async fn attr(&self, name: &str) -> Result<Option<String>> {
+        Ok(self.inner.attr(name).await?)
+    }
+    
+    /// Get a property value
+    pub async fn prop(&self, name: &str) -> Result<Option<String>> {
+        Ok(self.inner.prop(name).await?)
+    }
+    
+    /// Get the element's HTML
+    pub async fn html(&self, inner: bool) -> Result<String> {
+        Ok(self.inner.html(inner).await?)
+    }
+    
+    /// Check if element is displayed
+    pub async fn is_displayed(&self) -> Result<bool> {
+        Ok(self.inner.is_displayed().await?)
+    }
+    
+    /// Check if element is enabled
+    pub async fn is_enabled(&self) -> Result<bool> {
+        Ok(self.inner.is_enabled().await?)
+    }
+    
+    /// Check if element is selected (for checkboxes/radio buttons)
+    pub async fn is_selected(&self) -> Result<bool> {
+        Ok(self.inner.is_selected().await?)
+    }
+    
+    /// Find a child element by CSS selector
+    pub async fn find_element(&mut self, selector: &str) -> Result<WebElement> {
+        let elem = self.inner.find(fantoccini::Locator::Css(selector)).await?;
+        Ok(WebElement { inner: elem })
+    }
+    
+    /// Find multiple child elements by CSS selector
+    pub async fn find_elements(&mut self, selector: &str) -> Result<Vec<WebElement>> {
+        let elems = self.inner.find_all(fantoccini::Locator::Css(selector)).await?;
+        Ok(elems.into_iter().map(|inner| WebElement { inner }).collect())
+    }
+}
--- a/crates/g3-computer-control/src/webdriver/safari.rs
+++ b/crates/g3-computer-control/src/webdriver/safari.rs
@@ -0,0 +1,212 @@
+use super::{WebDriverController, WebElement};
+use anyhow::{Context, Result};
+use async_trait::async_trait;
+use fantoccini::{Client, ClientBuilder};
+use serde_json::Value;
+use std::time::Duration;
+
+/// SafariDriver WebDriver controller
+pub struct SafariDriver {
+    client: Client,
+}
+
+impl SafariDriver {
+    /// Create a new SafariDriver instance
+    /// 
+    /// This will connect to SafariDriver running on the default port (4444).
+    /// Make sure to enable "Allow Remote Automation" in Safari's Develop menu first.
+    /// 
+    /// You can start SafariDriver manually with:
+    /// ```bash
+    /// /usr/bin/safaridriver --enable
+    /// ```
+    pub async fn new() -> Result<Self> {
+        Self::with_port(4444).await
+    }
+    
+    /// Create a new SafariDriver instance with a custom port
+    pub async fn with_port(port: u16) -> Result<Self> {
+        let url = format!("http://localhost:{}", port);
+        
+        let mut caps = serde_json::Map::new();
+        caps.insert("browserName".to_string(), Value::String("safari".to_string()));
+        
+        let client = ClientBuilder::native()
+            .capabilities(caps)
+            .connect(&url)
+            .await
+            .context("Failed to connect to SafariDriver. Make sure SafariDriver is running and 'Allow Remote Automation' is enabled in Safari's Develop menu.")?;
+        
+        Ok(Self { client })
+    }
+    
+    /// Go back in browser history
+    pub async fn back(&mut self) -> Result<()> {
+        self.client.back().await?;
+        Ok(())
+    }
+    
+    /// Go forward in browser history
+    pub async fn forward(&mut self) -> Result<()> {
+        self.client.forward().await?;
+        Ok(())
+    }
+    
+    /// Refresh the current page
+    pub async fn refresh(&mut self) -> Result<()> {
+        self.client.refresh().await?;
+        Ok(())
+    }
+    
+    /// Get all window handles
+    pub async fn window_handles(&mut self) -> Result<Vec<String>> {
+        let handles = self.client.windows().await?;
+        Ok(handles.into_iter()
+            .map(|h| h.into())
+            .collect())
+    }
+    
+    /// Switch to a window by handle
+    pub async fn switch_to_window(&mut self, handle: &str) -> Result<()> {
+        let window_handle: fantoccini::wd::WindowHandle = handle.to_string().try_into()?;
+        self.client.switch_to_window(window_handle).await?;
+        Ok(())
+    }
+    
+    /// Get the current window handle
+    pub async fn current_window_handle(&mut self) -> Result<String> {
+        Ok(self.client.window().await?.into())
+    }
+    
+    /// Close the current window
+    pub async fn close_window(&mut self) -> Result<()> {
+        self.client.close_window().await?;
+        Ok(())
+    }
+    
+    /// Create a new window/tab
+    pub async fn new_window(&mut self, is_tab: bool) -> Result<String> {
+        let window_type = if is_tab { "tab" } else { "window" };
+        let response = self.client.new_window(window_type == "tab").await?;
+        Ok(response.handle.into())
+    }
+    
+    /// Get cookies
+    pub async fn get_cookies(&mut self) -> Result<Vec<fantoccini::cookies::Cookie<'static>>> {
+        Ok(self.client.get_all_cookies().await?)
+    }
+    
+    /// Add a cookie
+    pub async fn add_cookie(&mut self, cookie: fantoccini::cookies::Cookie<'static>) -> Result<()> {
+        self.client.add_cookie(cookie).await?;
+        Ok(())
+    }
+    
+    /// Delete all cookies
+    pub async fn delete_all_cookies(&mut self) -> Result<()> {
+        self.client.delete_all_cookies().await?;
+        Ok(())
+    }
+    
+    /// Wait for an element to appear (with timeout)
+    pub async fn wait_for_element(&mut self, selector: &str, timeout: Duration) -> Result<WebElement> {
+        let start = std::time::Instant::now();
+        let poll_interval = Duration::from_millis(100);
+        
+        loop {
+            if let Ok(elem) = self.find_element(selector).await {
+                return Ok(elem);
+            }
+            
+            if start.elapsed() >= timeout {
+                anyhow::bail!("Timeout waiting for element: {}", selector);
+            }
+            
+            tokio::time::sleep(poll_interval).await;
+        }
+    }
+    
+    /// Wait for an element to be visible (with timeout)
+    pub async fn wait_for_visible(&mut self, selector: &str, timeout: Duration) -> Result<WebElement> {
+        let start = std::time::Instant::now();
+        let poll_interval = Duration::from_millis(100);
+        
+        loop {
+            if let Ok(elem) = self.find_element(selector).await {
+                if elem.is_displayed().await.unwrap_or(false) {
+                    return Ok(elem);
+                }
+            }
+            
+            if start.elapsed() >= timeout {
+                anyhow::bail!("Timeout waiting for element to be visible: {}", selector);
+            }
+            
+            tokio::time::sleep(poll_interval).await;
+        }
+    }
+}
+
+#[async_trait]
+impl WebDriverController for SafariDriver {
+    async fn navigate(&mut self, url: &str) -> Result<()> {
+        self.client.goto(url).await?;
+        Ok(())
+    }
+    
+    async fn current_url(&self) -> Result<String> {
+        Ok(self.client.current_url().await?.to_string())
+    }
+    
+    async fn title(&self) -> Result<String> {
+        Ok(self.client.title().await?)
+    }
+    
+    async fn find_element(&mut self, selector: &str) -> Result<WebElement> {
+        let elem = self.client.find(fantoccini::Locator::Css(selector)).await
+            .context(format!("Failed to find element with selector: {}", selector))?;
+        Ok(WebElement { inner: elem })
+    }
+    
+    async fn find_elements(&mut self, selector: &str) -> Result<Vec<WebElement>> {
+        let elems = self.client.find_all(fantoccini::Locator::Css(selector)).await?;
+        Ok(elems.into_iter().map(|inner| WebElement { inner }).collect())
+    }
+    
+    async fn execute_script(&mut self, script: &str, args: Vec<Value>) -> Result<Value> {
+        Ok(self.client.execute(script, args).await?)
+    }
+    
+    async fn page_source(&self) -> Result<String> {
+        Ok(self.client.source().await?)
+    }
+    
+    async fn screenshot(&mut self, path: &str) -> Result<()> {
+        let screenshot_data = self.client.screenshot().await?;
+        
+        // Expand tilde in path
+        let expanded_path = shellexpand::tilde(path);
+        let path_str = expanded_path.as_ref();
+        
+        // Create parent directories if needed
+        if let Some(parent) = std::path::Path::new(path_str).parent() {
+            std::fs::create_dir_all(parent)
+                .context("Failed to create parent directories for screenshot")?;
+        }
+        
+        std::fs::write(path_str, screenshot_data)
+            .context("Failed to write screenshot to file")?;
+        
+        Ok(())
+    }
+    
+    async fn close(&mut self) -> Result<()> {
+        self.client.close_window().await?;
+        Ok(())
+    }
+    
+    async fn quit(mut self) -> Result<()> {
+        self.client.close().await?;
+        Ok(())
+    }
+}
--- a/crates/g3-computer-control/tests/integration_test.rs
+++ b/crates/g3-computer-control/tests/integration_test.rs
@@ -0,0 +1,62 @@
+use g3_computer_control::*;
+
+#[tokio::test]
+async fn test_mouse_movement() {
+    let controller = create_controller().expect("Failed to create controller");
+    
+    // Move mouse to center of screen (assuming 1920x1080)
+    let result = controller.move_mouse(960, 540).await;
+    assert!(result.is_ok(), "Failed to move mouse: {:?}", result.err());
+}
+
+#[tokio::test]
+async fn test_typing() {
+    let controller = create_controller().expect("Failed to create controller");
+    
+    // Type some text
+    let result = controller.type_text("Hello, World!").await;
+    assert!(result.is_ok(), "Failed to type text: {:?}", result.err());
+}
+
+#[tokio::test]
+async fn test_screenshot() {
+    let controller = create_controller().expect("Failed to create controller");
+    
+    // Take screenshot
+    let path = "/tmp/test_screenshot.png";
+    let result = controller.take_screenshot(path, None, None).await;
+    assert!(result.is_ok(), "Failed to take screenshot: {:?}", result.err());
+    
+    // Verify file exists
+    assert!(std::path::Path::new(path).exists(), "Screenshot file was not created");
+    
+    // Clean up
+    let _ = std::fs::remove_file(path);
+}
+
+#[tokio::test]
+async fn test_click() {
+    let controller = create_controller().expect("Failed to create controller");
+    
+    // Click at a safe location
+    let result = controller.click(types::MouseButton::Left).await;
+    assert!(result.is_ok(), "Failed to click: {:?}", result.err());
+}
+
+#[tokio::test]
+async fn test_double_click() {
+    let controller = create_controller().expect("Failed to create controller");
+    
+    // Double click
+    let result = controller.double_click(types::MouseButton::Left).await;
+    assert!(result.is_ok(), "Failed to double click: {:?}", result.err());
+}
+
+#[tokio::test]
+async fn test_press_key() {
+    let controller = create_controller().expect("Failed to create controller");
+    
+    // Press escape key
+    let result = controller.press_key("escape").await;
+    assert!(result.is_ok(), "Failed to press key: {:?}", result.err());
+}
--- a/crates/g3-config/src/autonomous_config_tests.rs
+++ b/crates/g3-config/src/autonomous_config_tests.rs
@@ -0,0 +1,131 @@
+#[cfg(test)]
+mod autonomous_config_tests {
+    use crate::{Config, AnthropicConfig, DatabricksConfig};
+
+    #[test]
+    fn test_default_autonomous_config() {
+        let config = Config::default();
+        assert!(config.autonomous.coach_provider.is_none());
+        assert!(config.autonomous.coach_model.is_none());
+        assert!(config.autonomous.player_provider.is_none());
+        assert!(config.autonomous.player_model.is_none());
+    }
+
+    #[test]
+    fn test_for_coach_with_overrides() {
+        let mut config = Config::default();
+        
+        // Set up base config with anthropic
+        config.providers.anthropic = Some(AnthropicConfig {
+            api_key: "test-key".to_string(),
+            model: "claude-3-5-sonnet-20241022".to_string(),
+            max_tokens: Some(4096),
+            temperature: Some(0.1),
+        });
+        
+        // Set coach overrides
+        config.autonomous.coach_provider = Some("anthropic".to_string());
+        config.autonomous.coach_model = Some("claude-3-opus-20240229".to_string());
+        
+        let coach_config = config.for_coach().unwrap();
+        
+        // Verify coach uses overridden provider and model
+        assert_eq!(coach_config.providers.default_provider, "anthropic");
+        assert_eq!(
+            coach_config.providers.anthropic.as_ref().unwrap().model,
+            "claude-3-opus-20240229"
+        );
+    }
+
+    #[test]
+    fn test_for_player_with_overrides() {
+        let mut config = Config::default();
+        
+        // Set up base config with databricks
+        config.providers.databricks = Some(DatabricksConfig {
+            host: "https://test.databricks.com".to_string(),
+            token: Some("test-token".to_string()),
+            model: "databricks-meta-llama-3-1-70b-instruct".to_string(),
+            max_tokens: Some(4096),
+            temperature: Some(0.1),
+            use_oauth: Some(false),
+        });
+        
+        // Set player overrides
+        config.autonomous.player_provider = Some("databricks".to_string());
+        config.autonomous.player_model = Some("databricks-dbrx-instruct".to_string());
+        
+        let player_config = config.for_player().unwrap();
+        
+        // Verify player uses overridden provider and model
+        assert_eq!(player_config.providers.default_provider, "databricks");
+        assert_eq!(
+            player_config.providers.databricks.as_ref().unwrap().model,
+            "databricks-dbrx-instruct"
+        );
+    }
+
+    #[test]
+    fn test_no_overrides_uses_defaults() {
+        let mut config = Config::default();
+        config.providers.default_provider = "databricks".to_string();
+        
+        let coach_config = config.for_coach().unwrap();
+        let player_config = config.for_player().unwrap();
+        
+        // Both should use the default provider when no overrides
+        assert_eq!(coach_config.providers.default_provider, "databricks");
+        assert_eq!(player_config.providers.default_provider, "databricks");
+    }
+
+    #[test]
+    fn test_provider_override_only() {
+        let mut config = Config::default();
+        
+        config.providers.anthropic = Some(AnthropicConfig {
+            api_key: "test-key".to_string(),
+            model: "claude-3-5-sonnet-20241022".to_string(),
+            max_tokens: Some(4096),
+            temperature: Some(0.1),
+        });
+        
+        // Only override provider, not model
+        config.autonomous.coach_provider = Some("anthropic".to_string());
+        
+        let coach_config = config.for_coach().unwrap();
+        
+        // Should use overridden provider with its default model
+        assert_eq!(coach_config.providers.default_provider, "anthropic");
+        assert_eq!(
+            coach_config.providers.anthropic.as_ref().unwrap().model,
+            "claude-3-5-sonnet-20241022"
+        );
+    }
+
+    #[test]
+    fn test_model_override_only() {
+        let mut config = Config::default();
+        config.providers.default_provider = "databricks".to_string();
+        
+        config.providers.databricks = Some(DatabricksConfig {
+            host: "https://test.databricks.com".to_string(),
+            token: Some("test-token".to_string()),
+            model: "databricks-meta-llama-3-1-70b-instruct".to_string(),
+            max_tokens: Some(4096),
+            temperature: Some(0.1),
+            use_oauth: Some(false),
+        });
+        
+        // Only override model, not provider
+        config.autonomous.player_model = Some("databricks-dbrx-instruct".to_string());
+        
+        let player_config = config.for_player().unwrap();
+        
+        // Should use default provider with overridden model
+        assert_eq!(player_config.providers.default_provider, "databricks");
+        assert_eq!(
+            player_config.providers.databricks.as_ref().unwrap().model,
+            "databricks-dbrx-instruct"
+        );
+    }
+}
--- a/crates/g3-config/src/lib.rs
+++ b/crates/g3-config/src/lib.rs
@@ -2,10 +2,16 @@ use serde::{Deserialize, Serialize};
 use anyhow::Result;
 use std::path::Path;

+#[cfg(test)]
+mod autonomous_config_tests;
+
 #[derive(Debug, Clone, Serialize, Deserialize)]
 pub struct Config {
    pub providers: ProvidersConfig,
    pub agent: AgentConfig,
+    pub computer_control: ComputerControlConfig,
+    pub webdriver: WebDriverConfig,
+    pub autonomous: AutonomousConfig,
 }

 #[derive(Debug, Clone, Serialize, Deserialize)]
@@ -62,6 +68,52 @@ pub struct AgentConfig {
    pub timeout_seconds: u64,
 }

+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct ComputerControlConfig {
+    pub enabled: bool,
+    pub require_confirmation: bool,
+    pub max_actions_per_second: u32,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct WebDriverConfig {
+    pub enabled: bool,
+    pub safari_port: u16,
+}
+
+impl Default for WebDriverConfig {
+    fn default() -> Self {
+        Self {
+            enabled: false,
+            safari_port: 4444,
+        }
+    }
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct AutonomousConfig {
+    pub coach_provider: Option<String>,
+    pub coach_model: Option<String>,
+    pub player_provider: Option<String>,
+    pub player_model: Option<String>,
+}
+
+impl Default for AutonomousConfig {
+    fn default() -> Self {
+        Self { coach_provider: None, coach_model: None, player_provider: None, player_model: None }
+    }
+}
+
+impl Default for ComputerControlConfig {
+    fn default() -> Self {
+        Self {
+            enabled: false, // Disabled by default for safety
+            require_confirmation: true,
+            max_actions_per_second: 5,
+        }
+    }
+}
+
 impl Default for Config {
    fn default() -> Self {
        Self {
@@ -84,6 +136,9 @@ impl Default for Config {
                enable_streaming: true,
                timeout_seconds: 60,
            },
+            computer_control: ComputerControlConfig::default(),
+            webdriver: WebDriverConfig::default(),
+            autonomous: AutonomousConfig::default(),
        }
    }
 }
@@ -194,6 +249,9 @@ impl Config {
                enable_streaming: true,
                timeout_seconds: 60,
            },
+            computer_control: ComputerControlConfig::default(),
+            webdriver: WebDriverConfig::default(),
+            autonomous: AutonomousConfig::default(),
        }
    }
    
@@ -262,4 +320,78 @@ impl Config {
        
        Ok(config)
    }
+    
+    /// Create a config for the coach agent in autonomous mode
+    pub fn for_coach(&self) -> Result<Self> {
+        let mut config = self.clone();
+        
+        // Apply coach-specific overrides if configured
+        if let Some(ref coach_provider) = self.autonomous.coach_provider {
+            config.providers.default_provider = coach_provider.clone();
+        }
+        
+        if let Some(ref coach_model) = self.autonomous.coach_model {
+            // Apply model override to the coach's provider
+            match config.providers.default_provider.as_str() {
+                "anthropic" => {
+                    if let Some(ref mut anthropic) = config.providers.anthropic {
+                        anthropic.model = coach_model.clone();
+                    } else {
+                        return Err(anyhow::anyhow!(
+                            "Coach provider 'anthropic' is not configured. Please add anthropic configuration to your config file."
+                        ));
+                    }
+                }
+                "databricks" => {
+                    if let Some(ref mut databricks) = config.providers.databricks {
+                        databricks.model = coach_model.clone();
+                    } else {
+                        return Err(anyhow::anyhow!(
+                            "Coach provider 'databricks' is not configured. Please add databricks configuration to your config file."
+                        ));
+                    }
+                }
+                _ => {}
+            }
+        }
+        
+        Ok(config)
+    }
+    
+    /// Create a config for the player agent in autonomous mode
+    pub fn for_player(&self) -> Result<Self> {
+        let mut config = self.clone();
+        
+        // Apply player-specific overrides if configured
+        if let Some(ref player_provider) = self.autonomous.player_provider {
+            config.providers.default_provider = player_provider.clone();
+        }
+        
+        if let Some(ref player_model) = self.autonomous.player_model {
+            // Apply model override to the player's provider
+            match config.providers.default_provider.as_str() {
+                "anthropic" => {
+                    if let Some(ref mut anthropic) = config.providers.anthropic {
+                        anthropic.model = player_model.clone();
+                    } else {
+                        return Err(anyhow::anyhow!(
+                            "Player provider 'anthropic' is not configured. Please add anthropic configuration to your config file."
+                        ));
+                    }
+                }
+                "databricks" => {
+                    if let Some(ref mut databricks) = config.providers.databricks {
+                        databricks.model = player_model.clone();
+                    } else {
+                        return Err(anyhow::anyhow!(
+                            "Player provider 'databricks' is not configured. Please add databricks configuration to your config file."
+                        ));
+                    }
+                }
+                _ => {}
+            }
+        }
+        
+        Ok(config)
+    }
 }
--- a/crates/g3-core/Cargo.toml
+++ b/crates/g3-core/Cargo.toml
@@ -8,6 +8,7 @@ description = "Core engine for G3 AI coding agent"
 g3-providers = { path = "../g3-providers" }
 g3-config = { path = "../g3-config" }
 g3-execution = { path = "../g3-execution" }
+g3-computer-control = { path = "../g3-computer-control" }
 tokio = { workspace = true }
 reqwest = { workspace = true }
 anyhow = { workspace = true }
@@ -23,3 +24,4 @@ futures-util = "0.3"
 chrono = { version = "0.4", features = ["serde"] }
 rand = "0.8"
 regex = "1.0"
+shellexpand = "3.1"
--- a/crates/g3-core/src/lib.rs
+++ b/crates/g3-core/src/lib.rs
--- a/crates/g3-core/src/tilde_expansion_tests.rs
+++ b/crates/g3-core/src/tilde_expansion_tests.rs
@@ -0,0 +1,36 @@
+#[cfg(test)]
+mod tilde_expansion_tests {
+    use std::env;
+
+    #[test]
+    fn test_tilde_expansion() {
+        // Test that shellexpand works
+        let path_with_tilde = "~/test.txt";
+        let expanded = shellexpand::tilde(path_with_tilde);
+        
+        // Get the actual home directory
+        let home = env::var("HOME").expect("HOME environment variable not set");
+        
+        // Verify expansion happened
+        assert_eq!(expanded.as_ref(), format!("{}/test.txt", home));
+        assert!(!expanded.contains("~"));
+    }
+
+    #[test]
+    fn test_tilde_expansion_with_subdirs() {
+        let path_with_tilde = "~/Documents/test.txt";
+        let expanded = shellexpand::tilde(path_with_tilde);
+        
+        let home = env::var("HOME").expect("HOME environment variable not set");
+        
+        assert_eq!(expanded.as_ref(), format!("{}/Documents/test.txt", home));
+    }
+
+    #[test]
+    fn test_no_tilde_unchanged() {
+        let path_without_tilde = "/absolute/path/test.txt";
+        let expanded = shellexpand::tilde(path_without_tilde);
+        
+        assert_eq!(expanded.as_ref(), path_without_tilde);
+    }
+}
--- a/crates/g3-core/tests/test_context_thinning.rs
+++ b/crates/g3-core/tests/test_context_thinning.rs
@@ -0,0 +1,157 @@
+use g3_core::ContextWindow;
+use g3_providers::{Message, MessageRole};
+
+#[test]
+fn test_thinning_thresholds() {
+    let mut context = ContextWindow::new(10000);
+    
+    // At 0%, should not thin
+    assert!(!context.should_thin());
+    
+    // Simulate reaching 50% usage
+    context.used_tokens = 5000;
+    assert!(context.should_thin());
+    
+    // After thinning at 50%, should not thin again until next threshold
+    context.last_thinning_percentage = 50;
+    assert!(!context.should_thin());
+    
+    // At 60%, should thin again
+    context.used_tokens = 6000;
+    assert!(context.should_thin());
+    
+    // After thinning at 60%, should not thin
+    context.last_thinning_percentage = 60;
+    assert!(!context.should_thin());
+    
+    // At 70%, should thin
+    context.used_tokens = 7000;
+    assert!(context.should_thin());
+    
+    // At 80%, should thin
+    context.last_thinning_percentage = 70;
+    context.used_tokens = 8000;
+    assert!(context.should_thin());
+    
+    // After 80%, should not thin (compaction takes over)
+    context.last_thinning_percentage = 80;
+    context.used_tokens = 8500;
+    assert!(!context.should_thin());
+}
+
+#[test]
+fn test_thin_context_basic() {
+    let mut context = ContextWindow::new(10000);
+    
+    // Add some messages to the first third
+    for i in 0..9 {
+        if i % 2 == 0 {
+            context.add_message(Message {
+                role: MessageRole::Assistant,
+                content: format!("Assistant message {}", i),
+            });
+        } else {
+            // Add tool results with varying sizes
+            let content = if i == 1 {
+                // Large tool result (> 1000 chars)
+                format!("Tool result: {}", "x".repeat(1500))
+            } else if i == 3 {
+                // Another large tool result
+                format!("Tool result: {}", "y".repeat(2000))
+            } else {
+                // Small tool result (< 1000 chars)
+                format!("Tool result: small result {}", i)
+            };
+            
+            context.add_message(Message {
+                role: MessageRole::User,
+                content,
+            });
+        }
+    }
+    
+    // Trigger thinning at 50%
+    context.used_tokens = 5000;
+    let summary = context.thin_context();
+    
+    println!("Thinning summary: {}", summary);
+    
+    // Should have thinned at least 1 large tool result in the first third
+    assert!(summary.contains("1 tool result"), "Summary was: {}", summary);
+    assert!(summary.contains("50%"));
+    
+    // Check that the large tool results were replaced
+    let first_third_end = context.conversation_history.len() / 3;
+    for i in 0..first_third_end {
+        if let Some(msg) = context.conversation_history.get(i) {
+            if matches!(msg.role, MessageRole::User) && msg.content.starts_with("Tool result:") {
+                if msg.content.len() > 1000 {
+                    panic!("Found un-thinned large tool result at index {}", i);
+                }
+            }
+        }
+    }
+}
+
+#[test]
+fn test_thin_context_no_large_results() {
+    let mut context = ContextWindow::new(10000);
+    
+    // Add only small messages
+    for i in 0..9 {
+        context.add_message(Message {
+            role: MessageRole::User,
+            content: format!("Tool result: small {}", i),
+        });
+    }
+    
+    context.used_tokens = 5000;
+    let summary = context.thin_context();
+    
+    // Should report no large results found
+    assert!(summary.contains("no large tool results found"));
+}
+
+#[test]
+fn test_thin_context_only_affects_first_third() {
+    let mut context = ContextWindow::new(10000);
+    
+    // Add 12 messages (first third = 4 messages)
+    for i in 0..12 {
+        let content = if i % 2 == 1 {
+            // All odd indices are large tool results
+            format!("Tool result: {}", "x".repeat(1500))
+        } else {
+            format!("Assistant message {}", i)
+        };
+        
+        let role = if i % 2 == 1 {
+            MessageRole::User
+        } else {
+            MessageRole::Assistant
+        };
+        
+        context.add_message(Message { role, content });
+    }
+    
+    context.used_tokens = 5000;
+    let summary = context.thin_context();
+    
+    // First third is 4 messages (indices 0-3), so only indices 1 and 3 should be thinned
+    // That's 2 tool results
+    assert!(summary.contains("2 tool results"));
+    
+    // Check that messages after the first third are NOT thinned
+    let first_third_end = context.conversation_history.len() / 3;
+    for i in first_third_end..context.conversation_history.len() {
+        if let Some(msg) = context.conversation_history.get(i) {
+            if matches!(msg.role, MessageRole::User) && msg.content.starts_with("Tool result:") {
+                // These should still be large (not thinned)
+                if i % 2 == 1 {
+                    assert!(msg.content.len() > 1000, 
+                        "Message at index {} should not have been thinned", i);
+                }
+            }
+        }
+    }
+}
--- a/test-ai-requirements.sh
+++ b/test-ai-requirements.sh
@@ -0,0 +1,39 @@
+#!/bin/bash
+# Test script for AI-enhanced interactive requirements mode
+
+echo "Testing AI-enhanced interactive requirements mode..."
+echo ""
+
+# Create a test workspace
+TEST_WORKSPACE="/tmp/g3-test-interactive-$(date +%s)"
+mkdir -p "$TEST_WORKSPACE"
+
+echo "Test workspace: $TEST_WORKSPACE"
+echo ""
+
+# Create sample brief input
+BRIEF_INPUT="build a calculator cli in rust with basic operations"
+
+echo "Brief input:"
+echo "---"
+echo "$BRIEF_INPUT"
+echo "---"
+echo ""
+
+echo "This will:"
+echo "1. Send brief input to AI"
+echo "2. AI generates structured requirements.md"
+echo "3. Show enhanced requirements"
+echo "4. Prompt for confirmation (y/e/n)"
+echo ""
+
+echo "To test manually, run:"
+echo "cargo run -- --autonomous --interactive-requirements --workspace $TEST_WORKSPACE"
+echo ""
+echo "Then type: $BRIEF_INPUT"
+echo "Press Ctrl+D"
+echo "Review the AI-generated requirements"
+echo "Choose 'y' to proceed, 'e' to edit, or 'n' to cancel"
+echo ""
+
+echo "Test workspace will be at: $TEST_WORKSPACE"
Author	SHA1	Message	Date
Michael Neale	f2ed303550	Revert "don't need this" This reverts commit `93121c18e0`.	2025-10-22 14:53:25 +11:00
Michael Neale	93121c18e0	don't need this	2025-10-22 14:30:13 +11:00
Michael Neale	ed84a940f9	tweak auto mode	2025-10-22 14:27:17 +11:00
Michael Neale	3128b5d8b9	can choose per mode models for auto mode	2025-10-22 14:19:00 +11:00
Dhanji Prasanna	758e255af8	dont run safaridriver --enable each time	2025-10-21 16:00:58 +11:00
Dhanji Prasanna	393826ae02	webdriver tools	2025-10-21 14:34:41 +11:00
Dhanji Prasanna	3afad3d61f	progressive context thinning	2025-10-20 15:29:44 +11:00
Dhanji Prasanna	2488cc54d5	docs: update README and DESIGN to reflect current project state - Add g3-computer-control crate to architecture documentation - Document all 13 tools including computer control and TODO management - Add context thinning feature documentation (50-80% thresholds) - Update tool ecosystem section with complete tool list - Remove broken link to non-existent COMPUTER_CONTROL.md - Update workspace count from 5 to 6 crates - Add platform-specific implementation details for computer control - Document OCR support via Tesseract - Clarify setup instructions for computer control features	2025-10-20 15:03:22 +11:00
Dhanji Prasanna	2ad0c9a3fd	todo list formatting	2025-10-20 14:27:53 +11:00
Dhanji Prasanna	2008a81193	fix to pass feedback to player (broken by todo system)	2025-10-20 14:12:08 +11:00
Dhanji Prasanna	776f5034b8	TODO tools	2025-10-20 10:50:53 +11:00
Dhanji Prasanna	92bece957b	colorizing tool calls	2025-10-18 16:09:30 +11:00
Dhanji Prasanna	767299ff4e	minor	2025-10-18 16:03:58 +11:00
Dhanji Prasanna	9d35449be8	~ expansion for read_file and str_replace	2025-10-18 16:01:15 +11:00
Dhanji Prasanna	da652bf287	computer control tools	2025-10-18 14:16:50 +11:00
Dhanji Prasanna	a566171203	small turn completing bug	2025-10-18 13:25:23 +11:00
Dhanji Prasanna	347c9e1e00	colorize timing based on duration	2025-10-17 13:54:21 +11:00
Dhanji Prasanna	aa7eda0331	fix wall clock timing	2025-10-17 10:36:21 +11:00
Dhanji Prasanna	e42c76f3b9	Tune coach pickiness down	2025-10-17 10:28:08 +11:00
Dhanji Prasanna	dd211fab1c	panic fix	2025-10-17 09:50:01 +11:00
Dhanji R. Prasanna	bcece38473	Merge pull request #5 from dhanji/micn/agent-tweaks load AGENTS.md if there	2025-10-16 15:06:14 +11:00